0% found this document useful (0 votes)
17 views54 pages

CH 10

The document provides a detailed overview of one-way Analysis of Variance (ANOVA), explaining its purpose, methodology, and key terms. It outlines how to conduct the test, interpret results, and perform supplemental analyses, including post-hoc tests. Additionally, it includes examples and learning checks to illustrate the application of ANOVA in comparing group means.

Uploaded by

김봉기
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views54 pages

CH 10

The document provides a detailed overview of one-way Analysis of Variance (ANOVA), explaining its purpose, methodology, and key terms. It outlines how to conduct the test, interpret results, and perform supplemental analyses, including post-hoc tests. Additionally, it includes examples and learning checks to illustrate the application of ANOVA in comparing group means.

Uploaded by

김봉기
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

CHAPTER 10

ANALYSIS OF VARIANCE(변량분석)
Analysis of variance (ANOVA)
• A one-way ANOVA is used to compare two or more
treatment means
• Analysis of “VARIANCE”: a hypothesis test where the
variance of group means is compared to variance within
those groups; produces an F-ratio
• Instead of comparing means one pair at a time, it requires
calculating a single number to tell how much all of those
means vary from each other.
• Research questions for the one-way ANOVA should ask if
there are any differences between the groups
• Example: Do any of the groups have different scores than the other
groups?
ANOVA Example
Joyce Kuhlman manages a regional financial center. She wishes to
compare the productivity, as measured by the number of customers served,
among three employees. Four days are randomly selected and the number
of customers served by each employee is recorded. Is there a difference in
the mean number of customers served?

Wolfe White Korosa


55 66 47
54 76 51
59 67 46
56 71 48
The ANOVA Test
Wolfe White Korosa First find the overall mean of the 12
55 66 47 observations. It is 58.
Next, find the difference between each
54 76 51
particular value and the overall mean.
59 67 46 Square these differences and sum up. This
56 71 48 result is the total variation, here 1,082.

TOTAL VARIATION The sum of the squared differences between


each observation and the overall mean.

Now, break this total variation in two components: variation due to


treatment variation and random variation.

TREATMENT (Between) VARIATION The sum of the squared


differences between each treatment mean and the grand or
overall mean.

RANDOM (Within) VARIATION The sum of the squared


differences between each observation and its treatment mean.
The ANOVA Test Continued
Wolfe White Korosa Recall, the overall mean is 58 and the total variation is 1,082.
55 66 47 Now, break this total variation in two components: variation
due to treatment variation and random variation.
54 76 51
• The variation due to treatments is 992, found by squaring
59 67 46 the difference between each treatment mean and the
56 71 48
overall mean and then multiplying each squared difference
by the number of observations in each treatment.
4(56-58)2 + 4(70-58)2 + 4(48-58)2 = 992
• The random variation is 90, found by summing the
squared differences between each value and the mean for
each treatment.
(55-56)2 + (54-56)2 + … + (48-48)2 = 90
• Calculate the test statistic, F
992/2
F = 90/9 = 49.6
This ratio is quite different from 1, we can conclude there is a
difference in the mean number of customers served by the
three employees.
ANOVA
• Between-groups variability (treatment): the degree to
which group means vary from one another within ANOVA
• How big is the difference between means?
• If overall variability, this is the sum of squares between (or
treatment) SSB
• If average variability, this is the mean square between (or
treatment) MSB
• Within-groups variability (error, random): the degree to
which members of a group vary from one another within
ANOVA
• How closely do members of groups cluster around their group’s
mean?
• If overall variability, this is the sum of squares within (or error)
SSW(SSE)
• If average variability, this is the mean square within (or error)
MSE
8

One-way ANOVA terms

• F: the ratio of between-group variability to within-group


variability used in ANOVA
• k: the number of groups being compared in ANOVA
• dfB: between-groups degrees of freedom, calculated as
k – 1 MSB = SSB/(k-1)
• dfW: within-groups degrees of freedom, calculated as N
– k MSW = SSW/(N-k)
• SS: sum of squares; shorthand for “sum of the squared
deviations”
• SST(Sum of squared total) = SSB + SSW
9

One-way ANOVA
• Research questions for the one-way ANOVA should
ask if there are any differences between the groups
• Example: Do any of the groups have different scores than the
other groups?

• Hypotheses for one-way ANOVA use the format:


• H0: μ1 = μ2= μ3
• H1: At least two means differ (or Not all treatment means are the
same)

• One-way ANOVA is an ANOVA comparing multiple


independent group means against one another
• Similar in purpose to an independent-samples t-
test(2집단 비교) but can compare any number of
groups
Characteristics of the F Distribution
• There is a family of F distributions. Each time the degrees
of freedom in either the numerator or the denominator
change, a new distribution is created
• The F distribution is continuous
• The F statistic cannot be negative
• The F distribution is positively skewed
• The F distribution is asymptotic
12

One-way ANOVA: Critical values and decision


rules
• The critical value for the one-way ANOVA depends
on alpha and the degrees of freedom for the test
• Example: If α = .05, k = 4, and N = 22, Fcrit(3, 18) = 3.16
One-way ANOVA: Conducting the statistical test
• The formula for the sum of the squares total, SS total is (SST)

• The formula for the sum of the squares error, SSE is

• The formula for the sum of the squares treatment, SST is

• This information is summarized in the ANOVA table


Design A Design B Design C Design D

55 115 86 71

71 86 108 62

72 98 66 48

62 120 37 69

67 115 90 55

103 57
Filling in the ANOVA table

SS df MS F

Between-groups 7428.024422 dfB= MSB= F=

Within-groups 4342.566487 dfW= MSW=

Total 11770.590909 dfT=


SS df MS F

Between-groups 7428.02 3 2476.01 10.26

Within-groups 4342.57 18 241.25

Total 11770.59 21
21

One-way ANOVA: Formally stating


the results
• We formally state the results of a one-way ANOVA
using the format:
• Using Excel/SPSS: F(dfB,dfW) = F, p = p-value or p =
relationship α
• Example: F(2, 12) = .46, p = .64 or p > .05
22

One-way ANOVA: Conducting


supplemental analyses
• If we found statistical significance, compute an effect
size and a post-hoc test.
• If we did not find statistical significance, no further
analyses needed.
• η2: a common measure of overall effect size for ANOVA
• Calculated as SSB / SST
23

One-way ANOVA: Conducting supplemental


analyses
• The most common post-hoc tests: Least Significant
Differences (LSD), Tukey’s Honestly Significant Differences
(HSD), & Scheffe’s test(conservative – hard to get the
significant result). 1-2, 1-3, 2-3

• To calculate a post hoc test:


• Compute the difference between each pair of means
• Compute critical mean differences for each of those pairs
• Compare each mean difference to the critical mean difference
• If the mean difference is larger, the difference is statistically significant.
• If the mean difference is smaller, the difference is not statistically
significant.
24

One-way ANOVA: Conducting supplemental


analyses

• To calculate critical mean differences, use the formula:


25

One-way ANOVA: Drawing


conclusions
• For the one-way ANOVA, your conclusions should
include:
• A formal statement about retaining the null or rejecting the null
and accepting the alternative.
• A formal statement about the statistical significance of the
finding.
• A sentence interpreting the results in terms of the research
question.
• Interpretation of any supplemental analyses.
26

Hypothesis testing using the one-


way ANOVA: Example
• Acme Co. wants to know if there are differences in the
perceived effectiveness of different types of training.
They randomly assign 15 employees into 3 groups: in
person training, online training, and blended training.
After the training, employees report how effective they
think the training is on a 1–5 scale (1 = not at all
effective, 5 = very effective). Data is provided below:
27

Hypothesis testing using the one-


way ANOVA: Example

In Person Online Blended


1 1 4
4 3 3
3 3 2
2 3 2
3 3 4
28

Hypothesis testing using the one-


way ANOVA: Example
• RQ: Are there differences in the perceived
effectiveness of different types of training?
• Hypotheses:
‒ H0: μ1 = μ2 = μ3
‒ H1: At least two means differ

• α = .05
• Fcrit ( , ) =
29

Hypothesis testing using the one-


way ANOVA: Example
• RQ: Are there differences in the perceived
effectiveness of different types of training?
• Hypotheses:
‒ H0: μ1 = μ2 = μ3
‒ H1: At least two means differ

• α = .05
• Fcrit (2, 12) = 3.89
30

Hypothesis testing using the one-


way ANOVA: Example
• SSBetween =
• SSTotal =
• SSWithin =
31

Hypothesis testing using the one-


way ANOVA: Example
• SSBetween = (5*(2.6-2.73333)2 ) + (5*(2.6-2.73333)2) +
(5*(3-2.73333)2) = .533333
• SSTotal = 125 – (412/15) = 12.933333
• SSWithin = 12.933333 – .533333 = 12.40
32

Hypothesis testing using the one-


way ANOVA: Example
SS df MS F
Between 0.533333 k-1 SSB/dfB MSB/MSW
Within 12.40 N-k SSW/dfW

Total 12.933333 N-1


33

Hypothesis testing using the one-


way ANOVA: Example
SS Df MS F

.2666665/1.03
.533333/2 =
Between 0.533333 3-1 =2 3333 =
0.266665
0.258064
12.40/12 =
Within 12.40 15 - 3 =12
1.033333

Total 12.933333 15 - 1 =14

F (2, 12) = .26, p > .05


34

Hypothesis testing using the one-


way ANOVA: Example
• Retain the null. The difference was not statistically
significant. There were no differences in the perceived
effectiveness of different types of training.
35

Learning check
• Beta Inc. wants to compare the number of sales made by
employees in three teams of employees. The company
wants to know whether there are differences in sales
between the three teams. Data is provided below:

Team 1 Team 2 Team 3


45 36 53
45 28 65
39 48 58
36

Learning check
• RQ: Are there differences in the number of sales between
the three teams?
• Hypotheses:
‒ H0: μ1 = μ2 = μ3
‒ H1: At least two means differ

• α = .05
• Fcrit ( , ) =

• SSBetween =
• SSTotal =
• SSWithin =
37

Learning check
• RQ: Are there differences in the number of sales between
the three teams?
• Hypotheses:
‒ H0: μ1 = μ2 = μ3
‒ H1: At least two means differ

• α = .05
• Fcrit (2, 6)= 5.14

• SSB = (3*(43-46.333333)2 ) + (3*(37.333333-46.333333)2) +


(3*(58.666667-46.333333)2) = 732.666709
• SST = 20353 – (4172/9) = 1032
• SSW = 1032 – 732.666709 = 299.333291
38

Learning check

SS df MS F

Between 732.666709 k-1 SSB/dfB MSB/MSW

Within 299.333291 N-k SSW/dfW

Total 1032 N-1


39

Learning check

SS df MS F

366.333355
3-1 = 732.666709/2
Between 732.666709 /49.888882
2 = 366.333355
= 7.342956
9-3 299.333291/1
Within 299.333291
=12 2 =49.888882

Total 1032 9 - 1 =8

F (2, 6) = 7.34, p < .05


40

Learning check
• η2 = 732.666709/1032 = .709948
• SSB/SST

1 1
xD = (3 −1)5.14 49.888882  +  =18.4906477
crit  3 3

• MeanTeam1 – MeanTeam2 = 43 – 37.333333 = 5.666667


• MeanTeam1 – MeanTeam3 = 43 – 58.666667 = -
15.666667
• MeanTeam2 – MeanTeam3 = 37.333333 - 58.666667 =
-21.333333*
41

Learning check
• Reject the null and accept the alternative. The difference
was statistically significant. There are differences in the
number of sales between the three teams. There is a
significant difference between the number of sales for
team 2 and team 3. 71% of the variance in number of
sales can be explained by the differences between the
teams.
A Two-Way ANOVA
• 공부한 시간(공부량), 난이도 – 난이도 ‘하’일 때, 공부시간이 증가할수록 성적
낮음, 난이도 ‘상’일 때는, 공부시간이 증가할수록 성적 증가.
• In a two-way ANOVA, we consider a second treatment variable
• This reduces the amount of error variance
• The second treatment variable is called the blocking variable
• It is determined using equation below
• SSB: SSBlock (SSB1, SSB2)

• SST = SSB1 + SSB2(Blocking) + SSW


• SSW = SST – SSB (IV 1개일 경우, one-way ANOVA)
• SSW = SST – SSB1 – SSB2 (IV 2개, two-way ANOVA)
• The SSE term, or sum of squares error, is found with the following equation
ANOVA Test Example
WARTA, the Warren Area Regional Transit Authority, is expanding bus service
from the suburb of Starbrick to the business district of Warren. There are four
routes being considered, U.S. 6, West End, Hickory St. , and Rte. 59. WARTA
conducted tests to determine whether there is a difference in the mean travel
times along the four routes; each driver drove each route. See the travel times in
minutes for each driver-route combination below.

At the .05 significance level, is there a difference in the mean travel time along
the four routes? If we remove the effects of the drivers, is there a difference in
the mean travel time?
ANOVA Test Example Continued
Step 1: State the null and alternate hypothesis
H0: μ1 = μ2 = μ3 = μ4
H1: Not all treatment means are the same
Step 2: Select the level of significance, we decide to use .05
Step 3: Select the test statistic, we use F
Step 4: State the decision rule, Reject H0 if F > 3.24
Step 5: Make decision, F = 2.483, we do not reject the null hypothesis
Step 6: Interpret, there is no reason to conclude that any one of the routes is
faster than any other.
The Blocking Variable
• In the WARTA example, we only considered the variation
due to routes and took all other variables to be random
• Now, we’ll include the variance due to the drivers by
letting the drivers be the blocking variable (Route, Driver)

BLOCKING VARIABLE A second treatment variable that when


included in the ANOVA analysis will have the effect of reducing the
SSE(SSW) term.

• To do so, requires that we calculate the SSB, the sum of


the squares due to blocks
Two-Way Analysis of Variance
Including the variance of the drivers, here is a table of the drivers
respective means with an overall mean of 22.8 minutes.

Substituting this information in formula 12-6, we determine SSB is 119.7


SSB = kΣ(തxb – xതG)2
= 4(19.5 – 22.8)2 + 4(21.0 – 22.8)2 + 4(22.5 – 22.8)2 +
4(24.75 –22.8)2 + 4(26.25 – 22.8)2 = 119.7
Then use formula to find SSE(SSW)
SSE (SSW) = SS total – SST – SSB = 229.2 – 72.8 – 119.7 = 36.7
A Second Treatment Variable Continued
• Determine the F statistics for the treatment variable and
the blocking variable from the following ANOVA table
Hypothesis Test of Equal Block Means
Step 1: State the null hypothesis and the alternate hypotheses,
H0: The treatment means are equal (μ1 = μ2= μ3= μ4)
H1: At least one treatment mean is different
H0: The block means are equal (μD= μS= μO= μZ= μF)
H1: At least one block mean is different
Step 2: Select the level of significance, we’ll use .05
Step 3: Select the test statistic, we use F
Step 4: State the decision rule for the first set of hypotheses, reject H0 if F > 3.49
Step 5: Make decision, the computed F ratio is 7.93 so we reject the null
hypothesis that all treatment means are equal
MST 24.27
F = MSE = 3.06 = 7.93
Step 6: Interpret, we conclude that at least one of the routes mean travel time is
different from the other routes

Next, we test to find if the travel times for the various drivers are equal.

One way crit: F(k-1, n-k) vs. Two way crit: F(k-1, (k-1)(b-1)), F(4-1, (4-1)(5-1)),
F(3, 12)
Hypothesis Test of Equal Block Means Continued
State the decision rule for the second set of hypotheses, reject H0 if F(4, 12) >
3.26
Make a decision, the computed F ratio is 9.78 so we reject the null hypothesis
MSB 29.93
F = MSE = 3.06 = 9.78
Interpret, we conclude at least one driver’s mean travel time is different from
the others. WARTA management can conclude, based on the sample results,
that there is a difference in the mean travel times of drivers.
Excel has a two-factor
ANOVA procedure. The
output for the WARTA
example just completed is
shown.
Interaction Plot
• Interaction(상호작용): IV1이 DV에 미치는 영향이 IV2에 의해 달라진다.
• IV1-DV의 관계가 IV2 (A, B, C)의 A에서는 유의, B, C에서는 유의하지 않음
• IV1-DV의 관계가 A, B, C에서 모두 유의. 그러나 형태가 다른 경우.
• IV2 – A일 때: IV1-DV – positive, sig, B일때 – negative, sig, C일때 – positive, sig (기울기가 다를 수 있음)

• An interaction plot illustrates the interaction of the two factors, route and
driver; Travel time is the response variable
INTERACTION The effect of one factor on a response variable
differs depending on the value of another factor.

Routes
Drivers U.S. 6 West Hickory Rte.
End 59
Deans 18 17 21 22
16 22.33 23 22
Snaverly
Ormson 18 23 26 22
Zollaco 23 22 29 23.67
Filbeck 25 24 28 28
Hypothesis Tests for Interaction
• The next step is to investigate the interaction effects
• Is there an interaction between drivers and routes?
• Are the mean travel times for drivers the same?
• Are the mean travel times for the routes the same?
• Test three sets of hypotheses
• H0: There is no interaction between drivers and routes
• H1: There is interaction between drivers and routes
• H0: The driver means are equal
• H1: At least one driver travel time mean is different
• H0: The route means are equal
• H1: At least one route travel time mean is different
ANOVA Table including Interactions
• The complete ANOVA table including interactions
A One-Way ANOVA to Test a Hypothesis
• We will continue the analysis by conducting a one-way ANOVA for
each route by testing the hypothesis (각 route마다, driver간 주행시간
차이가 있는가?) H0: Driver times are equal

The results show there are


significant differences in the mean
travel times among the drivers for
every route, except Route 59
which has a p-value of .06.

You might also like