0% found this document useful (0 votes)
75 views42 pages

Analysis of Variance-1

The document discusses analysis of variance (ANOVA) techniques, including one-way and two-way ANOVA. It covers experimental design, assumptions of ANOVA, partitioning variation, and calculating sums of squares for total, among-group, and within-group variation. Hypotheses for one-way ANOVA are also presented.

Uploaded by

Varun Bhayana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views42 pages

Analysis of Variance-1

The document discusses analysis of variance (ANOVA) techniques, including one-way and two-way ANOVA. It covers experimental design, assumptions of ANOVA, partitioning variation, and calculating sums of squares for total, among-group, and within-group variation. Hypotheses for one-way ANOVA are also presented.

Uploaded by

Varun Bhayana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Analysis of Variance

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Learning Objectives
In this chapter, you learn:
• The basic concepts of experimental design
• How to use one-way analysis of variance to test for differences among
the means of several groups
• How to use two-way analysis of variance and interpret the interaction
effect
• How to perform multiple comparisons in a one-way analysis of variance
and a two-way analysis of variance

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
General ANOVA Setting
• Investigator controls one or more factors of interest
• Each factor contains two or more levels
• Levels can be numerical or categorical
• Different levels produce different groups
• Think of each group as a sample from a different
population
• Observe effects on the dependent variable
• Are the groups the same?
• Experimental design: the plan used to collect the data

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Completely Randomized Design

• Experimental units (subjects) are assigned randomly to groups


• Subjects are assumed homogeneous
• Only one factor or independent variable
• With two or more levels
• Analyzed by one-factor analysis of variance (ANOVA)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way Analysis of Variance
• Evaluate the difference among the means of three or
more groups
Examples: Number of accidents for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

• Assumptions
• Populations are normally distributed
• Populations have equal variances
• Samples are randomly and independently drawn

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Hypotheses of One-Way ANOVA

• H0 : μ1 = μ2 = μ3 =  = μc
• All population means are equal
• i.e., no factor effect (no variation in means among groups)


H• 1At: Not
leastall
oneofpopulation
the population
mean ismeans are equal
different
• i.e., there is a factor effect
• Does not mean that all population means are different
(some pairs may be the same)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA
H0 : μ1 = μ2 = μ3 =  = μc
H1 : Not all μ j are equal
The Null Hypothesis is True
All Means are the same:
(No Factor Effect)

μ1 = μ 2 = μ 3
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA
(continued)
H0 : μ1 = μ2 = μ3 =  = μc
H1 : Not all μ j are equal
The Null Hypothesis is NOT true
At least one of the means is different
(Factor Effect is present)

or

μ1 = μ2  μ3 μ1  μ2  μ3
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Partitioning the Variation
• Total variation can be split into two parts:

SST = SSA + SSW

SST = Total Sum of Squares


(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Partitioning the Variation

SST = SSA + SSW

Total Variation = the aggregate variation of the individual


data values across the various factor levels (SST)

Among-Group Variation = variation among the factor


sample means (SSA)

Within-Group Variation = variation that exists among


the data values within a particular factor level (SSW)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Partition of Total Variation
Total Variation (SST)

Variation Due to Variation Due to Random


= Factor (SSA) + Error (SSW)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Total Sum of Squares
SST = SSA + SSW
c nj

SST =  ( Xij − X) 2

Where: j=1 i=1

SST = Total sum of squares


c = number of groups or levels
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Total Variation
(continued)

2 2 2
SST = ( X 11 − X ) + ( X 12 − X ) +    + ( X cn − X )
c

Response, X

Group 1 Group 2 Group 3

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Among-Group Variation
SST = SSA + SSW
c
SSA =  n j ( X j − X)2
j=1
Where:
SSA = Sum of squares among groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Among-Group Variation
(continued)
c
SSA =  n j ( X j − X)2
j=1

SSA
Variation Due to
MSA =
Differences Among Groups
c −1
Mean Square Among =
SSA/degrees of freedom

i j

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Among-Group Variation
(continued)

SSA = n1 (X1 − X) + n 2 (X 2 − X) +    + n c (X c − X)
2 2 2

Response, X

X3
X2 X
X1

Group 1 Group 2 Group 3


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Within-Group Variation
SST = SSA + SSW
c nj

SSW =   ( Xij − X j ) 2

j=1 i=1
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Within-Group Variation
(continued)

c nj

SSW =   ( Xij − X j )2
j=1 i=1
SSW
Summing the variation
MSW =
within each group and then
adding over all groups n−c
Mean Square Within =
SSW/degrees of freedom

μj
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Within-Group Variation
(continued)

SSW = (X11 − X1 ) + (X12 − X 2 ) +    + (Xcn c − Xc )


2 2 2

Response, X

X3
X2
X1

Group 1 Group 2 Group 3


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Obtaining the Mean Squares
The Mean Squares are obtained by dividing the various
sum of squares by their associated degrees of freedom

SSA Mean Square Among


MSA = (d.f. = c-1)
c −1
SSW
MSW = Mean Square Within
n−c (d.f. = n-c)

SST
MST = Mean Square Total
n −1 (d.f. = n-1)
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA Table

Source of Degrees of Sum Of Mean Square F


Variation Freedom Squares (Variance)

Among SSA FSTAT =


c-1 SSA MSA =
Groups c-1
MSA
Within SSW
n-c SSW MSW = MSW
Groups n-c

Total n–1 SST

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA
F Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different

• Test statistic
MSA
FSTAT =
MSW
MSA is mean squares among groups
MSW is mean squares within groups

• Degrees of freedom
• df1 = c – 1 (c = number of groups)
• df2 = n – c (n = sum of sample sizes from all populations)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Interpreting One-Way ANOVA
F Statistic
• The F statistic is the ratio of the among estimate
of variance and the within estimate of variance
• The ratio must always be positive
• df1 = c -1 will typically be small
• df2 = n - c will typically be large

Decision Rule:
◼ Reject H0 if FSTAT > Fα, 
otherwise do not reject
H0 0 Do not Reject H0
reject H0

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA
F Test Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on an 237 227 206
automated driving machine for 251 216 204
each club. At the 0.05
significance level, is there a
difference in mean distance?

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA Example: Scatter
Plot
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
263 218 222 ••
241 235 197
250 X1
240 •
237 227 206 • ••
251 216 204 230
• X
220 ••
X2 •
210
x1 = 249.2 x 2 = 226.0 x 3 = 205.8
•• X3
200 •

x = 227.0 190

1 2 3
Club prohibited.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution
One-Way ANOVA Example
Computations
Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5
254 234 200 X2 = 226.0 n2 = 5
263 218 222
X3 = 205.8 n3 = 5
241 235 197
237 227 206 n = 15
X = 227.0
251 216 204 c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2 2358.2


FSTAT = = 25.275
MSW = 1119.6 / (15-3) = 93.3 93.3

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA Example Solution

H0: μ1 = μ2 = μ3 Test Statistic:


H1: μj not all equal
MSA 2358.2
 = 0.05 FSTAT = = = 25.275
MSW 93.3
df1= 2 df2 = 12

Critical Decision:
Value:
Reject H0 at  = 0.05
Fα = 3.89
 = .05 Conclusion:
There is evidence that
0 Do not Reject H 0
at least one μj differs
reject H0
FSTAT = 25.275 from the rest
Fα = 3.89
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
One-Way ANOVA
Excel Output

SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 0.0000 3.89
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
ANOVA Assumptions
• Randomness and Independence
• Select random samples from the c groups (or randomly
assign the levels)
• Normality
• The sample values for each group are from a normal
population
• Homogeneity of Variance
• All populations sampled from have the same variance
• Can be tested with Levene’s Test

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Factorial Design:
Two-Way ANOVA

• Examines the effect of


• Two factors of interest on the dependent variable
• e.g., Percent carbonation and line speed on soft drink
bottling process
• Interaction between the different levels of these
two factors
• e.g., Does the effect of one particular carbonation level
depend on which level the line speed is set?

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA
(continued)

• Assumptions

• Populations are normally distributed


• Populations have equal variances
• Independent random samples are drawn

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA
Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n’ = number of replications for each cell
n = total number of observations in all cells
n = (r)(c)(n’)
Xijk = value of the kth observation of level i of
factor A and level j of factor B
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA
Sources of Variation (continued)

SST = SSA + SSB + SSAB + SSE Degrees of


Freedom:
SSA r–1
Factor A Variation

SST SSB c–1


Factor B Variation
Total Variation
SSAB
Variation due to interaction (r – 1)(c – 1)
between A and B
n-1
SSE rc(n’ – 1)
Random variation (Error)

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA Equations

Total Variation: r c n
SST =  ( Xijk − X) 2

i=1 j=1 k =1

Factor A Variation: r
SSA = cn  ( Xi.. − X)
 2

i=1

Factor B Variation: c
SSB = rn ( X. j. − X)2
j=1

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA Equations
(continued)

Interaction Variation:
r c
SSAB = n ( Xij. − Xi.. − X.j. + X)2
i =1 j=1

Sum of Squares Error:


r c n
SSE =  ( Xijk − Xij. ) 2

i =1 j =1 k =1

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA Equations
(continued)
r c n

where:  X
i=1 j=1 k =1
ijk

X= = Grand Mean
c n
rcn
 X
j=1 k =1
ijk

Xi.. = = Mean of ith level of factor A (i = 1, 2, ..., r)


cn
r n

 X ijk
X. j. = i=1 k =1
= Mean of jth level of factor B (j = 1, 2, ..., c)
rn
n
Xijk
Xij. = 
r = number of levels of factor A
= Mean of cell ij
k =1 n
c = number of levels of factor B
n’ = number of replications in each cell
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Mean Square Calculations

SSA
MSA = Mean square factor A =
r −1

SSB
MSB = Mean square factor B =
c −1

SSAB
MSAB = Mean square interactio n =
(r − 1)(c − 1)

SSE
MSE = Mean square error =
rc(n'−1)
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA:
The F Test Statistics
F Test for Factor A Effect
H0: μ1..= μ2.. = μ3..= • • = µr..
MSA Reject H0 if
H1: Not all μi.. are equal FSTAT =
MSE FSTAT > Fα

F Test for Factor B Effect


H0: μ.1. = μ.2. = μ.3.= • • = µ.c.
MSB Reject H0 if
H1: Not all μ.j. are equal FSTAT =
MSE FSTAT > Fα

F Test for Interaction Effect


H0: the interaction of A and B is
equal to zero
MSAB
H1: interaction of A and B is not FSTAT = Reject H0 if
MSE FSTAT > Fα
zero
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Two-Way ANOVA
Summary Table
Source of Sum of Degrees of Mean
F
Variation Squares Freedom Squares

MSA MSA
Factor A SSA r–1
= SSA /(r – 1) MSE
MSB MSB
Factor B SSB c–1
= SSB /(c – 1) MSE

AB MSAB MSAB
SSAB (r – 1)(c – 1)
(Interaction) = SSAB / (r – 1)(c – 1) MSE

MSE =
Error SSE rc(n’ – 1)
SSE/rc(n’ – 1)
Total SST n–1

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Features of Two-Way ANOVA
F Test
• Degrees of freedom always add up
• n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1)
• Total = error + factor A + factor B + interaction

• The denominators of the F Test are always the same


but the numerators are different
• The sums of squares always add up
• SST = SSE + SSA + SSB + SSAB
• Total = error + factor A + factor B + interaction

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Examples:
Interaction vs. No Interaction
◼ Interaction is present:
• No interaction: line
segments are parallel some line segments
not parallel

Factor B Level 1
Mean Response

Mean Response
Factor B Level 1
Factor B Level 3

Factor B Level 2
Factor B Level 2
Factor B Level 3

Factor A Levels Factor A Levels

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Summary
In this chapter we discussed
• The one-way analysis of variance
• The logic of ANOVA
• ANOVA assumptions
• F test for difference in c means
• The two-way analysis of variance
• Examined effects of multiple factors
• Examined interaction between factors

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

You might also like