0% found this document useful (0 votes)
42 views59 pages

Week 6 & 7 - Analysis of Variance (WK 5 & 6)

This chapter discusses analysis of variance (ANOVA) techniques. It will cover one-way ANOVA, randomized block ANOVA, and two-factor ANOVA with replications. The goals are to understand different ANOVA designs, perform hypothesis tests for single-factor and two-factor ANOVA, and interpret the results including post-hoc comparisons. Key concepts covered include partitioning total variation into between- and within-group components, calculating sum of squares, mean squares, and the F-test statistic.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views59 pages

Week 6 & 7 - Analysis of Variance (WK 5 & 6)

This chapter discusses analysis of variance (ANOVA) techniques. It will cover one-way ANOVA, randomized block ANOVA, and two-factor ANOVA with replications. The goals are to understand different ANOVA designs, perform hypothesis tests for single-factor and two-factor ANOVA, and interpret the results including post-hoc comparisons. Key concepts covered include partitioning total variation into between- and within-group components, calculating sum of squares, mean squares, and the F-test statistic.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 59

Chapter 11

Analysis of Variance

Chap 11-1
Chapter Goals

After completing this chapter, you should be able


to:
 Recognize situations in which to use analysis of variance
 Understand different analysis of variance designs
 Perform a single-factor hypothesis test and interpret results
 Conduct and interpret post-analysis of variance pairwise
comparisons procedures
 Set up and perform randomized blocks analysis
 Analyze two-factor analysis of variance test with replications
results
Chap 11-2
Chapter Overview

Analysis of Variance (ANOVA)

One-Way Randomized Two-factor


ANOVA Complete ANOVA
Block ANOVA with replication
F-test
F-test
Tukey-
Kramer Fisher’s Least
test Significant
Difference test
Chap 11-3
General ANOVA Setting
 Investigator controls one or more independent
variables
 Called factors (or treatment variables)
 Each factor contains two or more levels (or
categories/classifications)
 Observe effects on dependent variable
 Response to levels of independent variable
 Experimental design: the plan used to test
hypothesis

Chap 11-4
One-Way Analysis of Variance

 Evaluate the difference among the means of three


or more populations
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

 Assumptions
 Populations are normally distributed

 Populations have equal variances

 Samples are randomly and independently drawn

Chap 11-5
Completely Randomized Design

 Experimental units (subjects) are assigned


randomly to treatments
 Only one factor or independent variable
 With two or more treatment levels
 Analyzed by
 One-factor analysis of variance (one-way ANOVA)
 Called a Balanced Design if all factor levels
have equal sample size

Chap 11-6
Hypotheses of One-Way ANOVA


H0 : μ1  μ2  μ3    μk
 All population means are equal
 i.e., no treatment effect (no variation in means among
groups)

HA : Not all of the population means are the same
 At least one population mean is different
 i.e., there is a treatment effect
 Does not mean that all population means are different
(some pairs may be the same)

Chap 11-7
One-Factor ANOVA
H0 : μ1  μ2  μ3    μk
HA : Not all μi are the same

All Means are the same:


The Null Hypothesis is True
(No Treatment Effect)

μ1  μ2  μ3
Chap 11-8
One-Factor ANOVA
(continued)
H0 : μ1  μ2  μ3    μk
HA : Not all μi are the same

At least one mean is different:


The Null Hypothesis is NOT true
(Treatment Effect is present)

or

μ1  μ2  μ3 μ1  μ2  μ3
Chap 11-9
Partitioning the Variation
 Total variation can be split into two parts:

SST = SSB + SSW

SST = Total Sum of Squares


SSB = Sum of Squares Between
SSW = Sum of Squares Within

Chap 11-10
Partitioning the Variation
(continued)

SST = SSB + SSW

Total Variation = the aggregate dispersion of the individual


data values across the various factor levels (SST)

Between-Sample Variation = dispersion among the factor


sample means (SSB)

Within-Sample Variation = dispersion that exists among


the data values within a particular factor level (SSW)

Chap 11-11
Partition of Total Variation

Total Variation (SST)

Variation Due to Variation Due to Random


= Factor (SSB) + Sampling (SSW)

Commonly referred to as: Commonly referred to as:


 Sum of Squares Between  Sum of Squares Within
 Sum of Squares Explained  Sum of Squares Error
Among Groups Variation  Sum of Squares Unexplained
Within Groups Variation

Chap 11-12
Total Sum of Squares

SST = SSB + SSW


k ni
SST   ( x ij  x ) 2

i 1 j1
Where:

SST = Total sum of squares


k = number of populations (levels or treatments)
ni = sample size from population i
xij = jth measurement from population i
x = grand mean (mean of all data values)
Chap 11-13
Total Variation
(continued)

SST  ( x11  x )2  ( x12  x )2  ...  ( x knk  x )2

Response, X

Group 1 Group 2 Group 3

Chap 11-14
Sum of Squares Between

SST = SSB + SSW


k
SSB   ni ( x i  x ) 2

i1
Where:

SSB = Sum of squares between


k = number of populations
ni = sample size from population i
xi = sample mean from population i
x = grand mean (mean of all data values)
Chap 11-15
Between-Group Variation
k
SSB   ni ( x i  x ) 2

i1

SSB
Variation Due to
Differences Among Groups MSB 
k 1
Mean Square Between =
SSB/degrees of freedom

mi mj
Chap 11-16
Between-Group Variation
(continued)

SSB  n1 ( x1  x )  n2 ( x 2  x )  ...  nk ( x k  x )
2 2 2

Response, X

X3
X2 X
X1

Group 1 Group 2 Group 3


Chap 11-17
Sum of Squares Within

SST = SSB + SSW


k nj
SSW    ( x ij  x i ) 2

i 1 j 1
Where:

SSW = Sum of squares within


k = number of populations
ni = sample size from population i
xi = sample mean from population i
xij = jth measurement from population i
Chap 11-18
Within-Group Variation

k nj
SSW    ( x ij  x i )2
i 1 j 1
SSW
Summing the variation
MSW 
within each group and then
adding over all groups Nk
Mean Square Within =
SSW/degrees of freedom

mi
Chap 11-19
Within-Group Variation
(continued)

SSW  ( x11  x1 )  ( x12  x 2 )  ...  ( x knk  x k )


2 2 2

Response, X

X3
X2
X1

Group 1 Group 2 Group 3


Chap 11-20
One-Way ANOVA Table

Source of SS df MS F ratio
Variation
Between SSB MSB
SSB k-1 MSB =
Samples k - 1 F = MSW
Within SSW
SSW N-k MSW =
Samples N-k
SST =
Total N-1
SSB+SSW
k = number of populations
N = sum of the sample sizes from all populations
df = degrees of freedom
Chap 11-21
One-Factor ANOVA
F Test Statistic
H0: μ1= μ2 = … = μ k
HA: At least two population means are different
Test statistic
MSB

F
MSW
MSB is mean squares between variances
MSW is mean squares within variances
 Degrees of freedom
 df1 = k – 1 (k = number of populations)
 df2 = N – k (N = sum of sample sizes from all populations)

Chap 11-22
Interpreting One-Factor ANOVA
F Statistic
 The F statistic is the ratio of the between
estimate of variance and the within estimate
of variance
 The ratio must always be positive
 df1 = k -1 will typically be small
 df2 = N - k will typically be large

The ratio should be close to 1 if


H0: μ1= μ2 = … = μk is true

The ratio will be larger than 1 if


H0: μ1= μ2 = … = μk is false
Chap 11-23
One-Factor ANOVA
F Test Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on 237 227 206
an automated driving 251 216 204
machine for each club. At
the .05 significance level, is
there a difference in mean
distance?

Chap 11-24
One-Factor ANOVA Example:
Scatter Diagram
Distance
270
Club 1 Club 2 Club 3
254 234 200 260 •
••
263
241
218
235
222
197
250 X1
240 •
237 227 206 • ••
230
251 216 204
220

X2 • X
••
210
x1  249.2 x 2  226.0 x 3  205.8
•• X3
200 ••
x  227.0 190

1 2 3
Chap 11-25
Club
One-Factor ANOVA Example
Computations
Club 1 Club 2 Club 3 x1 = 249.2 n1 = 5
254 234 200 x2 = 226.0 n2 = 5
263 218 222
x3 = 205.8 n3 = 5
241 235 197
237 227 206 N = 15
x = 227.0
251 216 204 k=3
SSB = 5 [ (249.2 – 227)2 + (226 – 227)2 + (205.8 – 227)2 ] = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSB = 4716.4 / (3-1) = 2358.2 2358.2


F  25.275
MSW = 1119.6 / (15-3) = 93.3 93.3
Chap 11-26
One-Factor ANOVA Example
Solution
H0: μ1 = μ2 = μ3 Test Statistic:
HA: μi not all equal
MSB 2358.2
 = .05 F   25.275
MSW 93.3
df1= 2 df2 = 12

Critical Decision:
Value:
Reject H0 at  = 0.05
F = 3.885
 = .05 Conclusion:
There is evidence that
0 Do not Reject H0 at least one μi differs
reject H0 F = 25.275
F.05 = 3.885 from the rest Chap 11-27
ANOVA -- Single Factor:
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 4.99E-05 3.885
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14

Chap 11-28
The Tukey-Kramer Procedure
 Tells which population means are significantly
different
 e.g.: μ1 = μ2  μ3
 Done after rejection of equal means in ANOVA
 Allows pair-wise comparisons
 Compare absolute mean differences with critical
range

μ1= μ2 μ3 x
Chap 11-29
Tukey-Kramer Critical Range

MSW  1 1 
Critical Range  q 
2  ni n j 

where:
q = Value from standardized range table
with k and N - k degrees of freedom for
the desired level of 
MSW = Mean Square Within
ni and nj = Sample sizes from populations (levels) i and j

Chap 11-30
The Tukey-Kramer Procedure:
Example
1. Compute absolute mean
Club 1 Club 2 Club 3 differences:
254 234 200
263 218 222 x1  x 2  249.2  226.0  23.2
241 235 197 x1  x 3  249.2  205.8  43.4
237 227 206
251 216 204 x 2  x 3  226.0  205.8  20.2

2. Find the q value from the table with k and N - k


degrees of freedom for the desired level of 

qα  3.77 q(α,k,N-k) = q(0.05,3,12)


 Tabel L.22 Walpole
Chap 11-31
The Tukey-Kramer Procedure:
Example
3. Compute Critical Range:
MSW  1 1  93.3  1 1 
Critical Range  qα   3.77     16.285
 
2  ni n j  2 5 5

4. Compare:
5. All of the absolute mean differences x1  x 2  23.2
are greater than critical range.
Therefore there is a significant x1  x 3  43.4
difference between each pair of
means at 5% level of significance. x 2  x 3  20.2

Chap 11-32
Tukey-Kramer in PHStat

Chap 11-33
Randomized Complete Block ANOVA

 Like One-Way ANOVA, we test for equal population


means (for different factor levels, for example)...

 ...but we want to control for possible variation from a


second factor (with two or more levels)

 Used when more than one factor may influence the


value of the dependent variable, but only one is of key
interest

 Levels of the secondary factor are called blocks

Chap 11-34
Partitioning the Variation
 Total variation can now be split into three parts:

SST = SSB + SSBL + SSW

SST = Total sum of squares


SSB = Sum of squares between factor levels
SSBL = Sum of squares between blocks
SSW = Sum of squares within levels

Chap 11-35
Sum of Squares for Blocking

SST = SSB + SSBL + SSW


b
SSBL   k( x j  x )2
j1

Where:

k = number of levels for this factor


b = number of blocks
xj = sample mean from the jth block
x = grand mean (mean of all data values)
Chap 11-36
Partitioning the Variation
 Total variation can now be split into three parts:

SST = SSB + SSBL + SSW

SST and SSB are SSW = SST – (SSB + SSBL)


computed as they were
in One-Way ANOVA

Chap 11-37
Mean Squares

SSBL
MSBL  Mean square blocking 
b 1

SSB
MSB  Mean square between 
k 1

SSW
MSW  Mean square within 
(k  1)(b  1)

Chap 11-38
Randomized Block ANOVA Table
Source of SS df MS F ratio
Variation
Between MSBL
SSBL b-1 MSBL
Blocks MSW
Between MSB
SSB k-1 MSB
Samples MSW
Within
SSW (k–1)(b-1) MSW
Samples

Total SST N-1


k = number of populations N = sum of the sample sizes from all populations
b = number of blocks df = degrees of freedom Chap 11-39
Blocking Test
H0 : μb1  μb2  μb3  ...
HA : Not all block means are equal

MSBL
F=  Blocking test: df1 = b - 1
MSW df2 = (k – 1)(b – 1)

Reject H0 if F > F

Chap 11-40
Main Factor Test
H0 : μ1  μ2  μ3  ...  μk
HA : Not all population means are equal

MSB
F=  Main Factor test: df1 = k - 1
MSW df2 = (k – 1)(b – 1)

Reject H0 if F > F

Chap 11-41
Example
Alarm type
Room 1 2 3 4
1 5.2 7.4 3.9 12.3
2 6.3 8.1 6.4 9.4
3 4.9 5.9 7.9 7.8
4 3.2 6.5 9.2 10.8
5 6.8 4.9 4.1 8.5

Chap 11-42
ANOVA
Source of P-
Variation SS df MS F value F crit
Rows 6.07 4 1.516 0.426 0.787 3.259
Columns 56.28 3 18.76 5.27 0.015 3.490

Error 42.7 12 3.56

Total 105.06 19

Chap 11-43
Fisher’s
Least Significant Difference Test
 To test which population means are significantly
different
 e.g.: μ1 = μ2 ≠ μ3
 Done after rejection of equal means in randomized
block ANOVA design
 Allows pair-wise comparisons
 Compare absolute mean differences with critical
range

m1= m2 m3 x
Chap 11-44
Fisher’s Least Significant
Difference (LSD) Test

2
LSD  t /2 MSW
b

where:
t/2 = Upper-tailed value from Student’s t-distribution
for /2 and (k -1)(n - 1) degrees of freedom
MSW = Mean square within from ANOVA table
b = number of blocks
k = number of levels of the main factor

Chap 11-45
Fisher’s Least Significant
Difference (LSD) Test (continued)
2
LSD  t /2 MSW
b

Compare:
Is x i  x j  LSD ? x1  x 2

If the absolute mean difference x1  x 3


is greater than LSD then there
is a significant difference x2  x3
between that pair of means at
the chosen level of significance. etc...
Chap 11-46
Two-Way ANOVA

 Examines the effect of


 Two or more factors of interest on the
dependent variable
 e.g.: Percent carbonation and line speed on

soft drink bottling process


 Interaction between the different levels of these
two factors
 e.g.: Does the effect of one particular

percentage of carbonation depend on which


level the line speed is set?

Chap 11-47
Two-Way ANOVA
(continued)

 Assumptions

 Populations are normally distributed


 Populations have equal variances
 Independent random samples are
drawn

Chap 11-48
Two-Way ANOVA
Sources of Variation

Two Factors of interest: A and B


a = number of levels of factor A
b = number of levels of factor B
N = total number of observations in all cells

Chap 11-49
Two-Way ANOVA
Sources of Variation
(continued)

SST = SSA + SSB + SSAB + SSE Degrees of


Freedom:
SSA
a–1
Variation due to factor A

SSB
SST b–1
Variation due to factor B
Total Variation
SSAB
Variation due to interaction (a – 1)(b – 1)
between A and B
N-1
SSE N – ab
Inherent variation (Error)
Chap 11-50
Two Factor ANOVA Equations

Total Sum of Squares: a b n


SST   ( x ijk  x ) 2

i1 j1 k 1

Sum of Squares Factor A: a


SS A  bn  ( x i  x )
 2

i1

Sum of Squares Factor B: b


SSB  an ( x j  x )2
j1
Chap 11-51
Two Factor ANOVA Equations
(continued)

Sum of Squares
Interaction Between a b
A and B: SS AB  n ( x ij  x i  x j  x )2
i1 j1

Sum of Squares Error:


a b n
SSE   ( x ijk  x ij ) 2

i1 j 1 k 1

Chap 11-52
Two Factor ANOVA Equations
(continued)
a b n

where:  x
i 1 j1 k 1
ijk

x  Grand Mean
b n
abn
 x
j1 k 1
ijk

xi   Mean of each level of factor A


bn
a n

 x ijk
xj  i1 k 1
 Mean of each level of factor B
an
n x ijk
x ij  
a = number of levels of factor A
 Mean of each cell
k 1 n
b = number of levels of factor B
n’ = number of replications in each cell
Chap 11-53
Mean Square Calculations
SS A
MS A  Mean square factor A 
a 1

SSB
MSB  Mean square factor B 
b 1

SS AB
MS AB  Mean square interaction 
(a  1)(b  1)

SSE
MSE  Mean square error 
N  ab
Chap 11-54
Two-Way ANOVA:
The F Test Statistic
F Test for Factor A Main Effect
H0: μA1 = μA2 = μA3 = • • •
MS A Reject H0
HA: Not all μAi are equal F
MSE if F > F

F Test for Factor B Main Effect


H0: μB1 = μB2 = μB3 = • • •
MSB Reject H0
HA: Not all μBi are equal F
MSE if F > F

F Test for Interaction Effect


H0: factors A and B do not interact
to affect the mean response MS AB Reject H0
F
HA: factors A and B do interact MSE if F > F
Chap 11-55
Two-Way ANOVA
Summary Table
Source of Sum of Degrees of Mean F
Variation Squares Freedom Squares Statistic

MSA MSA
Factor A SSA a–1
= SSA /(a – 1) MSE
MSB MSB
Factor B SSB b–1
= SSB /(b – 1) MSE

AB MSAB MSAB
SSAB (a – 1)(b – 1)
(Interaction) = SSAB / [(a – 1)(b – 1)] MSE

MSE =
Error SSE N – ab
SSE/(N – ab)
Total SST N–1
Chap 11-56
Features of Two-Way ANOVA
F Test
 Degrees of freedom always add up
 N-1 = (N-ab) + (a-1) + (b-1) + (a-1)(b-1)
 Total = error + factor A + factor B + interaction
 The denominator of the F Test is always the
same but the numerator is different
 The sums of squares always add up
 SST = SSE + SSA + SSB + SSAB
 Total = error + factor A + factor B + interaction

Chap 11-57
Examples:
Interaction vs. No Interaction
 Interaction is
 No interaction:
present:

Factor B Level 1
Mean Response

Mean Response
Factor B Level 1
Factor B Level 3

Factor B Level 2
Factor B Level 2
Factor B Level 3

Factor A Levels Factor A Levels


1 2 1 2
Chap 11-58
Chapter Summary
 Described one-way analysis of variance
 The logic of ANOVA
 ANOVA assumptions
 F test for difference in k means
 The Tukey-Kramer procedure for multiple comparisons
 Described randomized complete block designs
 F test
 Fisher’s least significant difference test for multiple
comparisons
 Described two-way analysis of variance
 Examined effects of multiple factors and interaction
Chap 11-59

You might also like