0% found this document useful (0 votes)
28 views

Topic 5 Analysis of Variance

This chapter discusses analysis of variance (ANOVA), including one-way ANOVA for comparing means of three or more populations, two-factor ANOVA, assumptions, hypotheses, partitioning variation into sums of squares between and within groups, F-tests, and post-hoc tests. Key points covered include evaluating differences among population means, assumptions of normality and equal variances, partitioning total variation, calculating mean squares, and interpreting the F-statistic to determine if population means are equal or different.

Uploaded by

1221305124
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Topic 5 Analysis of Variance

This chapter discusses analysis of variance (ANOVA), including one-way ANOVA for comparing means of three or more populations, two-factor ANOVA, assumptions, hypotheses, partitioning variation into sums of squares between and within groups, F-tests, and post-hoc tests. Key points covered include evaluating differences among population means, assumptions of normality and equal variances, partitioning total variation, calculating mean squares, and interpreting the F-statistic to determine if population means are equal or different.

Uploaded by

1221305124
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Topic 5

Analysis of Variance
Chapter Outcomes

After completing this chapter, you should be able to:


 Recognize situations in which to use analysis of variance
 Understand different analysis of variance designs
 Perform a single-factor hypothesis test and interpret results
 Conduct and interpret post-analysis of variance pairwise
comparisons procedures
 Set up and perform randomized blocks analysis
 Analyze two-factor analysis of variance test with replications
results
Overview of the Topic

Analysis of Variance (ANOVA)

One-Way Two-factor
ANOVA ANOVA
with replication
F-test

Tukey-
Kramer
test
General ANOVA Setting
 Investigator controls one or more independent
variables
 Called factors (or treatment variables)
 Each factor contains two or more levels (or
categories/classifications)
 Observe effects on dependent variable
 Response to levels of independent variable
 Experimental design: the plan used to test
hypothesis
One-Way Analysis of Variance

 Evaluate the difference among the means of three or


more populations
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

 Assumptions
 Populations are normally distributed

 Populations have equal variances

 Samples are randomly and independently drawn


Hypotheses of One-Way ANOVA


H0 : μ1  μ2  μ3    μk
 All population means are equal
 i.e., no treatment effect (no variation in means among
groups)

HA : Not all of the population means are the same
 At least one population mean is different
 i.e., there is a treatment effect
 Does not mean that all population means are different (some
pairs may be the same)
One-Factor ANOVA
H0 : μ1  μ2  μ3    μk
HA : Not all μi are the same

All Means are the same:


The Null Hypothesis is True
(No Treatment Effect)

μ1  μ2  μ3
One-Factor ANOVA
(continued)
H0 : μ1  μ2  μ3    μk
HA : Not all μi are the same
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)

or

μ1  μ2  μ3 μ1  μ2  μ3
Partitioning the Variation
 Total variation can be split into two parts:

SST = SSB + SSW

SST = Total Sum of Squares


SSB = Sum of Squares Between
SSW = Sum of Squares Within
Partitioning the Variation
(continued)

SST = SSB + SSW

Total Variation = the aggregate dispersion of the individual data


values across the various factor levels (SST)

Between-Sample Variation = dispersion among the factor


sample means (SSB)

Within-Sample Variation = dispersion that exists among the


data values within a particular factor level (SSW)
Partition of Total Variation

Total Variation (SST)

Variation Due to Variation Due to Random


= Factor (SSB) + Sampling (SSW)

Commonly referred to as: Commonly referred to as:


 Sum of Squares Between  Sum of Squares Within
 Sum of Squares Among  Sum of Squares Error
 Sum of Squares Explained  Sum of Squares Unexplained
 Among Groups Variation  Within Groups Variation
Total Sum of Squares

SST = SSB + SSW


k ni
SST   ( x ij  x ) 2

i1 j1
Where:
SST = Total sum of squares
k = number of populations (levels or treatments)
ni = sample size from population i
xij = jth measurement from population i
x = grand mean (mean of all data values)
Total Variation
(continued)

SST  ( x11  x )2  ( x12  x )2  ...  ( x knk  x )2

Resp o n se , X

G rou p 1 G rou p 2 G rou p 3


Sum of Squares Between

SST = SSB + SSW


k
SSB   ni ( x i  x ) 2

i 1
Where:
SSB = Sum of squares between
k = number of populations
ni = sample size from population i
xi = sample mean from population i
x = grand mean (mean of all data values)
Between-Group Variation
k
SSB   ni ( x i  x ) 2

i 1

SSB
Variation Due to
MSB 
Differences Among Groups
k 1
Mean Square Between =
SSB/degrees of freedom

i j
Between-Group Variation
(continued)

SSB  n1 ( x1  x )  n 2 ( x 2  x )  ...  nk ( x k  x )
2 2 2

Re s p o n s e , X

X3
X2 X
X1

G rou p 1 G rou p 2 G rou p 3


Sum of Squares Within

SST = SSB + SSW


k nj

SSW    ( x ij  x i ) 2

i1 j1
Where:
SSW = Sum of squares within
k = number of populations
ni = sample size from population i
xi = sample mean from population i
xij = jth measurement from population i
Within-Group Variation

k nj

SSW    ( x ij  x i )2
i1 j1
SSW
Summing the variation within
MSW 
each group and then adding
over all groups Nk
Mean Square Within =
SSW/degrees of freedom

i
Within-Group Variation
(continued)

SSW  ( x11  x1 )  ( x12  x 2 )  ...  ( x knk  x k )


2 2 2

Re s p o n s e , X

X3
X2
X1

G rou p 1 G rou p 2 G rou p 3


Sum of Squares Within
(Alternative formulae)

SST = SSB + SSW


k
SSW   (ni  1) si
2

i 1
One-Way ANOVA Table

Source of SS df MS F ratio
Variation
Between SSB MSB
SSB k-1 MSB = F=
Samples k-1 MSW
Within SSW
SSW N-k MSW =
Samples N-k
Total SST = N-1
SSB+SSW
k = number of populations
N = sum of the sample sizes from all populations
df = degrees of freedom
One-Factor ANOVA
F Test Statistic
H0: μ1= μ2 = … = μ k
HA: At least two population means are different
Test statistic
MSB

F
MSW
MSB is mean squares between variances
MSW is mean squares within variances
 Degrees of freedom
 df1 = k – 1 (k = number of populations)
 df2 = N – k (N = sum of sample sizes from all populations)
Interpreting One-Factor ANOVA
F Statistic
 The F statistic is the ratio of the between
estimate of variance and the within estimate of
variance
 The ratio must always be positive
 df1 = k -1 will typically be small
 df2 = N - k will typically be large

The ratio should be close to 1 if


H0: μ1= μ2 = … = μk is true

The ratio will be larger than 1 if


H0: μ1= μ2 = … = μk is false
One-Factor ANOVA
F Test Worked Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on an 237 227 206
automated driving machine for 251 216 204
each club. At the .05
significance level, is there a
difference in mean distance?
One-Factor ANOVA Example:
Scatter Diagram
Distance
270
Club 1 Club 2 Club 3
254 234 200 260 •

263
241
218
235
222
197
250 • X1
240 •
237 227 206 • ••
230
251 216 204
220

X2 • X
••
210
x1  249.2 x 2  226.0 x 3  205.8
•• X3
200 ••
x  227.0 190

1 2 3
Club
One-Factor ANOVA Example
Computations
Club 1 Club 2 Club 3 x1 = 249.2 n1 = 5
254 234 200 x2 = 226.0 n2 = 5
263 218 222
x3 = 205.8 n3 = 5
241 235 197
237 227 206 N = 15
x = 227.0
251 216 204 k=3
SSB = 5 [ (249.2 – 227)2 + (226 – 227)2 + (205.8 – 227)2 ] = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSB = 4716.4 / (3-1) = 2358.2 2358.2


F  25.275
MSW = 1119.6 / (15-3) = 93.3 93.3
One-Factor ANOVA Example
Solution
H 0: μ 1 = μ 2 = μ 3 Test Statistic:
HA: μi not all equal
MSB 2358.2
 = .05 F   25.275
MSW 93.3
df1= 2 df2 = 12
Critical Decision:
Value:
Reject H0 at  = 0.05
F = 3.885
 = .05 Conclusion:
There is evidence that at
0 Do not Reject H0 least one μi differs from
reject H0 F = 25.275
F.05 = 3.885 the rest
ANOVA -- Single Factor:
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 4.99E-05 3.885
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14
Example 3.1
The Westfall Relocation Company, located in Denver,
Colorado, contracts with major corporations forced to lay off
employees. Westfall provides a variety of services, including
job searches, specialized training, and resume development.
Westfall then bills the corporation for the services. It currently
operates in three regions: west, southwest and northwest.
Recently, Westfall’s general manager questioned whether
the company’s mean billing amount differed by region. You
are required to carry a test to determine this.
Example 3.1
(continued)
The following sample data were collected:
West Southwest Northwest
$3,700 $3,300 $2,900
2,900 2,100 4,300
4,100 2,600 5,200
4,900 2,100 3,300
4,900 3,600 3,600
5,300 2,700 3,300
2,200 4,500 3,700
3,700 2,400 2,400
4,800 4,400
3,000 3,300
4,400
3,200
ANOVA TABLE

ANOVA
Source of
Variation SS df MS
Between
Groups 5011583.333 2 2505791.667
Within
Groups 21080416.67 27 780756.1728

Total 26092000 29

You might also like