0% found this document useful (0 votes)
26 views42 pages

Chap 011

SB

Uploaded by

asd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views42 pages

Chap 011

SB

Uploaded by

asd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

A PowerPoint Presentation Package to Accompany

Applied Statistics in Business &


Economics, 5th edition

David P. Doane and Lori E. Seward

Prepared by Lloyd R. Jaisingh

McGraw-Hill/Irwin Copyright © 2015 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 11
Analysis of Variance

Chapter Contents

11.1 Overview of ANOVA


11.2 One-Factor ANOVA (Completely Randomized Model)
11.3 Multiple Comparisons
11.4 Tests for Homogeneity of Variances
11.5 Two-Factor ANOVA without Replication
(Randomized Block Model)
11.6 Two-Factor ANOVA with Replication (Full Factorial Model)
11.7 Higher Order ANOVA Models (Optional)

11-2
Chapter 11
Analysis of Variance

Chapter Learning Objectives (LO’s)

LO11-1: Use basic ANOVA terminology correctly.


LO11-2: Explain the assumptions of ANOVA and why they are important
LO11-3: Recognize from data format when one-factor ANOVA is
appropriate.
LO11-4: Interpret sums of squares and calculations in an ANOVA table.
LO11-5: Use Excel or other software for ANOVA calculations.
LO11-6: Use a table or Excel to find critical values for the F distribution.
LO11-7: Understand and perform Tukey’s test for paired means.

11-3
Chapter 11
Analysis of Variance

Chapter Learning Objectives (LO’s)

LO11-8: Use Hartley's test for equal variances in c treatment


groups.
LO11-9: Recognize from data format when two-factor ANOVA is
needed.
LO11-10: Interpret results in a two-factor ANOVA without replication.
LO11-11: Interpret main effects and interaction effects in two-factor
ANOVA.
LO11-12: Recognize the need for experimental design and GLM
(optional).

11-4
Chapter 11
LO11-1 11.1 Overview of ANOVA

LO11-1: Use basic ANOVA terminology correctly.

• Analysis of variance (ANOVA) is a comparison of


means.
• ANOVA allows one to compare more than two
means simultaneously.
• Proper experimental design efficiently uses limited
data to draw the strongest possible inferences.

11-5
Chapter 11
LO11-1 11.1 Overview of ANOVA

The Goal: Explaining Variation


• ANOVA seeks to identify sources of variation in a
numerical dependent variable Y (the response
variable).
• Variation in Y about its mean is explained by one or
more categorical independent variables (the factors) or
is unexplained (random error).

11-6
Chapter 11
LO11-1 11.1 Overview of ANOVA

The Goal: Explaining Variation


• Each possible value of a factor or combination of factors is a
treatment.
• We test to see if each factor has a significant effect on Y using
(for example) the hypotheses:
H0: m1 = m2 = m3 = m4 (e.g. mean defect rates are the same for

all four plants)


H1: Not all the means are equal
• The test uses the F distribution.
• If we cannot reject H0, we conclude that observations within each
treatment have a common mean m.

11-7
Chapter 11
LO11-1 11.1 Overview of ANOVA

One-Factor ANOVA Example

11-8
Chapter 11
LO11-1 11.1 Overview of ANOVA

The Goal: Explaining Variation


• For example, a one-factor ANOVA would test the hypothesis that
the length of hospital stay (LOS) is affected by Type of Fracture:
Length of stay = f(type of fracture). See Figure 11.3 (slide # 10).
• A two-factor ANOVA would test the hypothesis that the length of
hospital stay (LOS) is affected by Type of Fracture and Age
Group:
Length of stay = f(type of fracture, age group)
• We can also test for interaction between factors.
• Another Example: Paint quality is a major concern of car makers.
A key characteristic of paint is its viscosity, a continuous
numerical variable. Viscosity is to be tested for dependence on
application temperature (low, medium, high), as illustrated in
Figure 11.3 (slide# 10).
11-9
Chapter 11
LO11-1 11.1 Overview of ANOVA
The Goal: Explaining Variation

Figure 11.3

11-10
Chapter 11
LO11-2 11.1 Overview of ANOVA

LO11-2: Explain the assumptions of ANOVA and why they


are important.
ANOVA Assumptions
• Analysis of Variance assumes that the
- observations on Y are independent,
- populations being sampled are normal,
- populations being sampled have equal
variances.
• ANOVA is somewhat robust to departures from
normality and equal variance assumptions.

11-11
Chapter 11
11.1 Overview of ANOVA

ANOVA Calculations

• Software (e.g., Excel, MegaStat, MINITAB, SPSS) can be


used to analyze data.
• Large samples increase the power of the test,
but power also depends on the degree of variation in Y.
• Lowest power would be in a small sample with high
variation in Y.

11-12
Chapter 11
LO11-3
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-3: Recognize from data format when one-factor ANOVA
is appropriate.
Data Format
• A one-factor ANOVA only compares the means of c groups
(treatments or factor levels).
• Consider the format for a one-factor ANOVA with treatments T1,
T2, …, Tc.

Table 11.1 11-13


Chapter 11
LO11-3
11.2 One-Factor ANOVA
(Completely Randomized Model)
Data Format
• Sample sizes within each treatment do not need to be equal (i.e.,
balanced).
• The total number of observations is equal to n = n1 + n2 + … + nc

Hypothesis to Be Tested

• ANOVA tests all means simultaneously and so does not inflate


the type I error.

11-14
Chapter 11
LO11-3
11.2 One-Factor ANOVA
(Completely Randomized Model)
One-Factor ANOVA as a Linear Model
• An equivalent way to express the one-factor model is
to say that treatment j came from a population with a
common mean (m) plus a treatment effect (Tj) plus
random error (eij):
• yij = m + Tj + eij
j = 1, 2, …, c and i = 1, 2, …, n
• Random error is assumed to be normally distributed
with zero mean and the same variance for all
treatments.

11-15
Chapter 11
LO11-3
11.2 One-Factor ANOVA
(Completely Randomized Model)
One-Factor ANOVA as a Linear Model

• A fixed effects model only looks at what happens to the


response for particular levels of the factor.
H0: T1 = T2 = … = Tc = 0
H1: Not all Tj are zero
• If the H0 is true, then the ANOVA model collapses to yij
= m + eij

11-16
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-4: Interpret sums of squares and calculations in
an ANOVA table.
Group Means
• The mean of each group is calculated as:

• The overall sample mean (grand mean) can be calculated as:

11-17
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)
Partitioned Sum of Squares

11-18
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)
Partitioned Sum of Squares
• This relationship is true for sums of squared deviations, yielding
partitioned sum of squares:

11-19
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)
Partitioned Sum of Squares

• SSB and SSE are used to test the hypothesis of equal


treatment means by dividing each sum of squares by it
degrees of freedom to adjust for group size.
• These ratios are called Mean Squares (MSA and
MSE).
• The resulting test statistic is F = MSA/MSE.

11-20
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)

Partitioned Sum of Squares

11-21
Chapter 11
LO11-5
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-5: Use EXEL or other software for ANOVA
calculations.
• The ANOVA calculations are mathematically simple but involve
tedious sums.
• One can use Excel’s one-factor ANOVA menu using Data
Analysis to analyze data.

11-22
Chapter 11
LO11-4
11.2 One-Factor ANOVA
(Completely Randomized Model)
Test Statistic
• The F distribution describes the ratio of two variances.
• The F statistic is the ratio of the variance due to
treatments (MSA) to the variance due to error (MSE).

11-23
Chapter 11
LO11-5
11.2 One-Factor ANOVA
(Completely Randomized Model)
Test Statistic
• When F is near zero, then there is little difference
among treatments and we would not expect to reject
the hypothesis of equal treatment means.
Decision Rule

• F cannot be negative has no upper limit.


• For ANOVA, the F test is a right-tailed test.
• Use Appendix F or Excel (or other appropriate
software) to obtain the critical value of F for a given a.

11-24
Chapter 11
LO11-5
11.2 One-Factor ANOVA
(Completely Randomized Model)

Decision Rule for an F-test

11-25
Chapter 11
11.2 One-Factor ANOVA
(Completely Randomized Model)
Example: Carton Packing.
Is the variation among stations within the range attributable to
chance, or do these samples indicate actual differences in the
means?

11-26
Chapter 11
11.2 One-Factor ANOVA
(Completely Randomized Model)
Example: Carton Packing.
As a preliminary step, we plot the data to check for any time
pattern and just to visualize the data. We see some potential
differences in means, but no obvious time pattern (otherwise we
would have to consider observation order as a second factor).

Figure 11.6
11-27
Chapter 11
11.2 One-Factor ANOVA
(Completely Randomized Model)
Example: Carton Packing.

11-28
Chapter 11
LO11-6
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-6: Use a table or Excel to find critical values for the
F distribution.
Example: Carton Packing.

11-29
Chapter 11
LO11-5
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-5: Use Excel or other software for ANOVA calculations.

Example: Carton Packing.

11-30
Chapter 11
11.2 One-Factor ANOVA
(Completely Randomized Model)

Example: Carton Packing.

11-31
Chapter 11
LO11-7 11.3 Multiple Comparison Tests
LO11-7: Understand and perform Tukey's test for paired
means.
Tukey’s Test
• After rejecting the hypothesis of equal mean, we naturally want to
know: Which means differ significantly?
• In order to maintain the desired overall probability of type I error, a
simultaneous confidence interval for the difference of means must
be obtained.
• For c groups, there are c(c – 1)/2 distinct pairs of means to be
compared.
• These types of comparisons are called Multiple Comparison
Tests.

11-32
Chapter 11
LO11-7 11.3 Multiple Comparison Tests

Tukey’s Test
• Tukey’s studentized range test (or HSD for
“honestly significant difference” test) is a multiple
comparison test that has good power and is widely
used.
• Named for statistician John Wilder Tukey (1915 –
2000)
• This test is not available in Excel’s Tools > Data
Analysis but is available in MegaStat and Minitab.

11-33
Chapter 11
LO11-7 11.3 Multiple Comparison Tests

Tukey’s Test
• Tukey’s is a two-tailed test for equality of paired means from c
groups compared simultaneously.
• The hypotheses are:

Decision Rule

Where Tc,n−c is a critical


value of the Tukey test
statistic Tcalc for the
desired level of
significance.

11-34
Chapter 11
LO11-7 11.3 Multiple Comparison Tests

Tukey’s Test
• For example, here is the upper 5% of studentized range:

11-35
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances
LO11-8: Use Hartley's test for equal variances in c treatment
groups.

ANOVA Assumptions
• ANOVA assumes that observations on the response variable are
from normally distributed populations that have the same
variance.
• The one-factor ANOVA test is only slightly affected by inequality
of variance when group sizes are equal.
• One can test this assumption of homogeneous variances by
using Hartley’s Fmax Test.

11-36
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances
Hartley’s Test
• The hypotheses are

• The test statistic is the ratio of the largest sample variance to the
smallest sample variance

11-37
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances
Hartley’s Test

• The decision rule is:

11-38
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances
Hartley’s Test

• Assuming equal
group sizes,
critical values of
Fmax are found
using degrees
of freedom

11-39
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances

Levene’s Test

• Levene’s test is a more robust alternative to Hartley’s


F test.
• Levene’s test does not assume a normal distribution.
• It is based on the distances of the observations from
their sample medians rather than their sample means.
• A computer program (e.g., MINITAB) is needed to
perform this test.

11-40
Chapter 11
11.4 Tests for Homogeneity of
LO11-8
Variances

Levene’s Test
for carton-packing data

11-41
Chapter 11
Please refer to your text for information
on Sections 11.5, 11.6 and 11.7

11-42

You might also like