0% found this document useful (0 votes)
139 views102 pages

Group 2 - Chap 11 Analysis of Variance

1) One-way ANOVA can be used to test for differences among the means of three or more groups. It partitions the total variation into variation among groups and variation within groups. 2) The Tukey-Kramer procedure allows for multiple comparisons after a one-way ANOVA. It compares the absolute differences between all pairings of group means to a critical range value. Differences greater than this value are considered statistically significant. 3) Assumptions of one-way ANOVA include random sampling, normality of data, and homogeneity of variances. Levene's test can test the assumption of equal variances among groups.

Uploaded by

Chendie C. Maya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views102 pages

Group 2 - Chap 11 Analysis of Variance

1) One-way ANOVA can be used to test for differences among the means of three or more groups. It partitions the total variation into variation among groups and variation within groups. 2) The Tukey-Kramer procedure allows for multiple comparisons after a one-way ANOVA. It compares the absolute differences between all pairings of group means to a critical range value. Differences greater than this value are considered statistically significant. 3) Assumptions of one-way ANOVA include random sampling, normality of data, and homogeneity of variances. Levene's test can test the assumption of equal variances among groups.

Uploaded by

Chendie C. Maya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 102

ANALYSIS OF VARIANCE

Baranda, Dulay,
Escobillo, Estares, Liboa
The Completely Randomized Design:
One-Way ANOVA
||Erica Dulay and April Escobillo||

➔ One-way ANOVA F Test for


Differences Among More than Two
Means

Discussion Outline ➔ Multiple Comparisons: The Tukey-


Kramer Procedure

The Randomized Block Design


||Rynheart Estares ||

➔ Testing for Factor and Block Effects


➔ Multiple Comparisons: The Tukey
Procedure
The Factorial Design: Two-Way
Analysis of Variance
||Fhritz Liboa and Jen Baranda||

➔ Testing for Factor and


Interaction Effects
Discussion Outline ➔ Multiple Comparisons: The
Tukey Procedure
➔ Visualizing Interaction Effects:
The Cell Means Plot
➔ Interpreting Interaction Effects
CHAPTER OVERVIEW
General ANOVA Setting
● Investigator controls one or more factors of interest
○ Each factor contains two or more levels
○ Levels can be numerical or categorical
○ Different levels produce different groups
○ Think of each group as a sample from a different population
● Evaluate differences among more than two groups.
● Groups are classified according to levels of a factor of interest.
● When there is only one factor, the experimental design is called a
Completely Randomized Design.
Completely Randomized Design
● Experimental units (subjects) are assigned randomly to groups
○ Subjects are assumed homogeneous
● Only one factor or independent variable
○ With two or more levels
● Analyzed by one-factor analysis of variance (ANOVA)
The Completely Randomized Design: One-Way
Analysis of Variance
One-Way Analysis of Variance
Evaluate the difference among the means of three or more groups

Assumptions:
● Populations are normally distributed
● Populations have equal variances
● Samples are randomly and independently drawn
Hypothesis of One-Way ANOVA
● Null Hypothesis of no difference
● All population means are equal
● i.e., no factor effect (no variation in means among groups)

● Alternative
● At least one population mean is different
● i.e., there is a factor effect
● Does not mean that all population means are different (some pairs may
be the same)
One-Way ANOVA
Partitioning the Variation
● Total variation can be split into two parts:
One-Way ANOVA
1. Compute Sample Means
2. Compute Grand Mean
3. Compute SSA
4. Compute SSW
5. Compute SST
6. Compute Mean Squares: MSA and MSW
7. Compute Fstat
The situation/problem
You are the production manager at the Perfect Parachutes Company.
Parachutes are woven in your factory using a synthetic fiber purchased from
one of four different suppliers. Strength of these fibers is an important
characteristic that ensures quality parachutes.You need to decide whether
the synthetic fibers from each of your four suppliers result in parachutes
of equal strength. Furthermore, your factory uses two types of looms to
produce parachutes, the Jetta and the Turk. You need to establish that the
parachutes woven on both types of looms are equally strong.You also want to
know if any differences in the strength of the parachute that can be attributed
to the four suppliers are dependent on the type of loom used. How would you
go about finding this information?
Step 1: Compute sample means
Step 2: Compute the Grand Mean
Sum all values and divide the sum
by the total number of values:
Step 3: Compute SSA
AMONG GROUP VARIATION ( SUM
OF SQUARES AMONG GROUP)

● Sum the squared differences


between the sample mean of
each group and the grand
mean, weighted by the sample
size in each group.
Step 4: Compute SSW
WITHIN GROUP VARIATION ( SUM
OF SQUARES WITHIN GROUPS)

● Sum the squared differences


between each value in a group
and the mean of its own
group.
Step 5: Compute SST
TOTAL VARIATION ( SUM OF
SQUARES TOTAL)

● Sum the squared differences


between each individual
value and the grand mean.
Step 6: Compute Mean Squares
The Mean Squares are obtained by dividing the various sum of squares by
their associated degrees of freedom.

Mean Square Among


(d.f. = c-1)

Mean Square Within


(d.f. = n-c)

Mean Square Total


(d.f. = n-1)
Step 7: Compute Fstat
● For a given level of significance,
you reject the null hypothesis if
the Fstat test statistic is greater
than the upper-tail critical value.

● Follows an F distribution
Degrees of Freedom
○ df1 = c – 1 (numerator)
○ df2 = n – c (denominator)
● The ratio must always be positive
● df1 = c – 1 will typically be small
● df2 = n – c will typically be large
Because is Fstat 3.4616 is greater
than the upper-tail critical value of
3.24, you reject the null hypothesis

You conclude that there is a


significant difference in the mean
tensile strength among the four
suppliers.
One-Way ANOVA Summary Table

c= number of groups
n= sum of the sample sizes from all groups
df= degrees of freedom
MULTIPLE COMPARISONS:
The Tukey-Kramer Procedure

● Tells which population means are significantly different


○ e.g.: μ1 = μ2 ¹ μ3
○ Done after rejection of equal means in ANOVA
● Allows paired comparisons
○ Compare absolute mean differences with critical range
4 Steps to Construct Comparisons:
1. Compute the absolute mean differences, , among all
c (c - 1)/2 pairs of sample means.
2. Compute the critical range for the Tukey-Kramer procedure.
3. Compare each of the c(c - 1)/2 pairs of means against its corresponding
critical range. You declare a specific pair significantly different if the
absolute difference in the sample means, , is greater than the
critical range.
4. Interpret the results.
Step 1: Compute absolute mean differences
a. There are four (4) suppliers, thus there are 4 (4-1)/2 = 6 pairwise
comparisons.
b. Compute the absolute mean differences for all six pairwise comparisons:
Find the Qα value
Find the value of Qαvalue from the table in appendix 6.7 with c = 4, and n - c =
20 - 4 = 16. The upper-tail critical value of the test statistic, is 4.05.
Step 2: Compute the Critical Range

where:
Qα = Upper Tail Critical Value from Studentized
MSW = Mean Square Within
Range Distribution with c and n - c degrees
nj and nj’ = Sample sizes from groups j and j’
of freedom (see appendix E.7 table)

Thus, in this example:


Step 3: Compare each of the pairs of means against
its corresponding critical range.
> 4.4712
< 4.4712
< 4.4712
< 4.4712
< 4.4712
< 4.4712

NOTE: Because 4.74 > 4.4712, there is a significant difference between the
means of Suppliers 1 and 2. All other pairwise differences are less than 4.4712.
Step 4: Interpret the results
With 95% confidence, you can conclude that parachutes woven using fiber
from Supplier 1 have a lower mean tensile strength than those from Supplier
2, but there no statistically significant differences between Suppliers 1 and 3,
Suppliers 1 and 4, Suppliers 2 and 3, Suppliers 2 and 4, and Suppliers 3 and 4.
ANOVA Assumptions
● Randomness and Independence
○ Select random samples from the c groups (or randomly assign the levels)
● Normality
○ The sample values for each group are from a normal population
● Homogeneity of Variance
○ All populations sampled from have the same variance
○ Can be tested with Levene’s Test
Levene Test for Homogeneity of Variance

● Test of whether the variances of two samples or groups are


approximately equal or homogenous.
● One powerful yet simple procedure for testing the equality of
the variances.
To test for the homogeneity of variance, you use the following null hypothesis:

Against the alternative hypothesis:


2 Steps to Test the Null Hypothesis of Equal
Variances
1. Compute the Absolute value of the difference between each value and
the median of the group.
2. Perform one-way ANOVA on these absolute differences.
Step 1: Compute the Absolute Value
Step 2.1: Compute the Grand Mean
Sum all values and divide the sum
by the total number of values:
Step 2.2: Compute SSA
Step 2.3: Compute SSW
Step 2.4: Compute SST
Step 2.4: Compute SST
Step 2.5: Compute Mean Squares
The Mean Squares are obtained by dividing the various sum of squares by
their associated degrees of freedom.

Mean Square Among


(d.f. = c-1)

Mean Square Within


(d.f. = n-c)

Mean Square Total


(d.f. = n-1)
Step 2.6: Compute Fstat
Because Fstat 0.2068 is lesser than the upper-tail critical value of 3.24, you do not reject
the null hypothesis.

You conclude that there is no evidence of significant difference among the four variances.
In other words, it is reasonable to assume that the materials from the four suppliers
produce parachutes with an equal amount of variability. Therefore, the homogeneity-of-
variance assumption for the ANOVA procedure is justified.
The Randomized Block Design
Randomized Block Design
● Evaluates differences among more than two groups that contain matched
samples or repeated measures that have been placed in blocks.

● Blocking removes as much variability as possible from measures of random


error so that differences among groups are more evident.

● Levels of the secondary factor are called blocks


Partitioning the Variation

SST = Total variation


SSA = Among-Group variation
SSBL = Among-Block variation
SSE = Error variation
Note: SST & SSA are computed as they were in One-Way ANOVA
Sum of Squares for AMONG-Group
Total Sum of Squares
Sum of Squares for BLOCK
Sum of Squares for ERROR

SSA 1,787.46
SST 2,295.63
SSBL 283.375

SSE = 2,295.63 – (1,787.46 + 283.375) = 224.795


Randomized Block ANOVA Table

c = number of populations rc = total number of observations r = number of blocks


Testing for Factor Effect
Main Factor Test:
df1 = c – 1 = 4 – 1 = 3
df2 = (r – 1)(c – 1) = (6 – 1)(4 – 1) = 15

See Table E.5 to get the Critical Value for F

_____
Testing for Block Effect
Blocking Test:
df1 = r – 1 = 6 – 1 = 5
df2 = (r – 1)(c – 1) = (6-1)(4-1) = 15

See Table E.5 to get the Critical Value for F

______
Estimated Relative Efficiency
r= 6
c= 4
MSBL = 56.675
MSE = 14.986

A relative efficiency of 1.6 means that it would take 1.6 times as many observations in one
ANOVA design as compared to the randomized block design in order to have the same
precision in comparing the restaurant,
Multiple Comparisons: The Tukey Procedure

Determine which groups are significantly different from the others.


Multiple Comparisons: The Tukey Procedure
> 6.448
> 6.448
< 6.448
> 6.448
> 6.448
> 6.448
FACTORIAL DESIGN:
TWO-WAY ANOVA
FACTORIAL DESIGN: TWO-WAY ANOVA
EXAMINES THE EFFECT OF

❏ Two factors of interest on the dependent variable


e.g., age and gender of employees

❏ Interaction between the different levels of these two factors


e.g., employees have been labeled into gender classification
(levels): male and female
e.g., employees have been classified into three groups or levels:
(a) age less than 40, (b) 40 to 55, and (c) above 55
FACTORIAL DESIGN: TWO-WAY ANOVA
ASSUMPTIONS
❏ Populations are normally distributed
❏ Populations have equal variances
❏ Independent random samples are drawn
FACTORIAL DESIGN: TWO-WAY ANOVA
TWO FACTORS OF INTEREST: A AND B
FACTORIAL DESIGN: TWO-WAY ANOVA
TWO-WAY ANOVA SOURCES OF VARIATION
FACTORIAL DESIGN: TWO-WAY ANOVA
TWO-WAY ANOVA EQUATIONS

TOTAL VARIATION IN TWO-WAY ANOVA

FACTOR A VARIATION

FACTOR B VARIATION

INTERACTION VARIATION

RANDOM VARIATION IN TWO-WAY ANOVA


FACTORIAL DESIGN: TWO-WAY ANOVA
WHERE
r = number of levels of factor A
c = number of levels of factor B
n’ = number of replications in each cell
FACTORIAL DESIGN: TWO-WAY ANOVA
MEAN SQUARE CALCULATIONS
FACTORIAL DESIGN: TWO-WAY ANOVA
THE F TEST STATISTICS
H0: μ1..= μ2.. = μ3..= • • = µr.. F Test for Factor A Effect
H1: Not all μi.. are equal Reject H0 if FSTAT > Fα

F Test for Factor B Effect


H0: μ.1. = μ.2. = μ.3.= • • = µ.c.
H1: Not all μ.j. are equal Reject H0 if FSTAT > Fα

H0: the interaction of A and B


F Test for Interaction Effect
is equal to zero
H1: interaction of A and B is Reject H0 if FSTAT > Fα
not zero
FACTORIAL DESIGN: TWO-WAY ANOVA
TWO-WAY ANOVA SUMMARY TABLE
Tensile Strengths of Parachutes Woven by Two Types of Looms,
Using Synthetic Fibers from Four Suppliers
As production manager at Perfect Parachutes, the business problem
you decided to examine involved not just the different suppliers but
also whether parachutes woven on the Jetta looms are as strong as
those woven on the Turk looms. In addition, you need to determine
whether any differences among the four suppliers in the strength of the
parachutes are dependent on the type of loom2 being used. Thus, you
have decided to collect the data by performing an experiment in which
five different parachutes from each supplier are manufactured on each
of the two different looms.
Step 1: Compute Sample Means
Step 1: Compute Sample Means
Step 2: Compute Sources of Variation
Step 2.1: Compute SSA

= 6.9723
Step 2.2: Compute SSB

= 134.3488
Step 2.3: Compute SSAB

= 0.2867
Step 2.4: Compute SSE

=275.5920
Step 3: Compute Degree of Freedom
Degrees of Freedom

SSA =r–1
SSB =c-1
SSAB = (r–1)(c-1)
SSE = rc(n’ – 1)

Total =n-1
Step 4: Compute Mean Squares
Step 5: Compute Fstat
Step 6: Interpret Results
Testing for Factor and Interaction Effects
F Test for Interaction Effect
Reject H0 if FSTAT > Fα

Because FSTAT = 0.0111 < 2.9011 or the p-value =


0.9984 > 0.05, you do not reject H0. You conclude
that there is insufficient evidence of an interaction
effect between loom and supplier.
Testing for Factor and Interaction Effects
F Test for Factor A Effect
Reject H0 if FSTAT > Fα

Because FSTAT = -0.8096 < 4.1491 or the p-value =


0.3750 > 0.05, you do not reject H0. You conclude that
there is insufficient evidence in the mean tensile
strength of the parachutes among the suppliers.
Testing for Factor and Interaction Effects
F Test for Factor B Effect
Reject H0 if FSTAT > Fα

Because FSTAT = 5.1999 > 2.9011 or the p-value =


0.0049 < 0.05, reject H0. You conclude that there is
evidence of a difference in the mean tensile strength
of the parachutes among the suppliers.
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
If one or both of the factor effects are significant and
there is no significant interaction effect, when there
are more than two levels of a factor, you can
determine the particular levels that are significantly
different by using the Tukey multiple comparisons
procedure for two-way ANOVA.
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
Using a = 0.05 level of significance
❑ The interaction effect is not
significant.
❑ There is no evidence of a
significant difference between the
two looms (Jetta and Turk) that
comprise factor A.
❑ There is evidence of a significant
difference among the four
suppliers that comprise factor B.

➔ Thus, you can use the Tukey multiple comparisons procedure to


determine which of the four suppliers differ.
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
CRITICAL RANGE FOR FACTOR A

where Q is the upper-tail critical value from a Studentized range distribution having r and
rc(n’ – 12) degrees of freedom.

CRITICAL RANGE FOR FACTOR B

where Q is the upper-tail critical value from a Studentized range distribution


having c and rc(n’ – 12) degrees of freedom.
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
1. Compute the absolute mean differences:
Because there are four suppliers, there are
4(4-1)/2= 6 pairwise comparisons.
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
2. Find the Qα value from the table in appendix 1. Compute the absolute mean differences:
E.7 [for a = 0.05, c = 4, and rc(n’-1) = 32 ] Because there are four suppliers, there are
degrees of freedom. 4(4-1)/2= 6 pairwise comparisons.

Qα = 3.84
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
2. Find the Qα value from the table in appendix 1. Compute the absolute mean differences:
E.7 [for a = 0.05, c = 4, and rc(n’-1) = 32 ] Because there are four suppliers, there are
degrees of freedom. 4(4-1)/2= 6 pairwise comparisons.

Qα = 3.84

3. Compute the critical range:

4. Compare:
MULTIPLE COMPARISONS: THE TUKEY PROCEDURE
4. Compare: 1. Compute the absolute mean differences:
Because there are four suppliers, there are
4(4-1)/2= 6 pairwise comparisons.
Critical Range = 3.56

Because 4.93 > 3.56 only the means of


Suppliers 1 and 2 are different. You can
conclude that the mean tensile strength
is lower for Supplier 1 than for Supplier
2.

And there are no statistically significant differences between Suppliers 1 and 3, Suppliers 1 and 4, Suppliers 2 and 3, Suppliers 2
and 4, and Suppliers 3 and 4. Note that by using you are able to make all six comparisons with an overall error rate of only 5%.
VISUALIZING & INTERPRETING INTERACTION
EFFECTS: The Cell Means Plot
Cell means are obtained and plotted when more than two
groups are compared at a time. Cell means are plotted to
check whether there is any difference between the means of
the different groups of the data or not.
As production manager, the business problem you decide to
examine involved not just the different suppliers but also
whether parachutes woven on the Jetta looms are as strong
as those woven on the Turk looms. In addition, you need to
determine whether any differences among the four suppliers
in the strength of the parachutes are dependent on the type
of loom being used.
Tensile Strengths of Parachutes Woven by Two
Types of Loom
This cell means plot
shows two parallel
series, although they
are slightly shifted
from one another this
is a clear indication
that there is no
interaction between
Factor A and B.
Since the p-value = 0.9984 > 0.05, we
can conclude that there is insufficient
evidence of an interaction effect
between loom and supplier.

If the interaction effect is significant,


further analysis will focus on this
interaction. If the interaction effect is
not significant, you can focus on the
main effects -potential differences in
looms (factor A) and potential
differences in suppliers (factor B).
A1

A2

B1 B2 B3 B4
If there is interaction, the line
would be nonparallel. Some
levels of factor A would
respond better with certain
levels of factor B.

The difference between the


looms is no longer the same
for all suppliers.
Supplier 1 Supplier 2
THE END

You might also like