Analysis of Variance PPT Powerpoint Presentation3260 PDF
Analysis of Variance PPT Powerpoint Presentation3260 PDF
(Analysis of Variance)
ANOVA: Introduction
• Many studies involve comparisons between more
than two groups of subjects.
• If the outcome is categorical (count) data, a Chi-
square test for a larger than 2 X 2 table can be used to
compare proportions between groups.
• If the outcome is numerical, ANOVA can be used to
compare the means between groups.
• ANOVA is an abbreviation for the full name of the
method: ANalysis Of Variance
– Invented by R.A. Fisher in the 1920’s
Why ANOVA instead of multiple
t-tests?
• If you are comparing means between more than two
groups, why not just do several two sample t-tests to
compare the mean from one group with the mean from
each of the other groups?
– Before ANOVA, this was the only option available to
compare means between more than two groups.
• The problem with the multiple t-tests approach is that
as the number of groups increases, the number of two
sample t-tests also increases.
• As the number of tests increases the probability of
making a Type I error also increases.
ANOVA: a single test for
multiple comparisons
• The advantage of using ANOVA over multiple
t-tests is that ANOVA will identify if any two of
the group means are significantly different with a
single test.
• If the significance level is set at 0.05, the
probability of a Type I error for ANOVA = 0.05
regardless of the number of groups being
compared.
• If the ANOVA F-test is significant, further
comparisons can be done to determine which
groups have significantly different means.
ANOVA Hypotheses
• The Null hypothesis for ANOVA is that the means for
all groups are equal:
H o : µ1 = µ 2 = µ3 = .... = µ k
• The Alternative hypothesis for ANOVA is that at least
two of the means are not equal.
• The test statistic for ANOVA is the ANOVA
F-statistic.
Analysis of Variance
• ANOVA is used to compare means between three or
more groups, so why is it called Analysis of
VARIANCE?
• The ANOVA F-test is a comparison of the average
variability between groups to the average variability
within groups.
– The variability within each group is a measure of the spread of
the data within each of the groups.
– The variability between groups is a measure of the spread of
the group means around the overall mean for all groups
combined.
– F = average variability between groups
average variability within groups
ANOVA: F-statistic
• If variability between groups is large relative to
the variability within groups, the F-statistic will be
large.
• If variability between groups is similar or smaller
than variability within groups, the F-statistic will
be small.
• If the F-statistic is large enough, the null
hypothesis that all means are equal is rejected.
Illustration of small F-statistic
Within
Group
Group 1
mean
Group 2
mean
Overall mean
Group 3
mean
Group 2
mean
Overall
mean Group 3
mean
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 292 2 146 5.732984 0.0141425 3.682317
Within Groups 382 15 25.46667
Total 674 17