OneWayANOVA LectureNotes
OneWayANOVA LectureNotes
Concepts:
Comparing Several Means
The Analysis of Variance F Test
The Idea of Analysis of Variance
Conditions for ANOVA
F Distributions and Degrees of Freedom
Objectives:
Describe the problem of multiple comparisons.
Describe the idea of analysis of variance.
Check the conditions for ANOVA.
Describe the F distributions.
Conduct and interpret an ANOVA F test.
References:
Moore, D. S., Notz, W. I, & Flinger, M. A. (2013). The basic practice of statistics (6th
ed.). New York, NY: W. H. Freeman and Company.
Introduction
ANOVA analyzes the variance or how spread apart the individuals are within
each group as well as between the different groups.
Although there are many types of analysis of variance, these notes will focus
on the simplest type of ANOVA, which is called the one-way analysis of
variance.
ANOVA – Example (Hypotheses)
Null Hypothesis: In the population, the mean test scores are not significantly
different across the three cohorts of students.
μface-to-face=μhybrid=μonline
Study Background
This example illustrates how to conduct a one-way analysis of variance to
compare the mean differences across three groups.
In this example, the researcher would like to determine which method of
instruction is the most effective: face-to-face, online, or hybrid (which
includes both face-to-face meetings and online sessions).
To investigate this, the researcher randomly assigns the participants in the
study to one of the three cohorts. The participants have similar levels of
background knowledge on the topic and they take exactly the same course
from the same instructor. The only way the courses differ is the type of
instructional delivery. At the end of the course, all students take the same
achievement test to measure the extent to which they mastered the course
objectives.
The basic conditions for inference are that the sample is random from each
population and that the population is Normally distributed.
The alternative hypothesis is that there is some difference. That is, not all
means are equal.
This hypothesis is not one-sided or two-sided; it is “many-sided.”
Multiple Comparisons
When statistical analyses involving multiple comparisons are conducted, there are
two main steps that must be followed:
The main idea of an ANOVA is that if researchers ask if a set of sample means gives
evidence of differences in the population means, what matters is not how far apart
the sample means are, but how far apart they are relative to the variability of
individual observations.
Example:
The sample means for the three samples are the same for each set (a) and
(b). However, the ranges for each sample in (a) are much larger.
The variation among sample means for (a) is identical to (b).
The variation among the individuals within the three samples is much smaller
for (b).
In the example, software can run some descriptive analyses for the test
scores obtained from the students in the three groups.
When statistical software is used to run ANOVAs, a table of descriptions, like
the one shown below, is included in the output.
However, it is always a good idea to look at the descriptive statistics before
conducting the analysis.
In this case, it is clear that students in the hybrid cohort had the highest
average score, followed by the online cohort and the face-to-face cohort.
(Face-to-face instruction is labeled as traditional instruction in this data set).
The boxplots show that the hybrid group has the highest median, but also has
the largest spread (both the box and whiskers are longer), so there is more
variability in scores compared to the other two groups.
This information is taken into account when computing the ANOVA test
statistic.
The test statistic will tell researchers whether the differences between these
means are statistically significant.
The ANOVA F Statistic
The general formula for computing the test statistic is a ratio of the variation
between sample means and the variation within the groups.
The computation process for the F statistic is a little more tedious than for
the t or the z test statistics.
It can be computed by hand, but the computation involves many steps.
Statistical software is usually used to compute the F statistic.
The F statistic always takes positive values and, just like with other test
statistics, the larger the value of the F statistic, the more likely it is to have a
small p-value, and, therefore, enable researchers to reject the null hypothesis.
ANOVA – Degrees of Freedom
Generally, these are calculated using statistical software, rather than by hand.
The next step of an ANOVA is to run the statistical analysis and interpret the
results.
The first column of the table, labeled “Sum of Squares,” shows the sum of
score variances between and within groups, as well as the total sum for the
entire sample.
There is also a column for degrees of freedom, labeled “df,” for each type of
comparison and for the total sample.
The “Mean Square” column shows the average variation of scores between
groups and within each group.
These values are obtained by dividing the values in the “Sum of Squares”
column by the corresponding degrees of freedom.
The Within Groups variance is also called error because the groups need to
be as homogenous as possible, so variation within groups adds error to the
computation process.
The p-value is lower than the alpha level of .05, so the null hypothesis can be
rejected and the alternative hypothesis can be accepted.
Although it is now clear that there are significant differences in student
performance across the three groups, researchers still need to determine
where these differences occur, which requires post-hoc analysis.
ANOVA – Post-Hoc Analysis
Example:
The first column shows the first term of the subtraction, then the second
term of the subtraction. In the first row, it is online mean minus hybrid
mean.
The second column displays the differences between the means. The
differences that are statistically significant are marked with an asterisk.
In the above example, two mean differences are marked as significant,
however, they are the same difference in reverse order. This means there is
only one significant difference.
Conclusion: there is a significant difference between hybrid and traditional
instruction.
The Tukey Post-Hoc Test also creates clusters of groups that have similar
average scores.
Example:
The table above shows that traditional and online instruction were grouped
into the first subset because they have similar means (there is not a
statistically significant difference between their means).
Similarly, online and hybrid instruction were grouped into a subset because
they have similar means (there is not a statistically significant difference
between their means).
Conclusion: students in hybrid courses had the highest average test score,
followed by students enrolled in online courses and students who attended
face-to face courses.
The difference between hybrid and traditional instruction was statistically
significant, whereas the difference between hybrid and online instruction,
and the difference between online and traditional instruction were not
statistically significant.
Conditions for ANOVA
There are certain assumptions that must be met before conducting analysis
of variance.