Exmples of ANOVA
Exmples of ANOVA
Introduction
Note: If your study design not only involves one dependent variable and one
independent variable, but also a third variable (known as a "covariate") that
you want to "statistically control", you may need to perform an ANCOVA
(analysis of covariance), which can be thought of as an extension of the one-
way ANOVA. To learn more, see our SPSS Statistics guide on ANCOVA.
Alternatively, if your dependent variable is the time until an event happens,
you might need to run a Kaplan-Meier analysis.
This "quick start" guide shows you how to carry out a one-way ANOVA using
SPSS Statistics, as well as interpret and report the results from this test. Since
the one-way ANOVA is often followed up with a post hoc test, we also show
you how to carry out a post hoc test using SPSS Statistics. However, before
we introduce you to this procedure, you need to understand the different
assumptions that your data must meet in order for a one-way ANOVA to give
you a valid result. We discuss these assumptions next.
SPSS Statisticstop ^
Assumptions
When you choose to analyse your data using a one-way ANOVA, part of the
process involves checking to make sure that the data you want to analyse can
actually be analysed using a one-way ANOVA. You need to do this because it
is only appropriate to use a one-way ANOVA if your data "passes" six
assumptions that are required for a one-way ANOVA to give you a valid result.
In practice, checking for these six assumptions just adds a little bit more time
to your analysis, requiring you to click a few more buttons in SPSS Statistics
when performing your analysis, as well as think a little bit more about your
data, but it is not a difficult task.
You can check assumptions #4, #5 and #6 using SPSS Statistics. Before doing this,
you should make sure that your data meets assumptions #1, #2 and #3, although you
don't need SPSS Statistics to do this. Remember that if you do not run the statistical
tests on these assumptions correctly, the results you get when running a one-way
ANOVA might not be valid. This is why we dedicate a number of sections of our
enhanced one-way ANOVA guide to help you get this right. You can find out about
our enhanced one-way ANOVA guide here, or more generally, our enhanced content
as a whole here.
In the section, Test Procedure in SPSS Statistics, we illustrate the SPSS Statistics
procedure to perform a one-way ANOVA assuming that no assumptions have been
violated. First, we set out the example we use to explain the one-way ANOVA
procedure in SPSS Statistics.
SPSS Statisticstop ^
Example
A manager wants to raise the productivity at his company by increasing the speed at
which his employees can use a particular spreadsheet program. As he does not have
the skills in-house, he employs an external agency which provides training in this
spreadsheet program. They offer 3 courses: a beginner, intermediate and advanced
course. He is unsure which course is needed for the type of work they do at his
company, so he sends 10 employees on the beginner course, 10 on the intermediate
and 10 on the advanced course. When they all return from the training, he gives them
a problem to solve using the spreadsheet program, and times how long it takes them to
complete the problem. He then compares the three courses (beginner, intermediate,
advanced) to see if there are any differences in the average time it took to complete
the problem.
SPSS Statistics
SPSS Statistics sets out its data in a spreadsheet-like manner. The principle behind
entering data in almost all cases in SPSS Statistics is to enter each unique case on a
new row. A case is the "object" which you are measuring in someway. Usually, a case
is an individual, but it can also be a commercial product or a biological cell (or
something else entirely). For the purposes of this explanation, we shall assume that a
case is an individual. Therefore, when entering data into SPSS Statistics you must put
one person's data on one row only. If you find that you have an individual's data on
more than one row then you have made a mistake. Equally, if a row contains more
than one person's data, you have also made a mistake.
We shall now look at the three most common tasks you face when entering data into
SPSS Statistics, plus two more advanced setups:
Entering Variables
If you do not have repeated measures, SPSS Statistics treats each column as a separate
variable. Thus, each variable goes in a separate column. For example, if we had
measured the height and weight of a group of individuals, the data in SPSS Statistics
would look like the following:
The Subject column has been added so that it is clear that each individual is placed on a separate
row. However, SPSS Statistics does not need you to enter this column, and it is mostly for you to
be able to better visualize your data. So, even if we ignored the Subject column, we can see that
one individual was 1.55 m tall and weighed 56 kg, looking at the Height and Weight columns,
respectively. How to label variable columns is in our Working with Variables guide. To add more
variables, simply add more columns - one column per variable. The only variation to this is
discussed later in this guide when we have to enter repeated measures.
Separate groups are more commonly called between-subjects factors or independent groups. They
are groups where the individuals in each group are unique (i.e., no person is in more than one
group). In this sense, you could call the groups "mutually-exclusive". A common example is
when differentiating between gender. You want to label some of your individuals as female and
others as male. To identify which subjects were males and which were females, you need to
create a "grouping variable" in SPSS Statistics. This is a separate column that includes
information on which group a subject belongs to. We do this by labelling our groups numerically.
For example, we label "males" as "1" and "females" as "2". By using the value attribute we can
label these numbers as representing males and females, respectively. An example is shown
below:
Looking at the columns on the left we can see that we have created a "grouping variable" called
"Gender" that has two categories: "1" and "2". Because we labelled the numbers using the value
attribute we can use the Value Label Button to switch to the text version of the "grouping
variable" categories. In this example, we can see that "1" and "2" are replaced by "Male" and
"Female", respectively. How to do this is explained in our guide on Working with Variables. You
do not need to add text labels – SPSS Statistics will work fine without them – but it can provide
extra clarity when analysing your data (especially as text labels are often used in the output
instead of the numbers – this helps greatly). We can see in this example that the first three
subjects were males and the last four subjects were females. What if you have more than two
categories of your "grouping variable"? Simple, just add more numbers with, we recommend,
corresponding text labels.
Entering Repeated Measures
Repeated measures, also called within-subject factors or related groups, are variables
that are measured on more than one occasion. This can occur when you have
measured the same subject for the same variable at more than one time point or under
more than one condition. For example, you measured body weight at the beginning
and end of a weight-loss programme. To enter this into SPSS Statistics, you must
ignore the "one-variable-one-column" rule and put each time point or condition in a
new column as follows:
Here, we have labelled their weight at the beginning of the weight-loss programme as
"Weight_Pre" and their weight after the weight-loss programme as "Weight_Post". It does not
matter what you call these "related" columns (you could have called them weight1 and weight2,
for example), as long the columns make sense to you. If you have a lot of time points and/or
conditions, labelling the variables logically is important because otherwise it can become very
confusing determining which variable is which. This is important as SPSS Statistics cannot tell
the difference between columns that contain different variables and columns that contain a
repeated variable. Therefore, it cannot help you.
Multiple Separating Groups
Sometimes, such as when running a two-way ANOVA or when entering in your whole study
data, you need to separate your subjects twice (i.e., on two separate variables). For example, you
need to separate subjects by their gender (male/female) and their physical activity level
(sedentary/active). This will require two columns that act as "grouping variables", as shown
below:
Here, we can see that, for example, Subject 1 was male and sedentary, and Subject 7 was female
and active. Notice that we are using the text labels as described earlier in this guide for added
clarity.
Sometimes, we have separated subjects into groups and then measured them
repeatedly on the same dependent variable. Such data might be analysed using a
mixed ANOVA. If we had males and females undertake a weight-loss programme and
we weighted them pre- and post-intervention, we would have the following setup in
SPSS Statistics:
Published with written permission from SPSS Statistics, IBM Corporation.
To generate this type of setup, simply used the rules you have learnt in this guide
under the Defining Separate Groups and Entering Repeated Measures sections.
In the section, Test Procedure in SPSS Statistics, we illustrate the SPSS Statistics procedure to
perform a one-way ANOVA assuming that no assumptions have been violated. First, we set out
the example we use to explain the one-way ANOVA procedure in SPSS Statistics.
Example
A manager wants to raise the productivity at his company by increasing the speed at
which his employees can use a particular spreadsheet program. As he does not have
the skills in-house, he employs an external agency which provides training in this
spreadsheet program. They offer 3 courses: a beginner, intermediate and advanced
course. He is unsure which course is needed for the type of work they do at his
company, so he sends 10 employees on the beginner course, 10 on the intermediate
and 10 on the advanced course. When they all return from the training, he gives them
a problem to solve using the spreadsheet program, and times how long it takes them to
complete the problem. He then compares the three courses (beginner, intermediate,
advanced) to see if there are any differences in the average time it took to complete
the problem.
The eight steps below show you how to analyse your data using a one-way ANOVA
in SPSS Statistics when the six assumptions in the previous section, Assumptions,
have not been violated. At the end of these eight steps, we show you how to interpret
the results from this test. If you are looking for help to make sure your data meets
assumptions #4, #5 and #6, which are required when using a one-way ANOVA, and
can be tested using SPSS Statistics, you can learn more here.
Click Analyze > Compare Means > One-Way ANOVA... on the top menu, as shown
below.
Published with written permission from SPSS Statistics, IBM Corporation.
Transfer the dependent variable, Time , into the Dependent List: box and the independent
variable, Course , into the Factor:box using the appropriate buttons (or drag-and-drop the
variables into the boxes), as shown below:
Click the button. Tick the Descriptive checkbox in the –Statistics– area, as
shown below:
Published with written permission from SPSS Statistics, IBM Corporation.
NOTE: When testing for some of the assumptions of the one-way ANOVA, you will need to
tick more of these checkboxes. We take you through this, including how to interpret the output,
in our enhanced one-way ANOVA guide.
Go to the next page for the SPSS Statistics output and an explanation of the output.
SPSS Statistics generates quite a few tables in its one-way ANOVA analysis. In this section, we
show you only the main tables required to understand your results from the one-way ANOVA
and Tukey post hoc test. For a complete explanation of the output you have to interpret when
checking your data for the six assumptions required to carry out a one-way ANOVA, see our
enhanced guide here. This includes relevant boxplots, and output from the Shapiro-Wilk test for
normality and test for homogeneity of variances. Also, if your data failed the assumption of
homogeneity of variances, we take you through the results for Welch ANOVA, which you will
have to interpret rather than the standard one-way ANOVA in this guide. Below, we focus on the
descriptives table, as well as the results for the one-way ANOVA and Tukey post hoc test only.
We will go through each table in turn.
Descriptives Table
The descriptives table (see below) provides some very useful descriptive statistics, including the
mean, standard deviation and 95% confidence intervals for the dependent variable ( Time ) for
each separate group (Beginners, Intermediate and Advanced), as well as when all groups are
combined (Total). These figures are useful when you need to describe your data.
SPSS Statisticstop ^
ANOVA Table
This is the table that shows the output of the ANOVA analysis and whether there is a statistically
significant difference between our group means. We can see that the significance value is 0.021
(i.e., p = .021), which is below 0.05. and, therefore, there is a statistically significant difference
in the mean length of time to complete the spreadsheet problem between the different courses
taken. This is great to know, but we do not know which of the specific groups differed. Luckily,
we can find this out in the Multiple Comparisons table which contains the results of the Tukey
post hoc test.
Published with written permission from SPSS Statistics, IBM Corporation.
SPSS Statisticstop ^
Multiple Comparisons Table
From the results so far, we know that there are statistically significant differences
between the groups as a whole. The table below, Multiple Comparisons, shows
which groups differed from each other. The Tukey post hoc test is generally the
preferred test for conducting post hoc tests on a one-way ANOVA, but there are many
others. We can see from the table below that there is a statistically significant
difference in time to complete the problem between the group that took the beginner
course and the intermediate course (p = 0.046), as well as between the beginner course
and advanced course (p = 0.034). However, there were no differences between the
groups that took the intermediate and advanced course (p = 0.989).
Based on the results above, you could report the results of the study as follows (N.B., this does
not include the results from your assumptions tests or effect size calculations):
General
In our enhanced one-way ANOVA guide, we show you how to write up the results from your
assumptions tests, one-way ANOVA and Tukey post hoc results if you need to report this in a
dissertation, thesis, assignment or research report. We do this using the Harvard and APA styles
(see here). It is also worth noting that in addition to reporting the results from your assumptions,
one-way ANOVA and Tukey post hoc test, you are increasingly expected to report an effect size.
Whilst there are many different ways you can do this, we show you how to calculate an effect
size from your SPSS Statistics results in our enhanced one-way ANOVA guide. Effect sizes are
important because whilst the one-way ANOVA tells you whether differences between group
means are "real" (i.e., different in the population), it does not tell you the "size" of the difference.
Providing an effect size in your results helps to overcome this limitation. You can learn more
about our enhanced one-way ANOVA guide here, or our enhanced content in general here.