Analysisof Variance
Analysisof Variance
ANOVA
ANOVA test, there are two types of mean that are calculated: Grand and Sample
Mean.
• A Sample mean (μn) represents the average value for a group while the Grand
mean (μ) represents the average value of sample means of different groups or
we are measuring (in the numerator) and the variation associated When F>1, variation due to the effect > variation
due to error
with the effect (in the denominator).
● If F<1, it means variation due to effect <
variation due to error
● When F = 1 it means variation due to
effect
= variation due to error. This situation is not
so favorable.
Sums of Squares
known as variation.
data set.
5. Mean Squared Error (MSE)
of freedom.
6. Hypothesis (Alternate and Null)
We use Null Hypothesis (H0) and Alternate Hypothesis (H1). The Null Hypothesis in ANOVA is valid when the
The Alternate Hypothesis is valid when at least one of the sample means is different from the other.
7. Group Variability (Within-group and Between-group)
In the ANOVA test, a group is the set of samples within the independent variable.
● When there is a big variation in the sample distributions of the individual groups, it
is
called between-group variability.
● On the other hand, when there are variations in the sample distribution within
For example
How well you perform in a race depends on your training.
How much you weigh depends on your diet.
How much you earn depends upon the number of hours you work.
Types of ANOVA Test
The ANOVA test is generally done in three ways depending on the number of Independent
Variables (IVs) included in the test. Sometimes the test includes one IV, sometimes it has two
IVs, and sometimes the test may include multiple IVs.
1. One-Way ANOVA
2. Two-Way ANOVA
3. N-Way ANOVA (MANOVA)
One-Way ANOVA
It is generally the most used method of performing the ANOVA test. It is also referred to as one-factor
The null and alternative hypotheses of one-way ANOVA can be expressed as:
-Null hypothesis can be thought of as a nullifiable hypothesis. That means you can nullify it, or reject
it.
Note: The One-Way ANOVA is considered an omnibus (Latin for “all”)
test because the F test indicates whether the model is significant overall.
whether or not there are any significant differences in the means between
any of the groups.
In ANOVA, the null hypothesis is that there is no difference among group means. If any
group differs significantly from the overall group mean, then the ANOVA will report a
statistically significant result.
Significant differences among group means are calculated using the F statistic, which is
the ratio of the mean sum of squares (the variance explained by the independent variable)
to the mean square error (the variance left over).
If the F statistic is higher than the critical value (the value of F that corresponds with your
alpha value, usually 0.05), then the difference among groups is deemed statistically
significant.
Critical value is a point on the distribution of the test statistic under the null hypothesis
that defines a set of values that call for rejecting the null hypothesis. This set is called
critical or rejection region.
Critical values on the standard normal distribution for α = 0.05
LIMITATIONS
A.) ANOVA is applicable if:
● all populations of interest are normally distributed.
● the populations have equal standard deviations.
● samples (not necessarily of the same size) are randomly and
independently selected from each population.
●there is one independent variable and one dependent variable.
The test statistic for analysis of variance is the F-ratio.
B.)One-Way ANOVA
this method is applicable if:
● all populations of interest are normally distributed.
● the populations have equal standard deviations
● samples (not necessarily of the same size) are randomly and
independently selected from each population.
The test statistic for analysis of variance is the F-ratio.
Two Way Anova
-It is an extension of the One Way ANOVA. With a One Way, you have one
independent variable affecting a dependent variable. With a Two Way ANOVA,
there are two independents. Use a two way ANOVA when you have one
measurement variable (i.e. a quantitative variable) and two nominal variables.
You might want to find out if there is an interaction between income and
gender for anxiety level at job interviews.
The anxiety level is the outcome, or the variable that can be measured.
Gender and Income are the two categorical variables.
16
Two-Way ANOVA:
A botanist wants to know whether or not plant
growth is influenced by sunlight exposure
and watering frequency. She plants 40 seeds
Two-way ANOVA is performed in two ways: and lets them grow for two months under
different conditions for sunlight exposure and
watering frequency. After two months, she
1. Two-way ANOVA with replication: records the height of each plant. The results
2. Two-way ANOVA without replication- This is used when you are shown below:
have only one group but you are double-testing that group
17
● The p-value for the interaction between watering
frequency and sunlight exposure was 0.310898. This
is not statistically significant at alpha level 0.05.
● The p-value for watering frequency was 0.975975.
This is not statistically significant at alpha level
0.05.
● The p-value for sunlight exposure was 3.9E-8
(0.000000039). This is statistically significant at
alpha level 0.05.
18
Some examples of factorial ANOVAs include:
Factorial ANOVA
A factorial ANOVA is an Analysis of Variance test with
● Testing the combined effects of vaccination
more than one independent variable, or “factor“.
(vaccinated or not vaccinated) and health status
It can(healthy
also referortopre-
moreexisting
than onecondition)
Level of Independent
on the rate
Variable
of flu infection in a population.
The only difference between one-way and two-way ANOVA is the number
of independent variables. A one-way ANOVA has one independent
variable, while a two-way ANOVA has two.
● One-way ANOVA: Testing the relationship between shoe brand (Nike,
Adidas, Saucony, Hoka) and race finish times in a marathon.
● Two-way ANOVA: Testing the relationship between shoe brand (Nike,
Adidas, Saucony, Hoka), runner age group (junior, senior, master’s),
and race finishing times in a marathon.
All ANOVAs are designed to test for differences among three or more
groups. If you are only testing for a difference between two groups, use a
t- test instead.
20
Post hoc (Latin, meaning “after this”) means to analyze the
results of your experimental data.
The most common post hoc tests are:
● Bonferroni Procedure
● Duncan’s new multiple range test
(MRT)
● Dunn’s Multiple Comparison Test
● Fisher’s Least Significant
Difference (LSD)
● Holm-Bonferroni Procedure
● Newman-Keuls
● Rodger’s Method
● Scheffé’s Method
● Tukey’s Test
● Dunnett’s correction
● Benjamini-Hochberg (BH)
procedure
21
1.)Duncan’s new multiple range test (MRT)
-Duncan’s Multiple Range Test will identify the pairs of means (from at least
three) that differ. The MRT is similar to the LSD, but instead of a t-value, a Q
Value is used.
If you perform a very large amount of tests, one or more of the tests will have a significant result purely by
chance alone. This post hoc test accounts for that false discovery rate.
23
References:
Stephanie Glen. "ANOVA Test: Definition, Types, Examples, SPSS" From StatisticsHowTo.com:
Elementary Statistics for the rest of us! https://www.statisticshowto.com/probability-and-
statistics/hypothesis-testing/anova/
Stephanie Glen. "Independent Variable (Treatment Variable) Definition and Uses" From
StatisticsHowTo.com: Elementary Statistics for the rest of us!
https://www.statisticshowto.com/independent-variable-definition/
Stephanie Glen. "Post Hoc Definition and Types of Tests" From StatisticsHowTo.com: Elementary
Statistics for the rest of us! https://www.statisticshowto.com/probability-and-statistics/statistics-
definitions/post-hoc/
https://libguides.library.kent.edu/spss/onewayanova
Ritesh Pathak Mar 02, 2021. ANOVA Test -Definition and Examples.
https://www.analyticssteps.com/blogs/anova-test-definition-and-examples
Zach December 30, 2018. Two-Way ANOVA: Definition, Formula, and Example.
https://www.statology.org/two-way-anova/
24
One way anova using
spss
25
A manager wants to raise the productivity at his company by increasing the speed at which his employees can use a particular spreadsheet
program. As he does not have the skills in-house, he employs an external agency which provides training in this spreadsheet program.
They offer 3 courses: a beginner, intermediate and advanced course. He is unsure which course is needed for the type of work they do at
his company, so he sends 10 employees on the beginner course, 10 on the intermediate and 10 on the advanced course. When they all
return from the training, he gives them a problem to solve using the spreadsheet program, and times how long it takes them to complete
the problem. He then compares the three courses (beginner, intermediate, advanced) to see if there are any differences in the average time
it took to complete the problem.
In SPSS Statistics, we separated the groups for analysis by creating a grouping variable called Course (i.e., the
independent
variable), and gave the beginners course a value of "1", the intermediate course a value of "2" and the advanced course a
value of "3". Time to complete the set problem was entered under the variable name Time (i.e., the dependent variable). In
our enhanced one-way ANOVA guide, we show you how to correctly enter data in SPSS Statistics to run a one-way
ANOVA
26
variable, Course, into the Factor:box using the appropriate buttons (or drag-and-drop the
variables into the boxes), as shown below:
27
This is the table that shows the output of the ANOVA analysis and whether there is a statistically
significant difference between our group means. We can see that the significance value is 0.021 (i.e., p
=
.021), which is below 0.05. and, therefore, there is a statistically significant difference in the mean length
of time to complete the spreadsheet problem between the different courses taken.
28
In order to know which of the specific groups differed, we can find this out
in the Multiple Comparisons table which contains the results of the Tukey
post hoc test. There was a statistically significant difference between
groups as determined by one-way ANOVA (F(2,27) =
4.467, p = .021). A Tukey post hoc test revealed that the
time to complete the problem was statistically
significantly lower after taking the intermediate (23.6 ±
3.3 min, p = .046) and advanced (23.4 ± 3.2 min, p =
.034) course compared to the beginners course (27.2 ±
3.0 min). There was no statistically significant
difference between the intermediate and advanced
groups (p = .989).
From the results so far, we know that there are statistically significant differences between the groups
as a whole. The table below, Multiple Comparisons, shows which groups differed from each other.
The Tukey post hoc test is generally the preferred test for conducting post hoc tests on a one-way
ANOVA, but there are many others. We can see from the table below that there is a statistically
significant difference in time to complete the problem between the group that took the beginner
course and the intermediate course (p = 0.046), as well as between the beginner course and
advanced course (p = 0.034). However, there were no differences between the groups that took the
intermediate and advanced course (p = 0.989).
29
30
A researcher was interested in whether an individual's interest in politics was influenced by their level of education
and gender. They recruited a random sample of participants to their study and asked them about their interest in
politics, which they scored from 0 to 100, with higher scores indicating a greater interest in politics. The
researcher then divided the participants by gender (Male/Female) and then again by level of education
(School/College/University). Therefore, the dependent variable was "interest in politics", and the two independent
variables were "gender" and "education".
A researcher had previously discovered that interest in politics is influenced by level of education. When participants were classified
into three groups according to their highest level of education; namely "school", "college" or "university", in that order; higher education
levels were associated with a greater interest in politics. Having demonstrated this, the researcher was now interested in determining
whether this effect of education level on interest in politics was different for males and females (i.e., different depending on your
gender). To answer this question, they recruited 60 participants: 30 males and 30 females, equally split by level of education
(School/College/University) (i.e., 10 participants in each group). The researcher had participants complete a questionnaire that
assessed their interest in politics, which they called the "Political Interest" scale. Participants could score anything between 0 and 100,
with higher scores indicating a greater interest in politics.
Transfer the dependent variable, political_interest, into the Dependent Variable: box, and the
32
independent
variables, genderand education_level, into the Fixed Factor(s): box,
33
Click on the button. This will add this profile plot, which it labels
"education_level*gender", into the Plots: box, as shown below:
34
Transfer the interaction effect, "gender*education_level", from
the Factor(s) and Factor Interactions: box to the Display Means for: box
by highlighting it and clicking on the button, as shown below:
click on the continue button. You will be returned to
the Univariate dialogue box.
Click on the EM MEANS Button
You will be presented with the Univariate: Estimated
Marginal Means dialogue box, as shown below:
35
Click on the options button. You will be presented with the Univariate:
Options dialogue box, as shown below:
Transfer education level from the Factor(s): box to the Post Hoc Tests for: box
using the button. This will activate the –Equal Variances Assumed– area (i.e.,
it will no longer be greyed out) and present you with some choices for which
post hoc test to use. For this example, we are going to select Tukey, which is
a good, all-round post hoc test.
Note: You only need to transfer independent variables that have more than
two groups into the Post Hoc Tests for: box. This is why we do not
transfer gender.
36
RESULTS
37
The particular rows we are interested in are the "gender", "education_level" and "gender*education_level" rows, and these are
highlighted above. These rows inform us whether our independent variables (the "gender" and "education_level" rows) and their
interaction (the
"gender*education_level" row) have a statistically significant effect on the dependent variable, "interest in politics". It is important to first look
at
the "gender*education_level" interaction as this will determine how you can interpret your results (see our enhanced guide for more
information).
You can see from the "Sig." column that we have a statistically significant interaction at the p = .002 level. You may also wish to report the
results of "gender" and "education_level", but again, these need to be interpreted in the context of the interaction result. We can see from the
table above that there was no statistically significant difference in mean interest in politics between males and females (p = .448), but there
were statistically significant differences between educational levels (p < .001).
38
▰ RESULTS OF TWO WAY there is a statistically significant difference between all three different
educational levels (p < .001).
ANOVA
A two-way ANOVA was conducted that examined the effect of gender and education level on interest in politics. There
was a statistically significant interaction between the effects of gender and education level on interest in politics, F (2,
52) = 7.315, p = .002.
Simple main effects analysis showed that males were significantly more interested in politics than females when
educated to university level (p = .002), but there were no differences between gender when educated to school (p =
.465) or college level (p = .793).
39
Research Question
How to lose weight effectively? Do diets really work
and what about exercise? In order to find out, 180
participants were assigned to one of 3 diets and one
of 3 exercise levels. After two months, participants
were asked how many kilos they had lost.
Our means plot was very useful for describing the pattern of means
resulting from diet and exercise in our sample. But perhaps
things are different in the larger population. If neither diet nor
exercise affect weight loss, could we find these sample results by
mere sampling fluctuation? Short answer: no.
43
References:
SPSS BASSIC TUTORIAL Retrieved from https://
www.spss-tutorials.com/spss-two-way-anova-
basics-tutorial/
https://statistics.laerd.com/spss-tutorials/one-way-anova-
using-spss-statistics-2.php