0% found this document useful (0 votes)
25 views44 pages

Analysisof Variance

Uploaded by

Rey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views44 pages

Analysisof Variance

Uploaded by

Rey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

ANALYSIS OF VARIANCE

ANOVA

The ANOVA, which stands for the Analysis of


Variance test, is a tool in statistics that is concerned
● A manufacturer has two different
with comparing the means of two groups of data sets
processes to make light bulbs. They want
and to what extent they differ.
to know if one process is better than the
other.
● Students from different colleges take the
same exam. You want to see if one college
● A group of psychiatric patients are trying outperforms the other.
three different therapies: counseling,
medication and biofeedback. You want to
see if one therapy is better than the
others.
Terminologies
• 1. Means(Grand and Sample)

• Mean is defined as an arithmetic average of a given range of values. In the

ANOVA test, there are two types of mean that are calculated: Grand and Sample

Mean.

• A Sample mean (μn) represents the average value for a group while the Grand

mean (μ) represents the average value of sample means of different groups or

mean of all the observations combined.


2. F-Statistic
● When the value of F exceeds 1 it means
The statistic which measures the extent of difference between the
that the variance due to the effect is larger
means of different samples or how significantly the means differ than the variance associated with sampling
error; we can represent it as:
is called the F-statistic or F-Ratio. It gives us a ratio of the effect

we are measuring (in the numerator) and the variation associated When F>1, variation due to the effect > variation
due to error
with the effect (in the denominator).
● If F<1, it means variation due to effect <
variation due to error
● When F = 1 it means variation due to
effect
= variation due to error. This situation is not
so favorable.
Sums of Squares

Tells you about the deviation from the mean, it is also

known as variation.

• 4. Degrees of Freedom (Df)

• Degrees of Freedom refers to the maximum numbers of

logically independent values that have the freedom to vary in a

data set.
5. Mean Squared Error (MSE)

The Mean Squared Error tells us about the average

error in a data set. To find the mean squared error,

we just divide the sum of squares by the degrees

of freedom.
6. Hypothesis (Alternate and Null)
We use Null Hypothesis (H0) and Alternate Hypothesis (H1). The Null Hypothesis in ANOVA is valid when the

sample means are equal or have no significant difference.

The Alternate Hypothesis is valid when at least one of the sample means is different from the other.
7. Group Variability (Within-group and Between-group)

In the ANOVA test, a group is the set of samples within the independent variable.

● When there is a big variation in the sample distributions of the individual groups, it
is
called between-group variability.

● On the other hand, when there are variations in the sample distribution within

an individual group, it is called Within-group variability.


Independent variables- variables that stand on their own and
aren’t affected by anything that you, as a researcher, do.
Example: you want to know how calorie intake affects weight.
Calorie intake is your independent variable and weight is your
dependent variable. You can choose the calories given to
participants, and you see how that independent variable affects
the weights.
Dependent variable(DV) -depends upon some factor that you, the researcher,
controls.

For example
How well you perform in a race depends on your training.
How much you weigh depends on your diet.
How much you earn depends upon the number of hours you work.
Types of ANOVA Test

The ANOVA test is generally done in three ways depending on the number of Independent
Variables (IVs) included in the test. Sometimes the test includes one IV, sometimes it has two
IVs, and sometimes the test may include multiple IVs.

We have three known types of ANOVA test:

1. One-Way ANOVA
2. Two-Way ANOVA
3. N-Way ANOVA (MANOVA)
One-Way ANOVA

It is generally the most used method of performing the ANOVA test. It is also referred to as one-factor

ANOVA, between-subjects ANOVA, and an independent factor ANOVA.


One-way or two-way refers to the number of independent variables (IVs) in your Analysis of
Variance test.
● One-way has one independent variable (with 2 levels). For example: brand of cereal,
● Two-way has two independent variables (it can have multiple levels). For example: brand of
cereal, calories.
Example of when to use a one way ANOVA This test is also known as:
● One-Factor ANOVA
Situation 1:
● One-Way Analysis of
You have a group of individuals randomly split into smaller groups and completing Variance
● Between Subjects
different tasks. For example, you might be studying the effects of tea on weight loss and
ANOVA
form three groups: green tea, black tea, and no tea.
HYPHOTHESES

The null and alternative hypotheses of one-way ANOVA can be expressed as:

H0: µ1 = µ2 = µ3 = ... = µk ("all k population means are equal")


H1: At least one µi different ("at least one of the k population means is not equal to the others")

-Null hypothesis can be thought of as a nullifiable hypothesis. That means you can nullify it, or reject
it.
Note: The One-Way ANOVA is considered an omnibus (Latin for “all”)
test because the F test indicates whether the model is significant overall.
whether or not there are any significant differences in the means between
any of the groups.

However, it does not indicate which mean is different. Determining which


specific pairs of means are significantly different requires either contrasts
or post hoc (Latin for “after this”) tests.
HOW STATISTICAL SIGNIFICANCE
IS COMPUTED IN ANOVA?

In ANOVA, the null hypothesis is that there is no difference among group means. If any
group differs significantly from the overall group mean, then the ANOVA will report a
statistically significant result.
Significant differences among group means are calculated using the F statistic, which is
the ratio of the mean sum of squares (the variance explained by the independent variable)
to the mean square error (the variance left over).
If the F statistic is higher than the critical value (the value of F that corresponds with your
alpha value, usually 0.05), then the difference among groups is deemed statistically
significant.
Critical value is a point on the distribution of the test statistic under the null hypothesis
that defines a set of values that call for rejecting the null hypothesis. This set is called
critical or rejection region.
Critical values on the standard normal distribution for α = 0.05
LIMITATIONS
A.) ANOVA is applicable if:
● all populations of interest are normally distributed.
● the populations have equal standard deviations.
● samples (not necessarily of the same size) are randomly and
independently selected from each population.
●there is one independent variable and one dependent variable.
The test statistic for analysis of variance is the F-ratio.

B.)One-Way ANOVA
this method is applicable if:
● all populations of interest are normally distributed.
● the populations have equal standard deviations
● samples (not necessarily of the same size) are randomly and
independently selected from each population.
The test statistic for analysis of variance is the F-ratio.
Two Way Anova

-It is an extension of the One Way ANOVA. With a One Way, you have one
independent variable affecting a dependent variable. With a Two Way ANOVA,
there are two independents. Use a two way ANOVA when you have one
measurement variable (i.e. a quantitative variable) and two nominal variables.

You might want to find out if there is an interaction between income and
gender for anxiety level at job interviews.
The anxiety level is the outcome, or the variable that can be measured.
Gender and Income are the two categorical variables.

These categorical variables are also the independent variables, which


are called factors in a Two Way ANOVA. 15
Assumption of two-way Anova

To use a two-way ANOVA your data should meet certain assumptions.Two-way


ANOVA makes all of the normal assumptions of a parametric test of difference:
1. Homogeneity of variance (a.k.a. homoscedasticity)
2. Independence of observations
3. Normally-distributed dependent variable
-The values of the dependent variable should follow a bell curve.

16
Two-Way ANOVA:
A botanist wants to know whether or not plant
growth is influenced by sunlight exposure
and watering frequency. She plants 40 seeds
Two-way ANOVA is performed in two ways: and lets them grow for two months under
different conditions for sunlight exposure and
watering frequency. After two months, she
1. Two-way ANOVA with replication: records the height of each plant. The results
2. Two-way ANOVA without replication- This is used when you are shown below:
have only one group but you are double-testing that group

17
● The p-value for the interaction between watering
frequency and sunlight exposure was 0.310898. This
is not statistically significant at alpha level 0.05.
● The p-value for watering frequency was 0.975975.
This is not statistically significant at alpha level
0.05.
● The p-value for sunlight exposure was 3.9E-8
(0.000000039). This is statistically significant at
alpha level 0.05.

18
Some examples of factorial ANOVAs include:

Factorial ANOVA
A factorial ANOVA is an Analysis of Variance test with
● Testing the combined effects of vaccination
more than one independent variable, or “factor“.
(vaccinated or not vaccinated) and health status
It can(healthy
also referortopre-
moreexisting
than onecondition)
Level of Independent
on the rate
Variable
of flu infection in a population.

● Testing the effects of marital status (married,


A factorial ANOVA is any ANOVA that uses more than one
single, divorced, widowed), job status
categorical independent variable. A two-way ANOVA is a type of
(employed, self- employed, unemployed,
factorial ANOVA.
retired), and family history (no family history,
some family history) on the incidence of
depression in a population.
19
● Testing the effects of feed type (type A, B, or
DIFFERENCE BETWEEN ONE WAY ANOVA AND TWO WAY
ANOVA

The only difference between one-way and two-way ANOVA is the number
of independent variables. A one-way ANOVA has one independent
variable, while a two-way ANOVA has two.
● One-way ANOVA: Testing the relationship between shoe brand (Nike,
Adidas, Saucony, Hoka) and race finish times in a marathon.
● Two-way ANOVA: Testing the relationship between shoe brand (Nike,
Adidas, Saucony, Hoka), runner age group (junior, senior, master’s),
and race finishing times in a marathon.
All ANOVAs are designed to test for differences among three or more
groups. If you are only testing for a difference between two groups, use a
t- test instead.

20
Post hoc (Latin, meaning “after this”) means to analyze the
results of your experimental data.
The most common post hoc tests are:
● Bonferroni Procedure
● Duncan’s new multiple range test
(MRT)
● Dunn’s Multiple Comparison Test
● Fisher’s Least Significant
Difference (LSD)
● Holm-Bonferroni Procedure
● Newman-Keuls
● Rodger’s Method
● Scheffé’s Method
● Tukey’s Test
● Dunnett’s correction
● Benjamini-Hochberg (BH)
procedure

21
1.)Duncan’s new multiple range test (MRT)

-Duncan’s Multiple Range Test will identify the pairs of means (from at least
three) that differ. The MRT is similar to the LSD, but instead of a t-value, a Q
Value is used.

2.)Fisher’s Least Significant Difference (LSD)


A tool to identify which pairs of means are statistically different. Essentially the same as
Duncan’s
MRT, but with t-values instead of Q values.
3.)Newman-Keuls
Like Tukey’s, this post hoc test identifies sample means that are different from each other.
Newman-Keuls uses different critical values for comparing pairs of means. Therefore, it is
more likely to find significant differences.
4.)Rodger’s Method
Considered by some to be the most powerful post hoc test for detecting differences among
groups. This test protects against loss of statistical power as the degrees of freedom 22
increase.
5.) Scheffé’s Method
Used when you want to look at post hoc comparisons in general (as opposed to just pairwise comparisons). Scheffe’s
controls for the overall confidence level. It is customarily used with unequal sample sizes.

6.) Tukey’s Test


The purpose of Tukey’s test is to figure out which groups in your sample differ. It uses the “Honest Significant
Difference,” a number that represents the distance between groups, to compare every mean with every other mean
.

7.) Dunnett’s correction


Like Tukey’s this post hoc test is used to compare means. Unlike Tukey’s, it compares every mean to a control mean.
Benjamini-Hochberg (BH) procedure

If you perform a very large amount of tests, one or more of the tests will have a significant result purely by
chance alone. This post hoc test accounts for that false discovery rate.

23
References:
Stephanie Glen. "ANOVA Test: Definition, Types, Examples, SPSS" From StatisticsHowTo.com:
Elementary Statistics for the rest of us! https://www.statisticshowto.com/probability-and-
statistics/hypothesis-testing/anova/

Stephanie Glen. "Independent Variable (Treatment Variable) Definition and Uses" From
StatisticsHowTo.com: Elementary Statistics for the rest of us!
https://www.statisticshowto.com/independent-variable-definition/

Stephanie Glen. "Dependent Variable: Definition and Examples" From


StatisticsHowTo.com: Elementary
Statistics for the rest of us!
https://www.statisticshowto.com/dependent-variable-definition/

Stephanie Glen. "Post Hoc Definition and Types of Tests" From StatisticsHowTo.com: Elementary
Statistics for the rest of us! https://www.statisticshowto.com/probability-and-statistics/statistics-
definitions/post-hoc/
https://libguides.library.kent.edu/spss/onewayanova

Ritesh Pathak Mar 02, 2021. ANOVA Test -Definition and Examples.
https://www.analyticssteps.com/blogs/anova-test-definition-and-examples

Zach December 30, 2018. Two-Way ANOVA: Definition, Formula, and Example.
https://www.statology.org/two-way-anova/
24
One way anova using
spss
25
A manager wants to raise the productivity at his company by increasing the speed at which his employees can use a particular spreadsheet
program. As he does not have the skills in-house, he employs an external agency which provides training in this spreadsheet program.
They offer 3 courses: a beginner, intermediate and advanced course. He is unsure which course is needed for the type of work they do at
his company, so he sends 10 employees on the beginner course, 10 on the intermediate and 10 on the advanced course. When they all
return from the training, he gives them a problem to solve using the spreadsheet program, and times how long it takes them to complete
the problem. He then compares the three courses (beginner, intermediate, advanced) to see if there are any differences in the average time
it took to complete the problem.

In SPSS Statistics, we separated the groups for analysis by creating a grouping variable called Course (i.e., the
independent
variable), and gave the beginners course a value of "1", the intermediate course a value of "2" and the advanced course a
value of "3". Time to complete the set problem was entered under the variable name Time (i.e., the dependent variable). In
our enhanced one-way ANOVA guide, we show you how to correctly enter data in SPSS Statistics to run a one-way
ANOVA

26
variable, Course, into the Factor:box using the appropriate buttons (or drag-and-drop the
variables into the boxes), as shown below:

27
This is the table that shows the output of the ANOVA analysis and whether there is a statistically
significant difference between our group means. We can see that the significance value is 0.021 (i.e., p
=
.021), which is below 0.05. and, therefore, there is a statistically significant difference in the mean length
of time to complete the spreadsheet problem between the different courses taken.

28
In order to know which of the specific groups differed, we can find this out
in the Multiple Comparisons table which contains the results of the Tukey
post hoc test. There was a statistically significant difference between
groups as determined by one-way ANOVA (F(2,27) =
4.467, p = .021). A Tukey post hoc test revealed that the
time to complete the problem was statistically
significantly lower after taking the intermediate (23.6 ±
3.3 min, p = .046) and advanced (23.4 ± 3.2 min, p =
.034) course compared to the beginners course (27.2 ±
3.0 min). There was no statistically significant
difference between the intermediate and advanced
groups (p = .989).

From the results so far, we know that there are statistically significant differences between the groups
as a whole. The table below, Multiple Comparisons, shows which groups differed from each other.
The Tukey post hoc test is generally the preferred test for conducting post hoc tests on a one-way
ANOVA, but there are many others. We can see from the table below that there is a statistically
significant difference in time to complete the problem between the group that took the beginner
course and the intermediate course (p = 0.046), as well as between the beginner course and
advanced course (p = 0.034). However, there were no differences between the groups that took the
intermediate and advanced course (p = 0.989).
29
30

Two way anova using


spss
31

A researcher was interested in whether an individual's interest in politics was influenced by their level of education
and gender. They recruited a random sample of participants to their study and asked them about their interest in
politics, which they scored from 0 to 100, with higher scores indicating a greater interest in politics. The
researcher then divided the participants by gender (Male/Female) and then again by level of education
(School/College/University). Therefore, the dependent variable was "interest in politics", and the two independent
variables were "gender" and "education".

A researcher had previously discovered that interest in politics is influenced by level of education. When participants were classified
into three groups according to their highest level of education; namely "school", "college" or "university", in that order; higher education
levels were associated with a greater interest in politics. Having demonstrated this, the researcher was now interested in determining
whether this effect of education level on interest in politics was different for males and females (i.e., different depending on your
gender). To answer this question, they recruited 60 participants: 30 males and 30 females, equally split by level of education
(School/College/University) (i.e., 10 participants in each group). The researcher had participants complete a questionnaire that
assessed their interest in politics, which they called the "Political Interest" scale. Participants could score anything between 0 and 100,
with higher scores indicating a greater interest in politics.
Transfer the dependent variable, political_interest, into the Dependent Variable: box, and the
32
independent
variables, genderand education_level, into the Fixed Factor(s): box,
33

Click on the button. This will add this profile plot, which it labels
"education_level*gender", into the Plots: box, as shown below:
34
Transfer the interaction effect, "gender*education_level", from
the Factor(s) and Factor Interactions: box to the Display Means for: box
by highlighting it and clicking on the button, as shown below:
click on the continue button. You will be returned to
the Univariate dialogue box.
Click on the EM MEANS Button
You will be presented with the Univariate: Estimated
Marginal Means dialogue box, as shown below:
35
Click on the options button. You will be presented with the Univariate:
Options dialogue box, as shown below:

Transfer education level from the Factor(s): box to the Post Hoc Tests for: box
using the button. This will activate the –Equal Variances Assumed– area (i.e.,
it will no longer be greyed out) and present you with some choices for which
post hoc test to use. For this example, we are going to select Tukey, which is
a good, all-round post hoc test.

Note: You only need to transfer independent variables that have more than
two groups into the Post Hoc Tests for: box. This is why we do not
transfer gender.
36
RESULTS
37

The particular rows we are interested in are the "gender", "education_level" and "gender*education_level" rows, and these are
highlighted above. These rows inform us whether our independent variables (the "gender" and "education_level" rows) and their
interaction (the
"gender*education_level" row) have a statistically significant effect on the dependent variable, "interest in politics". It is important to first look
at
the "gender*education_level" interaction as this will determine how you can interpret your results (see our enhanced guide for more
information).
You can see from the "Sig." column that we have a statistically significant interaction at the p = .002 level. You may also wish to report the
results of "gender" and "education_level", but again, these need to be interpreted in the context of the interaction result. We can see from the
table above that there was no statistically significant difference in mean interest in politics between males and females (p = .448), but there
were statistically significant differences between educational levels (p < .001).
38

▰ RESULTS OF TWO WAY there is a statistically significant difference between all three different
educational levels (p < .001).
ANOVA
A two-way ANOVA was conducted that examined the effect of gender and education level on interest in politics. There
was a statistically significant interaction between the effects of gender and education level on interest in politics, F (2,
52) = 7.315, p = .002.
Simple main effects analysis showed that males were significantly more interested in politics than females when
educated to university level (p = .002), but there were no differences between gender when educated to school (p =
.465) or college level (p = .793).
39

Manova using spss


examples
40
Get the mean:

Research Question
How to lose weight effectively? Do diets really work
and what about exercise? In order to find out, 180
participants were assigned to one of 3 diets and one
of 3 exercise levels. After two months, participants
were asked how many kilos they had lost.

▰ Using spss for


computation
41
42

Two Way ANOVA Output - Between Subjects Effects

Our means plot was very useful for describing the pattern of means
resulting from diet and exercise in our sample. But perhaps
things are different in the larger population. If neither diet nor
exercise affect weight loss, could we find these sample results by
mere sampling fluctuation? Short answer: no.
43

Two Way ANOVA Output - Multiple Comparisons

We now know that the average weight loss is not


equal for all different diets and exercise levels.
So precisely which means are different? We can
figure that out with post hoc tests, the most
common of which is Tukey’s HSD.
44

References:
SPSS BASSIC TUTORIAL Retrieved from https://
www.spss-tutorials.com/spss-two-way-anova-
basics-tutorial/

https://statistics.laerd.com/spss-tutorials/one-way-anova-
using-spss-statistics-2.php

You might also like