0% found this document useful (0 votes)

19 views38 pages

Unit 4-1

Uploaded by

emmanuelemlucks

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views38 pages

Unit 4-1

Uploaded by

emmanuelemlucks

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Statistical Inference:

By
Wilhemina Adoma Pels

Department of Statistics and Actuarial Science

KNUST

Analysis of Variance (ANOVA)

July 1, 2024
1 / 38
Introduction to Analysis of Variance (ANOVA)

What is an ANOVA?
An ANOVA test is a type of statistical test used to determine if
there is a statistically significant difference between two or more
categorical groups by testing for differences of means using
variance.

2 / 38
ANOVA

ANOVA is used to compare means across multiple groups

or conditions.

It is a statistical technique for making inferences about

population means based on sample data.

ANOVA allows one to determine whether the differences

between the samples are simply due to random error
(sampling errors) or whether there are systematic
treatment effects that causes the mean in one group to
differ from the mean in another.

3 / 38
Assumptions of ANOVA

Normality: Observations within each group follow a normal

distribution.

Independence: Observations within each group are

independent of each other. This means that subjects in the
first group cannot also be in the second group (e.g.
independent samples/between-groups)

Homogeneity of Variance or homoscedasticity means that

the deviation of scores (measured by the range or standard
deviation for example) is similar between populations.

The different groups/levels must have equal sample sizes.

4 / 38
Hypotheses in ANOVA

Null Hypothesis
Null Hypothesis (H0 ): There is no significant difference between
the means of the groups. H0 : µ1 = µ2 = µ3 · · · = µt
All population means are equal
No treatment effect

Alternative Hypothesis
Alternative Hypothesis (HA ): There is a significant difference
between at least one pair of group means.
HA : At least one µi is different
at least 1 population is different
Treatment effect
NOT H1 : µ1 ̸= µ2 ̸= µ3 . . . ̸= µt

5 / 38
Hypothesis

6 / 38
7 / 38
ANOVA Test Statistic

The F-test is used as the test statistic in ANOVA.

It compares the between-group variance to the

within-group variance.

The formula for calculating the F-statistic is:

Between-group variance
F =
Within-group variance
.

8 / 38
ANOVA Test Procedure

1 Formulate hypotheses.

2 Calculate the test statistic (F-statistic).

3 Determine the p-value.

4 Compare the p-value to the significance level (e.g.,

α = 0.05).

5 Make a conclusion based on the p-value and significance

level.

9 / 38
Interpreting ANOVA Results

If the p-value is less than the significance level, reject the

null hypothesis and conclude that there is a significant
difference between at least one pair of group means.

If the p-value is greater than the significance level, fail to

reject the null hypothesis and conclude that there is not
enough evidence to support a significant difference between
group means.

10 / 38
Types of ANOVA

11 / 38
One-way-Anova

• A one-way ANOVA (analysis of variance) has one categorical

independent variable (also known as a factor) and a normally dis-
tributed continuous (i.e., interval or ratio level) dependent vari-
able.
• A one way ANOVA is used to compare two means from two in-
dependent (unrelated) groups using the F-distribution. The null
hypothesis for the test is that the two means are equal. There-
fore, a significant result means that the two means are unequal.

Hence, we want to study the effect of one or more qualitative

variables on a quantitative outcome variable

12 / 38
One-way ANOVA example

Note: Qualitative variables are referred to as factors

Scenario
As a food scientist, you want to test the effect of three different
additive mixtures on yoghurt making. You can use a one-way
ANOVA to find out if there is a difference in the long lasting of
the yoghurt between the three groups.

13 / 38
When to use a one-way ANOVA

• Use a one-way ANOVA when you have collected data about one
categorical independent variable and one quantitative dependent
variable. The independent variable should have at least three
levels (i.e. at least three different groups or categories).

• ANOVA tells you if the dependent variable changes according

to the level of the independent variable. For example:

• Your independent variable is brand of soda, and you collect

data on Coke, Pepsi, Sprite, and Fanta to find out if there is a
difference in the price per 100ml.

14 / 38
Null Hypothesis
The null hypothesis (H0 ) of ANOVA is that there is no
difference among group means.
The null hypothesis is that all the groups have equal means
H0 : µ1 = µ2 = µ3

Alternative Hypothesis
The alternate hypothesis (Ha ) is that at least one group differs
significantly from the overall mean of the dependent variable.

Level of significance α is selected as 0.05

15 / 38
Why not several t-tests

• Imagine we have a design with three groups that have to be

compared:
G1, G2, G3

• We will have to run several separate t-tests

(one to compare G1 with G2, one to compare G1 with G3, and
one to compare G2 with G3)

• For every test we use a general α-level of 0.05

16 / 38
Cont’d

• α-level = 0.05

• 5% possibility to make Type I error, i.e. rejecting H0 , when

H0 is actually true.

• Our aim is too reduce the possibilities to have Type I error

• If we were to run 3 separate t-tests to compare G1, G2 and G3,

each with an α − level of 0.05, the overall possibility not to make
Type I error would be 0.857 [i.e. (0.95)3 = 0.857]

• Therefore subtracting that from the overall possibility not to

make Type 1 error (1 = 100%)1 − 0.857 = 0.14
• We have 14% of possibilities to make Type 1 error.
• 14% greater than the usual 5%.
• We can’t be happy with that

17 / 38
How does an ANOVA test work

• ANOVA determines whether the groups created by the levels of

the independent variable are statistically different by calculating
whether the means of the treatment levels are different from the
overall mean of the dependent variable.

• If any of the group means is significantly different from the

overall mean, then the null hypothesis is rejected.

• ANOVA uses the F-test for statistical significance. This allows

for comparison of multiple means at once, because the error is
calculated for the whole set of comparisons rather than for each
individual two-way comparison (which would happen with a t-
test).

18 / 38
The F-test compares the variance in each group mean from the
overall group variance. If the variance within groups is smaller
than the variance between groups, the F-test will find a higher
F-value, and therefore a higher likelihood that the difference ob-
served is real and not due to chance.

19 / 38
Computing the one way ANOVA

Here is the basic one-way ANOVA table

20 / 38
We can see that there are two different sources of variation that
an ANOVA measures:

• Between Group Variation: The total variation between each

group mean and the overall mean.

• Within-Group Variation: The total variation in the individual

values in each group and their group mean.
or ”unexplained random error

• If the Between group variation is high relative to the Within-

group variation, then the F-statistic of the ANOVA will be higher
and the corresponding p-value will be lower, which makes it more
likely that we’ll reject the null hypothesis that the group means
are equal.

21 / 38
Degrees of freedom

• The degrees of freedom, noted in are calculated as Ni − 1 for

the total(Ni is the total number of observations).
If there are n total data points collected, then there are n − 1
total degrees of freedom.

• If there are m groups being compared, then there are m − 1

degrees of freedom associated with the factor of interest.
Number of groups minus one for the between groups

• And for the within error, subtract d.f. for groups from the total
degrees of freedom.
If there are n total data points collected and m groups being
compared, then there are n − m error degrees of freedom.

22 / 38
Example

Suppose the National Transportation Safety Board (NTSB) wants

to examine the safety of compact cars, midsize cars, and full-size
cars. It collects a sample of three for each of the treatments
(cars types). Using the hypothetical data provided below, test
whether the mean pressure applied to the driver’s head during a
crash test is equal for each types of car. Use α = 5%.
Compact cars Midsize cars Full-size cars
643 469 484
655 427 456
702 525 402
X̄ 666.67 473.67 447.33
S 31.18 49.17 41.68

23 / 38
Solution
1 State the null and alternative hypotheses The null
hypothesis for an ANOVA always assumes the population
means are equal. Hence, we may write the null hypothesis
as:
H0 : µ1 = µ2 = µ3
this means the mean head pressure is statistically equal
across the three types of cars.
• Since the null hypothesis assumes all the means are
equal, we could reject the null hypothesis if only mean is
not equal. Thus, the alternative hypothesis is:
Ha : At least one mean pressure is not statistically equal.
2 Calculate the appropriate test statistic The test statistic in
ANOVA is the ratio of the between and within variation in
the data. It follows an F distribution
24 / 38
Solution

Total Sum of Squares - the total variation in the data. It is the

sum of the between and within variation.
¯ 2

• Total Sum of Squares (SST) = ri=1 cj=1 Xij − X̄
P P

where r is the number of rows in the table, c is the number of

¯ is the grand mean, and X is the i th observation in
columns, X̄ ij
the j th column. Using the data in Table we may find the grand
mean:
P
¯= Xij (643 + 655 + 702 + 469 + 427 + 525 + 484 + 456 + 402)
X̄ =
N 9
= 529.22
SST =
(643 − 529.22)2 + (655 − 529.22)2 + (702 − 529.22)2 + (469 − 529.22)2
+ . . . + (402 − 529.22)2 = 96303.55

25 / 38
Solution cont’d

Between Sum of Squares (or Treatment Sum of Squares): varia-

tion in the data between the different samples (or treatments).
P ¯ 2

• Treatment Sum of Squares (SSTR) = rj X̄j − X̄
where rj is the number of rows in the j th treatment
X̄j is the mean of the j th treatment. Using the data in Table

SSTR = 3 × (666.67 − 529.22)2 + 3 × (473.67 − 529.22)2

+ 3 × (447.33 − 529.22)2 = 86049.55

26 / 38
Solution Cont’d

Within variation (or Error Sum of Squares): variation in the data

from each individual treatment.
PP 2
Error Sum of Squares (SSE) = Xij − X̄j
From Table,

SSE = (643 − 666.67)2 + (655 − 666.67)2 + (702 − 666.67)2 +

(469 − 473.67)2 + (427 − 473.67)2 + (525 − 473.67)2 +

(484 − 447.33)2 + (456 − 447.33)2 + (402 − 447.33)2 = 10254

Note that

SST = SSTR + SSE = (96303.55 = 86049.55 + 10254)

27 / 38
• Hence, you only need to compute any two of three sources
of variation to conduct an ANOVA. Especially for the first few
problems you work out, you should calculate all three for practice.
• The next step in an ANOVA is to compute the “average”
sources of variation in the data using SST, SSTR, and SSE.
SST
• Total Mean Squares (M ST ) = N −1 → “average total variation
in the data” (N is the total number of observations)
96303.55
MST = = 12037.94
(9 − 1)
Mean Square Treatment (MSTR) = SST R
c−1 → “average between
variation” ( c is the number of columns in the data table)
86049.55
MSTR = = 43024.78
(3 − 1)
SSE
Mean Square Error (MSE) = N −c → “average within variation”
10254
MSE = = 1709
9−3
28 / 38
The test statistic may now be calculated. For a one-way ANOVA
the test statistic is equal to the ratio of MSTR and MSE. This
is the ratio of the “average between variation” to the “average
within variation”. In addition, this ratio is known to follow an F
distribution. Hence,
M ST R 43024.78
F = = = 25.17
M SE 1709
The intuition here is relatively straightforward. If the average
between variation rises relative to the average within variation,
the F statistic will rise and so will our chance of rejecting the
null hypothesis.

29 / 38
Obtain the Critical Value
To find the critical value from an F distribution you must know
degrees of freedom for the numerator (MSTR) and that of de-
nominator (MSE), along with the significance level.
• FCV has df1 and df2 degrees of freedom, where df1 is the numer-
ator degrees of freedom equal to m−1 and df2 is the denominator
degrees of freedom equal to n − m.
• In our example,
df1 = 3 − 1 = 2
and
df2 = 9 − 3 = 6
. Hence we need to find the critical values of F corresponding
to α = 5%. Using the F tables in your text we determine that
F2,6 = 5.14

30 / 38
One-way-ANOVA Decision Criteria
• Fcalculated > Fcritical , We reject the null hypothesis and accept
the alternative hypothesis that there is at least a difference be-
tween two of the group means.
• If Fcalculated < Fcritical , We fail to reject the null hypothesis
and conclude that there are no significant differences between
the group means.
Decision rule per example
In our example 25.17 > 5.14, so we reject the null hypothesis

Interpretation
Since we rejected the null hypothesis, we are 95% confident
(1 − α) that the mean head pressure is not statistically equal for
compact, midsize, and full size cars. However, since only one
mean must be different to reject the null, we do not yet know
which mean(s) is/are different. In short, an ANOVA test will
test us that at least one mean is different, but an additional test
must be conducted to determine which mean(s) is/are different.
31 / 38
Example 2: Reed Manufacturing

J. R. Reed would like to know if the mean number of hours

worked per week is the same for the department managers at her
three manufacturing plants (Buffalo-Plant 1,Pittsburgh-Plant 2,
and Detroit-Plant 3). A simple random sample of 5 managers
from each of the three plants was taken and the number of hours
worked by each manager for the previous week is shown on the
next slide.

32 / 38
Example 2: Reed Manufacturing

Plant 1 Plant 2 Plant 3

Observation Buffalo Pittsburgh Detroit
1 48 73 51
2 54 63 63
3 57 66 61
4 54 64 54
5 62 74 56
Sample Mean 55 68 57
Sample Variance 26.0 26.5 24.5

33 / 38
Solution

Hypothesis
H0 : µ1 = µ2 = µ3
Ha : not all the means are equal

where:
µ1 = mean number of hours worked per week by the managers
at Plant,
µ2 = mean number of hours worked per week by the managers
at Plant,
µ3 = mean number of hours worked per week by the managers
at Plant 3

34 / 38
Cont’d

Mean Square Between Since the sample sizes are all

equal,
¯ = (55 + 68 + 57)/3 = 60
x̄
SSB = 5(55 − 60)2 + 5(68 − 60)2 + 5(57 − 60)2 = 490
M SB = 490/(3 − 1) = 245
Mean Square within

SSW=4(26.0)+4(26.5)+4(24.5)=308
MSW= 308/(15-3)=25.667

35 / 38
F-test
If H0 is true, the ratio MSB/MSW should be near 1 since both
MSB and MSW are estimating σ 2 . If Ha is true, the ratio
should be significantly larger than 1 since MSB tends to
overestimate σ 2

Rejection Rule
Assuming α = 0.05, F0.05 = 3.89(2 d.f. numerator, 1 d.f.
denominator).
Reject H0 if F > 3.89.

Test Statistic
F=MSB/MSW=245/25.667=9.55

36 / 38
Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0. The mean number of
hours worked per week by department managers is not the same
at each plant.

ANOVA Table
Source of Sum of Degrees of Mean F
Variation Squares Freedom Square
Within Groups 490 2 245 9.55
Between Groups 308 12 25.667
Total 798 14

37 / 38
Group Assignment

38 / 38

MCQ Paper Template
100% (1)
MCQ Paper Template
7 pages
Topic3 3
No ratings yet
Topic3 3
64 pages
Anova Mab2024
No ratings yet
Anova Mab2024
30 pages
Anova
No ratings yet
Anova
31 pages
RM Unit-4
No ratings yet
RM Unit-4
45 pages
18MEO113T - DOE - Unit 5 - AY2023 - 24 ODD
No ratings yet
18MEO113T - DOE - Unit 5 - AY2023 - 24 ODD
76 pages
Anova
No ratings yet
Anova
5 pages
CH 10
No ratings yet
CH 10
54 pages
ANOVA
0% (1)
ANOVA
26 pages
SWAYAM 2019 9 November Shift 2 Basic Concepts in Education
No ratings yet
SWAYAM 2019 9 November Shift 2 Basic Concepts in Education
17 pages
Hypothesis Testing - Analysis of Variance
No ratings yet
Hypothesis Testing - Analysis of Variance
19 pages
Anova - Full
No ratings yet
Anova - Full
25 pages
ANOVA 2023 Aa 2564896
No ratings yet
ANOVA 2023 Aa 2564896
26 pages
Analysisof Variance
No ratings yet
Analysisof Variance
44 pages
Module 17
No ratings yet
Module 17
8 pages
ANOVA
No ratings yet
ANOVA
38 pages
SMuR Complete
No ratings yet
SMuR Complete
114 pages
Anova Ms
No ratings yet
Anova Ms
9 pages
One-Way ANOVA
No ratings yet
One-Way ANOVA
28 pages
CN 121 Blacksheet 2024
No ratings yet
CN 121 Blacksheet 2024
88 pages
Last Lecture 1
No ratings yet
Last Lecture 1
17 pages
Anova
No ratings yet
Anova
43 pages
Thesis With Anova
100% (3)
Thesis With Anova
7 pages
Oneway ANOVA
No ratings yet
Oneway ANOVA
38 pages
Statistical Inference BBA-IV (B) : ANOVA (Analysis of Variance) Lecture No. 9 (Part 4) by Amna Naeem
No ratings yet
Statistical Inference BBA-IV (B) : ANOVA (Analysis of Variance) Lecture No. 9 (Part 4) by Amna Naeem
18 pages
11-Anova For BRM
No ratings yet
11-Anova For BRM
39 pages
ANOVA
No ratings yet
ANOVA
36 pages
Analysis of Variance
No ratings yet
Analysis of Variance
40 pages
18MEO113T - DOE - Unit 5 - AY2023 - 24 ODD
No ratings yet
18MEO113T - DOE - Unit 5 - AY2023 - 24 ODD
76 pages
Anova One Way & Two Way Classified Data: Dr. Mukta Datta Mazumder Associate Professor Department of Statistics
No ratings yet
Anova One Way & Two Way Classified Data: Dr. Mukta Datta Mazumder Associate Professor Department of Statistics
32 pages
Lecture 10 - ANOVA
No ratings yet
Lecture 10 - ANOVA
27 pages
Darubandi Peon Answer Key
No ratings yet
Darubandi Peon Answer Key
86 pages
ANOVA - Edit (Jay & Dya)
No ratings yet
ANOVA - Edit (Jay & Dya)
69 pages
Business Statics
No ratings yet
Business Statics
28 pages
Lesson 4 Exploring Agricultural Insights With Anova in Python
No ratings yet
Lesson 4 Exploring Agricultural Insights With Anova in Python
9 pages
Anova-Ppt For Sonia Kalra Ma'Am
No ratings yet
Anova-Ppt For Sonia Kalra Ma'Am
31 pages
All Online Study Plan Tool Worksheets
No ratings yet
All Online Study Plan Tool Worksheets
79 pages
Math GRE-GMAT Introduction
0% (1)
Math GRE-GMAT Introduction
135 pages
Chapter 4 Hypotheses Testing of More Than Two Populations
No ratings yet
Chapter 4 Hypotheses Testing of More Than Two Populations
90 pages
Chapter 5 Analysis of Variance (ANOVA)
No ratings yet
Chapter 5 Analysis of Variance (ANOVA)
10 pages
One Way Anova
100% (1)
One Way Anova
52 pages
STAT-107 Statistical Inference: Topic:)
No ratings yet
STAT-107 Statistical Inference: Topic:)
22 pages
ANOVA
No ratings yet
ANOVA
29 pages
ANOVA Reader
No ratings yet
ANOVA Reader
7 pages
Unit 8 8614 Research
No ratings yet
Unit 8 8614 Research
38 pages
13 - Anova
No ratings yet
13 - Anova
33 pages
12 Anova
No ratings yet
12 Anova
43 pages
ANOVA Test
No ratings yet
ANOVA Test
5 pages
3 Sanskrit
No ratings yet
3 Sanskrit
76 pages
Comb Merit List SC r4 27102024
No ratings yet
Comb Merit List SC r4 27102024
89 pages
Anovaparametrictest 240312091837 c0b4bb94
No ratings yet
Anovaparametrictest 240312091837 c0b4bb94
12 pages
ANOVA Example
No ratings yet
ANOVA Example
6 pages
Analysis of Variance
No ratings yet
Analysis of Variance
25 pages
Chapter 15 PDF
No ratings yet
Chapter 15 PDF
16 pages
Mm13 Content Module 9
No ratings yet
Mm13 Content Module 9
12 pages
Bodo Winter's ANOVA Tutorial
No ratings yet
Bodo Winter's ANOVA Tutorial
18 pages
Analysis Var - Ance: OF (Anova)
No ratings yet
Analysis Var - Ance: OF (Anova)
13 pages
Exam Day Flyer 2025
No ratings yet
Exam Day Flyer 2025
10 pages
G - Power Guide
No ratings yet
G - Power Guide
86 pages
Da Anova Tests
No ratings yet
Da Anova Tests
6 pages
What Is Analysis of Variance (ANOVA) ?: Z-Test Methods
No ratings yet
What Is Analysis of Variance (ANOVA) ?: Z-Test Methods
7 pages
Psy 234 Week 13
No ratings yet
Psy 234 Week 13
27 pages
PET. Reading and Writing Exam Papers PDF
100% (1)
PET. Reading and Writing Exam Papers PDF
22 pages
Ielts Writing Answer Sheet
No ratings yet
Ielts Writing Answer Sheet
2 pages
Anova and Design of Experiments
No ratings yet
Anova and Design of Experiments
35 pages
Ranklist Rev
No ratings yet
Ranklist Rev
46 pages
FIITJEE Brochure-Low Res
No ratings yet
FIITJEE Brochure-Low Res
72 pages
Validity - Outline: 1. Definition 2. Validity: Two Different Views 3. Types of Validity
No ratings yet
Validity - Outline: 1. Definition 2. Validity: Two Different Views 3. Types of Validity
54 pages
CUET UG General Official Paper (Held On - 06 Oct, 2022 Shift 1) 6438134519fb239dedca9764 (English)
No ratings yet
CUET UG General Official Paper (Held On - 06 Oct, 2022 Shift 1) 6438134519fb239dedca9764 (English)
24 pages
CSEC Function Multiple Choice Questions
No ratings yet
CSEC Function Multiple Choice Questions
11 pages
Korukonda
No ratings yet
Korukonda
35 pages
Module 012 - One Way ANOVA and Its
No ratings yet
Module 012 - One Way ANOVA and Its
12 pages
One-Way ANOVA: We Will Cover Only Independent-Measures Designs Involving Only One Independent Variable (One-Way ANOVA)
No ratings yet
One-Way ANOVA: We Will Cover Only Independent-Measures Designs Involving Only One Independent Variable (One-Way ANOVA)
2 pages
Aural Test 1 2013
No ratings yet
Aural Test 1 2013
4 pages
(2018-19) F.5 1st Term Examination Paper 2
No ratings yet
(2018-19) F.5 1st Term Examination Paper 2
8 pages
National Testing Agency Jee (Main) Session - 4
No ratings yet
National Testing Agency Jee (Main) Session - 4
7 pages
Experemental Research 2
No ratings yet
Experemental Research 2
14 pages
UG22011202 11202 MHCETPCBMarksheet
No ratings yet
UG22011202 11202 MHCETPCBMarksheet
1 page
Candidate Hall Ticket
No ratings yet
Candidate Hall Ticket
2 pages
Grade 08 Science 3rd Term Test Paper With Answers 2020 Sinhala Medium Southern Province
100% (2)
Grade 08 Science 3rd Term Test Paper With Answers 2020 Sinhala Medium Southern Province
10 pages
Quiz Research Parametric Test
No ratings yet
Quiz Research Parametric Test
4 pages
Cbse - Joint Entrance Examination (Main) - 2018
No ratings yet
Cbse - Joint Entrance Examination (Main) - 2018
1 page
4 WAIS-IV Wechsler Adult Intelligence Scale 4th Edition
No ratings yet
4 WAIS-IV Wechsler Adult Intelligence Scale 4th Edition
1 page
Reading Assessment Templates For Gort and Towl
100% (2)
Reading Assessment Templates For Gort and Towl
2 pages
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
From Everand
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
P.Y. Cheng
No ratings yet
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
From Everand
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
Jens K. Perret
No ratings yet
Glossary of Research Methodology
From Everand
Glossary of Research Methodology
Dr. Awadhesh Kishore
No ratings yet
Hypothesis Testing: An Intuitive Guide for Making Data Driven Decisions
From Everand
Hypothesis Testing: An Intuitive Guide for Making Data Driven Decisions
Jim Frost
No ratings yet
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet

Unit 4-1

Uploaded by

Unit 4-1

Uploaded by

Statistical Inference:

Department of Statistics and Actuarial Science

Analysis of Variance (ANOVA)

ANOVA is used to compare means across multiple groups

It is a statistical technique for making inferences about

ANOVA allows one to determine whether the differences

Normality: Observations within each group follow a normal

Independence: Observations within each group are

Homogeneity of Variance or homoscedasticity means that

The different groups/levels must have equal sample sizes.

The F-test is used as the test statistic in ANOVA.

It compares the between-group variance to the

The formula for calculating the F-statistic is:

2 Calculate the test statistic (F-statistic).

3 Determine the p-value.

4 Compare the p-value to the significance level (e.g.,

5 Make a conclusion based on the p-value and significance

If the p-value is less than the significance level, reject the

If the p-value is greater than the significance level, fail to

• A one-way ANOVA (analysis of variance) has one categorical

Hence, we want to study the effect of one or more qualitative

Note: Qualitative variables are referred to as factors

• ANOVA tells you if the dependent variable changes according

• Your independent variable is brand of soda, and you collect

Level of significance α is selected as 0.05

• Imagine we have a design with three groups that have to be

• We will have to run several separate t-tests

• For every test we use a general α-level of 0.05

• 5% possibility to make Type I error, i.e. rejecting H0 , when

• Our aim is too reduce the possibilities to have Type I error

• If we were to run 3 separate t-tests to compare G1, G2 and G3,

• Therefore subtracting that from the overall possibility not to

• ANOVA determines whether the groups created by the levels of

• If any of the group means is significantly different from the

• ANOVA uses the F-test for statistical significance. This allows

Here is the basic one-way ANOVA table

• Between Group Variation: The total variation between each

• Within-Group Variation: The total variation in the individual

• If the Between group variation is high relative to the Within-

• The degrees of freedom, noted in are calculated as Ni − 1 for

• If there are m groups being compared, then there are m − 1

Suppose the National Transportation Safety Board (NTSB) wants

Total Sum of Squares - the total variation in the data. It is the

where r is the number of rows in the table, c is the number of

Between Sum of Squares (or Treatment Sum of Squares): varia-

SSTR = 3 × (666.67 − 529.22)2 + 3 × (473.67 − 529.22)2

+ 3 × (447.33 − 529.22)2 = 86049.55

Within variation (or Error Sum of Squares): variation in the data

SSE = (643 − 666.67)2 + (655 − 666.67)2 + (702 − 666.67)2 +

(469 − 473.67)2 + (427 − 473.67)2 + (525 − 473.67)2 +

(484 − 447.33)2 + (456 − 447.33)2 + (402 − 447.33)2 = 10254

SST = SSTR + SSE = (96303.55 = 86049.55 + 10254)

J. R. Reed would like to know if the mean number of hours

Plant 1 Plant 2 Plant 3

Mean Square Between Since the sample sizes are all

You might also like