0% found this document useful (0 votes)
16 views4 pages

Chapter 4

Uploaded by

yohanisabebe64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

Chapter 4

Uploaded by

yohanisabebe64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

CHAPTER 4

ANALYSIS OF VARIANCE (ANOVA)


Introduction
In Chapter 3, we presented a method for testing the equality of two population means. Now
suppose that we wish to extend this method to test the equality of more than two population
means. The test procedure described in previous chapter applies to only two means and therefore
is inappropriate. Hence, we will employ a more general method of data analysis, the analysis of
variance.
4.1. Test of hypothesis about the equality of more than two population means
When an F test is used to test a hypothesis concerning the means of three or more populations,
the technique is called analysis of variance (ANOVA).
At first glance, you might think that to compare the means of three or more samples, you can use
the t test, comparing two means at a time. But there are several reasons why the t test should not
be done.
 First, when you are comparing two means at a time, the rest of the means under study are
ignored. With the F test, all the means are compared simultaneously.
 Second, when you are comparing two means at a time and making all pairwise comparisons,
the probability of rejecting the null hypothesis when it is true is increased, since the more t
tests that are conducted, the greater is the likelihood of getting significant differences by
chance alone.
 Third, the more means there are to compare, the more t tests are needed. For example, for the
comparison of 3 means two at a time, 3 t tests are required. For the comparison of 5 means
two at a time, 10 tests are required. And for the comparison of 10 means two at a time, 45
tests are required.

Even though you are comparing three or more means in this use of the F test, variances are used
in the test instead of means. With the F test, two different estimates of the population variance
are made. The first estimate is called the between-group variance, and it involves finding the
variance of the means. The second estimate, the within-group variance, is made by computing
the variance using all the data and is not affected by differences in the means. If there is no
difference in the means, the between-group variance estimate will be approximately equal to the
within-group variance estimate, and the F test value will be approximately equal to 1. The null
hypothesis will not be rejected. However, when the means differ significantly, the between-
group variance will be much larger than the within-group variance; the F test value will be

1
significantly greater than 1; and the null hypothesis will be rejected. Since variances are
compared, this procedure is called analysis of variance (ANOVA).
For a test of the difference among 3 or more means, the following hypotheses should be used:
H 0 :µ 1=µ 2=µ 3=…=µ k vs H 1 : At least one mean is different from the others. As stated
previously, a significant test value means that there is a high probability that this difference in
means is not due to chance, but it does not indicate where the difference lies. The degrees of
freedom for this F test are degree of freedom of numerator(d . f . N .=k −1), where k is the
number of groups, and degree of freedom of denominator ¿ d . f . D .=N−k ¿ , where N is the sum
of the sample sizes of the groups N=n1+ n2 ++n3 +…+ nk . The sample sizes need not be equal.
The F test to compare means is always right-tailed.
Examples below illustrate the computational procedure for the ANOVA technique for comparing
three or more means, and the steps are summarized in the Procedure Table shown after the
examples.
Example 1: A researcher wishes to try three different techniques to lower the blood pressure of
individuals diagnosed with high blood pressure. The subjects are randomly assigned to three
groups; the first group takes medication, the second group exercises, and the third group follows
a special diet. After four weeks, the reduction in each person’s blood pressure is recorded. At α=
0.05, test the claim that there is no difference among the means. The data are shown.

c. Find the between group variance, denoted by S2B.

2
S
2
=
∑ 2
ni ( X i− X GM ) 5 ( 11.8−7.73 )2 +5 ( 3.8−7.73 )2+5 ( 7.6−7.73 )2 160.13
= = =80.07
B
k −1 3−1 2
Note: This formula finds the variance among the means by using the sample sizes as weights and
considers the differences in the means.
d. Find the within-group variance, denoted by S2W .
2
SW =
∑ (ni −1)S 2i = ( 5−1 ) (5.7)+( 5−1 ) (10.2)+ ( 5−1 ) (10.3)= 104.80 =8.73
∑ (ni−1) ( 5−1 ) + ( 5−1 ) + ( 5−1 ) 12
Note: This formula finds an overall variance by calculating a weighted average of the individual
variances. It does not involve using differences of the means.

e. Find the F-test value


2
SB 80.07
F= 2
= =9.17
S W
8.73
Step 4: Make the decision. The decision is to reject the null hypothesis, since
9.17> 3.89.
Conclusion: There is enough evidence to reject the claim and conclude that at least one mean is
different from the others.
The numerator of the fraction obtained in step 3, part c, of the computational procedure is called
the sum of squares between groups, denoted by SSB. The numerator of the fraction obtained in
step 3, part d, of the computational procedure is called the sum of squares within groups,
denoted by SSW. This statistic is also called the sum of squares for the error. SSB is divided by
d.f.N. to obtain the between-group variance. SSW is divided by N−k to obtain the within-group
or error variance. These two variances are sometimes called mean squares, denoted by MSB and
MSW. These terms are used to summarize the analysis of variance and are placed in a summary
table, as shown in table below.
Analysis of Variance Summary Table
Source Sum of squares d.f. Mean square F
Between SSB k −1 MSB MSB
Within (error) SSW N−k MSW MSW
Total

SSB SSW
Where, ¿ , MSW = , k=number of groups,
k−1 N−k
N=n1+ n2 ++n3 +…+ nk =¿ sum of sample sizes for groups
Example 2: A state employee wishes to see if there is a significant difference in the number of
employees at the interchanges of three state toll roads. The data are shown below. At α=0.05, can
it be concluded that there is a significant difference in the average number of employees at each
interchange?

3
c. Find SSB and SSW
SSB=∑ ni ( X i−X GM )2 =6 ( 15.5−8.4 )2 +6 ( 4−8.4 )2 +6 ( 5.8−8.4 )2=459.18
SSW =∑ (ni −1)S i =( 6−1 ) (81.9)+ ( 6−1 ) (25.6)+ ( 6−1 ) (29.0)=682.5
2

d. Construct ANOVA table


Analysis of Variance Summary Table
Source Sum of squares d.f. Mean square F
Between 459.18 2 229.59
Within (error) 682.5 15 45.5 5.05
Total 1141.68 17
Step 4: Make the decision. The decision is to reject the null hypothesis, since
5.05> F 0.05 (2 , 15 )=3.68.
Conclusion: There is enough evidence to support the claim that there is a difference among the
means.
When the null hypothesis is rejected in ANOVA, it only means that at least one mean is different
from the others. To locate the difference or differences among the means, it is necessary to use
other tests such as the Fisher’s least significant difference, Tukey or the Scheffé test.

You might also like