0% found this document useful (0 votes)
36 views6 pages

Chapter 5, ANOVA

The document discusses analysis of variance (ANOVA) which is used to compare population means simultaneously using the F-distribution. ANOVA can be used in one-way or two-way designs and makes assumptions about normality, equal variances, and independence. The document outlines the steps in ANOVA testing including stating hypotheses, calculating test statistics like sum of squares between and within groups, and determining whether to reject the null hypothesis of equal means.

Uploaded by

Girma erena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views6 pages

Chapter 5, ANOVA

The document discusses analysis of variance (ANOVA) which is used to compare population means simultaneously using the F-distribution. ANOVA can be used in one-way or two-way designs and makes assumptions about normality, equal variances, and independence. The document outlines the steps in ANOVA testing including stating hypotheses, calculating test statistics like sum of squares between and within groups, and determining whether to reject the null hypothesis of equal means.

Uploaded by

Girma erena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

CHAPTER-FIVE

ANALYSIS OF VARIANCE (ANOVA)

5.1. Area of Application

The probability distribution used in this chapter is the F-Distribution. It was named to honor Sir
Ronald Fisher, one of the founders of modern day statistics. This probability distribution is used
as the distribution of the test statistic for several situations. It is used to test whether two samples
are from populations having equal variance and it also applied when we want to compare several
population means simultaneously. The situations comparison of several populations means
known as analysis of variance (ANOVA).

Characteristics of ANOVA

1. There is a family of F-distribution. A particular member of the family is determined by two


parameters. The degree of freedom in the numerator and the degree of freedom in the
denominator.
2. The F-distribution is continuous distribution. This means that it can assume an infinite
number of values between zero and positive infinity. That is 0 →∞
3. The F-distribution is cannot be negative. The smallest value F can assume is zero.
4. It is positively skewed. The long tail of the distribution is to the right hand side. As the
number of the degree of freedom increase in both the numerator and the denominator the
distribution approaches the normal distribution.
5. It is asymptotic. As the value of x increases, the F-curve approaches the x-axis but never
touches it. This is similar to the normal distribution.

Types of ANOVA

1. One-way ANOVA

It refers to the situations when only one fact or variable is considered. For example; in testing for
differences in sales for three sales men, we are considering only one factors which is the sales
man’s ability.

1|Page
2. Two-way ANOVA

If we take two facts simultaneously and investigate the differences among their various
categories having numerous possible values, we said to use two-way variance. For example; the
sales not only affected by sales man’s selling ability but also by the price charged by the
company.

ANOVA Assumptions

Another use of the F-distribution is ANOVA technique in which we compare three or more
population means to determine whether they could be equal. To use ANOVA, we can assume the
followings:

a. The population follows the normal distribution


b. The population have equal standard deviation
c. The population are independent

When these conditions are meet, F-used as the distribution of the test statistic

Steps in ANOVA Testing

1. State the Null and the Alternative hypothesis


H0: µ1 = µ2 = µ3= µ4 = µkk is population category
HA: The mean scores are not equal
2. Select the level of significance
3. Determine the test statistic. It follows the F-distribution
4. Formulate the decision rule. To determine the decision rule, we need the critical value. The
critical value can be obtained from the F-score table by using dfs.
df(Numerator) = k – 1
df(Denominator) = n – 1
5. Select the sample, perform the calculations, and make decision.

SSB
Variance between SSB/df k−1 MSB
F= = = =
Variance within SSW / df SSW MSW
n−k

2|Page
Rational behind ANOVA

The estimate of population variance 2 is computed by two different estimate of variance 2

I. Variance Within Sample (2within)

Even though each observation comes from the same population, some chance of variation can
occur. This variance may be due to sampling errors or other natural causes. It can be calculate
through the following steps.

1. Calculate the mean value of each sample i.e. X1, X2, X3, Xk
2. Take one sample at a time and take the deviation of each item in the sample from its mean.
3. Square the difference and take the total sum of all these squared differences. This is also
known as SSW.
4. Divided these SSW by the corresponding df; df = N – k
5. This figure SSW/df is also known as (2within). It is called mean of sum of within (MSW).
II.Variance between Sample (2between)

It is due to the effect of different treatments i.e. the population means (µ) may be affected by
factors under consideration, making the different mean; inter sample variability, also known as
the sum of between samples (SSB). SSB can be calculated as follows:

1. Take k sample of size n each and calculate each mean of the sample i.e. X1, X2, X3, Xk
2. calculate the grand sum of mean X of the distribution of these sample mean so that:
k
xi
X=∑ k
i=1 k
3. Take the difference between the mean of various samples and the grand mean i.e. (X 1 – X,
X2 – X, Xk – X)
4. Square the difference individually, multiply each of these squared deviations by its products
so that we get:
∑ni (Xi – X)2. Where ni = size of the ith sample. This will be the value of SSB.
SSB = n1(X – X) 2 + n2(X – X) 2 + nk(Xk – X)2

3|Page
5. Divided SSB by the df, which are (k – 1), where k is the number of samples and these would
give us the value of (2between)

Degree of freedom

The degrees of freedom are associated both with the numerator and denominator of the F-ratio.

1. Numerator: Since the variance between samples (2between)comes from many samples and if
there are k-number of samples, then the df associated with the numerator will be k – 1
2. Denominator: It is the mean variance of k-samples and size each variance in each sample is
associated with would be df = N – k

Then the value of F is compared with the critical value of F from the table and decision is made
about the validity of null hypothesis.

ANOVA-Table

After variance calculations for SSB, SSW, and the df have been made, these figures can be
presented in simple table called ANOVA table as follows:

Source of Variance Sum of Squares Degree of Freedom Mean Square F-Ratio


Treatment SSB K–1 MSB = SSB/k -1
Within SSW N–k MSW = SSW/N -k
Total SST

5.2. Comparison of the mean of more than two population and Variance Test

When we use ANOVA to test whether the means of k-populations are equal, rejection of null
hypothesis allows us to conclude only that the population means are not all equal. In some case,
we will want to go a step further and determine whether the differences among means occur. The
purpose of this section is to show how multiple comparison procedures can be used to conduct
statistical comparisons between pairs of the population means.

Example:

1. To test all teachers teach the same material in different sections of the Statistics for
Management class or not, four sections of the same course were selected and the common
test was administered to five students selected at random from each section. The score for
each student from each section were noted and are given below. We want to test for any

4|Page
difference in learning as reflected in the average score for each section. At  = 05, test
whether there is any significant difference in teaching.

No. of Students Section-1 Section-2 Section-3 Section-4


1 8 12 10 12
2 10 12 13 15
3 12 10 11 13
4 10 8 12 10
5 5 13 14 10
2. There are three sections of an introductory course in Business Statistic. Each section is being
thought by different instructors. There are some complaints that at least one of the instructors
does not cover the necessary material. To make sure that all students receive the same level
of materials in similar matter, the chairperson of the department has prepared a common test
to be given to students of the three sections. A random sample of seven students is selected
from each class and their tests score out of a total of 20 points are tabulated as follows:

Students Section-1 Section-2 Section-3


1 20 12 16
2 18 11 15
3 18 10 18
4 16 14 16
5 14 15 16
6 18 12 17
7 15 10 14
Required:
a. State the appropriate H0 and HA for determining if there is any significant difference in
the average section of students from these three sections.
b. Compute the sum of square between the samples (SSB)
c. Compute the sum of square within the sample (SSW)

5|Page
3. Awash Insurance Company wants to test whether three of its sales men A, B, and C in a
given territory make similar number of appointments with prospective customers during a
given period of time. A record of previous four months showed that the following results for
the number of appointments made by each sales man for each month.

Sales Men
Months A B C
1 8 6 14
2 9 8 12
3 11 10 18
4 12 4 8
Do you think that at 95% confidence level, there is significant difference in the average
number of appointments made by the three sales men per month?

4. A department store chair is considering building new store at one of the three locations. An
important factor in making such a decision is the households’ income in these areas. If the
average income per house hold is similar, then they can pick any one of these three locations. A
random survey of various households in each location is undertaken and their annual combined
income is recorded. These data are tabulated as follow: Annual household income (1,000 birr). At
 = 0.01, test if the average income per household in all these locations can be considered as the same.

Area-1 70 72 75 80 83 - -
Area-2 100 110 108 112 113 120 100
Area-3 60 65 57 84 84 70 -

6|Page

You might also like