Chi Squire Note
Chi Squire Note
1.1 Introduction
In the previous chapter – Hypothesis Testing – we have attempted to determine whether a
hypothesis that assumes some value for a population parameter is reasonable or not and
whether to accept or reject the assumption. In this chapter, while our concern is still
testing a hypothesis made about a population, with information gathered from a sample,
our focus will be only on Chi - Square test – a right tailed test. A right tailed test will
reject the null hypothesis if the sample statistic is significantly higher than the
hypothesized population parameter.
1
The variable χ2 cannot be negative; so chi - square curves do not extend to the left of zero
having a positive skew-ness as they extend indefinitely in the positive direction. When ν
exceeds 2, χ2 curves have one mode but as ν increases the skew-ness becomes less
apparent and in fact, when ν is very large, the chi - square distribution is almost the same
as a normal distribution having a mean equal to ν and a standard deviation equal to
but in applications, ν is not large enough to permit us to use normal curve probabilities as
approximations of χ2 probabilities. The significance levels α in this chapter will be right -
tail areas of chi - square distributions. The symbol χ2 α, ν means the value of chi - square
such that the distribution with ν degrees of freedom has a right tail area of α.
2
Figure 4-2 The meaning of χ2 α, ν
Example: The value of χ2 0.02, 4 = 11.668
The values of χ2 α, ν are given in χ2 table.
Exercise: What is the value of χ2 0.05, 3?
Answer: 7.815
4.3 Areas of Application
The Chi - square distribution is used in number of statistical tests including:
Test for independence between two variables
Goodness-of-fit tests
Test for equality of several proportions
4.3.1 The χ2 test for independence
This involves a contingency table analysis. A contingency table is one that consists of
count data (data obtained by counting the frequency of an occurrence given a certain
category from a simple random sample) arranged in r – rows and c – columns. The actual
sample counts are called observed frequencies, and are denoted by fo.
The expected and observed frequencies, fe and fo are used to compute a sample statistic
for testing the hypothesis that the row and column categories are independent. The
underlying idea is that the observed frequencies should be close to the frequencies that
would be expected if the categories are independent. Large differences will lead us to
3
reject the hypothesis of independence. The statistic that is used for the test is called the
sample χ2. It is computed as:
Sample χ2 =
The formula shows that the larger the squared differences are relative to their respective
expected frequencies, the larger will be the value of the sample χ 2. Therefore, large
values of the sample χ2 will lead to rejecting the independence hypothesis.
The expected frequencies corresponding to a single observed frequency must not be too
small i.e. fe ≥ 5. If fe < 5 then we shall combine adjacent rows (or columns) in the
contingency table to get fe values of at least 5 before computing the sample χ2, also ν will
be computed after combining rows or columns (if any combining exists).
The steps to be followed in the contingency table test are illustrated along the following
example.
Example: The manager of Wabi-Shebele Hotel has collected opinions on the quality of
their service from a random sample of customers. The customers visited the Hotel’s
branches in three regions, namely Addis Ababa, Nazareth and Awasa – Langanoo, and
rated the services on a scale of 1 (best) to 4. The sample data is given in Table 4.1 below.
The manager wants to know whether quality ratings are or are not independent of the
respective regions where the Hotel’s branches are found. Perform test for independence
at 5% level.
4
Table 4.1
Customer quality ratings of service by region
Quality Hotel's Branch Row
Rating Addis Ababa Nazareth Langano Total
1 15 10 6 31
2 7 13 12 32
3 11 12 8 31
4 3 8 15 26
Column Total 36 43 41 120 - Grand total
Solution:
Steps
1. State the hypotheses Ho: Quality rating is independent of Hotel's Branch
Ha: Quality rating is not independent of Hotel's Branch
Sample χ2
fe = (row total) x (column total)
grand total
for the cell in the first row, first column,
fe = 31 x 36 = 9.3
5
120
for the cell in the first row, 2nd column, and 3rd column,
fe = 31 x 43 = 11.1 fe = 31 x 41 = 10.6
120 120
or for the last i.e. 3rd column we can find f e by adding the
previous expected frequencies and subtracting it from the
row total
i.e. fe = 31 - (9.3 + 11.1) = 10.6
Table 4.2 shows the observed and expected frequencies for each cell and from these
frequencies we compute the value of the test statistic – sample χ2 – as shown in Table 7.3.
Table 4.2
Service quality ratings observed and expected frequencies
Hotel’s Branch
Quality Addis Ababa Nazareth Langano
Ratings fo fe fo fe fo fe
1 15 9.3 10 11.1 6 10.6
2 7 9.6 13 11.5 12 10.9
3 11 9.3 12 11.1 8 10.6
4 3 7.8 8 9.3 15 8.9
6
Table 4.3
Calculating the sample χ2 value
fe fo - f e
fo (fo - fe)2
Sample χ2 = 14.9475
4. Accept or reject Ho Because the value of the sample test statistic,
14.9475, exceeds the value 12.592 in the decision
rule, we reject Ho.
The test result means that customer quality rating of services is not independent of the
Hotel’s branch where the service took place. The sample provides convincing evidence
that service quality rating depends on the Hotel’s Branch.
7
Check Point 4.1
In investigating whether there is a relation ship between the qualification test scores of persons
who have gone through a certain job-training program and their subsequent performance on the
job, 400 samples were taken and the results are as depicted in Table 4.4 below.
Table 4.4
Qualification test Performance Row
scores Poor Fair Good Total
Below Average 67 64 25 156
Average 42 76 56 174
Above Average 10 23 37 70
Column total 119 163 118 400 - Grand total
At a 0.01 level of significance, test whether the on-the-job performance of persons who have
gone through the training program is independent of their qualification test score.
Answer: Since χ2 = 40.89 exceeds 13.277, reject Ho. That is, we conclude that there is a
relationship between qualification test score and on-the-job performance.
The hypotheses of the test for equal proportions can be stated either in terms of equal row
proportions or in terms of equal column proportions. We will use row proportions P r (that is, cell
proportions to column totals) in our hypotheses. Thus, the hypotheses will be:
Ho: The proportions Pr in any row are equal.
Ha: The Pr in at least one row is not equal.
Example: Table 4.5 contains counts for a random sample of n = 200 workers. We want to test
the hypothesis that the population proportions of satisfactory workers in education levels 1, 2,
and 3, are equal at a 5% significance level.
Table 4.5
Performance rating of workers by educational level
Supervisor Educational level Row total
Rating Elementary (1) Junior (2) Secondary (3)
Satisfactory 12 63 65 140
Not-satisfactory 8 17 35 60
Column total 20 80 100 200 - grand total
Solution
Educational Level
Supervisor Row
Elementary Junior high school Secondary high school
Rating Total
fo fe fo fe fo fe
Satisfactory 12 14 63 56 65 70 140
Not-satisfactory 8 6 17 24 35 30 60
Column Total 20 80 100 200
fe fo - f e
fo (fo - fe)2
12 14 -2 4 0.2857
63 56 7 49 0.8750
65 70 -5 25 0.3571
8 6 2 4 0.6667
17 24 -7 49 2.0417
35 30 5 25 0.8333
χ2 = 5.0595
3. Sample χ2
The sample χ2 value i.e. 5.0595 does not exceed the value 5.991, in the decision rule, so
we accept Ho.
4. Accept or reject: Accept Ho.
Accepting Ho means that the proportion of satisfactory rated workers is the same for all
three educational levels.
Check Point 4.2
The quality-control manager of Vita Company examined a random sample of parts made during
the three shifts that the company operates. The manager classified the parts as good or defective
as shown in Table 4.8. Perform, at the 5 percent level, a test of the hypothesis that equal
proportions of defective parts are made by the three shifts.
Table 4.8
Shift
Day Middle Night
Number of good parts 427 273 240
Number of defective parts 23 27 10
Answer: The proportions of defective parts made by the shifts are not all equal. The
sample χ2, 7.19, is greater than 5.991.
A goodness-of-fit test is a hypothesis test that is performed by first computing the frequencies
that would be expected if Ho is true, then the differences between the frequencies observed in the
sample fo and the expected frequencies fe are used to calculate:
Sample χ2 =
Finally, the sample χ2 is compared with the appropriate χ2 α, ν to decide whether Ho should be
accepted or rejected.
Goodness-of-fit tests differ from independence tests both in the methods used to compute
expected frequencies and in the rule for determining the number of degrees of freedom. In a
goodness-of-fit test, the method for calculating the expected frequencies depends on the
population assumptions that are made; and the number of degrees of freedom in a goodness-of-fit
test is:
ν = ne – 1 – g
Where: ne = number of fe values used in computing the sample χ2
g = number of population parameters (e.g. μ, δ) estimated from the sample.
a. Goodness-of-fit: Binomial Distribution
A particular Binomial distribution is specified by the values of two parameters, n and p, where:
n = sample size (or number of trials)
p = probability of success in a trial
From Binomial Probability tables, we can calculate the probability of 0, 1, 2 … successes in n
trials. Each of these probabilities, when multiplied by the sample size n, is an expected
frequency for the number of successes in n trials.
Example: Mr. X, a sales representative for Moon Paper Company has five accounts to visit per
day. It is suggested that sales by Mr. X may be described by the binomial distribution with the
probability of selling each account being 0.4. Given the following frequency distribution of Mr.
X’s number of sales per day, can we conclude that the data do in fact follow the binomial
distribution with n = 5 and p = 0.4? Use the 0.05 significance level.
Number of Sales per day Frequency of No. of sales
0 10
1 41
2 60
3 20
4 6
5 3
140
Solution:
We first need to find the expected numbers of frequency of sales with n = 5 and p = 0.4 from the
binomial distribution table with p = 0.4. Thus the expected frequencies are calculated as:
Table 4.9
No. of sales per day fo fe
0 10 0.0778 x 140 = 10.892
1 41 0.2592 x 140 = 36.288
2 60 0.3456 x 140 = 48.384
3 20 0.2304 x 140 = 32.256
4 6 0.0768 x 140 = 10.752
5 4 or more 3 9 0.0102 x 140 = 1.428 12.18
140
Since we previously said that fe ≥ 5, thus we combine the frequencies for 4 and 5 number of sales
per day as shown in Table 9 above and proceed with the calculations for χ2.
Table 4.10
No. of sales per day fo fe (fo - fe)2 (fo - fe)2
fe
0 10 10.892 0.7957 0.0731
1 41 36.288 22.2029 0.6119
2 60 48.384 134.9315 2.7888
3 20 32.256 150.2095 4.6568
4 or more 9 12.18 10.1124 0.8302
χ2 = 8.9608
We did not state the decision rule at the beginning because, when goodness-of-fit tests are
performed, it is often not possible to know how many degrees of freedom there will be until after
the expected frequencies are computed; that’s because it may be necessary to combine some
frequencies before the sample χ2 is computed. So now that we have the information needed to
determine the number of degrees of freedom, let’s start at the beginning: The hypotheses are:
1. Ho: The number of sales per day by Mr. X follows a binomial distribution with n =
5, and p = 0.4
Ha: The number of sales per day by Mr. X does not follow a binomial distribution
with n = 5, and p = 0.4
The number of degree of freedom is
ν = ne –1–g
= 5–1–0=4
with α = 0.05 and ν = 4, we find (from the table at the back of the module) χ2 0.05, 4 =
9.49
Hence, the decision rule is
2. Reject Ho if sample χ2 > 9.49.
3. Sample χ2 = 8.9608
As 8.9608 does not exceed 9.49
4. Accept Ho.
Thus the sample evidence supports the hypothesis that the number of sales per day by
Mr. X follows a binomial distribution with n = 5 and p = 0.4.
Check Point 4.3
A manufacturer packages drinking glasses in boxes of 50. All glasses from a sample of 100
boxes were examined, and the number of defective glasses in each box was recorded. The
sample data are given in Table 7.11:
a. How much glasses were examined?
b. How many defective glasses were found?
c. Compute the sample proportion defective.
d. Do the sample data support the null hypothesis that the numbers of defective glasses in
box are binomially distributed? (Perform a goodness-of-fit test at the 5 percent level).
Table 4.11
Number of defectives in a box Number of boxes
0 69
1 22
2 4
3 1
4 3
5 1
Answer: a. 5000 b. 50 c. 0.01
a. Accept the hypothesis that the distribution is a binomial distribution. The sample χ 2,
3.59 is less than 3.841.
b. Goodness-of-fit: Normal Distribution
The test of a normal fit is similar to the binomial fit test in the foregoing; only here the expected
frequencies are determined using normal probabilities.
Example: For inventory planning and control purposes, a certain chemical company wants to
know if its sales of a liquid chemical are normally distributed. Sales for a random sample of 200
days are given in Table 4.12. The sample mean and sample standard deviation calculated from
the 200 sample daily sales numbers are:
= 40 thousand gallons
Sx = 2.5 thousand gallons
At a 5 percent level, perform a test of the hypothesis that sales are normally distributed. (The
values to be used for the parameters μ and δ are the sample estimates. That will cost us 2
degrees of freedom when we compute the value of ν.)
Table 4.12
Sales for 200 days
Sales No. of days
(in thousands of gallons) fo
Less than 34.0 0
34.0 and under 35.5 13
35.5 " " 37.0 20
37.0 " " 38.5 35
38.5 " " 40.0 43
40.0 " " 41.5 51
41.5 " " 43.0 27
43.0 " " 44.5 10
44.5 " " 46.0 1
46.0 or more 0
200
Solution:
To compute the expected frequency for the “less than 34.0” class, we first find the probability for
this class, and then multiply this probability by the sample size, 200.
at x = 34, we compute Z = x – μ
δ
= 34 – 40
2.5
= -2.4
From table at the back of the module;
P (0 to -2.4) = p (0 to 2.4) = 0.4918
The tail area probability we want is
0.5 – p (0 to -2.4) = 0.5 – 0.4918 = 0.0082
Since the sample includes 200 numbers, the expected frequency for the “less than 34.0” class is.
fe: 0.0082 (200) = 1.64
The expected frequency for the “34.0 and under 35.5” class is:
34 - 40 = -2.4
2.5 P(0 to 2.4) = 0.4918
35.5 - 40 = -1.8 P(0 to 1.8) = 0.4641
2.5 0.0277
fe = 0.0277 x 200 = 5.54
Exercise: Compute the expected frequency for the "35.5 and under 37.0" class.
Answer: 15.84
The complete list of expected frequencies, computed as just illustrated, is given in the fourth
column of table 4.13.
Table 4.13
Calculation of expected frequencies and the sample χ2 for a normal goodness-of-fit test
Sales class 200 times (fo - fe)2
fo - f e
Sales Class fo probability probability, fe
fe
Less than 34.0 0 0.0082 1.64
34.0 and under 35.5 13 13 0.0277 5.54 7.18 5.82 4.7176
35.5 " " 37.0 20 0.0792 15.84 4.16 1.0925
37.0 " " 38.5 35 0.1592 31.84 3.16 0.3136
38.5 " " 40.0 43 0.2257 45.14 -2.14 0.1015
40.0 " " 41.5 51 0.2257 45.14 5.86 0.7607
41.5 " " 43.0 27 0.1592 31.84 -4.84 0.7357
43.0 " " 44.5 10 0.0792 15.84 -5.84 2.1531
44.5 " " 46.0 1 0.0277 5.54
46.0 or more 0 1 0.0082 1.64 7.18 -6.18 5.3193
χ2 = 15.1940
After combining frequencies so that fe values are at least 5, the sample χ2 is computed to be
15.1940. Now let's summarize the test.
The hypotheses are:
1. Ho: the distribution is normally distributed
Ha: the distribution is not normally distributed
The sample χ2 was computed using ne = 8 expected frequencies. Two parameters, μ and δ,
were estimated from the sample. Hence, the number of degrees of freedom is
ν = ne - 1 - g
ν = 8 - 1 - 2 = 5 degrees of freedom
with α = 0.05, χ20.05,5 = 11.070 so the decision rule is,
2. Reject Ho if sample χ2 > 11.070.
From table 7.13,
3. Sample χ2 = 15.1940
The sample χ2 exceeds the value 11.070 in the decision rule, so we
4. Reject Ho.
Our conclusion is that daily sales are not normally distributed.
In this section we will let t = 1 unit of time, so λt = λ (1) = λ. Then the formula becomes.
P(x) = e-λ (λ)x
x!
Where x = number of arrivals in 1 unit of time
λ = average arrival rate per unit of time
The formula (the Poisson distribution table) is used to determine expected frequencies in a test of
the hypothesis that a distribution is a Poisson distribution with a stated value of the parameter λ.
Example: When a beer bottle-filling machine breaks a bottle, the machine must be shut down
while the broken glass is removed. The production manager at Harar Brewery has been using a
Poisson distribution with λ = 3 shutdowns per day, on the average, to determine the probabilities
of 0,1,2,3… shutdowns in a day. The manager has tabulated the number of shutdowns per day in
a random sample of 120 operating days, as shown in Table 4.14. We want to test, at the 5
percent level, the hypothesis that the number of shutdowns in a day has a Poisson distribution
with λt = λ = 3.
Table 4.14
Number of shutdowns in a day
No. of shutdowns in a day Number of days
X fo
0 3
1 20
2 29
3 22
4 23
5 10
6 or more 13
Solution:
The formula for the probability of shutdowns in a day is:
P(x) = e-λ (λ)x = e-3 (3)x
x! x!
To compute the expected number of days when there will be x = 0 shutdowns, we first compute
P(0) = e-3 (3)0
0!
= e-3 (1)
1
= 0.0498
Then we multiply P(0) by the number of days in the sample, 120, to obtain:
fe(0) = 0.0498 (120) = 5.976
As the expected frequency of x = 0 shutdowns. Poisson probabilities can be computed on a hand
calculator, or you can be found in the tables.
The probability of a or more arrivals is 1 minus the probability that the number of arrivals is 0, or
1, or 2, or…or (a-1). That is:
P (a or more) =
In our example
P (6 or more) =
Thus P(6 or more) = 1 - P(0) - P(1) - ….. - P(15)
= 1 - (0.0498) - (0.1494) - … - (0.1008)
= 0.0840