Chapter 11. Goodness of Fit and Contingency Tables
Chapter 11. Goodness of Fit and Contingency Tables
Chapter 11. Goodness of Fit and Contingency Tables
Goodness of fit involves a comparison of the frequency observed in the sample with the
expected frequency based on some theoretical model. If the differences between the observed
and the expected frequencies are so great that they are unlikely to be due to chance alone, we
conclude that the sample is not taken from the population that was used to calculate the expected
frequencies. Suppose we have k observed frequencies, O1, O2, ..., Ok and their corresponding
expected frequencies, E1, E2, ..., Ek, then the expression
k
χ = Σ ( 0i − E i ) 2 / E i
2
i=1
is approximately χ2 distributed with k-1 degrees of freedom. The closer the agreement between
the observed and expected values, the smaller will be the value of χ2, and a value of zero
indicates perfect agreement. Calculated values of χ2 exceeding those in Appendix Table A-5
indicate that, at the corresponding probability level, there is significant disagreement between
observed and expected values.
2
( 0i − E i − 1 / 2 )
2
adj. χ = Σ
2
with df = 1
Ei
i=1
In adjusted χ2 the absolute value of each of the differences between observed and expected
frequencies is reduced by 1/2 before being squared. The correction 1/2 is applicable only in the
case of one degree of freedom (k=2). For more degrees of freedom the corrections are more
complicated and are not generally used, but for 1 df the adjusted value of χ2 is always
appropriate.
An example in the use of adjusted χ2
It is hypothesized that blood groups are inherited in a simple Mendelian manner so that a
cross of parents both of whom have AB blood should give 3/4 type AB or AA children and 1/4
type BB children (theoretical model). Suppose that among 400 children from such parents, 292
are of type AB or AA. Does the observation conform to the theoretical hypothesis?
Blood Type: AB or AA BB
____________________________________________________
Referring to Appendix Table A-5 with 1 df, we find that the observed χ2 is not
significant (p < 0.10).
Thus the observed frequencies of the sample support the hypothetical ratio.
In a cross between ivory and red snapdragons the following counts were observed in the
F2 generation.
Red 20
Pink 55
Ivory 25
100
On the basis of these data, can segregation be assumed to occur in the simple Mendelian ratio of
1:2:1?
__________________________________________
Color Red Pink Ivory
__________________________________________
Observed 20 55 25
frequency
Expected 25 50 25
frequency
__________________________________________
Note since this χ2 has 2 df, the adjustment for noncontinuity is not necessary. Again, the
calculated χ2 is much smaller than the tabular χ2 at the 10% level. Therefore we conclude that
the observed color frequencies could conform with Mendelian ratio.
To illustrate the procedure, we will use the % sucrose data presented in Table 2-1,
summarized in Table 2-2, and graphed in Figure 2-1. We will test to see if these data can be
considered to be normally distributed. To compute the expected frequencies for each class
interval, we need to determine the probability associated with each interval. This procedure is
summarized in Table 11-1 and the computational steps follow the table.
Table 11-1. Observed and expected frequencies to test the goodness of fit of percent sucrose
values to a normal distribution.
Yi Oi Est.Z Ei Contri-
Mid- End- Obs. Y− Y Cum. Int. Exp. bution
Class point point freq. S prob. prob. freq. to χ2
1 4.8 5.6 1 -2.54 0.0055 0.0055 0.6 0.27
2 63 7.1 4 -1.96 0.0250 0.0195 2.0 2.00
3 7.8 8.6 4 -1.38 0.0838 0.0588 5.9 0.61
4 9.3 10.1 13 -0.81 0.2090 0.1252 12.5 0.02
5 10.8 11.6 10 -0.23 0.4090 0.2000 20.0 5.00
6 12.3 13.1 24 0.35 0.6368 0.2278 22.8 0.06
7 13.8 14.6 23 0.92 0.8186 0.1818 18.2 1.27
8 15.3 16.1 17 1.50 0.9332 0.1146 11.5 2.63
9 16.8 17.6 4 2.08 0.9812 0.0480 4.8 0.13
1. Columns 1 through 4 are recorded from the frequency table, Table 2-2.
2. Standardize the class interval end points, (Y - Y )/S, where Y = 12.2 and S = 2.6 as
previously calculated.
3. Determine the cumulative probability for each standardized value from Appendix Table
A-4. For example,
4. Calculate the probability for each class interval, e.g., for class 2, the interval probability =
0.0250 - 0.0055 = 0.0195.
5. The expected frequency for each class is calculated by multiplying the interval
probability by the sample size, e.g., for class 2, the expected frequency = (0.0195) • (100)
= 1.95 ~ 2.00.
For Appendix Table A-5, χ2 0.05,6 = 12.592. Although the calculated χ2 is not
significant at the 5% level, it is nearly so. Therefore we cannot be too sure that the data
of Figure 2-1 are normally distributed. However, we can conclude that the data are near
enough to being normally distributed to have no effect on the AOV procedures we are
using in the evaluation of this variable.
The following data show the effect of a certain type of fumigation on fruit spoilage.
Unfumigated 8 16 24
Fumigated 2 14 16
Totals 10 30 40
Does the amount of fruit depend upon whether it has been fumigated?
In any contingency table, we set up the hypothesis that the two criteria of classification
are independent. The marginal totals are accepted as part of the hypothesis. For a population in
which the distribution in the classes is shown by the marginal totals and the classes are
independent, we are asking what proportion of a large series of samples similar to the one under
consideration will deviate as much or more from the theoretical as the one observed. On the
basis of the marginal total the expected entry in the upper left-hand corner would be (24) •
(10)/40 = 6. After this entry or any other has been calculated, all the remaining entries can be
obtained by subtraction from the marginal totals. Since only one value need be calculated, for a
2 x 2 table, there is only one degree of freedom. This checks with the (j - 1) (k - 1) degrees of
freedom for a jxk table since in this case j = k = 2. The expected (theoretical) values are given
below:
Unfumigated 6 18 24
Fumigated 4 12 16
Totals 10 30 40
Therefore,
( 8 − 6 − 0.5) 2 ( 16 − 18 − 0.5) 2
χ2 = +
6 18
( 4 − 2 − 0.5) 2
( 12 − 14 − 0.5) 2
adjusted + +
4 12
2 2 2
(15.) (15
.) (15
.) . )2
(15
= + + +
6 18 4 12
= 0.27 + 0.13 + 0.56 + 0.19
= 1.25
Note the use of the correction for non-continuity for 1 df. From Appendix Table A-5, χ2 0.05,1
= 3.84. Since χ2 = 1.25 < 3.84, there is no reason to reject the null hypothesis of independence.
It appears, therefore, that fumigation has had no significant effect in reducing spoilage.
The following are tabulated data on 82 strains of oats divided into 2 groups according to the
presence or absence of awns, and into 3 groups according to yield. Do these data permit the
conclusion that more of the awned strains occur in the highest yielding classes than do awnless
strains?
Since χ2 = 15.55 > χ 20.05, 2 = 5.99 , the observed values are not distributed as expected on the basis
of the marginal totals, and the awned strains do occur in the highest yielding classes more
frequently.
SUMMARY
1. The chi-square distribution can be used to test the goodness of fit of data to a
hypothesized model.
( 0i − E i ) 2
χ2 = Σ
Ei
Where Oi is the observed frequency and Ei is the expected frequency from the
hypothesized model. Assume observations are classified into k frequency classes, then
2. The chi-square distribution can also be used to test the hypothesis of statistical
independence between two variables, each of which is classified into a number of
categories or attributes. The two-way table of the classified frequencies is called a
contingency table.
( 0ij − E ij ) 2
χ2 = Σ
E ij
where Oij is the observed frequency in the ith row and the jth column, with expected
frequency Eij. If there are j rows and k columns, the χ2 has (j - 1) (k - 1) df.
3. In case the calculated 2 has 1 df, the above formula needs to be modified as
( 0i − E i − 0.5) 2
adj. χ 2 = Σ
Ei
EXERCISES
1. Suppose 50 years rainfall data are summarized in the table below. Test whether the data
approximate a normal distribution.
Midpoint
Class (inches) Frequency Endpoint
1 8 3 9.5
2 11 9 12.5
3 14 14 15.5
4 17 10 18.5
5 20 7 21.5
6 23 0 24.5
7 26 4 27.5
8 29 2 30.5
9 32 1 33.5
2. Of 64 offspring of a certain cross of guinea pigs, 34 are red, 10 are black and 20 are
white. According to the genetic model, these numbers should be in the ratio of 9:3:4.
Are the data consistent with the model?
3. In an experiment involving the crossing of two hybrids of a species of flower the results
shown below are observed. Are these results consistent with the expected population
9:3:3:1?
4. It was hypothesized that the F2 generation of a particular barley cultivar would show a 9
hooded to 3 long-awned to 4 short-awned ratio. The observed data showed 348 hooded to
115 long-awned to 157 short-awned individuals. Test the hypothesis.
5.Suppose in a particular orange growing area of California that on the average 25% of the
oranges are graded as best grade, 40% are placed in the above average grade, 25% are
graded as average and 10% are graded as poor. A random sample of 500 bushels from
one orchard in the region yields 100 bushels of best grade, 300 bushels of above average
grade, 50 bushels of average grade and 50 bushels of poor grade oranges. Is the quality
of oranges from this orchard representative of this area?
(χ2 = 167.5)
Test the hypothesis that there is no difference between the two grasses in their
susceptibility to mildew. (χ2 = 199.40)
7. Two barley fields containing two different varieties of barley were examined for rust.
The results are presented below.
Do these data indicate the effectiveness of the inoculation on the basis of the 1% level of
significance? (Yes, since adj. χ2 = 35.79)
9. Twenty-two animals are suffering from a disease, the severity of which is about the same
in each case. In order to test the therapeutic value of a serum, it is administered to 10 of
the animals; 12 remain uninoculated as a control. The results are shown below.
Recovered Died
Inoculated 7 3
Not inoculated 3 9
Has inoculation been effective? (5% level) (No, since adj. χ2 = 2.82)
10. It is suspected that different combinations of temperature and humidity affect the number
of defective articles produced in a certain workroom. Do the following data confirm this
suspicion? (5% level) (Yes, since adj. χ2 = 5.95)
Humidity
Low High
Low 10 5
Temperature
High 4 16
11. A plant breeder wants to know if the sterility of rice is a genetic problem. Samples were
taken from a large field study of 400 plots and the sterility of each plot was rated as
follows:
Genotypes
Sterility A B C D
No problem 20 15 12 10
Moderate 70 60 80 50
Severe 10 25 8 40
Test the hypothesis that the severity of sterility is independent of genetic make-up or
genotype. (χ2 = 43.807)
Can he conclude that resistance to crown rot and growth habit are independent?
( χ2 = 33.636)
13. A pastry chef wishes to determine whether the proportion of unsatisfactory bear claws is
affected by oven temperature. He has baked batches of them at 350 , 400 , and 450 , and
has obtained the following results:
If the chef wishes to test the null hypothesis of identical proportions at an = 0.05
significance level, what conclusion should he reach? (χ2 = 13.712)