0% found this document useful (0 votes)
244 views37 pages

Chapter 12

This document discusses using chi-square tests to compare population proportions for three or more groups. It provides an example of using a chi-square test to determine if perceptions of administrative expenses at a United Way organization differ based on respondents' occupations. The test revealed perceptions were not independent of occupation, with some occupations having more inaccurate perceptions of expenses. This helped the organization understand how to adjust its programs and fundraising. The document also discusses how chi-square tests can be used to compare loyalty proportions for three car models to see if they are equal.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
244 views37 pages

Chapter 12

This document discusses using chi-square tests to compare population proportions for three or more groups. It provides an example of using a chi-square test to determine if perceptions of administrative expenses at a United Way organization differ based on respondents' occupations. The test revealed perceptions were not independent of occupation, with some occupations having more inaccurate perceptions of expenses. This helped the organization understand how to adjust its programs and fundraising. The document also discusses how chi-square tests can be used to compare loyalty proportions for three car models to see if they are equal.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

CHAPTER 12

Comparing Multiple
Proportions, Test of
Independence and
Goodness of Fit
CONTENTS APPENDIXES
STATISTICS IN PRACTICE: 12.1 CHI-SQUARE TESTS USING
UNITED WAY MINITAB
12.1 TESTING THE EQUALITY OF 12.2 CHI-SQUARE TESTS USING
POPULATION PROPORTIONS EXCEL
FOR THREE OR MORE
POPULATIONS
A Multiple Comparison Procedure
12.2 TEST OF INDEPENDENCE
12.3 GOODNESS OF FIT TEST
Multinomial Probability
Distribution
Normal Probability Distribution

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
510 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

STATISTICS in PRACTICE
UNITED WAY*
ROCHESTER, NEW YORK
United Way of Greater Rochester is a nonprofit orga-
nization dedicated to improving the quality of life for
all people in the seven counties it serves by meeting the
community’s most important human care needs.
The annual United Way/ Red Cross fund-raising
campaign funds hundreds of programs offered by more
than 200 service providers. These providers meet a

Hero Images/Getty Images


wide variety of human needs— physical, mental, and
social—and serve people of all ages, backgrounds, and
economic means.
The United Way of Greater Rochester decided
to conduct a survey to learn more about community
United Way programs meet the needs of children as
perceptions of charities. Focus-group interviews were well as adults.
held with professional, service, and general worker
groups to obtain preliminary information on percep-
tions. The information obtained was then used to help
Two questions in the survey provided categorical data
develop the questionnaire for the survey. The ques-
for the statistical test. One question obtained data on
tionnaire was pretested, modified, and distributed to
perceptions of the percentage of funds going to admin-
440 individuals.
istrative expenses (up to 10%, 11–20%, and 21% or
A variety of descriptive statistics, including frequency
more). The other question asked for the occupation of
distributions and crosstabulations, were provided from
the respondent.
the data collected. An important part of the analysis
The test of independence led to rejection of the
involved the use of chi-square tests of independence.
null hypothesis and to the conclusion that perception
One use of such statistical tests was to determine
of United Way administrative expenses is not inde-
whether perceptions of administrative expenses were
pendent of the occupation of the respondent. Actual
independent of the occupation of the respondent.
administrative expenses were less than 9%, but 35%
The hypotheses for the test of independence were:
of the respondents perceived that administrative ex-
H0: Perception of United Way administrative penses were 21% or more. Hence, many respondents
expenses is independent of the occupation of the had inaccurate perceptions of administrative expenses.
respondent. In this group, production-line, clerical, sales, and pro-
Ha: Perception of United Way administrative fessional-technical employees had the more inaccurate
expenses is not independent of the occupation of perceptions.
the respondent. The community perceptions study helped United
Way of Rochester develop adjustments to its programs
*The authors are indebted to Dr. Philip R. Tyler, marketing consultant to and fund-raising activities. In this chapter, you will learn
the United Way, for providing this Statistics in Practice. how tests, such as described here, are conducted.

In Chapters 9, 10, and 11 we introduced methods of statistical inference for hypothesis tests
about the means, proportions, and variances of one and two populations. In this chapter,
we introduce three additional hypothesis-testing procedures that expand our capacity for
making statistical inferences about populations.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.1 Testing the Equality of Population Proportions for Three or More Populations 511

The test statistic used in conducting the hypothesis tests in this chapter is based on the
chi-square (!2) distribution. In all cases, the data are categorical. These chi-square tests are
versatile and expand hypothesis testing with the following applications.
1. Testing the equality of population proportions for three or more populations
2. Testing the independence of two categorical variables
3. Testing whether a probability distribution for a population follows a specific his-
torical or theoretical probability distribution
We begin by considering hypothesis tests for the equality of population proportions for
three or more populations.

12.1 Testing the Equality of Population Proportions


for Three or More Populations
In Section 10.2 we introduced methods of statistical inference for population proportions
with two populations where the hypothesis test conclusion was based on the standard
normal (z) test statistic. We now show how the chi-square (!2) test statistic can be used to
make statistical inferences about the equality of population proportions for three or more
populations. Using the notation
p1 = population proportion for population 1
p2 = population proportion for population 2
and
pk = population proportion for population k

the hypotheses for the equality of population proportions for k ≥ 3 populations are as
follows:

H0: p1 = p2 = . . . = pk
Ha: Not all population proportions are equal

If the sample data and the chi-square test computations indicate H0 cannot be rejected, we
cannot detect a difference among the k population proportions. However, if the sample data
and the chi-square test computations indicate H0 can be rejected, we have the statistical
evidence to conclude that not all k population proportions are equal; that is, one or more
population proportions differ from the other population proportions. Further analyses can
be done to conclude which population proportion or proportions are significantly different
from others. Let us demonstrate this chi-square test by considering an application.
Organizations such as J.D. Power and Associates use the proportion of owners likely to
repurchase a particular automobile as an indication of customer loyalty for the automobile.
An automobile with a greater proportion of owners likely to repurchase is concluded to
have greater customer loyalty. Suppose that in a particular study we want to compare the
customer loyalty for three automobiles: Chevrolet Impala, Ford Fusion, and Honda Accord.
The current owners of each of the three automobiles form the three populations for the
study. The three population proportions of interest are as follows:

p1 = proportion likely to repurchase an Impala for the population of


Chevrolet Impala owners
p2 = proportion likely to repurchase a Fusion for the population of Ford Fusion owners
p3 = proportion likely to repurchase an Accord for the population of Honda Accord owners

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
512 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

TABLE 12.1 SAMPLE RESULTS OF LIKELY TO REPURCHASE FOR THREE POPULATIONS


OF AUTOMOBILE OWNERS (OBSERVED FREQUENCIES)

Automobile Owners
Chevrolet Impala Ford Fusion Honda Accord Total
AutoLoyalty Likely to Yes 69 120 123 312
Repurchase No 56 80 52 188
Total 125 200 175 500

The hypotheses are stated as follows:

H0: p1 = p2 = p3
Ha: Not all population proportions are equal

To conduct this hypothesis test we begin by taking a sample of owners from each of
the three populations. Thus we will have a sample of Chevrolet Impala owners, a sample
of Ford Fusion owners, and a sample of Honda Accord owners. Each sample provides
categorical data indicating whether the respondents are likely or not likely to repurchase
the automobile. The data for samples of 125 Chevrolet Impala owners, 200 Ford Fusion
In studies such as these, we owners, and 175 Honda Accord owners are summarized in the tabular format shown in
often use the same sample Table 12.1. This table has two rows for the responses Yes and No and three columns, one
size for each population. corresponding to each of the populations. The observed frequencies are summarized in
We have chosen different
sample sizes in this example
the six cells of the table corresponding to each combination of the likely to repurchase
to show that the chi-square responses and the three populations.
test is not restricted to Using Table 12.1, we see that 69 of the 125 Chevrolet Impala owners indicated that
equal sample sizes for each they were likely to repurchase a Chevrolet Impala. One hundred and twenty of the 200
of the k populations. Ford Fusion owners and 123 of the 175 Honda Accord owners indicated that they were
likely to repurchase their current automobile. Also, across all three samples, 312 of the
500 owners in the study indicated that they were likely to repurchase their current auto-
mobile. The question now is how do we analyze the data in Table 12.1 to determine if the
hypothesis H0: p1 = p2 = p3 should be rejected?
The data in Table 12.1 are the observed frequencies for each of the six cells that repre-
sent the six combinations of the likely to repurchase response and the owner population. If
we can determine the expected frequencies under the assumption H0 is true, we can use the
chi-square test statistic to determine whether there is a significant difference between the
observed and expected frequencies. If a significant difference exists between the observed
and expected frequencies, the hypothesis H0 can be rejected and there is evidence that not
all the population proportions are equal.
Expected frequencies for the six cells of the table are based on the following rationale.
First, we assume that the null hypothesis of equal population proportions is true. Then we
note that in the entire sample of 500 owners, a total of 312 owners indicated that they were
likely to repurchase their current automobile. Thus, 312/500 = .624 is the overall sample
proportion of owners indicating they are likely to repurchase their current automobile. If
H0: p1 = p2 = p3 is true, .624 would be the best estimate of the proportion responding likely
to repurchase for each of the automobile owner populations. So if the assumption of H0 is
true, we would expect .624 of the 125 Chevrolet Impala owners, or .624(125) = 78 owners
to indicate they are likely to repurchase the Impala. Using the .624 overall sample proportion,
we would expect .624(200) = 124.8 of the 200 Ford Fusion owners and .624(175) = 109.2
of the Honda Accord owners to respond that they are likely to repurchase their respective
model of automobile.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.1 Testing the Equality of Population Proportions for Three or More Populations 513

Let us generalize the approach to computing expected frequencies by letting eij de-
note the expected frequency for the cell in row i and column j of the table. With this
notation, now reconsider the expected frequency calculation for the response of likely
to repurchase Yes (row 1) for Chevrolet Impala owners (column 1), that is, the expected
frequency e11.
Note that 312 is the total number of Yes responses (row 1 total), 175 is the total sam-
ple size for Chevrolet Impala owners (column 1 total), and 500 is the total sample size.
Following the logic in the preceding paragraph, we can show

e11 5 S Row 1 Total


Total Sample Size
D
(Column 1 Total) 5
312
500
S D
125 5 (.624)125 5 78

Starting with the first part of the above expression, we can write
(Row 1 Total)(Column 1 Total)
e11 5
Total Sample Size

Generalizing this expression shows that the following formula can be used to provide the
expected frequencies under the assumption H0 is true.

EXPECTED FREQUENCIES UNDER THE ASSUMPTION H0 IS TRUE


(Row i Total)(Column j Total)
eij 5 (12.1)
Total Sample Size

Using equation (12.1), we see that the expected frequency of Yes responses (row 1) for
Honda Accord owners (column 3) would be e13 = (Row 1 Total)(Column 3 Total)/(Total
Sample Size) = (312)(175)/500 = 109.2. Use equation (12.1) to verify the other expected
frequencies are as shown in Table 12.2.
The test procedure for comparing the observed frequencies of Table 12.1 with the
expected frequencies of Table 12.2 involves the computation of the following chi-square
statistic:

CHI-SQUARE TEST STATISTIC

(fij 2 eij)2
!2 5 oo
i j
eij
(12.2)

where

fij = observed frequency for the cell in row i and column j


eij = expected frequency for the cell in row i and column j under the assumption
H0 is true

Note: In a chi-square test involving the equality of k population proportions, the


above test statistic has a chi-square distribution with k – 1 degrees of freedom pro-
vided the expected frequency is 5 or more for each cell.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
514 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

TABLE 12.2 EXPECTED FREQUENCIES FOR LIKELY TO REPURCHASE FOR THREE


POPULATIONS OF AUTOMOBILE OWNERS IF H0 IS TRUE

Automobile Owners
Chevrolet Impala Ford Fusion Honda Accord Total
Likely to Yes 78 124.8 109.2 312
Repurchase No 47 75.2 65.8 188
Total 125 200 175 500

Reviewing the expected frequencies in Table 12.2, we see that the expected frequency
is at least five for each cell in the table. We therefore proceed with the computation of the
chi-square test statistic. The calculations necessary to compute the value of the test statistic
are shown in Table 12.3. In this case, we see that the value of the test statistic is !2 = 7.89.
In order to understand whether or not !2 = 7.89 leads us to reject H0: p1 = p2 = p3,
you will need to understand and refer to values of the chi-square distribution. Table 12.4
shows the general shape of the chi-square distribution, but note that the shape of a specific
chi-square distribution depends upon the number of degrees of freedom. The table shows
the upper tail areas of .10, .05, .025, .01, and .005 for chi-square distributions with up to
15 degrees of freedom. This version of the chi-square table will enable you to conduct the
hypothesis tests presented in this chapter.
Since the expected frequencies shown in Table 12.2 are based on the assumption
that H0: p1 = p2 = p3 is true, observed frequencies, fij, that are in agreement with expected
frequencies, eij, provide small values of (fij −eij)2 in equation (12.2). If this is the case, the
value of the chi-square test statistic will be relatively small and H0 cannot be rejected. On
the other hand, if the differences between the observed and expected frequencies are large,
values of (fij − eij)2 and the computed value of the test statistic will be large. In this case,
the null hypothesis of equal population proportions can be rejected. Thus a chi-square test
The chi-square test for equal population proportions will always be an upper tail test with rejection of H0 oc-
presented in this section is
curring when the test statistic is in the upper tail of the chi-square distribution.
always a one-tailed test with
the rejection of H0 occurring We can use the upper tail area of the appropriate chi-square distribution and the
in the upper tail of the p-value approach to determine whether the null hypothesis can be rejected. In the automobile
chi-square distribution. brand loyalty study, the three owner populations indicate that the appropriate chi-square

TABLE 12.3 COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE TEST OF EQUAL
POPULATION PROPORTIONS

Squared Difference
Observed Expected Squared Divided by
Likely to Automobile Frequency Frequency Difference Difference Expected Frequency
Repurchase? Owner ( fi j) (ei j) ( fij 2 ei j) ( fij 2 ei j)2 ( fij 2 ei j)2/eij
Yes Impala 69 78.0 −9.0 81.00 1.04
Yes Fusion 120 124.8 −4.8 23.04 0.18
Yes Accord 123 109.2 13.8 190.44 1.74
No Impala 56 47.0 9.0 81.00 1.72
No Fusion 80 75.2 4.8 23.04 0.31
No Accord 52 65.8 −13.8 190.44 2.89
Total 500 500 !2 = 7.89

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.1 Testing the Equality of Population Proportions for Three or More Populations 515

TABLE 12.4 SELECTED VALUES OF THE CHI-SQUARE DISTRIBUTION

Area or
probability

0 !2

Area in Upper Tail


Degrees
of Freedom .10 .05 .025 .01 .005
1 2.706 3.841 5.024 6.635 7.879
2 4.605 5.991 7.378 9.210 10.597
3 6.251 7.815 9.348 11.345 12.838
4 7.779 9.488 11.143 13.277 14.860
5 9.236 11.070 12.832 15.086 16.750
6 10.645 12.592 14.449 16.812 18.548
7 12.017 14.067 16.013 18.475 20.278
8 13.362 15.507 17.535 20.090 21.955
9 14.684 16.919 19.023 21.666 23.589
10 15.987 18.307 20.483 23.209 25.188
11 17.275 19.675 21.920 24.725 26.757
12 18.549 21.026 23.337 26.217 28.300
13 19.812 22.362 24.736 27.688 29.819
14 21.064 23.685 26.119 29.141 31.319
15 22.307 24.996 27.488 30.578 32.801

distribution has k − 1 = 3 − 1 = 2 degrees of freedom. Using row two of the chi-square


distribution table, we have the following:

Area in Upper Tail .10 .05 .025 .01 .005


x2 Value (2 df) 4.605 5.991 7.378 9.210 10.597

!2 = 7.89

We see the upper tail area at !2 = 7.89 is between .025 and .01. Thus, the corresponding
upper tail area or p-value must be between .025 and .01. With p-value ≤ .05, we reject
H0 and conclude that the three population proportions are not all equal and thus there
is a difference in brand loyalties among the Chevrolet Impala, Ford Fusion, and Honda
Accord owners. Minitab or Excel procedures provided in Appendix F can be used to show
!2 = 7.89 with 2 degrees of freedom yields a p-value = .0193.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
516 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With " = .05 and 2 degrees of freedom, the critical value for the chi-square
test statistic is !2 = 5.991. The upper tail rejection region becomes

Reject H0 if !2 ≥ 5.991

With 7.89 ≥ 5.991, we reject H0. Thus, the p-value approach and the critical value approach
provide the same hypothesis-testing conclusion.
Let us summarize the general steps that can be used to conduct a chi-square test for the
equality of the population proportions for three or more populations.

A CHI-SQUARE TEST FOR THE EQUALITY OF POPULATION PROPORTIONS


FOR k ≥ 3 POPULATIONS
1. State the null and alternative hypotheses

H0: p1 = p2 = . . . = pk
Ha: Not all population proportions are equal
2. Select a random sample from each of the populations and record the observed
frequencies, fij, in a table with 2 rows and k columns
3. Assume the null hypothesis is true and compute the expected frequencies, eij
4. If the expected frequency, eij, is 5 or more for each cell, compute the test
statistic:
(fij 2 eij)2
2
! 5
i j
ooeij
5. Rejection rule:

p { value approach: Reject H0 if p { value # "


Critical value approach: Reject H0 if !2 $ !2"

where the chi-square distribution has k − 1 degrees of freedom and " is the
level of significance for the test.

A Multiple Comparison Procedure


We have used a chi-square test to conclude that the population proportions for the three
populations of automobile owners are not all equal. Thus, some differences among the
population proportions exist and the study indicates that customer loyalties are not all
the same for the Chevrolet Impala, Ford Fusion, and Honda Accord owners. To identify
where the differences between population proportions exist, we can begin by computing
the three sample proportions as follows:
Brand Loyalty Sample Proportions

Chevrolet Impala p1 = 69/125 = .5520


Ford Fusion p2 = 120/200 = .6000
Honda Accord p3 = 123/175 = .7029
Since the chi-square test indicated that not all population proportions are equal, it is
reasonable for us to proceed by attempting to determine where differences among the

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.1 Testing the Equality of Population Proportions for Three or More Populations 517

population proportions exist. For this we will rely on a multiple comparison procedure that
can be used to conduct statistical tests between all pairs of population proportions. In the fol-
lowing, we discuss a multiple comparison procedure known as the Marascuilo procedure.
This is a relatively straightforward procedure for making pairwise comparisons of all pairs
of population proportions. We will demonstrate the computations required by this multiple
comparison test procedure for the automobile customer loyalty study.
We begin by computing the absolute value of the pairwise difference between sample
proportions for each pair of populations in the study. In the three-population automobile
brand loyalty study we compare populations 1 and 2, populations 1 and 3, and then popula-
tions 2 and 3 using the sample proportions as follows:
Chevrolet Impala and Ford Fusion
u p1 2 p2 u 5 u.5520 2 .6000u 5 .0480

Chevrolet Impala and Honda Accord

u p1 2 p3 u 5 u.5520 2 .7029u 5 .1509

Ford Fusion and Honda Accord


u p2 2 p3 u 5 u.6000 2 .7029u 5 .1029

In a second step, we select a level of significance and compute the corresponding critical
value for each pairwise comparison using the following expression.

CRITICAL VALUES FOR THE MARASCUILO PAIRWISE COMPARISON


PROCEDURE FOR k POPULATION PROPORTIONS
For each pairwise comparison compute a critical value as follows:

CVij 5 Ï!2" Î pi(1 2 pi) pj(1 2 pj)


ni
1
nj
(12.3)

where
!2" = chi-square with a level of significance " and k – 1 degrees of freedom
pi and pj = sample proportions for populations i and j
ni and nj = sample sizes for populations i and j

Using the chi-square distribution in Table 12.4, k − 1 = 3 − 1 = 2 degrees of freedom,


and a .05 level of significance, we have !2.05 = 5.991. Now using the sample proportions
p1 = .5520, p2 = .6000, and p3 = .7029, the critical values for the three pairwise com-
parison tests are as follows:
Chevrolet Impala and Ford Fusion

CV12 5 Ï5.991 Î .5520(1 2 .5520) .6000(1 2 .6000)


125
1
200
5 .1380

Chevrolet Impala and Honda Accord

CV13 5 Ï5.991 Î .5520(1 2 .5520) .7029(1 2 .7029)


125
1
175
5 .1379

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
518 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

TABLE 12.5 PAIRWISE COMPARISON TESTS FOR THE AUTOMOBILE BRAND LOYALTY STUDY

Significant if
Pairwise Comparison z pi 2 pj z CVij u pi 2 pj u . CVij
Chevrolet Impala vs. Ford Fusion .0480 .1380 Not significant
Chevrolet Impala vs. Honda Accord .1509 .1379 Significant
Ford Fusion vs. Honda Accord .1029 .1198 Not significant

Ford Fusion and Honda Accord

CV23 5 Ï5.991 Î .6000(1 2 .6000) .7029(1 2 .7029)


200
1
175
5 .1198

If the absolute value of any pairwise sample proportion difference u pi 2 pj u exceeds its
corresponding critical value, CVij, the pairwise difference is significant at the .05 level of
significance and we can conclude that the two corresponding population proportions are
different. The final step of the pairwise comparison procedure is summarized in Table 12.5.
The conclusion from the pairwise comparison procedure is that the only significant
difference in customer loyalty occurs between the Chevrolet Impala and the Honda Accord.
Our sample results indicate that the Honda Accord had a greater population proportion of
owners who say they are likely to repurchase the Honda Accord. Thus, we can conclude that
the Honda Accord (p3 5 .7029) has a greater customer loyalty than the Chevrolet Impala
( p1 5 .5520).
The results of the study are inconclusive as to the comparative loyalty of the Ford Fusion.
While the Ford Fusion did not show significantly different results when compared to the
Chevrolet Impala or Honda Accord, a larger sample may have revealed a significant differ-
ence between Ford Fusion and the other two automobiles in terms of customer loyalty. It is
not uncommon for a multiple comparison procedure to show significance for some pairwise
comparisons and yet not show significance for other pairwise comparisons in the study.

NOTES AND COMMENTS

1. In Chapter 10, we used the standard normal distri- each population had a binomial distribution
bution and the z test statistic to conduct hypothesis with parameter p the population proportion of
tests about the proportions of two populations. Yes responses. An extension of the chi-square
However, the chi-square test introduced in this procedure in this section applies when each of
section can also be used to conduct the hypoth- the k populations has three or more possible re-
esis test that the proportions of two populations sponses. In this case, each population is said
are equal. The results will be the same under both to have a multinomial distribution. The chi-
test procedures and the value of the test statistic square calculations for the expected frequen-
!2 will be equal to the square of the value of the cies, eij, and the test statistic, !2, are the same
test statistic z. An advantage of the methodology as shown in expressions (12.1) and (12.2). The
in Chapter 10 is that it can be used for either a only difference is that the null hypothesis as-
one-tailed or a two-tailed hypothesis about the sumes that the multinomial distribution for the
proportions of two populations, whereas the chi- response variable is the same for all popula-
square test in this section can be used only for two- tions. With r responses for each of the k popu-
tailed tests. Exercise 12.6 will give you a chance lations, the chi-square test statistic has (r − 1)
to use the chi-square test for the hypothesis that (k − 1) degrees of freedom. Exercise 12.8 will
the proportions of two populations are equal. give you a chance to use the chi-square test to
2. Each of the k populations in this section had compare three populations with multinomial
two response outcomes, Yes or No. In effect, distributions.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.1 Testing the Equality of Population Proportions for Three or More Populations 519

Exercises

Methods
1. Use the sample data below to test the hypotheses
H0: p1 = p2 = p3
Ha: Not all population proportions are equal
where pi is the population proportion of Yes responses for population i. Using a .05 level
of significance, what is the p-value and what is your conclusion?

Populations
Response 1 2 3
Yes 150 150 96
No 100 150 104

2. Reconsider the observed frequencies in exercise 1


a. Compute the sample proportion for each population.
b. Use the multiple comparison procedure to determine which population proportions
differ significantly. Use a .05 level of significance.

Applications
3. The sample data below represent the number of late and on time flights for Delta, United,
and US Airways (Bureau of Transportation Statistics, March 2012).

Airline
Flight Delta United US Airways
Late 39 51 56
On Time 261 249 344

a. Formulate the hypotheses for a test that will determine if the population proportion of
late flights is the same for all three airlines.
b. Conduct the hypothesis test with a .05 level of significance. What is the p-value and
what is your conclusion?
c. Compute the sample proportion of late flights for each airline. What is the overall
proportion of late flights for the three airlines?
4. Benson Manufacturing is considering ordering electronic components from three different
suppliers. The suppliers may differ in terms of quality in that the proportion or percentage
of defective components may differ among the suppliers. To evaluate the proportion of
defective components for the suppliers, Benson has requested a sample shipment of 500
components from each supplier. The number of defective components and the number of
good components found in each shipment are as follows.

Supplier
Component A B C
Defective 15 20 40
Good 485 480 460

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
520 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

a. Formulate the hypotheses that can be used to test for equal proportions of defective
components provided by the three suppliers.
b. Using a .05 level of significance, conduct the hypothesis test. What is the p-value and
what is your conclusion?
c. Conduct a multiple comparison test to determine if there is an overall best supplier or
if one supplier can be eliminated because of poor quality.
5. Kate Sanders, a researcher in the department of biology at IPFW University, studied the
effect of agriculture contaminants on the stream fish population in Northeastern Indiana
(April 2012). Specially designed traps collected samples of fish at each of four stream
locations. A research question was, Did the differences in agricultural contaminants found
at the four locations alter the proportion of the fish population by gender? Observed
frequencies were as follows.

Stream Locations
Gender A B C D
Male 49 44 49 39
Female 41 46 36 44

a. Focusing on the proportion of male fish at each location, test the hypothesis that the
population proportions are equal for all four locations. Use a .05 level of significance.
What is the p-value and what is your conclusion?
b. Does it appear that differences in agricultural contaminants found at the four locations
altered the fish population by gender?
Exercise 6 shows a 6. A tax preparation firm is interested in comparing the quality of work at two of its regional
chi-square test can be offices. The observed frequencies showing the number of sampled returns with errors and
used when the hypothesis the number of sampled returns that were correct are as follows.
is about the equality of two
population proportions.

Regional Office
Return Office 1 Office 2
Error 35 27
Correct 215 273

a. What are the sample proportions of returns with errors at the two offices?
b. Use the chi-square test procedure to see if there is a significant difference between
the population proportion of error rates for the two offices. Test the null hypothesis
H0: p1 = p2 with a .10 level of significance. What is the p-value and what is your
conclusion? Note: We generally use the chi-square test of equal proportions when
there are three or more populations, but this example shows that the same chi-square
test can be used for testing equal proportions with two populations.
c. In the Section 10.2, a z test was used to conduct the above test. Either a !2 test statistic
or a z test statistic may be used to test the hypothesis. However, when we want to make
inferences about the proportions for two populations, we generally prefer the z test
statistic procedure. Refer to the Notes and Comments at the end of this section and
comment on why the z test statistic provides the user with more options for inferences
about the proportions of two populations.
7. Social networking is becoming more and more popular around the world. Pew Research
Center used a survey of adults in several countries to determine the percentage of adults
who use social networking sites (USA Today, February 8, 2012). Assume that the results
for surveys in Great Britain, Israel, Russia, and United States are as follows.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.2 Test of Independence 521

Country
Use Social Great United
Networking Sites Britain Israel Russia States
Yes 344 265 301 500
No 456 235 399 500

a. Conduct a hypothesis test to determine whether the proportion of adults using social
networking sites is equal for all four countries. What is the p-value? Using a .05 level
of significance, what is your conclusion?
b. What are the sample proportions for each of the four countries? Which country has
the largest proportion of adults using social networking sites?
c. Using a .05 level of significance, conduct multiple pairwise comparison tests among
the four countries. What is your conclusion?
Exercise 8 shows a 8. A manufacturer is considering purchasing parts from three different suppliers. The parts
chi-square test can also received from the suppliers are classified as having a minor defect, having a major defect,
be used for multiple or being good. Test results from samples of parts received from each of the three suppliers
population tests when
are shown below. Note that any test with these data is no longer a test of proportions for the
the categorical response
variable has three or more three supplier populations because the categorical response variable has three outcomes:
outcomes. minor defect, major defect, and good.

Supplier
Part Tested A B C
Minor Defect 15 13 21
Major Defect 5 11 5
Good 130 126 124

Using the data above, conduct a hypothesis test to determine if the distribution of defects is the
same for the three suppliers. Use the chi-square test calculations as presented in this section
with the exception that a table with r rows and c columns results in a chi-square test statistic
with (r – 1)(c – 1) degrees of freedom. Using a .05 level of significance, what is the p-value
and what is your conclusion?

12.2 Test of Independence


An important application of a chi-square test involves using sample data to test for the in-
dependence of two categorical variables. For this test we take one sample from a population
and record the observations for two categorical variables. We will summarize the data by
counting the number of responses for each combination of a category for variable 1 and a
category for variable 2. The null hypothesis for this test is that the two categorical variables
are independent. Thus, the test is referred to as a test of independence. We will illustrate
this test with the following example.
A beer industry association conducts a survey to determine the preferences of beer drink-
ers for light, regular, and dark beers. A sample of 200 beer drinkers is taken with each person
in the sample asked to indicate a preference for one of the three types of beers: light, regular,
or dark. At the end of the survey questionnaire, the respondent is asked to provide informa-
tion on a variety of demographics including gender: male or female. A research question of
interest to the association is whether preference for the three types of beer is independent of
the gender of the beer drinker. If the two categorical variables, beer preference and gender, are

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
522 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

independent, beer preference does not depend on gender and the preference for light, regular,
and dark beer can be expected to be the same for male and female beer drinkers. However, if
the test conclusion is that the two categorical variables are not independent, we have evidence
that beer preference is associated or dependent upon the gender of the beer drinker. As a result,
we can expect beer preferences to differ for male and female beer drinkers. In this case, a beer
manufacturer could use this information to customize its promotions and advertising for the
different target markets of male and female beer drinkers.
The hypotheses for this test of independence are as follows:
H0: Beer preference is independent of gender
Ha: Beer preference is not independent of gender
The sample data will be summarized in a two-way table with beer preferences of light,
regular, and dark as one of the variables and gender of male and female as the other vari-
able. Since an objective of the study is to determine if there is difference between the beer
preferences for male and female beer drinkers, we consider gender an explanatory variable
and follow the usual practice of making the explanatory variable the column variable in the
data tabulation table. The beer preference is the categorical response variable and is shown
as the row variable. The sample results of the 200 beer drinkers in the study are summarized
in Table 12.6.
The sample data are summarized based on the combination of beer preference and
gender for the individual respondents. For example, 51 individuals in the study were males
who preferred light beer, 56 individuals in the study were males who preferred regular
beer, and so on. Let us now analyze the data in the table and test for independence of beer
preference and gender.
First of all, since we selected a sample of beer drinkers, summarizing the data for each
variable separately will provide some insights into the characteristics of the beer drinker
population. For the categorical variable gender, we see 132 of the 200 in the sample were
male. This gives us the estimate that 132/200 = .66, or 66%, of the beer drinker population
is male. Similarly we estimate that 68/200 = .34, or 34%, of the beer drinker population is
female. Thus male beer drinkers appear to outnumber female beer drinkers approximately
2 to 1. Sample proportions or percentages for the three types of beer are
Prefer Light Beer 90/200 = .450, or 45.0%
Prefer Regular Beer 77/200 = .385, or 38.5%
Prefer Dark Beer 33/200 = .165, or 16.5%
Across all beer drinkers in the sample, light beer is preferred most often and dark beer is
preferred least often.
Let us now conduct the chi-square test to determine if beer preference and gender
are independent. The computations and formulas used are the same as those used for

TABLE 12.6 SAMPLE RESULTS FOR BEER PREFERENCES OF MALE AND FEMALE
BEER DRINKERS (OBSERVED FREQUENCIES)

Gender
Male Female Total
BeerPreference Light 51 39 90
Beer Preference Regular 56 21 77
Dark 25 8 33
Total 132 68 200

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.2 Test of Independence 523

TABLE 12.7 EXPECTED FREQUENCIES IF BEER PREFERENCE IS INDEPENDENT


OF THE GENDER OF THE BEER DRINKER

Gender
Male Female Total
Light 59.40 30.60 90
Beer Preference Regular 50.82 26.18 77
Dark 21.78 11.22 33
Total 132 68 200

the chi-square test in Section 12.1. Utilizing the observed frequencies in Table 12.6 for
row i and column j, fij, we compute the expected frequencies, eij, under the assumption
that the beer preferences and gender are independent. The computation of the expected
frequencies follows the same logic and formula used in Section 12.1. Thus the expected
frequency for row i and column j is given by
(Row i Total)(Column j Total)
eij 5 (12.4)
Sample Size
For example, e11 = (90)(132)/200 = 59.40 is the expected frequency for male beer drink-
ers who would prefer light beer if beer preference is independent of gender. Show that
equation (12.4) can be used to find the other expected frequencies shown in Table 12.7.
Following the chi-square test procedure discussed in Section 12.1, we use the following
expression to compute the value of the chi-square test statistic.
(fij 2 eij)2
!2 5 oo
i j
eij
(12.5)

With r rows and c columns in the table, the chi-square distribution will have (r – 1)(c – 1)
degrees of freedom provided the expected frequency is at least 5 for each cell. Thus, in this
application we will use a chi-square distribution with (3 – 1)(2 – 1) = 2 degrees of freedom.
The complete steps to compute the chi-square test statistic are summarized in Table 12.8.
We can use the upper tail area of the chi-square distribution with 2 degrees of freedom
and the p-value approach to determine whether the null hypothesis that beer preference is

TABLE 12.8 COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE TEST
OF INDEPENDENCE BETWEEN BEER PREFERENCE AND GENDER

Squared Difference
Observed Expected Squared Divided by
Beer Frequency Frequency Difference Difference Expected Frequency
Preference Gender fij eij ( fij 2 eij ) ( fij 2 eij )2 ( fij 2 eij )2/eij
Light Male 51 59.40 −8.40 70.56 1.19
Light Female 39 30.60 8.40 70.56 2.31
Regular Male 56 50.82 5.18 26.83 .53
Regular Female 21 26.18 −5.18 26.83 1.02
Dark Male 25 21.78 3.22 10.37 .48
Dark Female 8 11.22 −3.22 10.37 .92
Total 200 200 !2 5 6.45

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
524 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

independent of gender can be rejected. Using row two of the chi-square distribution table
shown in Table 12.4, we have the following:

Area in Upper Tail .10 .05 .025 .01 .005


x2 Value (2 df) 4.605 5.991 7.378 9.210 10.597

!2 = 6.45

Thus, we see the upper tail area at !2 = 6.45 is between .05 and .025, and so the correspond-
ing upper tail area or p-value must be between .05 and .025. With p-value ≤ .05, we reject
H0 and conclude that beer preference is not independent of the gender of the beer drinker.
Stated another way, the study shows that beer preference can be expected to differ for male
and female beer drinkers. Minitab or Excel procedures provided in Appendix F can be used
to show !2 = 6.45 with two degrees of freedom yields a p-value = .0398.
Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With " = .05 and 2 degrees of freedom, the critical value for the chi-square
test statistic is !2.05 = 5.991. The upper tail rejection region becomes
Reject H0 if ≥ 5.991
With 6.45 ≥ 5.991, we reject H0. Again we see that the p-value approach and the critical
value approach provide the same conclusion.
While we now have evidence that beer preference and gender are not independent, we
will need to gain additional insight from the data to assess the nature of the association
between these two variables. One way to do this is to compute the probability of the beer
preference responses for males and females separately. These calculations are as follows:

Beer Preference Male Female


Light 51/132 = .3864, or 38.64% 39/68 = .5735, or 57.35%
Regular 56/132 = .4242, or 42.42% 21/68 = .3088, or 30.88%
Dark 25/132 = .1894, or 18.94% 8/68 = .1176, or 11.76%

The bar chart for male and female beer drinkers of the three kinds of beer is shown in
Figure 12.1.

FIGURE 12.1 BAR CHART COMPARISON OF BEER PREFERENCE BY GENDER

0.7 Male
0.6 Female

0.5
Probability

0.4

0.3

0.2

0.1

0
Light Regular Dark
Beer Preference

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.2 Test of Independence 525

What observations can you make about the association between beer preference and
gender? For female beer drinkers in the sample, the highest preference is for light beer at
57.35%. For male beer drinkers in the sample, regular beer is most frequently preferred
at 42.42%. While female beer drinkers have a higher preference for light beer than males,
male beer drinkers have a higher preference for both regular beer and dark beer. Data visu-
alization through bar charts such as shown in Figure 12.1 is helpful in gaining insight as to
how two categorical variables are associated.
Before we leave this discussion, we summarize the steps for a test of independence.

CHI-SQUARE TEST FOR INDEPENDENCE OF TWO CATEGORICAL


VARIABLES
1. State the null and alternative hypotheses.
The expected frequencies H0: The two categorical variables are independent
must all be 5 or more for
the chi-square test to be
Ha: The two categorical variables are not independent
valid.
2. Select a random sample from the population and collect data for both vari-
ables for every element in the sample. Record the observed frequencies, fij, in
a table with r rows and c columns.
3. Assume the null hypothesis is true and compute the expected frequencies, eij
4. If the expected frequency, eij, is 5 or more for each cell, compute the test statistic:
(fij 2 eij)2
oo
This chi-square test is 2
! 5
also a one-tailed test with
i j
eij
rejection of H0 occurring in
the upper tail of a 5. Rejection rule:
chi-square distribution
with (r – 1)(c – 1) degrees p { value approach: Reject H0 if p { value # "
of freedom.
Critical value approach: Reject H0 if !2 $ !2"
where the chi-square distribution has (r – 1)(c – 1) degrees of freedom and "
is the level of significance for the test.

Finally, if the null hypothesis of independence is rejected, summarizing the probabilities


as shown in the above example will help the analyst determine where the association or
dependence exists for the two categorical variables.

Exercises

Methods
9. The following table contains observed frequencies for a sample of 200. Test for indepen-
dence of the row and column variables using " = .05.

Column Variable
Row Variable A B C
P 20 44 50
Q 30 26 30

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
526 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

10. The following table contains observed frequencies for a sample of 240. Test for indepen-
dence of the row and column variables using " = .05.

Column Variable
Row Variable A B C
P 20 30 20
Q 30 60 25
R 10 15 30

Applications
11. A Bloomberg Businessweek subscriber study asked, “In the past 12 months, when traveling
for business, what type of airline ticket did you purchase most often?” A second question
asked if the type of airline ticket purchased most often was for domestic or international
travel. Sample data obtained are shown in the following table.

Type of Flight
Type of Ticket Domestic International
First class 29 22
Business class 95 121
Economy class 518 135

a. Using a .05 level of significance, is the type of ticket purchased independent of the
type of flight? What is your conclusion?
b. Discuss any dependence that exists between the type of ticket and type of flight.
12. A Deloitte employment survey asked a sample of human resource executives how their
company planned to change its workforce over the next 12 months. A categorical response
WorkforcePlan variable showed three options: The company plans to hire and add to the number of em-
ployees, the company plans no change in the number of employees, or the company plans
to lay off and reduce the number of employees. Another categorical variable indicated if the
company was private or public. Sample data for 180 companies are summarized as follows.

Company
Employment Plan Private Public
Add Employees 37 32
No Change 19 34
Lay Off Employees 16 42

a. Conduct a test of independence to determine if the employment plan for the next
12 months is independent of the type of company. At a .05 level of significance, what
is your conclusion?
b. Discuss any differences in the employment plans for private and public companies
over the next 12 months.
13. Health insurance benefits vary by the size of the company (Atlanta Business Chronicle,
December 31, 2010). The sample data below show the number of companies providing
health insurance for small, medium, and large companies. For purposes of this study,
small companies are companies that have fewer than 100 employees. Medium-sized com-
panies have 100 to 999 employees, and large companies have 1000 or more employees.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.2 Test of Independence 527

The questionnaire sent to 225 employees asked whether or not the employee had health
insurance and then asked the employee to indicate the size of the company.

Size of the Company


Health Insurance Small Medium Large
Yes 36 65 88
No 14 10 12

a. Conduct a test of independence to determine whether health insurance coverage is


independent of the size of the company. What is the p-value? Using a .05 level of
significance, what is your conclusion?
b. A newspaper article indicated employees of small companies are more likely to lack
health insurance coverage. Use percentages based on the above data to support this
conclusion.
14. A vehicle quality survey asked new owners a variety of questions about their recently
purchased automobile. One question asked for the owner’s rating of the vehicle using cat-
AutoQuality egorical responses of average, outstanding, and exceptional. Another question asked for
the owner’s education level with the categorical responses some high school, high school
graduate, some college, and college graduate. Assume the sample data below are for 500
owners who had recently purchased an automobile.

Education
Quality Rating Some HS HS Grad Some College College Grad
Average 35 30 20 60
Outstanding 45 45 50 90
Exceptional 20 25 30 50

a. Use a .05 level of significance and a test of independence to determine if a new


owner’s vehicle quality rating is independent of the owner’s education. What is the
p-value and what is your conclusion?
b. Use the overall percentage of average, outstanding, and exceptional ratings to comment
upon how new owners rate the quality of their recently purchased automobiles.
15. The Wall Street Journal Corporate Perceptions Study 2011 surveyed readers and asked
how each rated the quality of management and the reputation of the company for over
250 worldwide corporations. Both the quality of management and the reputation of the
company were rated on an excellent, good, and fair categorical scale. Assume the sample
data for 200 respondents below applies to this study.

Reputation of Company
Quality of Management Excellent Good Fair
Excellent 40 25 5
Good 35 35 10
Fair 25 10 15

a. Use a .05 level of significance and test for independence of the quality of management
and the reputation of the company. What is the p-value and what is your conclusion?
b. If there is a dependence or association between the two ratings, discuss and use prob-
abilities to justify your answer.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
528 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

16. The race for the 2013 Academy Award for Actress in a Leading Role was extremely tight,
featuring several worthy performances (ABC News online, February 22, 2013). The nom-
inees were Jessica Chastain for Zero Dark Thirty, Jennifer Lawrence for Silver Linings
Playbook, Emmanuelle Riva for Amour, Quvenzhané Wallis for Beasts of the Southern
Wild, and Naomi Watts for The Impossible. In a survey, movie fans who had seen each
of the movies for which these five actresses had been nominated were asked to select the
actress who was most deserving of the 2013 Academy Award for Actress in a Leading
Role. The responses follow.

18–30 31–44 45–58 Over 58


Jessica Chastain 51 50 41 42
Jennifer Lawrence 63 55 37 50
Emmanuelle Riva 15 44 56 74
Quvenzhané Wallis 48 25 22 31
Naomi Watts 36 65 62 33

a. How large was the sample in this survey?


b. Jennifer Lawrence received the 2013 Academy Award for Actress in a Leading Role for
her performance in Silver Linings Playbook. Did the respondents favor Ms. Lawrence?
c. At " = .05, conduct a hypothesis test to determine whether people’s attitude toward
the actress who was most deserving of the 2013 Academy Award for Actress in a
Leading Role is independent of respondent age. What is your conclusion?
17. The National Sleep Foundation used a survey to determine whether hours of sleep per night
are independent of age. A sample of individuals was asked to indicate the number of hours
of sleep per night with categorical options: fewer than 6 hours, 6 to 6.9 hours, 7 to 7.9 hours,
and 8 hours or more. Later in the survey, the individuals were asked to indicate their age
with categorical options: age 39 or younger and age 40 or older. Sample data follow.

Age Group
Hours of Sleep 39 or younger 40 or older
Fewer than 6 38 36
6 to 6.9 60 57
7 to 7.9 77 75
8 or more 65 92

a. Conduct a test of independence to determine whether hours of sleep are independent of


age. Using a .05 level of significance, what is the p-value and what is your conclusion?
b. What is your estimate of the percentages of individuals who sleep fewer than 6 hours,
6 to 6.9 hours, 7 to 7.9 hours, and 8 hours or more per night?
18. On a syndicated television show the two hosts often create the impression that they
strongly disagree about which movies are best. Each movie review is categorized as Pro
(“thumbs up”), Con (“thumbs down”), or Mixed. The results of 160 movie ratings by the
two hosts are shown here.

Host B
Host A Con Mixed Pro
Con 24 8 13
Mixed 8 13 11
Pro 10 9 64

Use a test of independence with a .01 level of significance to analyze the data. What is
your conclusion?

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.3 Goodness of Fit Test 529

12.3 Goodness of Fit Test


In this section we use a chi-square test to determine whether a population being sampled has
a specific probability distribution. We first consider a population with a historical multino-
mial probability distribution and use a goodness of fit test to determine if new sample data
indicate there has been a change in the population distribution compared to the historical
distribution. We then consider a situation where an assumption is made that a population
has a normal probability distribution. In this case, we use a goodness of fit test to determine
if sample data indicate that the assumption of a normal probability distribution is or is not
appropriate. Both tests are referred to as goodness of fit tests.

Multinomial Probability Distribution


The multinomial probability With a multinomial probability distribution, each element of a population is assigned
distribution is an extension to one and only one of three or more categories. As an example, consider the market share
of the binomial probability
study being conducted by Scott Marketing Research. Over the past year, market shares for
distribution to the case
where there are three or a certain product have stabilized at 30% for company A, 50% for company B, and 20% for
more outcomes per trial. company C. Since each customer is classified as buying from one of these companies, we
have a multinomial probability distribution with three possible outcomes. The probability
for each of the three outcomes is as follows.
pA = probability a customer purchases the company A product
pB = probability a customer purchases the company B product
pC = probability a customer purchases the company C product
The sum of the probabilities Using the historical market shares, we have multinomial probability distribution with
for a multinomial pA = .30, pB = .50, and pC = .20.
probability distribution
equal 1.
Company C plans to introduce a “new and improved” product to replace its current entry
in the market. Company C has retained Scott Marketing Research to determine whether the
new product will alter or change the market shares for the three companies. Specifically, the
Scott Marketing Research study will introduce a sample of customers to the new company
C product and then ask the customers to indicate a preference for the company A product,
the company B product, or the new company C product. Based on the sample data, the fol-
lowing hypothesis test can be used to determine if the new company C product is likely to
change the historical market shares for the three companies.
H0: pA = .30, pB = .50, and pC = .20
Ha: The population proportions are not pA = .30, pB = .50, and pC = .20
The null hypothesis is based on the historical multinomial probability distribution for the
market shares. If sample results lead to the rejection of H0, Scott Marketing Research will
have evidence to conclude that the introduction of the new company C product will change
the market shares.
Let us assume that the market research firm has used a consumer panel of 200 custom-
ers. Each customer was asked to specify a purchase preference among the three alternatives:
company A’s product, company B’s product, and company C’s new product. The 200
responses are summarized here.

Observed Frequency
Company A’s Company B’s Company C’s
Product Product New Product
48 98 54

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
530 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

We now can perform a goodness of fit test that will determine whether the sample
of 200 customer purchase preferences is consistent with the null hypothesis. Like other
chi-square tests, the goodness of fit test is based on a comparison of observed frequencies
with the expected frequencies under the assumption that the null hypothesis is true. Hence,
the next step is to compute expected purchase preferences for the 200 customers under
the assumption that H0: pA = .30, pB = .50, and pC = .20 is true. Doing so provides the
expected frequencies as follows.

Expected Frequency
Company A’s Company B’s Company C’s
Product Product New Product
200(.30) = 60 200(.50) = 100 200(.20) = 40

Note that the expected frequency for each category is found by multiplying the sample size
of 200 by the hypothesized proportion for the category.
The goodness of fit test now focuses on the differences between the observed fre-
quencies and the expected frequencies. Whether the differences between the observed and
expected frequencies are “large” or “small” is a question answered with the aid of the fol-
lowing chi-square test statistic.

TEST STATISTIC FOR GOODNESS OF FIT

k (fi 2 ei )2
!2 5 o
i51
ei
(12.6)

where

fi 5 observed frequency for category i


ei 5 expected frequency for category i
k 5 the number of categories

Note: The test statistic has a chi-square distribution with k − 1 degrees of freedom
provided that the expected frequencies are 5 or more for all categories.

Let us continue with the Scott Marketing Research example and use the sample data to
test the hypothesis that the multinomial population has the market share proportions pA = .30,
pB = .50, and pC = .20. We will use an " = .05 level of significance. We proceed by using
the observed and expected frequencies to compute the value of the test statistic. With the
expected frequencies all 5 or more, the computation of the chi-square test statistic is shown
The test for goodness of fit in Table 12.9. Thus, we have !2 = 7.34.
is always a one-tailed test We will reject the null hypothesis if the differences between the observed and expected
with the rejection occurring
in the upper tail of the
frequencies are large. Thus the test of goodness of fit will always be an upper tail test.
chi-square distribution. We can use the upper tail area for the test statistic and the p-value approach to determine
whether the null hypothesis can be rejected. With k − 1 = 3 − 1 = 2 degrees of freedom,

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.3 Goodness of Fit Test 531

TABLE 12.9 COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE SCOTT MARKETING
RESEARCH MARKET SHARE STUDY

Squared Difference
Observed Expected Squared Divided by
Hypothesized Frequency Frequency Difference Difference Expected Frequency
Category Proportion ( fi ) (ei ) ( fi 2 ei ) ( fi 2 ei )2 ( fi 2 ei )2/ei
Company A .30 48 60 −12 144 2.40
Company B .50 98 100 −2 4 0.04
Company C .20 54 40 14 196 4.90
Total 200 !2 = 7.34

row two of the chi-square distribution table in Table 12.4 provides the following:

Area in Upper Tail .10 .05 .025 .01 .005


x2 Value (2 df) 4.605 5.991 7.378 9.210 10.597

!2 = 7.34

The test statistic !2 = 7.34 is between 5.991 and 7.378. Thus, the corresponding upper
tail area or p-value must be between .05 and .025. With p-value ≤ .05, we reject H0 and
conclude that the introduction of the new product by company C will alter the historical
market shares. Minitab or Excel procedures provided in Appendix F can be used to show
!2 = 7.34 provides a p-value = .0255.
Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With " = .05 and 2 degrees of freedom, the critical value for the test statistic
is !2.05 5 5.991. The upper tail rejection rule becomes

Reject H0 if !2 $ 5.991

With 7.34 > 5.991, we reject H0. The p-value approach and critical value approach provide
the same hypothesis testing conclusion.
Now that we have concluded the introduction of a new company C product will alter
the market shares for the three companies, we are interested in knowing more about how
the market shares are likely to change. Using the historical market shares and the sample
data, we summarize the data as follows:

Company Historical Market Share (%) Sample Data Market Share (%)
A 30 48/200 = .24, or 24
B 50 98/200 = .49, or 49
C 20 54/200 = .27, or 27

The historical market shares and the sample market shares are compared in the bar chart
shown in Figure 12.2. This data visualization process shows that the new product will
likely increase the market share for company C. Comparisons for the other two companies
indicate that company C’s gain in market share will hurt company A more than company B.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
532 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

FIGURE 12.2 BAR CHART OF MARKET SHARES BY COMPANY BEFORE AND AFTER
THE NEW PRODUCT FOR COMPANY C

0.6 Historical Market Share


0.5 After New Product C

Probability 0.4

0.3

0.2

0.1

0
A B C
Company

Let us summarize the steps that can be used to conduct a goodness of fit test for a
hypothesized multinomial population distribution.

MULTINOMIAL PROBABILITY DISTRIBUTION GOODNESS OF FIT TEST


1. State the null and alternative hypotheses.
H0: The population follows a multinomial probability distribution with
specified probabilities for each of the k categories
Ha: The population does not follow a multinomial distribution with the
specified probabilities for each of the k categories
2. Select a random sample and record the observed frequencies fi for each
category.
3. Assume the null hypothesis is true and determine the expected frequency ei
in each category by multiplying the category probability by the sample size.
4. If the expected frequency ei is at least 5 for each category, compute the value
of the test statistic.
k (f 2 e )2

o
i i
!2 5
i51
ei
5. Rejection rule:
p { value approach: Reject H0 if p { value # "
Critical value approach: Reject H0 if !2 $ !2"
where " is the level of significance for the test and there are k − 1 degrees
of freedom.

Normal Probability Distribution


The goodness of fit test for a normal probability distribution is also based on the use of the
chi-square distribution. In particular, observed frequencies for several categories of sample
data are compared to expected frequencies under the assumption that the population has a

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.3 Goodness of Fit Test 533

TABLE 12.10 normal probability distribution. Because the normal probability distribution is continuous,
CHEMLINE we must modify the way the categories are defined and how the expected frequencies are
EMPLOYEE computed. Let us demonstrate the goodness of fit test for a normal distribution by consider-
APTITUDE TEST ing the job applicant test data for Chemline, Inc., shown in Table 12.10.
SCORES FOR Chemline hires approximately 400 new employees annually for its four plants located
50 RANDOMLY throughout the United States. The personnel director asks whether a normal distribution
CHOSEN JOB applies for the population of test scores. If such a distribution can be used, the distribution
APPLICANTS would be helpful in evaluating specific test scores; that is, scores in the upper 20%, lower
40%, and so on, could be identified quickly. Hence, we want to test the null hypothesis that
71 66 61 65 54 93
60 86 70 70 73 73 the population of test scores has a normal distribution.
55 63 56 62 76 54 Let us first use the data in Table 12.10 to develop estimates of the mean and standard
82 79 76 68 53 58 deviation of the normal distribution that will be considered in the null hypothesis. We use
85 80 56 61 61 64
65 62 90 69 76 79 the sample mean x and the sample standard deviation s as point estimators of the mean and
77 54 64 74 65 65 standard deviation of the normal distribution. The calculations follow.
61 56 63 80 56 71
79 84 oxi 3421
x5 5 5 68.42
n 50

s5 Î o(xi 2 x)2
n21
5 Î 5310.0369
49
5 10.41

Using these values, we state the following hypotheses about the distribution of the job
applicant test scores.
Chemline
H0: The population of test scores has a normal distribution with mean 68.42
and standard deviation 10.41
Ha: The population of test scores does not have a normal distribution with
mean 68.42 and standard deviation 10.41
The hypothesized normal distribution is shown in Figure 12.3.
With the continuous normal probability distribution, we must use a different procedure for
defining the categories. We need to define the categories in terms of intervals of test scores.
With a continuous Recall the rule of thumb for an expected frequency of at least five in each interval or
probability distribution,
establish intervals such
category. We define the categories of test scores such that the expected frequencies will be
that each interval has an at least five for each category. With a sample size of 50, one way of establishing categories
expected frequency of five
or more.
FIGURE 12.3 HYPOTHESIZED NORMAL DISTRIBUTION OF TEST SCORES
FOR THE CHEMLINE JOB APPLICANTS

Standard Deviation
10.41

Mean 68.42

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
534 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

FIGURE 12.4 NORMAL DISTRIBUTION FOR THE CHEMLINE EXAMPLE


WITH 10 EQUAL-PROBABILITY INTERVALS

Note: Each interval has a


probability of .10

55.10

59.68
63.01
65.82
68.42
71.02
73.83
77.16

81.74
is to divide the normal probability distribution into 10 equal-probability intervals (see
Figure 12.4). With a sample size of 50, we would expect five outcomes in each interval or
category, and the rule of thumb for expected frequencies would be satisfied.
Let us look more closely at the procedure for calculating the category boundaries. When
the normal probability distribution is assumed, the standard normal probability tables can be
used to determine these boundaries. First consider the test score cutting off the lowest 10%
of the test scores. From the table for the standard normal distribution we find that the z value
for this test score is −1.28. Therefore, the test score of x = 68.42 − 1.28(10.41) = 55.10
provides this cutoff value for the lowest 10% of the scores. For the lowest 20%, we find
z = −.84, and thus x = 68.42 − .84(10.41) = 59.68. Working through the normal distribu-
tion in that way provides the following test score values.

Percentage z Test Score


10% −1.28 68.42 − 1.28(10.41) = 55.10
20% −.84 68.42 − .84(10.41) = 59.68
30% −.52 68.42 − .52(10.41) = 63.01
40% −.25 68.42 − .25(10.41) = 65.82
50% .00 68.42 + 0(10.41) = 68.42
60% +.25 68.42 + .25(10.41) = 71.02
70% +.52 68.42 + .52(10.41) = 73.83
80% +.84 68.42 + .84(10.41) = 77.16
90% +1.28 68.42 + 1.28(10.41) = 81.74

These cutoff or interval boundary points are identified on the graph in Figure 12.4.
With the categories or intervals of test scores now defined and with the known expected
frequency of five per category, we can return to the sample data of Table 12.10 and determine
the observed frequencies for the categories. Doing so provides the results in Table 12.11.
With the results in Table 12.11, the goodness of fit calculations proceed exactly as
before. Namely, we compare the observed and expected results by computing a !2 value.
The calculations necessary to compute the chi-square test statistic are shown in Table 12.12.
We see that the value of the test statistic is !2 = 7.2.
To determine whether the computed !2 value of 7.2 is large enough to reject H0, we need
to refer to the appropriate chi-square distribution table. Using the rule for computing the

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.3 Goodness of Fit Test 535

TABLE 12.11 OBSERVED AND EXPECTED FREQUENCIES FOR CHEMLINE JOB


APPLICANT TEST SCORES

Observed Expected
Frequency Frequency
Test Score Interval ( fi ) (ei )
Less than 55.10 5 5
55.10 to 59.68 5 5
59.68 to 63.01 9 5
63.01 to 65.82 6 5
65.82 to 68.42 2 5
68.42 to 71.02 5 5
71.02 to 73.83 2 5
73.83 to 77.16 5 5
77.16 to 81.74 5 5
81.74 and over 6 5
Total 50 50

TABLE 12.12 COMPUTATION OF THE CHI-SQUARE TEST STATISTIC


FOR THE CHEMLINE JOB APPLICANT EXAMPLE

Squared
Difference
Divided by
Observed Expected Squared Expected
Test Score Frequency Frequency Difference Difference Frequency
Interval ( fi ) (ei ) ( fi 2 ei ) ( fi 2 ei )2 ( fi 2 ei )2/ei
Less than 55.10 5 5 0 0 0.0
55.10 to 59.68 5 5 0 0 0.0
59.68 to 63.01 9 5 4 16 3.2
63.01 to 65.82 6 5 1 1 0.2
65.82 to 68.42 2 5 −3 9 1.8
68.42 to 71.02 5 5 0 0 0.0
71.02 to 73.83 2 5 −3 9 1.8
73.83 to 77.16 5 5 0 0 0.0
77.16 to 81.74 5 5 0 0 0.0
81.74 and over 6 5 1 1 0.2
Total 50 50 !2 5 7.2

number of degrees of freedom for the goodness of fit test, we have k − p − 1 =


10 − 2 − 1 = 7 degrees of freedom based on k = 10 categories and p = 2 parameters
(mean and standard deviation) estimated from the sample data.
Estimating the two Suppose that we test the null hypothesis that the distribution for the test scores is a
parameters of the normal normal distribution with a .10 level of significance. To test this hypothesis, we need to
distribution will cause
determine the p-value for the test statistic !2 = 7.2 by finding the area in the upper tail of
a loss of two degrees of
freedom in the !2 test. a chi-square distribution with 7 degrees of freedom. Using row seven of Table 12.4, we
find that !2 = 7.2 provides an area in the upper tail greater than .10. Thus, we know that
the p-value is greater than .10. Minitab or Excel procedures in Appendix F can be used
to show !2 = 7.2 provides a p-value = .4084. With p-value >.10, the hypothesis that the
probability distribution for the Chemline job applicant test scores is a normal probability

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
536 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

distribution cannot be rejected. The normal probability distribution may be applied to assist
in the interpretation of test scores. A summary of the goodness fit test for a normal prob-
ability distribution follows.

NORMAL PROBABILITY DISTRIBUTION GOODNESS OF FIT TEST


1. State the null and alternative hypotheses.
H0: The population has a normal probability distribution
Ha: The population does not have a normal probability distribution
2. Select a random sample and
a. Compute the sample mean and sample standard deviation.
b. Define k intervals of values so that the expected frequency is at least five
for each interval. Using equal probability intervals is a good approach.
c. Record the observed frequency of data values fi in each interval defined.
3. Compute the expected number of occurrences ei for each interval of values
defined in step 2(b). Multiply the sample size by the probability of a normal
random variable being in the interval.
4. Compute the value of the test statistic.
k (f 2 e )2

o
i i
!2 5
i51
ei

5. Rejection rule:
p { value approach: Reject H0 if p { value # "
Critical value approach: Reject H0 if !2 $ !2"
where " is the level of significance. The degrees of freedom = k − p − 1,
where p is the number of parameters of the distribution estimated by the sam-
ple. In step 2a, the sample is used to estimate the mean and standard deviation.
Thus, p = 2 and the degrees of freedom = k − 2 − 1 = k − 3.

Exercises

Methods
19. Test the following hypotheses by using the !2 goodness of fit test.

H0: pA 5 .40, pB 5 .40, and pC 5 .20


Ha: The population proportions are not
pA 5 .40, pB 5 .40, and pC 5 .20

A sample of size 200 yielded 60 in category A, 120 in category B, and 20 in category C.


Use " = .01 and test to see whether the proportions are as stated in H0.
a. Use the p-value approach.
b. Repeat the test using the critical value approach.
20. The following data are believed to have come from a normal distribution. Use the good-
ness of fit test and " = .05 to test this claim.
17 23 22 24 19 23 18 22 20 13 11 21 18 20 21
21 18 15 24 23 23 43 29 27 26 30 28 33 23 29

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
12.3 Goodness of Fit Test 537

Applications
21. During the first 13 weeks of the television season, the Saturday evening 8:00 p.m. to
9:00 p.m. audience proportions were recorded as ABC 29%, CBS 28%, NBC 25%, and
independents 18%. A sample of 300 homes two weeks after a Saturday night schedule
revision yielded the following viewing audience data: ABC 95 homes, CBS 70 homes, NBC
89 homes, and independents 46 homes. Test with " = .05 to determine whether the viewing
audience proportions changed.
22. Mars, Inc. manufactures M&M’s, one of the most popular candy treats in the world. The milk
chocolate candies come in a variety of colors including blue, brown, green, orange, red, and
M&M yellow. The overall proportions for the colors are .24 blue, .13 brown, .20 green, .16 orange,
.13 red, and .14 yellow. In a sampling study, several bags of M&M milk chocolates were
opened and the following color counts were obtained.

Blue Brown Green Orange Red Yellow


105 72 89 84 70 80

Use a .05 level of significance and the sample data to test the hypothesis that the overall
proportions for the colors are as stated above. What is your conclusion?
23. The Wall Street Journal’s Shareholder Scoreboard tracks the performance of 1000 major
U.S. companies. The performance of each company is rated based on the annual total return,
including stock price changes and the reinvestment of dividends. Ratings are assigned by
dividing all 1000 companies into five groups from A (top 20%), B (next 20%), to E (bottom
20%). Shown here are the one-year ratings for a sample of 60 of the largest companies. Do
the largest companies differ in performance from the performance of the 1000 companies in
the Shareholder Scoreboard? Use " = .05.

A B C D E
5 8 15 20 12

24. The National Highway Traffic Safety Administration reported the percentage of traffic
accidents occurring each day of the week. Assume that a sample of 420 accidents provided
the following data.

Sunday Monday Tuesday Wednesday Thursday Friday Saturday


66 50 53 47 55 69 80

a. Conduct a hypothesis test to determine if the proportion of traffic accidents is the same
for each day of the week. What is the p-value? Using a .05 level of significance, what
is your conclusion?
b. Compute the percentage of traffic accidents occurring on each day of the week. What
day has the highest percentage of traffic accidents? Does this seem reasonable? Discuss.
25. Use " = .01 and conduct a goodness of fit test to see whether the following sample appears
to have been selected from a normal probability distribution.
55 86 94 58 55 95 55 52 69 95 90 65 87 50 56
55 57 98 58 79 92 62 59 88 65
After you complete the goodness of fit calculations, construct a histogram of the data. Does
the histogram representation support the conclusion reached with the goodness of fit test?
(Note: x 5 71 and s = 17.)

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
538 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

26. The weekly demand for a product is believed to be normally distributed. Use a goodness
of fit test and the following data to test this assumption. Use " = .10. The sample mean is
Demand 24.5 and the sample standard deviation is 3.
18 20 22 27 22
25 22 27 25 24
26 23 20 24 26
27 25 19 21 25
26 25 31 29 25
25 28 26 28 24

Summary

In this chapter we have introduced hypothesis tests for the following applications.
1. Testing the equality of population proportions for three or more populations.
2. Testing the independence of two categorical variables.
3. Testing whether a probability distribution for a population follows a specific histori-
cal or theoretical probability distribution.
All tests apply to categorical variables and all tests use a chi-square (!2) test statistic that
is based on the differences between observed frequencies and expected frequencies. In each
case, expected frequencies are computed under the assumption that the null hypothesis is
true. These chi-square tests are upper tailed tests. Large differences between observed and
expected frequencies provide a large value for the chi-square test statistic and indicate that
the null hypothesis should be rejected.
The test for the equality of population proportions for three or more populations is based
on independent random samples selected from each of the populations. The sample data show
the counts for each of two categorical responses for each population. The null hypothesis is that
the population proportions are equal. Rejection of the null hypothesis supports the conclusion
that the population proportions are not all equal.
The test of independence between two categorical variables uses one sample from a
population with the data showing the counts for each combination of two categorical vari-
ables. The null hypothesis is that the two variables are independent and the test is referred
to as a test of independence. If the null hypothesis is rejected, there is statistical evidence
of an association or dependency between the two variables.
The goodness of fit test is used to test the hypothesis that a population has a specific histori-
cal or theoretical probability distribution. We showed applications for populations with a mul-
tinomial probability distribution and with a normal probability distribution. Since the normal
probability distribution applies to continuous data, intervals of data values were established to
create the categories for the categorical variable required for the goodness of fit test.

Glossary

Goodness of fit test A chi-square test that can be used to test that a population probabil-
ity distribution has a specific historical or theoretical probability distribution. This test
was demonstrated for both a multinomial probability distribution and a normal probability
distribution.
Marascuilo procedure A multiple comparison procedure that can be used to test for a
significant difference between pairs of population proportions. This test can be helpful in
identifying differences between pairs of population proportions whenever the hypothesis
of equal population proportions has been rejected.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Supplementary Exercises 539

Multinomial probability distribution A probability distribution where each outcome be-


longs to one of three or more categories. The multinomial probability distribution extends
the binomial probability from two to three or more outcomes per trial.
Test of independence A chi-square test that can be used to test for the independence
between two categorical variables. If the hypothesis of independence is rejected, it can be
concluded that the categorical variables are associated or dependent.

Key Formulas

Expected Frequencies Under the Assumption H0 Is True


(Row i Total)(Column j Total)
eij 5 (12.1)
Sample Size

Chi-Square Test Statistic


(fij 2 eij)2
!2 5 oo
i j
eij
(12.2)

Critical Values for the Marascuilo Pairwise Comparison Procedure

CVij 5 Ï!2"
ni
1 Î
pi(1 2 pi) pj(1 2 pj)
nj
(12.3)

Chi-Square Test Statistic for the Goodness of Fit Test

(fi 2 ei)2
!2 5 o
i
ei
(12.6)

Supplementary Exercises

27. In a quality control test of parts manufactured at Dabco Corporation, an engineer sampled
parts produced on the first, second, and third shifts. The research study was designed to
determine if the population proportion of good parts was the same for all three shifts.
Sample data follow.

Production Shift
Quality First Second Third
Good 285 368 176
Defective 15 32 24

a. Using a .05 level of significance, conduct a hypothesis test to determine if the popula-
tion proportion of good parts is the same for all three shifts. What is the p-value and
what is your conclusion?
b. If the conclusion is that the population proportions are not all equal, use a multiple
comparison procedure to determine how the shifts differ in terms of quality. What shift
or shifts need to improve the quality of parts produced?
28. Phoenix Marketing International identified Bridgeport, Connecticut, Los Alamos, New Mexico,
Naples, Florida and Washington D.C. as the four U.S. cities with the highest percentage

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
540 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

of millionaires. Data consistent with that study show the following number of millionaires
for samples of individuals from each of the four cities.

City
Millionaire Bridgeport Los Alamos Naples Washington DC
Yes 44 35 36 34
No 456 265 364 366

a. What is the estimate of the percentage of millionaires in each of these cities?


b. Using a .05 level of significance, test for the equality of the population proportion of
millionaires for these four cities. What is the p-value and what is your conclusion?
29. The five most popular art museums in the world are Musée du Louvre, the Metropolitan
Museum of Art, British Museum, National Gallery, and Tate Modern (The Art Newspaper,
April 2012). Which of these five museums would visitors most frequently rate as spec-
tacular? Samples of recent visitors of each of these museums were taken, and the results
of these samples follow.

Musée du Metropolitan British National Tate


Louvre Museum of Art Museum Gallery Modern
Rated Spectacular 113 94 96 78 88
Did Not Rate Spectacular 37 46 64 42 22

a. Use the sample data to calculate the point estimate of the population proportion of
visitors who rated each of these museums as spectacular.
b. Conduct a hypothesis test to determine if the population proportion of visitors who
rated the museum as spectacular is equal for these five museums. Using a .05 level of
significance, what is the p-value and what is your conclusion?
30. A Pew Research Center survey asked respondents if they would rather live in a place with
a slower pace of life or a place with a faster pace of life. The survey also asked the respon-
dent’s gender. Consider the following sample data.

Gender
Preferred Pace of Life Male Female
Slower 230 218
No Preference 20 24
Faster 90 48

a. Is the preferred pace of life independent of gender? Using a .05 level of significance,
what is the p-value and what is your conclusion?
b. Discuss any differences between the preferences of men and women.
31. Bara Research Group conducted a survey about church attendance. The survey respondents
were asked about their church attendance and asked to indicate their age. Use the sample
data to determine whether church attendance is independent of age. Using a .05 level of
significance, what is the p-value and what is your conclusion? What conclusion can you
draw about church attendance as individuals grow older?

Age
Church Attendance 20 to 29 30 to 39 40 to 49 50 to 59
Yes 31 63 94 72
No 69 87 106 78

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Supplementary Exercises 541

32. An ambulance service responds to emergency calls for two counties in Virginia. One
county is an urban county and the other is a rural county. A sample of 471 ambulance calls
Ambulance over the past two years showed the county and the day of the week for each emergency
call. Data are as follows.

Day of Week
County Sun Mon Tue Wed Thu Fri Sat
Urban 61 48 50 55 63 73 43
Rural 7 9 16 13 9 14 10

Test for independence of the county and the day of the week. Using a .05 level of signifi-
cance, what is the p-value and what is your conclusion?
33. Based on sales over a six-month period, the five top-selling compact cars are Chevy Cruze,
Ford Focus, Hyundai Elantra, Honda Civic, and Toyota Corolla (Motor Trend, November
2, 2011). Based on total sales, the market shares for these five compact cars were Chevy
Cruze 24%, Ford Focus 21%, Hyundai Elantra 20%, Honda Civic 18%, and Toyota Co-
rolla 17%. A sample of 400 compact car sales in Chicago showed the following number
of vehicles sold.

Chevy Cruze 108


Ford Focus 92
Hyundai Elantra 64
Honda Civic 84
Toyota Corolla 52

Use a goodness of fit test to determine if the sample data indicate that the market shares
for the five compact cars in Chicago are different than the market shares reported by Motor
Trend. Using a .05 level of significance, what is the p-value and what is your conclusion?
What market share differences, if any, exist in Chicago?
34. A random sample of final examination grades for a college course follows.

Grades 55 85 72 99 48 71 88 70 59 98 80 74 93 85 74
82 90 71 83 60 95 77 84 73 63 72 95 79 51 85
76 81 78 65 75 87 86 70 80 64

Use " = .05 and test to determine whether a normal probability distribution should be
rejected as being representative of the population distribution of grades.
35. A salesperson makes four calls per day. A sample of 100 days gives the following fre-
quencies of sales volumes.

Observed Frequency
Number of Sales (days)
0 30
1 32
2 25
3 10
4 3
Total 100

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
542 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Records show sales are made to 30% of all sales calls. Assuming independent sales calls,
the number of sales per day should follow a binomial probability distribution. The binomial
probability function presented in Chapter 5 is

n!
f(x) 5 px(1 2 p)n2x
x!(n 2 x)!

For this exercise, assume that the population has a binomial probability distribution with
n = 4, p = .30, and x = 0, 1, 2, 3, and 4.
a. Compute the expected frequencies for x = 0, 1, 2, 3, and 4 by using the binomial
probability function. Combine categories if necessary to satisfy the requirement that
the expected frequency is five or more for all categories.
b. Use the goodness of fit test to determine whether the assumption of a binomial prob-
ability distribution should be rejected. Use " = .05. Because no parameters of the
binomial probability distribution were estimated from the sample data, the degrees of
freedom are k − 1 when k is the number of categories.

Case Problem A Bipartisan Agenda for Change


In a study conducted by Zogby International for the Democrat and Chronicle, more than
700 New Yorkers were polled to determine whether the New York state government
works. Respondents surveyed were asked questions involving pay cuts for state legislators,
restrictions on lobbyists, term limits for legislators, and whether state citizens should be
able to put matters directly on the state ballot for a vote. The results regarding several pro-
posed reforms had broad support, crossing all demographic and political lines.
Suppose that a follow-up survey of 100 individuals who live in the western region of
New York was conducted. The party affiliation (Democrat, Independent, Republican) of each
individual surveyed was recorded, as well as their responses to the following three questions.

1. Should legislative pay be cut for every day the state budget is late?
Yes ____ No ____
2. Should there be more restrictions on lobbyists?
Yes ____ No ____
3. Should there be term limits requiring that legislators serve a fixed number of years?
Yes ____ No ____

The responses were coded using 1 for a Yes response and 2 for a No response. The complete
data set is available in the file named NYReform.
NYReform

Managerial Report
1. Use descriptive statistics to summarize the data from this study. What are your pre-
liminary conclusions about the independence of the response (Yes or No) and party
affiliation for each of the three questions in the survey?
2. With regard to question 1, test for the independence of the response (Yes and No)
and party affiliation. Use " = .05.
3. With regard to question 2, test for the independence of the response (Yes and No)
and party affiliation. Use " = .05.
4. With regard to question 3, test for the independence of the response (Yes and No)
and party affiliation. Use " = .05.
5. Does it appear that there is broad support for change across all political lines? Explain.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Appendix 12.1 Chi-Square Tests Using Minitab 543

Appendix 12.1 Chi-Square Tests Using Minitab


Test the Equality of Population Proportions
and Test of Independence
The Minitab procedure is identical for both of these applications. We will describe the
procedure for the following situations.
1. A data set shows the responses for each element in the sample.
2. A tabular summary of the data shows the observed frequencies for the response
categories.
We begin with the automobile loyalty example presented in Section 12.1. Responses
for a sample of 500 automobile owners is contained in the DATAfile AutoLoyalty. Column
AutoLoyalty
C1 shows the population the owner belongs to (Chevrolet Impala, Ford Fusion, or Honda
Accord) and column C2 contains the likely to purchase response (Yes or No). The Minitab
steps to conduct a chi-square test using the data set follow.
Step 1. Select the Stat menu
Step 2. Select Tables
Step 3. Choose Cross Tabulation and Chi-Square
Step 4. When then Cross Tabulation and Chi-Square dialog box appears:
Select Raw data (categorical variables)
Enter C2 in the Rows box
Enter C1 in the Columns box
Under the Display options, select Counts
Select Chi-Square
Step 5. When the Cross Tabulation: Chi-Square dialog box appears:
Select Chi-square test
Click OK
Step 6. Click OK
The output shows both a tabular summary of the data and the chi-square test results.
You can use Minitab Next let us show how to conduct this test if a tabular summary of the data showing
to preform a test of observed frequencies has already been obtained. We begin with a new Minitab worksheet
independence on either (i)
and label the columns C2 to C4 with the titles of the three populations: Chevrolet Impala,
a data set that shows the
responses for each element Ford Fusion, and Honda Accord. We then enter the labels Yes and No, respectively, in the
in the sample, or (ii) a first two cells of column C1. Finally we enter the observed frequencies of the Yes and No
tabular summary of the data responses for each population in its corresponding column. Thus, we enter 69 and 56 in the
that shows the observed first two cells of column 2, enter 120 and 80 in the first two cells of column 3, and enter
frequencies for the response
123 and 52 in the first two cells of column 4. The Minitab steps for this test are as follows.
categories. Choose Chi-
Square Test for Association Step 1. Select the Stat menu
in step 3 and then follow Step 2. Select Tables
the steps outlined for Test
the Equality of Population
Step 3. Choose Cross Tabulation and Chi-Square
Proportions. Use the Step 4. When the Cross Tabulation and Chi-Square dialog box appears:
DATAfile BeerPreference Select Summarized data in a two-way table
to conduct the test for the Enter C2-C4 in the Columns containing the table box
example in Section 12.2. Enter C1 in the Rows box
Enter Auto Owner Population in the Columns box
Select Chi-Square
Step 5. When the Cross Tabulation—Chi Square dialog box appears:
Select Chi-square test
Click OK
Step 6. Click OK

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
544 Chapter 12 Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Goodness of Fit Test


In order to use Minitab to conduct a goodness of fit test, the user must first obtain a sample
from the population and determine the observed frequency for each of k categories. Under
the assumption that the hypothesized population distribution is true, the user must also de-
termine the hypothesized or expected proportion for each of the k categories. Using a new
Minitab worksheet, the observed frequencies are entered in column C1 and the correspond-
ing hypothesized proportions are entered in column C2.
Using the Scott Marketing Research example presented in Section 12.3, the sample
of 200 customer preferences for products A, B, and C provided observed frequencies
of 48, 98, and 54. These frequencies are entered in column C1. Using historical market
share data, the hypothesized proportions, .30, .50 and .20, are entered in column C2.
The Minitab steps for the goodness of fit test for this multinomial probability distribu-
tion follow.
Step 1. Select the Stat menu
Step 2. Select Tables
Step 3. Choose Chi-Square Goodness-of-Fit Test (One Variable)
Step 4. When the Chi-Square Goodness-of-Fit Test dialog box appears:
Select Observed counts
Enter C1 in the Observed counts box
Select Specific proportions
Enter C2 in the Specific proportions box
Click OK
If in any application of the goodness of fit test the null hypothesis is equal proportions
for the k categories, column C2 is not necessary. In this case, the user can select Equal
proportions rather than Specific proportions in step 4.

Appendix 12.2 Chi-Square Tests Using Excel


The Excel procedure for tests for the equality of population proportions, tests of inde-
pendence, and goodness of fit tests are basically the same as all make use of the Excel
chi-square function CHISQ.TEST. Regardless of the application, the user must do the fol-
lowing before creating an Excel worksheet that will perform the test.
1. Select a sample from the population or populations and record the data
2. Summarize the data to show observed frequencies in a tabular format
Excel’s PivotTable can be used to summarize the data in step 2 above. Since this procedure
was previously presented in Appendix 2.2, we shall not describe it in this appendix. Rather
we will begin the Excel chi-square test procedure with the understanding that the user has
already determined the observed frequencies for the study.
Let us demonstrate the Excel chi-square test by considering the automobile loyalty
example presented in Section 12.1. Using the data in the DATA file AutoLoyalty and the
AutoLoyalty Excel PivotTable procedure, we obtained the observed frequencies shown in the Excel
worksheet of Figure 12.5. The user must next insert Excel formulas in the worksheet to
compute the expected frequencies. Using equation (12.1), the Excel formulas for expected
frequencies are as shown in the background worksheet of Figure 12.5.
The last step is to insert the Excel function CHISQ.TEST. The format of this function
is as follows:

=CHISQ.TEST(Observed Frequency Cells, Expected Frequency Cells)

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Appendix 12.2 Chi-Square Tests Using Excel 545

FIGURE 12.5 EXCEL WORKSHEET FOR THE AUTOMOBILE LOYALTY STUDY

ChiSquare

In Figure 12.5, the observed frequency cells are B7 to D8, written B7:D8 and
the expected frequency cells are B16 to D17, written B16:D17. The function
=CHISQ.TEST(B7:D8,B16:D17) is shown in cell E20 of the background worksheet. This
function does all the chi-square test computations and returns the p-value for the test.
The test of independence summarizes the observed frequencies in a tabular for-
The Excel worksheet shown mat very similar to the one shown in Figure 12.5. The formulas to compute expected
in Figure 12.5 is available frequencies are also very similar to the formulas shown in the background worksheet. For
in the DATAfile ChiSquare. the goodness of fit test, the user provides the observed frequencies in a column rather than
a table. The user must also provide the associated expected frequencies in another col-
umn. Lastly, the CHISQ.TEST function is used to obtain the p-value as described above.

Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203

You might also like