0% found this document useful (0 votes)
89 views20 pages

Stats Lecture. 13. Chi Square Test

The chi-square test can be used to test for independence between two categorical variables. It tests whether the variables are independent in the population by comparing observed and expected frequencies in a contingency table. The document provides steps for performing a chi-square test of independence, including stating hypotheses, calculating expected counts, computing the test statistic, determining degrees of freedom, finding the p-value, and interpreting results. An example contingency table is used to demonstrate applying these steps to determine if opinion about speaking a language differs between countries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views20 pages

Stats Lecture. 13. Chi Square Test

The chi-square test can be used to test for independence between two categorical variables. It tests whether the variables are independent in the population by comparing observed and expected frequencies in a contingency table. The document provides steps for performing a chi-square test of independence, including stating hypotheses, calculating expected counts, computing the test statistic, determining degrees of freedom, finding the p-value, and interpreting results. An example contingency table is used to demonstrate applying these steps to determine if opinion about speaking a language differs between countries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Chi Square Test

Shair Muhammad Hazara


MSPH, MSBE, BSN, Ped. N
Email address: [email protected]
Learning Objectives
By the end of this session, the learners will be able to:
• STATE appropriate hypotheses and COMPUTE the expected
counts and chi-square test statistic for a chi-square test based
on data in a two-way table.
• STATE and CHECK the Random, 10%, and Large Counts
conditions for a chi-square test based on data in a two-way
table.
• Calculate the degrees of freedom and P-value for a chi-square
test based on data in a two-way table.
• PERFORM a chi-square test for homogeneity.
• PERFORM a chi-square test for independence.
• CHOOSE the appropriate chi-square test in a given setting.
Chi Square Test
Assumption for Chi-square test use
• Both variables are categorical.
• All observations are independent.
• Cells in the contingency table are mutually
exclusive.
• Expected value of cells should be 5 or
greater in at least 80% of cells.
Tests for Homogeneity:
Expected Counts and the Chi-Square Test Statistic
Problem: For a class project, Abby and Mia wanted to know if the gender of an
interviewer could affect the responses to a survey question. The subjects in
their experiment were 100 males from their school. Half of the males were
randomly assigned to be asked, “Would you vote for a female president?” by a
female interviewer. The other half of the males were asked the same question
by a male interviewer. The table shows the results.
(a) State the appropriate null
and alternative hypotheses.
(b) Show the calculation for the
expected count in the
Male/Yes cell. Then provide
a complete table of expected
counts.
(c) Calculate the value of the chi-square test statistic.
Tests for Homogeneity:
Expected Counts and the Chi-Square Test Statistic
Problem:
The table shows the results.
(a) State the appropriate null
and alternative hypotheses.
(b) Show the calculation for the
expected count in the
Male/Yes cell. Then provide
a complete table of expected counts.
(c) Calculate the value of the chi-square test statistic.

(a)
H0: There is no difference in the true distributions of response to this question when asked by a male
interviewer and when asked by a female interviewer for subjects like these.
Ha: There is a difference in the true distributions of response to this question when asked by a male
interviewer and when asked by a female interviewer for subjects like these.
Expected Counts and the Chi-Square Test Statistic
Problem:
The table shows the results.
(a) State the appropriate null
and alternative hypotheses.
(b) Show the calculation for the
expected count in the
Male/Yes cell. Then provide
a complete table of expected counts.
(c) Calculate the value of the chi-square test statistic.

(b)
The expected count
for the Male/Yes cell
is (69 · 50)/100 = 34.5.
The rest of the expected
counts are shown in the table.
Expected Counts and the Chi-Square Test Statistic

(c)
(𝟑𝟎 − 𝟑𝟒. 𝟓) 𝟐 (𝟑𝟗 − 𝟑𝟒. 𝟓)𝟐 (𝟖 − 𝟏𝟎)𝟐
𝝌𝟐 = + +⋯+
𝟑𝟒. 𝟓 𝟑𝟒. 𝟓 𝟏𝟎

𝝌𝟐 = 𝟒. 𝟐𝟓
Expected Counts and the Chi-Square Test Statistic
Problem:

Earlier, we calculated 2 = 4.25.


(b) Use Table C to find the P-value. Then use your calculator’s 2 cdf command.

(b)
df = (3 – 1)(2 – 1) = 2, 0.05 = 5.991
Using Table C:
The P-value is between 0.10 and 0.15.
Using technology:
2 cal is 4.25 < 2 5.991, it doesn’t fall in critical region area. So, we fail to reject the H0
Expected Counts and the Chi-Square Test Statistic
Problem:

Earlier, we calculated 2 = 4.25.


(b) Use Table C to find the P-value. Then use your calculator’s 2 cdf command.

(d) Result Interpretation.


Because the P-value of 0.119 > α = 0.05, we fail to reject H0. There is not enough
evidence of a difference in the true distributions of response to this question
when asked by a male interviewer and when asked by a female interviewer for
subjects like these.
Relationships Between Two Categorical Variables
1. Are people who are prone to sudden anger more likely to develop heart
disease? An observational study followed a random sample of 8474 people
with normal blood pressure for about four years.
• Each person took the Spielberger Trait Anger Scale test. Researchers also
recorded whether each individual developed coronary heart disease (CHD).

Do these data provide convincing evidence of an association


between the variables in the larger population?
Tests for Independence: Stating Hypotheses
H0: There is no association between anger level and heart-disease status
in the population of people with normal blood pressure.

Ha: There is an association between anger level and heart-disease status


in the population of people with normal blood pressure.

An equivalent way to state the hypotheses is

H0: Anger and heart-disease status are independent in the population of


people with normal blood pressure.

Ha: Anger and heart-disease status are not independent in the population
of people with normal blood pressure.
Tests for Independence: Expected Counts

𝐫𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐜𝐨𝐥𝐮𝐦𝐧 𝒕𝒐𝒕𝒂𝒍


𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝 𝒄𝒐𝒖𝒏𝒕 =
𝐭𝐚𝐛𝐥𝐞 𝒕𝒐𝒕𝒂𝒍
Putting It All Together:
The Chi-Square Test for Independence
Chi-Square Test for Independence
Suppose the conditions are met. To perform a test of
H0: There is no association between two categorical variables in the
population of interest
compute the chi-square test statistic:
(Observed count − Expected count) 2
𝛘𝟐 =
Expected count
where the sum is over all cells (not including totals) in the two-way
table. The P-value is the area to the right of 2 under the chi-square
density curve with degrees of freedom = (num. of rows – 1)(num. of columns – 1).
Df = (r – 1) (c – 1), 0.05
Putting It All Together:
The Chi-Square Test for Independence
Problem: The Pew Research Center conducts surveys about a variety of topics in many
different countries. In one survey, it wanted to investigate how residents of different
countries feel about the importance of speaking the national language. Separate random

Scott Olson/Getty Images


samples of residents of Australia, the United Kingdom, and the United States were asked
many questions, including the following: “Some people say that the following things are
important for being truly [survey country nationality]. Others say they are not important.
How important do you think it is to be able to speak English?” The two-way table
summarizes the responses to this question. Do these data provide
convincing evidence at the
α = 0.05 level that the
distributions of opinion
about speaking English
differ for residents of
Australia, the U.K.,
and the U.S.?
Putting It All Together:
The Chi-Square Test for Independence
Problem:
Do these data provide convincing evidence at the α = 0.05 level that
the distributions of opinion about speaking English differ for

Scott Olson/Getty Images


residents of Australia, the U.K., and the U.S.?
The Chi-Square Test for Independence
Problem:
Do these data provide convincing evidence at the α = 0.05 level that the
distributions of opinion about speaking English differ for residents of Australia,
the U.K., and the U.S.?

STATE
H0: There is no difference in the true distributions of opinion about speaking English for residents
of Australia, the U.K., and the U.S.
Ha: There is a difference in the true distributions of opinion about speaking English for residents of
Australia, the U.K., and the U.S.
We’ll use α = 0.05.
The Chi-Square Test for Independence
Problem:
Do these data provide convincing evidence at the α = 0.05 level that the distributions of
opinion about speaking English differ for residents of Australia, the U.K., and the U.S.?
The Chi-Square Test for Independence

Test Statistic
𝟐 𝟐
(𝟔𝟗𝟎 − 𝟕𝟒𝟏. 𝟖) (𝟏𝟏𝟕𝟕 − 𝟏𝟎𝟖𝟑. 𝟏)
𝝌𝟐 = + + ⋯ = 𝟔𝟖. 𝟓𝟕
𝟕𝟒𝟏. 𝟖 𝟏𝟎𝟖𝟑. 𝟏

df = (4 – 1)(3 – 1) = 6
Using Table C: 𝝌𝟐 , df 6, 0.05 = 12.592
Using technology: 2 cal is 68.57 > 𝛘2 tab is 12.592. It fall in critical region area. So, we
Reject the H0.
The Chi-Square Test for Independence

Step No. 05. Result Interpretation


We conclude that 2 cal is 68.57, it falls in the rejection area. So, we reject the H0.
There is convincing (enough) evidence that there is a difference in the true
distributions of opinion about speaking English for residents of Australia, the
U.K., and the U.S.

You might also like