0% found this document useful (0 votes)
84 views22 pages

9-Chi Square Test

The document discusses the chi-square test for independence, which is used to explore the relationship between two categorical variables. It explains that the test compares the observed frequencies between categories to expected frequencies if there was no association. A contingency table is used to calculate expected cell counts as the total of a row multiplied by the total of a column over the total sample size. Assumptions of the test include random sampling, independence of observations, and no more than 20% of cells having an expected count less than 5. Examples are provided of research questions and null/alternative hypotheses that can be tested using a chi-square test for independence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views22 pages

9-Chi Square Test

The document discusses the chi-square test for independence, which is used to explore the relationship between two categorical variables. It explains that the test compares the observed frequencies between categories to expected frequencies if there was no association. A contingency table is used to calculate expected cell counts as the total of a row multiplied by the total of a column over the total sample size. Assumptions of the test include random sampling, independence of observations, and no more than 20% of cells having an expected count less than 5. Examples are provided of research questions and null/alternative hypotheses that can be tested using a chi-square test for independence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

CHI-SQUARE TEST SRT605/

FOR INDEPENDENCE SRT666


CHI-SQUARE TEST FOR INDEPENDENCE

 To explore the relationship between two categorical


variables
 Each of the variables can have two or more
categories (e.g. gender: M/F; smoker: Y/N)
 The test compares the observed
frequencies/proportions of cases that occur in each
categories with the values that would be expected if
there was no association between the two variables
being measured
CHI-SQUARE TEST FOR INDEPENDENCE
Smoker
TOTAL
Yes No
Male
Gender
Female
TOTAL

Based on a crosstabulation/contingency table


Examples of RQ:
 Is there an association between gender and
smoking behaviour?
 Are males more likely to be smokers than females?
 Is the proportion of males that smoke the same as
the proportion of females?
CHI-SQUARE TEST FOR INDEPENDENCE
Smoker
TOTAL
Yes No
Male
Gender
cell Female
TOTAL

 Assumptions:
1. Random sampling
2. Independence of observations
3. No cell should have an expected count of < 5 OR
< 20% of cells with expected count of < 5
HOW TO CALCULATE EXPECTED COUNT ?
Group A
Total
Yes No
Observed a b a+b
Yes Expected

Group B
Observed c d c+d
No Expected

a+c b+d a+b+c+d

Expected count = Total of row x total of column


Total
HOW TO CALCULATE EXPECTED COUNT ?
Group A
Total
Yes No
Observed a b a+b
Yes Expected (a+c) x (a+b)
a+b+c+d
Group B
Observed c d c+d
No Expected

a+c b+d a+b+c+d

Expected count = Total of row x total of column TR X TC


T
Total
% cells with EC < 5 = No of cells with EC < 5 x 100
Total no of cells
A1

HOW TO CALCULATE EXPECTED COUNT ?

Expected count = Total of row x total of column


Total
= 22 x 31
69
= 9.9
Slide 7

A1 Ang, 31/5/2022
HOW TO CALCULATE EXPECTED COUNT ?

 In this 2 X 3 table, there are no cell with EC <5


 Assumption met for Chi-square test for independence
 If you have a 2 X 2 table with any cell that has an EC < 10,
use Fisher’s Exact test
HOW TO CALCULATE % OF EC < 5 ?
Group A
Total
Yes No
Observed 25 16 41
High
Expected 18.48 22.52
Group Observed 4 20 24
Average
B Expected 10.82 13.18
Observed 3 3 6
Low
Expected 2.70 3.30
32 39 71

Percentage of Cells if EC less than 5 = No of cells with EC < 5 x 100


Total no of cells

= 2 x 100
6
= 33.33%
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
Step 1: Generate H0 and HA (2-tailed)
 H0 : The is no association between risk factors and
goal achievement (Observed = Expected)
 HA : The is an association between risk factors and
goal achievement (Observed ≠ Expected)

Step 2: Set the significance level (α)


 α = 0.05
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
Step 3: Check the assumptions
 Dataset : https://goo.gl/ng8Nc7
 Assumptions 1 : Random sampling
 Assumptions 2 : Independence of observations
 Assumptions 3 : No cell should have an EC < 5 OR
< 20% of cells with EC < 5
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
 Analyze → Descriptive Statistics → Crosstabs
 In Crosstabs dialogue box, move risk factor into Row(s) and
achieve_goal.3 in Column(s)
 Click Statistics and select Chi-square → Continue
 Click Cell → Tick Observed and Expected in the Counts box →
Continue → OK
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
 Assumptions 3 : No cell should have an EC < 5 OR
< 20% of cells with EC < 5
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
Step 4: Determine the test statistic and p values

 Refer to the Pearson’s Chi-square Asymp. Sig. value


 The χ2 (2, n = 69) = 11.48
 The p-value = 0.003
RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
Step 5: Statistical decision

 Reject H0 as the p-value is <0.05


RQ: IS THERE AN ASSOCIATION BETWEEN RISK
FACTORS AND GOAL ACHIEVEMENT?
Step 6: Conclusion

A Chi-Square Test for Independence indicated a


significant association between risk factors and goal
achievement, χ2 (2, n = 69) = 11.48, p = .003
RQ: IS THERE AN ASSOCIATION BETWEEN GENDER
AND SMOKING BEHAVIOUR?
RQ: IS THERE AN ASSOCIATION BETWEEN GENDER
AND SMOKING BEHAVIOUR?

 For a 2 X 2 table, refer to the Continuity Correction


Asymp. Sig. value
 This is the Yates' Correction for Continuity which
compensates for the overestimate of the chi-square
value when used with a 2 X 2 table
RQ: IS THERE AN ASSOCIATION BETWEEN GENDER
AND SMOKING BEHAVIOUR?

 The p-value = 0.562 i.e. >0.05


 Fail to reject H0
 A Chi-Square Test for Independence (with Yates'
Continuity Correction) indicated no significant
association between gender and smoking status,
χ2 (1, n = 436) = .34, p = .56
FISHER’S EXACT TEST

 Fisher’s exact test is automatically generated and


provided as part of output from chi-square for a 2 X 2
table
 Use when chi-square EC assumption is not met
THANK YOU…..

You might also like