Lecture LAB5 Chi Square
Lecture LAB5 Chi Square
Be able to correctly use and interpret Pearson Chi-Square test, a test for
independence between two qualitative variables.
Pearson Chi-Square test, a test for independence
Is used to examine the relationship between two qualitative variables.
The null hypothesis: There is no association (relationship) between the two
variables
The alternative hypothesis: The two variables are associated
Scenario when you would use Chi-Square test:
• Does smoking status at baseline depend on gender?
• Is there a relationship between coffee consumption and age group?
Assumptions
Random Sample: The sample should be randomly selected from the population
Independence: Observations must be independent from each other (Not matched pairs,
e.g. Matched Case-control study)
Adequate sample size :
• If the expected frequencies are too small then FISHER’S EXACT TEST should be used in
2 Qualitative
(Nominal)
Variables
Ho : There is no Ha : There is an
association between the association between the
2 variables 2 variables
(2 vars. are indep.) (2 vars. not indep.)
All assumptions are met Pearson Chi-Square
The assumption of
independence is not met. McNemar Test
(Paired observations)
Evidence or Proof
P-value (Sig.)
• If p-value > 0.05, we fail to reject the null hypothesis
Example1: Does smoking status at baseline depend on gender?
Observed Values
Gender Total
Male Female
Gender Total
Male Female
Observed 16 17
Smoker 33
Smoking Expected 12.9 20.1
Status Non- Observed 143 230
373
smoker Expected 146.1 226.9
Total 159 247 406
(O E ) 2
Test Statistics i
i
2
1.31
E
i
=+++=1.31
Degrees of freedom: (rows - 1)(columns - 1) = 1
Chi-square test statistic = 1.31 Critical value from table (α = 0.05) = 3.841
FTR
x
3.84
Variable
view
Data view
Using Weight cases option
Analyze Descriptive Statistics Crosstabs option Select smoking as the row variable
Select gender as the column variable Click Cells and select column and percentages Click continue
.
SPSS output
Gender
Characteristic Female Male P-value
n=247 n=159
Smoking Status at Baseline [n (%)]
Smoker 17 (6.9) 16 (10.1) 0.252*
Non-smoker 230 (93.1) 143 (89.9)
Pearson Chi-Square
*
Conclusion: at the 0.05 level of significance, there is no association between smoking status
BMI Groups
Characteristic Underweight Normal Overweight Obese P-value
71 2152 1866 601
CHD Status[n (%)]
CHD (No) 63 (88.7) 1622 (75.4) 1180 (63.2) 353 (58.7) <0.01*
CHD (Yes) 8 (11.3) 530 (24.6) 686 (36.8) 248 (41.3)
*Pearson Chi-Square
Decision: reject H0
Conclusion: at the 0.05 level of significance, there is an association between BMI groups
• Random Sample: we will that the sample are randomly selected from the population
• Independence: Observations are independent from each other ( Not matched pairs)
• Sample Size:
Analyze Descriptive Statistics Crosstabs option Select marital status at baseline as the row variable
Select marital status as the column variable Click Cells and select column and row percentages Click
continue
Click statistics option and select Chi-square Click Exact option and select Monte Carlo with 99%CI Click
Please note that
this option is not
continue then OK available for Mac
users.
.
SPSS output
Summary table
Marital status
Single Married Divorced Widowed p-value*
N % N % N % N %
Assumptions
SPSS Output:
Cross tabulation
Chi-square tests
Summary Table
Decision/interpretation/conclusion
Thank you