0% found this document useful (0 votes)

73 views31 pages

12 Chi Square

Chi-square tests are nonparametric tests used to analyze relationships between categorical variables. There are two main types of chi-square tests: tests of goodness of fit and tests of independence. Chi-square tests of goodness of fit compare observed counts to expected counts under a hypothetical distribution. Chi-square tests of independence examine whether two categorical variables are associated or independent of each other. To perform a chi-square test, observed and expected frequencies are calculated and compared using a chi-square distribution with degrees of freedom equal to (number of columns - 1) * (number of rows - 1). Larger differences between observed and expected values indicate stronger evidence against the null hypothesis of independence or goodness of fit.

Uploaded by

Bagus Mahendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views31 pages

12 Chi Square

Uploaded by

Bagus Mahendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Chi Square Tests

Chi-Square Test (χ2)

• Nonparametric test for nominal independent

variables
– These variables, also called "attribute variables" or
"categorical variables," classify observations into a small
number of categories.
• Examples?
– The dependent variable: the count of
observations in each category of the nominal
variable
• e.g. the number of men versus women; or the number
of accidents in dry weather versus in wet weather
Two uses of χ2
• Chi Square Goodness of Fit: when we have one
independent variable (e.g. weather)
– we compare the numbers of observations in the
categories of this variable (e.g. wet and dry) to what
we would expect if the variable did not make any
difference (e.g. out of 100 accidents 50 in dry weather
and 50 in wet weather)
• Chi Square Test of Independence: when we have
two or more independent variables
– we compare the numbers of observations in each
category of each variable to the numbers we would
expect if the variables were independent of each
other
Chi-Square Goodness of Fit
• The observed counts of numbers of observations in each category are
compared with the expected counts, which are calculated using some
kind of theoretical expectation, such as a 1:1 sex ratio.
• An example:
– An area of shore that has 59% of the area covered in sand, 28% mud and 13%
rocks;
– if seagulls were standing in random places, your null hypothesis would be that
59% of the seagulls were standing on sand, 28% on mud and 13% on rocks.
– the independent variable is type of shore, the dependent variable is the observed
number of seagulls
Rock
13%

Mud
28% Sand
59%
Tabulating Chi Square Goodness of Fit
Sand Mud Rocks

Observed
seagulls 35 8 57
(Total: 100)
Expected
seagulls 59 28 13
(Total: 100)
Calculating the test value
 The test statistic is calculated by taking an observed
number (O), subtracting the expected number (E), then
squaring this difference. The larger the deviation from
the null hypothesis, the larger the difference between
observed and expected.
 (O  E ) 2

Χ  
2

 E 
 Each squared difference is divided by the expected
number, and these standardized ratios are summed: the
more differences between what you would expect and
what you get the bigger the number.
Calculate Chi square for the seagull
data
(O – E)2 divided by E:

Now check your answer on Graphpad:

http://www.graphpad.com

Or on Social Science Statistics:

http://www.socscistatistics.com/tests/Default.aspx
(this will also give you the steps of the calculation)
Goodness of fit in SPSS
• Create a variable column (Surface)
• Create frequency column
• Type the observed frequencies for each category of the
independent variable
• From the data menu, weight cases by frequency
• Go to Analyse – Nonparametric – One Sample - Chi square
• Select Surface as test field
• In Options, you can set the expected values to be equal
percentages of the categories (33% here) or you can assign
expected values
• Run the analysis
How the test works

1.Identify Pop. Distribution & Assumptions

a) Two populations, one distribution that matches
expected outcomes and another where distribution
matches observed outcomes.
b) Null hypothesis: the two distributions do not differ
c) Comparison distribution is chi-square
Chi-Square Test for Goodness-of-Fit
distribution
Characteristics of the
comparison distribution
• Degrees of Freedom:
N of categories – 1
Chi-Square Goodness of Fit:
Test Assumptions

1. Random and independent sampling.

2. Sample size must be sufficiently large (no
more than 20% of cells should have an
expected value of less than 5)
3. Values of the variable are mutually exclusive
and exhaustive. Every subject must fall in
only one category.

Note: If these values are not met, the critical values

in the chi-square table are not necessarily correct.
Exercise 1: Popularity
• A Psychology course is offered by three different
professors.
• The table shows the number of students enrolled in
the course of each.
• Is one professor more popular than another or are the
different enrollment numbers due to chance?
• Complete the table and run the analysis.
Prof A Prof B Prof C Prof D
Observed 25 29 22 17
?????
Chi Square Test of Independence
• Two or more nominal variables
• We test the independence of the variables
(whether they affect each other)
Chi-Square Test of Independence
Example
A researcher wants to know if there is a significant difference in the
frequencies with which males come from small, medium, or large
cities as contrasted with females. The two variables are hometown
size (small, medium, or large) and sex (male or female). Another way
of putting our research question is: Is gender independent of size of
hometown?
Contingency table for the data for 30 females and 6 males:
Frequency with which males and females come from small, medium, and large cities

Small Medium Large Totals

Female 10 14 6 30
Male 4 1 1 6
Totals 14 15 7 36
The formula for chi-square is:

Where:
O is the observed frequency, and
E is the expected frequency.

The degrees of freedom for the 2-D chi-square statistic is:

df = (Columns - 1) x (Rows - 1)
Computing Expected Frequencies
Frequency with which males and females come from small, medium, and large cities

Small Medium Large Totals

Female 10 14 6 30
Male 4 1 1 6
Totals 14 15 7 36

Expected Frequency for each Cell:

The cell’s Column Total x the cell’s Row Total / Grand Total

In our example:
Column Totals are 14 (small), 15 (medium), and 7 (large).
Row Totals are 30 (female) and 6 (male).
Grand total is 36.
Computing Expected Frequencies
Frequency with which males and females come from small, medium, and large cities

Small Medium Large Totals

Female 10 14 6 30
Male 4 1 1 6
Totals 14 15 7 36

The expected frequency:

1. Small female cell:14 X 30 / 36 = 11.667
2. Medium female cell: 15 X 30 / 36 = 12.500
3. Large female cell: 7 X 30 / 36 = 5.833
4. Small male cell: 14 X 6 / 36 = 2.333
5. Medium male cell: 15 X 6 / 36 = 2.500
6. Large male cell: 7 X 6 / 36 = 1.167
Observed frequencies, expected frequencies, and (O - E)2/E for males and females from small,
medium, and large cities

Small Medium Large Totals

(O- (O- (O-

Observed Expected 2 Observed Expected 2 Observed Expected
E) /E E) /E E)2/E
Female 10 11.667 0.238 14 12.500 0.180 6 5.833 0.005 30
Male 4 2.333 1.191 1 2.500 0.900 1 1.167 0.024 6
Totals 14 15 7 36
• For practice:
– Check the accuracy of the hand calculations on
Social Science Statistics. Are the two variables
independent of each other?
Fischer’s Exact Test
• Chi square test is not accurate when we have
a small number of observations (expected
frequency of less than 5 in more than 20% of
cells)
• We can substitute Fischer’s exact in a 2 x 2
design
Exercise 2: Cycling (handlebar.sav)
• Are the Dutch reckless cyclists?
– Keep only one hand on the handlebar
• Variables:
– Nationality (English, Dutch)
– Hands on handlebar (One, Two)
Observed frequencies
Dutch English
One Handed 120 17
Two Handed 578 154

In SPSS:
• Weight cases by frequency
• Go to Analyze -> Descriptive Statistics -> Crosstabs
• Choose Chi square from Statistics
• You can also choose Phi and Cramer’s V – an effect size
• From Exact, you can choose Fischer’s exact
• From Cells, choose the data you need (expected, possibly
percentages)
• Run
Other way of running Chi-Square using SPSS:
ResponseIncentive.sav

A researcher is interested in whether people are more

likely to return survey questionnaires if the questionnaire
offers an incentive. He sends out 100 questionnaires:
• 20 promises that the respondent will get the survey
results
• 30 says the respondent will be entered in a prize draw
• 50 has no incentive

We have two categorical variables: Incentive (results vs.

prize draw vs. none) and Response (questionnaire
returned or not).
Steps
• Analyse – Desriptive Statistics – Crosstabs
• Choose rows and columns
• Click display bar charts
• Choose statistics (Chi2 and Cramer’s V for
effect size)
• Choose which cells you want displayed
(observed and expected)
• Run the analysis
SPSS Output
How big is the effect?: Cramer’s V

.27 out of 1 = a medium association between type of

incentive and whether people return a questionnaire.
Can be viewed like a correlation coefficient. The
significance level indicates it is unlikely the observed
pattern of data is due to chance.
A more useful effect size: Odds Ratio
1. Odds that a Q was returned given a promise of results.
Odds(responding to results) = number that responded to results = 9 = .82
number that didn’t respond = 11

2. Odds that a Q was returned given a promise of prize draw.

Odds(responding to draw) = number that responded to draw = 16 = 1.14
number that didn’t respond = 14

3. Odds ratio. Odds(responding to draw) = 1.39

Odds(responding to results)

Odds ratios can be calculated for any pairs of categories.

Writing up the results
• There was a significant association between the
type of incentive and whether people returned
the questionnaire, 2(2 )= 7.61, p = .022. People
offered either type of incentive were more likely
to respond than those not offered any incentive
but, based on the odds ratio, the odds of
returning the questionnaire were 1.39 times
higher if people were promised a prize draw
than if they were promised the results of the
survey.
Exercise 4
The relationship between drug companies and medical
researchers is under scrutiny because of possible conflict of
interest. The issue that started the controversy was a 1995
case control study that suggested that the use of calcium-
channel blockers to treat hypertension led to an increase risk
of heart disease. This led to an intense debate. Researchers
writing in the New England Journal of Medicine (“Conflict of
Interest in the Debate over Calcium Channel Antagonists,”
January 8, 1998, p. 101) looked at the 70 research reports that
appeared during 1996–1997, classifying them as favorable,
neutral, or critical toward the drugs. The researchers then
contacted the authors of the reports and questioned them
about financial ties to drug companies. Results in
ResearchBribes.sav
Homework
• Sonnentag(2012).sav
– Is there an association between the amount of
time pressure at work and whether we can relax
when not working (SwitchOff)?
• Births.sav
– Are births equally distributed over the year or are
more babies born in some months than in others?
Make-up Homework
Does a negative example on TV make us more negative in
our relationships? Eastenders.sav
• Couples watched three types of TV programmes:
– EastEnders (British soap opera with very miserable and
mean people)
– Friends (British soap opera with exaggeratedly nice and
helpful people)
– A neutral nature programme
• After watching each programme, the couples were left
alone for an hour and the number of
sharp/nasty/unfriendly comments they make to each
other was counted.
Choose the appropriate test, run it and report the results.

Activity Worksheet 2 T Test
No ratings yet
Activity Worksheet 2 T Test
1 page
Chi Square Test
100% (2)
Chi Square Test
75 pages
T Test
No ratings yet
T Test
21 pages
Chi-Square Test
No ratings yet
Chi-Square Test
36 pages
Biostat Midterm
No ratings yet
Biostat Midterm
4 pages
Exam in Statistics 1
100% (1)
Exam in Statistics 1
2 pages
Chi Square
No ratings yet
Chi Square
13 pages
SEM:Confirmatory Factor Analysis (CFA)
No ratings yet
SEM:Confirmatory Factor Analysis (CFA)
28 pages
Are Our Results Reliable Enough To Support A Conclusion?
100% (1)
Are Our Results Reliable Enough To Support A Conclusion?
21 pages
One Proportion Z-Tests in SPSS
No ratings yet
One Proportion Z-Tests in SPSS
2 pages
Tests of Significance and Measures of Association
No ratings yet
Tests of Significance and Measures of Association
21 pages
Chapter 1 Introduction The Teaching of Theory (3 Hours) Objective
100% (1)
Chapter 1 Introduction The Teaching of Theory (3 Hours) Objective
32 pages
Anova
No ratings yet
Anova
17 pages
Chi Square Test
No ratings yet
Chi Square Test
5 pages
Evaluation of Evidence
No ratings yet
Evaluation of Evidence
51 pages
Characteristics of Testable Hypotheses
67% (3)
Characteristics of Testable Hypotheses
30 pages
QUIZ - Hypothesis Testing I
No ratings yet
QUIZ - Hypothesis Testing I
8 pages
Chapter 3.2 WILCOXON RANK SUM TEST
No ratings yet
Chapter 3.2 WILCOXON RANK SUM TEST
16 pages
One-Way ANOVA: What Is This Test For?
No ratings yet
One-Way ANOVA: What Is This Test For?
21 pages
Inference About Population Variance
100% (1)
Inference About Population Variance
30 pages
Cochran'S Q Test: Melecio M. Panganiban II MPA-MBAN 1105
No ratings yet
Cochran'S Q Test: Melecio M. Panganiban II MPA-MBAN 1105
21 pages
04 Discrete and Continuous Random Variables
100% (1)
04 Discrete and Continuous Random Variables
29 pages
Non Parametric Test
100% (1)
Non Parametric Test
16 pages
Non-Parametric Tests
100% (1)
Non-Parametric Tests
10 pages
EPIData Presentation
No ratings yet
EPIData Presentation
36 pages
Sampling Process and Data Collection
100% (1)
Sampling Process and Data Collection
32 pages
Gamma
No ratings yet
Gamma
23 pages
Spearman Rho Correlation
No ratings yet
Spearman Rho Correlation
10 pages
Questions & Answers Chapter - 7 Set 1
No ratings yet
Questions & Answers Chapter - 7 Set 1
6 pages
Quasi-Experimental Design (Pre-Test and Post-Test Studies) in Prehospital and Disaster Research
No ratings yet
Quasi-Experimental Design (Pre-Test and Post-Test Studies) in Prehospital and Disaster Research
2 pages
Chi Squre
No ratings yet
Chi Squre
2 pages
Test of Goodness of Fit
No ratings yet
Test of Goodness of Fit
38 pages
Long Test 1 Hypothesis Testing
No ratings yet
Long Test 1 Hypothesis Testing
2 pages
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
No ratings yet
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
7 pages
Chapter 1 Introduction To Biostat
No ratings yet
Chapter 1 Introduction To Biostat
62 pages
Independent Samples T-Test: Module No. 2
No ratings yet
Independent Samples T-Test: Module No. 2
9 pages
4 - Estimation
No ratings yet
4 - Estimation
63 pages
2 Independent Samples: Mann-Whitney Test.: 1 2 n1 X, 1, 2,...., n2 Y
No ratings yet
2 Independent Samples: Mann-Whitney Test.: 1 2 n1 X, 1, 2,...., n2 Y
5 pages
Topic 4a
No ratings yet
Topic 4a
28 pages
Chi Square Test
No ratings yet
Chi Square Test
9 pages
Unit 10 - Chi-Square Test
No ratings yet
Unit 10 - Chi-Square Test
21 pages
APA Style 7th Edition - Reference List Examples
No ratings yet
APA Style 7th Edition - Reference List Examples
6 pages
Chi Square Statistics
No ratings yet
Chi Square Statistics
7 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
27 pages
Lesson 6 - Chi-Square Test For Independence
No ratings yet
Lesson 6 - Chi-Square Test For Independence
4 pages
Quiz Ch. 8
No ratings yet
Quiz Ch. 8
4 pages
Lesson 3 Measures of Central Tendency Power Point
No ratings yet
Lesson 3 Measures of Central Tendency Power Point
29 pages
Module Exercises in Master Statistics
17% (6)
Module Exercises in Master Statistics
6 pages
Mann Whitney Wilcoxon Tests (Simulation)
No ratings yet
Mann Whitney Wilcoxon Tests (Simulation)
16 pages
An Introduction To Two-Way ANOVA: Prepared By
No ratings yet
An Introduction To Two-Way ANOVA: Prepared By
46 pages
Week 1 - Descriptive Statistics-Review
No ratings yet
Week 1 - Descriptive Statistics-Review
2 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
8 pages
Nonparametric Test: DR - Dr. Siswanto, MSC
No ratings yet
Nonparametric Test: DR - Dr. Siswanto, MSC
44 pages
Frequency Distribution Table
100% (1)
Frequency Distribution Table
3 pages
Parametric and Non-Parametric
No ratings yet
Parametric and Non-Parametric
35 pages
BS IMI U8 Oct23
No ratings yet
BS IMI U8 Oct23
100 pages
Chisquaretest
No ratings yet
Chisquaretest
16 pages
Chi Square Lesson
No ratings yet
Chi Square Lesson
11 pages
Lecture 1 5th
No ratings yet
Lecture 1 5th
45 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Pityriasis Versicolor On Becker's Nevus: Letterto The Editor
No ratings yet
Pityriasis Versicolor On Becker's Nevus: Letterto The Editor
3 pages
Status Asthmaticus: Polski Merkuriusz Lekarski: Organ Polskiego Towarzystwa Lekarskiego June 2005
No ratings yet
Status Asthmaticus: Polski Merkuriusz Lekarski: Organ Polskiego Towarzystwa Lekarskiego June 2005
6 pages
Pengaruh Pemberian Buku Saku Gouty Arthritis Terhadap Pengetahuan, Sikap Dan Perilaku Pasien Gouty Arthritis Rawat Jalan Di RSUP Prof - DR.DR - Kandou Manado
No ratings yet
Pengaruh Pemberian Buku Saku Gouty Arthritis Terhadap Pengetahuan, Sikap Dan Perilaku Pasien Gouty Arthritis Rawat Jalan Di RSUP Prof - DR.DR - Kandou Manado
8 pages
951 Full PDF
No ratings yet
951 Full PDF
3 pages
Risk Factors For Urinary Tract Infection
No ratings yet
Risk Factors For Urinary Tract Infection
5 pages
Global Physiology and Pathophysiology of Cough: ACCP Evidence-Based Clinical Practice Guidelines
No ratings yet
Global Physiology and Pathophysiology of Cough: ACCP Evidence-Based Clinical Practice Guidelines
6 pages
Analisa Klausula Penunjukan Penerima Manfaat Dalam Formulir Pembukaan Rekening Tabungan Berdasarkan Hukum Waris Kitab Undang-Undang Hukum Perdata
No ratings yet
Analisa Klausula Penunjukan Penerima Manfaat Dalam Formulir Pembukaan Rekening Tabungan Berdasarkan Hukum Waris Kitab Undang-Undang Hukum Perdata
23 pages
Bab I
No ratings yet
Bab I
6 pages
The Incidence and Prevalence of Systemic Lupus Erythematosus in The UK, 1999 - 2012
No ratings yet
The Incidence and Prevalence of Systemic Lupus Erythematosus in The UK, 1999 - 2012
6 pages
Moss Chest 2004 125 509
No ratings yet
Moss Chest 2004 125 509
16 pages
Edema in Heart Failure: Pathophysiology & Management
No ratings yet
Edema in Heart Failure: Pathophysiology & Management
46 pages
Blood Volume Prior To and Following Treatment of Acute Cardiogenic Pulmonary Edema
No ratings yet
Blood Volume Prior To and Following Treatment of Acute Cardiogenic Pulmonary Edema
8 pages
Lecture 12 mm1 Queue PDF
No ratings yet
Lecture 12 mm1 Queue PDF
4 pages
Fast Generation of Deviates For Order Statistics by An Exact Method
No ratings yet
Fast Generation of Deviates For Order Statistics by An Exact Method
9 pages
Statistics 512 Notes I D. Small
No ratings yet
Statistics 512 Notes I D. Small
8 pages
Exam SC4040
No ratings yet
Exam SC4040
5 pages
Revision Sheet Final Exam
No ratings yet
Revision Sheet Final Exam
6 pages
Ece-V-Information Theory & Coding (10ec55) - Notes
0% (1)
Ece-V-Information Theory & Coding (10ec55) - Notes
217 pages
WST121 Study Guide 2024
No ratings yet
WST121 Study Guide 2024
29 pages
Bayesian Statistics and Modelling
No ratings yet
Bayesian Statistics and Modelling
28 pages
Gumbel and Log Pearson-III
No ratings yet
Gumbel and Log Pearson-III
8 pages
Forecasting Model
No ratings yet
Forecasting Model
15 pages
Lesson 17. Attributes and User-Defined Distributions in Promodel
No ratings yet
Lesson 17. Attributes and User-Defined Distributions in Promodel
2 pages
2024 Question Paper Mid Term Mat2227
No ratings yet
2024 Question Paper Mid Term Mat2227
3 pages
Econometrics II Chap 4.1 Univariate Time Series
No ratings yet
Econometrics II Chap 4.1 Univariate Time Series
63 pages
Time Series hw5
100% (2)
Time Series hw5
4 pages
Tutorial
No ratings yet
Tutorial
6 pages
Cointegration R Workshop
No ratings yet
Cointegration R Workshop
121 pages
Akshat File
No ratings yet
Akshat File
37 pages
RESEARCH Methodology: Associate Professor in Management Pondicherry University Karaikal Campus Karaikal - 609 605
No ratings yet
RESEARCH Methodology: Associate Professor in Management Pondicherry University Karaikal Campus Karaikal - 609 605
46 pages
Introduction To Econometrics, 5 Edition: Chapter 5: Dummy Variables
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 5: Dummy Variables
40 pages
2011 AP Statistics Free-Response Questions
No ratings yet
2011 AP Statistics Free-Response Questions
7 pages
Volumetric Equation Approaches in Mishrif - Amara Oil Field
No ratings yet
Volumetric Equation Approaches in Mishrif - Amara Oil Field
10 pages
Measures of Association Between Two Variables
0% (1)
Measures of Association Between Two Variables
8 pages
LEC11
No ratings yet
LEC11
15 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
DNV Fatigue Approach in Piping Stress Analysis
No ratings yet
DNV Fatigue Approach in Piping Stress Analysis
23 pages
Pengaruh Rekrutmen Terhadap Kinerja Karyawan: Roidah Lina
No ratings yet
Pengaruh Rekrutmen Terhadap Kinerja Karyawan: Roidah Lina
10 pages
Bai and NG 2002
No ratings yet
Bai and NG 2002
31 pages
1 s2.0 S009830042200084X Main
No ratings yet
1 s2.0 S009830042200084X Main
11 pages
Measures of Relative Position
No ratings yet
Measures of Relative Position
2 pages

12 Chi Square

Uploaded by

12 Chi Square

Uploaded by

Chi Square Tests

Chi-Square Test (χ2)

• Nonparametric test for nominal independent

Now check your answer on Graphpad:

Or on Social Science Statistics:

1.Identify Pop. Distribution & Assumptions

1. Random and independent sampling.

Note: If these values are not met, the critical values

Small Medium Large Totals

The degrees of freedom for the 2-D chi-square statistic is:

Small Medium Large Totals

Expected Frequency for each Cell:

Small Medium Large Totals

The expected frequency:

Small Medium Large Totals

(O- (O- (O-

A researcher is interested in whether people are more

We have two categorical variables: Incentive (results vs.

.27 out of 1 = a medium association between type of

2. Odds that a Q was returned given a promise of prize draw.

3. Odds ratio. Odds(responding to draw) = 1.39

Odds ratios can be calculated for any pairs of categories.

You might also like