Chi Square
Chi Square
Chi –
Square Prepared by:
SARAH JOY V. TADEJA
MAED-Math
Objectives
01 define Chi-Square
02 compare the two kinds
of Chi-square;
03 solve problems using
Chi-Square test.
01
What is Chi-
Square?
Chi-Square
- It is a statistical tool used to measures how a
model compares to actual observed data.
- The data used in calculating a chi-square
statistic must be random, raw, and mutually
exclusive, drawn from independent variables
and from large enough sample.
Chi-Square
- It is often used to test hypotheses. It
compares the size of discrepancies
between the expected results and the
actual results, given the size of the
sample and the number of variables in
the relationship.
02
What are the
two kinds of
Chi-Square?
Chi-Square Goodness of
Fit
Chi-Square Test for
Independence
What is Chi-Square Godness of
Fit?
provides a way to test how well a
sample of data matches the
(known or assumed)
characteristics of the larger
population that the sample is
intended to represent.
What is Chi-Square Godness of
Fit?
When goodness of fit is high, the values
expected based on the model are close to the
observed values.
When goodness of fit is low, the values
expected based on the model are far from
the observed values.
Example Problem
You’re hired by a dog food company to help
them test three new dog food flavors.
You recruit a random sample of 75 dogs and
offer each dog a choice between the three
flavors by placing bowls in front of them. You
expect that the flavors will be equally popular
among the dogs, with about 25 dogs choosing
each flavor.
What is Chi-Square Test for
Independence?
It is used to determine whether
two or more categorical or
nominal variables are likely to be
related or not.
Example Problem
You were studying whether the preference for
a dog food flavor is independent of the dog's
breed (e.g., small, medium, large), then you
would set up a contingency table with rows
representing dog breeds and columns
representing dog food flavors. The chi-square
test of independence would assess whether
there is a significant association between the
two categorical variables.
03
How to solve
problems using
Chi-Square?
Tash, the manager of a car dealership, did not want to
stock cars that were bought less frequently because of
their unpopular color. The five colors that he ordered
were red, yellow, green, blue and white. According to
Tash, the expected frequencies or number of
customers choosing each color should follow the
percentages of last year. He felt 20% would choose
yellow, 30% red, 10% green, 10% blue and 30%
white. He now took a random sample of 150
costumers and asked them their color preferences.
Tash’ survey result for car color preference from his
150 samples
Red Yellow Green Blue White
50 35 30 10 25
= 5-1
=4
9.488 26.95
MALE 24 20 19
FEMALE 13 15 20
Ho: The spiciness level selected by an individual is
independent of the gender of the individual
Ha: The spiciness level selected by an individual is
dependent of the gender of the individual
Level of significance: 0.05
Statistical tool: Chi-square of independence
2
(𝑂 − 𝐸)
𝑥 =∑
2
𝐸
OBSERVED FREQUENCIES
SPICINESS LEVEL TOTAL
GENDER
MILD MEDIUM HOT
MALE 24 20 19 63 Expected
FEMALE 13 15 20 48
value=
TOTAL 37 35 39 111
EXPECTED FREQUENCIES
SPICINESS LEVEL TOTAL
GENDER
MILD MEDIUM HOT
MALE 63
FEMALE 48
TOTAL 37 35 39 111
OBSERVED FREQUENCIES EXPECTED FREQUENCIES
SPICINESS LEVEL T SPICINESS LEVEL T
GENDER GENDER
MILD MEDIUM HOT MILD MEDIUM HOT
MALE 24 20 19 63 MALE 21 19.86 22.14 63
FEMALE 13 15 20 48 FEMALE 16 15.14 16.86 48
TOTAL 37 35 39 111 TOTAL 37 35 39 111
2 2 2 2 2 2
2(24−21) (13−16) (20−19.86) (15−15.14) (19−22.14) (20−16.86)
𝑋= + + + + +
21 16 19.86 15.14 22.14 16.86
2
𝑋 = 0.43+0.56 +0.00099+ 0.13+ 0.45+0.58
2
𝑋 =2.15099
2
𝑥 =2.15099
= (2-1)(3-1)
=2
2.15099 5.991