Chi-Square (X2) Distribution

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Dr.

Nadia Aziz
Senior lecturer
F.A.B.H.S(C.M)
College of Medicine
University of Baghdad
 Describe cross-tabulation and assess the
relationship between two categorical
(nominal- ordinal level) variables with two or
more categories.
 Understand the concept of observed and
expected frequencies.
 Interpret the SPSS output for the Chi-square
procedure.
 Understand the applications of Fischer exact
test and yate continuity test
 PROPERTIES:
1.It is one of the most widely used distribution
in statistical applications
2.This distribution may be derived from normal
distribution
3.This distribution assumes values from
( zero to + infinity)
4. X2 relates to frequencies of occurrence of
individuals (or events) in the categories of
one or more variables.
5. X2 test used to test the agreement between
the observed frequencies with certain
characteristics and the expected frequencies
under certain hypothesis.
 The values of test statistic in Chi-square
distribution is between zero and + ∞. No
negative values are present since they are
squared values.
 The Chi-square distribution has one tail
only (positively
 skewed distribution).
 The higher the df the
 more flattened is the
 curve.

 CHI-SQUARE(X2) test of Goodness of fit (non
parametric)
 CHI-SQUARE(X2) test of homogeneity
 CHI-SQUARE(X2) test of Independence
 It is used to test the null hypothesis that two
criteria of classification when applied to the
same set of entities are independent (NO
ASSOCIATION)
 Generally , a single sample of size (n) can be
drawn from a population, the frequency of
occurrence of the entities are cross-classified on
the basis of the two variables of interest( X & Y).
The corresponding cells are formed by the
intersections of the rows (r), and the columns
(c).
The table is called the ‘contingency table’
 Calculation of expected frequency is based
on the Probability Theory

 The hypotheses and conclusions are stated


on in terms of the independence or lack of
independence of the two variables.
 X2=∑(O-E)2/E

 df=(r-1)(c-1)
1. Hypotheses
Ho: the 2 criteria are independent (no
association)
HA: The 2 criteria are not independent (There
is association)

2. Construct the contingency table


Y1 Y2 Total
row total
X1 a b a+b

X2 c d c+d

Total a+c b+d N=a+b+c+d


column
total
3. Calculate the expected frequency for each cell

By multiplying the corresponding marginal totals


of that cell, and divide it by the sample size
E = (row total x column total) / grand total
4. Calculated the X2 value (calculated X2 c)

X2=∑(O-E)2/E
For each cell we will calculate X2 value
X2 value for all the cells of the contingency
table will be added together to find X2 c
5. Define the critical value (tabulated X2)
This depends on alpha level of significance
and degree of freedom The value will be
determined from X2 table
df=(r-1)(c-1)
r: no. of row
c: no. of column
6. Conclusion
If the X2 c is less than X2 tab we accept Ho.

If the X2 c is more than X2 tab we reject Ho.


The tabulated X2 for 2x2 table with df=1 and
alpha error= 0.05 is equal to (1.96) 2 = 3.84.
 The table shows the distribution of
individuals according to 3 categories of
Socioeconomic Index Level (SEIL).
SEIL No %

Low 50 25

Average 110 55
High 40 20
Total 200 100
 In the same sample the location of residence
was also
 classified into 3 sectors: south, center and
north.

N %
south 44 22
center 96 48
north 60 30
Total 200 100
 the relationship between two
 categorical variables, tabulated one against
other.

SEIL South Center North Total


Low 33 7 10 50
Average 9 81 20 110
High 2 8 30 40
Total 44 96 60 200
Add
here

Then click
continue
The last one is the most important. It shows the
calculated Chi square =126.2
The df = 4 and P value <0.001.
1. The expected frequency of any cell is <1
2. The summation of the least expected
frequencies in 20% of the cells is < 5
The expected frequency of any cell is <5
 This test is used if chi square test is not
applicable because of small expected value.
(when the expected frequency of any cell in a
2 X 2 table is less than 5)

 For tables in which the use of chi square test


is appropriate, the two tests give very similar
results
(a+b)! (c+d)! (a+c)! (b+d)!
P=--------------------------
n! a! b! c! d!
 Because Chi square is a continuous
distribution and categorical data are
discrete, some statisticians use a version of
chi square called Yate’s corrected chi
square.
 X² = ∑ ( |0 –E| - 0.5)² / E

 The corrected version is more conservative


than the non corrected version
THANK YOU

You might also like