Chi-Square (X2) Distribution
Chi-Square (X2) Distribution
Chi-Square (X2) Distribution
Nadia Aziz
Senior lecturer
F.A.B.H.S(C.M)
College of Medicine
University of Baghdad
Describe cross-tabulation and assess the
relationship between two categorical
(nominal- ordinal level) variables with two or
more categories.
Understand the concept of observed and
expected frequencies.
Interpret the SPSS output for the Chi-square
procedure.
Understand the applications of Fischer exact
test and yate continuity test
PROPERTIES:
1.It is one of the most widely used distribution
in statistical applications
2.This distribution may be derived from normal
distribution
3.This distribution assumes values from
( zero to + infinity)
4. X2 relates to frequencies of occurrence of
individuals (or events) in the categories of
one or more variables.
5. X2 test used to test the agreement between
the observed frequencies with certain
characteristics and the expected frequencies
under certain hypothesis.
The values of test statistic in Chi-square
distribution is between zero and + ∞. No
negative values are present since they are
squared values.
The Chi-square distribution has one tail
only (positively
skewed distribution).
The higher the df the
more flattened is the
curve.
CHI-SQUARE(X2) test of Goodness of fit (non
parametric)
CHI-SQUARE(X2) test of homogeneity
CHI-SQUARE(X2) test of Independence
It is used to test the null hypothesis that two
criteria of classification when applied to the
same set of entities are independent (NO
ASSOCIATION)
Generally , a single sample of size (n) can be
drawn from a population, the frequency of
occurrence of the entities are cross-classified on
the basis of the two variables of interest( X & Y).
The corresponding cells are formed by the
intersections of the rows (r), and the columns
(c).
The table is called the ‘contingency table’
Calculation of expected frequency is based
on the Probability Theory
df=(r-1)(c-1)
1. Hypotheses
Ho: the 2 criteria are independent (no
association)
HA: The 2 criteria are not independent (There
is association)
X2 c d c+d
X2=∑(O-E)2/E
For each cell we will calculate X2 value
X2 value for all the cells of the contingency
table will be added together to find X2 c
5. Define the critical value (tabulated X2)
This depends on alpha level of significance
and degree of freedom The value will be
determined from X2 table
df=(r-1)(c-1)
r: no. of row
c: no. of column
6. Conclusion
If the X2 c is less than X2 tab we accept Ho.
Low 50 25
Average 110 55
High 40 20
Total 200 100
In the same sample the location of residence
was also
classified into 3 sectors: south, center and
north.
N %
south 44 22
center 96 48
north 60 30
Total 200 100
the relationship between two
categorical variables, tabulated one against
other.
Then click
continue
The last one is the most important. It shows the
calculated Chi square =126.2
The df = 4 and P value <0.001.
1. The expected frequency of any cell is <1
2. The summation of the least expected
frequencies in 20% of the cells is < 5
The expected frequency of any cell is <5
This test is used if chi square test is not
applicable because of small expected value.
(when the expected frequency of any cell in a
2 X 2 table is less than 5)