0% found this document useful (0 votes)
24 views33 pages

12) Chi - Square

Uploaded by

selesmab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views33 pages

12) Chi - Square

Uploaded by

selesmab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

CHI- SQUARE TEST

Chi- square test is most commonly used non- parametric test. It is applicable only
for qualitative data ( nominal and ordinal scale) such as intelligence, color, heath response of
drug etc.

It is a test of significance when data are expressed in frequencies or percentile


and helps us to determine the degree of deviation between observed and expected
frequencies and to conclude whether the deviation is due to chance or by some influence.

(𝑂−𝐸)²
the tests statistic is given by; 𝛘²= 𝐸

Where O=observed frequency,

E= expected frequency= =

 If expected frequency is less than 5 Yates correction is applied.

( 𝑂−𝐸 −1/2)²
Therefore 𝛘² =
𝐸
Application of 𝛘² test:

1. to test the goodness of fit.

2.To test of independence of attributes.


(testing of associations)

3.Test of homogeneity.
Criteria for applying chi- square test

1. Random sample.

2. Qualitative date.

3. Lowest expected frequency not less than 5.

4. Sample size should be at least 50.


1. Test of goodness of fit:

It can be used to determine if observed values are similar to expected


theoretical values, by chance or if the sample is drawn from a different
population. It is used for binomial Poisson and normal distribution.

H0: There is not significance difference between observed and expected values.

H1: There is significance difference between observed and expected values.


Q. A random has following observation.

Academic Nutrition stats total


poor good
Poor 105 (A) 15 ( B) 120
Satisfactory 80 (C) 300 (D) 380
total 185 315 500

Is there is any relation between nutrition stats and academic performances.


Solution:

H0: There is no significance association between nutrition stats and academic


performance.

H1: There is significance association between nutrition stats and academic


performance.
Expected freq for cell A=

B=

C=

D=

Academic Nutrition stats total

poor good

Poor 0=105 O=15 120


E=44.4 E=75.6

Satisfactory 0=80 O=300 380


E=140.6 239.4

Total 185 315 500


We have calculated 𝛘 ²=

Degree of freedom df= (r-1) (c-1)= (2-1) (2-1)= 1


Tabulated 𝛘 ² value at 95% confidence level and at 1 df= 3.841
Conclusion:
calculated 𝛘 ² > Tabulated 𝛘 ² we do not accept H0 ie. There is significance
association between nutrition stats and academic performance.
Q. Nephropathy was observed in 100 of each class of diabetics divided into 4
classes as per severity. Is this inequality in difference groups due to severity?

Class I II III IV

Observed frequency 8 15 14 17

Solution:

Null hypothesis Ho: The severity of diabetes and incidence of nephropathy are
independent (no association)

Alternative hypothesis H1: The severity of diabetes and incidence of nephropathy


are dependent (association)
Now,

Class I II III IV
Observed 8 15 14 17
freq (O)
Expected 13.5 13.5 13.5 13.5
freq (E)

Calculated 𝛘²=

= 3.33
Df = 4-1=3

Now, tabulated 𝛘² at 95% confidence level and 3 df = 7.82

Conclusion:

calculated 𝛘 ² < Tabulated 𝛘 ², we accept H0.

ie, The severity of diabetics and incidence of nephropathy are independent.


Example
A researcher want to find whether the distribution of plants follows the Mendel's
law 9:3:3:1 or not

Phenotype Round yellow Round green Wrinkled yellow Winkled green Total

N 315 103 108 34 560

HINT;
Hypothesis
Null hypothesis(Ho): the given distribution follows Mendel's law
Alternative hypothesis (H1): the given distribution does not follow Mendel's law.
The expected frequency is calculated as
9𝑥 + 3𝑥 + 1𝑥 = 560
16𝑥 = 560
𝑥 = 35
Then the expected frequencies are 9𝑥 = 315 𝑎𝑛𝑑 𝑥 = 35
2)Test of association ( test of independence of attributes)

it measures the probability of association between two discrete attributes


such as smoking and cancer, vaccination and immunity, weight and diabetics etc.

H0: Two variables are independent.

OR

There is no association between two variables.

H1: Two variables are dependent.

OR

There is significant association between two variables.


.

 2Χ2 contingency table with expected frequencies a, b, c, d


Group Result Total
positive negative
X a b a+b
Y c d c+d
Total a+c b+d (a+b+c+d)=N
Alternative formula

𝛘 ²=

If expected values in any cell is less than 5, Yates correction is given by

𝛘²=

Degree of freedom:
df= (r-1)(c-1)
where r= row and c= column
Alternative formula

𝛘 ²=

If expected values in any cell is less than 5, Yates correction is given by

𝑁
𝑁( 𝑎𝑑−𝑏𝑐 − 2 )²
𝛘²=
(𝑎+𝑏)(𝑐+𝑑)(𝑎+𝑐)(𝑏+𝑑)

Degree of freedom:
df= (r-1)(c-1)
where r= row and c= column
Q. A random has following observation.

Academic Nutrition stats total


poor good
Poor 105 (A) 15 ( B) 120
Satisfactory 80 (C) 300 (D) 380
total 185 315 500

Is there is any relation between nutrition stats and academic performances.


Solution:

H0: There is no significance association between nutrition stats and academic


performance.

H1: There is significance association between nutrition stats and academic


performance.
Alternative method:

Calculated 𝛘²=

= 172.7

Degree of freedom df= (r-1) (c-1)= (2-1) (2-1)= 1


Tabulated 𝛘 ² value at 95% confidence level and at 1 df= 3.841
Conclusion:
calculated 𝛘 ² > Tabulated 𝛘 ² we do not accept H0 ie. There is significance
association between nutrition stats and academic performance.
3)Test of homogeneity
The chi-square test is also used to test the homogeneity of the attributes in
respect of a particular characteristics or to test the population and sample
variance. The test of independence concerns whether two attributes are
independent or not. But the test of homogeneity concerns whether the
samples drawn from populations are homogeneous or not with respect to
some criterion of classification.
Example
The researcher wishes to find the two districts are
homogenous with respect to smoking habits. He had collected
the as followings:

Smoking Habit
Yes No Total
A 90 140 230
Districts
B 80 190 270
Total 170 330 500

Ho: the two districts are homogeneous with respect to smoking habit.
H1: the two districts are not homogeneous with respect to smoking habit.
Miscellaneous questions

Q. Test whether the incidence of Bradycardia has any prediction for


the site of infraction from the data of following table.

Site of No of patients No of patients without Total


infarction with Bradycardiac Bradycardiac

Posterior 31 35 66
Anterior 6 28 34
Total 37 63 100
Q)Apply chi-square test to find the efficacy of vaccine. The following result was obtained:

vaccine Result Total


Died Survived
Given 2 10 12
Not Given 6 6 12
Total 8 16 24
Null hypothesis: vaccine is not effective in controlling disease.
Alternative hypothesis: Vaccine is effective in controlling disease
Q)Determine if there is any association
between whooping cough and tonsillectomy
when in a random sample of 100 children of a
school, 25 had a history of tonsillectomy and
60 of whopping cough and 10 had both with
25 had none.
Whooping Tonsillectomy
cough
Present Absent Total
present 10(A) 50(B) 60
Absent 15(C) 25(D) 40
Total 25 75 100
Two groups A and B consist of 100 people each who have a
disease. A serum is given to group A but not to group B
(called the control); otherwise the two groups are treated
identically. It is found that in groups A and B, 75 and 64
people respectively recovered from the disease. At the
significance level of 0.05, test the hypothesis using chi square
test that the serum helps cure the disease.
(given chi-square at 5% level of significance for 1 degree of
freedom is 3.841).
Hint: 2 × 2 contingency table:

recovered Not recovered total

Group A O = 75 O = 25 100
E= E=

Group B (control) O = 64 O = 36 100


E= E=

Total 139 61 200


Q. A survey of 400 children in age group 0-5 years
showed prevalence rate of protein calorie malnutrition
to be 15%. Another study showed a prevalence of 5% in
a sample of 300 of similar age group. Can we say that
there is statistical significance in the difference between
the two prevalence rates?
Community Protein calorie Malnutrition total

Yes No

Survey I 0=15% Of 400=60 O=400-60=340 400

Survey II 0=5% 0f 300=15 O=300-15=285 300

Total 75 625 700

Null hypothesis: There is not significant difference between the two


prevalence rates of protein calorie malnutrition.
Alternative : There is significant difference between two prevalence rates of
protein calorie malnutrition.
Q. An outbreak of pediculosis capitis is being
investigated in a girls school containing 291 pupils. Of
130 children who live in nearby housing estate 18 were
infested and of 161 who live elsewhere 37 were
infested. Is there any significant difference between two
groups for having infestation?
Hint:
Group of people infested Non infested Row total(RT)

Living nearby O = 18 O = 112 130


E = 24.57 E = 105.43

Living elsewhere O = 37 O = 124 161


E = 30.43 E = 130.57

Column total(CT) 55 236 291


Q. In an air pollution study a random sample of 200 houses holds was selected for
each of two communities .A respondent in each house hold was asked whether or
not any one in the household was bother by air pollution . The response were as
follows:

Community Any member of family bothered by air Total


pollution
Yes no
I 43 157 200
II 81 119 200
Total 124 276 400

Can the researcher conclude that the two communications differ with respect
to variable of interest?
Q. In a sample of 100 persons blood group proportions observed and expected are
given below. Find if the observed distribution fits to hypothetical (expected)
distribution.

A B AB O

(O) 23 35 5 37

(E) 42 9 3 46
Q. Ratio of female to male births in universe is expected to be 1:1 (
proportion of male and female birth is 50: 50 )

In one village it was found that males born were 52 and female 48. it this
differences due to chance.

Hint :

Male female
O 52 48
E 50 50

Since 𝚺Oi = 100

No of class =2

Therefore expected freq = 100/2 = 50

You might also like