0% found this document useful (0 votes)
50 views9 pages

Test of Goodness of Fit and Independence: Chi-Square-test-as A Test of Independence

Uploaded by

Bharti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views9 pages

Test of Goodness of Fit and Independence: Chi-Square-test-as A Test of Independence

Uploaded by

Bharti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Paper:15 , Quantitative Techniques for Management Decisions

Module: 24 Test of Goodness of Fit and Independence: Chi-Square-test-as a test of


independence

Prof. S P Bansal
Principal Investigator Vice Chancellor
Maharaja Agrasen University, Baddi

Prof. YoginderVerma
Co-Principal Investigator Pro–Vice Chancellor
Central University of Himachal Pradesh. Kangra. H.P.

Prof. Pankaj Madan


Paper Coordinator Dean- FMS
Gurukul Kangri Vishwavidyalaya , Haridwar

Dr Deependra Sharma
Paper:15 , Quantitative Techniques for Management Decisions
Associate-Professor
Content Writer
Amity University Gurgaon.Haryana
Module: 20, Hypothesis Testing: Developing null and alternative hypotheses

Prof. S P Bansal
Principal Investigator Vice Chancellor
Maharaja Agrasen University, Baddi
Items Description of Module

Subject Name Management

Paper Name Quantitative Techniques for Management Decisions

Module Title Test of Goodness of Fit and Independence: Chi-Square-test-as a test of independence

Module Id 24

Pre- Requisites Basic mathematical operations

Objectives After reading this module the students will be able to

 Understand the concept of non-parametric tests

 Apply χ2 as Test for independence.

 Gain knowledge about the procedure of conducting χ2Chi-square


test.

Keywords χ2, test of independence, normal distribution

Module-24 Test of Goodness of Fit and Independence: Chi-Square-test-as a test of independence

Introduction

χ2 Test for Independence

Use of Chi-Square Test for Independence

Self-Check Questions

Summary
Quadrant-I

Test of Goodness of Fit and Independence: Chi-Square-test-as a test of independence

Learning objective

After reading this module the students will be able to

 Understand the concept of non-parametric tests


 Apply Chi-square as Test for independence.
 Gain knowledge about the procedure of conducting Chi-square test.

Introduction

The given set of data can be analyzed with the help of various tools available on the basis of the following
parameters;
 Size of the Sample
 Size of the Population
 Scale used for measurement of data And dependency of measurement

The tests may be classified in to two category mainly Parametric and Non-Parametric. Three test i.e. t, z
and F are used to estimate and test the population parameters and prerequisite of application of these test
are-
 Interval and ratio Scale to be used
 Hypothesis testing for specific parameters
 Assumption of normality and Standard deviation is known or not should be clear

The absence of these conditions leads to the application of Non-Parametric Tests or distribution free tests.
These tests are applied in following conditions;
 Do not require specific population distribution and data can be nominal or ordinal
 Does not takes in to consideration of population parameters
 Does not require normally distributed population

These test are very easy to apply and can use nominal or ordinal data as well for calculation. These tests
provide broad based conclusion with approximate solution and does not necessarily require normally
distributed population. The χ-square test is one of the non-parametric tests used to test hypothesis.

χ2 test for Independence


The test is applicable in the situation when there are two categorical variables from a single population.
Its’ purpose is to find out if there is a significant association between the two variables or not. For
example, in an election survey, voters may be categorized on the basis of gender (i.e. male or female) and
on the basis of party inclination ( i.e Democrat, Republican, or Independent). Chi-square test for
independence is conducted to determine whether gender is related to party inclination or not.
This test is suitable under the following conditions:

 The sample is selected through simple random sampling.


 The variables are of categorical nature.
 The expected frequency count for each cell of the contingency table should not be less than 5.

Procedure
The procedure to test the association between two independent variables where the sample data is
presented in the form of contingency table with n rows and m columns is summarized as –

1. State the null and alternative hypotheses

H0: No relationship or association exists between variables.

Ha: A relationship or association exists between variables i.e., they are related.

2. Select a random sample and record the observed frequencies (O) in each cell of the contingency table
and calculate the row, column and grand total.

3. Calculate the expected frequencies (E) for each cell:

E= Row total*column total/Grand total

4. Compute the value of test statistic,

χ2= Σ [(O - E)2 / E ],

where O is the observed frequency count and E is the expected frequency count.

5. Calculate the degrees of freedom

df = (c - 1) * (r - 1)

where c is the number of levels for one categorical variable, and r is the number of levels for the other
categorical variable.

6.Use the level of significance α and df to find the table value of χ2 at α.

7. Compare the calculated and table value .If calculated value of chi-square is less than the table value,
accept the null hypotheis otherwise reject it
Example

A simple random sample of 1000 prospective voters was taken. They were categorized on the basis of
gender ( namely M/F) and on the basis of party liking (Republican, Democrat, or Independent).
The contingency table given below shows the result

Party liking
Row total
Republican Democrat Independent

M 400 300 100 800

F 500 600 100 1200

Column total 900 900 200 2000

Do the M's party liking differ significantly from the F's preferences? Use a 0.05 level of significance.

Solution

As discussed above following procedure is followed,

 The first step is to state the null hypothesis and an alternative hypothesis.

H0: Gender and party likings are independent.

Ha: Gender and party likings are not independent.

 For this analysis, the significance level is 0.05, chi-square test for independence will be used.
 Degrees of freedom, the expected frequency counts, and the chi-square test statistic are calculated.
df = (c - 1) * (r - 1) = (2 - 1) * (3 - 1) = 2

E1,1 = (800 * 900) / 2000 = 720000/2000 = 360


E1,2 = (800 * 900) / 2000 = 360
E1,3 = (800 * 200) / 2000 = 80
E2,1 = (1200 * 900) / 2000 = 540
E2,2 = (1200 *900) / 2000 = 540
E2,3 = (1200 * 200) / 2000 = 120

x2 = Σ [ (On ,m – En, m)2 / En ,m ]


x2 = (400 - 360)2/360 + (300 - 360)2/360 + (100 - 80)2/80
+ (500 - 540)2/540 + (600 - 540)2/540 + (100 - 120)2/120
x2 = 4.44 + 10.00 + 5.0 + 2.96 + 6.66 + 3.34 = 32.4

This calculated value of chi-square statistic having 2 degrees of freedom is more than the table
value (refer Chi-square table ,hence null hypothesis is not accepted. Thus, we conclude that there
is a relationship between gender and voting preference.

Self-Check Questions:

Question 1: Two hundred randomly selected adults were asked whether TV shows as a whole are
primarily entertaining, educational or boring. The respondents were categorized by gender. Their
responses are given in the following table-

Opinion
Gender Entertaining Educational Waste of time Total
Female 52 28 30 110
Male 28 12 50 90
Total 80 40 80 200

Is this evidence convincing that there is a relationship between gender and opinion in the population of
interest?

Solution – Let us take the null hypothesis that the opinion of adults is independent of adults is
independent of gender.

Since, contingency table is of size 2x3,the degrees of freedom would be (2-1)(3-1) = 2.This implies that
we need to calculate only to calculate only two expected frequencies and the other four can automatically
be determined as shown below:

E11=Row 1 total x Column 1 total


E13=110-(44+22)=44

E21=80-E11=40-22==18

E22=40-E12=40-22=18

E23=80-E13=80-44=36

The contingency table of expected frequencies is as follows:

Opinion

Gender Entertaining Educational Waste of time Total


Female 44 22 44 110
Male 36 18 36 90
Total 80 40 80 200

Arranging the observed and expected frequencies as follows to calculate the value of x2-test statistic:

Observed(O) Expected(E) O-E (O-E)2 (O-E)2/E


52 44 8 64 1.454
28 22 6 36 1.636
30 44 14 196 4.455
28 36 -8 64 1.777
12 18 -6 36 2
50 36 14 196 5.444
16.766

Since, calculated value of x2=16.766 is more than its critical value, x2=5.99 at α=0.05 and df = 2 ,the null
hypothesis is rejected. Hence, we conclude that the opinion of adults is not independent of gender.
Question 2: A sample analysis of examination results of 500 students was made. It was found that 220
students had failed ,170 had secured a third division 90 were placed in second division and 20 got a first
division. Are these figures commensurate with the general examination result which is the ratio of 4:3:2:1
for the various categories respectively?

Solution- Let us take the null hypothesis that the observed results are commensurate with the general
examination result which is the ratio 4:3:2:1.

The expected number of students who have failed, obtained a third division second division and first
division, respectively, are

E1=500*4/10=200, E2=500*3/10=150;E3=500*2/10=100 AND E4=500*1/10=50

The contingency table of expected and observed frequencies is as follows:

Category O E (O-E)2 Χ2=(O-E)2/E


Failed 220 200 400 2
3rd division 170 150 400 2.667
2nd division 90 100 100 1
1st division 20 50 900 18
23.667

Since calculated value of x2 = 23.667 is more than its table value , x2=7.81 at α = 0.05 level of
significance and df= n – 1 = 4 -1 =3 the hypothesis is rejected.

Question 3: Based on information on 1000 randomly selected fields about the tenancy status of the
cultivation of these fields and use of fertilizers ,collected in an AGRO ECONOMY survey, the following
classification was noted:

Owned Rented Total


Using fertilizers 416 184 600
Not using fertilizers 64 336 400
Total 480 5220 1000

Would you conclude that owner cultivators are more towards the use of fertilizers at 5%level of
significance? Carry out a chi-square test as per testing procedure.
Solution: Let us take the hypothesis that ownership of fields and the use of fertilizers are independent
attributes. Since, contingency table is of size 2*2 the degree of freedom would be (2-1)(2-1)=1. This
implies that we need to calculate only one expected frequency and others can be automatically determined
as follows:

E11=600*480/1000=288

E12 =600-288=312

E21=480-288=192

E22=208

The contingency table of expected frequencies is as follows:

Observed Expected (O-E)2 x2=(O-E)2/E


416 288 16,384 56.889
64 192 16,384 85.333
184 312 16,384 52.513
336 208 16384 78.769
273.534

The calculated value of x2=273.534 at α=0.05 level of significance and df= (n-1) (r-1) =(2-1) (2-1) = 1 is
much more than its table value,χ2=3.84. The null hypothesis H0 is rejected. Hence, it can be conducted
that owners’ cultivators are more inclined towards the use of fertilizers.

Summary

The tests may be classified in to two category mainly Parametric and Non-Parametric. Three test i.e. t test,
z test and f test are used to estimate and test the population parameters and prerequisite of application of
these test are-interval and ratio Scale to be used, hypothesis testing for specific parameters, assumption of
normality and Standard deviation is known or not should be clear.
The absence of these conditions leads to the application of Non-Parametric Tests or distribution free tests.
These test are very easy to apply and can use nominal or ordinal data as well for calculation. These tests
provide broad based conclusion with approximate solution and does not necessarily require normally
distributed population. The Chi-Square test is one of the non-parametric tests used to test hypothesis. Chi-
Square Test for Independence test is applied when you have two categorical variables from a single
population. It is used to determine whether there is a significant association between the two variables.

You might also like