0% found this document useful (0 votes)
108 views35 pages

- Hypothesis Testing Using the: Chi Square (χ) Distribution

The document provides information on the chi-square test, including: 1) The chi-square test is a nonparametric test that does not require assumptions about population parameters. It is used to analyze frequency data organized into categories. 2) Some key applications of the chi-square test are goodness of fit tests, tests of independence, and tests of equality of proportions. 3) Performing a chi-square test involves stating hypotheses, determining degrees of freedom, calculating the test statistic, and deciding whether to reject the null hypothesis based on comparing the test statistic to the critical value.

Uploaded by

Abdii Dhufeera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views35 pages

- Hypothesis Testing Using the: Chi Square (χ) Distribution

The document provides information on the chi-square test, including: 1) The chi-square test is a nonparametric test that does not require assumptions about population parameters. It is used to analyze frequency data organized into categories. 2) Some key applications of the chi-square test are goodness of fit tests, tests of independence, and tests of equality of proportions. 3) Performing a chi-square test involves stating hypotheses, determining degrees of freedom, calculating the test statistic, and deciding whether to reject the null hypothesis based on comparing the test statistic to the critical value.

Uploaded by

Abdii Dhufeera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Chapter 4

Hypothesis Testing Using the


2
Chi Square ( ) Distribution

1
Nonparametric Tests
The term "non-parametric" refers to the fact that
the chi-square tests do not require assumptions
about population parameters nor do they test
hypotheses about population parameters.
Other hypothesis tests, such as the t tests and
analysis of variance, are parametric tests and
they do include assumptions about parameters
and hypotheses about parameters.

2
Normal distribution assumption
Non asymmetric

4
Nonparametric Tests (cont.)
The most obvious difference between the
chi-square tests and the other hypothesis
tests we have considered (t and ANOVA)
is the nature of the data.
For chi-square, the data are frequencies
rather than numerical scores
If it is on higher scale measurement it can
always converted in to categories.
(gender, marital status, age, income, etc
5
Important
The chi square test can only be used on
data that has the following characteristics:
The data must be in The frequency data must
the form of have a precise numerical
frequencies value and must be
The expected frequency organised into categories
in any one cell of the or groups.
table must be greater
than 5.
The total number of observations must be
greater than 20.
Application of chi square
There are many application of chi square
test. Some of them are:
1.Chi square test for goodness of fit
2.Chi square test for the independence of
variance
3.Chi square test for the inequality of more
than two proportions.

7
. The Chi-Square Test for
Goodness-of-Fit
Chi-Square goodness of fit test is a non-
parametric test that is used to find out how the
observed value of a given phenomena is
significantly different from the expected value.
Each individual in the sample is classified into
one category on the scale of measurement. ( eg
age: male /F, Marital S; Married/un
The data, called observed frequencies, simply
count how many individuals from the sample
are in each category. 8
Hypothesis Testing with Chi-
Square
Chi-square follows five steps:
1. Making assumptions (random sampling)
2. Stating the research and null hypotheses and
selecting alpha
3. Selecting the sampling distribution and
specifying the test statistic
4. Computing the test statistic
5. Making a decision and interpreting the results
9
Step 1. The Assumptions
The chi-square test requires no
assumptions about the shape of the
population distribution from which the
sample was drawn.

However, like all inferential techniques it


assumes random sampling.

It can be applied to variables measured at


a nominal and/or an ordinal level of
measurement.
Stating Research and Null
Hypotheses
The research hypothesis (H1) proposes that
the two variables are related in the population.

The null hypothesis (H0) states that no


association exists between the two cross-
tabulated variables in the population, and
therefore the variables are statistically
independent.
Formula

2 = (O E)2
E

2 = The value of chi square


O = The observed value
E = The expected value
(O E)2 = all the values of (O E) squared then added
together
Example
Genetic theory states that children having one
parent of blood type A and the other of blood
type B will always be of one of three types, A,
AB, B and that the proportion of three types will
on an average be as 1 : 2 : 1.
A report states that out of 300 children having
one A parent and B parent, 30 per cent were
found to be types A, 45 per cent per cent type
AB and remainder type B.
Test the hypothesis by 2 test

13
Soln.
Given information
The observed frequencies of type A, AB and B
is given in the question are 90, 135 and 75
respectively.
The expected frequencies of type A, AB and B
(as per the genetic theory) should have been
75,150 and 75 respectively. (1:2:1)

We now calculate the value of 2 as follows:

14
cont,

15
The Chi-Square Test for
Independence
2 test enables us to explain whether or not two
attributes are associated.
For instance, we may be interested in knowing
whether a new medicine is effective in
controlling fever or not, 2 test will helps us in
deciding this issue.
In such a situation, we proceed with the null
hypothesis that the two attributes (viz., new
medicine and control of fever) are independent
which means that new medicine is not effective in
controlling fever
Cont.
A chi-square independence test is used to test the
independence of two variables. Using a chi-square
test, you can determine whether the occurrence of
one variable affects the probability of the occurrence
of the other variable.
For the chi-square independence test to be used, the
following must be true.
1. The observed frequencies must be obtained by
using a random sample.
2. Each expected frequency must be greater than or
equal to 5.
Chi-Square Independence Test
Performing a Chi-Square Independence Test

In Words In Symbols
1. Identify the claim. State the null State H0 and Ha.
and alternative hypotheses.

2. Specify the level of significance. Identify .

3. Identify the degrees of freedom. d.f. = (r 1)(c 1)

4. Determine the critical value. Use Table value.

5. Determine the rejection region.

Continued.
Chi-Square Independence Test
Performing a Chi-Square Independence Test

In Words In Symbols
6. Calculate the test statistic. 2 (O E )2

E

7. Make a decision to reject or


fail to reject the null If 2 is in the
hypothesis. rejection region,
reject H0.
8. Interpret the decision in the Otherwise, fail to
context of the original claim. reject H0.
Example:
The following contingency table shows a random
sample of 321 seriously injured passenger vehicle
drivers by age and gender. The expected
frequencies are displayed in parentheses. At =
0.05, can you conclude that the drivers ages are
related to gender in such accidents?

Age
Gender 16 20 21 30 31 40 41 50 51 60 61 and Total
older
Male 32 51 52 43 28 10 216
(30.28) (49.12) (57.20) (43.07) (25.57) (10.77)
Female 13 22 33 21 10 6 105
(14.72) (23.88) (27.80) (20.93) (12.43) (5.23)
45 73 85 64 38 16 321
Example continued:
Because each expected frequency is at least 5 and the
drivers were randomly selected, the chi-square
independence test can be used to test whether the
variables are independent.
H0: The drivers ages are independent of gender.

Ha: The drivers ages are dependent on gender.


(Claim)

d.f. = (r 1)(c 1) = (2 1)(6 1) = (1)(5) = 5


With d.f. = 5 and = 0.05, the critical value is 20 = 11.071.

Continued.
Chi-Square Independence Test
Example continued:
O E OE (O E)2 (O E )2
Rejection E
32 30.28 1.72 2.9584 0.0977
region
51 49.12 1.88 3.5344 0.072
0.05 52 57.20 5.2 27.04 0.4727
43 43.07 0.07 0.0049 0.0001
X2 28 25.57 2.43 5.9049 0.2309
10 10.77 0.77 0.5929 0.0551
20 = 11.071
13 14.72 1.72 2.9584 0.201
(O E )2 22 23.88 1.88 3.5344 0.148
2
2.84
E 33 27.80 5.2 27.04 0.9727
21 20.93 0.07 0.0049 0.0002
Fail to reject H0. 10 12.43 2.43 5.9049 0.4751
6 5.23 0.77 0.5929 0.1134

There is not enough evidence at the 5% level to


conclude that age is dependent on gender in such
accidents.
Example: 2
Question: Are the homicide rate and volume of gun
sales related for a sample of 25 cities?
HOMICIDE RATE

GUN SALES Low High Totals

High 8 5 13

Low 4 8 12

Totals 12 13 N = 25
Solution Using 5-Step Method
Step 1 Make Assumptions and Meet Test
Requirements
Independent random samples
Level of measurement is nominal
Note that no assumption is made about the
shape of the sampling distribution. When the
distribution is normal, a parametric test (Z- or
t-test, ANOVA) can be used.
The chi square test is non-parametric. It can
be used when normality is not assumed.
Step 2 State the Null and Alternate
Hypothesis
H0: The variables are independent
You can also say: H0: fo = fe

H1: The variables are dependent


Or: H1: fo fe
Step 3 Select the Sampling Distribution
and Establish the Critical Region
Because normality is not assumed and our
data are in tabular form, our Sampling
Distribution = 2

Alpha = .05
df = (r-1)(c-1) = 1
2 (critical) = 3.841
Step 4 Calculate the Test
Formula: Statistic

( fo fe ) 2
2 (obtained) =
f
e

Method:
1. Find expected frequencies for each cell.

2. Complete computational table to find 2 (obtained)


1. Find expected frequencies for each cell.

To find fe =
row marginal column marginal
N
Multiply column and row marginals for each cell and
divide by N.
(13*12)/25 = 156/25 = 6.24
(13*13)/25 = 169/25 = 6.76
(12*12)/25 = 144/25 = 5.76
(12*13)/25 = 156/25 = 6.24
Observed and Expected Frequencies
for each cell (Note that totals are
unchanged):
HOMICIDE RATE

GUN SALES Low High Total

fo = 8 fo = 5
High 13
fe = 6.24 fe = 6.76

fo = 4 fo = 8
Low 12
fe = 5.76 fe = 6.24

Total 12 13 N = 25
2. Complete Computational Table

A table like this will help organize the computations:


(a) Add values for fo and fe for each cell to table.

fo fe fo - fe (fo - fe)2 (fo - fe)2 /fe

8 6.24

5 6.76

4 5.76

8 6.24

Total 25 25
Computational Table (cont.)
(b) Subtract each fe from each fo. The total of this
column must be zero.

fo fe fo - fe (fo - fe)2 (fo - fe)2 /fe

8 6.24 1.76

5 6.76 -1.76

4 5.76 -1.76

8 6.24 1.76

Total 25 25 0
Computational Table (cont.)
(c) Square each of these values

fo fe fo - fe (fo - fe)2 (fo - fe)2 /fe

8 6.24 1.76 3.10

5 6.76 -1.76 3.10

4 5.76 -1.76 3.10

8 6.24 1.76 3.10

Total 25 25 0
Computational Table (cont.)
(d) Divide each of the squared values by the fe for that cell.
(e) The sum of this column is chi square.

fo fe fo - fe (fo - fe)2 (fo - fe)2 /fe

8 6.24 1.76 3.10 .50

5 6.76 -1.76 3.10 .46

4 5.76 -1.76 3.10 .54

8 6.24 1.76 3.10 .50

Total 25 25 0 2 = 2.00

2 (obtained) = 2.00
Step 5 Make a Decision and Interpret the
Results of the Test

2 (critical) = 3.841
2 (obtained) = 2.00
The test statistic is not in the Critical
Region. Fail to reject the H0.
There is no significant relationship
between homicide rate and gun sales.
The End

THANK YOU

You might also like