0% found this document useful (0 votes)
8 views18 pages

Week 16 - Testing For Independene - Pearson Chi-Square Test

Uploaded by

bonolobadire447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views18 pages

Week 16 - Testing For Independene - Pearson Chi-Square Test

Uploaded by

bonolobadire447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Week 16

Hypothesis Testing-
Testing For
Independence
Objectives

By the end of this lesson, you must be


able to;
• Use the Pearson chi square
distribution to test for independence.
Definition

A test of independence tests the null


hypothesis that in a contingency table,
the row and column variables are
independent.
Pearson chi square

We use the Pearson chi square to conduct the


significance tests in 2x2 tables or large
contingency tables. To test whether the
observed differences are statistically
significant.
Definition

• A contingency table (or two-way frequency


table) is a table in which frequencies
correspond to two variables.
• (One variable is used to categorize rows, and a
second variable is used to categorize
columns.)
• Contingency tables have at least two rows and
at least two columns.
Notation
O represents the observed frequency in a cell of
a contingency table.
E represents the expected frequency in a cell,
found by assuming that the row and column
variables are independent

r represents the number of rows in a


contingency table (not including labels).
c represents the number of columns in a
contingency table (not including labels).
Hypotheses and Test Statistic

H 0 : The row and column variables are independent.


H1 : The row and column variables are dependent.
2
(O  E )
 
2

E
(row total)(column total)
E
(grand total)

• O is the observed frequency in a cell and E is the


expected frequency in a cell.
• ALWAYS A RIGHT TAILED TEST
Pearson Chi square
Distribution
Pearson Chi square
Distribution
Pearson Chi square
Distribution
• It is a measure of the difference between
actual and expected frequencies.
• Helps us understand the relationship
between two categorical variables.
• Involves frequency of events, observed
Vs. Expected!
• Helps answer the question of whether the
differences are due to chance or some
other important phenomena
Pearson Chi square
Distribution
• The “expected frequency” is that there is no
difference between the sets of results (the null
hypothesis). In that case, the Chi square value
would be zero.
• The larger the observed difference between
the sets of results, the greater the Chi square
value.
• However, it is difficult to interpret the Chi
square value by itself as it depends on the
number of factors studied.
• We use critical values and P-values to
Degrees of freedom

• The degrees of freedom for contingency


table is calculated by;
• df= (R-1) X (C-1), whereby R=Rows
and C=Columns

NOTE: *don’t count the column/row totals


and labels*
Expected Frequencies

The expected frequencies are


calculated by;
Example 1

Is there an association between smoking status


and gender? Use the chi square test for
independence.
- State your null and Alternative hypotheses
MALE FEMALE

SMOKER 14 11

NON- 17 19
SMOKER
Example 1

Assuming that there is independence (no


association), then the expected values
would be;
MALE FEMALE TOTAL

SMOKER 12.705 12.295

NON- 18.295 17.705


SMOKER
TOTAL
Example 1

From the observed and expected values


above, find Chi square value (test
statistic). It’s easier to do so by using a
table;O E O-E (O-E)2/E
14 12.705 1.295 0.131
11 12.295 - 1.295 0.136
17 18.295 - 1.295 0.0911
19 17.705 1.295 0.0942
TOTAL 61 61 0 0.452

=
0.452
Example 1

• Calculate the degrees of Freedom.


df= (R-1) X (C-1)= (2-1)(2-1)=1

• Find the P Value


P value= 0.5

• Do you reject or fail to reject the null hypothesis


Fail to reject the null hypothesis
• State your conclusion.
There is sufficient evidence to conclude that
smoking status is independent of gender. OR
In this data, whether someone smokes or not,
is independent of their gender.
END WEEK
16

You might also like