0% found this document useful (0 votes)
15 views2 pages

STATISTICS Unit 7 Chi-Squared Introduction

Unit 7 introduces the chi-squared test for hypothesis testing, focusing on goodness of fit and dependency between attributes using contingency tables. Key concepts include the calculation of observed and expected values, critical values based on significance levels, and the application of Yates' continuity correction for 2x2 tables. The document also outlines formulas for chi-squared calculations and degrees of freedom relevant to statistical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views2 pages

STATISTICS Unit 7 Chi-Squared Introduction

Unit 7 introduces the chi-squared test for hypothesis testing, focusing on goodness of fit and dependency between attributes using contingency tables. Key concepts include the calculation of observed and expected values, critical values based on significance levels, and the application of Yates' continuity correction for 2x2 tables. The document also outlines formulas for chi-squared calculations and degrees of freedom relevant to statistical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CMM Subject Support Strand: STATISTICS Unit 7 Chi-Squared: Introduction

Unit 7 Chi-Squared Introduction

Learning objectives
Here we see how we can test particular hypotheses using the chi-squared test, sometimes referred to as
'goodness of fit' testing. After studying this unit you should
• understand how the chi-squared test can be used to test a uniform distribution, using critical values
• understand how the chi-squared test can be used to test the dependency between different attributes
by using a contingency table.

Notes
The first use of the test was by the English mathematician Karl Pearson (1857-1936) who was working
on correlation and regression problems. He wanted to test his methods using real data and, for
example, used over 4000 trials of roulette and tosses of a coin. Although he published his work in
1900 it not until many decades later that it was recognised as an important advance in statistical
analysis.

Key points
• In testing whether data fits a particular distribution, using chi-squared, you need to have
observed data, Oi , and theoretical (expected) data, Ei

• Critical values for the chi-squared test are based on the significance level and number of degrees
of freedom

• For 2 × 2 contingency tables, you need to calculate the chi-squared value using Yates' continuity
correction

• Groups in contingency tables must be combined if expected values are less than 5.

Facts to remember

• ( )
The formula for chi-squared χ 2 is given by

2
n
(Oi − Ei )2
χ = ∑
i =1 Ei

where Oi are the observed values and Ei are the expected values.

• The degrees of freedom for testing distribution is given by


v = number of classes − number of constraints .

• For 2 × 2 contingency tables, Yates' continuity correction is used to calculate χ 2 ; that is


2
2
n
(O − E
i i )
− 0.5
χ = ∑
i =1 Ei

1
CMM Subject Support Strand: STATISTICS Unit 7 Chi-Squared: Introduction

Unit 7 Chi-Squared Introduction

• For contingency tables, the expected frequencies are calculated by


( row total) × (column total)
Expected frequency =
(grand total)
• For h × k contingency tables (h rows and k columns), the number of degrees of freedom, σ , is
given by
υ = (h − 1) × (k − 1)

Glossary of terms
Observed and expected frequencies: observed data, Oi , and corresponding theoretical values, Ei ,
based on the distribution being tested

distribution based on the chi-squared values χ 2 =


n
(Oi − Ei )2
Chi-squared:
of freedom

i =1 Ei
and the degrees

Degrees of freedom: calculated from


number of groups − number of constraints

Contingency table: gives the numbers in different categories for 2 factors (that may or may not be
related), e.g.
French Russian Total
Male 39 16 55
Female 21 14 35
Total 60 30 90

You might also like