0% found this document useful (0 votes)

105 views5 pages

Assessment in Learning 1 Chi Square

- The chi-square (χ2) statistic is used to test whether observed data matches expected values or a theoretical distribution. - It measures the difference between observed and expected frequencies across categories. A larger χ2 value indicates a greater difference and less fit to the theoretical distribution. - The chi-square goodness of fit test was used as an example to evaluate if the distribution of candy flavors across bags matched the expected equal distribution. The χ2 test statistic was larger than the critical value, indicating the observed data did not match the expected equal distribution across flavors.

Uploaded by

Ky Sha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

105 views5 pages

Assessment in Learning 1 Chi Square

Uploaded by

Ky Sha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Chi-Square Test

The chi-square (χ2) statistic is a test that measures how well a model compares to the actual
observed data. The data used to calculate the chi-square statistic must be random, raw, mutually
exclusive, obtained from independent variables, and obtained from a sufficiently large sample.
For example, the results of tossing a fair coin meet these criteria.
Chi-square tests are often used to test hypotheses. The chi-square statistic compares the size of
any difference between the expected results and the actual results, given the sample size and
number of variables in the relationship.
For these tests, degrees of freedom are used to determine whether a specific null hypothesis can
be rejected based on the total number of variables and samples within the experiment. As with
any statistic, the larger the sample size, the more reliable the results.
 The chi-square (χ2) statistic is a measure of the difference between the observed and
expected frequency of outcomes of a set of events or variables.
 Chi-square is useful for analyzing such differences in categorical variables, especially
those that are nominal in nature.
 χ2 depends on the size of the difference between the actual and observed values, the
degrees of freedom, and the sample size.
 χ2 can be used to test whether two variables are related or independent of each other.
 It can also be used to test the goodness of fit between an observed distribution and a
theoretical distribution of frequencies.
The Formula for Chi-Square Is

where:
c=Degrees of freedom
O=Observed value(s)
E=Expected value(s)

Chi-Square Goodness of Fit

The chi-square goodness of fit test is a goodness of fit test for a categorical variable. Goodness of
fit is a measure of how well a statistical model fits a set of observations.
When the goodness of fit is high, the expected values based on the model are close to the
observed values. When the goodness of fit is low, the expected values based on the model are far
from the observed values.
Statistical models evaluated by the chi-square goodness of fit test are distributions. They can be
any distribution, from as simple as an equal probability for all groups, to as complex as a
probability distribution with many parameters.
The Chi-square goodness of fit test checks whether your sample data is likely from a certain
theoretical distribution. We have a set of data values, and an idea about how the data values are
distributed. The test gives us a way to decide if the data values have a "good enough" fit with our
idea, or if our idea is questionable.
To apply the goodness of fit test to a data set we need:

 Data values that are a simple random sample from the full population.
 Categorical or nominal data. The Chi-square goodness of fit test is not appropriate for
continuous data.
 A data set that is large enough so that at least five values are expected in each of the
observed data categories.
Chi-square goodness of fit test example
Let's use candy bags as an example. We collect a random sample of ten bags. Each bag has 100
pieces of candy and five flavors. Our hypothesis is that the proportions of the five flavors in each
bag are the same.

Let's start by answering: Is the Chi-square goodness of fit test an appropriate way to evaluate the
distribution of flavors in candy bags?
 We have a simple random sample of 10 bags of candy. We meet this need.
 Our categorical variable is candy flavors. We have a number of each flavor in 10 bags of
candy. We meet this need.
 Each bag has 100 pieces of candy. Each bag has five flavors of candy. We expect to have
equal numbers for each flavor. This means we expect 100 / 5 = 20 pieces of candy in
each flavor from each bag. For the 10 bags in our sample, we expect 10 x 20 = 200 pieces
of candy in each flavor. This is more than the requirement of five expected values in each
category.
Based on the answers above, yes, the Chi-square goodness of fit test is an appropriate method to
evaluate the distribution of flavors in candy bags.
Let’s start by listing what we expect if each bag has the same number of pieces for each flavor.
Above, we calculated this as 200 for 10 bags of candy.
Comparison of actual vs expected number of pieces of each flavor of candy

Flavor Number of Pieces of Candy Expected number of Pieces of

(10 bags) Candy
Apple 180 200
Lime 250 200
Cherry 120 200
Orange 225 200
Grapes 225 200
Now, we see a difference between what we observed in our data and what we expected
Difference between observed ad expected pieces of candy by flavor
Flavor Number of Pieces of Expected number of Observed-Expected
Candy (10 bags) Pieces of Candy
Apple 180 200 180-200=- 20
Lime 250 200 250-200= 50
Cherry 120 200 120-200- -80
Orange 225 200 225-200= 25
Grapes 225 200 225-200= 25
Some of the differences are positive and some are negative. If we just add them, we get zero.
Instead, we square the differences. It gives equal importance to candy flavors with fewer pieces
than expected, and flavors with more pieces than expected.
Calculation of the squared difference between Observed and Expected for each flavor of candy

Flavor Number of Expected Observed- Squared

Pieces of Candy number of Expected Difference
(10 bags) Pieces of Candy
Apple 180 200 180-200=- 20 500
Lime 250 200 250-200= 50 2500
Cherry 120 200 120-200- -80 6400
Orange 225 200 225-200= 25 625
Grapes 225 200 225-200= 25 625
Next, we divide the squared difference by the expected number:
Calculation of the squared difference/expected number of pieces of candy per flavor

Flavor Number of Expected Observed- Squared Squared

Pieces of number of Expected Difference Difference /
Candy (10 Pieces of Expected
bags) Candy Number
Apple 180 200 180-200=- 20 500 400 / 200 = 2
Lime 250 200 250-200= 50 2500 2500 / 200 =
12.5
Cherry 120 200 120-200- -80 6400 6400 / 200 =
32
Orange 225 200 225-200= 25 625 625 / 200 =
3.125
Grapes 225 200 225-200= 25 625 625 / 200 =
3.125
Finally, we add the numbers in the final column to calculate our test statistic:
2+12.5+32+3.125+3.125=52.75

To draw a conclusion, we compare the test statistic to a critical value from the Chi-Square
distribution. This activity involves four steps:

1. We first decide on the risk we are willing to take of drawing an incorrect conclusion
based on our sample observations. For the candy data, we decide prior to collecting
data that we are willing to take a 5% risk of concluding that the flavor counts in each
bag across the full population are not equal when they really are. In statistics-speak,
we set the significance level, α , to 0.05.
2. We calculate a test statistic. Our test statistic is 52.75.
3. We find the theoretical value from the Chi-square distribution based on our
significance level. The theoretical value is the value we would expect if the bags
contain the same number of pieces of candy for each flavor.

In addition to the significance level, we also need the degrees of freedom to find this
value. For the goodness of fit test, this is one fewer than the number of categories. We
have five flavors of candy, so we have 5 – 1 = 4 degrees of freedom.

The Chi-square value with α = 0.05 and 4 degrees of freedom is 9.488.

4. We compare the value of our test statistic (52.75) to the Chi-square value. Since
52.75 > 9.488, we reject the null hypothesis that the proportions of flavors of candy
are equal.
We make a practical conclusion that bags of candy across the full population do not have
an equal number of pieces for the five flavors. This makes sense if you look at the
original data. If your favorite flavor is Lime, you are likely to have more of your favorite
flavor than the other flavors. If your favorite flavor is Cherry, you are likely to be
unhappy because there will be fewer pieces of Cherry candy than you expect.

Chart Title
300
250
250
200 225 225
200 200 200 200 200
150 180
100 120
50
0
Apple Lime Cherry Orange Grape

Expected Actual

Bar chart comparing actual vs. expected counts of candy

Statistical details
Let’s look at the candy data and the Chi-square test for goodness of fit using statistical terms.
This test is also known as Pearson’s Chi-square test.
Our null hypothesis is that the proportion of flavors in each bag is the same. We have five
flavors. The null hypothesis is written as:
H0: p1=p2=p3=p4=p5 H0: p1=p2=p3=p4=p5
The formula above uses p for the proportion of each flavor. If each 100-piece bag contains equal
numbers of pieces of candy for each of the five flavors, then the bag contains 20 pieces of each
flavor. The proportion of each flavor is 20 / 100 = 0.2.
The alternative hypothesis is that at least one of the proportions is different from the others. This
is written as:
Ha: at least one pi not equal Ha: at least one pi not equal
In some cases, we are not testing for equal proportions. Look again at the example of children's
sports teams near the top of this page. Using that as an example, our null and alternative
hypotheses are:
H0: p1=0.2, p2=0.65, p3=0.15H0: p1=0.2, p2=0.65, p3=0.15
Ha:at least one pi not equal to expected valueHa:at least one pi not equal to expected value
Unlike other hypotheses that involve a single population parameter, we cannot use just a
formula. We need to use words as well as symbols to describe our hypotheses.
We calculate the test statistic using the formula below:

∑ni=1(Oi−Ei)2Ei∑i=1n(Oi−Ei)2Ei
In the formula above, we have n groups. The ∑∑ symbol means to add up the calculations for
each group. For each group, we do the same steps as in the candy example. The formula
shows Oi as the Observed value and Ei as the Expected value for a group.
We then compare the test statistic to a Chi-square value with our chosen significance level (also
called the alpha level) and the degrees of freedom for our data. Using the candy data as an
example, we set α = 0.05 and have four degrees of freedom. For the candy data, the Chi-square
value is written as:
χ²0.05,4χ²0.05,4

There are two possible results from our comparison:

 The test statistic is lower than the Chi-square value. You fail to reject the hypothesis of
equal proportions. You conclude that the bags of candy across the entire population have
the same number of pieces of each flavor in them. The fit of equal proportions is “good
enough.”
 The test statistic is higher than the Chi-Square value. You reject the hypothesis of equal
proportions. You cannot conclude that the bags of candy have the same number of pieces
of each flavor. The fit of equal proportions is “not good enough.”
Sharmaine A. Mislang

Advanced Educational Statistics

The Chi Square Tests

Goodness of Fit

Inferential Statistics Business Report
No ratings yet
Inferential Statistics Business Report
15 pages
A Bipartisan Agenda For Change: Case Problem
No ratings yet
A Bipartisan Agenda For Change: Case Problem
6 pages
SMPTHO4 With Answers
100% (1)
SMPTHO4 With Answers
14 pages
Chi Squared Goodness of Fit
No ratings yet
Chi Squared Goodness of Fit
24 pages
NUMB3RS Goodness of Fit Student Worksheet
No ratings yet
NUMB3RS Goodness of Fit Student Worksheet
6 pages
Stat 130 - Chi-Square Goodnes-Of-Fit Test
100% (3)
Stat 130 - Chi-Square Goodnes-Of-Fit Test
32 pages
Chi Square
No ratings yet
Chi Square
37 pages
Seatwork No. 3 Chi Square Statistics..p3
No ratings yet
Seatwork No. 3 Chi Square Statistics..p3
2 pages
Null and Alternative Hypotheses: N or n/6. in Fact, For This Example, The Expected Number of Candies For Each
No ratings yet
Null and Alternative Hypotheses: N or n/6. in Fact, For This Example, The Expected Number of Candies For Each
2 pages
MM ChiSquare Lab
No ratings yet
MM ChiSquare Lab
6 pages
TPS6 LecturePowerPoint 11.1 DT 043018
No ratings yet
TPS6 LecturePowerPoint 11.1 DT 043018
62 pages
Tps5e Ch11 1
No ratings yet
Tps5e Ch11 1
21 pages
11 Sample Problems On Chi-Square Tests (Chapter 11) - ANSWER KEY
No ratings yet
11 Sample Problems On Chi-Square Tests (Chapter 11) - ANSWER KEY
24 pages
L19 - Chi Square Test 1
No ratings yet
L19 - Chi Square Test 1
17 pages
AI22 Chi Square Goodness of Fit Test
No ratings yet
AI22 Chi Square Goodness of Fit Test
15 pages
MM Lab Chi Square
No ratings yet
MM Lab Chi Square
8 pages
LAB 5 - Chi - Square Analysis - 231016 - 232108
No ratings yet
LAB 5 - Chi - Square Analysis - 231016 - 232108
2 pages
Chi Square Test
No ratings yet
Chi Square Test
6 pages
Keya's Copy of M+M Chi Square Lab
No ratings yet
Keya's Copy of M+M Chi Square Lab
7 pages
Chi Square
No ratings yet
Chi Square
8 pages
7 Chi-Square and F
No ratings yet
7 Chi-Square and F
68 pages
Chi Square M&M's-1
No ratings yet
Chi Square M&M's-1
3 pages
Statistics
No ratings yet
Statistics
17 pages
08 Chi Square Test of Signific
No ratings yet
08 Chi Square Test of Signific
4 pages
CH 11 Notes
No ratings yet
CH 11 Notes
20 pages
Define The Null Hypothesis (No Difference Between Sample and Theoretical Distribution) and The Alternative Hypothesis (Difference Exists) .
No ratings yet
Define The Null Hypothesis (No Difference Between Sample and Theoretical Distribution) and The Alternative Hypothesis (Difference Exists) .
21 pages
Chi Square Test
No ratings yet
Chi Square Test
5 pages
Chi Square
No ratings yet
Chi Square
13 pages
Chi Square Distribution
No ratings yet
Chi Square Distribution
19 pages
Chi Square
No ratings yet
Chi Square
16 pages
Chi Square (KI Square) Test
No ratings yet
Chi Square (KI Square) Test
30 pages
Module 5a Chi Square - Introduction - Goodness of Fit Test
No ratings yet
Module 5a Chi Square - Introduction - Goodness of Fit Test
39 pages
HW 1 - GOF Test
No ratings yet
HW 1 - GOF Test
2 pages
STAT 1013 Statistics: Week 13 AND 14
No ratings yet
STAT 1013 Statistics: Week 13 AND 14
46 pages
Chi Square
No ratings yet
Chi Square
16 pages
MM Lab Chi Square
No ratings yet
MM Lab Chi Square
4 pages
Goodness of Fit Test Example
No ratings yet
Goodness of Fit Test Example
3 pages
Prepared By: Teffany V. Daniel, MS
No ratings yet
Prepared By: Teffany V. Daniel, MS
3 pages
Module 5 Quiz Rev
No ratings yet
Module 5 Quiz Rev
118 pages
When To Use Chi-Square? Sample Problems
No ratings yet
When To Use Chi-Square? Sample Problems
5 pages
Lecture 12 - Chi-Square Test (100%) - With Notesx
No ratings yet
Lecture 12 - Chi-Square Test (100%) - With Notesx
31 pages
Lecture 17 - Ch10 - ChiSquare Test
No ratings yet
Lecture 17 - Ch10 - ChiSquare Test
35 pages
T Dist&chisquare
No ratings yet
T Dist&chisquare
21 pages
Chi-Square Tests and F-Distribution
No ratings yet
Chi-Square Tests and F-Distribution
84 pages
Ch. 10.1, 10.2
No ratings yet
Ch. 10.1, 10.2
42 pages
10 Chi Square
No ratings yet
10 Chi Square
75 pages
Lab Report Bio610 (M&M)
No ratings yet
Lab Report Bio610 (M&M)
4 pages
Chapter 10 1
No ratings yet
Chapter 10 1
49 pages
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
No ratings yet
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
36 pages
Statistics Assignment 2 (Team 3) - 1
No ratings yet
Statistics Assignment 2 (Team 3) - 1
27 pages
Chi Square Using NIPS Candy
No ratings yet
Chi Square Using NIPS Candy
3 pages
Non-Parametric Methods: Goodness of Fit Tests: (Chi-Square Applications)
No ratings yet
Non-Parametric Methods: Goodness of Fit Tests: (Chi-Square Applications)
45 pages
Chi Square
No ratings yet
Chi Square
19 pages
50.2 - Chi Square Goodness-of-Fit Test
No ratings yet
50.2 - Chi Square Goodness-of-Fit Test
11 pages
Chisquare
No ratings yet
Chisquare
10 pages
Goodness of Fit
No ratings yet
Goodness of Fit
15 pages
Stat 213 Chapter 7 2
No ratings yet
Stat 213 Chapter 7 2
18 pages
ES031 M5 ChiSquare
No ratings yet
ES031 M5 ChiSquare
47 pages
Chapter 6. Chi-Square Test
No ratings yet
Chapter 6. Chi-Square Test
25 pages
Gre Formula Book
From Everand
Gre Formula Book
Saifuddin Kamran
No ratings yet
Money for Second Graders
From Everand
Money for Second Graders
Home School Brew
No ratings yet
Functions and Probability for Sixth Graders
From Everand
Functions and Probability for Sixth Graders
Home School Brew
No ratings yet
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)
Homework Assignment-7 Answers
No ratings yet
Homework Assignment-7 Answers
11 pages
Business Statistics Assignment
No ratings yet
Business Statistics Assignment
2 pages
Problem Set 3
No ratings yet
Problem Set 3
9 pages
Artificial Intelligence by SKOLAR
No ratings yet
Artificial Intelligence by SKOLAR
30 pages
MATH 1281 Math Assignment Unit 2
No ratings yet
MATH 1281 Math Assignment Unit 2
8 pages
BAB 15. Determining How Cost Behave
No ratings yet
BAB 15. Determining How Cost Behave
31 pages
ACST356 Section 4 Complete Notes
No ratings yet
ACST356 Section 4 Complete Notes
29 pages
TP Math 04 Activity 1 1
No ratings yet
TP Math 04 Activity 1 1
3 pages
Chapter 3 Econometrics Practice MC
No ratings yet
Chapter 3 Econometrics Practice MC
35 pages
Principles of Statistics
No ratings yet
Principles of Statistics
113 pages
Ch03 Guan CM Aise
No ratings yet
Ch03 Guan CM Aise
41 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
Tugas Data Mining Pertemuan 10 Kelompok 3
No ratings yet
Tugas Data Mining Pertemuan 10 Kelompok 3
4 pages
MTH 4th Grading Notes
No ratings yet
MTH 4th Grading Notes
19 pages
Exploring Marginal Treatment Effects Flexible Estimation Using Stata
100% (1)
Exploring Marginal Treatment Effects Flexible Estimation Using Stata
37 pages
STAT 231 Most Probable Que Paper
No ratings yet
STAT 231 Most Probable Que Paper
2 pages
MC Multiple Regression
No ratings yet
MC Multiple Regression
7 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Forecasting - Muhammad Idzhar Faisa - 120310200084
No ratings yet
Forecasting - Muhammad Idzhar Faisa - 120310200084
10 pages
Independent T-Test - Exercise
No ratings yet
Independent T-Test - Exercise
4 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
30 pages
Statistical Model Specification
No ratings yet
Statistical Model Specification
3 pages
STATISTICS Part 2 (ECON 106) Assignment
No ratings yet
STATISTICS Part 2 (ECON 106) Assignment
18 pages
Effectech - Calibration Gases
No ratings yet
Effectech - Calibration Gases
36 pages
Team Assignment Report Part A: Business Statistics
No ratings yet
Team Assignment Report Part A: Business Statistics
32 pages
MATH1041 Final Cheat Sheet
No ratings yet
MATH1041 Final Cheat Sheet
3 pages
An Improved Bonferroni Inequality and Applications
No ratings yet
An Improved Bonferroni Inequality and Applications
7 pages

Assessment in Learning 1 Chi Square

Uploaded by

Assessment in Learning 1 Chi Square

Uploaded by

Chi-Square Test

Chi-Square Goodness of Fit

Flavor Number of Pieces of Candy Expected number of Pieces of

Flavor Number of Expected Observed- Squared

Flavor Number of Expected Observed- Squared Squared

The Chi-square value with α = 0.05 and 4 degrees of freedom is 9.488.

Bar chart comparing actual vs. expected counts of candy

There are two possible results from our comparison:

Advanced Educational Statistics

The Chi Square Tests

You might also like