Chapter 14: Nonparametric Statistics and The Chi-Square Test of Independence
Chapter 14: Nonparametric Statistics and The Chi-Square Test of Independence
I. Overview
a. Most statistical procedures in this book assume that the data to be analyzed are normally
distributed.
b. In the real world, distributions of data are often not normally distributed, so there is
another class of statistics called nonparametric statistics.
i. Do not rely on the assumption of a normal distribution.
c. There are a wide variety of nonparametric statistics. In this chapter, we focus on one:
The chi-square test of independence.
a. Purpose: To determine whether cases are distributed among the categories of two
variables in the numbers one would expect by chance alone.
c. Procedure: A contingency table is produced that shows how many cases in each group on
one variable fall into each of the categories on the second variable. These numbers are
then compared to the number of cases one would expect to see in each category by
chance alone.
i. e.g., Suppose that in one high school, there are 100 students in eleventh grade.
Half of these students are girls, the other half are boys. Half of these 100
students are taking advanced math, and half are taking basic math.
1. Based on the number of boys and girls in the sample, and the number of
students in each level of math, we would expect there to be 25 students in
each cell of this contingency table, by chance alone. So the numbers in
the table below are the expected values.
Girl 25 25
Boy 25 25
ii. Now suppose that I examine the actual number of cases that appear in each cell
of the table. The table below contains the observed number of students in each
cell:
Girl 15 35
Boy 35 15
iii. As we can see, the observed values in the cells look quite different from the
expected values. There are more girls and fewer boys than we would expect by
chance alone in the advanced math and vice-versa in the basic math.
iv. Now we need to determine whether the difference between the observed and
expected values is statistically significant.
χ2 =
1. If the χ2 value that you calculated is statistically significant, this tells you
that the observed values in your contingency table differ from the
expected values. It does not tell you exactly what is causing this
difference, but you can usually get a pretty good idea by looking at the
values in your contingency table.
III. Summary
b. There are several nonparametric statistics researchers can use when their data are not
normally distributed.