Questions - CLT
Questions - CLT
7.1.2 Introduction
What does it mean to be average? Why are we so concerned with averages? Two reasons are that they give
us a middle ground for comparison and they are easy to calculate. In this chapter, you will study averages
and the Central Limit Theorem.
The Central Limit Theorem (CLT for short) is one of the most powerful and useful ideas in all of statistics.
Both alternatives are concerned with drawing finite samples of size n from a population with a known
mean, µ, and a known standard deviation, s. The first alternative says that if we collect samples of size
n and n is "large enough," calculate each sample’s mean, and create a histogram of those means, then the
resulting histogram will tend to have an approximate normal bell shape. The second alternative says that
if we again collect samples of size n that are "large enough," calculate the sum of each sample and create a
histogram, then the resulting histogram will again tend to have a normal bell-shape.
In either case, it does not matter what the distribution of the original population is, or whether you even
need to know it. The important fact is that the sample means (averages) and the sums tend to follow the
normal distribution. And, the rest you will learn in this chapter.
The size of the sample, n, that is required in order to be to be ’large enough’ depends on the original
population from which the samples are drawn. If the original population is far from normal then more
observations are needed for the sample averages or the sample sums to be normal. Sampling is done with
replacement.
281
282 CHAPTER 7. THE CENTRAL LIMIT THEOREM
Do the following example in class: Suppose 8 of you roll 1 fair die 10 times, 7 of you roll 2 fair dice 10
times, 9 of you roll 5 fair dice 10 times, and 11 of you roll 10 fair dice 10 times. (The 8, 7, 9, and 11 were
randomly chosen.)
Each time a person rolls more than one die, he/she calculates the average of the faces showing. For example,
one person might roll 5 fair dice and get a 2, 2, 3, 4, 6 on one roll.
The average is 2+2+35+4+6 = 3.4. The 3.4 is one average when 5 fair dice are rolled. This same person
would roll the 5 dice 9 more times and calculate 9 more averages for a total of 10 averages.
Your instructor will pass out the dice to several people as described above. Roll your dice 10 times. For
each roll, record the faces and find the average. Round to the nearest 0.5.
Your instructor (and possibly you) will produce one graph (it might be a histogram) for 1 die, one graph
for 2 dice, one graph for 5 dice, and one graph for 10 dice. Since the "average" when you roll one die, is just
the face on the die, what distribution do these "averages" appear to be representing?
Draw the graph for the averages using 2 dice. Do the averages show any kind of pattern?
Draw the graph for the averages using 5 dice. Do you see any pattern emerging?
Finally, draw the graph for the averages using 10 dice. Do you see any pattern to the graph? What can
you conclude as you increase the number of dice?
As the number of dice rolled increases from 1 to 2 to 5 to 10, the following is happening:
The Central Limit Theorem tells you that as you increase the number of dice, the sample means (averages)
tend toward a normal distribution (the sampling distribution).
If you draw random samples of size n, then as n increases, the random variable X which consists of sample
means, tends to be normally distributed and
⇣ ⌘
sX
X ⇠ N µX , p n
The Central Limit Theorem for Sample Means (Averages) says that if you keep drawing larger and larger
samples (like rolling 1, 2, 5, and, finally, 10 dice) and calculating their means the sample means (averages)
form their own normal distribution (the sampling distribution). The normal distribution has the same
mean as the original distribution and a variance that equals the original variance divided by n, the sample
size. n is the number of values that are averaged together not the number of times the experiment is done.
2 This content is available online at <http://http://cnx.org/content/m16947/1.20/>.
283
The random variable X has a different z-score associated with it than the random variable X. x is the value
of X in one sample.
x µ
z = ⇣ ⌘X (7.1)
sX
p
n
Let X = the mean or average of a sample of size 25. Since µ X = 90, sX = 15, and n = 25;
⇣ ⌘
then X ⇠ N 90, p15
25
TI-83 or 84: (lower value, upper value, mean for averages, for averages)
= standard deviation
Problem 2
Find the average value that is 2 standard deviations above the the mean of the averages.
Solution
To find the average value that is 2 standard deviations above the mean of the averages, use the
formula
⇣ ⌘
sX
value = µ X + (#ofSTDEVs) p n
value = 90 + 2 · p15 = 96
25
So, the average value that is 2 standard deviations above the mean of the averages is 96.
Example 7.2
The length of time, in hours, it takes an "over 40" group of people to play one soccer match is
normally distributed with a mean of 2 hours and a standard deviation of 0.5 hours. A sample of
size n = 50 is drawn randomly from the population.
Problem
Find the probability that the sample mean is between 1.8 hours and 2.3 hours.
Solution
Let X = the time, in hours, it takes to play one soccer match.
The probability question asks you to find a probability for the sample mean or average time, in
hours, it takes to play one soccer match.
Let X = the average time, in hours, it takes to play one soccer match.
The probability that the sample mean is between 1.8 hours and 2.3 hours is ______.