Week 006-007 - Course Module Central Limit Theorem
Week 006-007 - Course Module Central Limit Theorem
CK-12
The names “CK-12” and “CK12” and associated logos and the
terms “FlexBook®” and “FlexBook Platform®” (collectively
“CK-12 Marks”) are trademarks and service marks of CK-12
Foundation and are protected by federal, state, and international
laws.
C HAPTER
1 Central Limit Theorem
What is the Central Limit Theorem? How does the Central Limit Theorem relate other distributions to the normal
distribution?
This lesson describes the relationship between the normal distribution and the Central Limit Theorem.
The Central Limit Theorem is a very powerful statement in statistics, saying that as you take more and more
samples from a random variable, the distribution of the means of the samples (If you completed the lesson titled
“The Mean of Means”, you will recognize this as “the sampling distribution of the sample means”) will approximate
a normal distribution. This is true regardless of the original distribution of the random variable (if the number of data
points in each sample is 30 or more)! In fact, as demonstrated in the video above, even a discrete random variable
with a pretty odd distribution will output an approximately normal distribution from the means of enough samples.
Formally, the CLT says:
If samples of size n are drawn at random from any population with a finite mean and standard deviation, then
the sampling distribution of the sample means, x, approximates a normal distribution as n increases.
In “normal English”:
If you collect many samples from an ordinary random variable, and calculate the mean of each sample, then
the means will be distributed in an approximate bell-curve, and the “mean of means” will be the same as the
mean of the population. The larger the size of the samples you collect, the more closely the distribution of
their means will approximate a normal distribution.
Notes to remember:
• As long as your sample size is 30 or greater, you may assume the distribution of the sample means to be
approximately normal, meaning that you can calculate the probability that the mean of a single sample of size
30 or greater will occur by using the z-score of the mean.
• The mean of the distribution created from many sample means approaches the mean of the population.
Formally: µx = µ
• The standard deviation of the distribution of the means is estimated by dividing the standard deviation of the
population by the square root of the sample size. Formally: sx = ps
n
• Use the notation x(x-bar) rather than the random variable x to indicate that the random variable you are
describing is a sample mean.
You may use the z-score percentage reference table below as needed:
TABLE 1.1:
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Z
0.0 .5 0.504 0.508 0.512 0.516 0.5199 0.5239 0.5279 0.5319 0.5359 0.0
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.1
0.2 0.5793 0.5832 0.5871 0.591 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.2
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.648 0.6517 0.3
0.4 .6554 0.6591 0.6628 0.6664 0.67 0.6736 0.6772 0.6808 0.6844 0.6879 0.4
0.5 0.6915 0.695 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.719 0.7224 0.5
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.6
1
www.ck12.org
Mack asked 42 fellow high-school students how much they spent for lunch, on average. According to his research
online, the amount spent for lunch by high school students nation wide has µ = $15, with s = $9. What is the
probability that Mack’s random sample will result within $0.01 of the national average?
2
www.ck12.org Chapter 1. Central Limit Theorem
• Mack’s sample is 42 students, since 42 30, he can safely assume that the distribution of his sample is
approximately normal, according to the Central Limit Theorem.
• The range we are considering is $14.99 to $15.01, since that represents $0.01 above and below the mean.
• The mean of the sample should approximate the mean of the population, in other words µx = µ
• The standard deviation of Mack’s sample, sx , can be calculated as sx = ps , where n = 42
n
9
sx = p
42
9
=
6.48
sx = 1.389
Since Mack’s sample of 42 samples can be assumed to be normally distributed, and since we now know the standard
deviation of the sample, 1.39, we can calculate the z-scores of the range using Z = x sxµx :
15.01 15.00
Z1 = = +0.01
1.389
14.99 15.00
Z2 = = 0.01
1.389
Finally, we look up Z1 and Z2 on the Z-score probability table to get a range of 50.4% to 49.6% = 0.80%
The probability that Mack’s sample will have a mean within $0.01 of the population mean of $15.00 is a little
less than 1%.
The time it takes a student to complete the mid-term for Algebra II is a bi-modal distribution with µ = 1 hr and
s = 1 hr. During the month of June, Professor Spence administers the test 64 times. What is the probability that the
average mid-term completion time for students during the month of June exceeds 48 minutes?
Important facts:
3
www.ck12.org
• There are more than 30 samples, so the Central Limit Theorem applies
• The mean of the sample should approximate the mean of the population, in other words µx = µ
• The standard deviation of Professor Spence’s sample, sx , can be calculated as sx = ps , where n = 64 (the
n
number of tests/samples)
• 48 minutes is the same as 48
60 = 0.8 hrs, so the range we are interested in is x > 0.8 hrs
1
sx = p
64
sx = 0.125
Since the sample is normally distributed, according to the CLT, we can use the standard deviation of the sample to
calculate the z-score of the minimum value in the relevant range, 0.80 hrs:
0.80 1
Z= = 1.60
0.125
Finally, we use the z-score probability reference above to correlate the z-score of -1.60 to the probability of a value
greater than that
Evan price-checked 123 online auction sellers to record their average asking price for his favorite game. According
to a major nation price-checking site, the national average online auction cost for the game is $35.00 with a standard
deviation of $3.00. Evan found the prices less than $34.86 on average. How likely is this result?
Since there are more than 30 samples (123 > 30), we can apply the CLT theorem and treat the sample as a normal
distribution.
The standard deviation of the sample is: sx = p 3 = 3
11.09 = .27
123
The z-score for Evan’s price point of $34.86 is:
34.86 35 .14
Z= = = 0.518
.27 .27
Consulting the z-score probability table, we learn that the area under the normal curve less than 0.52 is .3015 or
30.15%
The likelihood of 123 samples having a mean of $34.86 is approximately 30.15%
What is the Central Limit Theorem? How does the Central Limit Theorem relate other distributions to the normal
distribution?
The Central Limit Theorem says that the larger the sample size, the more the mean of multiple samples will represent
a normal distribution. Since that is true regardless of the original distribution, the CLT can be used to effect a bridge
between other types of distributions and a normal distribution.
4
www.ck12.org Chapter 1. Central Limit Theorem
Examples
Example 1
The time it takes to drive from Cheyenne WY to Denver CO has a µ of 1 hr and s of 15 minutes. Over the course of
a month, a highway patrolman makes the trip 55 times. What is the probability that his average travel time exceeds
60 minutes?
The sample mean, µx is the same as the population mean: 1 hr = 60 mins.
The sample standard deviation is 15pmins = 7.42
15
= 2.02 min
55
The 55 trips made by the patrolman exceed the minimum sample size of 30 required to apply the CLT, so we may
assume the sample means to be normally distributed.
60 60 0
The z-score of the patrolman’s average time is: 2.02 = 2.02 =0
According to the z-score percentage reference, a z-score of 0 corresponds to .50 or 50%
There is a 50% probability that the patrolman’s mean travel time is greater than 60 mins.
Example 2
Abbi polls 95 high school students for their GPA. According to the school, the average GPA of high school students
has a mean of 3.0, and a standard deviation of .5. What is the probability that Abbi’s random sample will have a
mean within 0.01 of the population.
The sample mean of the 95 polled G.P.A. scores is the same as the population mean: 3.0
The sample standard deviation is p.5 = 9.75
.5
= .05
95
The 95 sampled G.P.A.’s exceed the minimum sample size of 30, so we may apply the CLT.
The z-scores of the minimum and maximum values in the range of interest, 2.99 to 3.01 is:
Referring to the z-score reference table, the z-scores -0.2 and 0.2 cover a range of apx. 15.86%
5
www.ck12.org
Example 3
A recipe website has calculated that the time it takes to cook Sunday dinner has µ of 1 hour with s of 25 minutes.
Over the course of a month, 172 users report their time spent cooking Saturday dinner, what is the probability that
the average user reports spending less than 45 minutes cooking dinner?
The sample mean, µx is the same as the population mean: 1 hr = 60 mins.
The sample standard deviation is 25
p mins = 13.11
25
= 1.91 min
172
The 172 users reporting cooking times exceed the minimum sample size of 30 required to apply the CLT, so we may
assume the sample means to be normally distributed.
45 60 15
The z-score of the average reported cooking time is: 1.91 = 1.91 = 7.85
According to the z-score percentage reference, a z-score of -7.85 corresponds to 0%.
There is essentially zero probability that 172 users would average only 45 mins.
Review
1. 128 randomly-sampled students reported how much they spent on a movie at the theater. If the national
average amount spent at the movies has a mean of $15 and standard distribution of $8, what is the probability
that the random sample will give a result within $0.01 of the true value?
2. The time an American family spends doing dishes in the evening has µ = 60 mins and s = 60 mins. 58
Americans were polled to find the time they spend doing dishes. What is the probability that their average
time exceeds 60 minutes?
3. Rachel asked 65 second year college students how many credits they have taken. According to the colleges,
the average number of credits taken by 2nd year students is 15, with a standard deviation of 7. How likely is it
that Rachel got less than 17.17 on average?
4. What do you need in order to apply the Central Limit Theorem to sample means?
5. 117 business women were asked how much they spend for lunch, on average. If the national average has a
mean of $30, and standard distribution of $9, what is the probability that the random sample will return a
result within $0.01 of the true value?
6. According to the phone company, the daily average number of calls made by Americans is 30, with a standard
deviation of 10. What is the probability that 117 Americans reported less than 30.92 calls per day, on average?
7. The time spent by the average technician repairing a laptop is governed by an exponential distribution where
µ and s are each 60 minutes. In the month of June, a technician repairs 76 laptops. How likely is it that the
average repair time is greater than 77 minutes?
8. 46 teenagers were asked how many .mp3’s they purchase each month. According .mp3 sales data, the average
has a mean of 15, with a standard distribution of 2. How likely is it that the 46 polled teens averaged within
0.02 of the national average?
9. 44 classrooms were investigated to see how many students they contained. According to school data, the
average number of students per classroom is 35, with a standard deviation of 10. How likely is it that the 44
classrooms averaged fewer than 33.49 students?
10. 100 bags of candy were counted to see how many pieces they contained. According to the company that fills
the bags, the average number of candies per bag has a mean of 50, and standard distribution of 10. What is
the probability that the 100 bags will have an average number within 0.02 of the production average?
Review (Answers)
To view the Review answers, open this PDF file and look for section 9.7.
6
www.ck12.org Chapter 1. Central Limit Theorem
References
7
Z-scores III
CK-12
The names “CK-12” and “CK12” and associated logos and the
terms “FlexBook®” and “FlexBook Platform®” (collectively
“CK-12 Marks”) are trademarks and service marks of CK-12
Foundation and are protected by federal, state, and international
laws.
C HAPTER
1 Z-scores III
Do z-score probabilities always need to be calculated as the chance of a value either above or below a given score?
How would you calculate the probability of a z-score between -0.08 and +1.92?
Z-Scores
To calculate the probability of getting a value with a z-score between two other z-scores, you can either use a
reference table to look up the value for both scores and subtract them to find the difference, or you can use technology.
In this lesson, which is an extension of Z-scores and Z-scores II, we will practice both methods.
Historically, it has been very common to use a z-score probability table like the one below to look up the probability
associated with a given z-score:
TABLE 1.1:
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Z
0.0 .5 0.504 0.508 0.512 0.516 0.5199 0.5239 0.5279 0.5319 0.5359 0.0
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.1
0.2 0.5793 0.5832 0.5871 0.591 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.2
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.648 0.6517 0.3
0.4 .6554 0.6591 0.6628 0.6664 0.67 0.6736 0.6772 0.6808 0.6844 0.6879 0.4
0.5 0.6915 0.695 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.719 0.7224 0.5
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.6
0.7 0.758 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 0.7
0.8 0.7881 0.791 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133 0.8
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.834 0.8365 0.8389 0.9
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 1.0
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.877 0.879 0.881 0.883 1.1
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.898 0.8997 0.9015 1.2
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177 1.3
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 1.4
1.5 0.9332 0.9345 0.9357 0.937 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441 1.5
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 1.6
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633 1.7
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706 1.8
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.975 0.9756 0.9761 0.9767 1.9
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817 2.0
2.1 0.9821 0.9826 0.983 0.9834 0.9838 0.9842 0.9846 0.985 0.9854 0.9857 2.1
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.989 2.2
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 2.3
2.4 0.9918 0.992 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 2.4
2.5 0.9938 0.994 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952 2.5
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.996 0.9961 0.9962 0.9963 0.9964 2.6
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.997 0.9971 0.9972 0.9973 0.9974 2.7
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.998 0.9981 2.8
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986 2.9
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.999 0.999 3.0
3.1 0.999 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993 3.1
1
www.ck12.org
Since the proliferation of the Internet, however, you can also use a free online calculator.
MEDIA
Click image to the left or use the URL below.
URL: https://www.ck12.org/flx/render/embeddedobject/67469
Calculating Probability
What is the probability associated with a z-score between 1.2 and 2.31?
To evaluate the probability of a value occurring within a given range, you need to find the probability of both the
upper and lower values in the range, and subtract to find the difference.
• First find z = 1.2 on the z-score probability reference above: .8849 Remember that value represents the
percentage of values below 1.2.
2
www.ck12.org Chapter 1. Z-scores III
• Next, find and record the value associated with z = 2.31: .9896
• Since approximately 88.49% of all values are below z = 1.2 and approximately 98.96% of all values are below
z = 2.31, there are 98.96% 88.49% = 10.47% of values between.
All you need to do is select the radio button to the left of the first type of probability, input “-1.32” into the first box,
and 1.49 into the second. When you click “Compute”, you should get the result
Which tells us that there is approximately and 83.85% probability that a value with a z-score between 1.32 and 1.49
will occur in a normal distribution.
Notice that the calculator also details the steps involved with finding the answer:
1. Estimate the probability using a graph, so you have an idea of what your answer should be.
2. Find the probability of z < 1.49, using a reference. (0.9319)
3. Find the probability of z < 1.32, again, using a reference. (0.0934)
4. Subtract the values: 0.9319 0.0934 = 0.8385 or 83.85%
2. What is the probability that a random selection will be between 8.45 and 10.25, if it is from a normal distribution
with µ = 10 and s = 2?
This question requires us to first find the z-scores for the value 8.45 and 10.25, then calculate the percentage of value
between them by using values from a z-score reference and finding the difference.
(x µ)
1. Find the z-score for 8.45, using the z-score formula: s
8.45 10 1.55
= ⇡ 0.78
2 2
3
www.ck12.org
10.25 10 0.25
= ⇡ .13
2 2
3. Now find the percentages for each, using a reference (don’t forget we want the probability of values less than our
negative score and less than our positive score, so we can find the values between):
4. At this point, let’s sketch the graph to get an idea what we are looking for:
There is approximately a 33.4% probability that a value between 8.45 and 10.25 would result from a random
selection of a normal distribution with mean 10 and standard deviation 2.
Do z-score probabilities always need to be calculated as the chance of a value either above or below a given score?
How would you calculate the probability of a z-score between -0.08 and +1.92?
After this lesson, you should know without question that z-score probabilities do not need to assume only probabili-
ties above or below a given value, the probability between values can also be calculated.
The probability of a z-score below -0.08 is 46.81%, and the probability of a z-score below 1.92 is 97.26%, so the
probability between them is 97.26% 46.81% = 50.45%.
4
www.ck12.org Chapter 1. Z-scores III
Examples
Solutions:
Example 1
Example 2
Example 3
Review
Find the probabilities, use the table from the lesson or an online resource.
5
www.ck12.org
14. What is the probability of getting a value between 1.2 and 2.3 from the random output of a normally distributed
set with µ = 2.6 and s = .9?
Review (Answers)
To view the Review answers, open this PDF file and look for section 9.5.
References
1. . . CC BY-NC-SA
2. . . CC BY-NC-SA
3. . . CC BY-NC-SA