Central Limit Theorem
Central Limit Theorem
Introduction
The Central Limit Theorem (CLT) is one of the most fundamental results in probability theory
and statistics. It tells us that under certain conditions, the distribution of the sum (or average) of a
large number of independent and identically distributed (i.i.d.) random variables approaches a
normal distribution, regardless of the original distribution of the variables.
In simpler terms: No matter the shape of the original population distribution, the
distribution of the sample mean will tend to become more normal (bell-shaped) as the
sample size increases, as long as the random variables are independent and identically
distributed.
Let X1,X2,…,XnX_1, X_2, \dots, X_nX1,X2,…,Xn be a random sample of size nnn from a
population with mean μ\muμ and standard deviation σ\sigmaσ, where:
This means that the distribution of the sample mean approaches a normal distribution with
mean μ\muμ and standard deviation σn\frac{\sigma}{\sqrt{n}}nσ (i.e., a normal distribution
with mean μ\muμ and standard error σn\frac{\sigma}{\sqrt{n}}nσ).
Suppose you are measuring the height of adult women in a certain country. The population of
heights is not normally distributed (perhaps it's skewed). However, by taking a sample of 50
women and calculating the sample mean, you can use the CLT to assert that the sampling
distribution of the sample mean will be approximately normal, with mean μ\muμ (the population
mean) and standard deviation σ50\frac{\sigma}{\sqrt{50}}50σ, where σ\sigmaσ is the
population standard deviation.
Here are 10 problems with solutions that demonstrate the application of the Central Limit
Theorem:
The average height of women in a certain country is 160 cm with a standard deviation of 10 cm.
A random sample of 36 women is selected. Find the probability that the sample mean height is
greater than 162 cm.
Solution:
1. The population mean μ=160\mu = 160μ=160 cm, and the population standard deviation
σ=10\sigma = 10σ=10 cm.
2. Sample size n=36n = 36n=36.
3. The sampling distribution of the sample mean Xˉ\bar{X}Xˉ will have a mean
μXˉ=μ=160\mu_{\bar{X}} = \mu = 160μXˉ=μ=160 cm, and a standard deviation
(standard error) of σn=1036=106=1.67\frac{\sigma}{\sqrt{n}} = \frac{10}{\sqrt{36}}
= \frac{10}{6} = 1.67nσ=3610=610=1.67 cm.
We need to find the probability that Xˉ>162\bar{X} > 162Xˉ>162. First, we standardize the
value using the z-score formula:
Using the standard normal distribution table, the probability of z>1.20z > 1.20z>1.20 is
approximately 0.1151. Therefore, the probability that the sample mean is greater than 162 cm is
0.1151 or 11.51%.
The population distribution of incomes in a region is positively skewed with a mean income of
$40,000 and a standard deviation of $6,000. If you take a random sample of 50 households, what
is the probability that the average income for the sample is between $39,000 and $41,000?
Solution:
Now, using the standard normal distribution table, the probability corresponding to z1=−1.18z_1
= -1.18z1=−1.18 is approximately 0.1190, and the probability corresponding to z2=1.18z_2 =
1.18z2=1.18 is approximately 0.8810.
Thus, the probability that the sample mean is between $39,000 and $41,000 is:
P(39,000≤Xˉ≤41,000)=P(z2)−P(z1)=0.8810−0.1190=0.7620P(39,000 \leq \bar{X} \leq 41,000)
= P(z_2) - P(z_1) = 0.8810 - 0.1190 = 0.7620P(39,000≤Xˉ≤41,000)=P(z2)−P(z1
)=0.8810−0.1190=0.7620
A researcher wants to estimate the average age of employees in a large company. The population
standard deviation is known to be 5 years. If the researcher wants the margin of error to be no
more than 1 year, what sample size should be used to achieve this?
Solution:
The formula for the margin of error (ME) when estimating the population mean is:
Where:
We are given:
σ=5\sigma = 5σ=5,
ME=1ME = 1ME=1,
zα/2=1.96z_{\alpha/2} = 1.96zα/2=1.96.
Since the sample size must be an integer, we round up to n=97n = 97n=97. Therefore, the
required sample size is 97.
Solution:
The Central Limit Theorem tells us that regardless of the population's distribution (in this case,
uniform), the distribution of the sample mean will approach a normal distribution as the sample
size increases.
The sampling distribution of the sample mean Xˉ\bar{X}Xˉ will be approximately normal with:
Thus, the distribution of the sample mean is approximately normal with mean 5 and standard
deviation 0.577.
The mean daily number of customers at a restaurant is 80, and the standard deviation is 20. What
is the probability that the sample mean of a sample of 100 days is greater than 82?
Solution:
1. The population mean μ=80\mu = 80μ=80, and the population standard deviation σ=20\
sigma = 20σ=20.
2. Sample size n=100n = 100n=100.
3. The sampling distribution of the sample mean Xˉ\bar{X}Xˉ will have:
o Mean μXˉ=80\mu_{\bar{X}} = 80μXˉ=80,
o Standard error σn=20100=2\frac{\sigma}{\sqrt{n}} = \frac{20}{\sqrt{100}} = 2n
σ=10020=2.
We need to find the probability that Xˉ>82\bar{X} > 82Xˉ>82. First, calculate the z-score:
The probability of z>1z > 1z>1 is approximately 0.1587. Therefore, the probability that the
sample mean is greater than 82 is 0.1587 or 15.87%.
Problem 6: Confidence Interval Using the CLT
Suppose the average height of adult males in a city is normally distributed with a mean of 70
inches and a standard deviation of 4 inches. A random sample of 64 adult males is selected.
Construct a 95% confidence interval for the sample mean.
Solution:
We are given:
For a 95% confidence level, the critical value zα/2z_{\alpha/2}zα/2 is 1.96 (for a normal
distribution).
So, the 95% confidence interval for the sample mean is (69.02, 70.98) inches.
Solution:
We are given:
We need to find the probability that the sample mean Xˉ\bar{X}Xˉ is between 83 and 87. First,
we standardize the values using the z-score formula:
Thus, the probability that 83≤Xˉ≤8783 \leq \bar{X} \leq 8783≤Xˉ≤87 is:
Solution:
We are given:
We need to find the probability that the sample mean Xˉ\bar{X}Xˉ is less than $490. First,
calculate the z-score for Xˉ=490\bar{X} = 490Xˉ=490:
Using the standard normal distribution table, the probability of z<−0.75z < -0.75z<−0.75 is
approximately 0.2266.
Thus, the probability that the sample mean is less than $490 is 0.2266 or 22.66%.
A factory produces light bulbs, and the lifetime of each bulb is measured. The lifetime of the
bulbs follows an exponential distribution with a mean of 1,000 hours and a standard deviation of
1,000 hours. If you select a random sample of 36 bulbs, what is the probability that the average
lifetime of the bulbs in the sample exceeds 1,050 hours?
Solution:
We are given:
We need to find the probability that Xˉ>1050\bar{X} > 1050Xˉ>1050. First, calculate the z-
score for Xˉ=1050\bar{X} = 1050Xˉ=1050:
Using the standard normal distribution table, the probability of z>0.30z > 0.30z>0.30 is
approximately 0.3821.
Thus, the probability that the sample mean exceeds 1,050 hours is 0.3821 or 38.21%.
A random sample of 400 voters is selected, and 60% of the sample is found to support a
particular candidate. Construct a 99% confidence interval for the proportion of voters in the
entire population who support this candidate.
Solution:
We are given:
[0.5367,0.6633][0.5367, 0.6633][0.5367,0.6633]
Thus, the 99% confidence interval for the proportion of voters supporting the candidate is
(0.5367, 0.6633).
Conclusion
The Central Limit Theorem is a powerful tool that allows us to make inferences about
population parameters based on sample data, even when the underlying population distribution is
not normal. It tells us that, under certain conditions (large sample size, independence, and
identical distribution), the sample mean will approximate a normal distribution, allowing us to
calculate probabilities and construct confidence intervals for population parameters. The key
takeaway is that the CLT is essential in statistical practice, especially when dealing with large
samples.