Central Limit Theorem
Central Limit Theorem
Statistics is the most commonly used branch of mathematics. We use it almost every day. It
is also a must-have knowledge for a data scientist. Central Limit Theorem is the
cornerstone of it.
In statistics, the given data set represents a sample from the entire population. Using this
sample, we try to see the patterns in the data. We then try to generalize the patterns in the
sample to the population while making the predictions. Central limit theorem helps us to
make inferences about the sample and population parameters.
Central Limit Theorem states that the sampling distribution of the mean approaches a
normal distribution, as the sample size increases.
Regardless of the initial shape of the population distribution, if samples of size n are
randomly selected from a population, the sampling distribution of the sampling means will
approach a normal distribution as the sample size n gets larger.
The standard error of the mean measures the degree of accuracy of the sample mean (µ x) )
as an estimate of the population mean (µ). It is also known as the standard deviation of the
sampling distribution of the sampling mean, denoted by σ x.
Remember that if we want to get a good estimate of the population mean, we have to make
n sufficiently large. This fact is stated as a theorem in the Central Limit Theorem.
Now, can you determine the standard error of the mean of the given set of data below? Your
knowledge of the formula and manipulating the given data will be handy in solving this
problem.
Central Limit theorem is important because it teaches researchers to use a limited sample
to make intelligent and accurate conclusions about a greater population. It also justifies the
use of normal curve methods for a wide range of problems.
Furthermore, it justifies the use of the formula when computing for the
probability that X̄ will take a value within a given range in the sampling distribution of X̄ .
a. If 50 randomly selected senior high school students take the examination, what is the
probability that the mean time it takes the group to complete the test will be less
than 43 minutes? Does it seem reasonable that the mean of the 50 senior high school
students could be less than 43 minutes?
Solution for #1:
Step1: Identify the parts of the problem.
Given: µ = 46.2 𝑚𝑖𝑛𝑢𝑡𝑒𝑠; 𝜎 = 8 𝑚𝑖𝑛𝑢𝑡𝑒𝑠; 𝑋̅ = 43 𝑚𝑖𝑛𝑢𝑡𝑒𝑠
Find: 𝑃(X̄ < 43)
Step 2: Use the formula to find the z-score.
Step 3: Use the z-table to look up the z-score you calculated in step 2.
𝑧 = −0.40 has a corresponding area of 0.3446.
Step 4: Draw a graph and plot the z-score and its corresponding area. Then, shade
the part that you’re looking for: 𝑃(𝑋̅ < 43)
Since we are looking for the probability less than 43 minutes, the shaded part
will be on the left part of -0.40.
Step 5: Subtract your z-score from 0.500
Step 3: Use the z-table to look up the z-score you calculated in step 2.
𝒛 = −2.83 has a corresponding area of 0.0023
Step 4: Draw a graph and plot the z-score and its corresponding area. Then, shade
the part that you’re looking for: 𝑃(𝑋̅ < 43)
Since we are looking for the probability less than 43 minutes, the shaded part
will be on the left part of – 2.83.
Step 5: Subtract your z-score from 0.500.
Therefore, the probability that a randomly selected 50 senior high school students
will complete the examination in less than 43 minutes is 00.23%. no, it’s
reasonable since the probability is lesser than 1.
2. An electrical company claims that the average life of the bulbs it manufactures is 1
200 hours with a standard deviation of 250 hours. If a random sample of 100 bulbs
is chosen, what is the probability that the sample mean will be between 1150 hours
and 1 250 hours?
Solution:
Step1: Identify the parts of the problem.
Given: µ = 1200 ℎ𝑜𝑢𝑟𝑠; 𝜎 = 250 ℎ𝑜𝑢𝑟𝑠; 𝑛 = 100 𝑏𝑢𝑙𝑏𝑠
𝑋̅ = 1150 & 1250 ℎ𝑜𝑢𝑟𝑠
Unknown: 𝑃(1150 < 𝑋̅ < 1250)
Step 2: Use the formula to find the z-score.
Step 3: Use the z-table to look up the z-score you calculated in step 2.
𝒛 = ±2 has a corresponding area of 0.9772, 0.0228
Step 4: Draw a graph and plot the z-score and its corresponding area. Then, shade
the part that you’re looking for: 𝑃(1150 < 𝑋̅ < 1250)
Since we are looking for the probability between 1 150 hours and 1 250 hours,
the shaded part will be between –2 and 2.
Therefore, the probability of randomly selected 100 bulbs to have a sample mean
between 1150 hours and 1250 hours is 95.44%.