Module 5
Module 5
Module 5
STAT 151
Module 5
By Haile Gessesse
Winter 2024
1
5 Sampling Distribution of statistics
Goal: We discuss the qqplot or normal probability plot,
distributions of sample means, and distribution of sample
proportions.
5.1 Normal Probability Plot
• A normal probability plot is a plot to identify normaly
distributed variables.
2
5.2 Sampling Distribution of the Sample Means
• We are often interested in estimating population pa-
rameters ( like µ, σ, p) using sample statistics (like x,
s, p̂) called point estimates.
• Sample statistics vary from sample to sample which
leads us to study their distribution.
• The sampling distribution of a statistic is the dis-
tribution of all values of the statistic when all possible
samples of the same size n are taken from the same
population.
• If the mean of a sampling distribution of a statistic
is equal to the true value of the parameter, then the
statistics is called unbiased estimator.
3
• The sampling distribution of the sample mean
is the distribution of all possible sample means (or the
distribution of the variable x), with all samples having
the same sample size n taken from the same popula-
tion.
Example: Consider the population consisting of the
values {1, 2, 5}. Thus, µ = 8/3
4
actually generally true. We say that the sample means
tend to target the population mean.
• If all possible random samples of size n are selected
(with replacement) from a population with mean µ and
standard deviation σ, the mean of the sample means is
denoted by µx, and
µx = µ
and the standard deviation of the sample means is de-
noted by σx, and
σ
σx = √
n
which measures the sampling variability.
• Note that if the sample size gets larger, then the sam-
pling variability gets smaller.
• The Central Limit Theorem: For a large sample
size (n ≥ 30), the possible sample means are approxi-
mately normally distributed, regardless of the distribu-
tion of the variable under consideration.
Summary
• Suppose that a variable x of a population has mean µ
and standard deviation σ. Then, for samples of size n,
– The mean of x is µ. That is µx = µ.
√
– The standard deviation of x is σ/ n.
5
– if x is normally distributed, so is x, regardless of
sample size; and
– if the sample size is large (usually n ≥ 30), x is ap-
proximately normally distributed, regardless of the
distribution of x.
i.e.
σ
x ∼ N µ, √ when n ≥ 30
n
– As the sample size increases we would expect sam-
ples to yield more consistent sample means, hence
the variability among the sample means would be
lower. √
This is because σx = σ/ n.
6
Example: IQ scores are normally distributed with mean
100 and standard deviation 16. Find the probability
that
(a) a randomly selected person has an IQ score of at
least 104
(b) a group of 64 people has an average IQ score of at
least 104.
Solution:
(a) Let x = IQ scores. Then x ∼ N (100, 16). So, to
compute the probability that a randomly selected
person has an IQ score of at least 104, we should
compute P (x > 104). By the finding the z-score,
this means
104 − 100
P (x > 104) = P (z > ) = P (z > 0.25) = 1−0.5987 = 0.4013
16
8
So, we find that the area under the standard normal
curve between -1.88 and 1.88 equals 0.9398. Conse-
quently, 93.98% of all samples of 400 male babies have
mean birth weights within 0.125 lb of the population
mean birth weight of all male babies.
Interpretation: There is about a 94% chance that
the sampling error made in estimating the mean birth
weight of all male babies by that of a sample of 400
male babies will be at most 0.125 lb.
10
Example: Find the mean. standard deviation, and
the distribution of the sample totals of samples of size
36 if the population mean is µ = 12 and population
standard deviation is σ = 5.
Solution:
• The q
standard deviation of the sample proportions p̂ is
p(1−p)
p is n . i.e.
r
p(1 − p)
σp̂ =
n
which measures the sampling variability.
• Note that if the sample size gets larger, then the sam-
pling variability gets smaller.
• Moreover, the sample proportions are normally distributed
for large enough n.
13
Example: Assume that about 20% of students in
Canadian universities are international students. Com-
pute the probability that
(a) out of 100 randomly selected students, at least 10%
are international students.
(b) out of 81 randomly selected students, at most 18 of
them are international students.
Solution:
(a) The sample proportions of international students in
a group of 100 students p̂ ∼ N (0.2, 0.04). That
is, the mean of the sampleq proportion q is 0.2 and
the standard deviation is p(1−p)
n = 0.2(1−0.2)
100 =
0.04. So, to compute the probability that out of
100 randomly selected students, at least 10% are
international students., we should compute P (p̂ >
0.1). By the finding the z-score, this means
0.1 − 0.2
P (p̂ ≥ 0.1) = P (z ≥ ) = P (z ≥ −2.5) = 1 − P (z < −2.5)
0.04
= 1 − 0.0062 = 0.9938
15
Bibliography
M. F. Triola, Elementary Statistics, 12th Edition, Pear-
son
N. A. Weiss, Introductory Statistics, 9th Edition,
16