Unit 3. Sampling Distribution
Unit 3. Sampling Distribution
The goal for many research projects is to know more about your goal, i.e., your population.
This is what you are interested in. You have a certain goal in mind, but the question is, what steps can
we take to understand the population better?
What we can do is to take a sample. Using the sample we collect, we can construct estimates
for the parameter of the population that we are interested in.
Sampling distribution is important since it can be viewed as the bridge that takes us from
probability to statistical inference. It shows every possible result a statistic can take in every possible
sample from a population and how often each result happens.
This unit focuses on understanding the basic concepts of sampling distribution of the sample
mean. We will also do some computation for the mean and variance and solve problems using Central
Limit Theorem.
Lesson Objectives:
Let’s Review:
Suppose we have 50 numbered papers in a box. If you were asked to select 10 papers from that
box, what will you consider?
To reduce the potential for human bias in the selection of elements to be included, we use
sampling. Sampling is a statistical method of obtaining representative data or observations from a group.
Let’s Explore:
A teacher wants to understand more about the schedule of the students at a single university.
Let’s say that the university has 8,000 students. Out of these 8,000 students, 200 were asked. Identify
the population and sample for this problem.
We can easily identify that the population is 8,000 students and the sample is 200 students.
Why do we use sample instead of using the entire population? Most of the time, when human
resources are limited and there is no enough time, the use of the entire population is not possible.
A population consists of all individuals or units that are being studied while a sample is a group
of individuals or units selected from a population.
POPULATION
SAMPLE
Figure 1. Shows the relationship between population and sample
What we are usually after in a study is to describe the entire population. A measure that
describes a population is called parameter. But parameters are difficult to obtain exactly. In statistics, to
estimate a population parameter, we use sample statistic. A measure that describes a sample is called
statistic.
Symbol
Name Population
Sample Statistic
Parameter
Size N N
Mean x́ μ
Variance σ
2
s
2
Standard Deviation σ s
Proportion π P
Correlation Coefficient ρ r
Regression Coefficient β b
Table 1. Shows the comparison between parameter and statistic
Statisticians use different methods of sampling to obtain samples that give each element in the
population an equal chance of being selected.
Sample -
selected
elements
2. Systematic Sampling
Systematic Sampling is a sampling method in which the elements to be included from a
population are being selected at regular interval.
4. Cluster Sampling
Cluster Sampling is sampling method in which the population is divided into clusters. Some of
the clusters are randomly selected. Cluster sampling consists of elements from the selected
clusters only.
Let’s Summarize:
To reduce the potential for human bias in the selection of elements to be included, we use
sampling. Sampling is a statistical method of obtaining representative data or observations from a group.
When the use of the entire population is not possible, we use sample. A population consists of
all individuals or units that are being studied while a sample is a group of individuals or units selected
from a population.
Since most of the time parameters are difficult to obtain, we estimate a population parameter.
Parameter is a measure that describes a population while statistic is a measure that describes a sample.
Statisticians use different methods of sampling to obtain samples: They are simple random
sampling, systematic sampling, stratified sampling and cluster sampling.
Let’s Try:
Given the following problems, decide if the situation is dealing with a population data set or with
a sample data set.
1. Mr. Medina wants to do a statistical analysis on students’ grade in her Algebra class for the past
year.
2. Mrs. Panoy wants to conduct a survey to all freshmen students in the Philippines to determine
the number of pet’s in each student’s household.
3. A teacher needs to find the variance of the heights of all the students in his Trigonometry class.
Solution:
1. POPULATION DATA SET. Mr. Medina is only working with the students’ grade from his class. He
has all of the data and there is no need to generalize the results.
2. SAMPLE DATA SET. In this situation, the population is extremely large. It is difficult to obtain the
entire population.
3. POPULATION DATA SET. The population in this problem is only the students in one specific
subject. The entire population is available for this situation.
Let’s Practice:
1. A survey will be given to 100 students randomly selected from the freshmen class at Laguna
State Polytechnic University. What is the population?
A. The 100 selected students
B. All freshmen at Laguna State Polytechnic University
C. All students at Laguna State Polytechnic University
2. Sixty bottles of soft drinks were randomly selected from a large collection of bottles in a
container. The large collection of bottles is referred to as the
A. Sample
B. Parameter
C. Population
3. A researcher wants to estimate the average weight of children aged 7 – 10. From a simple
random sample of 50 children, the researcher obtains a sample mean weight of 23.4 kg. Identify
the statistic in the study.
A. Average weight of children aged 7 – 10
B. 50 children selected
C. Sample mean weight of 23.4 kg
4. A teacher wants to estimate the proportion of adults age 30 or older who had read at least one
book the previous year. A random sample of 325 adults aged 30 or older is obtained, and 128 of
those adults had read at least one book during the previous year. What is the parameter in this
study?
A. 128 adults who had read at least one book during the previous year
B. 325 adults age 30 or older
C. The proportion of adults aged 30 or older
Let’s Reflect:
Lesson Objectives:
At the end of the lesson, learners …
Let’s Review:
Suppose 5 people take a test that has a maximum score of 100. Their scores are 78, 65, 86, 91
and 68. What is their average score?
Let’s Explore:
μ=
∑X
N
μ=
∑X
N
34
¿
5
¿ 6.8
σ2=
∑ ( X−μ )2
N
σ2=
∑ ( X−μ )2
N
40.83
¿
5
¿ 8.17
C. List all the possible samples of size 2 that can be drawn from the population and compute
the mean for each sample.
Mean
Observation Sample
( x́ )
1 3, 5 4.0
2 3, 6 4.5
3 3, 9 6.0
4 3, 11 7.0
5 5, 6 5.5
6 5, 9 7.0
7 5, 11 8.0
8 6, 9 7.5
9 6, 11 8.5
10 9, 11 10
Table 2. Possible samples of size 2
Notice that there are 10 possible samples of size 2 that can be drawn from the given population.
Observe that some values of the mean ( x́ ) appear in more than one sample.
Note that each of the sample means represents an estimate of the population mean. A
sampling distribution of sample means is the probability distribution of sample means drawn
from a population.
μx́ =x́∗P ( x́ )
μx́ =x́∗P ( x́ )
¿ 6.8
2 σ 2 N −n
σ x́ = ∙
n N −1
8.17 5−2
¿ ∙
2 5−1
¿ 3.06
Thus, the variance of the sampling distribution of the sample means is 3.06.
Let’s Summarize:
Let’s Try:
Let’s Reflect:
Lesson Objectives:
Let’s Review:
A teacher found the following ages of four children walking at the school playground: 3, 5, 7 and
9. What is the standard deviation of the ages of four children walking?
Let’s Explore:
What is the use of standard deviation? The standard deviation is a measure of how spread out
numbers are. So, using the standard deviation, we have a standard way of knowing what is normal.
Let’s Summarize:
Let’s Try:
Let’s Practice:
Let’s Reflect: