Statistics and Probability Module 3 Modified
Statistics and Probability Module 3 Modified
Module 3
What’s In
Some researches aim to study, describe and infer patterns of behaviour, properties,
and characteristics about a population; sometimes, they intent to study in a very large scale
and because of the fact that we cannot study a very large population due to feasibility,
impracticality, and inconvenience, that is why we must select a representative sample from
the population. In this lesson, sampling techniques that will help researches select samples
that would represent true inferences about the population where these samples came from.
Analysis: For case number 1, this task is only dealing with the students' final examination
scores in one specific class with 40 students. The intent is not to the students' final examination
scores of students in a much bigger population. Since the teacher has the entire population
available for this situation, she should use the all of them. In case number 2, the population is
extremely large. There is actually an impractical and an inconvenient way of obtaining all of
the data in the population. You simply will not have all of the data available for your use
especially if you only have a limited time. You will need to use a sample of the population.
What is It
Population is the group you want to generalize. It consists of all the members of the
group you are interested in. Sample is the subset from the population you want to examine. A
population commonly contains too many individuals to study conveniently and practically, so
an investigation is often restricted to one or more samples drawn from it. A well-chosen sample
will contain most of the information about a particular population parameter but the relation
between the sample and the population must be such as to allow true inferences to be made
about a population from that sample.
Sampling is a process used in statistical analysis in which a predetermined number of
observations are taken from a larger population. There are various sampling methods that
allow all the units in the population to have an equal chance of being selected. These sampling
methods are discussed below.
a. Lottery Method
Every member is assigned a unique number. These numbers are put in a jar and
thoroughly mixed. After that, the researcher picks some numbers without looking at
it and those people are included in the study.
b. Use of Table of Random Numbers
This table consists of a series of digits (0-9) that are generated randomly. The
numbers are arranged in rows and columns and can be read in any direction. All the
digits are equally probable.
To determine the desired number of samples needed given a certain number of population,
there are different formulas can be used, one of which is Slovin’s Formula.
𝑁
Slovin’s Formula: 𝑛 = 𝑁𝑒 2
𝑁
𝑛=
𝑁𝑒 2
1000
𝑛=
1000(0.05)2
1000
𝑛=
1000(0.0025)
1000
𝑛=
2.5
𝒏 = 𝟒𝟎𝟎 (sample size)
2. Systematic Random Sampling: This can be done by listing all the elements in the
population and selecting every kth element in your population list. This is equally
precise as the simple random sampling. It is often used on long population lists. To
determine the interval to be used in identifying the samples to who will participate in
𝑁
the study, use the formula K = (population/sample size).
𝑛
Example:
𝑁
If Population (N) = 2000, sample size (n) = 500, K = , so k = 2000/500 = 4th. Use a table of
𝑛
random numbers to determine the starting point for selecting every 4th subject. With list of
the 2000 subjects in the sampling frame, go to the starting point, and select every 4th name
on the list until the sample size is reached. Probably will have to return to the beginning of
the list to complete the selection of the sample.
Example: Assume you have a population of 1000 students with 500 from grade school, 300
from high school, and 200 from senior high school. Determine the how many samples you
need or you can use the Slovin’s Formula or any other formula for computing the sample
size. In this example, Slovin’s Formula is used and a sample size of 400 is computed. To
get the samples from each stratum, divide 400 by 1000 and the answer is 0.4. Multiply 0.4
to each of the number of students per stratum (e.g. 0.4 x 500 grade school is 200).
Stratified Sampling
A teacher who is conducting a research on the effects of using mobile phones in teaching
English decided to divide her students into male and female and then she selected students
from each gender group.
Systematic Sampling
The school office personnel gave the researcher a list of 2000 Grade 10 students. The
researcher selected every 25th name on the list.
Cluster Sampling
A researcher surveyed all dengue patients in each of the 10 randomly selected hospitals in
Misamis Oriental.
What’s In
In the previous lessons, you learned about random sampling, a process done in
selecting an unbiased representative sample from a population. This time, you will be
introduced to the measures involved in the population and sample. Measures from a very large
population are impossible to obtain especially if your resources are limited. This brings us to
other measures which are from the representative samples. This lesson will help you identify
measures about population and sample.
What’s New
Study the cases below. Identify which of the cases involves measures from a population
and a sample.
1. A researcher randomly selected a sample of 1600 people in Cagayan de Oro City and
asked if they use a certain detergent brand and 40% of them said yes.
2. A researcher interviewed all the members of the Mathematics department with 10 female
teachers, 15 male teachers and 1 department head. He wants to know the average hours
per day they spend in training students for competitions and found out that they spend an
average of 2 hours per day for training.
Analysis: The first case contains a measure for a sample. It indicated that 40% of the 1600
samples said yes. On the other hand, the second case contains a measure from a population
because an average of 2 hours per day spend in training is from all of the people in the
Mathematics department.
What is It
Examples:
Parameter: 50% of the 24 Philippine senators agreed to support a certain measure.
Statistic: A researcher found out that 25% of the students in the Philippines reported to have
internet connection at home.
Explanation:
1. The example indicates a parameter since there are only 24 senators in the Philippines.
2. Researchers cannot ask millions of students if they have internet connection at home,
so they take samples from the target population and calculate.
Let’s Summarize!
• Parameter is a measure that describes a population. Parameter includes population
mean 𝜇, population variance 𝜎 2 , and population standard deviation 𝜎.
• Statistic is a measure that describes a sample. Statistics include sample mean 𝑥̅ ,
sample variance 𝑠2 and sample standard deviation 𝑠.
Lesson 3 Sampling Distribution of the
Sample Means
What’s In
You learned in your previous chapters about discrete probability distribution and
continuous probability distribution. In this lesson you will learn how to identify and construct
sampling distribution of sample means. You will also learn on how to find the mean and
variance of the sampling distribution of the sample means.
What’s New
Activity
1. A population consists of 2, 3, and 4. List all possible samples of size 3 which can be drawn
with replacement from this population and compute the mean of each sample. One
possible sample is given as your guide.
Sample Mean
2, 3 2.5
What is It
If you list all the possible samples of size 3 drawn from a population of 4 with elements
2, 3, and 4 you will have 9 samples. The table below shows the 9 samples and their
corresponding means.
Sample Sample Mean
2, 2 2.0
2, 3 2.5
2, 4 3.0
3, 2 2.5
3, 3 3.0
3, 4 3.5
4, 2 3.0
4, 3 3.5
4, 4 4.0
This let us learn how to make a probability distribution of the sample means. We shall
call this distribution, the sampling distribution of sample means.
Sample Mean Probability
Frequency
̅
𝒙 ̅)
P(𝒙
1
2 1
9
2
2.5 2
9
1
3 3
3
2
3.5 2
9
1
4 1
9
Total 𝐧=𝟗 1.00
Finding the Mean and Variance of the Sampling Distribution of Sample Means
The following are formulae needed to compute the mean, variance and standard
deviation of a population and mean, variance, and standard deviation of the sampling
distribution of sample means.
Example 1:
𝜎 = √6
𝝈 = 𝟐. 𝟒𝟓
Hence, the population standard deviation is 2.45.
d. List all the possible samples of size 2 with replacement and their corresponding means.
Observation Samples ̅
𝒙
1 2, 2 2.0
2 2, 5 3.5
3 2, 8 5.0
4 5, 2 3.5
5 5, 5 5.0
6 5, 8 6.5
7 8, 2 5.0
8 8, 5 6.5
9 8, 8 8.0
e. Find the mean of the sampling distribution of means.
Observation Samples ̅
𝒙
1 2, 2 2.0
2 2, 5 3.5
3 2, 8 5.0
4 5, 2 3.5
5 5, 5 5.0
6 5, 8 6.5
7 8, 2 5.0
8 8, 5 6.5
9 8, 8 8.0
̅ = 𝟒𝟓
∑𝒙
∑ 𝑥̅
𝜇𝑥̅ =
𝑛
45
𝜇𝑥̅ =
9
𝝁𝒙̅ = 𝟓
∑(𝑥̅ − 𝜇𝑥̅ )2
𝜎 2 𝑥̅ =
𝑛
27
𝜎 2 𝑥̅ =
9
𝝈𝟐 𝒙̅ = 3
Solution:
a. Compute the population mean.
∑𝑥
𝜇=
𝑁
1+3+5
𝜇=
3
9
𝜇=
3
𝝁=𝟑
𝜎 = √2.67
Observation Samples ̅
𝒙
1 1, 3 2
2 1, 5 3
3 3, 1 2
4 3, 5 4
5 5, 1 3
6 5, 3 4
e. Find the mean of the sampling distribution of means.
Observation Samples ̅
𝒙
1 1, 3 2
2 1, 5 3
3 3, 1 2
4 3, 5 4
5 5, 1 3
6 5, 3 4
̅ = 𝟏𝟖
∑𝒙
∑ 𝑥̅
𝜇𝑥̅ =
𝑛
18
𝜇𝑥̅ =
6
𝝁𝒙̅ = 𝟑 Hence, the mean of the sampling distribution of sample means is 3.
∑(𝑥̅ − 𝜇𝑥̅ )2
𝜎 2 𝑥̅ =
𝑛
4
𝜎 2 𝑥̅ =
6
𝝈𝟐 𝒙̅ = 𝟎. 𝟔𝟕
Hence, the variance of the sampling distribution of sample means is 0.67.
1. What do you notice about the population mean and the mean of the sampling
distribution of sample means? How do you compare them?
2. How do you compare the population variance and the variance of the sampling
distribution of sample means?
Let us summarize the example above by comparing the means and variances of
population and the sampling distribution of the sample means.
With Replacement Without Replacement
Sampling Sampling
Population Distribution of Population Distribution of
Sample Means Sample Means
Mean 𝜇=5 𝜇𝑥̅ = 5 𝜇=3 𝜇𝑥̅ = 3
Variance 𝜎2 = 6 𝜎 2 𝑥̅ = 3 𝜎 2 = 2.67 𝜎 2 𝑥̅ = 0.67
Standard
𝜎 = 2.45 𝜎𝑥̅ = 1.73 𝜎 = 1.63 𝜎𝑥̅ = 0.82
Deviation
If all possible samples of size 𝑛 that can be drawn from the population of size N with
mean 𝜇 and variance 𝜎 2 , then the sampling distribution of the sample means has the
following properties.
With Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the population
variance divided by the size of 𝑛 of the samples. That is,
2
𝜎2
𝜎 𝑥̅ =
𝑛
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size of 𝑛 of
the samples. That is,
𝜎
𝜎𝑥̅ =
√𝑛
Without Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the population
variance divided by the size of 𝑛 of the samples. That is,
𝜎2 𝑁 − 𝑛
𝜎 2 𝑥̅ = ( )
𝑛 𝑁−1
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size of 𝑛 of
the samples. That is,
𝜎 𝑁−𝑛
𝜎𝑥̅ = √( )
√𝑛 𝑁 − 1
What I Have Learned
Let’s Summarize!
• If all possible samples of size 𝑛 that can be drawn from the population of size N
with mean 𝜇 and variance 𝜎 2 , then the sampling distribution of the sample means
has the following properties.
If all possible samples of size 𝑛 that can be drawn from the population of size
N with mean 𝜇 and variance 𝜎 2 , then the sampling distribution of the sample means
has the following properties.
With Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the population
variance divided by the size of 𝑛 of the samples. That is,
𝜎2
𝜎 2 𝑥̅ =
𝑛
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size of 𝑛
of the samples. That is,
𝜎
𝜎𝑥̅ =
√𝑛
Without Replacement
• The mean of the sampling distribution of means is equal to the mean of the
population.
𝜇𝑥̅ = 𝜇
• The variance of the sampling distribution of means is equal to the
population variance divided by the size of 𝑛 of the samples. That is,
2
𝜎2 𝑁 − 𝑛
𝜎 𝑥̅ = ( )
𝑛 𝑁−1
• The standard deviation of the sampling distribution of means is equal to the
population standard deviation divided by the square root of the sample size
of 𝑛 of the samples. That is,
𝜎 𝑁−𝑛
𝜎𝑥̅ = √( )
√𝑛 𝑁 − 1
Lesson 4 Sampling Distribution of the Sample Means using the
Central Limit Theorem
What’s New
The Central limit theorem states that the sampling distribution of the mean
σ
approximates a normal distribution with a mean of µ and a standard deviation of if the
√𝑛
sample size N of the random samples is large enough. In this case, as more samples with
large sample sizes will be taken from a certain population with replacement, the sampling
distribution will closely resemble to that of a normal distribution. On the question, “How large
should the sample size be?”, statisticians do differ on it as some would suggest 30 and
others suggest it as large as 50 or more. This usually happens when the population does
not appear to be normal. In fact, generally, as sample size increases, the sample mean
tends to be normally distributed around the population mean and its standard deviation also
decreases.
Mean:
1+2+3+4+5+6
𝑥̅ = = 3.5 (Note: Computation for population mean and
6
sampling distribution mean is the same for n=1)
Variance:
(1−3.5)2+(2−3.5)2+(3−3.5)2 +(4−3.5)2 +(5−3.5)2 +(6−3.5)2 17.5
𝜎2 = = ≈ 2.92
6 6
Standard Deviation:
17.5
√𝜎 2 = 𝜎 = √ ≈ 1.71 n=1
6
0.2
Sample 𝑥̅ Probability
Percentage
0.15
1 1 = 1/6
2 2 = 1/6 0.1
3 3 = 1/6
0.05
4 4 = 1/6
5 5 = 1/6 0
1 2 3 4 5 6
6 6 = 1/6 X
Interpretations:
The population mean and sampling distribution means are both equal which is
3.5. It has a variance of approximately 2.92 and a standard deviation of approximately
1.71. Since all samples have the same probability of 1/6 or 16.6 ̅ , the trend of the
histogram is like a flat line horizontally.
If n= 2,
Population Mean:
1+2+3+4+5+6
𝑋̅ = = 3.5 6
Variance:
𝜎2 =
(1−3.5)2 +2(1.5−3.5)2 +3(2−3.5)2 +4(2.5−3.5)2 +5(3−3.5)2 +6(3.5−3.5)2 +5(4−3.5)2 +4(4.5−3.5)2 +3(5−3.5)2 +2(5.5−3.5)2 +(6−3.5)2
36
52.5
= ≈ 1.46
36
Standard Deviation:
52.5
√𝜎 2 = 𝜎 = √ 36 ≈ 1.21
The table shows a total of 36 samples and the probability of each are as follow:
n=2
0.2
0.15
Percentage
0.1
0.05
0
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
X
Interpretations: The population mean and sampling mean is just the same which is 3.5
and it shows a variance of approximately 1.46 and a standard deviation of 1.21, approximately.
Most of the data are concentrated at the middle values of the sample means. As observe in
the graph, the data gathered resembles that of normal curve which supports the idea of a
central limit theorem which strongly suggests normality.
Examples:
A school principal claims that grade 11 students has mean grade of 86 with a standard
deviation of 4. Suppose that the distribution is approximately normal;
A. what is the probability that a randomly selected grade will be less than 84?
B. what is the probability that a randomly selected grade will be greater than 82 but less
than 90?
C. what is the probability that the mean random sample of 9 students will be less than
88?
A. What is the probability that a randomly selected grade will be less than 84?
Given: 𝜎 = 4 𝜇𝑥̅ = 84 𝜇 = 86
B. What is the probability that a randomly selected grade will be greater than 82
but less than 90?
Given: 𝜎 = 4 𝜇𝑥̅1 = 82 𝜇𝑥̅1 = 90 𝜇 = 86
82−𝜇 90−𝜇
Solution: 𝑃(𝑋 < 84) = 𝑃 ( <𝑧< )
𝜎 𝜎
82−86 90−86
= 𝑃( <𝑧< )
4 4
−4 4
= 𝑃( <𝑧< )
4 4
= 𝑃 (−1 < 𝑧 < 1)
= 0.3413 + 0.3413
= 0.6826
Conclusion:
Therefore, the probability that a randomly selected grade will be greater than 82
but less than 90 is 0.6826.
C. What is the probability that the mean random sample of 9 students will be
less than 88?
Given: 𝜎 = 4 𝜇𝑥̅ = 88 𝜇 = 86 n=9
88−𝜇
Solution: 𝑃 (𝑋̅ < 88) = 𝑃 (𝑧 > 𝜎 )
√𝑛
88−86
= 𝑃 (𝑧 > 4 )
√9
2
= 𝑃 (𝑧 > 4 )
3
= 𝑃 (𝑧 > 1.5)
= 0.5+0.4332
= 0.9332
Conclusion:
Therefore, the probability that the mean random sample of 9 students will be
less than 88 is 0.9332