Statistics and Probability2021 - Quarter 3 2
Statistics and Probability2021 - Quarter 3 2
Statistics and Probability2021 - Quarter 3 2
PROBABILITY
Level: SENIOR HIGH SCHOOL Semester: SECOND
Subject Group: CORE SUBJECT Quarter: THIRD
Course Description:
At the end of the course, the students must know how to find the mean and variance of a
random variable, to apply sampling techniques and distributions, to estimate population mean
and proportion, to perform hypothesis testing on population mean and proportion, and to
perform correlation and regression analyses on real-life problems..
Course Requirements:
Below is the list of activities that must be completed and submitted with their corresponding
percentage.
WEEK ACTIVITIES Date of Completion Final Grade
1 Enabling Assessment Activity No.1 January 14, 2022 10%
2 Mini Performance Task 1 January 21, 2022 15%
3 Enabling Assessment Activity No.2 January 28, 2022 10%
4 Mini Performance Task 2 February 4, 2022 15%
5 Enabling Assessment Activity No.3 February 11, 2022 10%
6 Mini Performance Task 3 February 18, 2022 15%
7 Enabling Assessment Activity No.4 February 25, 2022 10%
8 Final Performance Task March 4, 2022 15%
TOTAL 100%
CRITERIA PERCENTAGE
Relevance
(The output contains timely information and reasonable type of vacation 40%
options)
Clarity of plan & process
(The output shows clear data regarding the result of the survey) 30%
Presentation of data
30%
(Data should be presented accurately and precisely based on formula)
Total 100%
Week 1 and 2
DISCRETE RANDOM VARIABLE
INTRODUCTION
In this lesson, you will learn the difference of a continuous from a discrete variable
LEARNING MATERIALS: Module, pen, paper, internet (if applicable), scientific calculator
PRAYER: Father God, please guide me in the lesson today and help me grow in love and
kindness more like Jesus every day. AMEN
LESSON PROPER
Discrete and Continuous Random Variables:
A variable is a quantity whose value changes.
A discrete variable is a variable whose value is obtained by counting.
Colegio de Los Baños – STATISTICS AND PROBABILITY 2
Example: A fair coin is tossed twice. Let X be the number of heads that are observed.
a. Construct the probability distribution of X.
b. Find the probability that at least one head is observed.
Solution:
a. The possible values that X can take are 0, 1, and 2. Each of these numbers corresponds
to an event in the sample space
S = {hh,ht,th,tt}
The probability of each of these events, hence of the corresponding value of X, can be
found simply by counting, to give
x 0 1 2
P(x) 0.25 0.5 0.25
b. “At least one head” is the event X≥1, which is the union of the mutually exclusive
events X=1 and X=2. Thus
1,1 2,1 3, 1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2
1,3 2,3 3, 3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4
1,5 2,5 3, 5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
6
X 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 6 6 5 4 3 2 1
P(X)
36 36 36 36 36 36 36 36 36 36 36
b. The event X≥9 is the union of the mutually exclusive events X=9, X=10, X=11,
and X=12. Thus
c. Before we immediately jump to the conclusion that the probability that X takes an even
value must be 0.5, note that X takes six different even values but only five different odd
values. We compute
When we know the probability p of every value x we can calculate the Expected Value
(Mean) of X:
μ = Σxp
Example continued:
x 1 2 3 4 5 6
p 0.1 0.1 0.1 0.1 0.1 0.5
xp 0.1 0.2 0.3 0.4 0.5 3
μ = Σxp
= 0.1+0.2+0.3+0.4+0.5+3
= 4.5
Variance: Var(X)
Example continued: x 1 2 3 4 5 6
p 0.1 0.1 0.1 0.1 0.1 0.5
2
xp 0.1 0.4 0.9 1.6 2.5 18
2
Σx p = 0.1 + 0.4 + 0.9 + 1.6 + 2.5 + 18 = 23.5
Var(X) = Σx2p − μ2
= 23.5 – (4.5)2 = 3.25
The variance is 3.25
Standard Deviation: σ
σ = √Var(X)
Example continued:
x 1 2 3 4 5 6
p 0.1 0.1 0.1 0.1 0.1 0.5
x2p 0.1 0.4 0.9 1.6 2.5 18
In some ways, the standard deviation is the more tangible of the two measures, since it is in
the same units as X. For example, if X is a random variable measuring lengths in meters,
then the standard deviation is in meters (m), while the variance is in square meters (m 2).
Unlike the mean, there is no simple direct interpretation of the variance or standard
deviation. The variance is analogous to the moment of inertia in physics, but that is not
Colegio de Los Baños – STATISTICS AND PROBABILITY 6
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name:_______________________________ Section: _______________________
LAST NAME, FIRST NAME, MIDDLE INITIAL
ENGAGEMENT
The number of days in the winter months that a construction crew cannot
work because of the weather has the following probability distribution
X 3 4 5 6 7 8 9 10
P(X) 0.05 0.10 0.20 0.25 0.15 0.10 0.08 0.07
a. Find the probability that no more than 5 days can not work next winter (5 pts)
b. Find the probability that from 4 to 8 days will be absent next winter. (5 pts)
c. Find the probability that at most 7 days at all will be absent next winter. (5 pts)
d. Compute the mean and standard deviation of X. Interpret the mean in
the context of the problem (10 pts)
ASSIMILATION
Answer in 3-5 sentences.
Is it possible that the probability of an event to happen will be 0? Justify
your answer by giving an example? (10pts)
___________________________________________________________________
SIGNATURE OVER PRINTED NAME OF PARENT/GUARDIAN
DATE: ___________________
Colegio de Los Baños – STATISTICS AND PROBABILITY 8
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name: ________________________________ Section: _______________________
LAST NAME, FIRST NAME MIDDLE INITIAL
Probability
Find the mean, variance and standard deviation of the said event
___________________________________________________________________
SIGNATURE OVER PRINTED NAME OF PARENT/GUARDIAN
DATE: ______________
Colegio de Los Baños – STATISTICS AND PROBABILITY 9
LEARNING MATERIALS: Module, pen, paper, internet (if applicable), scientific calculator
PRAYER: Father God, please guide me in the lesson today and help me grow in love and
kindness more like Jesus every day. AMEN
INTRODUCTION:
MELC: At the end of the lesson, you should be able to:
Illustrate a normal random variable and its characteristics
Identify regions under the normal curve corresponding to different standard normal values
Convert a normal random variable to a standard normal variable and vice versa
Compute probabilities and percentiles using the standard normal table
Illustrate random sampling
Distinguish between parameter and statistic
Identify sampling distributions of statistics (sample mean)
Find the mean and variance of the sampling distribution of the sample mean
DEVELOPMENT
LESSON PROPER
LESSON 3: NORMAL RANDOM VARIABLES
Many variables, such as weight, shoe sizes, foot lengths, and other human physical
characteristics, exhibit these properties. The symmetry indicates that the variable is just as
likely to take a value a certain distance below its mean as it is to take a value that same
distance above its mean. The bell shape indicates that values closer to the mean are more
likely, and it becomes increasingly unlikely to take values far from the mean in either
direction.
We use a mathematical model with a smooth bell-shaped curve to describe these bell-
shaped data distributions. These models are called normal curves or normal
distributions. They were first called “normal” because the pattern occurred in many
different types of common measurements.
Colegio de Los Baños – STATISTICS AND PROBABILITY 10
The general shape of the mathematical model used to generate a normal curve looks like
this:
Because normal curves are mathematical models, we use Greek letters to represent the
mean and standard deviation of a normal curve. The mean of a normal distribution locates
its center. We use the Greek letter μ (pronounced “mu” ) to represent the mean. We use the
Greek letter σ (pronounced “sigma”) to represent the standard deviation of a normal
distribution. The standard deviation determines the spread of the distribution. In fact, the
shape of a normal curve is completely determined by specifying its standard deviation. As
we will see, if two normal distributions have the same standard deviation, then the shapes of
their normal curves will be identical.
In the language of statistics, we have just found the z-score for a male foot length of 13
inches to be z = +1.33. Or, to put it another way, we have standardized the value of 13. In
general, the standardized value z tells how many standard deviations below or above the
mean the original value is. It is calculated as follows:
The convention is to denote a value of our normal random variable X with the letter x. Since
the mean is written μ and the standard deviation σ, we may write the standardized value as
̅ −μ)/σ
z = (𝒙
Determining probabilities
The following mathematical notations on probabilities are used to simplify lengthy
expressions.
P (a < z < b) denotes the probability that the z – score is between a and b.
P (z > a) denotes the probability that the z – score is greater than a.
P (z < a) denotes the probability that the z – score is less than a.
Thus, P(1 < z < 2) = 0.1359 is read as “ the probability that the z – score falls between z = 1
and z = 2 is 0.1359.”
Case 1. The required area is: greater than a, at least a, more than a, to the right of a, or
above a.
P (z > a)
Colegio de Los Baños – STATISTICS AND PROBABILITY 12
Case 2. The required area is: less than a, at most a, no more than a, to the left of a, or
below a.
P (z < a)
Case 3. The required area is between a and b then P (a < z < b)
Find the probabilities indicated, where as always Z denotes a standard normal random
variable
1. P(Z<1.48).
Cumulative Probability table shows how this probability is read directly from the table without
any computation required. The digits in the ones and tenths places of 1.48, namely 1.4, are
used to select the appropriate row of the table; the hundredths part of 1.48, namely 0.08, is
used to select the appropriate column of the table. The four decimal place number in the
interior of the table that lies in the intersection of the row and column selected, 0.9306, is the
probability sought:
P(Z<1.48) = 0.9306
Colegio de Los Baños – STATISTICS AND PROBABILITY 13
Complementary rule means that if P(Z > -n) for n being any integer
P(Z > -n) = P(Z ≤ n)
So P(0.5 < Z < 1.57) = P(Z < 1.57) - P(Z < 0.5)
For P(Z < 1.57) = 0.9418 and P(Z < 0.5) = 0.6915
Then P(Z < 1.57) - P(Z > 0.5) = 0.9418 – 0.6915 = 0.2503
When we go to the table, we find that the value 0.90 is not there exactly, however, the values
0.8997 and 0.9015 are there and correspond to Z values of 1.28 and 1.29, respectively. Hence
we get the median (as an estimate) of the two z values
Sample 6: The mean BMI for men aged 60 is 29 with a standard deviation of 6. What is 90th
percentile of BMI for men?
Given
μ = 29 σ=6
Find: X90 = value at 90th percentile
Remember that Z(0.90) = 1.285 and z-score = (value – mean) / standard deviation.
̅ −μ)/σ into X = μ + Zσ
So rearranging z = (𝒙
Therefore X = μ + Zσ
X = 29 + (1.285 * 6) = 36.71
Colegio de Los Baños – STATISTICS AND PROBABILITY 15
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name:_______________________________ Section: _______________________
ENGAGEMENT
Enabling Assessment Activity No.2. Normal Random Variable
ASSIMILATION
Answer in 3-5 sentences.
Stat teachers use Normal distribution curve to know if cheating incident happened during an
exam. How are they able to know if cheating happened just by looking at the scores of the
students? (10pts)
___________________________________________________________________
DATE: ____________________
Colegio de Los Baños – STATISTICS AND PROBABILITY 16
Random sampling simply describes when every element in a population has an equal chance
of being chosen for the sample.
Sampling Techniques
Cluster sampling starts by dividing a population into groups, or clusters. What makes this
different that stratified sampling is that each cluster must be representative of the
population. Then, you randomly selecting entire clusters to sample.
Colegio de Los Baños – STATISTICS AND PROBABILITY 17
Systematic random sampling is a very common technique in which you sample every k’th
element. For example, if you were conducting surveys at a mall, you might survey every 100th
person that walks in, for example.
If you have a sampling frame then you would divide the size of the frame, N, by the desired
sample size, n, to get the index number, k. You would then choose every k’th element in the
frame to create your sample.
The total set of observations that can be made is called the population, normally denoted
by N.
A sample is a set of observations drawn from a population. The sample size is normally
denoted by n.
A parameter is a measurable characteristic of a population, such as a mean or standard
deviation.
A statistic is a measurable characteristic of a sample, such as a mean or standard
deviation, which in effect can be used as an estimate of the population parameter.
̅𝟐 P(𝑥̅ )] − ( 𝛍𝒙 )𝟐
𝝈𝒙 = √𝜮[𝒙
Colegio de Los Baños – STATISTICS AND PROBABILITY 19
̅𝟐 P(𝑥̅ )] − ( 𝛍𝒙 )𝟐
𝝈𝒙 = √𝜮[𝒙
𝝈𝒙 = 0.5774
Colegio de Los Baños – STATISTICS AND PROBABILITY 20
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name: ________________________________ Section: _______________________
LAST NAME, FIRST NAME MIDDLE INITIAL
ENGAGEMENT
Mini-Performance Task No.2. VARIABLE ON ROLLING A DIE (50 pts)
1. Get a 6-phased die. Roll it three times
2. Enter the values of the die based on its outcome. (the first part of table is already for
your example.
3. Calculate the mean of the three outcomes.
4. Repeat steps 1-3 four more times.
5. Tabulate the results
Total Σ
___________________________________________________________________
SIGNATURE OVER PRINTED NAME OF PARENT/GUARDIAN
DATE: ______________
Colegio de Los Baños – STATISTICS AND PROBABILITY 21
LEARNING MATERIALS: Module, pen, paper, internet (if applicable), scientific calculator, T-
table
PRAYER: Father God, please guide me in the lesson today and help me grow in love and
kindness more like Jesus every day. AMEN
INTRODUCTION:
MELC: At the end of the lesson, you should be able to:
Define the sampling distribution of the sample mean for normal population when the
variance is: (a) known (b) unknown
Illustrate the Central Limit Theorem
Define the sampling distribution of the sample mean using the Central Limit Theorem
Solve problems involving sampling distributions of the sample mean
Illustrate the t-distribution
Identify percentiles using the t-table
DEVELOPMENT
MOTIVATION - PROCESS QUESTIONS:
1. What is the significance central limit theorem?
2. What is the purpose of T-table
LESSON PROPER
LESSON 5: SAMPLING DISTRIBUTION OF THE SAMPLING MEAN
If the population is normally distributed with mean μ and standard deviation σ, then the sampling
distribution of the sample mean is also normally distributed no matter what the sample size is.
When the sampling is done with replacement or if the population size is large compared to the
𝜎
sample size, then 𝑥̅ has mean μ and standard deviation . We use the term standard error for
√𝑛
the standard deviation of a statistic, and since sample average, 𝑥̅ is a statistic, standard
deviation of 𝑥̅ is also called standard error of 𝑥̅ .
When we know the sample mean is Normal or approximately Normal, then we can calculate a z-
score for the sample mean and determine probabilities for it using
Colegio de Los Baños – STATISTICS AND PROBABILITY 22
Sample Problem 1
The engines made by Ford for speedboats have an average power of 220 horsepower (HP)
and standard deviation of 15 HP. You can assume the distribution of power follows a normal
distribution.
Consumer reports are testing the engines and will dispute the company's claim if the sample
mean is less than 215 HP. If they take a sample of 4 engines, what is the probability the
mean is less than 215?
Answer
We want to find P(𝑥̅ <215).
Since the population follows a normal distribution, we can conclude that 𝑥̅ has a normal
𝜎 15
distribution with mean 220 HP (μ=220) and a standard deviation of = = 7.5HP.
√𝑛 √4
215−220
P( 𝑥̅ < 215) = P (Z < ) = P (Z < -0.67)
7.5
= 0.2514 (Z table)
If the consumer reports samples four engines, the probability that the mean is less than 215 HP
is 25.14%
TRY THIS
Using the speedboat engines example above, answer the following question.
If consumer reports samples 100 engines, what is the probability that the sample mean will
be less than 215?
(Answer should be 0.0043 or 0.43%)
Colegio de Los Baños – STATISTICS AND PROBABILITY 23
Example: The average number of milligrams (mg) of cholesterol in a cup of a certain brand of
ice cream is 660 mg, and the standard deviation is 35 mg. Assume the variable is normally
distributed.
a) If a cup of ice cream is selected, what is the probability that the cholesterol content will be
more than 670 mg?
Solution:
Given : μ = 660, σ = 35, X = 670 ;
find P( X > 670) ;
since this is an individual data, the regular z – score formula is used:
𝑋− 𝜇 670−660
𝑧= = = 0.29.
𝜎 35
Thus P (X > 670) = P (z > 0.29) = 0.5 – 0.1141 = 0.3859.
So the probability that the cholesterol content will be more than 670 mg is 38.59%.
Example. If a sample of 10 cups of ice cream is selected, what is the probability that the mean
of the sample will be larger than 670 mg? Assume that population mean = 660 and population
standard deviation = 35.
Solution :
Given : μ = 660, σ = 35, 𝑋̅ = 670, n = 10 ;
find P(𝑋̅ > 670) ;
since this involves sample data, the z formula to use is
𝑋− 𝜇 670−660
𝑧= 𝜎 = 35 = 0.90.
√𝑛 √10
Thus P (𝑋̅ > 670) = P (z > 0.90) = 0.5 – 0.3159 = 0.1841.
So the probability that the mean cholesterol content of 10 randomly selected cups of ice cream
will be more than 670 mg is 18.41%.
T-DISTRIBUTION TABLE
T-Distribution Table
Colegio de Los Baños – STATISTICS AND PROBABILITY 25
When you conduct a t-test, you can compare the test statistic from the t-test to the critical
value from the t-Distribution table. If the test statistic is greater than the critical value found
in the table, then you can reject the null hypothesis of the t-test and conclude that the results
of the test are statistically significant.
A researcher recruits 20 subjects for a study and conducts a one-tailed t-test for a mean using
an alpha level of 0.05.
Question: Once she conducts her one-tailed t-test and obtains a test statistic t, what critical
value should she compare t to?
Answer: For a t-test with one sample, the degrees of freedom is equal to n-1, which is 20-1 =
19 in this case. The problem also tells us that she is conducting a one-tailed test and that she is
using an alpha level of 0.05, so the corresponding critical value in the t-distribution table
is 1.729.
A researcher recruits 18 subjects for a study and conducts a two-tailed t-test for a mean using
an alpha level of 0.10.
Question: Once she conducts her two-tailed t-test and obtains a test statistic t, what critical
value should she compare t to?
Answer: For a t-test with one sample, the degrees of freedom is equal to n-1, which is 18-1 =
17 in this case. The problem also tells us that she is conducting a two-tailed test and that she is
using an alpha level of 0.10, so the corresponding critical value in the t-distribution table is 1.74.
A researcher conducts a two-tailed t-test for a mean using a sample size of 14 and an alpha
level of 0.05.
Question: What would the absolute value of her test statistic t need to be in order for her to
reject the null hypothesis?
Answer: For a t-test with one sample, the degrees of freedom is equal to n-1, which is 14-1 =
13 in this case. The problem also tells us that she is conducting a two-tailed test and that she is
using an alpha level of 0.05, so the corresponding critical value in the t-distribution table
is 2.16. This means that she can reject the null hypothesis if the test statistic t is less than -2.16
or greater than 2.16.
Colegio de Los Baños – STATISTICS AND PROBABILITY 26
A researcher conducts a right-tailed t-test for a mean using a sample size of 19 and an alpha
level of 0.10.
Question: The test statistic t turns out to be 1.48. Can she reject the null hypothesis?
Answer: For a t-test with one sample, the degrees of freedom is equal to n-1, which is 19-1 =
18 in this case. The problem also tells us that she is conducting a right-tailed test (which is a
one-tailed test) and that she is using an alpha level of 0.10, so the corresponding critical value
in the t-distribution table is 1.33. Since her test statistic t is greater than 1.33, she can reject the
null hypothesis.
One problem that students frequently encounter is determining if they should use the t-
distribution table or the z table to find the critical values for a particular problem. If you’re stuck
on this decision, you can use the following flow chart to determine which table you should use:
ONE-TAILED VS TWO-TAILED
A two-tailed test is appropriate if you want to determine if there is any difference between the
groups you are comparing. For instance, if you want to see if Group A scored higher or lower
than Group B, then you would want to use a two-tailed test. This is because a two-tailed test
uses both the positive and negative tails of the distribution. In other words, it tests for the
possibility of positive or negative differences.
A one-tailed test is appropriate if you only want to determine if there is a difference between
groups in a specific direction. So, if you are only interested in determining if Group A scored
higher than Group B, and you are completely uninterested in possibility of Group A scoring
lower than Group B, then you may want to use a one-tailed test. The main advantage of using a
one-tailed test is that it has more statistical power than a two-tailed test at the same significance
(alpha) level. In other words, your results are more likely to be significant for a one-tailed test if
there truly is a difference between the groups in the direction that you have predicted. This is
because only one tail of the distribution is used for the test.
Colegio de Los Baños – STATISTICS AND PROBABILITY 28
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name:_______________________________ Section: _______________________
LAST NAME, FIRST NAME, MIDDLE INITIAL
ENGAGEMENT
Enabling Assessment Activity No.3. CENTRAL LIMIT THEOREM
Using the problem above at 95% confidence level, justify if the null hypothesis should be
rejected.
ASSIMILATION
Answer in 3-5 sentences.
How can the knowledge on Central Limit Theorem help you in your PR1? (10pts)
___________________________________________________________________
DATE: _______________
Colegio de Los Baños – STATISTICS AND PROBABILITY 29
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name: ________________________________ Section: _______________________
LAST NAME, FIRST NAME MIDDLE INITIAL
___________________________________________________________________
SIGNATURE OVER PRINTED NAME OF PARENT/GUARDIAN
DATE: ______________
Colegio de Los Baños – STATISTICS AND PROBABILITY 30
LEARNING MATERIALS: Module, pen, paper, internet (if applicable), scientific calculator
PRAYER: Father God, please guide me in the lesson today and help me grow in love and
kindness more like Jesus every day. AMEN
INTRODUCTION:
MELC: At the end of the lesson, you should be able to:
Identify the length of a confidence interval
Compute for the length of the confidence interval
Compute for an appropriate sample size using the length of the interval.
Solve problems involving sample size determination
DEVELOPMENT
LESSON PROPER
Confidence Intervals
Statisticians use a confidence interval to express the precision and uncertainty associated
with a particular sampling method. A confidence interval consists of three parts: confidence
level, statistic, and margin of error.
The confidence level describes the uncertainty of a sampling method. The statistic and the
margin of error define an interval estimate that describes the precision of the method. The
interval estimate of a confidence interval is defined by the sample statistic + margin of error.
Confidence Level
The probability part of a confidence interval is called a confidence level. The confidence
level describes the likelihood that a particular sampling method will produce a confidence
interval that includes the true population parameter.
Suppose all possible samples from a given population were collected and the confidence
intervals for each sample computed. Some confidence intervals would include the true
population parameter; others would not. A 95% confidence level means that 95% of the
intervals contain the true population parameter; a 90% confidence level means that 90% of the
intervals contain the population parameter; and so on.
Colegio de Los Baños – STATISTICS AND PROBABILITY 31
Margin of Error
A margin of error tells you how many percentage points your results will differ from the
real population value. For example, a 95% confidence interval with a 4 percent margin of error
means that your statistic will be within 4 percentage points of the real population value 95% of
the time.
More technically, the margin of error is the range of values below and above the sample
statistic in a confidence interval. The confidence interval is a way to show what
the uncertainty is with a certain statistic (i.e. from a poll or survey).
For example, a poll might state that there is a 98% confidence interval of 4.88 and 5.26. That
means if the poll is repeated using the same techniques, 98% of the time the true population
parameter (parameter vs. statistic) will fall within the interval estimates (i.e. between 4.88 and
5.26) 98% of the time.
Where
𝑋̅ = value
t = value using t-distribution table
s = standard deviation
n = sample size
25
240 ± 2.262
√10
Upper end is
25
240 + 2.262 = 257.883
√10
Lower end is
25
240 − 2.262 = 222.117
√10
Suppose the local newspaper conducts an election survey and reports that the independent
candidate will receive 30% of the vote. The newspaper states that the survey had a 5% margin
of error and a confidence level of 95%. These findings result in the following confidence interval:
The survey is 95% confident that the independent candidate will receive between 25% and 35%
of the vote.
𝜎 𝜎
Thus at 95% confidence interval, the value of 𝑋̅ − 1.96 ( ) < 𝜇 < 𝑋̅ + 1.96 ( ) can
√𝑛 √𝑛
vaguely estimate the population mean μ. Also a 95% confidence interval indicates a 5% level of
confidence. The level of significance is normally referred to as α . The general formula for
finding the confidence interval would be :
𝜎 𝜎
𝑋̅ − 𝑧𝛼/2 ( ) < 𝜇 < 𝑋̅ + 𝑧𝛼/2 ( ) .
√𝑛 √𝑛
For ready reference, a 90% confidence interval gives 𝑧𝛼/2 = ±1.65 ; a 95% confidence
interval gives 𝑧𝛼/2 = ±1.96 ; and a 99% confidence interval gives 𝑧𝛼/2 = ±2.58.
𝜎
The term 𝑧𝛼/2 ( ) is called the margin of error in statistics. This is simply defined as the
√𝑛
maximum likely difference between the observed sample mean and the true value of the
population mean. The above formula should only be used for sampling size greater than 30.
Colegio de Los Baños – STATISTICS AND PROBABILITY 34
Example: A researcher wants to estimate the number of hours that 5-year old children spend
watching television. A sample of 50 five-year old children was observed to have a mean viewing
time of 3 hours. The population is assumed to be normally distributed with a population standard
deviation of σ = 0.5 hours. Find the best point estimate of the population mean and the 95%
confidence interval.
Solution: Since the sample size is more than 30, the central limit theorem holds and therefore
the best point estimate of the population mean would be equal to the sample mean of 3 hours.
Since the confidence interval is at 95%, the confidence coefficient would be 1.96 and the margin
of error can be computed as
𝜎 0.5
E = 𝑧𝛼/2 ( ) = 1.96 ( ) = 0.14.
√𝑛 √50
3 + 0.14 = 3.14 would be the upper limit of the confidence interval and
Feeding Program: In a certain village, Mrs. Ramos wants to estimate the mean weight μ, in
kilograms, of all six-year old children to be included in a feeding program. She wants to be 99%
confident that the estimate of μ is accurate to within 0.06 kg. Suppose from a previous study,
the standard deviation of the weights of the target population was 0.5 kg, what should the
sample size be?
Solution:
Given a confidence interval of 99% ,
α = 1 – 0.99 = 0.01,
and 𝑧𝛼/2 = 2.58 .
The desired margin of error E = 0.06 and σ = 0.5 kg.
𝑧𝛼/2 ∗ 𝜎 2 2.58∗ 0.5 2
n=( ) = ( ) = (21.5)2
𝐸 0.06
= 462.25 , round up to 463 six-year old children.
Colegio de Los Baños – STATISTICS AND PROBABILITY 35
Another convenient formula to use in computing the sample size is the Sloven’s formula.
𝑁
𝑛 = 2
1+𝑁𝑒
where :
n = sample size e = margin of error (between 1% – 10%)
N = population size
A student wanted to conduct a survey regarding the perspective of the GAS Grade 11
students on LGU’s protocols for COVID-19.There are 135 enrolled Grade 11 GAS students
in CDLB. For his data to be statistically valid, a 5% margin of error should be used in
sampling. How many residents of Maahas should be his respondents?
Given
Population size, N = 135
Margin of error, e = 5% or 0.05
𝑁
𝑛 =
1 + 𝑁𝑒 2
135
𝑛 =
1 + 135(0.05)2
𝒏 = 𝟏𝟎𝟎. 𝟗𝟖 = 𝟏𝟎𝟏 𝑮𝒓𝒂𝒅𝒆 𝟏𝟏 𝑮𝑨𝑺 𝑺𝒕𝒖𝒅𝒆𝒏𝒕𝒔
Colegio de Los Baños – STATISTICS AND PROBABILITY 36
ANSWER SHEET (Please submit only the answers. Do not return the entire module.)
Name:_______________________________ Section: _______________________
ENGAGEMENT
Enabling Assessment Activity No.4. Sampling Size Determination
You are to conduct a sampling method using CXDLB SHS students as your respondents.
Using Slovin’s formula, calculate how many respondents should be your sampling size in each
strand if you are given 10% allowable error. Show your solutions. (5 pts per strand)
ASSIMILATION
Answer in 3-5 sentences.
First time researchers, like students, are allowed to use 10% as allowable error, while in the
actual field research, 5% error is allowed. Why is this so? (10pts)
___________________________________________________________________
SIGNATURE OVER PRINTED NAME OF PARENT/GUARDIAN
Colegio de Los Baños – STATISTICS AND PROBABILITY 37
GOAL – Prepare a normal distribution graph that will help in your decision making.
ROLE - Statistician
AUDIENCE – A client who wants to have a beach vacation.
SITUATION – A client in your travel company acquired some information on what month should
he take his beach vacation. As much as possible your client wanted not to coincide his vacation
with the peak season or where people get crowded in the beach.
PRODUCT – Survey at least 50 students in your strand and identify on what month of the year
they usually go out on a beach. Based on the result, create a normal distribution graph and
identify which month your client should took his vacation.
STANDARDS – The recommendation would be assessed based on the following criteria
CRITERIA PERCENTAGE
Relevance
(The output contains timely information and reasonable type of vacation 40%
options)
Clarity of plan & process
(The output shows clear data regarding the result of the survey) 30%
Presentation of data
30%
(Data should be presented accurately and precisely based on formula)
Total 100%