Statistics and Probability
Statistics and Probability
Statistics and
Probability
Table of Contents
What I Know ................................................................................1 Quarter I Module 1
......................................................................................4 Module 2
......................................................................................11 Module 3
......................................................................................19 Module 4
......................................................................................25 Module 5
......................................................................................32
Module 6 ......................................................................................39
......................................................................................48 Module 9
......................................................................................51 Module 10
....................................................................................57
Quarter II
Module 1 ......................................................................................63
Module 2 ......................................................................................72
Module 3 ......................................................................................77
Module 4 ......................................................................................80
Module 5 ......................................................................................86
Module 6 ......................................................................................89
Assessment ..................................................................................93
References ...................................................................................96
Directions: Choose the letter of the correct answer. Write your answer on a
separate sheet of paper.
Quarter I
1. Is a numerical quantity that is assigned to the outcome of an experiment.
A. Random variable C. Sample space
B. Sample point D. Variable
2. In how many ways can two coins fall?
C. 6
A. 2
D. 8
B. 4
3. It tells the distance of score from the mean measured in standard deviation
units.
A. normal curve C. z-score
B. sample mean D. area
4. Which of the following shows the probability that the z-score lies above a z-score
value?
𝐴.𝑃(𝑎<𝑧<𝑏) C. 𝑃(𝑧<𝑎)
𝐵.𝑃(𝑧>𝑎) D. 𝑃(𝑎=𝑧
5. What is the proportion of the area to the right of z = -1?
A. -0.3413 C. 0.3413
B. -0.8413 D. 0.8413
6. Statement 1: The number of students who are present in Filemon T. Lizan SHS
for the first day of class for the S.Y. 2020-2021
Statement 2: The number of Mayors in NCR who are present during the meeting
Which of the following is CORRECT?
8. Let B number of boys in a family and G for the girls in a family of four children.
Determine the values of the random variable B.
A. 0, 1 C. 0, 1, 2, 3
B. 0, 1, 2 D. 0, 1, 2, 3, 4
1
For numbers 9 – 10. Consider the probability distribution of the number of
mangoes given below.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
9. Find P(R = 3)
A. 1/8 C. 3/8
B. 5/8 D. 1
Quarter II
11. The random sample size n = 3 are drawn from a finite population consisting of
the numbers 14, 25, 36, 47, 58 and 69. How many possible samples are there?
A. 12 C. 20
B. 16 D. 24
12. The random samples of size 3 are taken from a population of the numbers 1, 2,
3, 4, 5, 6, and 7. How many samples are there?
A. 35 C. 210
B. 120 D. 350
13. The random samples of size 4 are taken from a population of the numbers 1, 2,
3, 4, 5, 6, 7, and 8. How many samples are there?
A. 70 C. 1 680
B. 840 D. 3 024
14. The random sample size n = 5 are drawn from a finite population consisting of
the numbers 15, 16, 17, 18, and 19. How many possible samples are there?
A. 1 C. 4
B. 2 D. 8
15. The following are the weights of five students in kg. suppose samples of size 2
are taken from this population of five students.
16. A random sample of size 4 is taken with replacement from a population with
𝜇 = 12 and 𝜎2 = 8. Find the variance (𝜎2?̅?̅ ).
A. 2.5 C. 1
B. 2 D. 1.5
2
17. A random of size 25 is taken with replacement from a population with 𝜇 = 121.4
and 𝜎2 = 50.5. Find the mean 𝜇?̅?̅.
A. 121.4 C. 122.5
B. 121.5 D. 122
19. Why is it important to sample not more than 10% of the population when the
sample is drawn without replacement?
20. The independence condition for the Central Limit Theorem is assumed to be
met when _____.
A. the sample is biased
B. the sample is randomly selected
C. the sample is drawn with replacement
D. the sample is drawn without replacement
3
Quarter I
MODULE 1
This module was designed and written with you in mind. It is here to help you
master the nature of Statistics and Probability. The scope of this module permits it
to be used in many different learning situations. The language used recognizes the
diverse vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.
The module consists of the lessons, namely:
4
RANDOM VARIABLE
- is a variable that assumes numerical values associated with the outcome of a
random process or experiment.
OTHER DEFINITION OF TERMS
Experiment- any activity which can be done repeatedly under similar conditions.
Sample Space - the set of all possible outcomes in an experiment.
Event - a subset of a sample space.
Sample Point - the elements in a sample space.
Probability - the ratio of the number of favorable outcomes to the total number of
possible outcomes.
A random variable may be classified as discrete or continuous.
Discrete Random Variable - is one that can assume only a countable number of
values.
Continuous Random Variable – can assume infinite number of values in one or
more intervals.
Examples:
A.Classify the ff. if it is Discrete Random Variable or Continuous Random
Variable.
Example 1
Supposed two coins are tossed and we are interested to determine the number of
heads that will come out. Let us use H to represent the number of heads that will
come out. Determine the values of the random variable H.
5
Step 2. Count the number of heads in each outcome and assign this number to this
outcome.
Number of Heads
Outcomes (Value of H)
HH
2
HT
1
TH
1
TT
0
The values of the random variable H (no. of heads) in this experiment are 0,
1, and 2.
Example 2
A basket contains 10 ripe and 4 unripe mangoes. If three mangoes are taken from
the basket one after the other, determine the possible values of the random variable
R representing the number of ripe mangoes.
Step 2. Count the number of ripe mangoes (R) in each outcome and assign this
number to this outcome.
The values of the random variable r (number of ripe mangoes) in this experiment are
0, 1, 2, and 3.
6
Different Presentations of a Discrete Probability Distribution Probability Distribution
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
1
= 𝑜𝑟 1
8
The bar graph shows the relationships of R which is value of the random variables
and the P(R) which is the probability of the number of ripe mangoes.
If we continue the process…
7
Step 3. Construct the frequency distribution of the values of the random variable R.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
Properties of a Discrete Probability Distribution
Examine the probability distribution that we have learned in the given example.
What have you notice about the probability values of the random variable in each
probability distribution?
What is the sum of the probabilities of a random variable?
Consider the probability of the number of bananas given below.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
8
In words, the probability of R which is exactly 3.
Solution: Since the given is exactly 3
Therefore, the answer is 1/8.
3. P (R > 1)
In words, the probability of R which is greater than 1.
There are two possible values of R. These are 2 and 3.
P (R > 1) = P(2) + P(3)
= 3/8 + 1/8
= 4/8 or ½
Note: Simplify the answer if possible
4. P (R < 3)
In words, the probability of R which is less than 3.
There are three possible values of R. These are 2, 1 and 0.
P (R < 3) = P (2) + P(1) + P(0)
= 3/8 + 3/8 + 1/8
= 7/8
5. ΣP(R)
To find ΣP(R) we need to find the sum of all the probability values.
ΣP(R) = P (3) + P (2) + P (1) + P(0)
= 1/8 + 3/8 + 3/8 +1/8
= 8/8 or 1
Classify Me Please!
Statement
1. the number of senators present in the meeting
2. the weight of newborn babies for the month of June
3. the number of ballpens in the box
4. the capacity of electrical resistors
5. the amount of salt needed to bake a loaf of bread
9
6. the capacity of an auditorium
7. the number of households with television
8. the height of mango tree in a farm
9. the area of lots in a subdivision
10. the number of students who joined the fieldtrip
11. the number of children in a family
12. the number of tails flipped in 4 trials
13. the time required to perform a job
14. the amount of sugar in a pineapple juice
15. the volume of mango juice in a 12 – ounce can
16. the Saturday night attendance at the prayer meeting
17. the number of patients of Dr. Naval in his clinic for three weeks
18. the time taken to complete an examination in Statistics and Probability
19. the interest rate given by the BDO bank
20. the weight of a fish
Analyze Me Please!
Directions: Determine the values of the random variable in each of the following
situations. Write your answers on a separate sheet of paper.
1. A coin is flipped four times. Let T be the number of tails that come out. Determine
the values of the random variable T.
a. List the sample space of the experiment.
S = { _________________________________}
b. Count the number of tails (T) in each outcome and assign this number to this
outcome.
Outcome Number of Tails
(Value of T)
10
b. Count the number of green dice (G) in each outcome and assign this number to
this outcome.
MODULE 2
This module was designed and written with you in mind. It is here to help you master
the random variable and probability distributions. The scope of this module permits it
to be used in many different learning situations. The language used recognizes the
diverse vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.
The module is divided into two lessons, namely:
11
Lesson Constructing Probability
2.1 Distributions
In your previous study of mathematics, you have learned how to find the probability
of an event. In this lesson, you will learn how to construct a probability distribution
of a discrete random variable. Your knowledge of getting the probability of an event
is very important in understanding the present lesson. To find out if you are ready
to learn this new lesson, do the following activities.
12
According to the first property, for every element x in the Support S, in another words,
sample space, all the probabilities must be positive and according to the second
property, the sum of all the probabilities for all possible x values in the Support S must
be equal to 1. The values of the discrete random variable X where 𝒇(𝒙)>𝟎 are called its
mass points.
Steps Solution
1. Determine the sample space. Let H The sample space for this experiment is :
represent head and T represent Tail S={𝑇𝑇𝑇,𝑇𝑇𝐻,𝑇𝐻𝑇,𝐻𝑇𝑇,𝐻𝐻𝑇,𝐻𝑇𝐻,𝑇𝐻𝐻,𝐻𝐻𝐻}
2. Count the number of tails in each Possible Value of the
outcome in the sample space and Outcomes Random Variable
assign this number to this outcome TTT Y( No. of Tails )
TTH 3
THT 2
HTT 2
HHT 2
HTH 1
THH 1
HHH 1
Number of Tails 0
(Y) Possibility P(Y)
3. There are four possible values of the
random variable Y, representing the
number of tails. These are 0,1,2 and 0 1
3. Assign probability values P (Y) to 8
each value of the random variable 1 3
-There are 8 possible outcomes, and no 8
tail occurs once, so the probability that 2
1
3
we shall assign to the random 0 is 8
8
-There are 8 possible outcomes and 1 3 1
tail occurs three times, so the
8
probability that we shall assign to the
3
random variable 1 is 8
-There are 8 possible outcomes and 2
tail occurs three times, so the
probability that we shall assign to the
3
random variable 2 is 8
-There are 8 possible outcomes and 3
tail occurs once
-e, so the probability that we shall
1
assign to the random variable 2 is 8
13
Table 1.1 . The Probability Distribution or the Probability Mass Function of
Discrete Random Variable Y.
Steps Solution
1. Determine the sample space. Let B The sample space for this experiment is :
represent the blue ball and R represent
the red ball. S={𝑅𝑅,𝑅𝐵,𝐵𝑅,𝐵𝐵}
14
Table 1.2 . The Probability Distribution or the Probability Mass Function of Discrete
Random Variable Z.
No. of blue 0 1 2
balls (Z)
Probability 1 1 1
P (Z) 4 2 4
The sum of probabilities is ∑𝑷(𝒛)=1. Discrete the random variable because the
sum of probabilities is equal to 1.
Example 3. Determine whether the given values can serve as the values of a
probability distribution of the random variable x that can take on only the values
1 10 5 5
1,2,3,4. P(1) = , P(2)= , P (3)= , P(4) =)= .
19 19 19 19
P 1 2 3 4
10 5 1 5
P(x)
19 19 19 19
The sum of probabilities is ∑𝑷(𝒛)=1.163 , this is not a discrete random variable
because the sum of probabilities is not equal to 1.
Example 4. Determine whether the given values can serve as the values of a
1
probability distribution of the random variable X. P(x) = for x= 1,2,3...8.
8
Solution :
x 1 2 3 4 5 6 7 8
1
1 1 1 1 1 1 1 8
P(x) 8 8 8 8 8 8 8
15
The variance of a random variable X is denoted by 𝜎2. It can likewise be written as Var
(X). The variance of a random variable is the expected value of the square of the
difference between the assumed value of random variable and the mean. The
variance of X is:
Where:
x = outcome, 𝜇= 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 , P (x) = probability of the outcome
The larger the value of the variance, the farther are the values of X from the
mean. The variance is tricky to interpret since it uses the square of the unit of
measure of x. so, it is easier to interpret the value of the standard deviation because
it uses the same unit of measure of x.
The standard deviation of a discrete random variable x is written as 𝜎. It is the square
roots of the variance. The standard deviation is computed as:
𝜎=√Σ[(𝑥−𝜇)2)𝑃(𝑥)]
x 0 1 2
x P(x) xP(x)
0 0.25 0
1 0.50 0.50
2 0.25 0.50
Σ[𝑥𝑃(𝑥)]= 1
E(x)= Σ[𝑥𝑃(𝑥)] =1.00
The expected value is 1. So, the average number of college graduates in the
household of the small town is one.
Example 2. A security guard recorded the number of people entering the bank every
hour during one working day. The random variable x represents the number of people
who entered the bank. The probability distribution of x is shown below.
x 0 1 2 3 4 5
P(x) 0 0.1 0.2 0.4 0.2 0.1
16
What is the expected number of people who enters the bank every hour?
Solution:
x P(x) xP(x)
0 0 0
1 0.1 0.1
2 0.2 0.4
3 0.4 1.2
4 0.2 0.8
5 0.1 0.5
Σ𝑃(𝑥)= 1 Σ[𝑥𝑃(𝑥)]= 3
x 1 2 3 4 5 6
P(x) 0.15 0.25 0.30 0.15 0.10 0.05
Solution:
Steps
1.Find the expected value.
2.Subtract the expected value from each outcome. Square each difference
3.Multiply each difference by the corresponding probability
4.Sum up all the figures obtained in step 3.
17
A. Construct a probability distribution for the data. (2 points)
1. The probabilities that a surgeon operates on 3,4,5,6, or 7 patients in any
one day are 0.15,0.20 ,0.25,0.20, and 0.20 respectively.
2. The probabilities that a customer buy 2,3,4,5, or 6 items in a convenience
store are 0.32,0.12,0.23,0.18, and 0.15 respectively.
3. The probabilities that a student will borrow 1,2,3, or 4 books are
0.45,0.30,0.15, and 0.10, respectively.
4. The probabilities that a bias die will fall as 1,2,3,4,5 or 6 are
1
, , 1 1, 1and1 1 ,, respectively.
2 6 12 ,12 12 12
5. The probabilities that a dispositor will invest Php100,000, Php250,000., or
1 1 1
Php180,000 are , ,and, respectively.
4 4 4
A. Find the expected value of each probability mass function below. (2 points)
1.
x 0 1 2 3
B. Find the variance and standard deviation of each of the following probability
distribution. (3 points)
1.
X 0 1 2 3
P(x) 0.10 0.45 0.25 0.20
2.
X 0 1 2 3
P(x) 0.15 0.38 0.33 0.14
18
MODULE 3
This module was designed and written with you in mind. It is here to help you
master the random variable and probability distributions. The scope of this module
permits it to be used in many different learning situations. The language used
recognizes the diverse vocabulary level of students. The lessons are arranged to follow
the standard sequence of the course. But the order in which you read them can be
changed to correspond with the textbook you are now using.
The module deals with an understanding of:
The table below shows the result of 4 tiles picked and returned in the jar 15 times.
If x represents each tile, and f represents the number of times picked, your task is to
evaluate what is being asked you to do.
19
Let us analyze and explore.
Just like frequency distribution, the probability distribution can be described by
computing its mean and variance. This time you will be exploring how to compute
for the mean and the variance for the discrete probability distribution.
To find the mean () or the expected value E(x) of a discrete probability distribution,
we use the following formula:
= 𝐄 (𝒙)=∑[𝒙 𝑷(𝒙)]
where: = mean
From the experiment we discussed on tile and jar, we can use x to represent the tiled
number and the number of times picked to P(x) and dividing each of its value by 15,
thus table becomes:
x 1 2 3 4
P(x) 2 4 8 1
(1). Find the mean of the discrete random variable using the table above.
20
x P(x) xP(x)
2 2
1
15 15
2 4 8
15 15
3 8 24
15 45
4 1 4
15 15 38
∑[ 𝑥 P(𝑥)]=
15
Step 2: Find the mean or the expected value of the probability distribution by getting
the sum of the values under the column x P(x).
= E (x)=∑[𝑥𝑃(𝑥)]
38
=
15
= 2.53
therefore, the mean or the expected value of the probability distribution is 2.53.
where: = mean
X = value of the random variable
P(X)= the probability value of the random variables
𝜎2= variance
𝜎 = standard deviation or SD
Now, let us try to find the variance and the standard deviation of the discrete random
variable x using the same example we use.
x P(x)
2
1
15
2 4
15
3 8
15
4
1
15
21
Step 1: Find the mean of the probability distribution. Prepare a table as shown
below.
x P(x) xP(x)
2 2
1
15 15
2 4 8
15 15
3 8 24
15 45
4
1 4
15 15 38
∑[ 𝑥 P(𝑥)]=
15
= 𝐄 (𝒙)=∑[(𝒙𝑷(𝒙))]
𝟑𝟖
= = 𝟐.𝟓𝟑
𝟏𝟓
Step 2: Square each value of the random variable and multiply by the corresponding
probability value (x2 P(x)). The new table below will give you an idea on how to do the
squaring values of the random variable
Step 3: Find the variance and the standard deviation by applying the formulas
𝜎2 = ∑[(𝑥2𝑃(𝑥))] − 𝜇2
= 7.06 – 2.53
= 4.53
or √𝜎2 = 𝟐.𝟏𝟑 thus, the variance is 4.53 and the standard deviation is 2.13
22
this only shows how close the variance and standard deviation from the mean. Since
the experiment, we have discussed is a tile picked and returned in the jar 15 times. The
mean computed tells us that for every tile picked from the jar, the number in the tile is
in average.
Standard Deviation
of a Discrete Probability Distribution. Try to answer the following by completing the
table. 1.What is the mean outcome if a fair die is rolled?
x P(x) xP(x)
12 1
6
3
1 1 4
4
5 6 4()= 6
6
(2) The random variable , representing the number of nuts in a chocolate bar has
the following probability distribution. Compute the mean.
x 0 1 2 3 4
P(x) 1 3 3 2 1
10
10 10 10 10
(3). The probability distributions below show the number of typing errors (x) and the
probability P(x) of committing the errors whenever clerks’ type-in a document.
Complete the table.
𝒙 0 1 2 3 4 5
P(x) 0.02 0.22 0.42 0.31 0.10 0.04
x P(x)
𝒙𝟐
𝒙𝟐𝑷(𝒙)
23
Refer to the table in no. 3 to answer numbers 4 – 5.
x 0 1 2 3 4
1
5 1
5 1
5 1
5 1
5
P(x)
xP(x)
2.
H P(H) HP(H)
0 0.06
1 0.70
2 0.20
3 0.03
4 0.01
3-4. Determine the variance and the standard deviation of the random variable.
x 1 2 3 4 5
1
5 1
5 1
5 1
5 1
5
P(x)
xP(x)
24
MODULE 4
This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.
After going through this module, you are expected to:
Normal distribution or normal curve represents a group of data where a very large
number of cases exists and the mean, the median and the mode are all equal. When you
sketch the graph of a normal curve, you will find the following properties:
1. The distribution curve is unimodal and bell-shaped. Unimodal means that there is
only one peak point.
2. The mean, the median, and the mode coincide at the center.
25
3. The curve is symmetrical about its center. Meaning, when you draw a vertical line at
the center of the curve, the resulting half part looks an image of the other half part.
4. The width of the curve is based on the standard deviation of the distribution.
5. The tails of the curve approach the base line, but it will never intersect the line. These
tails just go nearer and nearer to the base line, but never meet the line.
6. The area of the curve is 1. Thus, normal curve is also a probability distribution.
The normal curve is a standard normal curve when the mean µ = 0 and the standard
deviation σ = 1. This is mostly used to represent inferential statistics. You will find its area
by substituting the mean µ = 0 and the standard deviation σ = 1 in the formula that
describes a normal curve. But don’t worry! Mathematicians have already computed these
for everyone’s use.
You might be wondering why the area is considered as equal to 1 when the standard practice
is to show 99.73% of the area. Take note that .9973 is just the area between -3 and +3. In this
case, remember that the total area is not shown because the tails are asymptotic to the
horizontal line. Meaning, it just continues to approach the line but will never intersect the line.
Therefore, there is a little portion of the area at the tails of the distribution. So, when asked
about the area under a normal curve, you say 1.
Areas under the normal curve is found at the z-Table. This time, you will learn how
to use z-table in finding the areas under the normal curve.
Steps in Finding the Areas under the Normal Curve Given a Z-Value
26
Illustrative Example 1: Find the area that corresponds to z = 0.72.
Note: The area that corresponds to z = 0.72 can also be understood as “the corresponding
area between z = 0 and z = 0.72.”
Steps:
1. Express the z-value into a three-digit form.
➢ z = 0.72 is already in three digits.
2. Locate the first two digits on the left column of the z-Table.
➢ The first two digits are 0.7. Find it in the left column.
3. In the z-table, match the third digit with the appropriate column on the right just like
in what you are doing in multiplication table.
➢ The last digit is 2. Find the column with the heading .02.
4. The intersection of the row and the column is the required area or probability.
➢ The area is 0.2642.
27
Illustrative Example 2: Find the area that corresponds to z = -1.5.
Notes: The area that corresponds to z = -1.5 can also be understood as “the
corresponding area between z = 0 and z = -1.5.” Moreover, note that “negative sign” in
the z-value is just a signal that the region is on the left side of the mean. This means
that the area corresponding to z = 1.5 is also the same with z = -1.5. The only difference
is their location on the graph. If it is positive, then the region is on the right of the mean.
If it is negative, then the region is on the left of the mean.
Steps Solution
Express the z-value into a three-digit z = -1.50
form.
Locate the first two digits on the left
column of the z-Table.
Now, sketch a normal curve and identify the region. Remember that the z-value is
negative, so the region is on the left of the mean.
Before proceeding to the next topic, remember that the mean divides the area
under the curve
Recall that z-value or z-score tells you the distance from the mean measured in
standard deviation units. It can be positive (above the mean), negative (below the mean), or zero
to the mean). However, in real life, these scores are not usually given. Thus, it is important
that you know how to transform a raw score to its corresponding z-score under the
normal curve.
To get the z-value, use the formula:
𝑋−𝜇
𝑧= (𝑧−𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑑𝑎𝑡𝑎)
𝜎
OR
𝑋− ?̅?̅
𝑧= (𝑧−𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎)
𝑠
28
𝜎=𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
?̅?̅=𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑠=𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
Example 1: Find the z-value that corresponds to a score X = 58, given the mean μ = 50 ,
and standard deviation σ = 4.
(The symbols used are for population. Therefore, the z-score locates the raw score within a
population.) Solution:
𝑋−𝜇 58−50 8
𝑧= = = 4 =2 (The resulting z-value is positive 2.)
𝜎 4
The corresponding z-score to the raw score 58 is 2.
Meaning, the score 58 is 2 units above the mean.
Example 2: Locate the corresponding z-value to a score of 20 given that ?̅?̅ = 26 and s = 4.
(The given are sample data.)
Solution:
𝑋− ?̅?̅ 20−26 −6
𝑧= 𝑠 = 4 = 4
=−1.5 (The resulting z-value is negative 1.5)
Example 3: During the summer break, 1500 students took a test to apply for
scholarships in college. Their mean score is 80 with a standard deviation of 5. How many
students got a score between 75 and 82?
Steps Solution
1. Convert the raw scores 75 and 82 to For the raw score 75,
z-scores. 𝑋−𝜇
𝑧= 𝜎 =
75−80 −5
= 5 =-1
5
For the raw score 82,
𝑋−𝜇 82−80 2
𝑧= = = =0.4
𝜎 5 5
z = -1 corresponds to the area .3413
2. Find the area that corresponds to z = .4 corresponds to the area .1554
each z-score.
3. Sketch the graph of a normal curve
showing the z-scores
4. In your sketch, draw a line through
the z-scores and shade the region
between them.
5. Analyze the sketch and determine The graph suggests addition
the operation to use to find the total .3413 + .1554 =.4967
area.
.4967 is 49.67% when expressed to
percent.
49.67% of the students got a score between 75
6. Make a statement. and 82.
29
D. LOCATING PERCENTILES UNDER THE NORMAL CURVE
The following phrases are expressions of order. Are you familiar with them?
“Top 10”
‘a score of 75%’
Just like the z-scores, percentile tells the position of a value. It describes the relationship
of a value to the rest of the data. It is a point in the distribution where a number of cases is
below it. For example, if your score is at the 84th percentile, it means that 84% of the scores
were lower than yours and that 16% of the scores were higher than yours.
The Neophyte Statistician Hello, neophyte! Prove your learning by solving the following
problems. 1,000 children joined the physical fitness program last month. Their average
weight before
the program was 35 kg with a standard deviation of 5 kg. How many of these children
weighted between 32 kg and 45 kg?
31
MODULE 5
This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.
32
Sample - a subset of the population from which the data is collected. It is a small part of
the population from which the researchers gather data.
OTHER DEFINITION OF TERMS
a. Simple random sampling - is the simplest form of random sampling where each
element or member of the population has an equal chance of being included in the
sample. The most commonly used is the lottery method.
nN k =
where: k = interval size, N = population size and n= sample size
c.1. Simple stratified sampling - is used when the population is divided into strata with
common characteristic/s and if we decide to get an equal number of samples from each
stratum.
c.2. Proportional stratified sampling -is used when the sample size is proportional to the
number of members of the stratum. This means that the smaller the number of stratum
members, the smaller the stratum's sample size will be.
d. Cluster sampling - usually used on a geographical basis and is sometimes called area
sampling. It requires a complete list of clusters that represent the sampling frame.
e. Multi-stage sampling - it involves two or more stages in selecting the samples from a
given population.
33
b. Construct the sampling distribution of the sample means.
c. Construct the histogram of the sampling distribution of the sample means.
Solution
1. Since the size of the population is 5, we have N = 5. We shall draw a sample of size
2 from this population, so n = 2. Thus, the number of possible samples of size 2
can be drawn from this population is computed as follows:
n!
C (n , r ) =
(n − r)!r !
5!
C (5, 2)=
(5 − 2)!2!
= 10
The number of all possible samples of size 2 is 10. The table shows the list of all possible
samples with their corresponding means.
Possible samples of size 2 Mean
2 , 3 (2+3)/2 2.5
2,4 3.0
2,5 3.5
2,6 4.0
3,4 3.5
3,5 4.0
3,6 4.5
4,5 4.5
4,6 5.0
5,6 5.5
Observe that the means of the samples vary from sample to sample. The mean of the
population μ=4, while the means of the samples may be less than, greater than, or equal to
4.
1. We now construct the frequency distribution of the sample means.
Mean Frequency
2.5 1
3.0 1
3.5 2
4.0 2
4.5 2
5.0 1
5.5 1
Total 10
Next, we construct the probability distribution of the sample means. This is the sampling
distribution of the sample means.
Mean ?̅?̅ Probability P(?̅?̅)
2.5 1/10
3.0 1/10
3.5 2/10 or 1/5
4.0 2/10 or 1/5
4.5 2/10 or 1/5
5.0 1/10
5.5 1/10
34
2. The histogram of the sampling distribution of the sample means is constructed by
making a bar graph where the sample means are plotted on the horizontal axis and the
corresponding probabilities are shown in the vertical axis.
Example 2
The following table gives the monthly salaries (in thousands of pesos) of six officers in a
government office. Suppose that random samples of size 4 are taken from this population
of six officers.
Officer Salary
A 8
B 12
C 16
D 20
E 24
F 28
1. How many samples are possible? List them and compute the mean of each sample.
2. Construct the sampling distribution of the sample means.
3. Construct the histogram of the sampling distribution of the sample means.
Solution
Since the size of the population is 6, we have N = 6. We shall draw a sample of size 4. Thus,
the number of possible samples of size 4 that can be drawn from this population is
computed as follows.
n!
C(n,r)=
( n − r ) ! r!
6!
C ( 6, 4)= (6 − 4)!4!
= 15
35
The number of all possible samples of size 4 is 15. The table shows the list of all
possible samples with their corresponding means.
Next, we construct the probability distribution of the sample means. This is the sampling
distribution of the sample means.
2. The histogram of the sampling distribution of the sample means is constructed by making
a bar graph where the sample means are plotted on the horizontal axis and the
corresponding probabilities are shown in the vertical axis.
36
PARAMETER AND STATISTIC
The main objective of conducting a survey is to estimate the value of some of the
characteristics of a population. Let us consider one of the results of XYZ survey before the
May 2016 presidential election. The actual percentage of all the voters represent the
population parameter, while the estimate of those percentage based from the sample is
known as the sample statistic.
The sampling method used in selecting the sample data strongly affects the quality
of the sample statistic with regards to its representativeness and accuracy. The table below
shows a list of the common symbols used for parameters and statistic:
Parameter Statistic
Population mean (µ) Sample mean (X)
Population standard Sample standard
deviation (σ) deviation (s)
Population variance (σ2) Sample variance (s2)
Population proportion (P) Sample proportion (p)
Examples
Identify the population parameter and sample statistic for each study.
A recent survey of 540 senior high school students in FTLSHS for the S.Y. 2019-2020 found
that 90% of the students could be classified are good in Mathematics.
Population Parameter: All senior high school students in FTLSHS for the S.Y.
2019-2020
Sample Statistic: Collection of 540 senior high school students in FTLSHS for
the S.Y. 2019-2020 or the 90% of all senior high school
students in FTLSHS for the S.Y. 2019-2020
The average weight of every seventh person entering the Ayala mall within 3-hour period
was 168 pounds.
Population Parameter: All the people entering the Ayala mall within the
assigned 3-hour period
Sample Statistic: every seventh person entering the Ayala mall within 3-hour
Period
37
SOLVE ME PLEASE!
Directions: Solve the following given.
The random Finite population (N) consisting Solution Final
sample size (n) of answer
1. n = 2 3, 4, 5
2. n = 3 1, 2, 3, 4, 5
3. n = 5 6, 7, 8, 9, 10, 11, 12
4. n = 7 4, 6, 8, 10, 12, 14, 16, 18, 20
5. n = 9 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
6. n = 5 Odd numbers between 9 – 21
inclusive
ANALYZE ME PLEASE!
Directions: Answer the following problems.
38
MODULE 6
This module was designed and written with you in mind. It is here to help you master
the Estimation of Parameters. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.The module is divided into two lessons, namely:
Lesson 6 –Sampling and Sampling Distribution
If all possible random samples of size n taken with replacement (independent) from a
population with a mean 𝜇 and variance 𝜎2, then the mean (𝜇
𝑥̅)̅, variance (𝜎?̅?̅)
2 and
standard deviation (𝜎𝑥̅)̅ of the sampling distribution of the sample mean are:
𝜎2
𝜇?̅?̅ =𝜇 (mean), 𝜎2 ?̅?̅=𝑛 (Variance)
𝜎
𝜎𝑥̅= √𝑛 (Standard deviation or standard error)
If all possible samples of size n are taken without replacement (dependent) from a finite
population of size N with a mean 𝜇 and variance 𝜎2, then mean( (𝜇𝑥,
)̅ variance 𝜎2?̅?) of the
sampling distribution of the sample mean are:
39
Note: The factor 𝑁−𝑛𝑁−1 is called correction factor for finite population. It will be close to
1 and can be safely ignored when n is small compared to N.
Note that as we increase the sample size and the variance of the sample mean
decreases.
THEOREM
If random samples of size n are taken from a population with a mean 𝜇 and stan
deviation 𝜎, then the sampling distribution of the sample mean ?̅?̅ app
𝜎
distribution with mean 𝜇?̅?̅, and standard deviation 𝜎?̅?̅
, thus can be =
standardized
√𝑛
?̅?̅𝜎 −𝜇
as : 𝑧=
√𝑛
As the n increase, the sampling distribution of the sample mean gets nearer and nearer to
the normal distribution .
?̅?̅=𝜇 that
Example 1 . Suppose a jar contains number 1,3 and 5. Show and 𝜇
𝜎2
𝜎 2 ?̅?̅= 𝑛 .
Solution: The probability distribution is :
x 1 3 5
f(x) 1 1
3 1
3
xf(x) 3 3 5
x2 1 3 3
x2f(x) 3 9
1 9/3 25
1/3 25/3
Since the distribution is uniform, that is , the observations have the same probabilities, the
mean (𝜇) and sample varian ce (𝜎2)can be easily computed as
1 1 1 9
𝜇 =1()3+3 ( ) + 5 ()3= 3 = 3
3
3−
𝜎 2 = 1 (
1−3) 2 1 ) + ( 1 5−3)
32 2 1 4
+ = +0+ =
4 𝟖
3 +( 3 3 3 3 𝟑
3
if we take two numbers in succession with replacement then, the possible 2 number samples
are: (1,1) , (3,3), (5,5), (1,3), (3,1) (1,5), (5,1), (3,5) and (5,3). The average or mean of each pair, in order
are 1,3,5,2,2,3,3, 4, and 4 respectively. IF we denote the means as random variable ?̅?̅ , then:
40
The probability distribution of ?̅?̅ becomes:
x 1 2 3 4 5
f(x) 1 2 3 2 1
9 9 9 9 9
xf(x) 1 4 9 8 5
9 9 9 9 9
x2 1 4 9 16 25
x2f(x) 1 8 27 32 25
9 9 9 9
9
The mean of the sampling distribution (using the formula for mean of
random variable is :
𝜇?̅?̅ =∑(𝑥̅ ̅f (?̅?̅))
1 2 3 2 1
= 1 ( ) + 2 () + 3( ) + 4( )+ 5( )
9 9 9 9 9
5
=1 4 9
+
8
+
9 +9 + 9 9 9
𝟐𝟕
=
𝟗
= 𝟑 , but 𝜇 =3 , therefore , 𝜇 ?̅?̅
The variance of the sampling distribution (using the formula for variance
of random variable is:
𝑥 −[𝐸(𝑥)]
𝜎2?̅?̅= E(2) 2
𝜎2
𝜎 2 ?̅?̅= , since the sample
2 𝑛
= ∑(𝑥̅̅̅)̅2 f(𝑥̅̅̅)̅ − [∑?̅?̅ 𝑓(𝑥̅̅̅)̅]
size n is 2.
1 2 3 2 1
=[12 ()+22 () +32 () +42 ()+52 ( )−(3)
] 2
9 9 9 9 9 Therefore:
93
= −9 𝟐 = 𝝈𝟐
9 𝝈?̅?̅ 𝒏
8
4 (3) 𝟒
= , but 𝜎 2= 2 =
3 𝟑
Example 2.
If a random sample of size 16 is taken with replacement from the population 1,1,1,2,4,5,5, and 6 .
what is the mean , variance and standard deviation of the sampling distribution of the
sample mean.
Answer: We can write the probability distribution of our population as:
X 1 .2 4 5 6
f(x) 3 1 1 2 1
8 8 8 8 8
The mean of the distribution is:
𝟐𝟓
= or 3.13
𝟖
41
And the variance is:
3
𝜎 2 =12() +22()
1
+42()1 +5()
2 2
+
1
62() − (
𝟐𝟓 𝟐
)
8 8 8 8 8 𝟖
3 4 16 50 36 625
= + + + + −
8 8 8 8 8 64
=13.625 −
625
= 3.86
64
Therefore, the mean of the sampling distribution (𝜇?̅?̅) of the sample mean is 3.13 ( the same
3.86
as 𝜇) , while the variance of the sampling distribution (𝜎 ?̅?̅ ) is 16 = 0.24(that is 𝜎2/𝑛 ). In
addition, the standard deviation 𝜎?̅?̅ is √0.24 = 0.49 ( positive square root of the variance)
Example 3.
A school has 5,000 students with a mean weight of 80 lbs., with a standard deviation of 20
lbs. If you draw a random sample of 50 students, what is the mean, variance and standard
deviation (standard error) of the sampling distribution of the sample mean?
Answer: We know from the given that the 𝜇 = 80 lbs, therefore 𝜎?̅?̅ = 80 lbs. Since the sample
(n=50) is taken from a finite population (N=5000) whose 𝜎=20, the standard error (𝜎?̅?̅) can
be computed as:
𝜎 𝑁−𝑛 20 5000−50
𝜎?̅?̅=𝑛 √𝑁−1 =
50
√ 5000−1
4950
=2.828 √ = (2.828)(0.995) = 𝜎?̅?̅ =2.814 which is the square of the standard error is
4999
: = 𝜎 2 ?̅?̅=2.814 2 , =𝜎2 ?̅?̅= 7.9185 or 7.919
A. Determine the mean( 𝜇)?̅?̅, variance( 𝜎2 ?̅?̅) and standard deviation (𝜎?̅?̅ ) of each.
(1-3) A random sample of 20 independent observations is taken from a population
with 𝜇=23.8 and 𝜎=5.
1. 𝜇?̅?̅ =_____________
2. 𝜎2?̅?̅=___________________
3. 𝜎?̅?̅=______________
(4-6). A random sample of size 30 independent observations is taken from a
population with 𝜇=48 and 𝜎=6.5.
4. 𝜇?̅?̅ =_____________
5. 𝜎2?̅?̅=___________________
6. 𝜎?̅?̅=______________
(7-9). A random sample of size 120 is taken with replacement from a
population with independent 𝜇=120 and 𝜎=28.
7. 𝜇?̅?̅ =_____________
8. 𝜎2?̅?̅=___________________
42
9. 𝜎?̅?̅=______________
A. Compute the mean, variance and standard deviation of the sampling distribution taken
from the following populations.
10-12. A community has 1500 people with mean age of 42 and variance of 16.
If you draw a random sample of 30 people, what are the mean, variance
And standard error of sampling distribution of their ages?
13-15. What is the mean, variance, and standard error of the sample mean
when 60 students are taken from a population of 2000 with a mean
score of 75 and standard deviation of 5?
MODULE 7
This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be used in
many different learning situations. The language used recognizes the diverse vocabulary
level of students. The lessons are arranged to follow the standard sequence of the
course. But the order in which you read them can be changed to correspond with the
textbook you are now using
Lesson
7 The Central Limit Theorem
43
In the previous module, you have described the sampling distribution of the sample means
by computing its mean and variance. This time, you will continue to study sampling
distribution along with an important concept used in the field of Statistics.
The Central Limit Theorem (CLT) states that the sampling distribution of the
sample means moves closer to a normal distribution as the sample size increases
regardless of the shape of the population distribution. In other words, the distribution is
approximately normally distributed whenever the sample size is sufficiently large
irrespective of whether the distribution is normal, left-skewed, right-skewed, or uniform.
This theorem also tells us that the mean of the sampling distribution of the sample means
is always equal to the population mean.
There are assumptions and conditions that you have to consider before using the
Central Limit Theorem.
1. The sample must be selected randomly. This is to secure that the data gathered are
unbiased data from a population.
2. The variables must be independent from each other. Meaning, the value of one
observation to another observation does not affect each other. Normally, when the
sample is randomly selected, it is assumed that independence is already met.
3. The sample size should not be more than 10% of the population when the sample is
drawn with no replacement. Taking away each item in the observations changes the
44
population. When you sample only 10% or less of the population, eliminating each
observation does not change the population that much (Khan Academy).
4. The sample size must be large enough. How large? For many Statisticians, the minimum
sample size of 30 is assumed to be sufficiently large. They believe that when the mean
approaches a normal distribution. But of course, this is just a guideline and not a general
rule because situation varies. If it is stated that the population is already normally
distributed, you may use small sample size. However, when the population
is strongly skewed, then you will need adequately large sample size. The basic concept
is this: the more the distribution varies from being normal, the larger the sample size
is required.
After knowing the Central Limit Theorem's definition, assumptions, and conditions, you
may now describe the sampling distribution of the sample means by computing the
mean and the standard error of the mean. Example 1: The mean (μ) and standard
deviation (σ) of population distribution are 68 and 3, respectively. Find the mean and
standard deviation of the sampling distribution when a random sample of size 25 is drawn
from the population. Assume that the population is finite. Solution:
Example 2: Samples of size 100 are selected randomly from a population with a mean 28.5
and a standard deviation of 1.5. Compute the mean and standard error of the sampling
distribution of the means given that the population is finite.
Solution:
45
The Central Limit Theorem gives us confidence that whatever the shape of the distribution
is, it will approach a normal distribution as the sample size increases.
In the next module, you will solve more real-life problems involving this theorem. Exciting,
right?
Crossword Puzzle
Across Down
1. this condition shows that the value of one 2. the standard error of the mean
observation to another observation does not depends on this
affect each other 6. the data collected must be from a
3. the relationship between the population _____ sample
mean and the mean of the sampling 8. a parameter for a standard normal
distribution of the sample means distribution
4. the sample size should not exceed the
_____ percent of the population when the 10. abbreviation for Central Limit
sample is drawn without replacement Theorem
5. The standard error of the mean must be
close to _____ to have a good estimate of the
mean.
7. the more the distribution varies from
being normal, the _____the sample size is
required 9. the approximate shape of the
sampling
distribution of the sample means when the
size of the random samples gets larger
46
Instructions:
1. Do only the task that is assigned to your group.
Group 1: Find the mean and standard deviation of the heights of all the students in
your section.
Group 2: Find the mean and standard deviation of the weights of all the students in
your section.
Group 3: Find the mean and standard deviation of the hours spent in sleeping of all
the students in your section.
Group 4: Find the mean and standard deviation of the waistline of all the students
in your section.
Group 5: Find the mean and standard deviation of the general average of all the
students in your section.
2. After finding the mean and standard deviation, create a problem using the Central
Limit Theorem for the solution.
3. Submit your work on a letter size bond paper before the end of the first quarter of
this semester
RUBRIC:
5 3 1 Score
Mathematical The solution has The solution has The solution has
Concept no error. a minor error. lots of errors.
The submitted The submitted The submitted
output
neatnessshows
and output
neatnessshows
and output is untidy
and disorganized.
orderliness. orderliness with
Presentation
of the output minor flaws.
The output is
submitted
two onethe
days after or
The output is deadline. The output is
submitted on or
before the submitted later
than two days
Timeliness
deadline. after the deadline.
TOTAL SCORE
47
MODULE 8
This module was designed and written with you in mind. It is here to help you master the
random variable and probability distributions. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard sequence of
the course. But the order in which you read them can be changed to correspond with the
textbook you are now using.
The module deals with an understanding of:
The knowledge taught to you in the previous lesson is pertinent in your succeeding
lesson. But before we proceed to the next lesson, let me check your understanding.
48
2. What is the probability that a random sample of size of 35 will have a mean of 66 or
more?
Let’s take the step-by-step solution:
➢ the population is finite because its size is given and therefore, we can use the
normal distribution as CLT holds.
✓ 1st convert the values to z scores using 𝑧= 𝑋̅ ̅− 𝜇
𝜎 ?̅?̅
✓ 𝑧= ?̅?̅− 𝜇
𝜎 =
64−65
5 =
−1
0.645 −
= 1.55
𝑋̅ ̅− 𝜇 √60
√𝑛 𝜎
= 67−65 2
𝑧= 5
√60
= 0.645 = 3.10
√𝑛
49
A. Compute the z value for each; assume tat each population is normally distributed.
B.
4. 5.
46 245
2 9 20
?̅?̅̅ 45.5 248
n 20 25
𝒛= ?̅?̅̅− 𝝁
𝝈
√𝒏
Try to do the following activities for to deepen your understanding on the Central Limit
Theorem.
Assume that the height of adult women was normally distributed with a
mean of 63 in and standard deviation of 2.5 in.
1. If 36 women are randomly selected, what is the probability that the mean height is
less than 62 in?
2. If 70 women are taken as samples, what is the probability that their mean height is
greater than 62.5 in?
50
MODULE 9
This module was designed, and This module was designed and written with you in
mind. It is here to help you learn about T-Distribution and Percentile. The scope of this
module permits it to be used in many different learning situations. The language used
recognizes the diverse vocabulary level of learners. The lessons are arranged to follow the
standard sequence of the course.
The module is divided into 2 lessons, namely:
Lesson
Understanding the T-Distribution
9
There are situations when sample values are not large enough for the central Limit
Theorem to be applied. Can we still obtain an interval estimate of the population mean? Are
assumptions can be met? Those questions and other pertinent procedures were discussed
in this lesson. To find out if you are ready to learn this new lesson, do the following activities.
N < 30
If the sample size is less than 30 (n<30), it is considered small, thus, even if the
variance of the population is given, the formula for standardizing the sampling distribution
of the sample mean cannot be used. For this small sample, the normality of the distribution
sample mean cannot be guaranteed, thus, the z-table cannot be used.
51
The T-Distribution
Since the z-table cannot be used for small sample size, another type of distribution
is used. This special case is called t-distribution, formulated in 1908 by an Irish brewing
employee named W.S. Gosset.
Theorem
If x̅ and s are the mean and standard deviation, respectively, of a random sample of size
n taken from a normally distributed population with a mean μ, can be standardized as
̅ x̅ − 𝜇
𝑡= 𝑠
√𝑛
Where;
n = sample size
Steps Solution
1. Analyze the problem and identify what Since n=18 and it is less than 30, we
kind of distribution is going to be used. can use the t-distribution
x̅̅ − 𝜇
𝑡= 𝑠
2. Write the formula
√𝑛
43− 42
6
𝑡=
3. Evaluate the formula √25
1
6
4. Subtract the numerator and find the 𝑡= 5
square root of the sample size (n)
52
1
5. Divide the denominator 𝑡 = 1.2
Example 2:
MATH Corporation manufactures light bulbs. The CEO claims that an average Acme
light bulb lasts 400 days. A researcher randomly selects 20 bulbs for testing. The sampled
bulbs last an average of 380 days, with a standard deviation of 50 days. Find the t-value of
the given data.
Steps Solution
x̅̅ − 𝜇
𝑡= 𝑠
3. Write the formula
√𝑛
380 − 400
4.Evaluate the formula 𝑡= 50
√15
The T-Table
The t-table shows right-tail probabilities for selected t-distributions. You can use it
to solve the following problems. Suppose you have a sample of size 10 and you want to find
the 95th percentile of its corresponding t-distribution. You have n – 1= 9 degrees of freedom,
so, using the t-table, you look at the row for df = 9. The 95th percentile is the number where
95% of the values lie below it and 5% lie above it, so you want the right-tail area to be 0.05.
Move across the row, find the column for 0.05, and you get. This is the 95th percentile of
the t-distribution with 9 degrees of freedom.
Now, if you increase the sample size to n = 20, the value of the 95th percentile
decreases; look at the row for 20 – 1 = 19 degrees of freedom, and in the column for 0.05 (a
right-tail probability of 0.05) you find degrees of freedom indicate a smaller standard
53
deviation and thus, the t-values are more concentrated about the mean, so you reach the
95th percentile with a value of t closer to 0.
df
54
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674
Note:
The formula for Degree of freedom (df) is n -1, Where n is the sample size.
df = n – 1
Example 1:
Mr. Sotto conducts a survey to 25 people for the effectiveness of their new medicine.
He wants to know what is the 95th percentile of his survey. Find the t-value.
Steps
df = n – 1
Compute for the degree of freedom = 25 – 1
df = 24
Change the percentile into a
95th = 95% = .95
percentage to decimal
α=1 - 0.95
Subtract the decimal to 1. To
know what the right tail area is α =0.05
55
Referring to the table. Look for the
column of 0.05 and the row of the
df.
Example 2: What is the t-value when n=15 at 90th percentile (one tail)
df =n-1 = 15-1 = 14 and 90th percentile is α=0.10.
Then referring to the table, t = 1.345.
Complete the table and find the t-value on the given problems. (5 points)
1. Suppose scores on a Math test is normally distributed, with a population mean of 100.
Suppose 20 people are randomly selected and tested. The standard deviation in the sample
group is 15. What is the probability that the average test score in the sample group will be at
most 110?
This module was designed and written with you in mind. It is here to help you master
the Estimation of Parameters. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.
The module is divided into two lessons, namely:
57
Length of Confidence Interval and Appropriate Sample Size
58
The length of a confidence interval is the absolute difference between the upper confidence
limit and the lower confidence limit. That is,
𝑃̂ ̂ (1−𝑝̂) 𝑃̂ ̂ (1−𝑝̂)
LCI = ⌈?̂?̂ + 𝑧𝛼√ ⌉- ⌈?̂?̂−𝑧𝛼√ ⌉
2 𝑛 2 𝑛
𝑃̂ ̂ (1−𝑝̂) 𝑃̂ ̂ (1−𝑝̂)
𝑧𝛼√
= ?̂?̂+ + ?̂?̂+ 𝑧𝛼√
2 𝑛 2 𝑛
𝑃̂ ̂ (1−𝑝̂)
LCI = 2𝑧√
𝛼
2 𝑛
The last equation above can be used to find the length of a confidence interval for population
proportion.
Example 2. Find the length of the confidence interval given the following data
59
Solution:
1. Find 𝛼 in (1- 𝛼) 100% confidence level then find 𝑧 𝛼 .
2
(1- 𝛼) 100% = 95%
1- 𝛼 = 0.95
𝛼 = 0.05
𝛼 0.05
= = 0.025
2 2
Hence, using the areas under the standard Normal Curve Table, 𝑧𝛼=1.96
2
𝑃̂ ̂ (1−𝑝̂)
2. LCI = 2𝑧𝛼√
2 𝑛
0.25 (0.75)
= 2(1.96)√
400
0.1875
= 3.92√
400
=0.0848 or 0.085
The last equation above can be used to find the length of the confidence interval.
Example 3. Find the length of the confidence interval, given the following data.
s=6.17 , n=12, confidence level : 99%
Solution:
1. Find the degrees of freedom df.
df= n−1
= 12 −1
=11
2. Find 𝛼 in (1 −𝛼 ) 100% confidence level then find 𝑧 𝛼 .
2
( 1 − 𝛼 ) 100% = 99%
1 − 𝛼 = 0.99
𝛼 = 0.01
𝛼 0.01
= = 0.005
2 2
E= margin error
60
Determining the Sample of Size for Estimating p
The following are the formulas:
A. When an estimate ?̂?̂ is known:
2
[𝑧𝛼]?̂?̂(1−𝑝̂)
n= 2
𝐸2
Where : n=desired sample size
?̂?̂ = sample proportion
B. When an estimate ?̂?̂ is unknown 𝑧𝛼 =z value
2
0.25(𝑧𝛼) 2
n= 2
𝐸2 E= margin error
0.25 is constant
If the computed sample size is not a whole number, it should be rounded up to the next
whole number.
Example 4.
In previous study done by a student, it was found out that 28.5% of the student used twitter.
This year, your statistics teacher wants you to conduct a study on the current percentage
of Twitter users among the students in your school. How many students must you include
in your study to be 95% confident so that the margin of error is no more than 3.5
percentage?
Solution:
a. P= 28.5 %
=0.285
q hat
1-0.285 = 0.715
b. E= 3.5%
=0.035
( 0.035 )
=639.04 ≈ 640
61
A. Find the length of the following confidence interval.
1. 0.325< p <0.575 2. 0.137< p < 0.563 3. 0.338< p < 0.562 4. 0.301 < p < 0.751 5. 0.245 < p <
0.467
Find the length of the confidence interval given the following upper and lower
limits.
6. Upper confidence limit= 0.673
B.
C. Find the length of the confidence interval given the following data.
11. ?̂?̂ =0.35, n=400, confidence level= 95%
12. ?̂?̂ =0.45, n=350, confidence level= 95%
13. ?̂?̂ =0.48, n=410, confidence level= 95%
14. ?̂?̂ =0.51, n=420, confidence level= 95%
15. ?̂?̂ =0.42, n=300, confidence level= 95%
16. 𝜎=0.5, n=36, confidence level =95%
17. 𝜎=0.4, n=64, confidence level =95%
18. 𝜎=0.52 , n=40, confidence level =95%
19. 𝑠=5.25 n=14, confidence level =95%
20. 𝑠=6.05 n=12, confidence level =95%
62
Quarter II
MODULE 1
This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be used in
many different learning situations. The language used recognizes the diverse vocabulary
level of students. The lessons are arranged to follow the standard sequence of the course.
But the order in which you read them can be changed to correspond with the textbook you
are now using.
The module consists of one lesson which contains sub lessons:
Lesson
Tests of Hypothesis
1
In daily life, we make tentative explanation of facts about a particular phenomenon by
formulating hypothesis. This hypothesis may be correct or incorrect, depending on the
available evidence that we can gather to support our hypothesis. We usually use a sample to
gather information and evidence that we need to validate our hypothesis. The data that we
gather from this sample become the basis of our decision whether we shall accept or reject
our hypothesis regarding the entire population. The data obtained from this sample is
analyzed with the use of appropriate statistical procedure to find out whether our
hypothesis should be accepted. This process is called testing hypothesis. In this lesson 1 we
shall discuss another aspect of inferential statistics: the testing of hypothesis. We shall learn
how to conduct a test of hypothesis that will help us to arrive at the right decision.
63
A statistical hypothesis is a statement about the numerical value of a population
parameter. It is a statement or tentative assertion which aims to explain facts about a
certain phenomenon. A hypothesis needs to be resolved whether it is true or not. Thus, it
must be subjected to statistical testing procedure known as test of hypothesis or
hypothesis testing. If the hypothesis is found to be true, it is accepted: if it is found
false, it is rejected.
There are two kinds of hypothesis: the null and alternative hypotheses.
Since the null and alternative hypotheses are contradictory, you must examine evidence to
decide if you have enough evidence to reject the null hypothesis or not. The evidence is in
the form of sample data.
After you have determined which hypothesis the sample supports, you make a decision.
There are two options for a decision. They are “reject H0” if the sample information favors
the alternative hypothesis or “do not reject H0” or “decline to reject H0” if the sample
information is insufficient to reject the null hypothesis
64
Notice that the null hypothesis is expressed through the use of the “equal” symbol while
the alternative hypothesis is expressed by the "not equal" symbol because the claim or
conjecture does not specify any direction. Example 2 H0: No more than 30% of the
registered voters in Santa Clara County voted in the primary
election. µ ≤ 30%
Ha: More than 30% of the registered voters in Santa Clara County voted in the primary
election. µ > 30%
Example 3
We want to test if college students take less than five years to graduate from college, on the
average. The null and alternative hypotheses are:
H0: μ ≥ 5
Ha: μ < 5
Types of Tests
A statistical test may either be directional (one-tailed) or non-directional (two-tailed).
We can determine whether a test is directional or nondirectional by looking at how
alternative hypothesis is expressed.
Directional Test
A test of any statistical hypothesis where the alternative hypothesis is expressed using less
than (<) or greater than(>) is called directional test or one-tailed test since the critical or
rejection region lies entirely in one tail of the sampling distribution.
Study the following examples.
Example 4
Claim: The average weekly allowance of college students is less than Php 1 500.
65
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).
Ha: The average weekly allowance of college students is less than Php 1 500
(µ < 1 500).
one-tailed
This is directional
test. Moretest or
specifically,
tailed test because
this is a left-
the "less than" symbol was used in expressing the alternative hypothesis Thus, the critical
region or the rejection region lies entirely in the left tail of the sampling distribution.
Example 5
Claim: The average weekly allowance of college students is greater than
Php 1 500.
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).
Ha: The average weekly allowance of college students is less than Php 1 500
(µ > 1 500).
This is also directional test or one-tailed test. More specifically this is a right-tailed test
because the “greater region than" symbol was used in expressing the alternative hypothesis.
Thus, the critical region or the rejection region lies entirely at the right tail of the sampling
distribution.
Nondirectional Test
A test of any statistical hypothesis where the alternative hypothesis is written with a
not equal sign (≠) is called a nondirectional test or two-tailed test since there is no assertion
made on the direction of the difference. The rejection region is split into two equal parts,
one in each tail of the sampling distribution.
Example 6
Claim: The average weekly allowance of college students is Php 1 500.
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).
Ha: The average weekly allowance of college students is less than Php 1 500
(µ ≠ 1 500).
66
Observe that the alternative hypothesis is expressed, using the “not equal” (≠) symbol; the
test is two-tailed. Types of Error In decision-making, we sometimes make a wrong decision.
Likewise. when we test a hypothesis, there is a possibility that we shall also commit an error
of accepting or rejecting the hypothesis. There are two types of errors: The Type I error and
the Type Il error.
Type I error occurs when we reject the null hypothesis when it is true. It is also
called alpha error (α error).
Type II error occurs when we accept the null hypothesis when it is false. It is also
called beta error (ß error).
In hypothesis testing, four outcomes are possible: two of which lead to incorrect decisions.
The four possible outcomes are described in the table below.
Fact
Decision Ho is true Ho is false
Accept Ho Correct decision Type II error
Reject Ho Type I error Correct decision
Level of Significance
The probability of committing Type I error is called the level of significance, It is denoted
by the Greek letter α (alpha). Thus. the value of a tells us the probability of making an
error in rejecting the null hypothesis when it is true. The choice for the value of the
significance level is determined by the researcher. This depends on the risk or degree of
confidence the researcher is willing to take in committing Type I error. The commonly
used levels of significance are 0.05 and 0.01. The level of significance should be set before
testing the hypothesis. Example 7
A 0.01 level of significance means that the researcher is willing to take 1% error in
making a decision. 't also implies that he is confident that he will make a right decision.
Likewise, a 0.05 level of significance means that the researcher is willing to take error in
making a decision. It also implies that he is 95% confident that he will make a right decision.
Steps in Testing the Hypothesis
Whenever we test hypotheses, we follow these steps.
Step l: Identity the claim and formulate the null (HO) and alternative (Ha) hypothesis
Step 2: Set the level of significance and determine whether the test is one-tailed or
two-tailed by looking at how the alternative hypothesis is expressed, Decide
on the test statistic to be used and find the critical value for the test. Draw or
illustrate the rejection region.
67
Step 3: Compute the test value, using the test statistic or formula for the
test.
Step 4: Make a decision whether to accept or reject the null hypothesis.
Step 5: Formulate a conclusion by answering the research question.
H0: µ = 50
Ha: µ ≠ 50
Test Value = 1.45 Accept H0
Critical Value = ±1.96
H0: µ = 50
Ha: µ ≠ 50
Test Value = -1.65 Accept H0
Critical Value = ±1.96
H0: µ = 50
Ha: µ ≠ 50
Test Value = 2.35 Reject H0
Critical Value = ±1.96
H0: µ = 50
Ha: µ ≠ 50
Test Value = -2.35 Reject H0
Critical Value = ±1.96
68
H0: µ ≤ 50 Ha: µ > 50
Test Value = 1.86
Critical Value = 1.65 Fail to Accept/Reject
H0
H0: µ ≤ 50
Ha: µ > 50
Test Value = 1.34 Accept H0
Critical Value = 1.65
H0: µ ≥ 50
Ha: µ < 50
Test Value = -2.05 Accept H0
Critical Value = −2.53
H0: µ ≥ 50
Ha: µ < 50
Test Value = -2.88 Fail to accept/Reject
Critical Value = −2.53 H0
Identify Me Please!
69
9. The average score of grades eleven students in Filemon T. Lizan Senior High in
Statistics and Probability during the Diagnostic Test is at most 45 out of 50-item test.
10. The mean number of cars a person owns in his/ her lifetime is not more than ten.
I Can Do These!
What’s My Decision?
Directions: Decide whether the null hypothesis is to be accepted or rejected, given the
test value and the critical value of test statistic.
Hypotheses Rejection Region Decision
1. H0: µ = 150
Ha: µ ≠ 150
Test Value = 2.35
Critical Value = ±1.96 __________________
2.
H0: µ = 150
Ha: µ ≠ 150
Test Value = -1.34 __________________
Critical Value = ±1.96
3.
H0: µ = 150
Ha: µ ≠ 150
Test Value = 1.97 __________________
70
Critical Value = ±1.96 4.
H0: µ = 150
Ha: µ ≠ 150 Test Value
= -2.02 Critical Value =
±1.96 __________________
5. H0: µ = 150 Ha: µ ≠
150
Test Value = -1.99
Critical Value = ±1.96
__________________
6.
H0: µ ≤ 150
Ha: µ > 150
Test Value = 1.56 __________________
Critical Value = 1.65
7.
H0: µ ≤ 150
Ha: µ > 150
Test Value = 1.28 __________________
Critical Value = 1.65
8.
H0: µ ≥ 150
Ha: µ < 150
Test Value = -2.55
__________________
Critical Value = −2.53
71
10. H0: µ ≤ 150 Ha:
µ > 150
Test Value = 1.76
__________________
Critical Value = 1.65
MODULE 2
This module was designed and written with you in mind. It is here to help you master
the random variable and probability distributions. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard sequence of
the course. But the order in which you read them can be changed to correspond with the
textbook you are now using.
The module is focus only in one topic:
Lesson 1 – Hypothesis Testing About a Population Mean When the Variance is
known
Lesson 2 – Hypothesis Testing About a Population Mean When the Variance is
Unknown
Learning Competencies:
✓ formulates the appropriate null and alternative hypotheses on a population mean.
(M11/12SP-IVb-1)
✓ identifies the appropriate form of the test-statistic when: (a) the population variance
is assumed to be known; (b) the population variance is assumed to be unknown; and
(c) the Central Limit Theorem is to be used. (M11/12SP-IVb-2)
After going through this module, you are expected to:
✓ differentiate traditional approach from probability value approach of hypothesis
testing
✓ determine whether a hypothesis test is non-directional or directional
✓ determine whether a directional test is left -tailed or right -tailed
72
One of the ultimate goals of every nation is to produce professional who will contribute to
scientific knowledge through research. In research investigation, hypothesis testing is a vital
procedure. It is deciding whether to accept and reject a statement or the assumption about
some parameter in any research problem. From the results of the correct decision making,
conclusions are drawn in which facts are generated, and thus can become a contribution to a
body of knowledge in the fields of education, business, medicine, commerce, economics, and
many others.
In this lesson, we will study the terminologies related to testing of hypothesis, how to
calculate the probabilities of committing a type I and type II error, hypothesis testing about
a population mean when the variance is known, hypothesis testing when the variance is
unknown, and hypothesis testing concerning proportions. Hypothesis testing, the focal
point lesson, brings to light the role of research in discovering new knowledge and
breakthrough in different fields of discipline.
Example 1. The leader of the association of jeepney drivers claims that the average daily
take home pay of all jeepney driver in Navotas City is Php400.00. A random sample of 100
jeepney drivers in Navotas City was interviewed and the average daily take home of these
drivers is found to be Php425. Use a 0.05 significant level to find out if the average daily
take home pay of all jeepney drivers in Navotas City is different from Php400.00. Assume
that the population variance Php8,464.00.
Solution :
𝐻𝑎: 𝜇 =400
𝐻𝑎: 𝜇 ≠400
Step 2 . Choose the level of significance: 𝑎= 0.05.
Step 3. Compute the test statistics. Since it is the population mean that is being tested
and the population variance is known and n>30, the appropriate test statistic is the z-value.
𝑥̅ ̅−𝜇
z= 𝜎
√𝑛
Computation:
The standard deviation 𝜎 is the square roots of the variance 𝜎2. The square roots of 8,464
is 92, hence 𝜎 = 92.
𝑥̅ ̅−𝜇
z=
𝜎
√𝑛
425−400
= 92
√100
=2.72
Because the computed test statistic, z= 2.72 falls within the rejection region (beyond the
critical value ± 1.96), reject the null the hypothesis and accept the alternative hypothesis.
Conclude that the average daily take home pay of jeepney drivers is not equal to Php400.00.
This result is significant at 𝛼 =0.05 level.
B. By 𝝆−value method
This method is gaining popularity because of statistical computer programs. Most
statistical computer programs are using the p-value method. For deciding and drawing a
conclusion., following rules are important.
a. If p-value ≤ 𝛼 , reject 𝐻𝑜 ,
b. If p-value > 𝛼, do not reject 𝐻𝑜
Solution
Step 1. State the null and the alternative hypothesis.
𝐻𝑎: 𝜇 =400
𝐻𝑎: 𝜇 ≠400
Step 2. Choose the level of significance: 𝑎= 0.05.
Step 3. Compute the test statistics. Since it is the population mean that is being tested
and the population variance is known and n>30, the appropriate test statistic is the z-value.
𝒙̅ ̅−𝝁
z=
𝝈
√𝒏
Computation:
The standard deviation 𝜎 is the square roots of the variance 𝜎 2 . The square roots of 8,464
is 92, hence 𝜎 = 92.
𝑥̅ ̅−𝜇
z= 𝜎
√𝑛
= 425−400
92
√100
=2.72
Step 4. Determine the critical value The computed test statistic us z=2.72. Use the Areas
74
from 0.5. Since this is a two- tailed test, double the result. Hence, 0.5−0.4967 = .0033 . The
𝑝- value = 2 (0.0033) = 0.0066.
𝛼 𝛼
=0.025 Non - rejection region =0.025
2 2
1 1
p=0.0033 𝜇= 400 p=0.0033
2 2
5. Draw a conclusion.
Since 0.0066 is less than 0.05, reject the null hypothesis and accept the alternative
hypothesis. Conclude that the average daily take home pay of jeepney drivers is not equal
to Php400.00. This result is significant at 𝛼=0.05 level.
Directions: Read each problem carefully. Choose the letter which corresponds
to the correct answer and write it in a separate sheet of paper.
1- 5. Find the appropriate rejection region in each case. (when the variance is
known)
1. 𝐻𝑎: 𝜇≠𝜇𝑜 , 𝛼 = 0.05.
a. z=−1.97 b. z=1.95 c. z=1.96 d. z=1.94
2. 𝐻𝑎: 𝜇>𝜇𝑜 , 𝛼 = 0.01.
a. z=+2.33 b. z=−2.33 c. z= +2.35 d. z= −2.35
3. A two tailed test at 10% level of significance.
a. The appropriate rejection is the area to the right of the critical value z=+1.645
and the area to the left of the critical value z=−1.645.
b. The appropriate rejection is the area to the right of the critical value z=−1.645
and the area to the left of the critical value z=+1.645.
c. The appropriate rejection is the area to the right of the critical value z=+1.635
and the area to the left of the critical value z=-1.635.
d. The appropriate rejection is the area to the right of the critical value z=+1.545
and the area to the left of the critical value z=-1.545.
4. A two tailed test at 95% level of confidence.
a. The appropriate rejection is the area to the right of the critical value z=+1.645
and the area to the left of the critical value z=−1.645.
b. The appropriate rejection is the area to the right of the critical value z=+1.96 and
the area to the left of the critical value z=−1.96.
c. The appropriate rejection is the area to the right of the critical value z=+1.95 and
the area to the left of the critical value z=−1.95.
75
d. The appropriate rejection is the area to the right of the critical value z=+2.96
and the area to the left of the critical value z=-2.96
5. 𝐻𝑎: 𝜇<𝜇𝑜 , 𝛼 = 0.01.
a. z=2.33 b. −2.33 c. 2.45 d. -2.45
a. The computed test statistic z=1.77 does not fall within the rejection region,
hence do not reject the null hypothesis.
b. The computed test statistic z=−1.77 does not fall within the rejection region,
hence do not reject the null hypothesis.
c. The computed test statistic z=1.77 falls within the rejection region, hence, do
not reject the null hypothesis.
d. The computed test statistic z=1.77 does not fall within the rejection region,
hence accept the null hypothesis.
7. 𝐻𝑜: 𝜇 = 84 , 𝐻𝑎: 𝜇 ≠ 84. By using the p- value method.
Given: ?̅?̅=87, 𝜎 =10 , n=35 𝛼 =0.05
a. The p-value of 0.077 is more than 0.05, hence accept the null hypothesis.
b. The p-value of 0.077 is less than 0.05, hence do not accept the null hypothesis.
c. The p-value of 0.077 is more than 0.05, hence do not reject the null hypothesis.
d. The p-value of 0.077 is less than 0.05, hence accept the null hypothesis.
8. 𝐻𝑜: 𝜇 = 45 , 𝐻𝑎: 𝜇 < 45. By using the critical value method.
Given: ?̅?̅=40, 𝜎 =12 , n=32 𝛼 =0.01
a. The computed test statistic z=−2.36 falls within the rejection region, hence reject
the null hypothesis.
b. The computed test statistic z=2.36 do not falls within the rejection region, hence
reject the null hypothesis.
c. The computed test statistic z=2.36 do not falls within the rejection region, hence,
accept the null hypothesis.
d. The computed test statistic z=-2.36 falls within the rejection region, hence,
accept the null hypothesis.
9. 𝐻𝑜: 𝜇 = 45 , 𝐻𝑎: 𝜇 < 45. By using the P- value method.
Given: ?̅?̅=40, 𝜎 =12 , n=32 𝛼 =0.01
a. The p-value of 0.0091 is less than −0.01, hence reject the null hypothesis.
b. The p-value of 0.0091 is more than 0.01, hence reject the null hypothesis.
c. The p-value of 0.0091 is less than 0.01, hence accept the null hypothesis.
d. The p-value of 0.0091 is less than 0.01, hence reject the null hypothesis.
10- 13. Find the critical value of the following. (When variance is unknown)
10. A right -tailed test, 𝛼=0.05 ; df =24.
a. Critical value =+1.711 c. Critical value = 1.750
76
b. Critical value = 1.712 d. Critical value = −1.711
11. A left -tailed test; 𝛼=0.01 ; df =14.
a. Critical value =2.553 c. Critical value =−2.553
b. Critical value =−2.624 d. Critical value= =2.624
12. A two -tailed test, 𝛼=0.01; df =18.
a. Critical value = ±1.734 b. c. Critical value =−1.734
Critical value = +2.878 d. Critical value = -2.878
13. A two -tailed test, 𝛼=0.05; df =16.
a. Critical value = +2.120 c. Critical value =±2.120
b. Critical value =−2.120 d. Critical value= 2.120
MODULE 3
This module was designed and written with you in mind. It is here to help you master
the tests of hypothesis. The scope of this module permits it to be used in many different
learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But
the order in which you read them can be changed to correspond with the textbook you
are now using.
The module consists of a lesson, namely:
Lesson 1 – Comparing the Sample Mean and the Population Mean in a Large
Sample Size
77
Let us explore.
Example 1.
A new drug on the market is claimed by its manufacturers to reduce overweight women by
4.55 kg per month with a standard deviation of 0.91 kg. Ten women chosen at random have
reported losing an average of 4.05 kg within a month. Does this data support the claim of
the manufacturer at 0.05 level of significance?
For you can easily understand how to test a hypothesis, a simplified approach of testing a
hypothesis is presented to you. Understand carefully and suggested to follow it.
Example 2: The ABC company claims that the average lifetime of a certain tire is at least 28
000 km.
To check the claim, a taxi company puts 40 of these tires on its taxis and gets a mean
lifetime of 25 560 km. With a standard deviation of 1 350 km, is the claim true? Use the z-
test at 0.05.
78
(Ho: ≥ 28 000)
Ha: The average lifetime of a certain tire is less
II. Hypotheses: 28 000 km. (Ho 28 000)
Since the claim says that a certain tire is at east 28
000 km. it could also be possible that the
alternative hypothesis is Ha < 28 000
III. Level of Significance: = 0.05
Critical value (cv) z = -1.645
IV. Statistics z-test for one-tailed
Determine the decision for each of the following, given the computed and
critical value of the z.
1. z computed = 1.82 z z critical = 1.96 z
2. computed = 2.54 critical = 2.33
3. z computed = 1.02 z critical = 2.33
4. z computed = 2.54 z critical = 2.33
5. z computed = 2.54 z critical = 2.33
Determine the decision for each of the following given the computed z
note: Determine first the critical value using the confidence level .
Directions: Read and Understand the problem carefully and Solve the following: A
sociologist believes that it costs more than Php 90 000 to raise a child from birth to
age one. A random sample of 49 families, each with a child is selected to see if this
figure is correct. The average expenses for these families reveal a mean of Php 92 000
with a standard deviation of Php 4 500. Based on these sample data, can it be
concluded that the sociologist is correct in his claim? Use the 0.05 level of significance.
I. Problem
79
II. Hypotheses:
III. Level of Significance: Critical value (cv)
IV. Statistics
Rejection Region:
Compute the test value, using the test statistics
V. Decision Rule:
VI. Conclusion:
A printer manufacturing company claims that its new ink-efficient printer can print
an average of 1500 pages of word documents with a standard deviation of 60. Thirty-
five (35) of these printers showed a mean of 1 475 pages. Does this support the
company's claim? Use the 95% confidence level.
I. Problem
II. Hypotheses:
III. Level of Significance: Critical value (cv)
IV. Statistics
Rejection Region:
Compute the test value, using the test statistics
V. Decision Rule:
VI. Conclusion:
MODULE 4
This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using. This module targets the following
learning competencies:
1. Compute for the test statistic value (population mean) (M11/12SP-IVd-1).
2. Draw conclusion about the population mean based on the test-statistic value and
the rejection region (M11/12SP-IVd-2).
After going through this module, you are expected to:
• define tests of significance;
• compute the test statistic;
• find the p-value; and
• compare p-value with 𝛼; and
draw conclusion about the population mean based on the test-statistic value and
the rejection region.
80
Lesson
4 Tests of Significance
Once a sample data has been collected, researchers will use a tool to find out the
probability that a relationship exists between two variables in every sample. They
need to assess whether or not the relationship between two variables does exist or it
is just because of random chance. In this module, you will learn how to do it. You will
know how to compute for the test statistic value (population mean), and draw a
conclusion about the population mean. The learning that you gained from the
previous modules will help you understand this lesson
After formulating the null and alternative hypotheses, the next step is to
compute the test statistic. However, before doing the computation, you have to
identify first the appropriate significance test. Take note that the test statistic
follows a normal distribution where the mean is 0 and the standard deviation is 1.
𝜇 = population mean
s = sample standard deviation
n = sample size
To summarize when to use a t-test or a z-test, use this diagram:
In the past, statisticians used a z-test when n ≥ 30 and used a t-test when n < 30.
That is because they assume that a distribution is normally distributed when the sample
81
size is large enough. However, there is no need to do it nowadays. We can now use a t-test
even if the sample size is greater than or equal to 30. Even the statistical packages now use a
t-test for large sample sizes. This is because as the sample size increases, t gets closer to z.
Meaning, you do not lose anything when you use a t-test. The main point now is this: if the
population standard deviation (σ) is unknown, use a t-test regardless of the sample size.
Meaning, the use of a z-test or a t-test is not related to n. So, whenever you use a sample
standard deviation (s) to compute the standard error as an estimate for a population
standard deviation (σ), use a t-statistic.
Example 1: Compute the test statistic using the following data:
?̅?̅=85, 𝜇=84, 𝜎=5, 𝑛=60
Steps Solution
1. Identify the appropriate statistical Since the population standard deviation
test. 𝜎 is known, use the z-test.
2. Compute using the formula for ?̅?̅−𝜇
𝜎
𝑧=
z statistic 𝑧=
√𝑛
85−84 1 1
5 = 5
7.7460
= .6455 =
√60
z =1.55
Level of significance = 0.05
Steps Solution
1. Identify the appropriate statistical Since the population standard deviation
test. 𝜎 is unknown, use the t-test.
2. Compute using the formula for 𝑡=
?̅?̅−𝜇
𝑠
t statistic 𝑡= √𝑛
4.51 130.05−120 10.05 10.05
9.96 =9.96 = .6455 = 4.512 or
√20 4.4721
Reject the Ho
82
B. The Probability-value Method (p-Value Method) Recall that the null hypothesis (H0) is the
A p-value helps you to determine how likely is the data, assuming that H0 is true. It
is the probability to the right of the test statistic. If you are doing the two-tailed test, then
it is the probability to the lower left and to the upper right of the test statistic. Note that it
does not tell you the probability that H0 is true (because in the first place, you assume this
to be true before doing the test). This belief is one of the biggest misconceptions about a p-
value. Another thing is that, having a good p-value (or low p-value) does not mean that your
conclusion is correct. It only tells you how strong your evidence is to reject the null
hypothesis. Also, always bear in mind that you do not accept a null hypothesis. It is either
you reject it or fail to reject it. This is what we are doing in hypothesis testing. We are
gathering evidences to reject the null hypothesis.
1. Select the level of significance (𝛼). This is the cutoff value for p, and you set
this before doing the hypothesis testing. The most commonly used levels of
significance are 0.01, 0.05 and 0.10.
2. Compute the p-value.
3. Compare the p-value with the significance level (𝛼) and draw a relevant
conclusion. If the p-value is less than or equal to the significance level 𝛼, then
the evidence is sufficient to reject the null hypothesis.
Interpretation
p-value Interpretation
Less than .01 Highly statistically significant
There is very strong evidence against H0
.01 to .05 Statistically significant
Adequate evidence against H0
Greater than .05 Insufficient evidence against H0
Adapted from Statistics & Probability by R. Belecina et al, page 259
83
Decision Rule:
➢ Reject the null hypothesis when the p-value is equal or smaller than alpha 𝛼 .
➢ (Reject H0if p ≤ ) 𝛼
Do not reject the null hypothesis when the p-value is larger than alpha 𝛼 .
(Do not reject H0 if p > 𝛼)
Example:
The owner of a company that sells a particular powdered juice claims that the average
5 g. However,
weight a of their product is 100 g with a standard deviation of
content
group of students wants to test the claim for they believe that it is less than 100 g. So, they
get a sample of 50 packs of such powdered juice, computed the weight content, and then
find the mean weight to be 99 g. Is the claim of the company owner true?
Solution:
Steps Solution
1. Formulate the null hypothesis and H0: µ = 100 g / Ho : µ ≥ 100
the alternative hypothesis. Ha: µ < 100 g
2. Statistical Test
• Choose a significance level (𝛼) • α = .05
• Is the test one-tailed or two-tailed? • one-tailed
• What is the appropriate test • z test (note that σ is given)
statistic?
3. Compute for the test statistic and the 𝑧=
?̅?̅−𝜇
𝜎
p-value 𝑧=
√𝑛
99−100 −1 −1
5 = 5 = .7071 = -1.41
√50 7.0711
84
Activity: Rejected or Not?
Directions: Complete the table by filling out the missing values. Then, draw a decision about
the population mean based on the test statistic value and the probability value. (Assume
that there is only one variable and that all the assumptions are met.)
Decision
(Reject the null
Test Statistic hypothesis or
Significance Level p-value failed to reject the
(one-tailed) null hypothesis)
1 𝛼=.05 z = 1.35
2 𝛼=.10 z = -2.28
3 𝛼=.01 z = -1.17
4 𝛼=.05 z = 1.96
5 𝛼=.05 z = 2.54
6 𝛼=.01 t = 1.345; n = 15
7 𝛼=.10 t = -1.19; n = 5
8 𝛼=.05
t = 2.756; n = 30
9 𝛼=.01
t = 3.25; n = 10
10 𝛼=.01
t = -1.059; n = 25
Directions:
1. Compute the test statistic using the appropriate statistical test. (Write the test
statistic in three-digit form.)
2. Find the p-value.
3. Using the selected significance level, decide whether to reject the null hypothesis.
85
MODULE 5
This module was designed and written with you in mind. It is here to help you learn about
Hypothesis testing. The scope of this module permits it to be used in many different learning
situations. The language used recognizes the diverse vocabulary level of learners. The lessons
are arranged to follow the standard sequence of the course.
The module is divided into 2 lessons, namely:
The hypothesis or claims about population mean or population proportion could be tested
using the five -step hypothesis testing procedure. There are certain situations when the data
to be analyzed involved population proportion or percentage.
The owner of the iPhone 12 pro claims that their cellphone has 2,185 mAh
Battery with a standard deviation of 60. Forty-five (45) of theses cellphones
showed a mean of 2,160 mAh battery. Does this support the company’s claim?
Use 95% confidence Level.
Answer
Using the five-step hypothesis testing procedure:
1. Null Hypothesis (H0) and Alternative Hypothesis (Ha)
86
2. Statistical Test
Since n=45, therefore it is Z-test
We are using equal/not equal sign, it is two-tailed
Confidence Level = 95%, α=0.05
Z-Critical = ±1.96
3. Computation
In the example above, AITF may initially believe that 50% of the patients are
female. Suppose they gather enough data. Out of 100 records, 56 are female patients.
Would this support their initial belief?
To test a claim about population proportion, we use the z-test for Population
87
Proportion. The formula below is used.
p̂̂ −p
𝑧=
√𝑝𝑞/𝑛
Where:
Answer: 1.
88
Determine the decision for each of the following given.
Write R if Rejected, DNR if Do not Reject, the Null hypothesis.
1. Z-computed= 2.25 Z-critical=2.87
2. Z-computed= 1.95 Z-critical=2.50
3. Z-computed= 0.89 4. Z-critical= 0.89 Z-
Z-computed= 1.00 5. T- critical= 3.00 T-critical=
computed= 0.27 6. T- 3.00 T-critical= 1.97 T-
computed= 1.56 7. T- critical=2.43 T-
computed= 2.34 8. T- critical=2.13
computed= 1.23 9. Z- Confidence level=90%
computed= 0.12 10. Z- , one tailed
computed= 1.97 11. Z- Confidence level= 95% , two tailed
computed= 2.22 12. T- α= 0.01 , one tailed
computed= 1.11 13. T- Confidence level= 95% , two tailed, n=18
computed= 1.67 14. T- α =0.1 α , one tailed, n=20
computed= 1.67 =0.1 , Two tailed, n=20
15. T-computed= 2.50 α = 0.05 , one tailed, n= 1
Answer the given questions. 1. In a recent survey, a researcher claims that the average life
of a dog in a certain
country is 10 years. Is their claim correct if a random sample of 30 deaths from this
country showed a mean of 13 years with a standard deviation of 1.2 years? Use 95%
confidence level.
2. Ms. Pelaez, a teacher in English, believes that less than 15% of the student like
English, If 20 out of 55 randomly students like English, is the teacher’s claim valid? Use
95% confidence level.
MODULE 6
This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are
now using. The module consists of one lesson which contains sub lessons:
89
• Lesson 6 – Comparing Sample Proportion and Population Proportion
To compare sample proportion and population proportion, we use the z-test for one-
sample proportion. The test statistics for this test is
Example 3
It has been claimed that less than 60% of all purchases of a certain kind of computer
program will call the manufacturer’s hotline within one-month purchase. If 55 out of 100
software purchasers selected at random call the hotline within a month of purchase, test
the claim at 0.05 level of significance.
Solution
90
Ha: The proportion of purchasers that will call the manufacturer’s
hotline within one month of purchase is less than 60% or 0.60
( po < 0.60 )
?̂?̂ −𝑝0
𝑧=
𝑝
√0(1−𝑝0 )
𝑛
0.55 − 0.60
𝑧=
√0.60(1 − 0.60)
100
−0.05
𝑧=
√0.60(0.40)
100
Step 4: Decision:
Fail to reject /Accept the null hypothesis because the computed value or the
test value falls outside the rejection region.
Step 5: Conclusion:
There is no sufficient evidence to conclude that the proportion of purchasers
that will call the manufacturer’s hotline within one month of purchase is less than
60%. Thus, the claim is false or incorrect.
91
Solve Me Please!
A doctor claims that only 10% of all patients exposed to a certain amount of
radiation will feel ill effects. If in a random sample, 5 of 18 patients exposed to
such radiation feel some ill effects, test the doctor’s claim at 0.01 level of
significance.
1. Formulate the null and alternative hypotheses
Ho:
Ha:
2. Type of test:
3. Compute the test value.
4. Decision:
5. Conclusion:
What’s My Decision?
Directions: Find the critical value, type of test, draw the rejection region, compute the
value of the test statistic, and make a decision whether to accept or failed to accept
the null hypothesis in each of the following situations.
92
Directions: Choose the letter of the correct answer. Write your answer on a
separate sheet of paper.
1. Is a numerical quantity that is assigned to the outcome of an experiment.
A. Random variable C. Sample space
B. Sample point D. Variable
2. In how many ways can two coins fall?
A. 2 B. 4 C. 6 D. 8
3. It tells the distance of score from the mean measured in standard deviation
units.
A. normal curve C. z-score
B. sample mean D. area
4. Which of the following shows the probability that the z-score lies above a z-score
value?
𝐴.𝑃(𝑎<𝑧<𝑏) 𝐵.𝑃(𝑧>𝑎) C. 𝑃(𝑧<𝑎) D. 𝑃(𝑎=𝑧
6. Statement 1: The number of students who are present in Filemon T. Lizan SHS
for the first day of class for the S.Y. 2020-2021
Statement 2: The number of Mayors in NCR who are present during the meeting
Which of the following is CORRECT?
A. both statements are Discrete C. Statement 1 is Discrete Random Variable
Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable
7. Statement 1: the volume of soft drinks in a 12-ounce can
Statement 2: the time required to perform a job.
which of the following is CORRECT?
A. both statements are Discrete C. Statement 1 is Discrete Random Variable
Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable
8. Let B number of boys in a family and G for the girls in a family of four children.
Determine the values of the random variable B.
A. 0, 1 B. 0, 1, 2 C. 0, 1, 2, 3 D. 0, 1, 2, 3, 4
For numbers 9 – 10. Consider the probability distribution of the number of
mangoes given below.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
9. Find P(R = 3) A. 1/8 B. 5/8 C. 3/8 D. 1
10. Find P(R > 1)
A. 18 B. 3/8
C. 1/2 D. 1
93
References
Ocampo, J.M, Marquez, W. G., (2006). Conceptual Math & Beyond: Brilliant Creations
Publishing, Inc.
Gabuyo, Y. A, Cardenas, M. C., (2016). Statistics and Probability: The Inteligente Publishing,
Inc.
Belecina, R.R., Baccay, E.S., & Mateo E.B. (2016). Statistics and probability. Quezon City,
QC: Rex Book Store, Inc.
96