0% found this document useful (0 votes)
3 views

Statistics and Probability

The document is a comprehensive guide on Statistics and Probability, structured into modules for educational purposes. It includes various topics such as random variables, discrete probability distributions, and assessments for students to test their understanding. The content is designed to cater to different learning situations and provides examples and exercises to reinforce the concepts taught.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Statistics and Probability

The document is a comprehensive guide on Statistics and Probability, structured into modules for educational purposes. It includes various topics such as random variables, discrete probability distributions, and assessments for students to test their understanding. The content is designed to cater to different learning situations and provides examples and exercises to reinforce the concepts taught.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

FE DEL MUNDO NHS

B. Del Mundo, Mansalay, Oriental Mindoro


[email protected]

Statistics and
Probability
Table of Contents
What I Know ................................................................................1 Quarter I Module 1

......................................................................................4 Module 2

......................................................................................11 Module 3

......................................................................................19 Module 4

......................................................................................25 Module 5

......................................................................................32

Module 6 ......................................................................................39

Module 7 ......................................................................................43 Module 8

......................................................................................48 Module 9

......................................................................................51 Module 10

....................................................................................57

Quarter II

Module 1 ......................................................................................63

Module 2 ......................................................................................72

Module 3 ......................................................................................77

Module 4 ......................................................................................80

Module 5 ......................................................................................86

Module 6 ......................................................................................89

Assessment ..................................................................................93
References ...................................................................................96
Directions: Choose the letter of the correct answer. Write your answer on a
separate sheet of paper.
Quarter I
1. Is a numerical quantity that is assigned to the outcome of an experiment.
A. Random variable C. Sample space
B. Sample point D. Variable
2. In how many ways can two coins fall?
C. 6
A. 2
D. 8
B. 4
3. It tells the distance of score from the mean measured in standard deviation
units.
A. normal curve C. z-score
B. sample mean D. area
4. Which of the following shows the probability that the z-score lies above a z-score
value?
𝐴.𝑃(𝑎<𝑧<𝑏) C. 𝑃(𝑧<𝑎)
𝐵.𝑃(𝑧>𝑎) D. 𝑃(𝑎=𝑧
5. What is the proportion of the area to the right of z = -1?
A. -0.3413 C. 0.3413
B. -0.8413 D. 0.8413

6. Statement 1: The number of students who are present in Filemon T. Lizan SHS
for the first day of class for the S.Y. 2020-2021
Statement 2: The number of Mayors in NCR who are present during the meeting
Which of the following is CORRECT?

A. both statements are Discrete C. Statement 1 is Discrete Random Variable


Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable

7. Statement 1: the volume of soft drinks in a 12-ounce can


Statement 2: the time required to perform a job.
which of the following is CORRECT?

A. both statements are Discrete C. Statement 1 is Discrete Random Variable


Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable

8. Let B number of boys in a family and G for the girls in a family of four children.
Determine the values of the random variable B.

A. 0, 1 C. 0, 1, 2, 3
B. 0, 1, 2 D. 0, 1, 2, 3, 4

1
For numbers 9 – 10. Consider the probability distribution of the number of
mangoes given below.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
9. Find P(R = 3)
A. 1/8 C. 3/8
B. 5/8 D. 1

10. Find P(R > 1)


A. 18 C. 1/2
B. 3/8 D. 1

Quarter II
11. The random sample size n = 3 are drawn from a finite population consisting of
the numbers 14, 25, 36, 47, 58 and 69. How many possible samples are there?

A. 12 C. 20
B. 16 D. 24

12. The random samples of size 3 are taken from a population of the numbers 1, 2,
3, 4, 5, 6, and 7. How many samples are there?
A. 35 C. 210
B. 120 D. 350

13. The random samples of size 4 are taken from a population of the numbers 1, 2,
3, 4, 5, 6, 7, and 8. How many samples are there?
A. 70 C. 1 680
B. 840 D. 3 024

14. The random sample size n = 5 are drawn from a finite population consisting of
the numbers 15, 16, 17, 18, and 19. How many possible samples are there?
A. 1 C. 4
B. 2 D. 8

15. The following are the weights of five students in kg. suppose samples of size 2
are taken from this population of five students.

Student Weight (in kg.)


Rusty 55
Buchoy 38
Boyong 60
Jenny 45
Kathrina 75

How many samples are possible?


A. 8 C. 12
B. 10 D. 14

16. A random sample of size 4 is taken with replacement from a population with
𝜇 = 12 and 𝜎2 = 8. Find the variance (𝜎2?̅?̅ ).
A. 2.5 C. 1
B. 2 D. 1.5

2
17. A random of size 25 is taken with replacement from a population with 𝜇 = 121.4
and 𝜎2 = 50.5. Find the mean 𝜇?̅?̅.
A. 121.4 C. 122.5
B. 121.5 D. 122

18. Which of the following is stated in the Central Limit Theorem?

A. The sampling distribution of the sample means approaches a normal


distribution as the sample size decreases, no matter what the shape of
the distribution is.
B. The sampling distribution of the sample means approaches a normal
distribution as the sample size increases, no matter what the shape of
the distribution is.
C. The sampling distribution of the sample means deviates from a normal
distribution as the sample size decreases, no matter what the shape of
the distribution is.
D. The sampling distribution of the sample means deviates from a normal
distribution as the sample size increases, no matter what the shape of
the distribution is.

19. Why is it important to sample not more than 10% of the population when the
sample is drawn without replacement?

A. To reduce the effort when gathering data from the sample


B. To lessen the expenses that may occur during the conduct of research
C. To minimize the chance of creating a significant change when removing
each item in the observation
D. To clearly see the behavior of the sample and then create a significant
change from the population

20. The independence condition for the Central Limit Theorem is assumed to be
met when _____.
A. the sample is biased
B. the sample is randomly selected
C. the sample is drawn with replacement
D. the sample is drawn without replacement

3
Quarter I
MODULE 1

This module was designed and written with you in mind. It is here to help you
master the nature of Statistics and Probability. The scope of this module permits it
to be used in many different learning situations. The language used recognizes the
diverse vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.
The module consists of the lessons, namely:

– Understanding Random Variables


– Discrete Probability Distributions

After going through this module, you are expected to:


1. illustrates a random variable (discrete and continuous).
2. distinguishes between a discrete and a continuous random variable.
3. finds the possible values of a random variable.
4. illustrates a probability distribution for a discrete random variable and its
properties.

Lesson Random Variables and


1 Probability Distributions

The concept of probability distribution is very important in statistical analysis


of data. This is especially true when we try to estimate the true value of a variable,
using sample data. Moreover, we also use probability distribution in testing
hypothesis. We have studied probability in the previous grade levels. We have also
studied frequency distributions in statistics. In this lesson you will learn to illustrate
a random variable (discrete and continuous), distinguish between a discrete and a
continuous random variable, find the possible values of a random variable, and
illustrate a probability distribution for a discrete random variable and its properties.

4
RANDOM VARIABLE
- is a variable that assumes numerical values associated with the outcome of a
random process or experiment.
OTHER DEFINITION OF TERMS
Experiment- any activity which can be done repeatedly under similar conditions.
Sample Space - the set of all possible outcomes in an experiment.
Event - a subset of a sample space.
Sample Point - the elements in a sample space.
Probability - the ratio of the number of favorable outcomes to the total number of
possible outcomes.
A random variable may be classified as discrete or continuous.

Discrete Random Variable - is one that can assume only a countable number of
values.
Continuous Random Variable – can assume infinite number of values in one or
more intervals.
Examples:
A.Classify the ff. if it is Discrete Random Variable or Continuous Random
Variable.

1. number of pencils in a box Discrete Random Variable


2. number of defective flashlights Discrete Random Variable
3. voltage of radio batteries Continuous Random Variable
4. amount of antibiotic in a vial Continuous Random Variable
5. number of soldiers in a troop Discrete Random Variable
6. length of wire ropes Continuous Random Variable

Let’s apply in a Problem Solving

Example 1

Supposed two coins are tossed and we are interested to determine the number of
heads that will come out. Let us use H to represent the number of heads that will
come out. Determine the values of the random variable H.

Let’s follow the steps in solving this problem.

Step 1. List the sample space of the experiment.

S = { HH, HT, TH, TT }

5
Step 2. Count the number of heads in each outcome and assign this number to this
outcome.
Number of Heads
Outcomes (Value of H)
HH
2
HT
1
TH
1
TT
0

The values of the random variable H (no. of heads) in this experiment are 0,
1, and 2.

Example 2
A basket contains 10 ripe and 4 unripe mangoes. If three mangoes are taken from
the basket one after the other, determine the possible values of the random variable
R representing the number of ripe mangoes.

Solution: let R represents ripe mangoes and U represents unripe mangoes

Step 1. List the sample space of the experiment.


S = {RRR, RRU, RUR, URR, UUR, URU, RUU, UUU}

Step 2. Count the number of ripe mangoes (R) in each outcome and assign this
number to this outcome.

Outcome No. of ripe mangoes


(Value of R)
RRR 3
RRU 2
RUR 2
URR 2
UUR 1
URU 1
RUU 1
UUU 0

The values of the random variable r (number of ripe mangoes) in this experiment are
0, 1, 2, and 3.

6
Different Presentations of a Discrete Probability Distribution Probability Distribution

of a discrete random variable – is a correspondence that


assigns probabilities to the values of a random variable. The probability distribution
of a discrete random variable is also called the probability mass function.

The probability distribution of a discrete random variable can be shown graphically


by constructing a histogram. The graph is called a probability histogram.
Probability Distribution of the Number of Ripe Mangoes

R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8

For any discrete random variable X, the following are true.


● 0 ≤ P(X) ≤ 1, for each value of X
● ΣP(X) = 1

So, if we add the P(R) values that is equal to 1.


1 3 3 1 8
P(R) = + + + =
8 8 8 8 8

1
= 𝑜𝑟 1
8

Therefore, it is a probability distribution of a discrete random variable and sometimes


it is called probability mass function.

The bar graph shows the relationships of R which is value of the random variables
and the P(R) which is the probability of the number of ripe mangoes.
If we continue the process…

7
Step 3. Construct the frequency distribution of the values of the random variable R.

Number of Ripe Number of Occurrence


(frequency)
Mangoes
(Values of R) 1
3 3
2 3
1 1
0 8
Total

Step 4. Construct the probability distribution of the random variable R by getting


the probability of occurrence of each value of the random variable.

Number of Ripe Number of Probability


Mangoes Occurrence P(R)
(Values of R) (frequency)
3 1 1/8
2 3 3/8
1 3 3/8
0 1 1/8
Total 8 1

The probability distribution of the random variable R can be written as follows:

R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
Properties of a Discrete Probability Distribution
Examine the probability distribution that we have learned in the given example.
What have you notice about the probability values of the random variable in each
probability distribution?
What is the sum of the probabilities of a random variable?
Consider the probability of the number of bananas given below.

R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8

Solve the following:


1. P (R = 2)
In words, the probability of R which is exactly 2.
Solution: Since the given is exactly 2
Therefore, the answer is 3/8.
2. P (R = 3)

8
In words, the probability of R which is exactly 3.
Solution: Since the given is exactly 3
Therefore, the answer is 1/8.
3. P (R > 1)
In words, the probability of R which is greater than 1.
There are two possible values of R. These are 2 and 3.
P (R > 1) = P(2) + P(3)
= 3/8 + 1/8
= 4/8 or ½
Note: Simplify the answer if possible
4. P (R < 3)
In words, the probability of R which is less than 3.
There are three possible values of R. These are 2, 1 and 0.
P (R < 3) = P (2) + P(1) + P(0)
= 3/8 + 3/8 + 1/8
= 7/8
5. ΣP(R)
To find ΣP(R) we need to find the sum of all the probability values.
ΣP(R) = P (3) + P (2) + P (1) + P(0)
= 1/8 + 3/8 + 3/8 +1/8
= 8/8 or 1

Classify Me Please!

Directions: Classify the following as discrete random variable (DRV) or continuous


random variable (CRV). Write your answers on a separate sheet of paper.

Statement
1. the number of senators present in the meeting
2. the weight of newborn babies for the month of June
3. the number of ballpens in the box
4. the capacity of electrical resistors
5. the amount of salt needed to bake a loaf of bread

9
6. the capacity of an auditorium
7. the number of households with television
8. the height of mango tree in a farm
9. the area of lots in a subdivision
10. the number of students who joined the fieldtrip
11. the number of children in a family
12. the number of tails flipped in 4 trials
13. the time required to perform a job
14. the amount of sugar in a pineapple juice
15. the volume of mango juice in a 12 – ounce can
16. the Saturday night attendance at the prayer meeting
17. the number of patients of Dr. Naval in his clinic for three weeks
18. the time taken to complete an examination in Statistics and Probability
19. the interest rate given by the BDO bank
20. the weight of a fish

Analyze Me Please!

Directions: Determine the values of the random variable in each of the following
situations. Write your answers on a separate sheet of paper.
1. A coin is flipped four times. Let T be the number of tails that come out. Determine
the values of the random variable T.
a. List the sample space of the experiment.
S = { _________________________________}
b. Count the number of tails (T) in each outcome and assign this number to this
outcome.
Outcome Number of Tails
(Value of T)

c. The values of the random variables are_________________________


2. A box contains 4 green and 2 blue dice. Three dice are chosen one after the other.
Determine the values of the random variable G representing the number of green
dice.
a. List the sample space of the experiment.
S = { _________________________________}

10
b. Count the number of green dice (G) in each outcome and assign this number to
this outcome.

Outcome Number of Tails


(Value of T)

c. The values of the random variables are_________________________

3. A meeting of consuls was attended by 4 Americans and 2 Germans. If three consuls


were selected at random, construct the probability distribution for the random
variable G representing the number of Germans.

a. List the sample space of the experiment.


S = { _________________________________}
b. Count the number of Germans (G) in each outcome and assign this number to
this outcome.
Outcome Number of Germans
(Value of G)

c. The values of the random variables are_________________________

MODULE 2

This module was designed and written with you in mind. It is here to help you master
the random variable and probability distributions. The scope of this module permits it
to be used in many different learning situations. The language used recognizes the
diverse vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.
The module is divided into two lessons, namely:

• Lesson 1 – Constructing Probability Distribution


• Lesson 2 – Mean, Variance, and Standard Deviation of a Discrete Random
Variable

After going through this module, you are expected to:


1. computes probabilities corresponding to a given random variable
2. illustrates the mean and variance of a discrete random variable.
3. calculates the mean and the variance of a discrete random variable

11
Lesson Constructing Probability
2.1 Distributions

In your previous study of mathematics, you have learned how to find the probability
of an event. In this lesson, you will learn how to construct a probability distribution
of a discrete random variable. Your knowledge of getting the probability of an event
is very important in understanding the present lesson. To find out if you are ready
to learn this new lesson, do the following activities.

Discrete probability distribution Defined


A discrete probability distribution is a table showing all the possible values of
a discrete random variable together with their corresponding probabilities.
The mean of a discrete random variable x is also called the expected value
of x. It is the weighted average of all the values that the random variable x
would assume in the long run. The discrete random variable x assumes values
or outcomes in every trial of an experiment with their corresponding
probabilities. The expected value of x is the average of the outcomes that is
likely to be obtained if the trials are repeated. The expected value of x is
denoted by E(x).

A probability distribution describes the probability of each specific value in a random


variable. The probability distribution of a discrete random variable is probability mass
function (pmf). The pmf of x is denoted by 𝑓(𝑥) and satisfies the following two basic
properties.

12
According to the first property, for every element x in the Support S, in another words,
sample space, all the probabilities must be positive and according to the second
property, the sum of all the probabilities for all possible x values in the Support S must
be equal to 1. The values of the discrete random variable X where 𝒇(𝒙)>𝟎 are called its
mass points.

Example No.1 : NUMBER OF TAILS


Suppose three coins are tossed. Let Y be the random variable representing the
number of tails that occur. Find the probability of each of the values of the random
variable Y.
Solution:

Steps Solution
1. Determine the sample space. Let H The sample space for this experiment is :
represent head and T represent Tail S={𝑇𝑇𝑇,𝑇𝑇𝐻,𝑇𝐻𝑇,𝐻𝑇𝑇,𝐻𝐻𝑇,𝐻𝑇𝐻,𝑇𝐻𝐻,𝐻𝐻𝐻}
2. Count the number of tails in each Possible Value of the
outcome in the sample space and Outcomes Random Variable
assign this number to this outcome TTT Y( No. of Tails )
TTH 3
THT 2
HTT 2
HHT 2
HTH 1
THH 1
HHH 1
Number of Tails 0
(Y) Possibility P(Y)
3. There are four possible values of the
random variable Y, representing the
number of tails. These are 0,1,2 and 0 1
3. Assign probability values P (Y) to 8
each value of the random variable 1 3
-There are 8 possible outcomes, and no 8
tail occurs once, so the probability that 2
1
3
we shall assign to the random 0 is 8
8
-There are 8 possible outcomes and 1 3 1
tail occurs three times, so the
8
probability that we shall assign to the
3
random variable 1 is 8
-There are 8 possible outcomes and 2
tail occurs three times, so the
probability that we shall assign to the
3
random variable 2 is 8
-There are 8 possible outcomes and 3
tail occurs once
-e, so the probability that we shall
1
assign to the random variable 2 is 8

13
Table 1.1 . The Probability Distribution or the Probability Mass Function of
Discrete Random Variable Y.

No. of Tails (Y) 0 1 2 3


Probability P 1 3 3 1
(Y) 8 8 8
8
The sum of probabilities is ∑𝑷(𝒀)=1. This is discrete random variable
Example 2. NUMBER OF BLUE BALLS
Two balls are drawn in succession without replacement from an URN containing 5
red balls and 6 blue balls. Let Z be the random variable representing the number
of blue balls. Construct the probability distribution of the random variable Z.
Solution:

Steps Solution

1. Determine the sample space. Let B The sample space for this experiment is :
represent the blue ball and R represent
the red ball. S={𝑅𝑅,𝑅𝐵,𝐵𝑅,𝐵𝐵}

2. Count the number of blue balls in each


outcome in the sample space and
assign this number to this outcome

3. There are three possible values of the


random variable Z, representing the
number of blue balls. These are 0,1, and 2.
Assign probability values P (Z) to each
value of the random variable
-There are 4 possible outcomes, and no
blue balls occurs once, so the
probability that we shall assign to the
1 Number of Tails (Y) Possibility P(Y)
random 0 is
4
0 1
-There are 4 possible outcomes, and 1
4
blue ball occurs two times, so the
probability that we shall assign to the 1 1
2 1 2
random variable 1 is 𝑜𝑟 2
4
2 1
-There are 4 possible outcomes, and 2
4
blue balls occurs once, so the
probability that we shall assign to the
1
random variable 2 is 4

14
Table 1.2 . The Probability Distribution or the Probability Mass Function of Discrete
Random Variable Z.
No. of blue 0 1 2
balls (Z)

Probability 1 1 1
P (Z) 4 2 4
The sum of probabilities is ∑𝑷(𝒛)=1. Discrete the random variable because the
sum of probabilities is equal to 1.
Example 3. Determine whether the given values can serve as the values of a
probability distribution of the random variable x that can take on only the values
1 10 5 5
1,2,3,4. P(1) = , P(2)= , P (3)= , P(4) =)= .
19 19 19 19

P 1 2 3 4

10 5 1 5
P(x)
19 19 19 19
The sum of probabilities is ∑𝑷(𝒛)=1.163 , this is not a discrete random variable
because the sum of probabilities is not equal to 1.
Example 4. Determine whether the given values can serve as the values of a
1
probability distribution of the random variable X. P(x) = for x= 1,2,3...8.
8

Solution :
x 1 2 3 4 5 6 7 8
1
1 1 1 1 1 1 1 8
P(x) 8 8 8 8 8 8 8

The sum of probabilities is ∑𝑷(𝒙)=1, this is a discrete random variable because


the sum of probabilities is equal to 1.

Lesson Mean, Variance and Standard


Deviation of a Discrete Random
2.2 Variable

The mean or expected value of a discrete random variable x is computed using


the following formula:
E(x)=Σ[𝑥𝑃(𝑥)], where x = discrete random variable
x = outcome or value of the random variable
P (x) = probability of the outcome x

15
The variance of a random variable X is denoted by 𝜎2. It can likewise be written as Var
(X). The variance of a random variable is the expected value of the square of the
difference between the assumed value of random variable and the mean. The
variance of X is:

𝑉𝑎𝑟(𝑋) =Σ[(𝑥−𝜇)2𝑃(𝑥)] or 𝜎2=Σ[(𝑥−𝜇)2𝑃(𝑥)]

Where:
x = outcome, 𝜇= 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 , P (x) = probability of the outcome
The larger the value of the variance, the farther are the values of X from the
mean. The variance is tricky to interpret since it uses the square of the unit of
measure of x. so, it is easier to interpret the value of the standard deviation because
it uses the same unit of measure of x.
The standard deviation of a discrete random variable x is written as 𝜎. It is the square
roots of the variance. The standard deviation is computed as:

𝜎=√Σ[(𝑥−𝜇)2)𝑃(𝑥)]

Variable x represents the number of college graduates in the households. The


probability distribution of x is shown below:

x 0 1 2

P(x) 0.25 0.50 0.25

Find the mean or expected value of x.

x P(x) xP(x)
0 0.25 0
1 0.50 0.50
2 0.25 0.50
Σ[𝑥𝑃(𝑥)]= 1
E(x)= Σ[𝑥𝑃(𝑥)] =1.00
The expected value is 1. So, the average number of college graduates in the
household of the small town is one.

Example 2. A security guard recorded the number of people entering the bank every
hour during one working day. The random variable x represents the number of people
who entered the bank. The probability distribution of x is shown below.
x 0 1 2 3 4 5
P(x) 0 0.1 0.2 0.4 0.2 0.1

16
What is the expected number of people who enters the bank every hour?
Solution:
x P(x) xP(x)
0 0 0
1 0.1 0.1
2 0.2 0.4
3 0.4 1.2
4 0.2 0.8
5 0.1 0.5
Σ𝑃(𝑥)= 1 Σ[𝑥𝑃(𝑥)]= 3

So, E(x) = 3.0


Therefore, the average number of people entering the bank every hour during that
working day is three.
Example 3. Determine the variance and the standard deviation of the following
probability mass function.

x 1 2 3 4 5 6
P(x) 0.15 0.25 0.30 0.15 0.10 0.05

Solution:
Steps
1.Find the expected value.
2.Subtract the expected value from each outcome. Square each difference
3.Multiply each difference by the corresponding probability
4.Sum up all the figures obtained in step 3.

x P(x) xP(x) 𝑥 −𝜇 (𝑥−𝜇)2 (𝑥−𝜇)2𝑃(𝑥)

1 0.15 0.15 1 – 2.95=-1.95 3.8025 0.570375

2-2.95=-0.95 0.9025 0.225625


2 0.25 0.50 0.90
3-2.95=0.05 0.0025 0.000750
3 0.30 0.60 0.50
4-2.95=1.05 1.1025 0.165375
4 0.15 0.30
5-2.95=2.05 4.2025 0.420250
5 0.10 ()]=2.95
6-2.95=3.05 9.3025 0.465125
6 0.05

Σ[(𝑥−𝜇)2 𝑃 (2)] =1.8475


Σ[ 𝑥𝑃𝑥

E(x) = Σ[𝑥𝑃(𝑥)] =2.95


𝜎2 = Σ[(𝑥−𝜇)2𝑃 (2)] =1.8475
𝜎=√Σ[(𝑥−𝜇)2) 𝑃(𝑥)]
=√1.8475 = 1.359227 or 1.36

17
A. Construct a probability distribution for the data. (2 points)
1. The probabilities that a surgeon operates on 3,4,5,6, or 7 patients in any
one day are 0.15,0.20 ,0.25,0.20, and 0.20 respectively.
2. The probabilities that a customer buy 2,3,4,5, or 6 items in a convenience
store are 0.32,0.12,0.23,0.18, and 0.15 respectively.
3. The probabilities that a student will borrow 1,2,3, or 4 books are
0.45,0.30,0.15, and 0.10, respectively.
4. The probabilities that a bias die will fall as 1,2,3,4,5 or 6 are
1
, , 1 1, 1and1 1 ,, respectively.
2 6 12 ,12 12 12
5. The probabilities that a dispositor will invest Php100,000, Php250,000., or
1 1 1
Php180,000 are , ,and, respectively.
4 4 4

A. Find the expected value of each probability mass function below. (2 points)

1.
x 0 1 2 3

P(x) 0.15 0.32 0.37 0.16


2.
x 0 1 2 3
P(x) 0.17 0.33 0.36 0.14
3.
x 0 1 2 3
P(x) 0.20 0.30 0.32 0.18

B. Find the variance and standard deviation of each of the following probability
distribution. (3 points)
1.
X 0 1 2 3
P(x) 0.10 0.45 0.25 0.20

2.

X 0 1 2 3
P(x) 0.15 0.38 0.33 0.14

18
MODULE 3

This module was designed and written with you in mind. It is here to help you
master the random variable and probability distributions. The scope of this module
permits it to be used in many different learning situations. The language used
recognizes the diverse vocabulary level of students. The lessons are arranged to follow
the standard sequence of the course. But the order in which you read them can be
changed to correspond with the textbook you are now using.
The module deals with an understanding of:

▪ The Mean of a Discrete Probability Distribution and the Variance of a Discrete


Probability Distribution
▪ Applied Problems involving the Mean and the Variance of a Discrete
Probability Distribution

After going through this module, you are expected to:


1. Interpret the mean and the variance of a discrete random variable
2. Solves problems involving mean and the variance of a probability distribution.

Lesson Finding the Mean and the


3 Variance of a Discrete Probability
Distribution
In Lessons 1 and 2, you have learned and understood that the probability of
distribution of a discrete random variable is also called the probability mass
function, and a listing of the possible values of a discrete random variable is a
probability distribution.
In this lesson, you shall have to learn how to find the mean and the variance and
also the standard deviation of a probability distribution.

The table below shows the result of 4 tiles picked and returned in the jar 15 times.
If x represents each tile, and f represents the number of times picked, your task is to
evaluate what is being asked you to do.

19
Let us analyze and explore.
Just like frequency distribution, the probability distribution can be described by
computing its mean and variance. This time you will be exploring how to compute
for the mean and the variance for the discrete probability distribution.

THE MEAN OF A DISCRETE PROBABILITY DISTRIBUTION

To find the mean () or the expected value E(x) of a discrete probability distribution,
we use the following formula:

 = 𝐄 (𝒙)=∑[𝒙 𝑷(𝒙)]

where:  = mean

x = value of the random variable

P(x)= is the probability value of the random variable

From the experiment we discussed on tile and jar, we can use x to represent the tiled
number and the number of times picked to P(x) and dividing each of its value by 15,
thus table becomes:

x 1 2 3 4
P(x) 2 4 8 1

Let us figure it out:

(1). Find the mean of the discrete random variable  using the table above.

Step 1: Multiply the value of x by its corresponding probability value P(x).

20
x P(x) xP(x)
2 2
1
15 15
2 4 8
15 15
3 8 24
15 45
4 1 4
15 15 38
∑[ 𝑥 P(𝑥)]=
15

Step 2: Find the mean or the expected value of the probability distribution by getting
the sum of the values under the column x P(x).

 = E (x)=∑[𝑥𝑃(𝑥)]
38
=
15

= 2.53

therefore, the mean or the expected value of the probability distribution is 2.53.

THE VARIANCE AND STANDARD DEVIATION OF THE DISCRETE PROBABILITY


DISTRIBUTION
To find the variance and the standard deviation of a discrete probability distribution
using these formulas:

𝝈 𝟐 = ∑[(𝒙𝟐 𝑷(𝒙))] − 𝝁𝟐 𝝈 = √∑[(𝒙𝟐 𝑷(𝒙))] − 𝝁𝟐

where:  = mean
X = value of the random variable
P(X)= the probability value of the random variables
𝜎2= variance
𝜎 = standard deviation or SD

Now, let us try to find the variance and the standard deviation of the discrete random
variable x using the same example we use.

x P(x)
2
1
15
2 4
15
3 8
15
4
1
15

21
Step 1: Find the mean of the probability distribution. Prepare a table as shown
below.

x P(x) xP(x)
2 2
1
15 15
2 4 8
15 15
3 8 24
15 45
4
1 4
15 15 38
∑[ 𝑥 P(𝑥)]=
15

Using the formula for the mean of the probability distribution:

 = 𝐄 (𝒙)=∑[(𝒙𝑷(𝒙))]

𝟑𝟖
= = 𝟐.𝟓𝟑
𝟏𝟓
Step 2: Square each value of the random variable and multiply by the corresponding
probability value (x2 P(x)). The new table below will give you an idea on how to do the
squaring values of the random variable

x x2 P(x) x2 P(x) xP(x)


2 2 2
1 1
15 15 15
2 4 4 4 8
4( )
15 15 15
3 8 2 24
9 9( )
15 15 45
4 1
16 1 16( ) 4
15 15 15
106 38
∑[(𝑥 2 𝑃(𝑥))]= 15 ≈ 7.06 ∑[ 𝑥 P(𝑥)]= 15 ≈2.533̅̅

Step 3: Find the variance and the standard deviation by applying the formulas

𝜎2 = ∑[(𝑥2𝑃(𝑥))] − 𝜇2

= 7.06 – 2.53

= 4.53

or √𝜎2 = 𝟐.𝟏𝟑 thus, the variance is 4.53 and the standard deviation is 2.13

22
this only shows how close the variance and standard deviation from the mean. Since
the experiment, we have discussed is a tile picked and returned in the jar 15 times. The
mean computed tells us that for every tile picked from the jar, the number in the tile is
in average.

Thinking-Out-Loud. To give you a better understanding of the Mean, Variance, and

Standard Deviation
of a Discrete Probability Distribution. Try to answer the following by completing the
table. 1.What is the mean outcome if a fair die is rolled?

Step 1: Since the die is fair then:

x P(x) xP(x)
12 1
6

3
1 1 4
4
5 6 4()= 6
6

(2) The random variable , representing the number of nuts in a chocolate bar has
the following probability distribution. Compute the mean.

x 0 1 2 3 4
P(x) 1 3 3 2 1
10
10 10 10 10
(3). The probability distributions below show the number of typing errors (x) and the
probability P(x) of committing the errors whenever clerks’ type-in a document.
Complete the table.

𝒙 0 1 2 3 4 5
P(x) 0.02 0.22 0.42 0.31 0.10 0.04
x P(x)
𝒙𝟐
𝒙𝟐𝑷(𝒙)

23
Refer to the table in no. 3 to answer numbers 4 – 5.

(4) Compute the variance.

(5) Compute for the standard deviation.

Let me check what you gain from the lesson.


1. Complete the table below and find the Mean or expected value of the following
probability distribution.

x 0 1 2 3 4
1
5 1
5 1
5 1
5 1
5
P(x)
xP(x)

2.

H P(H) HP(H)

0 0.06

1 0.70
2 0.20

3 0.03

4 0.01

3-4. Determine the variance and the standard deviation of the random variable.

x 1 2 3 4 5
1
5 1
5 1
5 1
5 1
5
P(x)
xP(x)

24
MODULE 4

This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.
After going through this module, you are expected to:

1. illustrate a normal random variable and its characteristics.


2. identify regions under the normal curve corresponding to different standard normal
values.
3. convert a normal random variable to a standard normal variable and vice-versa.
4. compute probabilities and percentiles using the standard normal table; and
5. apply the normal curve concepts in solving problems.

Lesson Exploring the Normal Curve


4 Distribution
In the previous module, you have learned to solve problems involving the mean and variance
of a probability distribution. However, the data you used previously are samples of discrete
data. What if the data belong to a continuous type? This module will help you deal with
problems involving distribution of this type. Traditionally, we call this a normal probability
distribution or simply the normal curve

A. PROPERTIES OF THE NORMAL CURVE

Normal distribution or normal curve represents a group of data where a very large
number of cases exists and the mean, the median and the mode are all equal. When you
sketch the graph of a normal curve, you will find the following properties:

1. The distribution curve is unimodal and bell-shaped. Unimodal means that there is
only one peak point.
2. The mean, the median, and the mode coincide at the center.

25
3. The curve is symmetrical about its center. Meaning, when you draw a vertical line at
the center of the curve, the resulting half part looks an image of the other half part.
4. The width of the curve is based on the standard deviation of the distribution.
5. The tails of the curve approach the base line, but it will never intersect the line. These
tails just go nearer and nearer to the base line, but never meet the line.
6. The area of the curve is 1. Thus, normal curve is also a probability distribution.

B. EXPLORING THE STANDARD NORMAL CURVE

The normal curve is a standard normal curve when the mean µ = 0 and the standard
deviation σ = 1. This is mostly used to represent inferential statistics. You will find its area
by substituting the mean µ = 0 and the standard deviation σ = 1 in the formula that
describes a normal curve. But don’t worry! Mathematicians have already computed these
for everyone’s use.

Look at the image below. This is a graphical representation of a normal cuve.


distribution.

Fig 4.1. Areas under the Normal Curve

Source: Chegg Study. https://che.gg/2YDK2zM

You might be wondering why the area is considered as equal to 1 when the standard practice
is to show 99.73% of the area. Take note that .9973 is just the area between -3 and +3. In this
case, remember that the total area is not shown because the tails are asymptotic to the
horizontal line. Meaning, it just continues to approach the line but will never intersect the line.
Therefore, there is a little portion of the area at the tails of the distribution. So, when asked
about the area under a normal curve, you say 1.

Areas under the normal curve is found at the z-Table. This time, you will learn how
to use z-table in finding the areas under the normal curve.

Steps in Finding the Areas under the Normal Curve Given a Z-Value

1.Express the given z-value into a three-digit form.


2.Locate the first two digits on the left column of the z-Table.
3.Match the third digit with the appropriate column on the right just
like in what you are doing in multiplication table.
4.The intersection of the row and the column is the required area or

26
Illustrative Example 1: Find the area that corresponds to z = 0.72.

Note: The area that corresponds to z = 0.72 can also be understood as “the corresponding
area between z = 0 and z = 0.72.”

Steps:
1. Express the z-value into a three-digit form.
➢ z = 0.72 is already in three digits.
2. Locate the first two digits on the left column of the z-Table.
➢ The first two digits are 0.7. Find it in the left column.
3. In the z-table, match the third digit with the appropriate column on the right just like
in what you are doing in multiplication table.
➢ The last digit is 2. Find the column with the heading .02.
4. The intersection of the row and the column is the required area or probability.
➢ The area is 0.2642.

How will you show this in a graph?

• Sketch a normal curve.


• Draw a vertical line through the given z-values and shade the region.
• Note that the given z-value is positive, so the shaded region is on the right of the
mean

27
Illustrative Example 2: Find the area that corresponds to z = -1.5.
Notes: The area that corresponds to z = -1.5 can also be understood as “the
corresponding area between z = 0 and z = -1.5.” Moreover, note that “negative sign” in
the z-value is just a signal that the region is on the left side of the mean. This means
that the area corresponding to z = 1.5 is also the same with z = -1.5. The only difference
is their location on the graph. If it is positive, then the region is on the right of the mean.
If it is negative, then the region is on the left of the mean.

Steps Solution
Express the z-value into a three-digit z = -1.50
form.
Locate the first two digits on the left
column of the z-Table.

In the z-Table, match the third digit


with the appropriate column on the
right

Find the intersection of the row 1.5 and z = .4332


the column .00 Therefore, the required area is .4332

Now, sketch a normal curve and identify the region. Remember that the z-value is
negative, so the region is on the left of the mean.
Before proceeding to the next topic, remember that the mean divides the area
under the curve

C. UNDERSTANDING THE Z-SCORES

Recall that z-value or z-score tells you the distance from the mean measured in
standard deviation units. It can be positive (above the mean), negative (below the mean), or zero
to the mean). However, in real life, these scores are not usually given. Thus, it is important
that you know how to transform a raw score to its corresponding z-score under the
normal curve.
To get the z-value, use the formula:
𝑋−𝜇
𝑧= (𝑧−𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑑𝑎𝑡𝑎)
𝜎

OR
𝑋− ?̅?̅
𝑧= (𝑧−𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎)
𝑠

where: 𝑥=𝑔𝑖𝑣𝑒𝑛 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡


𝜇=𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

28
𝜎=𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
?̅?̅=𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑠=𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

Example 1: Find the z-value that corresponds to a score X = 58, given the mean μ = 50 ,
and standard deviation σ = 4.
(The symbols used are for population. Therefore, the z-score locates the raw score within a
population.) Solution:

𝑋−𝜇 58−50 8
𝑧= = = 4 =2 (The resulting z-value is positive 2.)
𝜎 4
The corresponding z-score to the raw score 58 is 2.
Meaning, the score 58 is 2 units above the mean.

Example 2: Locate the corresponding z-value to a score of 20 given that ?̅?̅ = 26 and s = 4.
(The given are sample data.)
Solution:

𝑋− ?̅?̅ 20−26 −6
𝑧= 𝑠 = 4 = 4
=−1.5 (The resulting z-value is negative 1.5)

The corresponding z-score to the raw score 20 is -1.5.


Meaning, the score 20 is 1.5 units below the mean.

Example 3: During the summer break, 1500 students took a test to apply for
scholarships in college. Their mean score is 80 with a standard deviation of 5. How many
students got a score between 75 and 82?
Steps Solution
1. Convert the raw scores 75 and 82 to For the raw score 75,
z-scores. 𝑋−𝜇
𝑧= 𝜎 =
75−80 −5
= 5 =-1
5
For the raw score 82,
𝑋−𝜇 82−80 2
𝑧= = = =0.4
𝜎 5 5
z = -1 corresponds to the area .3413
2. Find the area that corresponds to z = .4 corresponds to the area .1554
each z-score.
3. Sketch the graph of a normal curve
showing the z-scores
4. In your sketch, draw a line through
the z-scores and shade the region
between them.
5. Analyze the sketch and determine The graph suggests addition
the operation to use to find the total .3413 + .1554 =.4967
area.
.4967 is 49.67% when expressed to
percent.
49.67% of the students got a score between 75
6. Make a statement. and 82.

29
D. LOCATING PERCENTILES UNDER THE NORMAL CURVE
The following phrases are expressions of order. Are you familiar with them?

“Top 10”
‘a score of 75%’

Just like the z-scores, percentile tells the position of a value. It describes the relationship
of a value to the rest of the data. It is a point in the distribution where a number of cases is
below it. For example, if your score is at the 84th percentile, it means that 84% of the scores
were lower than yours and that 16% of the scores were higher than yours.

The Neophyte Statistician Hello, neophyte! Prove your learning by solving the following
problems. 1,000 children joined the physical fitness program last month. Their average
weight before
the program was 35 kg with a standard deviation of 5 kg. How many of these children
weighted between 32 kg and 45 kg?

Write your solutions in the figure.


30
1.The scores in a test conducted to
500 students are normally
distributed with a mean of 110 and a
standard deviation of 6. How many
students got a score below 104?

2. 150 male students were asked


about their shoe size. The result is
normally distributed with a mean 8
inches and a standard deviation 3
inches. Find the probability that a
male student, picked at random, has
a shoe size 8 in.

3. The heights of 80 0 children is


normally distributed with an average
95 cm and standard deviation 5 cm.
How many of the children has a
height between 93 cm and 98 cm?

31
MODULE 5

This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.

The module is concentrated on the lesson


• Lesson 5 – Understanding Sampling and Sampling Distribution

After going through this module, you are expected to:


1. illustrates random sampling.
2. distinguishes between parameter and statistic.
3. Identifies sampling distributions of statistics (sample mean).

Lesson Understanding Sampling and


5 Sampling Distribution
In the first lesson of this module, we have learned how to construct the probability
distribution of a discrete random variable. We have also learned how to compute the mean
and the standard deviation of a discrete random variable. We have also studied to identify
regions under the normal curve, to convert a normal random variable to a standard normal
variable and vice-versa, and to compute probabilities and percentiles using the standard
normal table. In this lesson we will learn to illustrate random sampling, to distinguish
between parameter and statistic and to identify sampling distributions of statistics (sample
mean). Also, we will learn how to construct the sampling distribution of sample means and
find out some characteristics of the sampling distribution of the sample means. This will
eventually help us to understand the process of making statistical inference about the
population, using a sample drawn from it.

32
Sample - a subset of the population from which the data is collected. It is a small part of
the population from which the researchers gather data.
OTHER DEFINITION OF TERMS

Sampling method-It is concerned with selecting a subset of the population used to


estimate the entire population's characteristics. Random Sampling is a method wherein
each element of the population has an equal chance of being chosen to represent the
population.
TYPES OF RANDOM SAMPLING

a. Simple random sampling - is the simplest form of random sampling where each
element or member of the population has an equal chance of being included in the
sample. The most commonly used is the lottery method.

b. Systematic sampling - is another type of random sampling, which is known as


interval sampling. This method considers an interval in selecting a sample from a given
population. Using the formula:

nN k =
where: k = interval size, N = population size and n= sample size

c. Stratified sampling - is a random sampling method that divides a population into


different homogenous subgroups called strata.
Two types of stratified sampling

c.1. Simple stratified sampling - is used when the population is divided into strata with
common characteristic/s and if we decide to get an equal number of samples from each
stratum.

c.2. Proportional stratified sampling -is used when the sample size is proportional to the
number of members of the stratum. This means that the smaller the number of stratum
members, the smaller the stratum's sample size will be.

d. Cluster sampling - usually used on a geographical basis and is sometimes called area
sampling. It requires a complete list of clusters that represent the sampling frame.
e. Multi-stage sampling - it involves two or more stages in selecting the samples from a
given population.

SAMPLING DISTRIBUTION OF THE SAMPLE MEANS – is frequency distribution of the


sample means taken from a population.
Example 1
A population consists of five values (Php2, Php3, Php4, Php5, Php6). A sample of size 2 is
to be taken from this population.
a. How many samples are possible? List them and compute the mean of each sample.

33
b. Construct the sampling distribution of the sample means.
c. Construct the histogram of the sampling distribution of the sample means.

Solution
1. Since the size of the population is 5, we have N = 5. We shall draw a sample of size
2 from this population, so n = 2. Thus, the number of possible samples of size 2
can be drawn from this population is computed as follows:
n!
C (n , r ) =
(n − r)!r !
5!
C (5, 2)=
(5 − 2)!2!
= 10

The number of all possible samples of size 2 is 10. The table shows the list of all possible
samples with their corresponding means.
Possible samples of size 2 Mean
2 , 3 (2+3)/2 2.5
2,4 3.0
2,5 3.5
2,6 4.0
3,4 3.5
3,5 4.0
3,6 4.5
4,5 4.5
4,6 5.0
5,6 5.5

Observe that the means of the samples vary from sample to sample. The mean of the
population μ=4, while the means of the samples may be less than, greater than, or equal to
4.
1. We now construct the frequency distribution of the sample means.

Mean Frequency
2.5 1
3.0 1
3.5 2
4.0 2
4.5 2
5.0 1
5.5 1
Total 10
Next, we construct the probability distribution of the sample means. This is the sampling
distribution of the sample means.
Mean ?̅?̅ Probability P(?̅?̅)
2.5 1/10
3.0 1/10
3.5 2/10 or 1/5
4.0 2/10 or 1/5
4.5 2/10 or 1/5
5.0 1/10
5.5 1/10
34
2. The histogram of the sampling distribution of the sample means is constructed by
making a bar graph where the sample means are plotted on the horizontal axis and the
corresponding probabilities are shown in the vertical axis.

Example 2
The following table gives the monthly salaries (in thousands of pesos) of six officers in a
government office. Suppose that random samples of size 4 are taken from this population
of six officers.

Officer Salary
A 8
B 12
C 16
D 20
E 24
F 28

1. How many samples are possible? List them and compute the mean of each sample.
2. Construct the sampling distribution of the sample means.
3. Construct the histogram of the sampling distribution of the sample means.

Solution
Since the size of the population is 6, we have N = 6. We shall draw a sample of size 4. Thus,
the number of possible samples of size 4 that can be drawn from this population is
computed as follows.
n!
C(n,r)=
( n − r ) ! r!

6!
C ( 6, 4)= (6 − 4)!4!

= 15
35
The number of all possible samples of size 4 is 15. The table shows the list of all
possible samples with their corresponding means.

Sample Salaries Mean


A, B, C, D 8, 12, 16, 20, 14
A, B, C, E 8, 12, 16, 24 15
A, B, C, F 8, 12, 16, 28 16
A, B, D, E 8, 12, 20, 24 16
A, B, D, F 8, 12, 20, 28 17
A, B, E, F 8, 12, 24, 28 18
A, C, D, E 8, 16, 20, 24 17
A, C, D, F 8, 16, 20, 28 18
A, C, E, F 8, 16, 24, 28 19
A, D, E, F 8, 20, 24, 28 20
B, C, D, E 12, 16, 20, 24 18
B, C, D, F 12, 16, 20, 28 19
B, C, E, F 12, 16, 24, 28 20
B, D, E, F 12, 20, 24, 28 21
C, D, E, F 16, 20, 24, 28 22
Observe that the means of the samples vary from sample to sample. The mean of the
population μ = 18, while the mean of the samples may be less than, greater than, or equal
to 18.
1. We now construct the frequency distribution of the sample means.

Mean(?̅?̅) Frequency (f)


14 1
15 1
16 2
17 2
18 3
19 2
20 2
21 1
22 1
Total 15

Next, we construct the probability distribution of the sample means. This is the sampling
distribution of the sample means.

Mean ?̅?̅ Probability P(?̅?̅)


14 1/15
15 1/15
16 2/15
17 2/15
18 1/15
19 2/15
20 2/15
21 1/15
22 1/15

2. The histogram of the sampling distribution of the sample means is constructed by making
a bar graph where the sample means are plotted on the horizontal axis and the
corresponding probabilities are shown in the vertical axis.
36
PARAMETER AND STATISTIC
The main objective of conducting a survey is to estimate the value of some of the
characteristics of a population. Let us consider one of the results of XYZ survey before the
May 2016 presidential election. The actual percentage of all the voters represent the
population parameter, while the estimate of those percentage based from the sample is
known as the sample statistic.

The sampling method used in selecting the sample data strongly affects the quality
of the sample statistic with regards to its representativeness and accuracy. The table below
shows a list of the common symbols used for parameters and statistic:
Parameter Statistic
Population mean (µ) Sample mean (X)
Population standard Sample standard
deviation (σ) deviation (s)
Population variance (σ2) Sample variance (s2)
Population proportion (P) Sample proportion (p)

Examples
Identify the population parameter and sample statistic for each study.
A recent survey of 540 senior high school students in FTLSHS for the S.Y. 2019-2020 found
that 90% of the students could be classified are good in Mathematics.
Population Parameter: All senior high school students in FTLSHS for the S.Y.
2019-2020
Sample Statistic: Collection of 540 senior high school students in FTLSHS for
the S.Y. 2019-2020 or the 90% of all senior high school
students in FTLSHS for the S.Y. 2019-2020

The average weight of every seventh person entering the Ayala mall within 3-hour period
was 168 pounds.
Population Parameter: All the people entering the Ayala mall within the
assigned 3-hour period
Sample Statistic: every seventh person entering the Ayala mall within 3-hour
Period

37
SOLVE ME PLEASE!
Directions: Solve the following given.
The random Finite population (N) consisting Solution Final
sample size (n) of answer
1. n = 2 3, 4, 5
2. n = 3 1, 2, 3, 4, 5
3. n = 5 6, 7, 8, 9, 10, 11, 12
4. n = 7 4, 6, 8, 10, 12, 14, 16, 18, 20
5. n = 9 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
6. n = 5 Odd numbers between 9 – 21
inclusive

ANALYZE ME PLEASE!
Directions: Answer the following problems.

Random samples of size n = 2 are drawn from a finite


population consisting of the number 5, 6, 7, 8, 9.

a. How many possible samples are there?


b. List all the possible samples and the corresponding mean for each sample.
c. Construct the sampling distribution of the sample means.
d. Construct the histogram for the sampling distribution of the sample means.
Describe the shape of the histogram.

38
MODULE 6

This module was designed and written with you in mind. It is here to help you master
the Estimation of Parameters. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.The module is divided into two lessons, namely:
Lesson 6 –Sampling and Sampling Distribution

After going through this module, you are expected to:


a. finds the mean and variance of the sampling distribution of the sample mean.
b. defines the sampling distribution of the sample mean for normal population when
the variance is: (a) known; (b) unknown

Lesson Sampling and Sampling


6 Distribution
Statistician do not just describe the variations of the individual data values about the mean
of the population. They are also interested to know how the means of the samples of the
same size taken from the same population vary about the population mean. In this lesson,
you will learn how to describe the sampling distribution of the sample means by
computing its means and variance. You will also make a general conclusion regarding the
mean and variance and shape of the sampling distribution and the sample means. There
are many different possible samples of the same size that can be drawn from a given
population. A statistic such as mean can be computed for each of the sample drawn.

If all possible random samples of size n taken with replacement (independent) from a
population with a mean 𝜇 and variance 𝜎2, then the mean (𝜇
𝑥̅)̅, variance (𝜎?̅?̅)
2 and
standard deviation (𝜎𝑥̅)̅ of the sampling distribution of the sample mean are:
𝜎2
𝜇?̅?̅ =𝜇 (mean), 𝜎2 ?̅?̅=𝑛 (Variance)
𝜎
𝜎𝑥̅= √𝑛 (Standard deviation or standard error)
If all possible samples of size n are taken without replacement (dependent) from a finite
population of size N with a mean 𝜇 and variance 𝜎2, then mean( (𝜇𝑥,
)̅ variance 𝜎2?̅?) of the
sampling distribution of the sample mean are:

𝜇?̅? =𝜇 (mean), 𝜎2?̅? = 𝜎2𝑛 (𝑁−𝑛𝑁−1) (Variance)


𝜎?̅?=𝜎2𝑛 √𝑁−𝑛𝑁−1 (standard deviation or standard error)

39
Note: The factor 𝑁−𝑛𝑁−1 is called correction factor for finite population. It will be close to
1 and can be safely ignored when n is small compared to N.
Note that as we increase the sample size and the variance of the sample mean
decreases.

THEOREM
If random samples of size n are taken from a population with a mean 𝜇 and stan
deviation 𝜎, then the sampling distribution of the sample mean ?̅?̅ app
𝜎
distribution with mean 𝜇?̅?̅, and standard deviation 𝜎?̅?̅
, thus can be =
standardized
√𝑛
?̅?̅𝜎 −𝜇
as : 𝑧=
√𝑛
As the n increase, the sampling distribution of the sample mean gets nearer and nearer to
the normal distribution .

➢ If 𝜎 in unknown, compute the sample standard deviation s then use it to replace


𝜎
in the formula if n ≥30.
➢ Even if n<30., the formula can still be used provided that the population is
approximately normal and the population standard deviation 𝜎 is known.

?̅?̅=𝜇 that
Example 1 . Suppose a jar contains number 1,3 and 5. Show and 𝜇
𝜎2
𝜎 2 ?̅?̅= 𝑛 .
Solution: The probability distribution is :
x 1 3 5
f(x) 1 1
3 1
3
xf(x) 3 3 5
x2 1 3 3
x2f(x) 3 9
1 9/3 25
1/3 25/3
Since the distribution is uniform, that is , the observations have the same probabilities, the
mean (𝜇) and sample varian ce (𝜎2)can be easily computed as
1 1 1 9
𝜇 =1()3+3 ( ) + 5 ()3= 3 = 3
3
3−
𝜎 2 = 1 (
1−3) 2 1 ) + ( 1 5−3)
32 2 1 4
+ = +0+ =
4 𝟖
3 +( 3 3 3 3 𝟑
3
if we take two numbers in succession with replacement then, the possible 2 number samples
are: (1,1) , (3,3), (5,5), (1,3), (3,1) (1,5), (5,1), (3,5) and (5,3). The average or mean of each pair, in order
are 1,3,5,2,2,3,3, 4, and 4 respectively. IF we denote the means as random variable ?̅?̅ , then:

40
The probability distribution of ?̅?̅ becomes:
x 1 2 3 4 5
f(x) 1 2 3 2 1
9 9 9 9 9
xf(x) 1 4 9 8 5
9 9 9 9 9
x2 1 4 9 16 25
x2f(x) 1 8 27 32 25
9 9 9 9
9
The mean of the sampling distribution (using the formula for mean of
random variable is :
𝜇?̅?̅ =∑(𝑥̅ ̅f (?̅?̅))
1 2 3 2 1
= 1 ( ) + 2 () + 3( ) + 4( )+ 5( )
9 9 9 9 9
5
=1 4 9
+
8
+
9 +9 + 9 9 9
𝟐𝟕
=
𝟗
= 𝟑 , but 𝜇 =3 , therefore , 𝜇 ?̅?̅

The variance of the sampling distribution (using the formula for variance
of random variable is:
𝑥 −[𝐸(𝑥)]
𝜎2?̅?̅= E(2) 2
𝜎2
𝜎 2 ?̅?̅= , since the sample
2 𝑛
= ∑(𝑥̅̅̅)̅2 f(𝑥̅̅̅)̅ − [∑?̅?̅ 𝑓(𝑥̅̅̅)̅]
size n is 2.
1 2 3 2 1
=[12 ()+22 () +32 () +42 ()+52 ( )−(3)
] 2
9 9 9 9 9 Therefore:
93
= −9 𝟐 = 𝝈𝟐
9 𝝈?̅?̅ 𝒏
8
4 (3) 𝟒
= , but 𝜎 2= 2 =
3 𝟑
Example 2.
If a random sample of size 16 is taken with replacement from the population 1,1,1,2,4,5,5, and 6 .
what is the mean , variance and standard deviation of the sampling distribution of the
sample mean.
Answer: We can write the probability distribution of our population as:
X 1 .2 4 5 6
f(x) 3 1 1 2 1
8 8 8 8 8
The mean of the distribution is:

𝜇= 1() 3+2() +4()


1 1
+5()
2
+ 6()8
1
8 8 8 8
3 2 4 10 6
= + 8+ 8 + + 8
8 8

𝟐𝟓
= or 3.13
𝟖

41
And the variance is:
3
𝜎 2 =12() +22()
1
+42()1 +5()
2 2
+
1
62() − (
𝟐𝟓 𝟐
)
8 8 8 8 8 𝟖
3 4 16 50 36 625
= + + + + −
8 8 8 8 8 64
=13.625 −
625
= 3.86
64

Therefore, the mean of the sampling distribution (𝜇?̅?̅) of the sample mean is 3.13 ( the same
3.86
as 𝜇) , while the variance of the sampling distribution (𝜎 ?̅?̅ ) is 16 = 0.24(that is 𝜎2/𝑛 ). In
addition, the standard deviation 𝜎?̅?̅ is √0.24 = 0.49 ( positive square root of the variance)
Example 3.
A school has 5,000 students with a mean weight of 80 lbs., with a standard deviation of 20
lbs. If you draw a random sample of 50 students, what is the mean, variance and standard
deviation (standard error) of the sampling distribution of the sample mean?
Answer: We know from the given that the 𝜇 = 80 lbs, therefore 𝜎?̅?̅ = 80 lbs. Since the sample
(n=50) is taken from a finite population (N=5000) whose 𝜎=20, the standard error (𝜎?̅?̅) can
be computed as:

𝜎 𝑁−𝑛 20 5000−50
𝜎?̅?̅=𝑛 √𝑁−1 =
50
√ 5000−1

4950
=2.828 √ = (2.828)(0.995) = 𝜎?̅?̅ =2.814 which is the square of the standard error is
4999
: = 𝜎 2 ?̅?̅=2.814 2 , =𝜎2 ?̅?̅= 7.9185 or 7.919

A. Determine the mean( 𝜇)?̅?̅, variance( 𝜎2 ?̅?̅) and standard deviation (𝜎?̅?̅ ) of each.
(1-3) A random sample of 20 independent observations is taken from a population
with 𝜇=23.8 and 𝜎=5.
1. 𝜇?̅?̅ =_____________
2. 𝜎2?̅?̅=___________________
3. 𝜎?̅?̅=______________
(4-6). A random sample of size 30 independent observations is taken from a
population with 𝜇=48 and 𝜎=6.5.
4. 𝜇?̅?̅ =_____________
5. 𝜎2?̅?̅=___________________
6. 𝜎?̅?̅=______________
(7-9). A random sample of size 120 is taken with replacement from a
population with independent 𝜇=120 and 𝜎=28.

7. 𝜇?̅?̅ =_____________
8. 𝜎2?̅?̅=___________________
42
9. 𝜎?̅?̅=______________

A. Compute the mean, variance and standard deviation of the sampling distribution taken
from the following populations.
10-12. A community has 1500 people with mean age of 42 and variance of 16.
If you draw a random sample of 30 people, what are the mean, variance
And standard error of sampling distribution of their ages?

13-15. What is the mean, variance, and standard error of the sample mean
when 60 students are taken from a population of 2000 with a mean
score of 75 and standard deviation of 5?

MODULE 7

This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be used in
many different learning situations. The language used recognizes the diverse vocabulary
level of students. The lessons are arranged to follow the standard sequence of the
course. But the order in which you read them can be changed to correspond with the
textbook you are now using

This module targets the following learning competencies:


1. Illustrate the Central Limit Theorem (M11/12SP-IIIe-2).
2. Define the sampling distribution of the sample mean using the Central Limit Theorem
(M11/12SP-III-3).

After going through this module, you are expected to:


• state and explain the Central Limit Theorem;
• identify the assumptions and conditions in using the Central Limit Theorem; and
• describe the sampling distribution of the sample means using the Central Limit
Theorem by finding the mean and the standard error of the mean.

Lesson
7 The Central Limit Theorem

43
In the previous module, you have described the sampling distribution of the sample means
by computing its mean and variance. This time, you will continue to study sampling
distribution along with an important concept used in the field of Statistics.

A. The Central Limit Theorem

In the previous module, you have learned to use parameters in describing a


population. For a standard normal distribution, the two parameters are the mean and
standard deviation (or you may use the variance). You do not get the mean and then
forget the standard deviation or the variance. These two go hand in hand. From there,
you may be able to conclude about a population.

As mentioned in the previous part of this module, it is difficult and almost


impossible to collect data from the entire population. So, what the researchers are doing
is to obtain data from a sample, and then do the study. The result is then treated as an
estimate of the population mean. If this is only an estimate of the population mean, then
it is important to ensure that this is a good estimate. This is achieved when the standard
deviation of the sampling distribution of the sample means (which is also known as
standard error of the mean) is small or close to zero. The standard error of the mean
determines how accurate the sample mean as an estimate of the population mean is.
This value depends on the sample size. Remember that the formula to get the standard
error of the mean (𝜎?̅?̅) is determined by dividing the population standard deviation (𝜎) by
the square root of the sample size (n). Meaning, to get a good estimate of the population
mean, you have to increase the sample size. This is now where the Central Limit Theorem
enters in.

The Central Limit Theorem (CLT) states that the sampling distribution of the
sample means moves closer to a normal distribution as the sample size increases
regardless of the shape of the population distribution. In other words, the distribution is
approximately normally distributed whenever the sample size is sufficiently large
irrespective of whether the distribution is normal, left-skewed, right-skewed, or uniform.
This theorem also tells us that the mean of the sampling distribution of the sample means
is always equal to the population mean.

B. Assumptions and Conditions for Using the CLT

There are assumptions and conditions that you have to consider before using the
Central Limit Theorem.

1. The sample must be selected randomly. This is to secure that the data gathered are
unbiased data from a population.
2. The variables must be independent from each other. Meaning, the value of one
observation to another observation does not affect each other. Normally, when the
sample is randomly selected, it is assumed that independence is already met.
3. The sample size should not be more than 10% of the population when the sample is
drawn with no replacement. Taking away each item in the observations changes the

44
population. When you sample only 10% or less of the population, eliminating each
observation does not change the population that much (Khan Academy).
4. The sample size must be large enough. How large? For many Statisticians, the minimum
sample size of 30 is assumed to be sufficiently large. They believe that when the mean
approaches a normal distribution. But of course, this is just a guideline and not a general
rule because situation varies. If it is stated that the population is already normally
distributed, you may use small sample size. However, when the population
is strongly skewed, then you will need adequately large sample size. The basic concept
is this: the more the distribution varies from being normal, the larger the sample size
is required.

B. Defining the Sampling Distribution of the Sample Means Using


the Central Limit Theorem

After knowing the Central Limit Theorem's definition, assumptions, and conditions, you
may now describe the sampling distribution of the sample means by computing the
mean and the standard error of the mean. Example 1: The mean (μ) and standard
deviation (σ) of population distribution are 68 and 3, respectively. Find the mean and
standard deviation of the sampling distribution when a random sample of size 25 is drawn
from the population. Assume that the population is finite. Solution:

Example 2: Samples of size 100 are selected randomly from a population with a mean 28.5
and a standard deviation of 1.5. Compute the mean and standard error of the sampling
distribution of the means given that the population is finite.
Solution:

45
The Central Limit Theorem gives us confidence that whatever the shape of the distribution
is, it will approach a normal distribution as the sample size increases.
In the next module, you will solve more real-life problems involving this theorem. Exciting,
right?

Crossword Puzzle

Directions: Complete the puzzle by answering the given clues.

Across Down

1. this condition shows that the value of one 2. the standard error of the mean
observation to another observation does not depends on this
affect each other 6. the data collected must be from a
3. the relationship between the population _____ sample
mean and the mean of the sampling 8. a parameter for a standard normal
distribution of the sample means distribution
4. the sample size should not exceed the
_____ percent of the population when the 10. abbreviation for Central Limit
sample is drawn without replacement Theorem
5. The standard error of the mean must be
close to _____ to have a good estimate of the
mean.
7. the more the distribution varies from
being normal, the _____the sample size is
required 9. the approximate shape of the

sampling
distribution of the sample means when the
size of the random samples gets larger

46
Instructions:
1. Do only the task that is assigned to your group.
Group 1: Find the mean and standard deviation of the heights of all the students in
your section.
Group 2: Find the mean and standard deviation of the weights of all the students in
your section.
Group 3: Find the mean and standard deviation of the hours spent in sleeping of all
the students in your section.
Group 4: Find the mean and standard deviation of the waistline of all the students
in your section.
Group 5: Find the mean and standard deviation of the general average of all the
students in your section.

2. After finding the mean and standard deviation, create a problem using the Central
Limit Theorem for the solution.

3. Submit your work on a letter size bond paper before the end of the first quarter of
this semester
RUBRIC:

5 3 1 Score

Mathematical The solution has The solution has The solution has
Concept no error. a minor error. lots of errors.
The submitted The submitted The submitted
output
neatnessshows
and output
neatnessshows
and output is untidy
and disorganized.
orderliness. orderliness with
Presentation
of the output minor flaws.

The output is
submitted
two onethe
days after or
The output is deadline. The output is
submitted on or
before the submitted later
than two days
Timeliness
deadline. after the deadline.

TOTAL SCORE

47
MODULE 8

This module was designed and written with you in mind. It is here to help you master the
random variable and probability distributions. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard sequence of
the course. But the order in which you read them can be changed to correspond with the
textbook you are now using.
The module deals with an understanding of:

• Applied Problems Involving Sampling Distribution of Samples Means

After going through this module, you are expected to:


1. Solves problems involving sampling distributions of the sample means

Lesson Applied Problems Involving


Sampling Distribution of Sample
8 Means
Your previous lesson on computing the mean and the variance of the sampling distribution
of the means gives you an idea that the mean of the sampling distribution of the sample
means 𝝁𝑿̅̅ ̅is equal to the mean of the population. It is also shown to you from your lesson that
the variance of the sampling distribution of the sample mean has two formulas to use, that is,
if the population is finite and or infinite, as well as in finding the standard deviation since
getting the standard deviation is just only taking the square root of the variance.

The knowledge taught to you in the previous lesson is pertinent in your succeeding
lesson. But before we proceed to the next lesson, let me check your understanding.

Let us explore how the Central Limit Theorem is applied in


solving problems involving sampling distribution.
A population of size of N = 60, has a mean = 65 and a standard
deviation is =5

1. What is the probability that their mean is between 64 and 67?

48
2. What is the probability that a random sample of size of 35 will have a mean of 66 or
more?
Let’s take the step-by-step solution:
➢ the population is finite because its size is given and therefore, we can use the
normal distribution as CLT holds.
✓ 1st convert the values to z scores using 𝑧= 𝑋̅ ̅− 𝜇
𝜎 ?̅?̅
✓ 𝑧= ?̅?̅− 𝜇
𝜎 =
64−65
5 =
−1
0.645 −
= 1.55
𝑋̅ ̅− 𝜇 √60
√𝑛 𝜎
= 67−65 2
𝑧= 5
√60
= 0.645 = 3.10
√𝑛

49
A. Compute the z value for each; assume tat each population is normally distributed.

(20 = 2 points each)


1. 2. 3.
 100 140 62
 2 14 6
?̅?̅̅ 100.5 145 59
n 80 12 30
?̅?̅̅− 𝝁
𝒛= 𝝈
√𝒏

B.
4. 5.
 46 245
2 9 20
?̅?̅̅ 45.5 248
n 20 25

𝒛= ?̅?̅̅− 𝝁
𝝈
√𝒏

Try to do the following activities for to deepen your understanding on the Central Limit
Theorem.

Assume that the height of adult women was normally distributed with a
mean of 63 in and standard deviation of 2.5 in.

1. If 36 women are randomly selected, what is the probability that the mean height is
less than 62 in?

2. If 70 women are taken as samples, what is the probability that their mean height is
greater than 62.5 in?

50
MODULE 9

This module was designed, and This module was designed and written with you in
mind. It is here to help you learn about T-Distribution and Percentile. The scope of this
module permits it to be used in many different learning situations. The language used
recognizes the diverse vocabulary level of learners. The lessons are arranged to follow the
standard sequence of the course.
The module is divided into 2 lessons, namely:

● Lesson 1 – Understanding the T-Distribution


● Lesson 2 –Percentile using T-Table

After going through this module, you are expected to:


1. Illustrate the T-Distribution
2. Identifies Percentiles using the T-Table

Lesson
Understanding the T-Distribution
9
There are situations when sample values are not large enough for the central Limit
Theorem to be applied. Can we still obtain an interval estimate of the population mean? Are
assumptions can be met? Those questions and other pertinent procedures were discussed
in this lesson. To find out if you are ready to learn this new lesson, do the following activities.

N < 30

If the sample size is less than 30 (n<30), it is considered small, thus, even if the
variance of the population is given, the formula for standardizing the sampling distribution
of the sample mean cannot be used. For this small sample, the normality of the distribution
sample mean cannot be guaranteed, thus, the z-table cannot be used.

51
The T-Distribution

Since the z-table cannot be used for small sample size, another type of distribution
is used. This special case is called t-distribution, formulated in 1908 by an Irish brewing
employee named W.S. Gosset.

The t-distribution, like the z-distribution, is a bell-shaped and symmetrical. But as


compared to the z-distribution, the t-distribution is more variable since its value depends
on the fluctuations of mean and variance from sample to sample.

Theorem

If x̅ and s are the mean and standard deviation, respectively, of a random sample of size
n taken from a normally distributed population with a mean μ, can be standardized as
̅ x̅ − 𝜇
𝑡= 𝑠
√𝑛
Where;

x̅̅ = sample mean


μ = population mean

s = sample standard deviation

n = sample size

a value of a random variable T following the t-distribution. Remember that the


formula should be used when sample size (n) is less than 30.
The t distribution allows us to conduct statistical analyses on certain data sets that are
not appropriate for analysis, using the normal distribution. The t-distribution can be used
when your sample size (n) is less than 30 and the variance (σ) is unknown. Here is an
example on how to find the t-value.
Example 1:

What is the t-value when μ=42, x̅ = 43, s=6 and n=25?

Steps Solution

1. Analyze the problem and identify what Since n=18 and it is less than 30, we
kind of distribution is going to be used. can use the t-distribution

x̅̅ − 𝜇
𝑡= 𝑠
2. Write the formula
√𝑛
43− 42
6
𝑡=
3. Evaluate the formula √25
1
6
4. Subtract the numerator and find the 𝑡= 5
square root of the sample size (n)

52
1
5. Divide the denominator 𝑡 = 1.2

6. Simplify your answer t = 0.8333 or 0.83

Example 2:

MATH Corporation manufactures light bulbs. The CEO claims that an average Acme
light bulb lasts 400 days. A researcher randomly selects 20 bulbs for testing. The sampled
bulbs last an average of 380 days, with a standard deviation of 50 days. Find the t-value of
the given data.

Steps Solution

1. Write the given μ=400, x̅̅ =380, s=50, n=15

2. Analyze the problem and identify


Since n=15 and it is less than 30, we can
what kind of distribution is going to use the t-distribution
be used.

x̅̅ − 𝜇
𝑡= 𝑠
3. Write the formula
√𝑛

380 − 400
4.Evaluate the formula 𝑡= 50
√15

5.Subtract the numerator and find the


square root of the sample size (n). −20
If the square root is not a perfect 𝑡=
50
square, write at least 4 decimal 3.8730
places then rounded off.
−20
6. Divide the denominator 𝑡=
12. 9099

7. Simplify your answer t = -1.5491 or -1.55

The T-Table

The t-table shows right-tail probabilities for selected t-distributions. You can use it
to solve the following problems. Suppose you have a sample of size 10 and you want to find
the 95th percentile of its corresponding t-distribution. You have n – 1= 9 degrees of freedom,
so, using the t-table, you look at the row for df = 9. The 95th percentile is the number where
95% of the values lie below it and 5% lie above it, so you want the right-tail area to be 0.05.
Move across the row, find the column for 0.05, and you get. This is the 95th percentile of
the t-distribution with 9 degrees of freedom.

Now, if you increase the sample size to n = 20, the value of the 95th percentile
decreases; look at the row for 20 – 1 = 19 degrees of freedom, and in the column for 0.05 (a
right-tail probability of 0.05) you find degrees of freedom indicate a smaller standard

53
deviation and thus, the t-values are more concentrated about the mean, so you reach the
95th percentile with a value of t closer to 0.

t Table t.80 t.90 t.95 t.98 t.99 t.998 t.999

t .90 t .95 t .975 t .99 t .995 t .999 t .9995

Two-tails 0.20 0.10 0.050 0.02 0.010 0.002 0.001

one-tail 0.10 0.05 0.025 0.01 0.005 0.001 0.0005

df

1 3.078 6.314 12.71 31.82 63.66 318.31 636.62

2 1.886 2.920 4.303 6.965 9.925 22.327 31.599

3 1.638 2.353 3.182 4.541 5.841 10.215 12.924

4 1.533 2.132 2.776 3.747 4.604 7.173 8.610

5 1.476 2.015 2.571 3.365 4.032 5.893 6.869

6 1.440 1.943 2.447 3.143 3.707 5.208 5.959

7 1.415 1.895 2.365 2.998 3.499 4.785 5.408

8 1.397 1.860 2.306 2.896 3.355 4.501 5.041

9 1.383 1.833 2.262 2.821 3.250 4.297 4.781

10 1.372 1.812 2.228 2.764 3.169 4.144 4.587

11 1.363 1.796 2.201 2.718 3.106 4.025 4.437

12 1.356 1.782 2.179 2.681 3.055 3.930 4.318


13 1.350 1.771 2.160 2.650 3.012 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878
3.610 3.922
19 1.328 1.729 2.093 2.539 2.861
3.579 3.883
20 1.325 1.725 2.086 2.528 2.845
3.552 3.850
21 1.323 1.721 2.080 2.518 2.831
3.527 3.819
22 1.321 1.717 2.074 2.508 2.819
3.505 3.792
23 1.319 1.714 2.069 2.500 2.807
3.485 3.768
24 1.318 1.711 2.064 2.492 2.797
3.467 3.745
25 1.316 1.708 2.060 2.485 2.787
3.450 3.725
26 1.315 1.706 2.056 2.479 2.779
3.435 3.707
27 1.314 1.703 2.052 2.473 2.771
3.421 3.690

54
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674

29 1.311 1.699 2.045 2.462 2.756 3.396 3.659

30 1.310 1.697 2.042 2.457 2.750 3.385 3.646

z 1.282 1.645 1.960 2.326 2.576 3.090 3.291

Note:

The formula for Degree of freedom (df) is n -1, Where n is the sample size.
df = n – 1
Example 1:

Mr. Sotto conducts a survey to 25 people for the effectiveness of their new medicine.
He wants to know what is the 95th percentile of his survey. Find the t-value.

Steps
df = n – 1
Compute for the degree of freedom = 25 – 1
df = 24
Change the percentile into a
95th = 95% = .95
percentage to decimal
α=1 - 0.95
Subtract the decimal to 1. To
know what the right tail area is α =0.05

Referring to the table. Look for the


column of 0.05 and the row of the
df.

Write your answer t-value = 1.833


Steps
Compute for the degree of freedom df = n – 1
= 25 – 1
df = 24

Change the percentile into a 95th = 95% = .95


percentage to decimal
Subtract the decimal to 1. To α=1 - 0.95
α =0.05
know what the right tail area is

55
Referring to the table. Look for the
column of 0.05 and the row of the
df.

Write your answer t-value = 1.833

Example 2: What is the t-value when n=15 at 90th percentile (one tail)
df =n-1 = 15-1 = 14 and 90th percentile is α=0.10.
Then referring to the table, t = 1.345.

A. Compute the t-value for each number below. (2 Points Each)


1. μ=36, 2. x̅ =30 , s= 5 n= 20
μ=536 , 3. x̅ =550 , s= 14 n= 9
μ=47.2, 4. x̅ =48.9, s=5 s= n= 12
μ=77 , 5. x̅ =82 , x̅ = 12 s= n= 22
μ=200 , 189, 19 n= 16

Complete the table and find the t-value on the given problems. (5 points)
1. Suppose scores on a Math test is normally distributed, with a population mean of 100.
Suppose 20 people are randomly selected and tested. The standard deviation in the sample
group is 15. What is the probability that the average test score in the sample group will be at
most 110?

a. What is/are given?


b. Write the formula to be used
c. Solution

Answer the given questions.


1. The mean percentage score of 20 students answering the module is 89% with a
standard deviation of 1.6. But the teacher claims that the percentage score is 92%.
Find the t-value for this question.
2. Ms. Pelaez conducts a written evaluation to her 25 students. She wants to know
the 95th percentile of her assessment. Find its t-value.
56
MODULE 10

This module was designed and written with you in mind. It is here to help you master
the Estimation of Parameters. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are now
using.
The module is divided into two lessons, namely:

Lesson 10 – Length of Confidence Interval and Appropriate Sample Size

After going through this module, you are expected to:


1. Identify the length of a confidence interval;
2. computes for the length of the confidence interval;
3. computes for an appropriate sample size using the length of the interval;
4. solves problems involving sample size determination.

Lesson Length of Confidence Interval and


10 Appropriate Sample Size
Researchers cannot always study the entire population especially when the population
especially when the population is too large. When the population is too large, it may not be
possible to determine the population parameters such as the population variance and
population mean, population variance, and population standard deviation. Having a slice of
the population whose elements were randomly selected from the entire population is not
enough. Aside from the sampling technique discussed in Unit 2., researchers need additional
techniques to be able to make inferences about the entire population. These techniques are
discussed in this unit- one will learn the best point estimates for the population mean and
population proportion, how to compute these using point estimators, how to compute for
the confidence interval estimate based on the appropriate form of estimator based on the
appropriate form of estimator for the population mean, and how to solve problems involving
confidence interval estimation of population proportion.

57
Length of Confidence Interval and Appropriate Sample Size

In statistics, a confidence interval (CI) is a type of estimate computed from the


statistics of the observed data. This proposes a range of plausible values for an
unknown parameter (for example, the mean). The interval has an associated
confidence level that the true parameter is in the proposed range.
Interval estimate is sometimes called Confidence Interval, is a or interval with
lower and upper limits used to estimate the population parameter. It is usually
in the form a <𝜃 <b,which tells that the estimated parameter 𝜃 is between two
values ( a and b) at a certain level of confidence.
Statistical inference is a making conclusion or generalization about the
population based on the study of samples.
Point Estimate is a single value that estimates the population parameter, such
as ?̅?̅ as estimate for 𝜇, or s as estimate for 𝛼.
More strictly speaking, the confidence level represents the frequency (i.e., the
proportion) of possible confidence intervals that contain the true value of the
unknown population parameter. In other words, if confidence intervals are
constructed using a given confidence level from an infinite number of
independent sample statistics, the proportion of those intervals that contain the
true value of the parameter will be equal to the confidence level. For example, if
the confidence level (CL) is 90% then in hypothetical indefinite data collection,
in 90% of the samples the interval estimate will contain the population
parameter.

58
The length of a confidence interval is the absolute difference between the upper confidence
limit and the lower confidence limit. That is,

LCI=|𝑈𝐶𝐿 − 𝐿𝐶𝐿|=|𝐿𝐶𝐿 − 𝑈𝐶𝐿| or LCI = UCL – LCL

Where : LCI = length of Confidence Interval


UCL = Upper Confidence Limit

LCL = Lower Confidence Limit


Example No.1 : Find the length of the following confidence interval
a. 0.357 <?̂?̂ < 0.603
b. 0.629 <?̂?̂ < 0.655
Solution:
a. LCI = UCL – LCL
=0.603 – 0.357
=0.246
b. LCI = UCL – LCL
=0.655 – 0.629
=0.026

➢ The formula for a confidence interval for population proportion 𝑝 is


𝑃̂ ̂ (1−𝑝̂) 𝑃̂ ̂ (1−𝑝̂)
?̂?̂-𝛼
𝑧 √ < p <?̂?̂ +1𝑧√𝛼
2 𝑛 2 𝑛

𝑃̂ ̂ (1−𝑝̂) 𝑃̂ ̂ (1−𝑝̂)
LCI = ⌈?̂?̂ + 𝑧𝛼√ ⌉- ⌈?̂?̂−𝑧𝛼√ ⌉
2 𝑛 2 𝑛

𝑃̂ ̂ (1−𝑝̂) 𝑃̂ ̂ (1−𝑝̂)
𝑧𝛼√
= ?̂?̂+ + ?̂?̂+ 𝑧𝛼√
2 𝑛 2 𝑛

𝑃̂ ̂ (1−𝑝̂)
LCI = 2𝑧√
𝛼
2 𝑛

The last equation above can be used to find the length of a confidence interval for population
proportion.

Example 2. Find the length of the confidence interval given the following data

?̂?̂=0.25 ,𝑛=400,𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙∶ 95%

59
Solution:
1. Find 𝛼 in (1- 𝛼) 100% confidence level then find 𝑧 𝛼 .
2
(1- 𝛼) 100% = 95%
1- 𝛼 = 0.95
𝛼 = 0.05
𝛼 0.05
= = 0.025
2 2

0.5 – 0.025 = 0.475

Hence, using the areas under the standard Normal Curve Table, 𝑧𝛼=1.96
2
𝑃̂ ̂ (1−𝑝̂)
2. LCI = 2𝑧𝛼√
2 𝑛

0.25 (0.75)
= 2(1.96)√
400
0.1875
= 3.92√
400
=0.0848 or 0.085
The last equation above can be used to find the length of the confidence interval.
Example 3. Find the length of the confidence interval, given the following data.
s=6.17 , n=12, confidence level : 99%
Solution:
1. Find the degrees of freedom df.
df= n−1
= 12 −1
=11
2. Find 𝛼 in (1 −𝛼 ) 100% confidence level then find 𝑧 𝛼 .
2
( 1 − 𝛼 ) 100% = 99%
1 − 𝛼 = 0.99
𝛼 = 0.01
𝛼 0.01
= = 0.005
2 2

0.5 – 0.005 = 0.47


Hence, using the areas under the standard Normal Curve Table, 𝑧𝛼=3.106.
2
𝑆
3. LCI = 2𝑡𝛼 Where
size : n=desired sample
2√𝑛
6.17
=2(3.106) ( )
√12 ?̂?̂ = sample
=(6.212) (
6.17
) proportion
√12

=11.064 or 11.06 𝑧𝛼 =z value


2

E= margin error

60
Determining the Sample of Size for Estimating p
The following are the formulas:
A. When an estimate ?̂?̂ is known:
2
[𝑧𝛼]?̂?̂(1−𝑝̂)
n= 2
𝐸2
Where : n=desired sample size
?̂?̂ = sample proportion
B. When an estimate ?̂?̂ is unknown 𝑧𝛼 =z value
2
0.25(𝑧𝛼) 2
n= 2
𝐸2 E= margin error

0.25 is constant

If the computed sample size is not a whole number, it should be rounded up to the next
whole number.
Example 4.
In previous study done by a student, it was found out that 28.5% of the student used twitter.
This year, your statistics teacher wants you to conduct a study on the current percentage
of Twitter users among the students in your school. How many students must you include
in your study to be 95% confident so that the margin of error is no more than 3.5
percentage?

Solution:
a. P= 28.5 %
=0.285
q hat
1-0.285 = 0.715
b. E= 3.5%
=0.035

(1−𝛼) 100% = 95%


1−𝛼 = 0.95
𝛼 = 0.05
𝛼 0.05
= = 0.025
2 2
0.5 – 0.025 = 0.475 , hence 𝑧𝛼=1.96.
2
2
[𝑧 𝛼] 𝑝̂(1−𝑝̂)
c. = 2
𝐸2 2
[1.96](0.285) (0.715)
= 2

( 0.035 )
=639.04 ≈ 640

61
A. Find the length of the following confidence interval.
1. 0.325< p <0.575 2. 0.137< p < 0.563 3. 0.338< p < 0.562 4. 0.301 < p < 0.751 5. 0.245 < p <
0.467
Find the length of the confidence interval given the following upper and lower
limits.
6. Upper confidence limit= 0.673
B.

Lower confidence limit = 0.447


7. Upper confidence limit= 0.691
Lower confidence limit = 0.409
8. Upper confidence limit= 0.765
Lower confidence limit = 0.535
9. Upper confidence limit= 0.861
Lower confidence limit = 0.619
10. Upper confidence limit= 0.852
Lower confidence limit = 0.588

C. Find the length of the confidence interval given the following data.
11. ?̂?̂ =0.35, n=400, confidence level= 95%
12. ?̂?̂ =0.45, n=350, confidence level= 95%
13. ?̂?̂ =0.48, n=410, confidence level= 95%
14. ?̂?̂ =0.51, n=420, confidence level= 95%
15. ?̂?̂ =0.42, n=300, confidence level= 95%
16. 𝜎=0.5, n=36, confidence level =95%
17. 𝜎=0.4, n=64, confidence level =95%
18. 𝜎=0.52 , n=40, confidence level =95%
19. 𝑠=5.25 n=14, confidence level =95%
20. 𝑠=6.05 n=12, confidence level =95%

62
Quarter II
MODULE 1

This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be used in
many different learning situations. The language used recognizes the diverse vocabulary
level of students. The lessons are arranged to follow the standard sequence of the course.
But the order in which you read them can be changed to correspond with the textbook you
are now using.
The module consists of one lesson which contains sub lessons:

• Lesson 1 – Basic Concepts of Hypothesis Testing


• 1.1. Null hypothesis
• 1.2. Alternative hypothesis
• 1.3. Level of Significance
• 1.4. Rejection Region
• 1.5. Types of Error in hypothesis Testing
• 1.6. The Parameter tested to be tested given a real-life problem

After going through this module, you are expected to:


1. Illustrates: (a) null hypothesis; (b) alternative hypothesis; (c) level of significance; (d)
rejection region; (e) types of errors in hypothesis testing.
1. Identifies the parameter to be tested given a real-life problem.

Lesson
Tests of Hypothesis
1
In daily life, we make tentative explanation of facts about a particular phenomenon by
formulating hypothesis. This hypothesis may be correct or incorrect, depending on the
available evidence that we can gather to support our hypothesis. We usually use a sample to
gather information and evidence that we need to validate our hypothesis. The data that we
gather from this sample become the basis of our decision whether we shall accept or reject
our hypothesis regarding the entire population. The data obtained from this sample is
analyzed with the use of appropriate statistical procedure to find out whether our
hypothesis should be accepted. This process is called testing hypothesis. In this lesson 1 we
shall discuss another aspect of inferential statistics: the testing of hypothesis. We shall learn
how to conduct a test of hypothesis that will help us to arrive at the right decision.

63
A statistical hypothesis is a statement about the numerical value of a population
parameter. It is a statement or tentative assertion which aims to explain facts about a
certain phenomenon. A hypothesis needs to be resolved whether it is true or not. Thus, it
must be subjected to statistical testing procedure known as test of hypothesis or
hypothesis testing. If the hypothesis is found to be true, it is accepted: if it is found
false, it is rejected.

There are two kinds of hypothesis: the null and alternative hypotheses.

1. A null hypothesis, denoted by Ho, is a statement that there is no difference


between a parameter and a specific value.
2. An alternative hypothesis denoted by Ha, is a statement that there exists a
difference between a parameter and a specific value. It is the opposite or negation of
the null hypothesis.

In a hypothesis test, sample data is evaluated in order to arrive at a decision about


some type of claim. If certain conditions about the sample are satisfied, then the claim can
be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis,
typically denoted with H0. The null is not rejected unless the hypothesis test shows
otherwise. The null statement must always contain some form of equality (=, ≤ or ≥) Always
write the alternative hypothesis, typically denoted with Ha or H1, using less than, greater
than, or not equals symbols, i.e., (≠, >, or <). If we reject the null hypothesis, then we can
assume there is enough evidence to support the alternative hypothesis. Never state that a
claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is
based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Since the null and alternative hypotheses are contradictory, you must examine evidence to
decide if you have enough evidence to reject the null hypothesis or not. The evidence is in
the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision.
There are two options for a decision. They are “reject H0” if the sample information favors
the alternative hypothesis or “do not reject H0” or “decline to reject H0” if the sample
information is insufficient to reject the null hypothesis

Study the following examples:


Example 1
Claim: The average monthly income of Filipino families who belong to low-income bracket
is Php8 000.
Ho: The average monthly income of Filipino families who belong to low
income bracket is Php8 000 (µ = 8 000).
Ha: The average monthly income of Filipino families who belong to low
income bracket is not equal to Php8 000 (µ ≠ 8 000).

64
Notice that the null hypothesis is expressed through the use of the “equal” symbol while
the alternative hypothesis is expressed by the "not equal" symbol because the claim or
conjecture does not specify any direction. Example 2 H0: No more than 30% of the
registered voters in Santa Clara County voted in the primary
election. µ ≤ 30%
Ha: More than 30% of the registered voters in Santa Clara County voted in the primary
election. µ > 30%

Example 3
We want to test if college students take less than five years to graduate from college, on the
average. The null and alternative hypotheses are:
H0: μ ≥ 5
Ha: μ < 5

In formulating the hypotheses, we can use the following guidelines:


1. A null hypothesis is generally a statement of no change. Thus, a statement of equality or
one which involves the equality is usually considered in the null hypothesis. Possible forms
of the null hypothesis include (a) equality; (b) less than or equal; and (c) greater than or
equal.
2. The statistical hypothesis is about a parameter or distribution of the population values.
For example, the parameter in the statement is the average daily number of text messages
that a Grade 11 student sends. Usually, the parameter is represented by a symbol, like for
the population mean, we use µ. Hence, the null and alternative hypotheses could be stated
using symbols as “Ho: µ = 100 against Ha: µ ≠ 100.”
3. The null and alternative hypotheses are complementary and must not overlap. The usual
pairs are as follow:
(a) Ho: Parameter = Value versus Ha: Parameter ≠ Value.
(b) Ho: Parameter = Value versus Ha: Parameter < Value.
(c) Ho: Parameter = Value versus Ha: Parameter > Value.
(d) Ho: Parameter ≤ Value versus Ha: Parameter > Value, and
(e) Ho: Parameter ≥ Value versus Ha: Parameter < Value.

Types of Tests
A statistical test may either be directional (one-tailed) or non-directional (two-tailed).
We can determine whether a test is directional or nondirectional by looking at how
alternative hypothesis is expressed.

Directional Test
A test of any statistical hypothesis where the alternative hypothesis is expressed using less
than (<) or greater than(>) is called directional test or one-tailed test since the critical or
rejection region lies entirely in one tail of the sampling distribution.
Study the following examples.

Example 4
Claim: The average weekly allowance of college students is less than Php 1 500.

65
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).

Ha: The average weekly allowance of college students is less than Php 1 500
(µ < 1 500).

one-tailed
This is directional
test. Moretest or
specifically,
tailed test because
this is a left-
the "less than" symbol was used in expressing the alternative hypothesis Thus, the critical
region or the rejection region lies entirely in the left tail of the sampling distribution.
Example 5
Claim: The average weekly allowance of college students is greater than
Php 1 500.
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).
Ha: The average weekly allowance of college students is less than Php 1 500
(µ > 1 500).

This is also directional test or one-tailed test. More specifically this is a right-tailed test
because the “greater region than" symbol was used in expressing the alternative hypothesis.
Thus, the critical region or the rejection region lies entirely at the right tail of the sampling
distribution.

Nondirectional Test
A test of any statistical hypothesis where the alternative hypothesis is written with a
not equal sign (≠) is called a nondirectional test or two-tailed test since there is no assertion
made on the direction of the difference. The rejection region is split into two equal parts,
one in each tail of the sampling distribution.
Example 6
Claim: The average weekly allowance of college students is Php 1 500.
H0: The average weekly allowance of college students is equal to Php 1 500
(µ = 1 500).
Ha: The average weekly allowance of college students is less than Php 1 500
(µ ≠ 1 500).

66
Observe that the alternative hypothesis is expressed, using the “not equal” (≠) symbol; the
test is two-tailed. Types of Error In decision-making, we sometimes make a wrong decision.
Likewise. when we test a hypothesis, there is a possibility that we shall also commit an error
of accepting or rejecting the hypothesis. There are two types of errors: The Type I error and
the Type Il error.

Type I error occurs when we reject the null hypothesis when it is true. It is also
called alpha error (α error).
Type II error occurs when we accept the null hypothesis when it is false. It is also
called beta error (ß error).
In hypothesis testing, four outcomes are possible: two of which lead to incorrect decisions.
The four possible outcomes are described in the table below.
Fact
Decision Ho is true Ho is false
Accept Ho Correct decision Type II error
Reject Ho Type I error Correct decision

Level of Significance
The probability of committing Type I error is called the level of significance, It is denoted
by the Greek letter α (alpha). Thus. the value of a tells us the probability of making an
error in rejecting the null hypothesis when it is true. The choice for the value of the
significance level is determined by the researcher. This depends on the risk or degree of
confidence the researcher is willing to take in committing Type I error. The commonly
used levels of significance are 0.05 and 0.01. The level of significance should be set before
testing the hypothesis. Example 7

A 0.01 level of significance means that the researcher is willing to take 1% error in
making a decision. 't also implies that he is confident that he will make a right decision.
Likewise, a 0.05 level of significance means that the researcher is willing to take error in
making a decision. It also implies that he is 95% confident that he will make a right decision.
Steps in Testing the Hypothesis
Whenever we test hypotheses, we follow these steps.
Step l: Identity the claim and formulate the null (HO) and alternative (Ha) hypothesis
Step 2: Set the level of significance and determine whether the test is one-tailed or
two-tailed by looking at how the alternative hypothesis is expressed, Decide
on the test statistic to be used and find the critical value for the test. Draw or
illustrate the rejection region.

67
Step 3: Compute the test value, using the test statistic or formula for the
test.
Step 4: Make a decision whether to accept or reject the null hypothesis.
Step 5: Formulate a conclusion by answering the research question.

Accepting or Rejecting the Null Hypothesis


How do we decide on accepting or rejecting the null hypothesis? Follow these steps.
1. Determine the critical value. using appropriate statistical tables,
2. Draw the rejection region and the critical value.
3. If the test value or the computed value falls in the rejection region, then reject the null
hypothesis; otherwise, accept the null hypothesis.

Study the following examples:


Example 8
Hypotheses Rejection Region Decision

H0: µ = 50
Ha: µ ≠ 50
Test Value = 1.45 Accept H0
Critical Value = ±1.96

H0: µ = 50
Ha: µ ≠ 50
Test Value = -1.65 Accept H0
Critical Value = ±1.96

H0: µ = 50
Ha: µ ≠ 50
Test Value = 2.35 Reject H0
Critical Value = ±1.96

H0: µ = 50
Ha: µ ≠ 50
Test Value = -2.35 Reject H0
Critical Value = ±1.96

68
H0: µ ≤ 50 Ha: µ > 50
Test Value = 1.86
Critical Value = 1.65 Fail to Accept/Reject
H0

H0: µ ≤ 50
Ha: µ > 50
Test Value = 1.34 Accept H0
Critical Value = 1.65

H0: µ ≥ 50
Ha: µ < 50
Test Value = -2.05 Accept H0
Critical Value = −2.53

H0: µ ≥ 50
Ha: µ < 50
Test Value = -2.88 Fail to accept/Reject
Critical Value = −2.53 H0

Identify Me Please!

DIRECTIONS: Identify whether the following is a null hypothesis (Ho) or an alternative


hypothesis (Ha).
Statement
1. The average age of grade eleven students is 17 years old.
2. The mean content of citric acid in a bottle of juice drinks is greater than 2 ml.
3. The average monthly salary of private school teachers is less than Php16 000.
4. The mean weight of newborn babies is 0.5kg.
5. The average IQ of grade eleven students is less than 108.
6. The mean starting salary for education graduates is at least Php 250 000 per year.
7. The mean number of years Americans work before retiring is 34.
8. The average height of mango tree in a farm is at least 15 meters.

69
9. The average score of grades eleven students in Filemon T. Lizan Senior High in
Statistics and Probability during the Diagnostic Test is at most 45 out of 50-item test.
10. The mean number of cars a person owns in his/ her lifetime is not more than ten.

I Can Do These!

DIRECTIONS: Identify whether the test of hypothesis to be performed is one-tailed or


two-tailed.
1. The average time to commute from home to school is 32.8 minutes.
2. The average number of vehicles passing through NLEX daily is less than 21 000.
3. The average daily number of customers in a convenient store is less than 1 025.
4. The mean content of citric acid in a bottle of juice drinks is greater than 2 ml.
5. The average typing speed of a secretary is 23.8 words per minute.
6. HO: µ = 12 Ha: µ ≠ 12
7. HO: µ ≤ 10 Ha: µ > 10
8. HO: µ ≥ 12 Ha: µ < 12
9. HO: µ ≤ 12 Ha: µ > 12
10. HO: µ = 10 Ha: µ ≠ 10

What’s My Decision?

Directions: Decide whether the null hypothesis is to be accepted or rejected, given the
test value and the critical value of test statistic.
Hypotheses Rejection Region Decision
1. H0: µ = 150
Ha: µ ≠ 150
Test Value = 2.35
Critical Value = ±1.96 __________________

2.
H0: µ = 150
Ha: µ ≠ 150
Test Value = -1.34 __________________
Critical Value = ±1.96
3.
H0: µ = 150
Ha: µ ≠ 150
Test Value = 1.97 __________________

70
Critical Value = ±1.96 4.
H0: µ = 150
Ha: µ ≠ 150 Test Value
= -2.02 Critical Value =
±1.96 __________________
5. H0: µ = 150 Ha: µ ≠
150
Test Value = -1.99
Critical Value = ±1.96

__________________

6.
H0: µ ≤ 150
Ha: µ > 150
Test Value = 1.56 __________________
Critical Value = 1.65
7.
H0: µ ≤ 150
Ha: µ > 150
Test Value = 1.28 __________________
Critical Value = 1.65
8.
H0: µ ≥ 150
Ha: µ < 150
Test Value = -2.55
__________________
Critical Value = −2.53

9. H0: µ ≥ 150 Ha: µ


< 150
Test Value = -3.01
__________________
Critical Value = −2.53

71
10. H0: µ ≤ 150 Ha:
µ > 150
Test Value = 1.76
__________________
Critical Value = 1.65

MODULE 2

This module was designed and written with you in mind. It is here to help you master
the random variable and probability distributions. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard sequence of
the course. But the order in which you read them can be changed to correspond with the
textbook you are now using.
The module is focus only in one topic:
Lesson 1 – Hypothesis Testing About a Population Mean When the Variance is
known
Lesson 2 – Hypothesis Testing About a Population Mean When the Variance is
Unknown
Learning Competencies:
✓ formulates the appropriate null and alternative hypotheses on a population mean.
(M11/12SP-IVb-1)
✓ identifies the appropriate form of the test-statistic when: (a) the population variance
is assumed to be known; (b) the population variance is assumed to be unknown; and
(c) the Central Limit Theorem is to be used. (M11/12SP-IVb-2)
After going through this module, you are expected to:
✓ differentiate traditional approach from probability value approach of hypothesis
testing
✓ determine whether a hypothesis test is non-directional or directional
✓ determine whether a directional test is left -tailed or right -tailed

Hypothesis Testing About a


Lesson
Population Mean When the Variance
2 is known
We make decision all the time, consciously or not. In studying statistic, making decisions
based on observation or data that are considered random variable is an essential concept to
learn. Such procedure making decision is called hypothesis testing.

72
One of the ultimate goals of every nation is to produce professional who will contribute to
scientific knowledge through research. In research investigation, hypothesis testing is a vital
procedure. It is deciding whether to accept and reject a statement or the assumption about
some parameter in any research problem. From the results of the correct decision making,
conclusions are drawn in which facts are generated, and thus can become a contribution to a
body of knowledge in the fields of education, business, medicine, commerce, economics, and
many others.
In this lesson, we will study the terminologies related to testing of hypothesis, how to
calculate the probabilities of committing a type I and type II error, hypothesis testing about
a population mean when the variance is known, hypothesis testing when the variance is
unknown, and hypothesis testing concerning proportions. Hypothesis testing, the focal
point lesson, brings to light the role of research in discovering new knowledge and
breakthrough in different fields of discipline.

Example 1. The leader of the association of jeepney drivers claims that the average daily
take home pay of all jeepney driver in Navotas City is Php400.00. A random sample of 100
jeepney drivers in Navotas City was interviewed and the average daily take home of these
drivers is found to be Php425. Use a 0.05 significant level to find out if the average daily
take home pay of all jeepney drivers in Navotas City is different from Php400.00. Assume
that the population variance Php8,464.00.
Solution :

A. By critical Value Method


Step 1. State the null and the alternative hypotheses.

𝐻𝑎: 𝜇 =400
𝐻𝑎: 𝜇 ≠400
Step 2 . Choose the level of significance: 𝑎= 0.05.

Step 3. Compute the test statistics. Since it is the population mean that is being tested
and the population variance is known and n>30, the appropriate test statistic is the z-value.
𝑥̅ ̅−𝜇
z= 𝜎
√𝑛

Computation:

The standard deviation 𝜎 is the square roots of the variance 𝜎2. The square roots of 8,464
is 92, hence 𝜎 = 92.
𝑥̅ ̅−𝜇
z=
𝜎
√𝑛
425−400
= 92
√100

=2.72

Step 4. Determine the critical value.


73
The alternative hypothesis is non- directional; hence the two-tailed test shall be used.
Divide 𝛼 by 2, and then subtract the quotient from 0.5.
𝛼
= 0.025
2

0.5−0.025= 0.475 or 0.4750

Step 5 . Draw a conclusion.

Because the computed test statistic, z= 2.72 falls within the rejection region (beyond the
critical value ± 1.96), reject the null the hypothesis and accept the alternative hypothesis.
Conclude that the average daily take home pay of jeepney drivers is not equal to Php400.00.
This result is significant at 𝛼 =0.05 level.
B. By 𝝆−value method
This method is gaining popularity because of statistical computer programs. Most
statistical computer programs are using the p-value method. For deciding and drawing a
conclusion., following rules are important.

a. If p-value ≤ 𝛼 , reject 𝐻𝑜 ,
b. If p-value > 𝛼, do not reject 𝐻𝑜
Solution
Step 1. State the null and the alternative hypothesis.

𝐻𝑎: 𝜇 =400
𝐻𝑎: 𝜇 ≠400
Step 2. Choose the level of significance: 𝑎= 0.05.

Step 3. Compute the test statistics. Since it is the population mean that is being tested
and the population variance is known and n>30, the appropriate test statistic is the z-value.
𝒙̅ ̅−𝝁
z=
𝝈
√𝒏

Computation:

The standard deviation 𝜎 is the square roots of the variance 𝜎 2 . The square roots of 8,464
is 92, hence 𝜎 = 92.

𝑥̅ ̅−𝜇
z= 𝜎
√𝑛

= 425−400
92
√100

=2.72

Step 4. Determine the critical value The computed test statistic us z=2.72. Use the Areas

under the Standard Normal Curve


Table. In the first column under z, look at 2.7. Move to the right along this row until the
column headed 2 is reached. The value under the column head 2 is 0.4967. Subtract 0.4967

74
from 0.5. Since this is a two- tailed test, double the result. Hence, 0.5−0.4967 = .0033 . The
𝑝- value = 2 (0.0033) = 0.0066.

Rejection region Rejection region

𝛼 𝛼
=0.025 Non - rejection region =0.025
2 2

1 1
p=0.0033 𝜇= 400 p=0.0033
2 2

5. Draw a conclusion.

Since 0.0066 is less than 0.05, reject the null hypothesis and accept the alternative
hypothesis. Conclude that the average daily take home pay of jeepney drivers is not equal
to Php400.00. This result is significant at 𝛼=0.05 level.

Directions: Read each problem carefully. Choose the letter which corresponds
to the correct answer and write it in a separate sheet of paper.

1- 5. Find the appropriate rejection region in each case. (when the variance is
known)
1. 𝐻𝑎: 𝜇≠𝜇𝑜 , 𝛼 = 0.05.
a. z=−1.97 b. z=1.95 c. z=1.96 d. z=1.94
2. 𝐻𝑎: 𝜇>𝜇𝑜 , 𝛼 = 0.01.
a. z=+2.33 b. z=−2.33 c. z= +2.35 d. z= −2.35
3. A two tailed test at 10% level of significance.
a. The appropriate rejection is the area to the right of the critical value z=+1.645
and the area to the left of the critical value z=−1.645.
b. The appropriate rejection is the area to the right of the critical value z=−1.645
and the area to the left of the critical value z=+1.645.
c. The appropriate rejection is the area to the right of the critical value z=+1.635
and the area to the left of the critical value z=-1.635.
d. The appropriate rejection is the area to the right of the critical value z=+1.545
and the area to the left of the critical value z=-1.545.
4. A two tailed test at 95% level of confidence.
a. The appropriate rejection is the area to the right of the critical value z=+1.645
and the area to the left of the critical value z=−1.645.
b. The appropriate rejection is the area to the right of the critical value z=+1.96 and
the area to the left of the critical value z=−1.96.
c. The appropriate rejection is the area to the right of the critical value z=+1.95 and
the area to the left of the critical value z=−1.95.

75
d. The appropriate rejection is the area to the right of the critical value z=+2.96
and the area to the left of the critical value z=-2.96
5. 𝐻𝑎: 𝜇<𝜇𝑜 , 𝛼 = 0.01.
a. z=2.33 b. −2.33 c. 2.45 d. -2.45

Test each of the following hypothesis using the given formula.

6. 𝐻𝑜: 𝜇 = 84 , 𝐻𝑎: 𝜇 ≠ 84. By using the critical value method.


Given: ?̅?̅=87, 𝜎 =10 , n=35 𝛼 =0.05

a. The computed test statistic z=1.77 does not fall within the rejection region,
hence do not reject the null hypothesis.
b. The computed test statistic z=−1.77 does not fall within the rejection region,
hence do not reject the null hypothesis.
c. The computed test statistic z=1.77 falls within the rejection region, hence, do
not reject the null hypothesis.
d. The computed test statistic z=1.77 does not fall within the rejection region,
hence accept the null hypothesis.
7. 𝐻𝑜: 𝜇 = 84 , 𝐻𝑎: 𝜇 ≠ 84. By using the p- value method.
Given: ?̅?̅=87, 𝜎 =10 , n=35 𝛼 =0.05

a. The p-value of 0.077 is more than 0.05, hence accept the null hypothesis.
b. The p-value of 0.077 is less than 0.05, hence do not accept the null hypothesis.
c. The p-value of 0.077 is more than 0.05, hence do not reject the null hypothesis.
d. The p-value of 0.077 is less than 0.05, hence accept the null hypothesis.
8. 𝐻𝑜: 𝜇 = 45 , 𝐻𝑎: 𝜇 < 45. By using the critical value method.
Given: ?̅?̅=40, 𝜎 =12 , n=32 𝛼 =0.01

a. The computed test statistic z=−2.36 falls within the rejection region, hence reject
the null hypothesis.
b. The computed test statistic z=2.36 do not falls within the rejection region, hence
reject the null hypothesis.
c. The computed test statistic z=2.36 do not falls within the rejection region, hence,
accept the null hypothesis.
d. The computed test statistic z=-2.36 falls within the rejection region, hence,
accept the null hypothesis.
9. 𝐻𝑜: 𝜇 = 45 , 𝐻𝑎: 𝜇 < 45. By using the P- value method.
Given: ?̅?̅=40, 𝜎 =12 , n=32 𝛼 =0.01

a. The p-value of 0.0091 is less than −0.01, hence reject the null hypothesis.
b. The p-value of 0.0091 is more than 0.01, hence reject the null hypothesis.
c. The p-value of 0.0091 is less than 0.01, hence accept the null hypothesis.
d. The p-value of 0.0091 is less than 0.01, hence reject the null hypothesis.

10- 13. Find the critical value of the following. (When variance is unknown)
10. A right -tailed test, 𝛼=0.05 ; df =24.
a. Critical value =+1.711 c. Critical value = 1.750
76
b. Critical value = 1.712 d. Critical value = −1.711
11. A left -tailed test; 𝛼=0.01 ; df =14.
a. Critical value =2.553 c. Critical value =−2.553
b. Critical value =−2.624 d. Critical value= =2.624
12. A two -tailed test, 𝛼=0.01; df =18.
a. Critical value = ±1.734 b. c. Critical value =−1.734
Critical value = +2.878 d. Critical value = -2.878
13. A two -tailed test, 𝛼=0.05; df =16.
a. Critical value = +2.120 c. Critical value =±2.120
b. Critical value =−2.120 d. Critical value= 2.120

MODULE 3

This module was designed and written with you in mind. It is here to help you master
the tests of hypothesis. The scope of this module permits it to be used in many different
learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But
the order in which you read them can be changed to correspond with the textbook you
are now using.
The module consists of a lesson, namely:
Lesson 1 – Comparing the Sample Mean and the Population Mean in a Large
Sample Size

After going through this module, you are expected to:


➢ Identifies the appropriate rejection region for a given level of significance when:
➢the population variance is assumed to be known the population variance is
assumed to be unknown the Central Limit Theorem

Comparing the Sample Mean and the


Lesson
Population Mean in a Large Sample
3
Size
In this lesson, we shall learn how to determine if a significant difference exists between a
sample mean and population mean, using the z-test of one sample mean difference, we
mean that the difference is statistically significant.
To find out if you are ready to learn this new lesson, do the following activity before going
through this lesson.

77
Let us explore.
Example 1.

A new drug on the market is claimed by its manufacturers to reduce overweight women by
4.55 kg per month with a standard deviation of 0.91 kg. Ten women chosen at random have
reported losing an average of 4.05 kg within a month. Does this data support the claim of
the manufacturer at 0.05 level of significance?

For you can easily understand how to test a hypothesis, a simplified approach of testing a
hypothesis is presented to you. Understand carefully and suggested to follow it.

Is the claim true that the drug reduces overweight


women by 4.55 kg per month with a standard
I. Problem: deviation () of 0.91 kg?

Ho: The average weight loss per month using a new


drug is equal to 4.55 kg
( = 4.55)
II. Hypotheses:
Ha = The average weight loss per month using a new
drug is not equal to 4.55kg (  4.55)
III. Level of Significance:  = 0.05
Critical value (cv) c.v. = 2.262
IV. Statistics t- test for two-tailed test or non-directional test

Example 2: The ABC company claims that the average lifetime of a certain tire is at least 28

000 km.
To check the claim, a taxi company puts 40 of these tires on its taxis and gets a mean
lifetime of 25 560 km. With a standard deviation of 1 350 km, is the claim true? Use the z-
test at 0.05.

I. Problem Is the claim true that the average lifetime of a


certain tire is at least 28 000 km?
Ho: The average lifetime of a certain tire is at
least 28 000 km.

78
(Ho:  ≥ 28 000)
Ha: The average lifetime of a certain tire is less
II. Hypotheses: 28 000 km. (Ho   28 000)
Since the claim says that a certain tire is at east 28
000 km. it could also be possible that the
alternative hypothesis is Ha < 28 000
III. Level of Significance:  = 0.05
Critical value (cv) z = -1.645
IV. Statistics z-test for one-tailed

It's your turn.

Determine the decision for each of the following, given the computed and
critical value of the z.
1. z computed = 1.82 z z critical = 1.96 z
2. computed = 2.54 critical = 2.33
3. z computed = 1.02 z critical = 2.33
4. z computed = 2.54 z critical = 2.33
5. z computed = 2.54 z critical = 2.33
Determine the decision for each of the following given the computed z
note: Determine first the critical value using the confidence level .

6. z computed = 1.29 confidence


level= 90% two tailed
7. z computed = 1.87 z = 0.05
one tailed
8. computed = 1.11 z
confidence level=
9 computed = 3.11 90% one tailed
= 0.01 two-tailed
10. z computed = 1.34
confidence level
= 95% one-tailed

Directions: Read and Understand the problem carefully and Solve the following: A

sociologist believes that it costs more than Php 90 000 to raise a child from birth to
age one. A random sample of 49 families, each with a child is selected to see if this
figure is correct. The average expenses for these families reveal a mean of Php 92 000
with a standard deviation of Php 4 500. Based on these sample data, can it be
concluded that the sociologist is correct in his claim? Use the 0.05 level of significance.
I. Problem
79
II. Hypotheses:
III. Level of Significance: Critical value (cv)
IV. Statistics
Rejection Region:
Compute the test value, using the test statistics
V. Decision Rule:
VI. Conclusion:

A printer manufacturing company claims that its new ink-efficient printer can print
an average of 1500 pages of word documents with a standard deviation of 60. Thirty-
five (35) of these printers showed a mean of 1 475 pages. Does this support the
company's claim? Use the 95% confidence level.
I. Problem
II. Hypotheses:
III. Level of Significance: Critical value (cv)
IV. Statistics
Rejection Region:
Compute the test value, using the test statistics
V. Decision Rule:
VI. Conclusion:

MODULE 4

This module was designed and written with you in mind. It is here to help you master
the nature of Statistics and Probability. The scope of this module permits it to be
used in many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using. This module targets the following
learning competencies:
1. Compute for the test statistic value (population mean) (M11/12SP-IVd-1).
2. Draw conclusion about the population mean based on the test-statistic value and
the rejection region (M11/12SP-IVd-2).
After going through this module, you are expected to:
• define tests of significance;
• compute the test statistic;
• find the p-value; and
• compare p-value with 𝛼; and
draw conclusion about the population mean based on the test-statistic value and
the rejection region.

80
Lesson
4 Tests of Significance

Once a sample data has been collected, researchers will use a tool to find out the
probability that a relationship exists between two variables in every sample. They
need to assess whether or not the relationship between two variables does exist or it
is just because of random chance. In this module, you will learn how to do it. You will
know how to compute for the test statistic value (population mean), and draw a
conclusion about the population mean. The learning that you gained from the
previous modules will help you understand this lesson

A. Computing for the Test-Statistic Value (Population Mean)

After formulating the null and alternative hypotheses, the next step is to
compute the test statistic. However, before doing the computation, you have to
identify first the appropriate significance test. Take note that the test statistic
follows a normal distribution where the mean is 0 and the standard deviation is 1.

1. Use z-tests when the population standard deviation σ is known.


?̅?̅−𝜇
This test statistic uses the formula: 𝑧=𝜎
√𝑛
where ?̅?̅ = sample mean
𝜇 = population mean
σ = population standard deviation
n = sample size

2. Use t -tests when the population standard deviation σ is unknown. Actually,


this statistical test is more commonly used than a z-test because in most
research cases, the population standard deviation σ is not known.
?̅?̅−𝜇
This test statistic uses the formula: 𝑡= 𝑠.
where ?̅?̅ = sample mean √𝑛

𝜇 = population mean
s = sample standard deviation
n = sample size
To summarize when to use a t-test or a z-test, use this diagram:

In the past, statisticians used a z-test when n ≥ 30 and used a t-test when n < 30.
That is because they assume that a distribution is normally distributed when the sample
81
size is large enough. However, there is no need to do it nowadays. We can now use a t-test
even if the sample size is greater than or equal to 30. Even the statistical packages now use a
t-test for large sample sizes. This is because as the sample size increases, t gets closer to z.
Meaning, you do not lose anything when you use a t-test. The main point now is this: if the
population standard deviation (σ) is unknown, use a t-test regardless of the sample size.
Meaning, the use of a z-test or a t-test is not related to n. So, whenever you use a sample
standard deviation (s) to compute the standard error as an estimate for a population
standard deviation (σ), use a t-statistic.
Example 1: Compute the test statistic using the following data:
?̅?̅=85, 𝜇=84, 𝜎=5, 𝑛=60

Steps Solution
1. Identify the appropriate statistical Since the population standard deviation
test. 𝜎 is known, use the z-test.
2. Compute using the formula for ?̅?̅−𝜇
𝜎
𝑧=
z statistic 𝑧=
√𝑛
85−84 1 1
5 = 5
7.7460
= .6455 =
√60
z =1.55
Level of significance = 0.05

A = 0.4394 -0.5 = 0.0606 or 6.06%

p-value = 0.0606 > 0.05

Fail to reject the Ho


Example 2: Compute the test statistic using the following data:

?̅?̅=130.05, 𝜇=120, 𝑠=9.96, 𝑛=20

Steps Solution
1. Identify the appropriate statistical Since the population standard deviation
test. 𝜎 is unknown, use the t-test.
2. Compute using the formula for 𝑡=
?̅?̅−𝜇
𝑠

t statistic 𝑡= √𝑛
4.51 130.05−120 10.05 10.05
9.96 =9.96 = .6455 = 4.512 or
√20 4.4721

t-value = 4.51, level of significance = 0.05

p-value = 0.00012 < 0.05

Reject the Ho

82
B. The Probability-value Method (p-Value Method) Recall that the null hypothesis (H0) is the

claim that is being tested by a test-statistic. You


assume this to be true until you have gathered enough evidence that it is not. Once you
have found the test statistic, the next step is to find the probability of getting this score
when H0 is true. This probability is known as the p-value. The p-value approach has become
prevalent in testing hypothesis because of the convenience brought to us by computers,
calculators, and statistics software.

A p-value helps you to determine how likely is the data, assuming that H0 is true. It
is the probability to the right of the test statistic. If you are doing the two-tailed test, then
it is the probability to the lower left and to the upper right of the test statistic. Note that it
does not tell you the probability that H0 is true (because in the first place, you assume this
to be true before doing the test). This belief is one of the biggest misconceptions about a p-
value. Another thing is that, having a good p-value (or low p-value) does not mean that your
conclusion is correct. It only tells you how strong your evidence is to reject the null
hypothesis. Also, always bear in mind that you do not accept a null hypothesis. It is either
you reject it or fail to reject it. This is what we are doing in hypothesis testing. We are
gathering evidences to reject the null hypothesis.

Now, how to use the p-value in testing the hypothesis?

1. Select the level of significance (𝛼). This is the cutoff value for p, and you set
this before doing the hypothesis testing. The most commonly used levels of
significance are 0.01, 0.05 and 0.10.
2. Compute the p-value.
3. Compare the p-value with the significance level (𝛼) and draw a relevant
conclusion. If the p-value is less than or equal to the significance level 𝛼, then
the evidence is sufficient to reject the null hypothesis.
Interpretation

p-value Interpretation
Less than .01 Highly statistically significant
There is very strong evidence against H0
.01 to .05 Statistically significant
Adequate evidence against H0
Greater than .05 Insufficient evidence against H0
Adapted from Statistics & Probability by R. Belecina et al, page 259

83
Decision Rule:
➢ Reject the null hypothesis when the p-value is equal or smaller than alpha 𝛼 .
➢ (Reject H0if p ≤ ) 𝛼
Do not reject the null hypothesis when the p-value is larger than alpha 𝛼 .
(Do not reject H0 if p > 𝛼)

Example:

The owner of a company that sells a particular powdered juice claims that the average
5 g. However,
weight a of their product is 100 g with a standard deviation of
content
group of students wants to test the claim for they believe that it is less than 100 g. So, they
get a sample of 50 packs of such powdered juice, computed the weight content, and then
find the mean weight to be 99 g. Is the claim of the company owner true?
Solution:

Steps Solution
1. Formulate the null hypothesis and H0: µ = 100 g / Ho : µ ≥ 100
the alternative hypothesis. Ha: µ < 100 g
2. Statistical Test
• Choose a significance level (𝛼) • α = .05
• Is the test one-tailed or two-tailed? • one-tailed
• What is the appropriate test • z test (note that σ is given)
statistic?
3. Compute for the test statistic and the 𝑧=
?̅?̅−𝜇
𝜎

p-value 𝑧=
√𝑛
99−100 −1 −1
5 = 5 = .7071 = -1.41
√50 7.0711

*The area when z = -1.41 is .4207

p-value = .5000 - .4207 = .0793


4. Compare the p-value with the p-value is .0793
significance level. 𝛼 is .05 0.0793 > 0.05

5. Make a conclusion. Since p > .05, the group of students fail


to reject the null hypothesis, and the
result is not significant at p < .05.

In context, the group of students does


not have enough evidence that the
weight content of each pack is less than
100 g.
Note:
For a t statistic, it is better to use software or Excel to find the exact p-value. However, if you
need to find the p-value manually, you may use a t-table and approximate the probability.

84
Activity: Rejected or Not?

Directions: Complete the table by filling out the missing values. Then, draw a decision about
the population mean based on the test statistic value and the probability value. (Assume
that there is only one variable and that all the assumptions are met.)

Decision
(Reject the null
Test Statistic hypothesis or
Significance Level p-value failed to reject the
(one-tailed) null hypothesis)

1 𝛼=.05 z = 1.35
2 𝛼=.10 z = -2.28
3 𝛼=.01 z = -1.17
4 𝛼=.05 z = 1.96
5 𝛼=.05 z = 2.54
6 𝛼=.01 t = 1.345; n = 15
7 𝛼=.10 t = -1.19; n = 5
8 𝛼=.05
t = 2.756; n = 30
9 𝛼=.01
t = 3.25; n = 10
10 𝛼=.01
t = -1.059; n = 25

Directions:
1. Compute the test statistic using the appropriate statistical test. (Write the test
statistic in three-digit form.)
2. Find the p-value.
3. Using the selected significance level, decide whether to reject the null hypothesis.

?̅?̅ = 102, 𝜇 = 100 , 𝜎 = 5 , 𝑛=36 , 𝛼 = .05 , one-tailed

?̅?̅=48.95, 𝜇=50 , 𝑠=5, 𝑛=25, 𝛼=.05 , one-tailed test

?̅?̅=24.8, 𝜇=25, 𝑠=5, 𝑛=25, 𝛼=.05 , one-tailed test

85
MODULE 5

This module was designed and written with you in mind. It is here to help you learn about
Hypothesis testing. The scope of this module permits it to be used in many different learning
situations. The language used recognizes the diverse vocabulary level of learners. The lessons
are arranged to follow the standard sequence of the course.
The module is divided into 2 lessons, namely:

● Lesson 1 – Testing Hypothesis involving Population Mean


● Lesson 2 – Testing Hypothesis involving Population Proportion
After going through this module, you are expected to:

1. Solves problems involving test of hypothesis on the population mean.


2. Formulates the appropriate null and alternative hypotheses on a
population proportion.
3. Solve problems involving hypotheses on a population proportion.

Lesson Testing Hypothesis involving


5.1 Population Mean

The hypothesis or claims about population mean or population proportion could be tested
using the five -step hypothesis testing procedure. There are certain situations when the data
to be analyzed involved population proportion or percentage.

Solves problems involving test of hypothesis on the population mean.


Example:

The owner of the iPhone 12 pro claims that their cellphone has 2,185 mAh
Battery with a standard deviation of 60. Forty-five (45) of theses cellphones
showed a mean of 2,160 mAh battery. Does this support the company’s claim?
Use 95% confidence Level.
Answer
Using the five-step hypothesis testing procedure:
1. Null Hypothesis (H0) and Alternative Hypothesis (Ha)

H0: μ = 2 185 Ha: μ ≠ 2 185

86
2. Statistical Test
Since n=45, therefore it is Z-test
We are using equal/not equal sign, it is two-tailed
Confidence Level = 95%, α=0.05

Z-Critical = ±1.96
3. Computation

x̅̅ −μ Write the appropriate equation, by looking at the


𝑧=
𝜎 /√𝑛 standard deviation and sample size
2160 − 2185
𝑧= Evaluate the given
60 /√45
−25
𝑧= Perform the operation. Write at least 6 decimal places.
8. 94427 2
The answer is 2.80 (negative sign will be disregarded
z = -2.795084 = -2.80 since the test is two-tailed)

4. Decision (Reject or Not Reject the Ho)


Z-Computed Z-Critical

2.80 > 1.96 -------> H0 is Rejected


5. Conclusion

There is a enough evidence to deny the owner’s claims.

Lesson Testing Hypothesis involving Population


5.2 Proportion
The Population Proportion can be estimated only for large sample size (n ≥ 30). The
same is true in testing a claim or hypothesis about the population proportion (p).
For example, IATF (The Inter-Agency Task Force on Emerging Infectious Diseases is
a task force organized by the executive of the Philippine government to respond to
affairs concerning emerging infectious diseases in the Philippine) is studying on the
rapid growth of COVID-19 Patients in a region, to determine the proportion of COVID-
19 female patients. They don’t need to collect all the patient, but they only need a
sufficient sample from which they will make inference about the proportion of COVID-
19 female patients.

In the example above, AITF may initially believe that 50% of the patients are
female. Suppose they gather enough data. Out of 100 records, 56 are female patients.
Would this support their initial belief?

To test a claim about population proportion, we use the z-test for Population

87
Proportion. The formula below is used.
p̂̂ −p
𝑧=
√𝑝𝑞/𝑛

Where:

p = claimed / hypothesis proportion


p̂ = sample proportion (p̂̂=x/n)
q=1-p
n = sample size
as in the use of the z-test for means, the decision rule below is used:
Z-computed ≥ Z-critical -----> Reject H0

Z-computed < Z-critical -----> Do not Reject / Accept H0

Example: Compute for z for each of the following given.

1. p = 0.30, p̂̂= 0.40 , n= 30


2. p = 0.87, p̂̂= 0.81 , n= 45

Answer: 1.

q = 1 – p = 1 – 0.30 = 0.70 Compute for the value of q


p̂̂ −p Write the equation
𝑧=
√𝑝𝑞/𝑛
0.40 − 0.30
𝑧= Evaluate the given
√(0.30)(0.70)/30
Perform mathematical operation (GEMDAS),
.10
Write at least 6 decimal places.
𝑧= The answer is 1.2
.083666
Z= 1.195229
q = 1 – p = 1 – 0.87 = 0.13 Compute for the value of q
p̂̂ −p
𝑧= Write the equation
√𝑝𝑞/𝑛
0.81 − 0.87
𝑧= Evaluate the given
√(0.87)(0.13)/45
Perform mathematical operation (GEMDAS)
−0.06
Write at least 6 decimal places.
𝑧= The answer is -1.2
.050133
Z= -1.196813 Perform mathematical operation (GEMDAS)
0.033846 Write at least 6 decimal places.
𝑧 = .028501 The answer is 1.19
Z= 1.187537 = 1.19

88
Determine the decision for each of the following given.
Write R if Rejected, DNR if Do not Reject, the Null hypothesis.
1. Z-computed= 2.25 Z-critical=2.87
2. Z-computed= 1.95 Z-critical=2.50
3. Z-computed= 0.89 4. Z-critical= 0.89 Z-
Z-computed= 1.00 5. T- critical= 3.00 T-critical=
computed= 0.27 6. T- 3.00 T-critical= 1.97 T-
computed= 1.56 7. T- critical=2.43 T-
computed= 2.34 8. T- critical=2.13
computed= 1.23 9. Z- Confidence level=90%
computed= 0.12 10. Z- , one tailed
computed= 1.97 11. Z- Confidence level= 95% , two tailed
computed= 2.22 12. T- α= 0.01 , one tailed
computed= 1.11 13. T- Confidence level= 95% , two tailed, n=18
computed= 1.67 14. T- α =0.1 α , one tailed, n=20
computed= 1.67 =0.1 , Two tailed, n=20
15. T-computed= 2.50 α = 0.05 , one tailed, n= 1

Answer the given questions. 1. In a recent survey, a researcher claims that the average life
of a dog in a certain
country is 10 years. Is their claim correct if a random sample of 30 deaths from this
country showed a mean of 13 years with a standard deviation of 1.2 years? Use 95%
confidence level.
2. Ms. Pelaez, a teacher in English, believes that less than 15% of the student like
English, If 20 out of 55 randomly students like English, is the teacher’s claim valid? Use
95% confidence level.

MODULE 6

This module was designed and written with you in mind. It is here to help you master the
nature of Statistics and Probability. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary level of
students. The lessons are arranged to follow the standard sequence of the course. But the
order in which you read them can be changed to correspond with the textbook you are
now using. The module consists of one lesson which contains sub lessons:

89
• Lesson 6 – Comparing Sample Proportion and Population Proportion

Lesson Comparing Sample Proportion and


6 Population Proportion

In daily life, we make tentative explanation of facts about a particular phenomenon by


formulating hypothesis. This hypothesis may be correct or incorrect, depending on the
available evidence that we can gather to support our hypothesis. We usually use a sample
to gather information and evidence that we need to validate our hypothesis. The data that
we gather from this sample become the basis of our decision whether we shall accept or
reject our hypothesis regarding the entire population. The data obtained from this sample
is analyzed with the use of appropriate statistical procedure to find out whether our
hypothesis should be accepted. This process is called testing hypothesis. In this lesson we
shall learn how to determine if a proportion from a sample differs significantly from a
proportion from a population. We shall learn how to conduct a test of hypothesis that will
help us to arrive at the right decision. There are some instances wherein what we want to
compare are proportions.

To compare sample proportion and population proportion, we use the z-test for one-
sample proportion. The test statistics for this test is

?̂?̂ −𝑝0 ?̂?̂ −𝑝


𝑧= 𝑝 0(1−𝑝0 )
𝑧= 𝑝𝑞
√ √ 𝑛
𝑛
where;
?̂?̂ = sample proportion ?̂?̂ =𝑥
𝑛
po = population proportion x = number of successes
n = size of the sample

Example 3
It has been claimed that less than 60% of all purchases of a certain kind of computer
program will call the manufacturer’s hotline within one-month purchase. If 55 out of 100
software purchasers selected at random call the hotline within a month of purchase, test
the claim at 0.05 level of significance.

Solution

Step 1: Formulate the null and alternative hypotheses

Ho: The proportion of purchasers that will all the manufacturer’s


hotline within one month of purchase is 60% or 0.60
( po ≥ 0.60 )

90
Ha: The proportion of purchasers that will call the manufacturer’s
hotline within one month of purchase is less than 60% or 0.60
( po < 0.60 )

Step 2: Type of test: The test is one-tailed (left tailed).


Critical value: with the use of z-table,
the critical value of z at 0.05 level,
One-tailed test is z =-1.65.

Step 3: Compute the test value.


Given:
po = 0.60
n = 100
𝑝̂ ̂= 55 = 0.55
100

Substitute the given values in the formula below.

?̂?̂ −𝑝0
𝑧=
𝑝
√0(1−𝑝0 )
𝑛

0.55 − 0.60
𝑧=
√0.60(1 − 0.60)
100

−0.05
𝑧=
√0.60(0.40)
100

Step 4: Decision:
Fail to reject /Accept the null hypothesis because the computed value or the
test value falls outside the rejection region.

Step 5: Conclusion:
There is no sufficient evidence to conclude that the proportion of purchasers
that will call the manufacturer’s hotline within one month of purchase is less than
60%. Thus, the claim is false or incorrect.

91
Solve Me Please!

Directions: Analyze and solve the given problem below.

A doctor claims that only 10% of all patients exposed to a certain amount of
radiation will feel ill effects. If in a random sample, 5 of 18 patients exposed to
such radiation feel some ill effects, test the doctor’s claim at 0.01 level of
significance.
1. Formulate the null and alternative hypotheses
Ho:
Ha:
2. Type of test:
3. Compute the test value.
4. Decision:
5. Conclusion:

What’s My Decision?

Directions: Find the critical value, type of test, draw the rejection region, compute the
value of the test statistic, and make a decision whether to accept or failed to accept
the null hypothesis in each of the following situations.

Hypotheses Rejection Region Decision


1.
H0: po ≤ 0.58
Ha: po > 0.58
Given:
po = 0.58
x = 80
n = 120
α = 0.05
Critical value: Test statistic:
Type of test:
2.
H0: po ≥ 0.80
Ha: po < 0.80
Given:
po = 0.80
x = 140
n = 200
α = 0.01
Critical value:
Test statistic:
Type of test

92
Directions: Choose the letter of the correct answer. Write your answer on a
separate sheet of paper.
1. Is a numerical quantity that is assigned to the outcome of an experiment.
A. Random variable C. Sample space
B. Sample point D. Variable
2. In how many ways can two coins fall?
A. 2 B. 4 C. 6 D. 8
3. It tells the distance of score from the mean measured in standard deviation
units.
A. normal curve C. z-score
B. sample mean D. area
4. Which of the following shows the probability that the z-score lies above a z-score
value?
𝐴.𝑃(𝑎<𝑧<𝑏) 𝐵.𝑃(𝑧>𝑎) C. 𝑃(𝑧<𝑎) D. 𝑃(𝑎=𝑧

5. What is the proportion of the area to the right of z = -1?


A. -0.3413 B. -0.8413 C. 0.3413 D. 0.8413

6. Statement 1: The number of students who are present in Filemon T. Lizan SHS
for the first day of class for the S.Y. 2020-2021
Statement 2: The number of Mayors in NCR who are present during the meeting
Which of the following is CORRECT?
A. both statements are Discrete C. Statement 1 is Discrete Random Variable
Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable
7. Statement 1: the volume of soft drinks in a 12-ounce can
Statement 2: the time required to perform a job.
which of the following is CORRECT?
A. both statements are Discrete C. Statement 1 is Discrete Random Variable
Random Variables while the Statement 2 is Continuous
Random Variables
B. both statements are Continuous D. Statement 1 is a Continuous Random
Random Variables Variable while the Statement 2 is a
Continuous Random Variable
8. Let B number of boys in a family and G for the girls in a family of four children.
Determine the values of the random variable B.
A. 0, 1 B. 0, 1, 2 C. 0, 1, 2, 3 D. 0, 1, 2, 3, 4
For numbers 9 – 10. Consider the probability distribution of the number of
mangoes given below.
R 3 2 1 0
P(R) 1/8 3/8 3/8 1/8
9. Find P(R = 3) A. 1/8 B. 5/8 C. 3/8 D. 1
10. Find P(R > 1)
A. 18 B. 3/8
C. 1/2 D. 1

93
References
Ocampo, J.M, Marquez, W. G., (2006). Conceptual Math & Beyond: Brilliant Creations
Publishing, Inc.
Gabuyo, Y. A, Cardenas, M. C., (2016). Statistics and Probability: The Inteligente Publishing,
Inc.
Belecina, R.R., Baccay, E.S., & Mateo E.B. (2016). Statistics and probability. Quezon City,
QC: Rex Book Store, Inc.

Chegg Study. (n.d.). Normal curve. [image]. Retrieved from


https://www.chegg.com/homework-help/definitions/normal-curve-31. Copyright
Retrieved 2003-
2020
Crawford, J. (n.d.). Standard normal table. [image]. from
https://faculty.tarleton.edu/m/crawford/documents/NormalTable.png. Copyright 2017
Glen, S. (n.d.). Find the area under a normal curve, [image]. From StatisticsHowTo.com:
Elementary Statistics for the rest of us! Retrieved from
https://www.statisticshowto.com/probability-and-statistics/normal-distributions/find-the-
area-under-a-normal-curve/

96

You might also like