100% found this document useful (1 vote)
7K views398 pages

Statistics and Probability 2nd Quarter

Uploaded by

Heart Gulla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
7K views398 pages

Statistics and Probability 2nd Quarter

Uploaded by

Heart Gulla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 398

Statistics and

Probability
Quarter 2 – Module 1:
Testing Hypothesis

1
Republic Act 8293, Section 176 states that: No copyright shall subsist in any work
of the Government of the Philippines. However, prior approval of the government agency or
office wherein the work is created shall be necessary for exploitation of such work for profit.
Such agency or office may, among other things, impose as a condition the payment of
royalties.
Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names,
trademarks, etc.) included in this module are owned by their respective copyright holders.
Every effort has been exerted to locate and seek permission to use these materials from
their respective copyright owners. The publisher and authors do not represent nor claim
ownership over them.
Published by the Department of Education
Secretary: Leonor Magtolis Briones
Undersecretary: Diosdado M. San Antonio

Development Team of the Module


Writer: Josephine B. Ramos
Editors: Gilberto M. Delfina, Josephine P. De Castro, and Pelagia L. Manalang
Reviewers: Josephine V. Cabulong, Nenita N. De Leon, and Tesalonica C. Abesamis
Illustrator: Jeewel C. Cabriga
Layout Artist: Edna E. Eclavea
Management Team: Wilfredo E. Cabral, Regional Director
Job S. Zape Jr., CLMD Chief
Elaine T. Balaogan, Regional ADM Coordinator
Fe M. Ong-ongowan, Regional Librarian
Aniano M. Ogayon, Schools Division Superintendent
Maylani L. Galicia, Assistant Schools Division Superintendent
Randy D. Punzalan, Assistant Schools Division Superintendent
Imelda C. Raymundo, CID Chief
Generosa F. Zubieta, EPS In-charge of LRMS
Pelagia L. Manalang, EPS

Printed in the Philippines by ________________________


Department of Education – Region IV-A CALABARZON
Office Address: Gate 2 Karangalan Village, Barangay San Isidro
Cainta, Rizal 1800
Telefax: 02-8682-5773/8684-4914/8647-7487
E-mail Address: [email protected]

2
Statistics and
Probability
Quarter 2 – Module 1:
Testing Hypothesis

3
Introductory Message
For the facilitator:

Welcome to the Statistics and Probability for Senior High School Alternative
Delivery Mode (ADM) Module.

This module was collaboratively designed, developed, and reviewed by


educators both from public and private institutions to assist you, the
teacher or the facilitator, in helping the learners meet the standards set by
the K to 12 Curriculum while overcoming their personal, social, and
economic constraints in schooling.

This learning resource hopes to engage the learners into guided and
independent learning activities at their own pace and time. Furthermore,
this also aims to help learners acquire the needed 21st century skills while
taking into consideration their needs and circumstances.

In addition to the material in the main text, you will also see this box in the
body of the module:

Notes to the Teacher


This contains helpful tips or strategies
that will help you in guiding the
learners.

As a facilitator, you are expected to orient the learners on how to use this
module. You also need to keep track of the learners' progress while allowing
them to manage their own learning. Furthermore, you are expected to
encourage and assist the learners as they do the tasks included in the
module.

4
For the learner:

Welcome to the Statistics and Probability for Senior High School Alternative
Delivery Mode (ADM) Module.

The hand is one of the most symbolical parts of the human body. It is often
used to depict skill, action, and purpose. Through our hands, we may learn,
create, and accomplish. Hence, the hand in this learning resource signifies
that as a learner, you are capable and empowered to successfully achieve
the relevant competencies and skills at your own pace and time. Your
academic success lies in your own hands!

This module was designed to provide you with fun and meaningful
opportunities for guided and independent learning at your own pace and
time. You will be enabled to process the contents of the learning resource
while being an active learner.

This module has the following parts and corresponding icons:

What I Need to This will give you an idea on the skills or


Know competencies you are expected to learn
in the module.

What I Know This part includes an activity that aims


to check what you already know about
the lesson to take. If you get all the
answers correct (100%), you may decide
to skip this module.
This is a brief drill or review to help you
What’s In
link the current lesson with the previous
one.

What’s New In this portion, the new lesson will be


introduced to you in various ways such
as a story, a song, a poem, a problem
opener, an activity, or a situation.

What Is It This section provides a brief discussion


of the lesson. This aims to help you
discover and understand new concepts
and skills.

What’s More This comprises activities for independent


practice to solidify your understanding
and skills of the topic. You may check

5
the answers to the exercises using the
Answer Key at the end of the module.
What I Have This includes questions or blank
Learned sentences/paragraphs to be filled in to
process what you learned from the
lesson.

What I Can Do This section provides an activity which


will help you transfer your new
knowledge or skill into real life
situations or concerns.

Assessment This is a task which aims to evaluate


your level of mastery in achieving the
learning competency.
Additional In this portion, another activity will be
Activities given to you to enrich your knowledge or
skill of the lesson learned. This also
aims for retention of learned concepts.

At the end of this module, you will also find:

References This is a list of all sources used in


developing this module.
The following are some reminders in using this module:
1. Use the module with care. Do not put unnecessary mark/s on any part of
the module. Use a separate sheet of paper in answering the exercises.
2. Don’t forget to answer What I Know before moving on to the other
activities included in the module.
3. Read the instruction carefully before doing each task.
4. Observe honesty and integrity in doing the tasks and checking your
answers.
5. Finish the task at hand before proceeding to the next.
6. Return this module to your teacher/facilitator once you are through with
it.

If you encounter any difficulty in answering the tasks in this module, do not
hesitate to consult your teacher or facilitator. Always bear in mind that you
are not alone.
We hope that through this material, you will experience meaningful learning
and gain deep understanding of the relevant competencies. You can do it!

6
What I Need to Know

Hypothesis testing can allow us to measure data in samples to learn more


about the data in populations that are often too large or inaccessible. We
can measure a sample mean to learn more about the mean in a population.
Here, we can either accept or reject our assumption using hypothesis
testing. This ADM module in hypothesis testing will help you study the
different concepts and steps in hypothesis testing as well as its application
in real-life situations.

After going through this module, you are expected to:


1. define and illustrate the null hypothesis, alternative hypothesis, level
of significance, rejection region, and types of errors in hypothesis
testing;
2. identify the rejection and non-rejection regions and the critical values;
and
3. differentiate Type I and Type II errors in claims and decisions.

Are you ready now to study hypothesis testing using your ADM module?
Good luck and may you find it helpful.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.
1. It is a proposed explanation, assertion, or assumption about a population
parameter or about the distribution of a random variable.
A. Decision C. Probability
B. Statistics D. Hypothesis
2. What is the statistical method used in making decisions using
experimental data?
A. Simple analysis C. Hypothesis testing
B. Analytical testing D. Experimental testing
3. It is also the probability of committing an incorrect decision about the
null hypothesis.

7
A. Level of error C. Level of acceptance
B. Level of hypothesis D. Level of significance

4. Which of the following describes an alternative hypothesis using two-


tailed test?
A. 𝐻𝑎 = 100 C. 𝐻𝑎 > 100
B. 𝐻𝑎 ≠ 100 D. 𝐻𝑎 < 100

5. In a one-tailed test, in which critical value listed below will the computed
z of 2.313 fall in the acceptance region?
A. 1.383 C. 2.228
B. 1.533 D. 2.365
6. Which of the following would be an appropriate null hypothesis?
A. The mean of a sample is equal to 75.
B. The mean of a population is equal to 75.
C. The mean of a sample is not equal to 75.
D. The mean of a population is greater than 75.

7. When is a Type I error committed?


A. We reject a null hypothesis that is false.
B. We reject a null hypothesis that is true.
C. We fail to reject a null hypothesis that is true.
D. We fail to reject a null hypothesis that is false.

8. When is a Type II error committed?


A. We reject a null hypothesis that is true.
B. We reject a null hypothesis that is false.
C. We fail to reject a null hypothesis that is true.
D. We fail to reject a null hypothesis that is false.

9. Which of the following is a Type I error?


A. 𝐻0 is true; reject 𝐻0 . C. 𝐻0 is true; fail to reject 𝐻0 .
B. 𝐻0 is false; reject 𝐻0 . D. 𝐻0 is false; fail to reject 𝐻0 .

10. Which of the following describes an alternative hypothesis in a left-tailed


test?
A. 𝐻𝑎 > 100 B. 𝐻𝑎 < 100 C. 𝐻𝑎 = 100 D. 𝐻𝑎 ≠ 100

11. Which of the following must be used as the level of significance if we want
a higher possibility of correct decision?
A. 1% B. 5% C. 10% D. 25%

12. Which of the following would be an appropriate alternative hypothesis for


one-tailed test?
A. 𝐻𝑎 < 100 B. 𝐻𝑎 = 100 C. 𝐻𝑎 ≥ 100 D. 𝐻𝑎 ≤ 100

8
13. Using a left-tailed test, which of the following value of z falls in the
rejection region where the critical value is – 1.725?
A. – 1.700 B. – 1.715 C. – 1.724 D. – 1.728
14. If the computed z-value is 2.015 and the critical value is 1.833, which of
the following statements could be true?
A. It lies in the rejection region, 𝐻𝑜 must be rejected.
B. It lies in the rejection region, we failed to reject 𝐻𝑜 .
C. It lies in the non-rejection region, 𝐻𝑜 must be rejected.
D. It lies in the non-rejection region, we failed to reject 𝐻𝑜 .

15. If the computed z-value is – 1.290 and the critical value is – 2.571, which
of the following statements could be true?
A. It lies in the rejection region, 𝐻𝑜 must be rejected.
B. It lies in the rejection region, we failed to reject 𝐻𝑜 .
C. It lies in the non-rejection region, 𝐻𝑜 must be rejected.
D. It lies in the non-rejection region, we failed to reject 𝐻𝑜 .

Lesson

1 Testing Hypothesis

Have you at a certain time asked yourself how you could possibly decide to
put a business in place and gain your expected profit? Or wonder if a judge
in a trial could have given a wrong decision in determining who’s guilty? Or
think if your classmates’ average weights differ significantly among your
age? Or imagine how a newly discovered medicine is being tested for human
treatment?

This lesson will help you make sound decisions in dealing with these
situations.

9
What’s In

Where Am I Now?

Directions: Identify the region where each of the given values falls.

Region B
Region A
Region C

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

1. 𝑡 = 1.95 ______________________________
2. 𝑡 = 0.15 ______________________________
3. 𝑡 = −1.45 ______________________________
4. 𝑡 = −2.4 ______________________________
5. 𝑡 = 2.73 ______________________________

Answer the following questions.


1. Are you familiar with the shape of the curve used in Activity 1?
2. What is the name of that curve?
3. In what type of distribution is this kind of curve used?
4. How were you able to locate in which region the given value falls?
5. What mathematical concepts did you apply in locating the region?

Notes to the Teacher


Check the student’s level of readiness for the next topic.
If she/he did not answer most of the items and the guide
questions, you may provide another review activity about
normal curve.

10
What’s New

Keep Me Connected!
Directions: Analyze the situation below and answer the questions that follow.

According to a survey, the average daily


usage of social media worldwide of global internet
users amounts to 142 minutes per day. Sofia
conducts her own survey among her friends to
find out if their time spent on social media is
significantly higher than the global survey.

Before her survey, she formulated the following claims:


Claim A: The average daily usage of social media of her friends is
the same as the global average usage.
Claim B: The average daily usage of social media of her friends is
higher than the global average usage.

The table shows Sofia’s friends and their respective time


spent on social media.
Minutes per Day Spent on
Friend’s Name
Social Media
Allen 132
Bryan 148
Ellen 165
Jake 157
Mindie 120
Shamsi 144
Candice 136
Dory 160
Mitch 185
Mila 173

Answer the following questions:


1. What statistical data is/are needed to prove which of Sofia’s claims is
accepted or rejected?

11
2. What is the average daily usage of social media of her friends? Compare
it with the previous average usage.
3. Which of the two claims could probably be true? Why?
4. If Sofia computed the average daily internet usage of her friends to be
higher than the global survey, do you think it would be significantly
higher?
5. What is your idea of an average value being significantly higher than the
global average value?
6. What do you think is the difference between simple comparison of data
and hypothesis testing?

What Is It

Hypothesis testing is a statistical method applied in making decisions


using experimental data. Hypothesis testing is basically testing an
assumption that we make about a population.

A hypothesis is a proposed explanation, assertion, or assumption about a


population parameter or about the distribution of a random variable.

Here are the examples of questions you can answer with a hypothesis test:
 Does the mean height of Grade 12 students differ from 66 inches?
 Do male and female Grade 7 and Grade 12 students differ in height
on average?
 Is the proportion of senior male students’ height significantly
higher than that of senior female students?

Key Terms and Concepts Used in Test Hypothesis

The Null and Alternative Hypothesis


 The null hypothesis is an initial claim based on previous analyses,
which the researcher tries to disprove, reject, or nullify. It shows no
significant difference between two parameters. It is denoted by 𝐻𝑜 .
 The alternative hypothesis is contrary to the null hypothesis, which
shows that observations are the result of a real effect. It is denoted
by 𝐻𝑎 .

Note: You can think of the null hypothesis as the current value of the
population parameter, which you hope to disprove in favor of your
alternative hypothesis.

12
Take a look at this example.
The school record claims that the mean score in Math of the incoming
Grade 11 students is 81. The teacher wishes to find out if the claim is true.
She tests if there is a significant difference between the batch mean score
and the mean score of students in her class.

Solution:
Let 𝜇 be the population mean score and 𝑥̅ be the mean score of
students in her class.
You may select any of the following statements as your null and
alternative hypothesis as shown in Option 1 and Option 2.

Option 1:
𝐻𝑜 : The mean score of the incoming Grade 11 students is 81 or 𝜇 = 81.
𝐻𝑎 : The mean score of the incoming Grade 11 students is not 81 or 𝜇 ≠ 81.

Option 2:
𝐻𝑜 : The mean score of the incoming Grade 11 students has no significant
difference with the mean score of her students or 𝜇 = 𝑥̅ .
𝐻𝑎 : The mean score of the incoming Grade 11 students has a significant
difference with the mean score of her students or 𝜇 ≠ 𝑥̅ .

Now, it’s your turn!


Based on the first claim of Sofia in Activity 2 that “the average daily usage of
social media of her friends is the same as the global average usage”,
formulate two hypotheses about the global average usage (𝜇) and the average
usage of her friends (𝑥̅ ) on the blanks provided below.
𝐻𝑜 : _____________________________________________
𝐻𝑎 : _____________________________________________
You can verify your answer to your teacher and start working on the next
activity.
Here is another key term you should know!

Level of Significance
 The level of significance denoted by alpha or 𝛂 refers to the degree of
significance in which we accept or reject the null hypothesis.
 100% accuracy is not possible in accepting or rejecting a hypothesis.
 The significance level α is also the probability of making the wrong decision
when the null hypothesis is true.

Do you know that the most common levels of significance used are 1%, 5%,
or 10%?
Some statistics books can provide us table of values for these levels of
significance.

13
Take a look at this example.
Maria uses 5% level of significance in proving that there is no
significant change in the average number of enrollees in the 10 sections for
the last two years. It means that the chance that the null hypothesis (𝐻𝑜 )
would be rejected when it is true is 5%.

𝛼 = 0.05

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

𝛼 = 0.05 is actually the area under the


normal curve within the rejection region.

It’s your turn!


If Sofia used a 0.10 level of significance, what are the chances that she
would have a wrong conclusion if the two values have no significant
difference?

Here is another key term you should know!


Two-Tailed Test vs One-Tailed Test
 When the alternative hypothesis is two-sided like 𝐻𝑎 : 𝜇 ≠ 𝜇0 , it is
called two-tailed test.
 When the given statistics hypothesis assumes a less than or greater
than value, it is called one-tailed test.

Here are some examples.


The school registrar believes that the average number of enrollees this
school year is not the same as the previous school year.
In the above situation,
let 𝜇0 be the average number of enrollees last year.
𝐻𝑜 : 𝜇 = 𝜇0
𝐻𝑎 : 𝜇 ≠ 𝜇0

If 𝐻𝑎 uses ≠, use a two-


tailed test.
𝛂
2
𝛂 𝛂
2 2

14
However, if the school registrar believes that the average number of enrollees
this school year is less than the previous school year, then you will have:
𝐻𝑜 : 𝜇 = 𝜇0
𝐻𝑎 : 𝜇 < 𝜇0

Use the left-tailed when


𝐻𝑎 contains the symbol <.

On the other hand, if the school registrar believes that the average number
of enrollees this school year is greater than the previous school year, then
you will have:
𝐻𝑜 : 𝜇 = 𝜇0
𝐻𝑎 : 𝜇 > 𝜇0

Use the right-tailed test when


𝐻𝑎 contains the symbol >.

Now back to the two claims of Sofia, what do you think should be the type of
test in her following claims?
Claim A: The average daily usage of social media of her friends is
the same as the global average usage.

Claim B: The average daily usage of social media of her friends is


higher than the global average usage.

Here is the other concept!

15
Illustration of the Rejection Region
 The rejection region (or critical region) is the set of all values of the test
statistic that causes us to reject the null hypothesis.
 The non-rejection region (or acceptance region) is the set of all values of
the test statistic that causes us to fail to reject the null hypothesis.
 The critical value is a point (boundary) on the test distribution that is
compared to the test statistic to determine if the null hypothesis would
be rejected.

Non-Rejection
Region Rejection Region

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Critical Value

Illustrative Example 1:

Now, let’s take a look at Sofia’s first claim. She assumed that the average
online usage of her friends is the same as the global usage (𝐻𝑜 ).

𝑥̅ −𝜇
She computed for the t-value using the formula 𝑡 = 𝑠 where 𝜇 = 142, 𝑥̅
√𝑛
= 152, s = 19.855, and n = 10.
𝑥̅ − 𝜇
𝑡= 𝑠
Use a scientific
√𝑛
This t-test formula calculator to
was discussed in 152 − 142 verify the
𝑡=
the last chapter. 19.855 computed t-
√10 value.
10
𝑡=
6.2787

𝑡 = 1.593

16
From the table of t-values, determine the critical value. Use df = n-1 = 9,
one-tailed test at 5% level of significance.
The critical t-value is 1.833.
How did we get that value?
Look at this illustration!

The table of t-values


can be found at the
last part of this
module.

Now, you can sketch a t distribution curve and label showing the rejection
area (shaded part), the non-rejection region, the critical value, and the
computed t-value. This is how your t distribution curve should look like!

Rejection
Region
Non-Rejection
Region

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

1.593 1.833
(Computed Value) (Critical Value)

As you can see from your previous illustration, the computed t-


value of 1.593 is at the left of the critical value 1.833. So, in
which region do you think the computed value falls?

The computed value is less than the critical value.

𝐻𝑜 : The average online usage of


her friends is the same as the The computed
global usage. We fail to reject
t-value is at the
𝐻𝑎 : The average online usage of the null
non-rejection
her friends is higher than the hypothesis, 𝐻𝑜 .
region.
global usage.

17
Illustrative Example 2:
A medical trial is conducted to test whether or not a certain drug reduces
cholesterol level. Upon trial, the computed z-value of 2.715 lies in the
rejection area.

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

The computed z-value of


2.715 can be found here!

The computed value is greater than the critical value.


𝐻𝑜 : The certain drug is effective in We reject the
The computed
reducing cholesterol level by 60%. null hypothesis,
z-value is at the
𝐻𝑎 : The certain drug is not effective in 𝐻𝑜 in favour of
rejection region.
reducing cholesterol level by 60%. 𝐻𝑎 .

Illustrative Example 3:
Sketch the rejection region of the test hypothesis with critical values of
±1.753 and determine if the computed t-value of –1.52 lies in that region.

Solution:

Draw a t-distribution curve. Since there are two critical values, it is a two
tailed test. Locate the critical values and shade the rejection regions.

Now, locate the computed t-value of –1.52. You can clearly see that it is not
at the rejection region as shown in the following figure. The computed t-value
is at the non-rejection region. Therefore, we fail to reject the null hypothesis,
𝐻𝑜 .

– 1.52

– 1.753 1.753
(critical value) (critical value)

18
Type I and Type II Errors

 Rejecting the null hypothesis when it is true is called a Type I


error with probability denoted by alpha (𝜶). In hypothesis testing,
the normal curve that shows the critical region is called the alpha
region.
 Accepting the null hypothesis when it is false is called a Type II
error with probability denoted by beta (𝛃). In hypothesis testing,
the normal curve that shows the acceptance region is called the
beta region.
 The larger the value of alpha, the smaller is the value of beta.

This is the region of Type I


error.
α = P [Type I error]
= P [𝐻𝑜 is true, Reject 𝐻𝑜 ]
Region where 𝐻𝑜
is true

α
This is the region of Type II
error.
β = P [type II error]
= P [𝐻𝑜 is false, Fail to reject 𝐻𝑜 ]

Region where 𝐻𝑜 is
false
𝜷

To summarize the difference between the Type I and Type II errors, take a
look at the table below.

Null Hypothesis 𝑯𝒐 Fail to Reject 𝑯𝒐 Reject 𝑯𝒐


Correct Decision Type I Error
True - Failed to reject 𝐻𝑜 when - Rejected 𝐻𝑜 when
it is true it is true
Type II Error Correct Decision
False - Failed to reject 𝐻𝑜 when - Rejected 𝐻𝑜 when it
it is false is false

19
Now, complete the statements that follow.

Analyze the possibilities of Sofia’s conclusion. Identify if it is a Type I


Error, Type II Error, or a Correct Decision.
If Sofia finds out that her null hypothesis is …
1. true and she fails to reject it, then she commits a ____________________.
2. true and she rejects it, then she commits a _____________________.
3. false and she fails to reject it, then she commits a __________________.
4. false and she rejects it, then she commits a _____________________.
Your answers should be: 1) Correct Decision, 2) Type I Error, 3) Type II
Error, and 4) Correct Decision.

Illustrative Example:

Bryan is starting his own food cart


business and he is choosing cities where he
will run his business. He wants to survey
residents and test at 5% level of significance
whether or not the demand is high enough
to support his business before he applies for
the necessary permits to operate in his
selected city. He will only choose a city if
there is strong evidence that the demand
there is high enough. We can state the null
hypothesis for his test as:
𝐻𝑜 : The demand is high enough.

What would be the consequence of a Type I error in this setting?


_____ He doesn't choose a city where demand is actually high enough.
_____ He chooses a city where demand is actually high enough.
_____ He chooses a city where demand isn't actually high enough.

The Type I error is the first statement because he rejected the true
null hypothesis.

What would be the consequence of a Type II error in this setting?


_____ He doesn't choose a city where demand is actually high enough.
_____ He chooses a city where demand is actually high enough.
_____ He chooses a city where demand isn't actually high enough.

The Type II error is the third statement because he failed to


reject the false null hypothesis.

What is the probability of Type I error?


_____ 0.10 _____ 0.25 _____ 0.05 _____ 0.01
The probability of Type I error is 0.05 because it is the level of
significance used.

20
What’s More

Activity 1.1 Null Vs Alternative

Directions: State the null and the alternative hypotheses of the following
statements.
1. A medical trial is conducted to test whether or not a new medicine
reduces uric acid by 50%.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
2. We want to test whether the general average of students in Math is
different from 80%.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
3. We want to test whether the mean height of Grade 8 students is 58
inches.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
4. We want to test if LPIHS students take more than four years to graduate
from high school, on the average.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
5. We want to test if it takes less than 60 minutes to answer the quarterly
test in Calculus.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
6. A medical test is conducted to determine whether or not a new vaccine
reduces the complications of dengue fever.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
7. The enrolment in high school this school year increases by 10%.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
8. The intelligence quotient of male grade 11 students is the same as the
female students.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________

21
9. The school want to test if the students in Grade 7 prefer online distance
learning as the method of instruction.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________
10. The school librarian wants to find out if there was an increase in the
number of students accessing the school library.
𝐻𝑜 : ____________________________________________________
𝐻𝑎 : ____________________________________________________

Activity 1.2 The Tale of Tails

Directions: Determine if one-tailed test or two-tailed test fits the given


alternative hypothesis.

1. The mean height of Grade 12 students is less than 66 inches.


2. The standard deviation of their height is not equal to 5 inches.
3. Male Grade 7 and Grade 12 students differ in height on average.
4. The proportion of senior male students’ height is significantly higher than
that of senior female students.
5. The average grade of Grade 11 students in Statistics is lower than their
average grade in Calculus.
6. The newly found vaccine reduces the risks of viral infections of the
patience.
7. The enrolment in elementary schools is not the same as the enrolment in
the secondary schools.
8. Male adolescents have higher intelligence quotient level than the female
adolescents.
9. The average number of internet users this year is significantly higher as
compared last year.
10. Paracetamol and Ibuprofen have the same rate of time to reduce the
headache of the patients.

Activity 1.3 Are You In or Out?

Directions: Illustrate the rejection region given the critical value and
identify if the t-values lie in the non-rejection region or rejection
region.

1. critical t-value of 1.318


computed t-value of 1.1

The computed t-value is at the


__________ region.

22
2. critical t-value of −1.671
computed t-value of −2.45

The computed t-value is at the


__________ region.

3. critical t-value of 1.725


computed t-value of 2.14

The computed t-value is at the


__________ region.

4. critical t-value of ±1.311


computed t-value of −1.134

The computed t-value is at the


__________ region.

5. critical t-value of −1.701


computed t-value of −2.48

The computed t-value is at the


__________ region.

6. critical t-value of 2.12


computed t-value of 2.15

The computed t-value is at the


__________ region.

23
7. critical t-value of −2.306
computed t-value of −2.110

The computed t-value is at the


__________ region.

8. critical t-value of 2.228


computed t-value of 1.987

The computed t-value is at the


__________ region.

9. critical t-value of ±1.812


computed t-value of −1.915

The computed t-value is at the


__________ region.

10. critical t-value of −1.860


computed t-value of −2.3

The computed t-value is at the


__________ region.

Activity 1.4 Type I or Type II


Directions: Check the box that corresponds to your answer.
Situation 1:
A quality control expert wants to test the
null hypothesis that an imported solar
panel is an effective source of energy.

24
1. What would be the consequence of a Type I error in this context?
They do not conclude that the They do not conclude that the
solar panel is effective when it is solar panel is effective when it is
not actually effective. actually effective.
They conclude that the solar They conclude that the solar panel
panel is effective when it is is effective when it is not actually
actually effective. effective.

2. What would be the consequence of a Type II error?


They do not conclude that the They do not conclude that the solar
solar panel is effective when it is panel is effective when it is actually
not actually effective. effective.
They conclude that the solar They conclude that the solar panel
panel is effective when it is is effective when it is not actually
actually effective. effective.

Situation 2:
A resort owner does a daily water
quality test in their swimming pool. If
the level of contaminants is too high,
then he temporarily closes the pool to
perform a water treatment.
We can state the hypotheses for his
test as:
𝐻𝑜 : The water quality is acceptable.
𝐻𝑎 : The water quality is not acceptable.

3. What would be the consequence of a Type I error in this setting?


The owner closes the pool when it The owner does not close the pool
needs to be closed. when it needs to be closed.

The owner closes the pool when it The owner does not close the pool
does not need to be closed. when it does not need to be closed.

4. What would be the consequence of a Type II error in this setting?


The owner closes the pool when it The owner closes the pool when it
needs to be closed. does not need to be closed.
The owner does not close the pool
The owner does not close the pool
when it does not need to be
when it needs to be closed.
closed.

25
5. In terms of safety, which error has more dangerous consequences in this
setting?

Type I Error Type II Error

What I Have Learned

Directions: Complete the following statements. Write the answers in your


notebook.
1. _________________________is a statistical method that is used in
making decisions using experimental data.
2. A ________________________ is a proposed explanation, assertion, or
assumption about a population parameter or about the distribution of
a random variable.
3. The null hypothesis is an initial claim which the researcher tries to
______________________________________.
4. The alternative hypothesis is contrary to the
______________________________________.
5. The level of significance is denoted by_______________________.
6. The significance level α is also the probability of making the wrong
decision when ____________________________________.
7. When the alternative hypothesis is two-sided, it is called
_____________________________.
8. When the given statistics hypothesis assumes a less than or greater
than value, it is called ______________________________.
9. The rejection region (or critical region) is the set of all values of the
test statistic that cause us to ________________________________
10. Rejecting the null hypothesis when it is true results to what type of
error?

26
What I Can Do

Directions: Cite five (5) situations in your community where you can apply
hypothesis testing. Then, just choose one situation and:
1. create a problem statement;
2. formulate the null and alternative hypothesis;
3. select the level of significance and sketch the rejection region; and
4. state the possible Type I and Type II errors.

Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.
1. It is the statistical method used in making decisions using experimental
data.
A. observation C. analytical testing
B. simple analysis D. hypothesis testing

2. What term is being used to describe a proposed explanation, assertion, or


assumption about a population parameter or about the distribution of a
random variable?
A. statistic B. decision C. hypothesis D. probability

3. It is also referred to as a probability of committing an incorrect decision


about the null hypothesis.
A. level of error C. level of acceptance
B. level of hypothesis D. level of significance

4. Which of the following would be an appropriate null hypothesis?


A. The mean of a sample is equal to 80.
B. The mean of a population is equal to 80.
C. The mean of a population is not equal to 80.
D. The mean of a population is greater than 80.

27
5. Which of the following describes a null hypothesis using two-tailed test?
A. 𝐻0 : 𝜇 = 𝜇0 B. 𝐻0 : 𝜇 ≠ 𝜇0 C. 𝐻0 : 𝜇 ≥ 𝜇0 D. 𝐻0 : 𝜇 ≤ 𝜇0

6. Which of the following describes an alternative hypothesis using two-


tailed test?
A. 𝐻𝑎 : 𝜇 < 50 𝑦𝑒𝑎𝑟𝑠 𝑜𝑙𝑑 C. 𝐻𝑎 : 𝜇 ≠ 50 𝑦𝑒𝑎𝑟𝑠 𝑜𝑙𝑑
𝐵. 𝐻𝑎 : 𝜇 > 50 𝑦𝑒𝑎𝑟𝑠 𝑜𝑙𝑑 D. 𝐻𝑎 : 𝜇 = 50 𝑦𝑒𝑎𝑟𝑠 𝑜𝑙𝑑

7. Which of the following must be used as the significance level if we want a


lower possibility of correct decision?
A. 1% C. 5%
B. 2% D. 10%

8. Which of the following would be an appropriate alternative hypothesis for


one-tailed test?
A. 𝐻𝑎 : 𝜇 = 85 B. 𝐻𝑎 : 𝜇 ≥ 85 C. 𝐻𝑎 : 𝜇 ≥ 85 D. 𝐻𝑎 : 𝜇 < 85

9. In a one-tailed test, in which critical values below will the computed z of


2.312 falls in the non-rejection region?
A. 1.383 B. 1.533 C. 2.228 D. 2.354
10. When is a Type I error committed?
A. We reject a null hypothesis that is true.
B. We reject a null hypothesis that is false.
C. We fail to reject a null hypothesis that is true.
D. We fail to reject a null hypothesis that is false.

11. When is a Type II error committed?


A. We reject a null hypothesis that is true.
B. We reject a null hypothesis that is false.
C. We fail to reject a null hypothesis that is true.
D. We fail to reject a null hypothesis that is false.

12. Which of the following is a Type I error?


A. 𝐻0 is true; reject 𝐻0 . C. 𝐻0 is true; fail to reject 𝐻0 .
B. 𝐻0 is false; reject 𝐻0 . D. 𝐻0 is false; fail to reject 𝐻0 .

13. If the computed z-value is 1.286 and the critical value is 1.383, which of
the following statements could be true?
A. It lies in the rejection region, 𝐻𝑜 must be rejected.
B. It lies in the rejection region, hence we fail to reject𝐻𝑜 .
C. It lies in the non-rejection region, 𝐻𝑜 must be rejected.
D. It lies in the non-rejection region, hence we fail to reject𝐻𝑜 .

14. Using a left-tailed test, which of the following value of z will not fall in the
rejection region where the critical value is – 1.638?
A. – 1.637 B. – 1.639 C. – 1.641 D. – 1.706

28
15. If the computed z-value is 1.915 and the critical value is 1.812, which of
the following statements could be true?
A. It lies in the rejection region, 𝐻𝑜 must be rejected.
B. It lies in the rejection region, hence we fail to reject𝐻𝑜 .
C. It lies in the non-rejection region, 𝐻𝑜 must be rejected.
D. It lies in the non-rejection region, hence we fail to reject𝐻𝑜 .

Additional Activities

A medical trial is conducted to test whether or not a certain drug can treat a
certain allergy. Upon trial, the t-value is computed as 1.311. Sketch and
complete the table below to discuss the findings of the medical trial.

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

𝐻𝑜 : The computed Decision:


t-value is at the
𝐻𝑎 : ___________
region.

Justify your decision by writing an explanation in 5-10 sentences.

29
30
References

Textbooks
Albacea, Zita, Mark John Ayaay, Imelda Demesa, and Isidro David. Teaching
Guide for Senior High School: Statistics and Probability. Quezon City:
Commission on Higher Education, 2016.
Caraan, Avelino. Introduction to Statistics & Probability. Mandaluyong City:
Jose Rizal University Press, 2011.
Chan Shio, Christian Paul, and Maria Angeli Reyes. Statistics and
Probability for Senior High School. Quezon City: C & E Publishing Inc.,
2017.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc., 2017.
Jaggia, Sanjiv, and Alison Kelly. Business Statistics: Communicating with
Numbers. 2nd Ed. New York: McGraw-Hill Education, 2016.
Sirug, Winston. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Minshapers Co., Inc., 2017.

Online Resources
Khan Academy. “Consequences of Errors and Significance.” Accessed
February 2, 2019. https://www.khanacademy.org/math/ap-
statistics/tests-significance-ap/error-probabilities-
power/a/consequences-errors-significance
Minitab.com. “About the Null and Alternative Hypotheses.” Accessed
February 4, 2019. https://support.minitab.com/en-
us/minitab/18/help-and-how-to/statistics/basic-
statistics/supporting-topics/basics/null-and-alternative-hypotheses/
Minitab. com. “What are Type I and Type II Errors?” Accessed February 4,
2019. https://support.minitab.com/en-us/minitab/18/help-and-how-
to/statistics/basic-statistics/supporting-topics/basics/type-i-and-type-
ii-error/
Zaiontz, Charles. “Null and Alternative Hypothesis.” Accessed February 2,
2018. http://www.real-statistics.com/hypothesis-testing/null-
hypothesis/

31
Statistics and
Probability
Quarter 2 – Module 2:
Identifying Parameters for
Testing in Given Real-Life
Problems

32
What I Need to Know

In the previous module, you were introduced to what hypothesis


testing is and other terms related to it. You were able to determine the null
and alternative hypothesis in given statistical hypotheses. You also learned
to identify the steps used in hypothesis testing.

With this module, you will learn how to identify the parameter to be
tested in a statistical hypothesis. The first step in hypothesis testing,
defining the parameter, will be given emphasis in this module.

After going through this module, you are expected to:

1. define the parameters used in statistical analysis; and


2. identify the parameter to be tested in a real-life problem.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.

1. Which of the following population parameters is used as symbol for the


mean or average population?
A. 𝜎 B. 𝜇 C. 𝜌 D. 𝑥

2. Which of the following statements best describes a parameter?


A. It describes the sample. C. It describes the hypothesis.
B. It describes the researcher. D. It describes the population.

3. A researcher estimates that the average height of buildings in a large city


is at least 700 feet. Based on the given data, which is the parameter?
A. the researcher C. buildings in the large city
B. at least 700 feet D. the average height of the building

4. The average height of a 1-year old child is 29 inches. What is the


parameter?
A. a one-year old child
B. the mean height of 33 inches
C. the average height of 29 inches
D. a random sample of 30 children who are 1-year old

33
5. Which is the parameter in the given situation below? “The average age of
10 college students is 24 years.”
A. 24 years C. age of the college students
B. 10 college students D. the average age of 24 years

6. A parameter is...
A. a numerical value summarizing the sample data
B. a planned activity with results yielding a set of data
C. a numerical value that summarizes all the data of an entire
population
D. the set of values collected from the variable from each of the elements
that belongs to the sample

7. Which of the following is a parameter?


A. 𝑆𝑥 B. 𝑥̅ C. 𝑝̂ D. µ

8. Which of the following symbols represents population standard


deviation?
A. 𝑝 B. 𝜎 C. 𝑝̂ C. µ

9. SWS survey was trying to see if people in the Philippines thought the
pollution was too high. Which choice best represents a parameter?
A. all people in the Philippines
B. 500 randomly selected residents of the Philippine
C. 71% of the residents surveyed who thought the pollution was too high
D. percentage of all people in the Philippines who thought the pollution
was too high

10. A research conducted on a certain company last year showed that 25% of
the employees would rather drink coffee than soft drinks during break
time. Which choice best represents a parameter?
A. 25 employees C. the percentage of employee
B. all employees D. coffee rather than soft drinks

11. What is the parameter in the problem that follows?


A fast food outlet claims that the mean waiting time in line is less than
1.9 minutes. A random sample of 20 customers has a mean of 1.7
minutes with a standard deviation of 0.8 minute. Test the fast food
outlet's claim at α = 0.05.
A. mean of 1.7 minutes
B. the level of significance of 0.05
C. random sample of 20 costumers
D. the mean waiting time of line less than 1.9 minutes

12. A councilor is concerned about the percentage of city residents who


express disapproval of her performance. Her political committee pays for
a newspaper ad, hoping to keep her disapproval rating below 21%. What
is the parameter?

34
A. average of all residents
B. disapproval rating below 21%
C. political committee paying a newspaper ad
D. percentage of city residents who express disapproval

13. A motorcycle manufacturer advertises that its new subcompact models


get 47 mpg. If μ is the mileage of these cars, what kind of parameter is
used?
A. mean C. proportion
B. variance D. standard deviation

14. Three percent (3%) of cars of a certain model have needed new engines
after being driven between 0 and 80 miles. The manufacturer hopes
that redesigning one of the engine's components has solved this
problem. What kind of parameter is illustrated in the problem?
A. mean C. proportion
B. variance D. standard deviation

15. A random sample of 101 bottles of cologne showed an average content of


4 oz. It is known that the population standard deviation of the contents
is 0.22 oz. In this problem, translate the parameter into symbols.
A. µ = 4 B. 𝜎 = 101 C. µ = 0.22 D. 𝜎 = 0.22

Lesson Identifying Parameters for Testing


2 in Given Real-Life Problems

Inferential statistics makes use of sample data to make an inference and


conclusion about a population. The main activities of inferential statistics
are using sample data (1) to estimate a population parameter and (2) to test
a hypothesis or claim about a population parameter. But before you test a
hypothesis, you should understand first what parameter is and how to
identify it in each real-life problem.

For instance, you might be interested in the average age of your section
where you belong and found the average age was 17. Do you think this is an
example of parameter? To be able to answer this question, read and
understand this module.

35
What’s In

Activity 1. Choose Wisely!


Directions: Choose the best answer and write the letter of your choice on a
separate sheet of paper.

1. The ____________ in a set of data is the sum of the values divided by the
total number of values.
A. mean B. range C. variance D. standard deviation

2. The ____________ is the middle value of a data set when it is arranged


from smallest to largest.
A. mean B. median C. variance D. standard deviation

3. The ________ is the item of data that appears most frequently in a set of
data.
A. mean B. mode C. median D. standard deviation

4. The measurement that shows how data are spread above and below the
mean is the _____.
A. mean B. range C. variance D. standard deviation

5. Mean, median, and mode are examples of measures of _____.


A. variation B. data sets C. statistics D. central tendency

Check the student’s level of readiness for the next topic.


If the student did not answer most of the item, you may
provide another review activity on the concepts related to
quantitative statistics.

36
What’s New

Activity 2: Grouping!
Directions: Group the following symbols into two. Place the first group
inside Box A and the second group in Box B.

𝒙 𝒑 𝒔 𝒔𝟐 ෝ
𝒑 𝝁 𝝈 𝝈𝟐
A B

Guide Questions:
1. What are the symbols that you placed in Box A? Box B?
2. How did you categorize each symbol or notation?
3. What mathematical principle did you consider in answering the
activity?
4. Which symbols seemed to be familiar to you and which are not?

What Is It

Parameters in statistics are important component of any statistical


analysis. In simple words, a parameter is any numerical quantity that
characterizes a given population or some of its aspects. This means the
parameter tells us something about the whole population.
However, the numerical measure that is calculated from the sample is
called statistic. Statistic is a known number and a variable that depends on
the portion of the population.
A parameter denotes the true value that would be obtained if a census
rather than a sample was undertaken.
Examples of parameters are the measures of central tendency. These
tell us how the data behave on an average basis. For
example, mean, median, and mode are measures of central tendency that
give us an idea about where the data concentrate. Meanwhile, standard
deviation tells us how the data are spread from the central tendency, i.e.
whether the distribution is wide or narrow. Such parameters are often very
useful in analysis.
In the normal distribution, there are two parameters that can
characterize a distribution - the mean and standard deviation. By varying
these two parameters, you can get different kinds of normal distribution.

37
Different symbols are used to denote parameters. Based on Activity 2,
symbols are grouped as indicated in the table below.
Measure Statistic Parameter
𝒎𝒆𝒂𝒏 𝑥̅ (x-bar) 𝜇 (myu)
𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝑠2 𝜎 2 (sigma squared)
𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝑠 𝜎 (sigma)
𝒑𝒓𝒐𝒑𝒐𝒓𝒕𝒊𝒐𝒏 𝑝̂ (p hat) 𝑝

Mean and standard deviation


are two common parameters.

Identifying Parameter to be Tested


Illustrative Examples:
1. The average height of adult Filipinos 20 years and older is 163 cm for
males.
Parameter: the average height of adult Filipinos 20 years and older
In hypothesis testing, the parameter will be translated into symbols such
as 𝛍 = 𝟏𝟔𝟑 where 𝛍 is the symbol for mean/average and 163 is the value
that pertains to the average height.

2. A Grade 11 researcher reported that the average allowance of Senior High


School students is ₱100. A sample of 40 students has mean allowance of
₱120. At 𝛼 = 0.01 test, it was the claimed that the students had allowance
of ₱ 100. The standard deviation of the population is ₱50.
Parameters: the average allowance of Senior High School students is
₱100 or 𝝁 = ₱𝟏𝟎𝟎

In this claim, there are different parameters used but the parameter
to be tested in this hypothesis would be the average allowance of Senior
High School students since it relates to the population, not in sample.
Statistical hypothesis is a conjecture about the population parameter that’s
why you will look for the population mean, population standard deviation, or
population proportion but not sample mean.

3. According to a survey, 63% of the parents are willing to spend extra


money for their children’s health and education matters.
Parameter: the percentage/proportion of parents willing to spend
extra money in their children’s health and education matter or 𝒑 =
𝟎. 𝟔𝟑

To identify the parameters to be tested:


1. Just look for mean/average, standard deviation, variance,
and proportion of population.
2. Determine the value that pertains to the given parameter,
then translate them in symbols for hypothesis testing.

38
What’s More

Activity 3. Translate It!


Directions: Determine the notation of the given parameter, inequality
symbol, or value of the parameter.

Notation Symbols Value


Parameter (µ, 𝜎, 𝑝, 𝜎 2 ) (=, ≠, <,
>, ≤, ≥)
1. Average salary of Polytechnic University of
the Philippines (PUP) graduates is at most ≤
₱324,000. _____ _____
2. The standard deviation of adults riding a
bus is 1.5. _____ = _____
3. Filipino employers offer a mean of 15 days
of paid vacation for sick leave. _____ _____ 15
4. Survival rate of breast cancer in the
Philippines is below 50%. _____ _____ .50
5. Mean number of vehicles in households is
at most 1.9 personal vehicles. µ _____ _____

Activity 4-5. What Is Your Parameter?

Directions: Determine the parameter to be tested in each situation by


writing your answer on a separate sheet of paper. Translate it into symbols.

1. The television habits of children were observed and found out that the
standard deviation is 12.4 hours per week.
2. A newspaper article stated that students in the country take an average
of 4 years to finish their undergraduate degrees. Suppose that you
believe the mean time is longer, you conducted survey on 49 students.
The result obtained a sample mean of 5 with a sample standard deviation
of 1.2.
3. According to DOLE, registered nurses in government earned an average
monthly salary of ₱9,700. For that same year, a survey was conducted on
41 registered nurses to determine if the mean salary is higher than the
previous survey. The sample average was ₱10,000 with a sample
standard deviation of ₱2,500.
4. Records of the Department of Health (DOH) revealed that 14.7% of the
country's Filipino smokers have maintained their habit of smoking.

39
What I Have Learned

Answer the following questions?

1. What is a parameter?
________________________________________________________________________
2. What are the two commonly used parameters? What are their symbols or
notations?
________________________________________________________________________
3. What are the other notations used as parameters?
________________________________________________________________________
4. To identify the parameter to be tested in a claim/hypothesis, what are
the concepts to consider?
________________________________________________________________________

What I Can Do

List down five (5) different real-life situations where hypothesis testing can
be done. Identify the parameter to be tested in each situation.
1. ________________________________________________________________________
2. ________________________________________________________________________
3. ________________________________________________________________________
4. ________________________________________________________________________
5. ________________________________________________________________________

Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.

1. The numerical measure that describes the certain characteristics of a


population is called ______________.
A. sample B. statistics C. parameter D. population

2. What are the two common parameters of normal distribution?


A. 𝜇 and 𝜎 B. 𝜎 and 𝑝 C. 𝑝 and 𝜇 D. 𝑝̂ and 𝑝

40
3. Anna wants to estimate the average shower time of teenagers. From the
sample of 50 teenagers, she found out that it takes 5 minutes for
teenagers to shower. What is the parameter?
A. sample of 50 teenagers C. average shower time of teenagers
B. 50 teenagers in 5 minutes D. took 5 minutes for teenagers to shower

4. What kind of parameter is applied in the given situation? “The mean


height of all Grade 11 students is 170 cm.”
A. mean B. variance C. proportion D. standard deviation

5. An education official wants to estimate the proportion of adults aged 18


and above who had read at least one book during the previous year. A
random sample of 1,006 adults aged 18 or older is obtained, and 835 of
those adults had read at least one book during the previous year.
Determine the parameter in the situation.
A. The parameter is the 835 adults.
B. The parameter is the average of adults aged 18 and above.
C. The parameter is the proportion of adults who had read at least one
book during the previous year.
D. The parameter is the random sample of 1,006 adults 18 and above
who had read a book in the previous year.

6. Which of following denotes the true value that would be obtained if a


census rather than a sample was undertaken?
A. sample B. statistic C. parameter D. population

7. Which of following is NOT a parameter?


A. 𝜎 B. Σ C. µ D. 𝑝

8. Which of the following symbols is used for population variance?


A. Σ B. 𝜎 2 C. 𝜎 D. µ
9. A study claims that the mean survival time for a certain cancer patient
treated immediately with chemotherapy and radiation is 24 months.
Which is the parameter?
A. 24 months
B. study claims on cancer
C. mean survival time for a certain cancer patient
D. mean survival time of 24 months with chemotherapy and radiation

10. What is the parameter to be tested in this claim? As stated by a


company’s shipping department, the number of shipping errors per
million shipments has a standard deviation of 2.7.
A. million shipments C. number of shipping errors
B. standard deviation of 2.7 D. company shipping department

11. A researcher claims that the mean monthly consumption of coffee per
person is more than 19 cups. In a sample of 60 randomly selected
people, the mean monthly consumption was 20. The standard deviation

41
of the sample was 4 cups. Which is the parameter to be tested in this
claim?
A. sample of 60 randomly selected people
B. mean consumption of 60 selected people
C. the mean consumption of coffee per person
D. the standard deviation of the sample which was 4 cups

12. A certified public accountant (CPA) claims that more than 30% of all
accountants advertise. What kind of parameter is used in this claim?
A. mean B. variance C. proportion D. standard deviation

13. The average baptismal cost includes 50 guests. A random sample of 32


baptismal during the past year in the National Capital Region has a
mean of 53 guests and a standard deviation of 10. Which is the
parameter?
A. the mean of 53 guests C. a random sample of 32 baptismal
B. the standard deviation of 10 D. the average baptismal cost including
50 guests
14. Powder milk is packed in 1-kilogram bag. An inspector from Department
of Trade and Industry (DTI) suspects that bags may not contain 1
kilogram. A sample of 40 bags produces a mean of 0.96 kilograms and
standard deviation of 0.12 kilogram. In this problem, 0.96 kilogram is
_______.
A. variance C. population mean
B. sample mean D. standard deviation

15. In symbols, what is the parameter in the given claim below?


In 2018, DepEd reported that the proportion of Grade 10 completers who
proceeded to Grade 11 is 93%.
A. 𝑝 = 0.93 B. 𝜎 = 0.93 C. µ = 0.93 D. 𝑥̅ = 0.93

Additional Activities

Activity 6. Parameter Plus!

Directions: Determine the parameter to be tested in the given problems


below.

1. An electric lamps manufacturer is testing a new method of producing


lamps that will be considered acceptable in a normal population with an
average life of 2,600 hours and a standard deviation equal to 350. A
sample of 80 lamps produced by this method has an average life of 2,630
hours. Can the hypothesis of validity for the new manufacturing process
be accepted with a risk equal to or less than 5%?

2. A car dealer claims that the average price of Honda Vios is at least
₱662,000.00. A client suspected that the claim is incorrect and found

42
that random sample of 15 similar vehicles has the mean price of
₱640,000.00 and standard deviation of ₱ 24,000.00. Is there enough
evidence to reject the dealer’s claim at 𝛼 = 0.05?

References

Books

Caraan, Avelino Jr. S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.

Chua, Jedd Amerson S. Soaring 21st Century Mathematics: Statistics and


Probability. Quezon City: Phoenix Publishing House Inc., 2016.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E


Publishing Inc, 2017.

Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources

Mahesh. “Everything You Need to Know About Hypothesis Testing - Part I.”
Accessed May 20, 2020. https://towardsdatascience.com/everything-
you-need-to-know-about-hypothesis-testing-part-i-4de9abebbc8a

Kalla, Siddharth. “Parameter and Statistics.” Accessed May 20, 2020.


https://explorable.com/parameters-and-statistics.

43
Statistics
Quarter 2 – Module 3:
Formulating Appropriate Null
and Alternative Hypotheses on a
Population Mean

44
What I Need to Know

In the previous module, you learned about the parameters used in


hypothesis testing. You were able to identify the parameters to be tested in
given real-life problems. You also learned how to translate the parameter
into mathematical symbols as the first step in hypothesis testing.

In this module, you will learn how to formulate null and alternative
hypotheses on a population mean.

After going through this module, you are expected to:


1. identify the notation to be used in formulating hypotheses;
2. illustrate one-tailed and two-tailed tests;
3. differentiate null and alternative hypotheses; and
4. formulate null and alternative hypotheses.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.

1. In formulating the alternative hypothesis, what mathematical symbol is


applicable to use in the statement, “The average score of Grade 11 (ABM)
in Business Statistics is 75.”?
A. < B. > C. = D. ≠

2. A vacuum cleaner consumes less than 46 kwh per year. What hypothesis
test can you use in this claim?
A. left-tailed C. null hypothesis
B. right- tailed D. alternative hypothesis

3. Which of the following steps is not included in formulating hypothesis?

A. Identify the claim to be tested.


B. Translate the claim into mathematical symbols/notations.
C. Use the data about sample then compute the test statistic.
D. Formulate first the null hypothesis and then the alternative
hypothesis.

45
4. The sign of the alternative hypothesis in a left-tailed test is always ________
A. Equal C. less than
B. not equal D. greater than

5. A scientist invented a substance that increases the life of an automobile


battery. If the mean lifetime of the battery is 24 months, then what are
his hypotheses?
A. 𝐻𝑜 : 𝜇 = 24, 𝐻𝑎 : 𝜇 ≠ 24 C. 𝐻𝑜 : 𝜇 = 24, 𝐻𝑎 : 𝜇 ≤ 24
𝐵. 𝐻𝑜 : 𝜇 = 24, 𝐻𝑎 : 𝜇 > 24 D. 𝐻𝑜 : 𝑝 = 24, 𝐻𝑎 : 𝑝 > 24

6. A researcher reports that the average salary of an accountant is more


than ₱40,000. A sample of 30 accountants has a mean salary of ₱42,500.
At a = 0.05 test, it is found out that an accountant earns more than
₱40,000 a month. The standard deviation of the population is ₱3,000.
What is the alternative hypothesis?
A. The average salary of an accountant is equal to ₱40,000.
B. The average salary of an accountant is greater than ₱40,000.
C. The average salary of an accountant is less than or equal to ₱42,500.
D. The average salary of an accountant is greater than or equal to
₱42,500.

7. What kind of hypothesis is illustrated in statement below?


“There is no significant difference between the average weekly allowances
of morning and afternoon students in Mabunga Integrated High School.”
A. one-tailed test C. null hypothesis
B. two-tailed test D. alternative hypothesis

8. “The introduction of modern computers affects the performance of the


students.” What kind of hypothesis is it?
A. Null C. alternative
B. Mean D. standard deviation

9. Consider this statement: “New cars are expected to last an average of at


least three (3) years before needing major service.” Which of the following
is the null hypothesis?
A. 𝐻𝑜 : µ ≤ 3 B. 𝐻𝑜 : µ < 3 C. 𝐻𝑜 : µ > 3 D. 𝐻𝑜 : µ ≥ 3

10. Which is the correct null hypothesis of the claim below? “Students take
an average of less than five (5) years to graduate from college.”
A. 𝐻𝑜 : µ = 5 B. 𝐻𝑜 : µ < 5 C. 𝐻𝑜 : µ ≠ 5 D. 𝐻𝑎 : µ < 5
11. In driver’s test, an average of 300 drivers pass on their first try. We want
to test if more than an average of 300 passes on the first try. Which

46
inequality symbols is correct (=, ≠, ≥, <, ≤, >) for the null and alternative
hypotheses - 𝐻𝑜 : µ __ 300 𝐻𝑎 : µ __ 300?
A. <, > B. =, ≠ C. ≤, ≥ D. = , >

12. Which of these is a correct alternative hypothesis for a two‐tailed test?


A. 𝐻𝑎 : µ ≠ 7 B. 𝐻𝑎 : µ = 7 C. 𝐻𝑎 : µ > 7 D. 𝐻𝑎 : µ < 7

13. In a commercial, a new diet program would like to claim that their
methods result in a mean weight loss of more than 22kgs in two (2)
weeks. To determine if this is a valid claim, they hire an agency that then
selects 25 people to be placed on this diet. What is the test of
hypothesis?
A. null C. one tailed- test
B. alternative D. two tailed- test

14. A researcher estimated that the average height of a building in the


Philippines is at least 150 meters. A random sample of 15 buildings is
selected and has the mean of 168 meters. What are the null and
alternative hypotheses?
A. 𝐻𝑜 : µ > 150, 𝐻𝑎 : µ ≤ 150 C. 𝐻𝑜 : µ = 150, 𝐻𝑎 : µ ≥ 150
B. 𝐻𝑜 : µ = 150, 𝐻𝑎 : µ ≠ 150 D. 𝐻𝑜 : µ ≥ 150, 𝐻𝑎 : µ < 150

15. A survey reported that teenagers spend an average at most four (4)
hours each day on social media. The organization thinks that, currently,
the mean is higher. Fifteen (15) randomly chosen teenagers were asked
how many hours per day do they spend on social media. The sample
mean was 4.5 hours with a sample standard deviation of 2.0. What is
the test of hypothesis?
A. left-tailed test C. hypothesis test
B. two-tailed test D. right-tailed test

47
Lesson Formulating Appropriate Null

3 and Alternative Hypotheses on


a Population Mean
In statistics, hypothesis testing is the process of using statistical tests
to determine whether an observed difference between two or more samples
is statistically significant or not. In a practical point of view, hypothesis
testing allows you to collect samples and make decision based on facts, not
on how you feel or what you think is right. To be able to prove your
assumptions, you must state first the null and alternative hypotheses.

This module will start by recalling your knowledge on the


equality/inequality symbols. This concept will help you understand how to
formulate hypothesis.

What’s In

Activity 1. No More No Less!

Directions: Which of the given equality/inequality expressions describes


each situation? Select the best answer and write the letter of your choice on
a separate sheet of paper.

1. The survey shows that the number of students (n) who have parents with
a house of their own is less than 20.
A. 𝑛 < 20 B. 𝑛 > 20 C. 𝑛 ≤ 20 D. 𝑛 ≥ 20

2. Mother gives me at most P200 allowance (n) in a week.


A. 𝑛 ≥ 200 B. 𝑛 ≤ 200 C. 𝑛 > 200 D. 𝑛 < 200

3. Larry is an industrious appliance salesman. His average sales (n) in a


week is at least P10, 000.
A. 𝑛 < 10, 000 B. 𝑛 > 10, 000 C. 𝑛 ≤ 10, 000 D. 𝑛 ≥ 10, 000

4. A son’s savings (n) is greater than P1,500.


A. 𝑛 = 1,500 B. 𝑛 ≠ 1,500 C. 𝑛 > 1,500 D. 𝑛 ≥ 1,500

5. Marco’s salary (n) is equal to P20, 000.


A. 𝑛 = 20,000 B. 𝑛 ≠ 20,000 C. 𝑛 ≤ 20,000 D. 𝑛 < 20,000

48
Guide Questions:

1. How did you find the previous activity? Was it easy or difficult?
2. What previously learned principle did you apply in the activity?
3. Were you able to determine the correct expression that correspond to
each situation? Elaborate.
4. Do you think you will apply these activities in formulating null and
alternative hypotheses?

Notes to the Teacher

Check the level of readiness of the students. If the students


failed to answer all the items correctly, provide another activity to recall
past lessons that involve translating verbal phrase into symbols and
comparison of quantities using different equality and inequality
symbols.

What’s New

Activity 2. Differentiate It!

Directions: Examine the pictures below then answer the guide questions
that follow.
“Effect of a Fertilizer on Plant Growth”

Without Fertilizer With Fertilizer

49
Guide Questions:

1. What have you observed between the two figures?


2. Do you think the fertilizer has an effect to the plant?
3. What do you think are the variables shown in the pictures?
4. Is there any relationship among the variables in Figure 1 and Figure
2?
5. How does these pictures relate to hypothesis?

What Is It

A statistical hypothesis is a statement about a parameter and deals with


evaluating the value of parameter.

In statistical hypothesis testing, there are always two hypotheses: the null
and alternative hypotheses. Below is a comparison between the two.

Null Hypothesis (𝑯𝒐 ) Alternative Hypothesis (𝑯𝒂 )

- It states that there is no - It states that the population


difference between population parameter has some statistical
parameters (such as mean, significance (smaller, greater,
standard deviation, and so on) or different than) with the
and the hypothesized value. hypothesized value.

- There is no observed effect. - There is an observed effect.

- The null hypothesis is often an - The alternative hypothesis is


initial claim that is based on what you might believe to be
previous analyses or specialized true or hope to prove true.
knowledge.

To state the null and alternative hypotheses correctly:


1. Identify the parameter in a given problem.
2. Identify the claim to be tested that may show up in null or alternative
hypothesis.
3. Translate the claim into mathematical symbols/notations.
4. Formulate first the null hypothesis (𝐻𝑜 ) then alternative hypothesis (𝐻𝑎 )
based on the three different ways in writing hypothesis as illustrated
below:

50
𝑯𝒐 : µ = 𝒌 𝑯𝒐 : µ ≤ 𝒌 𝑯𝒐 : µ ≥ 𝒌
𝑯𝒂 : µ ≠ 𝒌 𝑯𝒂 : µ > 𝒌 𝑯𝒂 : µ < 𝒌

Hypothesis-Testing Common Phrases


= is equal to ≠ is not equal to
is the same as is not the same
is exactly the same as is different from
has not changed from has changed from
> is increased < is decreased
is greater than is less than
is higher than is lower than
is above is below
is bigger than is smaller than
is longer than is decreased or reduced from
is more than is not more than
≥ is at least ≤ is at most
is not less than is not more than
is greater than or equal to is less than or equal to

Let us take an example from your previous activity.


“The survey shows that the number of students (n) who have parents
with a house of their own is less than 20.”

The claim used the word “less than” which as seen in the table above,
corresponds to the symbol (<). Therefore, the answer is n<20.

Note:
𝐻𝑜 always has = symbol in it. 𝐻𝑎 never has an = symbol in it. The choice of
symbol depends on the wording of the hypothesis test. However, be aware
that many researchers use = (equal sign) in the null hypothesis, even with
> or < as the symbol in the alternative hypothesis. Notice also that the
notation of alternative hypothesis complements the null hypothesis.

Illustrative Examples:

1. The average weight of all Grade 11 students in Senior High School is


169cm. Is this claim true?

Solution: First, identify the parameter which is the mean height of all
Grade 11 students. Since it is a population mean, use the notation 𝝁.
The claim in this example is that the average weight is 169 cm which
translates to 𝝁 = 𝟏𝟔𝟗 and is considered as null hypothesis. To formulate

51
the alternative hypothesis, write the complement/opposite of the null
hypothesis which is the average weight is not equal to 169 cm.

𝑯𝒐 : The average weight of all Grade 11 students is 169 cm. / 𝑯𝑶 : 𝝁 = 𝟏𝟔𝟗


(claim)
𝑯𝒂 : The average weight of all Grade 11 students is not 169 cm./ 𝑯𝒂 : 𝝁 ≠ 𝟏𝟔𝟗

2. The average price per square meter of residential lot in an exclusive


subdivision is above ₱15,000. A buyer wants to test the agent’s
claim.

Solution: In this hypothesis, the parameter is the average. Therefore,


you will use the symbol µ. The claim is above ₱15,000 can be written as
µ > ₱15,000 and greater than falls at alternative hypothesis, 𝑯𝒂 : 𝝁 >
₱𝟏𝟓, 𝟎𝟎𝟎. Since you have already formulated the alternative, the null
hypothesis will be 𝑯𝒐 : 𝝁 ≤ ₱𝟏𝟓, 𝟎𝟎𝟎 as complement of >. You can also write
your null hypothesis as 𝑯𝒐 : 𝝁 = ₱𝟏𝟓, 𝟎𝟎𝟎.
𝑯𝒐 : 𝝁 ≤ ₱𝟏𝟓, 𝟎𝟎𝟎or 𝑯𝒐 : 𝝁 = ₱𝟏𝟓, 𝟎𝟎𝟎
𝑯𝒂 : 𝝁 > ₱𝟏𝟓, 𝟎𝟎𝟎 (claim)

3. Holistic Fitness Center claims that their members reduced an


average of 13 pounds after joining the center. An independent
agency wanted to check this claim took sample of 40 members and
found that they reduced an average of 12 pounds with the standard
deviation of 4 pounds. Determine the null and alternative
hypothesis.

Solution: In this example, the parameter to be tested is the average and


the claim is reduced of 13 pounds. The claim that pertains to the
parameter has the notation of (<). Therefore, the claim is found at the
alternative hypothesis and can be written as 𝑯𝒂 : 𝝁 < 𝟏𝟑. The null
hypothesis would be 𝑯𝒐 : 𝝁 ≥ 𝟏𝟑 or 𝑯𝒐 : 𝝁 = 𝟏𝟑
𝑯𝒐 : 𝝁 ≥ 𝟏𝟑 or 𝑯𝒐 : 𝝁 = 𝟏𝟑.
𝑯𝒂 : 𝝁 < 𝟏𝟑 (claim)

4. The treasurer of a municipality claims that the average net worth


of families in the municipality is at least ₱730,000. A random
sample of 50 families from this area produced a mean net worth of
₱860,000 with standard deviation of ₱65,000. What are the null and
alternative hypotheses?

Solution: In this example, the parameter is the average and the claim
is that the average is at least ₱730,000. The word at least has the
notation of (≥) which means that the claim is at the null hypothesis. In

52
the alternative hypothesis, you will use (<) as its complement.
Therefore:
𝑯𝑶 : µ ≥ ₱𝟕𝟑𝟎, 𝟎𝟎𝟎 or 𝑯𝑶 : µ = ₱𝟕𝟑𝟎, 𝟎𝟎𝟎 (claim)
𝑯𝒂 : µ < ₱𝟕𝟑𝟎, 𝟎𝟎𝟎

5. An academic organization claimed that Grade 11 students’ study


time is at most 240 minutes per day, on average. Another survey
was conducted to find whether the claim is true. The group took a
random sample of 30 students and found a mean study time of 300
minutes with standard deviation of 90 minutes. What are the null
and alternative hypotheses?

Solution: The parameter used in this example is average (µ) and the
claim is that average is at most 240 minutes. The word ‘at most’ has
the notation of (≤) which means that claim is at the null hypothesis.
The null hypothesis would be 𝑯𝟎 : µ ≤ 𝟐𝟒𝟎. To formulate the alternative,
use the notation (>) as the complement of (≤). Therefore, alternative
hypothesis is 𝑯𝒂 : µ > 𝟐𝟒𝟎.
𝑯𝑶 : µ ≤ 𝟐𝟒𝟎 or 𝑯𝑶 : µ = 𝟐𝟒𝟎 (claim)
𝑯𝒂 : µ > 𝟐𝟒𝟎

One-Tailed and Two-Tailed Test

The alternative hypothesis can take another form depending on the


value of the parameter. The parameter may increase, decrease, or changed
from the null value. An alternative hypothesis predicts not only the
difference of sample mean from the population mean but also how it would
be different in a specific direction - lower or higher. This test is called
a directional or one-tailed test because the rejection region is entirely
within one tail of the distribution.

On the other hand, some hypotheses predict only that one value will
be different from another, without additionally predicting which will be
higher. The test of such a hypothesis is nondirectional or two-
tailed because an extreme test statistic in either tail of the distribution
(positive or negative) will lead to the rejection of the null hypothesis of no
difference.
One-Tailed Two-Tailed
 Alternative hypothesis contains  Alternative contains the
the greater than (>) or less than inequality (≠) symbol.
(<) symbols
 It is directional (either right-tailed  It has no direction.
or left-tailed)

53
The table below shows the null and alternative hypotheses stated
together with the directional test.

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null 𝐻𝑜 : 𝜇 = 𝜇𝑜 or 𝐻𝑜 : 𝜇 = 𝜇𝑜 or 𝐻𝑜 : 𝜇 ≥
𝐻𝑜 : 𝜇 = 𝜇𝑜
Hypothesis 𝐻𝑜 : 𝜇 ≤ 𝜇𝑜 𝜇𝑜
Alternative
𝐻𝑎 : 𝜇 ≠ 𝜇𝑜 𝐻𝑎 : 𝜇 > 𝜇𝑜 𝐻𝑎 : 𝜇 < 𝜇𝑜
Hypothesis
Illustrative Examples:
Determine the hypotheses and the hypothesis test.
1. Teacher A wants to know if mathematical games affect the
performance of the students in learning Mathematics. A class of 45
students was used in the study. The mean score was 90 and the
standard deviation was 3. A previous study revealed that 𝝁 = 𝟖𝟓 and
the standard deviation 𝝈 = 𝟓.
The parameter is the population mean = 85. You can write the
hypotheses into symbols: 𝐻𝑂 ∶ 𝜇 = 85 and 𝐻𝑎 ∶ 𝜇 ≠ 85. The phrase ‘affects
performance’ has no clue of the direction of the study, so it implies
either increase or decrease in performance. This tells you that the test is
two-tailed test.
𝑯𝑶 ∶ 𝝁 = 𝟖𝟓 and 𝑯𝒂 ∶ 𝝁 ≠ 𝟖𝟓 (two-tailed test)

2. A piggery owner believes that using organic feeds on his pigs will
yield greater income. His average income from the previous year
was ₱120, 000. State the hypothesis and identify the directional
test.
In this example, the null hypothesis is 𝑯𝑶 ∶ 𝝁 = 𝟏𝟐𝟎, 𝟎𝟎𝟎 . You may
notice that the hypothesis used the phrase ‘greater income’ that is
associated with greater than. Therefore, 𝑯𝒂 ∶ 𝝁 > 𝟏𝟐𝟎, 𝟎𝟎𝟎. This
hypothesis uses inequality symbol (>) so it is one-tailed test and it uses
greater than which specifically called for the right-tailed test.
𝑯𝑶 ∶ 𝝁 = 𝟏𝟐𝟎, 𝟎𝟎𝟎 and 𝑯𝒂 ∶ 𝝁 > 𝟏𝟐𝟎, 𝟎𝟎𝟎 (right-tailed test)

3. The average waiting time of all costumers in a restaurant before


being served is less than 20 minutes. Determine the hypotheses and
the directional test.
You may notice that the hypothesis used the phrase ‘less than’
which denotes that the alternative hypothesis is 𝑯𝒂 ∶ 𝝁 < 𝟐𝟎. This
hypothesis uses inequality symbol (<) so it is one-tailed test and it used
less than which specifically called for the left-tailed test. In this
example, the null hypothesis is 𝑯𝑶 ∶ 𝝁 ≥ 𝟐𝟎.
𝑯𝑶 ∶ 𝝁 ≥ 𝟐𝟎 and 𝑯𝒂 ∶ 𝝁 < 𝟐𝟎 (left-tailed test)

54
What’s More

Activity 2. Fill Me!

Directions: Determine what is asked in each problem as indicated by the


blanks.

1. A school principal claims that the Grade 11 students in her high school
have a mean score of 92.
Parameter: ___________ Null Hypothesis: ___________
Claim: mean score of 92 Alternative Hypothesis: 𝐻𝑎 : µ ≠ 92

2. A medicine company has manufactured and claimed that their medicine


pill contains an average of 14mg of active ingredient.
Parameter: average Null Hypothesis: ___________
Claim: average of 14mg Alternative Hypothesis: ___________

3. A certain product produced by a manufacturing company is supposed to


weigh at least 12lbs.
Parameter: ___________ Null Hypothesis: 𝐻𝑜 : µ ≥ 121
Claim: weigh at least 12lbs Alternative Hypothesis: ___________

4. The Bureau of Internal Revenue claims that the mean wait time for
taxpayer during a recent tax filing is at most 8.7 minutes. A random
sample of 11 taxpayers has a mean wait time of 8.7 minutes and a
standard deviation of 2.7 minutes. Is there enough evidence to reject the
claim at a significance level of 0.10?
Parameter: mean Null Hypothesis: ___________
Claim: ___________ Alternative Hypothesis: 𝐻𝑎 : µ > 8.7

5. According to a company, the mean pH level of the river water is 7.4. A


researcher randomly selected 15 river water samples and found out that
the mean is 6.7 with standard deviation of 0.24.
Parameter: mean
Claim: mean pH level of the water river is 7.4
Null Hypothesis: ___________
Alternative Hypothesis: ___________

55
Activity 3. Let’ s Hypothesize

Directions: Write the null hypothesis and alternative hypothesis in


notations for each given situation.

1. Mrs. Dela Cruz claims that her students scored an average of 91 in their
Mathematics quiz. The master teacher wants to know whether the
teacher’s claim is acceptable or not.
𝐻𝑜 : _________________________________________________
𝐻𝑎 : _________________________________________________

2. A car manufacturer claims that the mean selling price of all cars
manufactured is only ₱150,000. A consumer agency wants to test
whether the mean selling price of all the cars manufactured exceeds
₱150, 000.
𝐻𝑜 : _________________________________________________
𝐻𝑎 : _________________________________________________

3. A manufacturer of soft drinks claims that all labeled 1.5-liter bottles


contain an average of 1.49 liters of soft drinks. A retailer wishes to test
whether the mean amount of soft drinks in labeled 1.5-liter bottle is less
than 1.49 liters.
𝐻𝑜 : _________________________________________________
𝐻𝑎 : _________________________________________________

4. A bus company in Manila claims that the mean waiting time for a bus
during rush hour is less than 12 minutes. A random sample of 30
waiting times has a mean of 15 minutes with a standard deviation of 4.8
minutes.
𝐻𝑜 : _________________________________________________
𝐻𝑎 : _________________________________________________

5. The average power consumption of air conditioner is at most 2,700 watts


as claimed by the owner. A survey made by an electric power company
found out that the mean consumption is 3,000 with standard deviation
of 225.
𝐻𝑜 : _________________________________________________
𝐻𝑎 : _________________________________________________

56
Activity 4. One-Tailed or Two-Tailed!

Directions: Identify whether the given hypothesis is one-tailed or two-tailed.


Write ONE if it is one-tailed and TWO if it is two-tailed test.

1. A used car dealer says that the mean price car in the Philippines is at
least ₱350,000.

2. PAG-ASA reported that the mean annual rainfall in the Philippines is at


most 4,064mm.

3. According to the survey, the average cost of visiting doctors is ₱500.

4. The mean age of students in a university in the previous years was 27


years old. An instructor thinks the mean age for students is older than
27. She randomly surveys 56 students and finds that the sample mean is
29 with a standard deviation of 2.

5. The mean work week for engineers in a new company is believed to be


about 40 hours. A newly hired engineer hopes that it is shorter. She asks
10 engineering friends for the lengths of their mean work weeks. Based
on the results, should she count on the mean work week to be shorter
than 40 hours?

Activity 5. Formu-Tail

Directions: Formulate the null and alternative hypotheses. Identify whether


it is one-tailed or two-tailed. If the hypothesis is one tailed, identify its
direction whether it is left or right. Write your answer on a separate sheet of
paper.

1. The average salary of an accountant is ₱24,620 per month in the


Philippines.
𝐻𝑜 : ________________ 𝐻𝑎 : __________________ _______-tailed test

2. A normal smartphone battery manufacturer claims that the mean life of


a certain type of battery is more than 650 hours.
𝐻𝑜 : ________________ 𝐻𝑎 : __________________ _______-tailed test

3. According to an international shipping company, a package from US can


arrive to Manila in an average of less than 8 business days.
𝐻𝑜 : ________________ 𝐻𝑎 : _________________ _______- tailed test

57
4. The average price of a certain type of car is greater than ₱600,000.
𝐻𝑜 : _________________ 𝐻𝑎 : _________________ _______- tailed test

5. A research organization reports that the mean of adult grocery shoppers


who never buy the store brand in Metro Manila is 300.
𝐻𝑜 : _________________ 𝐻𝑎 : _________________ _______- tailed test

6. A study claims that the mean survival period for certain cancer patients
treated immediately with chemotherapy and radiation is 24 months.
𝐻𝑜 : _________________ 𝐻𝑎 : _________________ _______- tailed test

7. The average pre-school cost for tuition fees last year was ₱ 15,500. The
following year, 20 schools had a mean of ₱ 13, 100 and standard
deviation of ₱ 2,500.
𝐻𝑜 : _________________ 𝐻𝑎 : _________________ _______- tailed test

8. A magazine reports that a typical shopper spends less than 10 minutes


in line waiting to check out. A sample of 30 shoppers at the DM
Supermarket showed mean of 9.5 minutes with standard deviation of 2.7
minutes.
𝐻𝑜 : ________________ 𝐻𝑎 : __________________ _______-tailed test

9. The principal of Mabundok High School claims that the students in his
school have above average intelligence. A random sample of 30 students’
IQ scores have a mean score of 113. The mean population IQ is 100 with
a standard deviation of 15. Is there an evidence to support his claim?
𝐻𝑜 : ________________ 𝐻𝑎 : __________________ _______-tailed test

10. The owner of BYD manufacturer claims that their batteries last an
average of at most 350 hours under normal use. A researcher randomly
selected 20 batteries from the production line and tested them. The
tested batteries had a mean life span of 270 hours with a standard
deviation of 50 hours.
𝐻𝑜 : ________________ 𝐻𝑎 : __________________ _______-tailed test

58
What I Have Learned

Direction: Complete the following statements.

1. ______________________ is a statement about a parameter and deals with


evaluating the value of parameter.

2. The two kinds of hypothesis are______________ and ____________.

3. To formulate a hypothesis, the steps are:


a. ________________________________________
b. ________________________________________
c. ________________________________________
d. ________________________________________
4. The test of hypothesis can be __________________ if the alternative
hypothesis uses ≠ symbol or __________________ if it uses < 𝑜𝑟 >
symbols.

5. The null hypothesis and alternative hypothesis can be denoted as ______


and ______, respectively.

What I Can Do

Cite five (5) research questions used in real life and formulate your null and
alternative hypotheses.
Example: Is it true that turmeric can prevent viruses?
𝐻𝑜 : Drinking turmeric cannot prevent viruses.
𝐻𝑎 : Drinking turmeric can prevent viruses.

59
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your answer on a separate sheet of paper.

1. This hypothesis states that there is no difference between population


parameters and the hypothesized value.
A. hypothesis C. alternative hypothesis
B. null hypothesis D. two-tailed hypothesis

2. When the value of parameter has significant difference with the


hypothesized value, then it is called ________________.
A. one-tailed test C. null hypothesis
B. two-tailed test D. alternative hypothesis

3. The sign of the alternative hypothesis in a left-tailed test is


always_________.
A. equal B. less than C. not equal D. greater than

4. If the researcher wishes to test the claim that the mean of the population
is 75, the appropriate null hypothesis is:
A. 𝜇 ≤ 75 B. 𝜇 ≥ 75 C. 𝜇 ≠ 75 D. 𝜇 = 75

5. A researcher thinks that if expectant mothers use vitamins, the birth


weight of the babies will increase. The average birth weight of the
population is 3.9 kgs. What is the alternative hypothesis?
A. 𝐻𝑎 : 𝜇 > 3.9 B. 𝐻𝑎 : 𝜇 < 3.9 C. 𝐻𝑎 : 𝜇 = 3.9 D. 𝐻𝑎 : 𝜇 ≠ 3.9

6. According to the report, the average weight of Filipino newborn baby is 2.


8 kgs. Mellissa wants to perform a significance test to see if this holds
true in her nation. She takes a random sample of babies and observes
that the average weight of newborns is 3kgs. What is the null
hypothesis?
A. 𝐻𝑎 : 𝜇 > 2.8 B. 𝐻𝑎 : 𝜇 < 2.8 C. 𝐻𝑎 : 𝜇 = 2.8 D. 𝐻𝑎 : 𝜇 ≠ 2.8

7. What kind of hypothesis is illustrated below?


The mean score of all Grade 11 students is higher than 75.
A. one-tailed test C. null hypothesis
B. two-tailed test D. alternative hypothesis

60
8. “A modern approach in advertisement will not increase the demand for a
product.” This is an example of _______________ hypothesis.
A. Null C. alternative
B. Mean D. right-tailed

9. What is the alternative hypothesis in the following statement?


“The number of defective batteries produced by the company is not equal
to 15 batteries a day as claimed by the manager.”
A. µ = 15 B. µ ≠ 15 C. µ > 15 D. µ < 15

10. Which is the correct null hypothesis of the given statement?


“According to the owner, an average of 500 people buys foods at
McDonalds during breakfast and lunch hours.”
A. 𝐻𝑜 = 500 B. 𝐻𝑜 ≠ 500 C. 𝐻𝑜 < 500 D. 𝐻𝑜 > 500

11. On average, the household electricity consumption in the country was


about 248.1-kilowatt hours in 2015. Electricity was used primarily for
lighting purposes, cooking, recreation, and space cooling. Which
inequality symbols are correct (=, ≠, ≥, <, ≤, >) for the null and alternative
hypotheses: 𝐻𝑜 : µ __ 248.1 𝐻𝑎 : µ __ 248.1?
A. = , > B. <, > C. =, ≠ D. ≤, ≥

12. Which is the correct alternative hypothesis for one-tailed test?


A. µ = 25 B. µ ≠ 25 C. µ ≥ 25 D. µ < 25

13. A teacher in Math announced that the mean score of Grade 9 students in
the first quarterly assessment in Mathematics was 89 and standard
deviation was 6. One student, who believed that the mean score was less
than this, randomly selected 30 students and computed the mean score.
What kind of test of hypothesis can describe this?
A. left-tailed B. two-tailed C. right-tailed D. multiple-tailed

14. Determine the null and alternative hypothesis.


“It was claimed that the average monthly income of aircraft pilot was
₱116, 714.00. A random sample of 45 pilots is selected and it is found out
that the average monthly salary is ₱ 120,000. Using a 0.01 level of
significance, can it be concluded that there is an increase in the average
monthly income of pilot?”

A. 𝐻𝑜 : µ = ₱116, 714.00, 𝐻𝑎 : µ ≤ ₱116, 714.00


B. 𝐻𝑜 : µ = ₱116, 714.00, 𝐻𝑎 : µ ≠ ₱116, 714.00
C. 𝐻𝑜 : µ = ₱116, 714.00, 𝐻𝑎 : µ > ₱116, 714.00
D. 𝐻𝑜 : µ = ₱116, 714.00, 𝐻𝑎 : µ < ₱116, 714.00

61
15. Which directional test is illustrated in the given problem below?
In a recent survey, the average amount of money students have in their
wallet is ₱200.00 with standard deviation of 45. A teacher feels that the
average amount is lower. She surveyed 80 randomly selected students
and found that the average amount is ₱35.
A. left-tailed B. two-tailed C. alternative D. right tailed

Additional Activities

Activity 6. Let Us Take a Challenge!


1. Based on the data provided in a known website article entitled “Tuition
Fee Guide: 2019 Cost of College Education in the Philippines”, the
average tuition fee in private colleges and universities is greater than
₱145,000 a year. Suppose that we want to perform a hypothesis test to
find whether the average tuition fee is greater than ₱145.000.
a. Determine the null and alternative hypotheses for the hypothesis test.
b. Classify the hypothesis as two-tailed, left-tailed, or right-tailed.

2. A traffic enforcer believes that the number of cars passing through a


certain intersection during rush hours in weekdays follows a normal
distribution with an average of 800. A new highway is opened, and it is
hypothesized that the number of cars passing through the intersection
will decrease as a result. A sample of 15 weekdays is taken, and the
mean number of cars passing through the intersection is 750 with a
sample standard deviation of 42.
a. Determine the null and alternative hypotheses for the hypothesis test.
b. Classify the hypothesis as two-tailed, left-tailed, or right-tailed.

62
References

Textbooks

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.
Chua, Jedd Amerson S. Soaring 21st Century Mathematics: Statistics and
Probability. Quezon City: Phoenix Publishing House Inc., 2016.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc, 2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources

CliffsNotes. “One- and Two-Tailed Tests.” Accessed May 22, 2020.


https://www.cliffsnotes.com/study-guides/statistics/principles-of-
testing/one-and-twotailed-tests

Minitab.com. “About the Null and Alternative Hypotheses.” Accessed


May 22, 2020. https://support.minitab.com/en-
us/minitab/18/help-and-how to/statistics/ basic-
statistics/supporting-topics/basics /null-and-alternative-
hypotheses/

Your Dictionary. “Example of Hypothesis Testing.” Accessed May 23, 2020.


https://examples.yourdictionary.com/examples-of-hypothesis-
testing.html

63
Statistics and
Probability
Quarter 2 – Module 4:
Identifying Appropriate Test
Statistics Involving Population
Mean

64
What I Need to Know

In the previous module, you have learned more about hypothesis. You
identified the two kinds of hypotheses and the directionality test of
hypothesis. The module also discussed about the notations commonly used
in formulating a hypothesis. You also accomplished activities identifying the
test of hypothesis to be used after formulating null and alternative
hypotheses.

This time, you are ready to identify the test statistic to be used when
the population variance is known and unknown.

After going through this module, you are expected to:


1. define the statistical concepts related to test concerning means;
2. identify the appropriate form of test statistics when: (a) the population
variance is assumed to be known; (b) the population variance is
assumed to be unknown; and (c) the Central Limit Theorem is to be
used; and
3. apply the concepts of test statistic on real-life problems.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. If the variance is unknown and the sample size is small, which test
statistic is appropriate?
A. t-test C. two-tailed test
B. z-test D. one-tailed test

2. One-sample z-statistic is used instead of one-sample t-statistic when


___________.
A. μ is known. C. μ is unknown.
B. σ is known. D. σ is unknown

65
3. Based on the Central Limit Theorem, when the sample (n) is extremely
large and the variance is known, what is the statistical test to be used?
A. t-test C. two-tailed test
B. z-test D. one-tailed test

4. Which of the following characteristics can be considered in using z-


test/statistic as an appropriate test?
A. Sample standard deviation is known.
B. Population is not normally distributed.
C. The sample size is greater than 30.
D. Population standard deviation is unknown.

5. What test is appropriate if the distribution is not normal, there is a


sufficiently large sample size, and population variance is unknown?
A. t-test C. null test
B. z-test D. hypothesis test

6. Which of the following notations is needed in identifying the test statistic


to be used in computing test value?
A. µ B. α C. σ D. 𝑥

7. In a sample n=100 selected from a normal population 𝑥̅ = 56 and 𝑠 = 12,


what statistical test is applicable?
A. t-test C. left-tailed test
B. z-test D. two-tailed test

8. The t-test for single sample mean may be used when all the following
conditions are true except ____________.
A. Sample size is less than 30.
B. Sample standard deviation (𝑠) is known.
C. Population standard deviation (𝜎) is known.
D. Data are approximately normally distributed.

9. A simple random sample of 150 observations was taken from a large


population. The sample average and the sample standard deviation were
determined to be 70 and 16, respectively. What is the value of the s?
A. 1.6 B. 16 C. 70 D. 150

10. A tire manufacturer tests the braking performance of one of its tire
models on a test track. From long-term records, the company knows the
value of σ. The company tried the tires on 10 different cars, recording the
stopping distance for each car on both wet and dry streets. Which test
statistic is appropriate to use?
A. t-test C. one-tailed test
B. z-test D. hypothesis test

66
11. “The average production of corn in the Philippines is 3,000 kgs. A new
plan on food has been developed and is tested on 60 plots. The mean
yield with the new plan on food is 3,200 kgs with standard deviation of
600 kgs. At α = 0.05 level of significance, can you conclude that the
production increased?” What test statistic is to be used on the given
problem?
A. t-test C. left-tailed test
B. z-test D. right-tailed test

12. In the given situation below, identify the population standard deviation.
“In a recent survey, the average amount of money a college student gets
is ₱200.00 with a standard deviation of ₱62.00. A teacher feels that the
average amount is higher. She surveys 80 randomly selected students
and finds that the average amount is ₱245.”
A. 𝜎 = 80 B. 𝜎 = ₱62.00 C. 𝜎 = ₱200.00 D. 𝜎 = ₱245.00

13. An agent believes that the average closing cost of purchasing a new home
is ₱328,250. She selects 40 new home sales at random and finds that the
average closing cost is ₱333,300. The standard deviation of the
population is ₱6,060. What is the test statistic appropriate to used?
A. t-test C. standard deviation
B. z-test D. Central Limit Theorem
14. What test static is appropriate to use in the given problem below? “A
random sample of 29 medical doctors showed that they work an average
of 55 hours per week with a standard deviation of 7.5 hours per week. If
the average is 48 hours per week, is this given evidence significantly
greater than the rest of the medical doctors?”
A. t-test C. variance
B. z-test D. two-tailed test

15. Last 2015, the government made a claim that the average income of the
Filipino people was ₱18,000. However, a sample was taken recently
showing an average income of ₱20,000 with a population standard
deviation of ₱1,300. Which test statistic is appropriate to use?
A. t-test C. one -tailed test
B. z-test D. two-tailed test

67
Lesson Identifying Appropriate Test

4 Statistics Involving Population


Mean

Hypothesis testing is a method of testing a claim or hypothesis about


a parameter in a population given a data sample. In this method, we test the
hypothesis by determining the likelihood that sample statistic could be
selected and if the hypotheses regarding the population parameter were
true. The process of hypothesis testing involves setting up two contrasting
hypotheses: the null hypothesis and the alternative hypothesis. One selects
a random sample, computes summary statistics using appropriate test
statistics, and then assesses the likelihood that the sample data support the
alternative hypothesis.

In the previous module, you were taught how to formulate null and
alternative hypotheses. You are now ready to analyze statistical hypothesis
to determine the correct test statistics to be used in computing the results
and making decisions.

What’s In

Activity 1: Is It Zee or Tee?

Directions: Write the letter “z” if the statement is a characteristic of


standard normal distribution and “t” if the given characteristic describes t-
distribution.

1. It is best applied if you have a limited sample size (n <


30) as long as the variables are approximately normally
distributed.
2. It is also applicable if you do not know the populations’
standard deviation.
3. This is the best to use in a statistical test if the
population standard deviation is known.
4. It is always used for normal distribution.

5. This test is often applied in large samples (n > 30).

68
Follow-up Questions:

1. In the items above, how did you differentiate the statements


describing standard normal distribution from those involving t-
distribution?
2. Were you able to answer them easily? If not, which item/s did you
find difficult to answer?
3. Were you able to differentiate the statements characterizing normal
distribution from those describing t-distribution?

Notes to the Teacher

Check the level of readiness of your student. If the


student failed to answer most of the items, help him/her recall
the concepts about z-distribution and t-distribution by
providing additional activities.

What’s New

Activity 2: Find Me!

Directions: Determine the needed data for each given problem. First, read
and understand the examples below before you proceed to the items that
follow.

Examples:

1. A Grade 11 researcher reported that the average allowance of Senior High


School students was more than ₱100. A sample of 40 students had mean
allowance of ₱120. At 𝛼 = 0.01 test, it was the claimed that the students
had allowance of more than ₱ 100.The standard deviation of the
population is ₱50.
𝜇 = 100 𝑥̅ = 120 𝑛 = 40 𝜎 = 50

69
2. According to a cell phone company, the average price of cellular phone in
the Philippines is ₱12,999. However, in a sample of 20 costumers
randomly asked about the price of their cellular phone, data collected
showed an average of ₱9,999 and standard deviation of ₱7,999. Using
𝛼 = 0.05 level of significance, is there enough evidence proving that the
average price of cellular phone is less than ₱12,999?
𝜇 = 12,999 𝑥̅ = 9,999 𝑛= 20 𝑠 = 7,999

Now, it’s your turn…

1. The average number of ad clicks per day for Facebook before was
192,000 and the standard deviation was 100,000. Sixty-four (64) days
after the redesign, the mean number of ad clicks per day was 200,000.
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝜎 = ______

2. The average life of typical incandescent bulb is 1,500 hours as claimed by


a light bulb company. Thinking that the average life of bulbs is less than
what the company claimed, a client tested a random sample of 55 light
bulbs. The rest resulted to sample mean of 1,300 hours and standard
deviation of 25 hours. Is there enough evidence to prove that the average
life of the company’s light bulb is less than 1,500 hours?
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝑠 = ______
3. The mean number of close friends for the population of people living in
the Philippines is 5. The standard deviation of scores in this population is
1.2. An investigator predicts that the mean number of close friends for
introverts will be significantly different from the mean of the population.
The mean number of close friends for a sample of 26 introverts is 6.
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝜎 = ______

Guide Questions:

1. How did you find the activity?


2. What mathematical concepts did you apply in answering the
activity?
3. Were you able to determine the needed data for each notation?
4. Which value of notation/s seemed too difficult to identify on the
given problems?
5. Have you observed the differences of notations in the items? Is the
value of 𝑠 same as σ? If not, how do they differ?
6. What do you think is the relationship of these notations on
determining test statistic in hypothesis testing?

70
What Is It

Before we move forward to the different test statistics, it is important to


define the following terms:

 A population includes all of the elements from a set of data.


 A sample consists of one or more observations drawn from the population.
 Sample mean (𝒙 ̅) is the mean of sample values collected.
 Population mean (µ) is the mean of all the values in the population.
If the sample is randomly selected and sample size is large, then the
sample mean would be a good estimate of the population mean.
 Population standard deviation (𝝈) is a parameter which is a measure of
variability with fixed value calculated from every individual in the
population.
 Sample standard deviation (𝒔) is a statistic which means that this
measure of variability is calculated from only some of the individuals in a
population.
 Population variance (𝝈𝟐 ), in the same sense, indicates how the
population data points are spread out. It is the average of the distances
from each data point in the population to the mean, squared.

Since we already defined important things in identifying the test


statistics in hypothesis testing, let us now determine those concepts when
given a problem. Let’s use the example in Activity 2.

Example:

A Grade 11 researcher reported that the average allowance of


Senior High School students was ₱100. A sample of 40 students has
mean allowance of ₱120. At 𝛼 = 0.01 test, it was the claimed that the
students had allowance of more than ₱ 100.The standard deviation of the
population is ₱50.

µ = ₱100 the average allowance of the population (Senior High School


students)
𝐧 = 𝟒𝟎 the number of students taken from all Senior High School students
̅ = ₱120 the mean allowance of the sample
𝒙
𝛔 = ₱50 the standard deviation of the population

Now you already know how to get the data needed in choosing test
statistics. This time, you will determine what test statistic is appropriate in
computing test value in the hypothesis testing.

71
A test statistic is a random variable that is calculated from sample
data and used in a hypothesis test. You can use test statistics to determine
whether to reject or accept the null hypothesis. The test statistic compares
your data with what is expected under the null hypothesis.
To identify the test statistic, you must consider whether the
population standard deviation/variance is known or unknown. If the
population standard deviation σ is known, then the mean has a normal
distribution. Use z-test. If the population standard deviation σ is unknown,
then the mean has a t- distribution. Use t-test. Instead of the population
standard deviation, use the sample standard deviation.
z-test
In a z-test, the sample is assumed to be normally distributed. A z-score
is calculated with population parameters such as “population
mean” and “population standard deviation”. It is used to validate a
hypothesis that the sample drawn belongs to the same population. When the
variance is known and either the distribution is normal or sample size is
large, use a z-test statistic.
t-test
Like a z-test, a t-test also assumes a normal distribution of the
sample. A t-test is used when the population variance or standard deviation
are not known. When the variance is unknown and a sample size is less
than 30, use a t-test statistic assuming that the population is normal or
approximately normal.

Central Limit Theorem


In Central Limit Theorem, if the population is normally distributed
or the sample size is large and the true population mean µ = µ𝑜 , then z has
a standard normal distribution.
When population standard deviation σ is not known, we may still use
z-score by replacing the population standard deviation σ by its estimate,
sample standard deviation s. Since the sample is large the resulting test
statistic still has a distribution that is approximately standard normal.
Historically, this was very useful, as most statisticians before did not
have access to the t-table of quantities for very large number of degrees of
freedom. But with modern computers today, using t-test with a very large
sample size is not a problem at all.
However, since you will be using a t-table with only limited number of
degrees of freedom, you will use z-test when the sample size is large even
though the population standard deviation is unknown.
When sample sizes are small, the Central Limit Theorem does not
apply. You must then impose stricter assumptions on the population to give
statistical validity to the test procedure. One common assumption is that
the population from which the sample is taken has a normal probability
distribution to begin with. Under such circumstances, if the population
𝑥̅ −𝜇
standard deviation is known, then the test statistic 𝜎 𝑜 still has the
⁄ 𝑛

standard normal distribution.

72
The table shows what test statistic is appropriate when:
Population Variance Is Population Variance Is Central Limit Theorem
Known Unknown (CLT)
Population is normal or Population may not be
Population is normally
nearly normally normally distributed.
distributed.
distributed.
𝑛 ≥ 30 or considered
𝑛 ≥ 30 𝑛 < 30
sufficiently large
Population standard Sample standard
Variance is known/
deviation (𝜎) is known. deviation (s) is known.
unknown.
Population standard
deviation (𝜎) is unknown.
Use z-test by replacing
population standard
z-test t-test deviation (𝜎) by sample
standard deviation (𝑠) in
the formula.
Identifying Appropriate Test Statistic

When the value of sample size (n)…

𝒏 ≥ 𝟑𝟎 𝒏 < 𝟑𝟎

σ is known σ is not known σ is known σ is not known

z-test z-test z-test t-test

Illustrative Examples:
1. A manufacturer claimed that the average life of batteries used in their
electronic games is 150 hours. It is known that the standard deviation of
this type of battery is 20 hours. A consumer wished to test the
manufacturer’s claim and accordingly tested 100 electronic games using
the battery. It was found out that the mean is equal to 144 hours.
Here, the sample size (n) is 100 (extremely large) and population
standard deviation (20 hours) is known, then the appropriate test
statistic to be used is z-test.

2. An English teacher wanted to test whether the mean reading speed of


students is 550 words per minute. A sample of 12 students revealed a
sample mean of 540 words per minute with a standard deviation of 5
words per minute. At 0.05 significance level, is the reading speed
different from 550 words per minute?

73
The sample size (n) is 12 which is less than 30 and sample
standard deviation (5 words per minute) was given. Therefore, the
appropriate test is t-test.

3. A study was conducted to look at the average time students exercise. A


researcher claimed that in average, students exercise less than 15 hours
per month. In a random sample size n=115, it was found that the mean
time students exercise is 𝑥̅ = 11.3 hours per month with s = 6.43 hours
per month.
Since n=115, the sample size is large and variance is unknown.
Hence, z-test is the appropriate tool. (Central Limit Theorem)

Note:
The illustrative examples above used standard deviations instead of
variances. Variance is the square of the standard deviation and conversely,
the standard deviation is the square root of the variance. Hence, if the
standard deviation is known in the problem, then basically, variance is also
known.

What’s More

Activity 3: Mark My Numbers!

Directions: In each problem, underline the population standard


deviation/sample standard deviation and circle the number of samples.

1. A sample of 160 people has a mean age of 27 with a population standard


deviation (σ) of 5. Test the hypothesis that the population mean is 26.7 at
α=0.05.
2. An electric lamps manufacturer is testing a new production method that
will be considered acceptable if the lamps produced by this method result
in a normal population with an average life of 1,300 hours and a
standard deviation equal to 120. A sample of 100 lamps produced by this
method has an average life of 1,250 hours.

3. The cholesterol levels in a certain population have mean of 210 and


standard deviation 21. The cholesterol levels for a random sample of 9
individuals are measured and the sample mean x is determined. What is
the z-score for a sample mean x=180?
4. Mabunga Elementary School has 1,000 students. The principal of the
school thinks that the average IQ of students at Mabunga is at least 110.
To prove her point, she administers an IQ test to 20 randomly selected

74
students. Among the sampled students, the average IQ is 108 with a
standard deviation of 10.
5. A new energy-efficient lawn mower engine was developed by a well-known
inventor. He claims that the engine will run continuously for 5 hours on
a single gallon of regular gasoline. From his stock of 2,000 engines, the
inventor selects a simple random sample of 50 engines for testing. The
engines run for an average of 295 minutes with a standard deviation of
20 minutes.
Activity 4. Check It Out!

Directions: Read and analyze each problem. On the table below, put a
check on the columns of the criteria that correspond to the given problem.

1. It is claimed that the average age of working students in a certain


university is 35. A researcher selected a random sample of 25 working
students. The computation of their ages resulted to an average of 32
years with standard deviation of 10 years.
2. A manufacturer of tires claim that their tire has a mean life of at least
50,000kms. A random sample of 28 of these tires is tested and the
sample mean is 33,000kms. Assume that the population standard
deviation is 3,000kms and the lives of the tires are approximately
normally distributed.
3. On average, a drinking vending machine is adjusted so it dispenses
240ml of fruit juice. However, the machine tends to go out of adjustment
and periodic checks are made to determine the average amount of fruit
juice being dispensed. A sample of 28 with a standard deviation of 15ml
in plastic cup drinks is taken to test the adjustment of the machine.
4. Uber company claims that the mean time to rent a car on their app is 60
seconds with a standard deviation of 30 seconds. A random sample of 36
customers attempted to rent a car on the app. The mean time of renting
was 75 seconds. Is this enough evidence to contradict the company's
claim?
5. The waiting time to be seated at the restaurant has population standard
deviation of 10 minutes. An expensive restaurant claims that the average
waiting time for dinner is approximately 1 hour, but we suspect that this
claim is inflated to make the restaurant appear more exclusive and
successful. A random sample of 30 customers yielded a sample average
waiting time of 50 minutes.

75
𝒏 ≥ 𝟑𝟎 𝒏 < 𝟑𝟎 𝝈 is known. 𝝈 is unknown. z-test t-test
1.
2.
3.
4.
5.

Activity 5. Which is Which?

Directions: Identify the appropriate test statistic to be used in each


problem. Write z-test or t-test on a separate sheet of paper.
___________1. A sample of n=25 is selected from a normal population, 𝑥̅ = 56
and s= 12.

___________2. Based on the report of the school nurse, the average height of
Grade 11 students has increased. Five years ago, the average height of
Grade 11 students was 170cm with standard deviation of 38cm. She took a
random sample of 150 students and derived the average height of 165cm.

___________3. Knowing from a previous study that the average of athletes is


80, an athletic adviser asked how his soccer players are academically doing
as compared to other student athletes. After an initiative to help improve the
average of student athletes, the adviser randomly selected 15 soccer players
and found 85 as the average with standard deviation of 1.25.

___________4. The CEO of a battery manufacturing company claimed that


their batteries would last an average of 280 hours under normal use. A
researcher randomly selected 20 batteries from the production line and
tested them. The tested batteries had a mean life span of 250 hours with a
standard deviation of 40 hours. Do we have enough evidence to suggest that
the claim of an average of 280 hours is false?

___________5. It was known that the number of tickets purchased by


students at the ticket window for the volleyball match of two popular
universities followed a distribution that has mean of 500 and standard
deviation of 8.9. Suppose that a few hours before the start of one of these
matches, there are 100 eager students standing in line to purchase tickets.
If there are 250 tickets remaining, what is the probability that all 100
students will be able to purchase the tickets they want?

76
What I Have Learned

Complete the following sentences by filling each blank with the correct word
or phrase.
1. __________________ is a random variable that is calculated from sample
data and is used in a hypothesis test.
2. ____________ includes all of the elements from a set of data while
______________ consists of one or more observations drawn from the
population.
3. ___________ is a measure of variability calculated from every individual in
the population while ______________ is calculated from only some of the
individuals in a population.
4. The two common test statistics to be computed in hypothesis testing are
________________________ and ____________________________________.
5. A z-score is calculated with population parameters such as population
mean and ______________________.
6. A t-test is used when the __________________ or standard deviation is not
known.
7. The number of sample for z-test is ________________________ while
________________________ in t-test.
8. If the population standard deviation is known, use
______________________ and if it’s unknown, use
________________________.
9. The notations that need to be considered in identifying test statistics are
_____________________ and ____________________.
10. If the number of samples is sufficiently large and the variance is
unknown, then ________________________ is appropriate to be used.

What I Can Do

Make a comics strip on how to determine the appropriate tool when the
variance is known, variance is unknown, and when Central Limit Theorem
is used. Your work will be evaluated using the following rubric.

77
Clear Understanding of Mathematical Concept 30
Organization and Accuracy of Solution(s) 30
Clear Understanding of Vocabulary 10
Accuracy of Analysis 20
Presentation 10
Total 100

Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. If the variance is known, what test statistic is appropriate?


A. t-test C. two-tailed test
B. z-test D. one-tailed test
2. One-sample t-statistic is used instead of one-sample z-statistic when
___________________.
A. μ is known. C. μ is unknown.
B. σ is known. D. σ is unknown.
3. Based on the Central Limit Theorem, when the sample (n) is extremely
large and the variance is unknown, what is the statistical test to be
used?
A. t-test C. two-tailed test
B. z-test D. one- tailed test
4. Which of the following is NOT a consideration in using z-test/statistic?
A. Variance is known.
B. Sample standard deviation is known.
C. The population mean is less than 30.
D. Population standard deviation is known.
5. What appropriate tool is applicable if the population is normal, sample
standard deviation is known, and sample is less than 30?
A. t-test C. normal test
B. z-test D. Central Limit Theorem
6. Which of the following symbols is NOT needed when t-test is used in
computing values?
A. 𝑛 B. µ C. 𝜎 D. 𝑠

78
7. If in a sample n=16 selected from a normal population, 𝑥̅ = 56 and 𝑠 = 12,
what statistical test is applicable to be used?
A. f-test C. z-test
B. t-test D. Central Limit Theorem
8. Based on Central Limit Theorem, the z-test for single sample may be
used when all the following conditions are TRUE except
_________________.
A. Sample size is less than 30.
B. Data are normally distributed.
C. Population standard deviation is known.
D. Population standard deviation is unknown.
9. What is the sample standard deviation if a simple random sample of 220
students is drawn from a population of 2,740 college students? Among
the sampled students, the average IQ score is 115 with standard
deviation of 10.
A. 10 B. 115 C. 220 D. 2,740
10. The supervisor of a certain company claimed that the mean workday of
his workers is 8.3 hours per day. A sample of 20 workers was taken and it
was found out that the mean workday is 8 hours with standard deviation
of 1 hour. At 0.01 level of significance, is the mean workday less than 8.3
hours?
What test statistic is to be used in the given problem?
A. z-test C. right-tailed test
B. t-test D. left-tailed test

11. Based on the problem in no. 10, 8.3 hours is _____________.


A. σ B. µ C. 𝑥̅ D. 𝑠

12. A leader of an association of jeepney driver claims that the average daily
take-home pay of all jeepney drivers in Caloocan is ₱350.00. A random
sample of 100 jeepney drivers in Caloocan was interviewed and the take-
home pay was found to be ₱420.00. If 0.05 significance level was used to
find out whether the average take home pay is different from ₱350.00 and
population variance was assumed to be ₱92.00, what is the appropriate
test statistic?
A. t-test C. left-tailed test
B. z-test D. right-tailed test
13. L.V. Co. has an average sale of ₱37 million per week from their products
in all their outlets. An area manager found out that the average gross
sales from the 28 outlets under her jurisdiction is ₱32.5 million per week
with standard deviation of ₱1.5 million. Does the mean sales of all outlets

79
differ from the mean sales of the 28 outlets under her jurisdiction? In the
given problem, what statistical tool is suitable to use?
A. t-test C. ANOVA
B. z-test D. chi-square test
14. A cellular battery manufacturer claims that his battery when fully
charged has mean life of 24 hours with standard deviation of 4 hours. A
dealer randomly chose sample of 35 batteries to be tested and resulted to
22.5 hours mean life. In the given situation, 22.5 hours is __________.
A. sample mean C. sample standard deviation
B. number of sample D. population standard deviation
15. According to a study, there is an increase on average monthly expenses
of ₱250.00 for cell phone loads of Senior High School students in the city.
Is there a reason to believe that the amount increased if sample of 60
students has an average monthly expense of ₱280.00 and the population
standard deviation is ₱77.00? What is the tool to be used in computing
the test value?
A. z-test C. left-tailed test
B. t-test D. alternative test

Additional Activities

Activity 6. Read, Analyze, and Answer!


Directions: Answer the following.
1. In a sample of 𝑛 = 12 selected from a normal population, 𝑥̅ = 50, 𝑠 = 10,
and null hypothesis is 𝐻0 : µ = 45.
a. What is the number of degrees of freedom?
b. What is the test statistic to be used?
2. In order to test 𝐻0 : µ = 26 versus 𝐻𝑎 : µ < 26, a random sample of size 𝑛 =
37 is obtained from the population that is known to be normally
distributed with 𝜎 = 3.
a. Based on the given alternative hypothesis, what is the hypothesis
test?
b. What test statistic would you apply to compute for the value?

80
References
Textbooks
Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular
Approach. Mandaluyong City: Jose Rizal University Press, 2011.
Chua, Jedd Amerson S. Soaring 21st Century Mathematics: Statistics and
Probability. Quezon City: Phoenix Publishing House Inc., 2016.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc, 2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources

LaMorte, Wayne W. “Central Limit Theorem (CLT).” Accessed May 27, 2020
http://sphweb.bumc.bu.edu/otlt/MPHModules/BS/BS704_Probabili
ty/ BS704_Probability12.html
MacEwan University. “1 Hypotheses Test About µ If σ Is Not Known.”
Accessed May 27, 2020. https://academic.macewan.ca/burok/
Stat141/notes/ttests.pdf
Nigam, Vibhor. “Statistical Tests - When to Use Which?” Accessed May 26,
2020. https://towardsdatascience.com/statistical-tests-when-to-use-
which-704557554740

Quizziz. “Hypothesis Testing.” Accessed May 27,2020 https://quizizz.com/


admin/quiz/ 5e71a2de8318d3001f64551f/ identify-hypothesis-
tests-sample
Saylordotorg. “Large Sample Tests for a Population Mean” Accessed June 5,
2020. https://saylordotorg.github.io/text_introductory-
statistics/s12-02-large-sample-tests-for-a-popul.html
Saylorddotorg. “Small Sample Tests for a Population Mean” Accessed June
5, 2020. https://saylordotorg.github.io/text_introductory-statistics/s12-04-
small-sample-tests-for-a-popul.html

81
Statistics and
Probability
Quarter 2 – Module 5:
Identifying the Appropriate
Rejection Region for a Given
Level of Significance

82
What I Need to Know

In the previous module, you have learned to identify the appropriate


test statistic when the population variance is known or unknown. You were
able to define different statistical concepts related to z-test and t-test as the
tools for computing value in hypothesis testing problem. The steps in
choosing correct statistical test were also discussed. Moreover, the test used
for Central Limit Theorem was explained.

Since you already know how to choose the test statistic applicable in
hypothesis testing, you are now ready to identify the appropriate rejection
region when population variance is known or unknown. In determining
rejection region, you will also be defining other statistical concepts such as
critical value.

After going through this module, you are expected to:


1. define the critical values, level of significance, hypothesis test, and
rejection region;
2. identify the critical value when population variance is known or
unknown; and
3. determine the appropriate rejection region for a given level of significance
when population is known/unknown and Central Limit Test is to be
used.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. In a right-tailed test with 𝛼 = 0.01, the critical value of z is:
A. 1.28 B. 1.65 C. 1.96 D. 2.33
2. The value that separates a rejection region from an acceptance region is
called a ___________.
A. Parameter C. critical value
B. Hypothesis D. significance level

83
3. For a two-tailed test with variance unknown, n= 19, and 𝛼 = 0.05, what is
the critical value?
A. ±2.092 B. ±2.101 C. ±2.145 D. ±2.878
4. For a two-tailed test with a sample size of 40, the null hypothesis will be
rejected at 5% level of significance if the test statistic is:
A. 𝑧 ≤ −1.28 𝑜𝑟 𝑧 ≥ 1.28 C. 𝑧 ≤ −1.96 𝑜𝑟 𝑧 ≥ 1.96
B. 𝑧 ≤ −1.645 𝑜𝑟 𝑧 ≥ 1.645 D. 𝑧 ≤ −2.33 𝑜𝑟 𝑧 ≥ 3.33
5. If the alpha level is increased from 0.01 to 0.05, then the boundaries for
the critical region move farther away from the center of the distribution.
A. True C. both A and B
B. False D. cannot be determined
6. In the two-tailed test, the rejection region lies on ___________ of the
normal distribution.
A. center B. left tail C. right tail D. both tails
7. Given the illustration at the right, which of the following is NOT TRUE?
A. This is a left-tailed test.
B. This is a right-tailed test. 1.645
C. This has a critical value of 1.645.
D. This has a level of significance of 0.5.
8. Given the normal curve at the right, what is the rejection region?
A. 𝑧 ≤ 1.645 𝑜𝑟 𝑧 ≥ 1.645
B. 𝑧 ≥ −1.645 𝑜𝑟 𝑧 ≥ 1.645
C. 𝑧 ≥ −1.96 𝑜𝑟 𝑧 ≤ 1.96 -1.96 1.96
D. 𝑧 ≤ −1.96 𝑜𝑟 𝑧 ≥ 1.96
9. What is the critical value if the population variance is unknown, 𝑛 = 13,
𝛼 = 0.05, and it is a one-tailed test?
A. 𝑡 =1.782 B. 𝑡 =2.179 C. 𝑡 =2.681 D. 𝑡 =3.055
10. Given a two-tailed test, population variance is known, and 𝛼 = 0.10, what
is critical region?
A. 𝑧 ≥ 1.28 C. ≤ −2.33 or 𝑧 ≥ 2.33
B. 𝑧 ≤ −1.96 D. 𝑧 ≤ −1.645 or 𝑧 ≥ 1.645

11. Which of the following is the sketch of the normal curve if 𝑧 ≥ 1.645?
A. B. C. D.

12. Which of the following graphs of rejection region show 𝑡 ≥ 2.074?


A. B. C. D.

84
13. In the given problem below, identify the rejection region.
It is claimed that the mean distance of a certain type of vehicle is 35
miles per gallon of gasoline with population standard deviation σ = 5
miles. What can be concluded about the claim using α = 0.1 if a random
sample of 49 such vehicles has sample mean, x̅ = 36 miles?
A. 𝑧 ≤ −1.28 C. 𝑧 ≤ −1.645 𝑜𝑟 𝑧 ≥ 1.645
B. 𝑧 ≥ 2.33 D. 𝑧 ≤ −2.575 𝑜𝑟 𝑧 ≥ 2.575
14. Based on the problem in no. 13, which is the correct graph?
A. B. C. D.

15. In a modeling agency, a researcher wishes to see if the average height of


female models is less than 67 inches, as the coach claims. A random
sample of 20 models has an average height of 65.8 inches. The standard
deviation of the sample is 1.7 inches. At 𝛼 = 0.05, which of the following
shows the appropriate rejection of the given problem?
A. B. C. D.

Lesson Identifying the Appropriate


5 Rejection Region for a Given
Level of Significance

In hypothesis testing, a researcher collects sample data. From the


given data, the researcher formulates the null and alternative hypotheses.
Then, s/he chooses appropriate test statistic and computes it. If the
statistics fall within the specific range of values, the researcher rejects the
null hypothesis. The range of values that leads the researcher to reject the
null hypothesis is called region of rejection. What is rejection region and
how is it important in the process of hypothesis testing?

Before we discuss the topic, let us recall some concepts that will lead
you to the concept of rejection region.

85
What’s In

Activity 1: You Bring Color to My Life!

Directions: Given a standard normal curve, shade the required area with
color GREEN and for the remaining area, use color RED.

1. between 𝑧 = −1.56 and 𝑧 =


+1.56

2. to the left of 𝑧 = 2.05

3. to the right of 𝑧 = −1.3

4. between 𝑧 = −1.58 and 𝑧 = 1.58

5. to the left of 𝑧 = 1.96

Notes to the Teacher


Check the level of readiness of you students. If the
students did not pass the first activity, provide other
activities that will help them recall how to determine the
areas of normal curve.

86
What’s New

Activity 2: Let Me Read and Understand!


Directions: Carefully read the problem and answer the questions that
follow.
Problem 1. A banana company claims that the mean weight of their
banana is 150 grams with a standard deviation of 18 grams. Data generated
from a sample of 49 bananas randomly selected indicated a mean weight of
153.5 grams per banana. Is there sufficient evidence to reject the company’s
claim? Use 𝛼 = 0.05.
1. What are the hypotheses?
2. Is it two-tailed or one-tailed test?
3. What is the level of significance?
4. Is the population standard deviation known?
5. What appropriate test statistic (z-test or t-test) can you use?
6. Based on the level of significance, hypothesis test, and test statistic,
what is the critical value?
7. Draw the rejection region.

Problem 2. The manufacturer of an airport baggage scanning machine


claims it can handle an average of 530 bags per hour. At 𝛼 = 0.05 in a left-
tailed test, would a sample of 16 randomly chosen hours with a mean of 510
and standard deviation of 50 indicate that the manufacturer’s claim is an
overstatement?
1. What are the hypotheses?
2. Is it two-tailed test or one-tailed test?
3. What is the level of significance?
4. Is the population standard deviation known or unknown?
5. What appropriate test statistic (z-test or t-test) can you use?
6. Based on the level of significance, hypothesis test, and test statistic,
what is the critical value?
7. Draw the rejection region.

Guide Questions:
1. How did you find the activity?
2. What are the similarities and differences of the two problems?
3. Have you encountered previously learned statistical concepts? If yes,
will you discuss those concepts?
4. Were you able to answer all the follow-up questions? If not, why?
5. What are the concepts that seemed to be familiar and unfamiliar to
you?
6. How do these concepts relate to the rejection region?

87
What Is It

To be able to answer the questions in the next activities, please take


time to read and understand this section that discusses the next steps in
hypothesis testing.

Critical Value, Significance Level, and Rejection Region

In hypothesis testing, a critical value is a point on the test


distribution that is compared to the test statistic to determine whether to
reject the null hypothesis. Critical values for a test of hypothesis depend
upon the test statistic, which is specific to the type of the test and
significance level (𝛼) which defines the sensitivity of the test. A value of 𝛼 =
0.05 implies that the null hypothesis is rejected 5% of the time when it is in
fact true. In practice, the common values of α are 0.1, 0.05, and 0.01.

Critical Value of z-Distribution

A critical value of z (Z-score) is used when the sampling distribution


is normal or close to normal. Z-scores are used when the population
standard deviation is known or when you have larger sample sizes. While
the z-score can also be used to calculate probability for unknown standard
deviations and small samples. Many statisticians prefer using the t-
distribution to calculate these probabilities.
Table of Critical Values (Z-Score)

Level of Significance
Test Type
𝛼 = 0.01 𝛼 = 0.025 𝛼 = 0.05 𝛼 = 0.10
left-tailed test −2.33 −1.96 −1.645 −1.28
right-tailed test 2.33 1.96 1.645 1.28
two-tailed test ±2.575 ±2.33 ±1.96 ±1.645

a. left-tailed test: If the alternative hypothesis 𝐻𝑎 contains the less-than


inequality symbol (<), the hypothesis test is a left-tailed test.
b. right-tailed test: If the alternative hypothesis 𝐻𝑎 contains the greater-
than inequality symbol (>), the hypothesis test is a right-tailed test.
c. two-tailed test: If the alternative hypothesis 𝐻𝑎 contains the not-equal-to
symbol (≠), the hypothesis test is a two-tailed test. In a two-tailed test,
1
each tail has an area of 2 𝛼.

88
Examples:

Find the critical z values. In each case, assume that the normal distribution
applies.

1. left-tailed test with α= 0.01 𝒛 = −𝟐. 𝟑𝟑 (based on the table of critical value
of z)
2. two-tailed test with α=0.05 𝒛 = ±𝟏. 𝟗𝟔
3. right-tailed test with α=0.025 𝒛 = 𝟏. 𝟗𝟔

Critical Value of t-Distribution

The t-distribution table values are critical values of the t-


distribution. The column header is the t-distribution probabilities (alpha).
The row names are the degrees of freedom (df).
To find critical values for t-distribution:
1. Identify the level of significance.
2. Identify the degrees of freedom, d.f. = n -1.
3. Find the critical value using t-distribution in the row with n-1 degrees of
freedom. If the hypothesis test is:
a. left-tailed, use “α one tail” column with a negative sign.
b. right-tailed, use “α one tail” column with a positive sign.
c. two-tailed, use “α two tails” column with a negative and a positive
sign.

Critical Value Table for t-Distribution


𝜶 for one-tailed test 0.05 0.025 0.01 0.005
𝜶 for two-tailed test 0.10 0.05 0.02 0.01
df = (n – 1)
1 6.311 12.706 31.821 63.657
2 2.920 4.303 6.065 9.925
3 2.353 3.182 4.541 5.841
4 2.132 2.776 3.747 4.604
5 2.025 2.571 3.365 4.032
6 1.943 2.447 3.143 3.707
7 1.895 2.365 2.998 3.499
8 1.860 2.306 2.896 3.355
9 1.833 2.262 2.821 3.250
10 1.812 2.228 2.764 3.169
11 1.796 2.201 2.718 3.106
12 1.782 2.179 2.681 3.055
13 1.771 2.160 2.650 3.012
14 1.761 2.145 2.624 2.977
15 1.753 2.134 2.602 2.947
16 1.746 2.120 2.583 2.921
17 1.740 2.110 2.567 2.898
18 1.734 2.101 2.552 2.878
19 1.729 2.093 2.539 2.861

89
20 1.725 2.086 2.528 2.845
21 1.721 2.080 2.512 2.831
22 1.717 2.074 2.508 2.819
23 1.714 2.069 2.500 2.807
24 1.711 2.064 2.492 2.797
25 1.708 2.060 2.485 2.787
26 1.706 2.056 2.479 2.779
27 1.703 2.052 2.473 2.771
28 1.701 2.048 2.467 2.763
29 1.699 2.045 2.462 2.756
30 1.697 2.042 2.457 2.750

Examples:

a) Find the critical t-value for a left-tailed test with α= 0.05 and n =21.
Answer: 𝒕 = −𝟏. 𝟕𝟐𝟓
b) Find the critical t-value for a right-tailed test with α=0.01 and n = 17.
Answer: 𝒕 = 𝟐. 𝟓𝟖𝟑
c) Find the critical t-values for a two-tailed test with α=0.05 and n =26.
Answer: 𝒕 = ±𝟐. 𝟎𝟔𝟎

Critical Regions/Rejection Regions

Critical region, also known as the rejection region, describes the


entire area of values that indicates you reject the null hypothesis. In other
words, the critical region is the area encompassed by the values not
included in the acceptance region. It is the area of the “tails” of the
distribution.

The “tails” of a test are the values outside of the critical values. In
other words, the tails are the ends of the distribution and they begin at the
greatest or least value in the alternative hypothesis (the critical values).

Rejection Region If Population Variance Is Known

To determine the critical region for a normal distribution, we use the


table for the standard normal distribution. If the level of significance is  =
0.10, then for a one-tailed test, the critical region is below 𝑧 = −1.28 or

above 𝑧 = 1.28. For a two-tailed test, use 2 = 0.05 and the critical region is
below 𝑧 = −1.645 and above 𝑧 = 1.645. If the absolute value of the
calculated statistics has a value equal to or greater than the critical value,
then the null hypotheses 𝐻𝑜 should be rejected and the alternate hypothesis
𝐻𝑎 is assumed to be supported.

90
Rejection Region If Population Variance Is Unknown

To determine the critical region for a t-distribution, we use the table


of the t-distribution. (Assume that we use a t-distribution with 20 degrees of
freedom.) If the level of significance is  = .10, then for a one-tailed test, 𝑡 =

−1.325 or 𝑡 = 1.325. For a two-tailed test, use 2 = 0.05 and then 𝑡 = −1.725
and 𝑡 = 1.725. If the absolute value of the calculated statistics has a value
equal to or greater than the critical value, then the null hypotheses 𝐻𝑜 will
be rejected and the alternate hypotheses 𝐻𝑎 is assumed to be correct.

Hypothesis Test and Their Tails

There are three types of test from a “tails” standpoint:

 A left-tailed test only has a tail on the left side of the graph.
rejection
region

 A right-tailed test only has a tail on the right side of the graph

rejection
region

 A two-tailed test has tails on both ends of the graph. This is a test
where the null hypothesis is a claim of a specific value.

rejection
rejection
region
region

Illustrative Examples:

Determine the critical values and the appropriate rejection region. Sketch
the sampling distribution.

1. Right-tailed test where 𝝈 is known, 𝜶 = 𝟎. 𝟎𝟓, and 𝒏 = 𝟑𝟒


In this example, the population standard deviation is known. Therefore,
the test statistic would be z-test. To obtain the critical value for the level of
significance of 0.05 and one-tailed test, z-value from the table is 1.645. The

91
hypothesis test is right-tailed, so the inequality symbol would be ≥. Hence, the
rejection region for a one-tailed test is z ≥ 1.645.
To sketch the graph, locate first the critical value of 1.645 which is
between the 1 and 2 in the normal curve. Then, shade the region greater than
the critical value because it is a right-tailed test.

critical value rejection region


z=1.645

2. Two-tailed test where 𝝈 is unknown, 𝜶 = 𝟎. 𝟎𝟓, and 𝒏 = 𝟏𝟎


Since this is a two-tailed test, ½ of 0.05= 0.025 of the values would be
in the left and the other 0.025 would be in the right tail. Looking up t-score
(n=10-1=9) associated with 0.025 on the reference table, we find 2.262.
Therefore, +2.262 is the critical value of the right tail and -2.262 is the critical
value of the left tail. The rejection region is −𝟐. 𝟐𝟔𝟐 ≤ 𝒕 ≥ 𝟐. 𝟐𝟔𝟐.

critical value critical value


𝒕 = −𝟐. 𝟐𝟔𝟐 𝒕 = +𝟐. 𝟐𝟔𝟐

rejection region rejection region

3. Left-tailed test where 𝝈 is known, 𝜶 = 𝟎. 𝟎𝟏, and 𝒏 = 𝟒𝟎


A one-tailed test with 0.01 would have 99% of the area under the curve
outside of the critical region. Since the variance is known, we use z-score as
the reference to find the critical value. This is a left-tailed test, so the critical
value we need is negative. The solution is z= -2.326. The rejection region is z
≤ -2.326.

rejection
region critical value
𝒛 = −𝟐. 𝟑𝟐𝟔

In the first three examples, you were able to find rejection region given
the hypothesis test, population variance known or unknown, number of
sample, and level of significance. The following example will discuss on how
to determine the appropriate rejection region in a real-life problem.

92
4. A survey reports a customer in the drive thru lane of one fast food
chain spends eight minutes to wait for his/her order. A sample of 24
customers at the drive thru lane showed mean of 7.5 minutes with a
standard deviation of 3.2 minutes. Is the waiting time at the drive
thru lane less than that of the survey made? Use 0.05 significance
level.
Hypotheses Hypothesis Population Level of Number z-value
Test Standard Significance of or t-
Known/Unknown Sample value
𝑯𝑶 : 𝝁 = 𝟖, left-tailed 𝛔 is unknown. 𝜶 = 𝟎. 𝟎𝟓 𝒏 = 𝟐𝟒 t-value
𝑯𝒂 : 𝝁 < 𝟖, test

A one-tailed test with 0.05 level of significance has 95% of the area
under the curve outside of the critical region. Since the variance is unknown,
we use t-score with df = 24-1=23 as the reference to determine the critical
value. This is a left-tailed test, so the critical value we need is negative. The
critical value is 2. 069 and the rejection region is 𝒕 ≤ −𝟐. 𝟎𝟔𝟗.

5. A banana company claims that the mean weight of their banana is 150
grams with a standard deviation of 18 grams. Data generated from a
sample of 49 bananas randomly selected indicated a mean weight of
153.5 grams per banana. Is there sufficient evidence to reject the
company’s claim? Use 𝛼 = 0.05.
Hypotheses Hypothesis Population Level of Number z-value
Test Standard Significance of or
Known/Unknown Sample t-value
𝑯𝑶 : 𝝁 = 𝟏𝟓𝟎 two-tailed σ is known. 𝜶 = 𝟎. 𝟎𝟓 𝒏 = 𝟒𝟗 z-
𝑯𝒂 : 𝝁 ≠ 𝟏𝟓𝟎 test value

The rejection region is 𝒛 ≤ −𝟏. 𝟗𝟔 or 𝒛 ≥ 𝟏. 𝟗𝟔.

critical value critical value


𝒛 = −𝟏. 𝟗𝟔 𝒛 = 𝟏. 𝟗𝟔

rejection region rejection region

93
After you find the appropriate rejection region, you will then
compute the standard (z or t) value based on the given data in the
hypothesis problem. If the computed value is in the rejection region,
then reject the null hypothesis and if not, do not reject the null
hypothesis. More discussions about this decision making will be on the
next module.

What’s More

Activity 1: What is My Value?


Directions: Find the critical value of the following.
1. right-tailed test 𝛼 = 0.05 n=25
2. two-tailed test 𝛼 = 0.01 n=20
3. two-tailed test 𝛼 = 0.10 n=29
4. left-tailed test 𝛼 = 0.05 n=50
5. two-tailed test 𝛼 = 0.01 n=67
6. one-tailed test, σ known 𝛼 = 0.05, n=34
7. two-tailed test, σ unknown 𝛼 = 0.01 n=23
8. right-tailed test, σ unknown 𝛼 = 0.01 n=15
9. one-tailed test, σ known 𝛼 = 0.025 n=37
10. left-tailed test, σ known 𝛼 = 0.05 n=36

Activity 2. Reject It!


Directions: Find the rejection region for each hypothesis test based on the
information given.
1. 𝐻𝑜 : μ=121 𝐻𝑎 :μ >121 α=0.01 n=39 σ=known
2. 𝐻𝑜 : μ=98.6 𝐻𝑎 :μ ≠98.6 α=0.05 n= 25 σ=unknown
3. 𝐻𝑜 : μ=27 𝐻𝑎 : μ <27 α=0.05 n=12 σ=known
4. 𝐻𝑜 : μ=65 𝐻𝑎 : μ≠65 α=0.05 n=9 σ=unknown
5. 𝐻𝑜 : μ=2.9 𝐻𝑎 : μ>2.9 α=0.01 n=50 σ=known

Activity 3. Let’s Do Sketch!


Directions: Sketch the graph given the critical value and rejection region.
1. 𝑧 ≥ 2.33
2. 𝑧 ≤ −1.645 or 𝑧 ≥ 1.645
3. 𝑡 ≤ −2.145
4. 𝑡 ≤ −1.771 or 𝑡 ≥ 1.771
5. 𝑧 ≤ −1.28

94
Activity 4. Think Critically!
Directions: Identify the critical value of each given problem. Find the
rejection region and sketch the curve on a separate sheet of paper.
1. 𝐻𝑂 : 𝜇 = 90
𝐻𝑎 : 𝜇 ≠ 90
The sample mean is 69 and sample size is 35. The population follows a
normal distribution with standard deviation 5. Use 𝛼 = 0.05.
2. A survey reports the mean age at death in the Philippines is 70.95 years
old. An agency examines 100 randomly selected deaths and obtains a
mean of 73 years with standard deviation of 8.1 years. At 1% level of
significance, test whether the agency’s data support the alternative
hypothesis that the population mean is greater than 70.95.
3. The mean time costumer waits in line before checking in a grocery chain
is less than 10 minutes. To verify the performance of the store, the
obtaining mean time of 25 costumers is 9.5 minutes with standard
deviation of 1.6 minute. Use these data to test the null hypothesis that
the mean time is 10 minutes, at 0.01 level of significance.
4. A fast food restaurant cashier claimed that the average amount spent by
the customers for dinner is ₱125.00. Over a month period, a sample of 50
customers was selected and it was found that the average amount spent
for dinner was ₱130.00. Using 0.05 level of significance, can it be
concluded that the average amount spent by customers is more than
₱125.00? Assume that the population standard deviation is ₱7.00.
5. According to the radio announcer, the average price of kilogram of pork
liempo is more than ₱210.00. However, a sample of 15 prices randomly
collected from different markets showed an average of ₱215.00 and
standard deviation of ₱9.00. Using 0.05 level of significance, is there
sufficient evidence to conclude that the average price of pork liempo is
more than ₱210.00?

What I Have Learned

Directions: Complete the following statements.


1. _____________ is a point on the test distribution that is compared to the
test statistic to determine whether to reject or accept the null hypothesis.
2. A _____________ may be defined as the sensitivity of the test.
3. The most used levels of significance are _______, _______, and _______.

95
4. Z-score is used when the population standard deviation is _____________
while t-score is used when the population standard deviation
is_____________.
5. _____________, also known as the critical region, describes the entire area
of values that indicates you reject the null hypothesis.
6. The values outside the critical values are the _____________.
7. To determine the critical region if population variance is known, use table
for _____________ distribution while if the variance is unknown, use table
for _____________ distribution.
8. If the hypothesis test is a right-tailed test, then the z-values or t-values
on the rejection region are _____________ the critical value.
9. When the given hypothesis test is a two-tailed test, then the rejection
regions are on ___________________ tails of the distribution.
10. To sketch the graph of the rejection region, locate first the _____________.

What I Can Do

Create a meme about concepts in hypothesis testing such as hypothesis,


test statistic, or rejection region.

Photo taken from https://memegenerator.net/instance/48001959/dikembe-mutombo-test-statistic-in-the-rejection-region-null-


hypothesis-not-in-my-house

Criteria for Creating a Meme Equivalent Points


Design 15 points
Appropriateness 15 points
Uniqueness 10 points
Effectiveness 10 points
Total 50 points

96
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. In a left-tailed test with 𝛼 = 0.01, the critical value of z is:


A. -2.576 B. -2.330 C. -1.960 D. -1.645
2. Which of the following defines the area encompassed by the values not
included in the non-rejection region or also the area of the tails of the
distribution?
A. critical value C. level of significance
B. rejection region D. population variance

3. For a two-tailed test with variance unknown, n= 16, and 𝛼 = 0.05, what is
the critical value?
A. ±2.092 B. ±2.134 C. ±2.145 D. ±2.145
4. For a one-tailed test with a sample of 15, the null hypothesis will not be
rejected at 5% level of significance if the test statistics is:
A. 𝑡 ≤ −1.761 B. 𝑡 ≤ −1.753 C. 𝑡 ≤ −1.703 D. 𝑡 ≤ −1.697
5. If the level of significance decreased from 0.1 to 0.05, then the
boundaries for the critical region move farther away from the center of
the distribution.
A. true B. false C. both A and B D. cannot be determined
6. In a right-tailed test, the rejection lies in the ________ tails of distribution.
A. up B. left C. right D. down
7. Based on the graph, which of the following is TRUE?
A. This is a two-tailed test.
B. This is a right-tailed test.
−1.725
C. Level of significance is 0.025.
D. The rejection region is 𝑡 ≤ -1.725.
8. What is the rejection region of the given normal curve at the right?
A. 𝑧 ≥ 1.28
B. 𝑧 ≥ 1.645
C. 𝑧 ≥ 1.96
D. 𝑧 ≤ 2.33
9. Given a left-tailed test, population standard deviation is unknown, 𝑛 =
27, 𝛼 = 0.01, what is the critical value?

97
A. 𝑡 = −2.528 B. 𝑡 = −2.479 C. 𝑡 = −1.706 D. 𝑡 = 2.479
10. What is the critical value if the population variance is known, 𝑎 = 0.025,
and it is a two-tailed test?
A. 𝑧 = ±1.28 B. 𝑧 = ±1.645 C. 𝑧 = ±1.96 D. 𝑧 = ±2.33
11. Which of the following is the correct illustration of rejection region 𝑡 ≤
−1.943?
A. B. C. D.

12. Which of the following is the sketch of the normal curve if 𝑧 < −1.645 𝑜𝑟
𝑧 > 1.645?
A. B. C. D.

13. Given the graph of the normal curve at the right, what are the directional
test of hypothesis and critical z value if 𝛼 = 0.01?
A. two-tailed test, ±2.33
B. two-tailed test, ± 2.575
C. left-tailed test, −1.645
D. right-tailed test, 1.645

14. In the given problem below, what is the rejection region?


The factory owner claimed that their bottled fruit juice has the
capacity of less than an average of 280 ml. To test the claim, a group of
consumers gets a sample of 80 bottles of the fruit juice, calculates the
capacity, and then finds the mean capacity to be 265ml. The standard
deviation is 8ml. Use 𝛼 = 0.05 level of significance to test the claim.
A. 𝑧 ≤ −1.645 C. 𝑧 ≥ 1.645
B. 𝑧 ≤ −1.28 D. 𝑧 ≥ 2.33
15. Based on the given problem in no. 14, which is the appropriate rejection
region?

A. C.

B. D.

98
Additional Activities

Activity 5. Do It Now!

Directions: Read and analyze the given problem. Supply the data being
asked for on the items that follow.

1. Effects of drug and alcohol on the nervous system have been the subject
of significant research. A neurologist wants to test the effect of a drug by
injecting 100 rats with a unit dose of the drug, subjecting each rat to
stimulus, and recording its response time. It has been found out that the
mean is x̅ = 1.05 with standard deviation of s = 0.5. The mean response
time of a rat not to respond is 1.2 seconds. She wishes to test whether
the mean response time for drug-injected rats differs from 1.2 seconds.
Assume that the population is normal using α = 0.05.

a. null and alternative hypotheses: ____________________________________


b. test of hypothesis: __________________________________________________
c. level of significance: ________________________________________________
d. population standard deviation: _____________________________________
e. sample standard deviation: _________________________________________
f. number or sample: _________________________________________________
g. test statistic: _______________________________________________________
h. critical value: ______________________________________________________
i. rejection region: ___________________________________________________
j. graph:

99
References
Textbooks

Chua, Jedd Amerson S. Soaring 21st Century Mathematics: Statistics and


Probability. Quezon City: Phoenix Publishing House Inc., 2016.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources
Bognar, Matt. “Normal Curve Generator.” Accessed May 29, 2020
https://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
Ku Leuven. “Critical Region.” Accessed May 28, 2020 https://lstat.
kuleuven.
be/training/coursedescriptions/Goodyear/critical_region.pdf
LibreTexts. “Testing Hypothesis.” Accessed May 29, 2020
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/
Book%3A_Introductory_Statistics_(Shafer_and_Zhang)/08%3A_Testin
g_Hypotheses/8.E%3A_Testing_Hypotheses_(Exercises)
Stephanie, Glen. “Critical Values: Find a Critical Value in Any Tail.”
Accessed May 28, 2020 https://www.statisticshowto.com/
probability-and-statistics/find-critical-values/

100
Statistics and
Probability
Quarter 2 - Module 6:
Computing of Test Statistic on
Population Mean

101
What I Need to Know

In this module, you will learn how to compute the test statistic on a
population mean particularly the t-test and z-test. It is a skill that you need
to develop to be able to determine whether you reject the null hypothesis or
otherwise (to be discussed in the next module). Perform each activity
independently. If you find any difficulty in answering the exercises, you may
ask the assistance of your teacher or you may consult your peers.
After going through this module, you are expected to:
1. determine the appropriate test statistic to be used in the given
problem/situation; and
2. compute for the test statistic value (population mean).

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. It refers to a value calculated from sample data which is needed in
deciding whether the null hypothesis is rejected or not.
A. test statistics C. null hypothesis
B. critical region D. alternative hypothesis
2. What test statistic will be used if the sample size is above 30?
A. t-test C. population mean
B. z-test D. standard deviation
3. What test statistic can be used when the population standard deviation
is known?
A. t-test C. population mean
B. z-test D. standard deviation
4. What test statistic can be used when the population standard deviation
is unknown?
A. t – test C. population mean
B. z – test D. standard deviation

102
5. When finding the z-computed value, which formula should be used for
hypothesis testing?
𝑥̅ −𝜇 𝑥̅ −𝜇 𝜇− 𝑥̅ 𝜇− 𝑥̅
A. 𝑧 = 𝜎 B. 𝑧 = 𝑠 C. 𝑧 = 𝑠 D. 𝑧 = 𝜎
√𝑛 √𝑛 √𝑛 √𝑛

6. When should you use the t-test?


I. When you are testing for a population mean
II. When the sample standard deviation is given
III. When the population standard deviation is given
IV. When you are testing a proportion/percentage of a
population
A. I and II B. II and III C. I and III D. II and IV

For nos. 7-8, refer to the problem below:


Given: 𝐻𝑜 : μ = 8.6 𝐻𝑎 : μ > 8.6
The study has a sample mean of 9.1 and a standard deviation of 2.1
conducted among 25 respondents. Use 𝛼 = 0. 05.
7. What test statistics should be used?
A. t-test C. population mean
B. z-test D. standard deviation
8. What is the computed value?
A. -1. 190 B. – 0. 567 C. 0. 567 D. 1.190
9. How many samples are best when dealing with z-test?
A. cannot be determined C. smaller than 30
B. exactly 30 D. equal or larger than 30

For nos. 10 – 12, refer to the given problem below:


The Choco Toppings, Inc. is one of the manufacturers of chocolate
toppings which uses a machine to dispense liquid ingredients into bottles
that move along a filling line. The owner claims that the machine can
dispense at an average of 50 grams with a standard deviation of 0.7 grams.
A sample of 35 bottles was selected and it was found out that the average
amount dispensed in the sample is 49.3 grams. Test the claims of the owner
of the company at 5% level of significance.
10. Which of the following information is correct?
A. 𝛼 = 0.5 B. 𝜎 = 0.7 C. 𝑥̅ = 35 D. 𝜇 = 49.3
11. What test statistic will be used?
A. t-test C. population mean
B. z-test D. standard deviation

103
12. Find the computed value.
A. -5.916 B. -4.950 C. 4.950 D. 5.916

13. Which test statistic will be used if the sample size is 15?
A. t-test C. cannot be determined
B. z-test D. neither t-test nor z-test
14. Which statistical method can you use when you have a normal
distribution of data?
A. t–test only C. either t–test or z–test
B. z–test only D. neither t–test nor z–test
15. A tire manufacturer claims that its tires have a mean life of 40,000 km. A
random sample of 46 of these tires is tested and the sample mean is
38,000 km. Assume that the population’s standard deviation is 2,000 km
and the lives of the tires are approximately normally distributed.
Determine the computed value at 5% level of significance.
A. -6.782 B. -3.033 C. 3.033 D. 6.782

How do you find this pre-test? Did you encounter both familiar and
unfamiliar terms? Kindly compare your answer in the Answer Key on the
last part of this module

If you obtain 100% or a perfect score, skip the module and


immediately move to the next module. But if you missed a point, please
proceed with the module as it will enrich your knowledge in computing the
test statistic.

104
Lesson
Computing Test Statistic on
6 Population Mean

One of the steps in hypothesis testing is the computation of test


statistic. Remember that it is the value calculated from a sample data
which is needed whether you reject the null hypothesis or not.

Do you still remember when to use t-test? How about z-test? Answer
the activity that follows for a short review on t-test and z-test.

What’s In

Is It T or Z?

Directions: Identify the appropriate test statistic to be used based on the


given information. Write T if it is t-test and Z if it is z-test.
1. The sample mean is 345 and the sample size is 46. The population is
normally distributed with a standard deviation of 11. Test the hypothesis
at 0.05 level of significance. Consider the hypotheses below:
𝐻𝑜 : 𝜇 = 342 𝐻𝑎 : 𝜇 ≠ 342
2. Test at 𝛼 = 0.05 the null hypothesis 𝐻𝑜 : 𝜇 = 2. 19 against the alternative
hypothesis 𝐻𝑎 : 𝜇 < 2. 19 with 𝑛 = 18, 𝑥̅ = 1.36, and 𝑠 = 0.14. Assume that
the population is approximately normal.
3. The sample size is less than 18 and the standard deviation is 3. 67.
4. 𝑥̅ = 125.3 𝑠=5 𝜇 = 124 𝑛 = 24 𝛼 = 0.05
5. 𝑥̅ = 25.4 𝜇 = 22.6 𝜎 = 15 𝑛 = 118 𝛼 = 0.01
Were you able to answer all the questions correctly? If yes, the next
activity will be easy for you. If not, go back your notes about the test
concerning means.

105
What’s New

t-Test vs z-Test

Directions: Complete the diagram below.

Do you know the standard deviation (σ)?

YES NO
1.Use: Is the sample
size above
__________
30?

YES NO

2. Use: 3. Use:

__________ __________

I think you are very much ready for this topic. Read, analyze, and
study the given examples carefully.

What Is It

There are two specific test statistics used for hypothesis testing
concerning means: z-test and t-test.

If the sample size is large, where 𝑛 ≥ 30 and the population standard


deviation (𝜎) is known, use z-test.

In finding the z-value, use the formula below:


𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛

where: 𝑥̅ = sample mean 𝜇 = population mean


𝑛 = sample size 𝜎 = population standard deviation

106
On the other hand, t- test is used when 𝑛 < 30, the population is
normal or nearly normal, and sample standard deviation (𝑠) is unknown.
The formula for the t- value is:

𝑥̅ − 𝜇
𝑡= 𝑠
√𝑛
where: 𝑥̅ = sample mean 𝜇 = population mean
𝑛 = sample 𝑠 = sample standard deviation
The degrees of freedom is 𝑛 − 1 or 𝑑𝑓 = 𝑛 − 1.

Study the following examples.

Example 1: Compute the z-value given the following information. Use one-
tailed test and 0. 05 level of significance.
𝑥̅ = 70 𝜇 = 71.5 𝜎=8 𝑛 = 100
Solution: Since σ is known and n ≥ 30, we will use z-test. Thus, we have:

𝑥̅ − 𝜇 Use the formula for z-test.


𝑧= 𝜎
√𝑛
71. 5 − 70
𝑧= Substitute the given value to the formula.
8
√100
1.5
𝑧= 8
Simplify.
10

1.5
𝑧=
0.8
𝐳 = 𝟏. 𝟖𝟕𝟓
Therefore, the computed z-value is 1.875.

Example 2: In the first semester of the school year, a random sample of 200
students got a mean score of 81.72 with a population standard deviation of
15 in Statistics and Probability test. The population mean is 79.83. Use 0.05
level of significance.

Solution: To answer the problem, let us first identify the given. We have:
𝑥̅ = 81.72 𝜇 = 79.83 𝜎 = 15 𝑛 = 200
Since σ is known and n ≥ 30, we will use z-test.

107
𝑥̅ − 𝜇 Use the formula for z-test.
𝑧= 𝜎
√𝑛
81.72 − 79. 83
𝑧= Substitute the given value to the
15
√200 formula.
1. 89
𝑧=
15 Simplify.
14. 14
1. 89
𝑧=
1.06
Therefore, the computed z-value is
𝐳 = 𝟏. 𝟕𝟖𝟑 1.783.

In Central Limit Theorem, the sample standard deviation (𝑠) may be


used as an estimate of the population standard deviation (𝜎) when the value
of 𝜎 is unknown.

Consider the given examples below:


Example 3: In the past, the average length of an outgoing call from a
business office has been 140 seconds. A manager wishes to check whether
that average has decrease after the introduction of policy changes. A sample
of 150 telephone calls produced a mean of 135 second, with a standard
deviation of 30 seconds. Perform the relevant test at 1% level of significance.

Solution: Let us first identify the given. We have:


𝑥̅ = 135 𝜇 = 140 𝑠 = 30 𝑛 = 150
Since n ≥ 30, we will use z-test by replacing 𝝈 with its estimate s.
𝑥̅ − 𝜇 Use the formula for z-test.
𝑧= 𝜎
√𝑛
135 − 140
𝑧= Substitute the given value to the
30
√150 formula.

−5
𝑧= Simplify.
30
12.25
−5
𝑧=
2.45 Therefore, the computed z – value
𝐳 = − 𝟐. 𝟎𝟒𝟏 is -2.041.

108
Example 4: Compute the t-value given the following information:
𝑥̅ = 129.5 𝜇 = 127
𝑠=5 𝑛 = 12

Solution: Since σ is unknown and n < 30, we will use t-test. Thus, we have:

𝑥̅ − 𝜇 Use the formula for t-test.


𝑡= 𝑠
√𝑛
129. 5 − 127
𝑡= Substitute the given value to the
5
√12 formula.
2. 5
𝑡= Simplify.
5
3.46
2.5
𝑡=
1.44
Therefore, the computed t – value
𝐭 = 𝟏. 𝟕𝟑𝟔 is 1. 736.

Example 5: The government claims that the monthly expenses of a Filipino


family with four members is P10,000. A sample of 26 family’s expenses has
a mean of P10,900 and a standard deviation of P1,250. Is there enough
evidence to reject the government’s claim at 𝛼 = 0. 01?

Solution: Let us first identify the given, so we have:

𝑥̅ = P10,900 𝜇 = P10,000 𝑠 = P1,250 𝑛 = 26

𝑥̅ − 𝜇 Use the formula for t-test.


𝑡= 𝑠
√𝑛
10 900 − 10 000
𝑡=
1 250 Substitute the given value to the
√26 formula.
900
𝑡=
1 250 Simplify.
5.10
900
𝑡=
245. 10
Therefore, the computed t-value is
𝐭 = 𝟑. 𝟔𝟕𝟏
3.671.

109
Now, it’s your turn to answer the following exercises.

What’s More

Activity 1: Find My z-Value!

Directions: Find the computed z-value of the following. Write your answer
to the nearest thousandths. Show your solutions.

1. 𝑥̅ = 21. 75 2. 𝑥̅ = 11. 23 3. 𝑥̅ = 891.75


𝜇 = 20. 83 𝜇 = 12. 01 𝜇 = 890. 25
𝜎 = 2.75 𝜎 = 3.0 𝜎 = 11.75
𝑛 = 38 𝑛 = 44 𝑛 = 90

4. 𝑥̅ = 45 000 5. 𝑥̅ = 1.72
𝜇 = 46 100 𝜇 = 1.83
𝜎 = 1 795 𝜎 = 1.05
𝑛 = 50 𝑛 = 36

110
Activity 2: Find My t-Value!

Directions: Compute the t-value of the following. Write your answer to the
nearest thousandths. Show your solutions.

1. 𝑥̅ = 16.4 2. 𝑥̅ = 246 3. 𝑥̅ = 9.5


𝜇 = 15.86 𝜇 = 245. 85 𝜇 = 8.25
𝑠 = 1.25 𝑠 = 3.25 𝑠 = 1.45
𝑛 = 21 𝑛 = 29 𝑛 = 16

4. 𝑥̅ = 1.83 5. 𝑥̅ = 30. 18
𝜇 = 1. 27 𝜇 = 31. 23
𝑠 = 2.15 𝑠 = 3.15
𝑛 = 10 𝑛 = 23

Activity 3: Compute Me!

Directions: Solve the following. Write your answer to the nearest


thousandths.

111
1. 𝑥̅ = 7.7 2. 𝑥̅ = 19.8 3. 𝑥̅ = 12.5
𝜇 = 8.1 𝑠=4 𝑠=3
𝜎=5 𝜇 = 18.3 𝜇 = 10.75
𝑛 = 135 𝑛 = 11 𝑛 = 18

4. 𝑥̅ = 125.3 5. 𝑥̅ = 25.4 6. 𝑥̅ = 18.1


𝑠=5 𝜇 = 22.6 𝑠=3
𝜇 = 124 𝜎 = 15 𝜇 = 19.2
𝑛 = 24 𝑛 = 118 𝑛 = 15

7. 𝑥̅ = 98.7 8. 𝑥̅ = 129.1 9. 𝑥̅ = 17.2


𝜇 = 4.6 𝑠=7 𝜇 = 3.1
𝜎 = 99.1 𝜇 = 128.3 𝜎 = 16.9
𝑛 = 105 𝑛 = 23 𝑛 = 100

Activity 4: Find My Value!

Directions: Determine the test statistic used. Then, find the value of the
following based on the given information.

112
1. 𝐻𝑜 : 𝜇 = 85 𝐻𝑎 : 𝜇 ≠ 85
The sample mean is 83, the sample size is 39, and the standard
deviation is 5. Use 𝛼 = 0.05.

2. 𝐻𝑜 : 𝜇 = 7. 5 𝐻𝑎 : 𝜇 > 7. 5
The sample mean is 8.3 and the sample size is 52. The population
follows a normal distribution with standard deviation 3.17. Use 𝛼 =
0.01.

3. 𝐻𝑜 : 𝜇 = 15 𝐻𝑎 : 𝜇 < 15
The sample mean is 10, the sample standard deviation is 6.1, and the
sample size is 9. Use 𝛼 = 0.05.

4. 𝐻𝑜 : 𝜇 = 116.12 𝐻𝑎 : 𝜇 > 116.12


The population follows a normal distribution with standard deviation
of 7.18, sample mean of 118.7, and sample size of 21. Use 𝛼 = 0.10.

5. 𝐻𝑜 : 𝜇 = 215 𝐻𝑎 : 𝜇 ≠ 215
The population is approximately normal. The sample mean is 219.3,
the sample standard deviation is 13.12, and the sample size is 22.
Use 𝛼 = 0.05.

6. 𝐻𝑜 : 𝜇 = 15 𝐻𝑎 : 𝜇 ≠ 15
The population is approximately normal. The sample mean is 15.3,
the sample standard deviation is 2.5, and the sample size is 12. Use 𝛼
= 0.05.

7. 𝐻𝑜 : 𝜇 = 65 𝐻𝑎 : 𝜇 > 65
The sample mean is 63, the sample size is 43, and the standard
deviation is 4. Use 𝛼 = 0.05.

113
8. 𝐻𝑜 : 𝜇 = 25 𝐻𝑎 : 𝜇 < 25
The sample mean is 23.75, the sample standard deviation is 4.5, and
the sample size is 12. Use 𝛼 = 0.05.

9. 𝐻𝑜 : 𝜇 = 106.22 𝐻𝑎 : 𝜇 > 106.22


The population follows a normal distribution with standard deviation
of 4.08, sample mean of 108.5 and sample size of 17. Use 𝛼 = 0.10.

10. 𝐻𝑜 : 𝜇 = 25. 5 𝐻𝑎 : 𝜇 > 25. 5


The sample mean is 23.8 and the sample size is 42. The population
follows a normal distribution with standard deviation 2.27. Use
𝛼 = 0.01.

What I Have Learned

Directions: What new realizations did you have about the computation of test
statistic? To answer the question, complete the sentences below.

1. The __________ is used if the sample size is large, 𝑛 ≥ ___, and the
population standard deviation (𝜎) is __________.
2. The formula of the z-test is __________.
3. The __________ is used when 𝑛 < 30, the population is normal or nearly
normal, and sample standard deviation (𝑠) is __________.
4. The formula of the t-test is __________.
5. The formula for degree of freedom is __________.

114
What I Can Do

The Corona!
Directions: Determine the test statistic to be used, then find its computed
vale.

The Coronavirus Disease (COVID-19) is an infectious disease caused


by a new strain of coronavirus. The World Health Organization (WHO) claims
that the incubation period of the virus in the infected person has a mean of
5.1 days. The doctors in the Philippines conducted a research and they
found out that incubation period of the virus in human body is 6.03 days
with a standard deviation of 3.32. The samples were 46 COVID patients. Is
there enough evidence to conclude that the incubation period of the virus is
5.1 days as stated, at 𝛼 = 0. 01?

115
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. What test statistic will be used if the sample size is below 30?
A. t-test C. population mean
B. z-test D. standard deviation

2. In using t-test for a population mean, we assume that the sample is


selected randomly. The given statement is:
A. always true C. sometimes true
B. always false D. sometimes false

3. If the population standard deviation is unknown, what test statistic is to


be used?
A. t-test C. population mean
B. z-test D. standard deviation

4. In finding the t-computed value, which formula should be used?


𝜇− 𝑥̅ 𝜇− 𝑥̅ 𝑥̅ −𝜇 𝑥̅ −𝜇
A. 𝑡 = 𝜎 B. 𝑡 = 𝑠 C. 𝑡 = 𝑠 D. 𝑡 = 𝜎
√𝑛 √𝑛 √𝑛 √𝑛

5. When should you use the z-test?


I. When you are testing for a population mean
II. When the population standard deviation is given
III. When the sample standard deviation ONLY is given
IV. When you are testing with small sample sizes, n < 30.
A. I and II C. II and IV
B. II and III D. I and III
For nos. 6-8, refer to the problem below:
Milky Milk is sold in packets with an advertised mean weight of
0.5kgs. The standard deviation is known to be 0.11 kilograms. A consumer
group wishes to check the accuracy of the advertised mean and takes a
sample of 36 packets finding an average weight of 0.47kgs.

6. What test statistic should be used?


A. t-test C. population mean
B. z-test D. standard deviation

116
7. What is the sample size?
A. 0. 15 B. 0. 48 C. 0. 5 D. 36

8. What is the computed value?


A. – 1. 636 B. -1.488 C. 0. 833 D. 5. 551
For nos. 9-10, refer to the problem below:
Given: 𝐻𝑜 : μ = 7.25 𝐻𝑎 : μ < 7.25
The study has a sample mean of 8.1 and a standard deviation of 1.18
conducted among 15 respondents. Use 𝛼 = 0.01.
9. What test statistic should be used?
A. t-test C. population mean
B. z-test D. standard deviation

10. What is the computed value?


A. – 2.790 B. -2.368 C. 2.368 D. 2.790

11. How many samples are best when dealing with t-test?
A. cannot be determined C. smaller than 30
B. exactly 30 D. equal or larger than 30

12. Which test statistic will be used if the sample is 37?


A. t-test C. population mean
B. z-test D. standard deviation

13. Which statistical method can you use when you have a normal
distribution of data?
A. t-test only C. either t-test or z-test
B. z-test only D. neither t-test nor z-test

For nos. 14-15, refer to the problem below:


A tire manufacturer claims that its tires have a mean life of 40,000
kms. A random sample of 46 of these tires is tested and the sample mean is
38,000 kms. Assume that the populations standard deviation is 2,000 kms
and the lives of tires are approximately normally distributed.

14. What test statistic should be used?


A. t-test C. population mean
B. z-test D. standard deviation

15. What is the computed value at 5% level of significance?


A. 6.782 B. 3.033 C. -6.782 D. -3.033

117
Additional Activities

Directions: Answer the following:

1. Assume that the cholesterol levels in a certain population have mean µ =


150 and standard deviation σ = 12. The cholesterol levels for a random
sample of n = 40 individuals are measured and the sample mean x is
determined. What is the computed value at 𝛼 = 0. 01 if 𝑥̅ = 147?

2. The maximum heart rate of a person at the age of 20 is 200 beats per
minute. Conduct a survey with your neighbors whose age is between 15-
20. Collect a data of 10 samples, then compute its value for 𝛼 = 0. 01?

118
References
Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Lauer, Stephen A., Kyra H. Grantz, Qifang Bi, Forrest K. Jones, Qulu Zheng,
Hannah R. Meredith, Andrew S. Azman, Nicolas G. Reich, and Justin Lessler.
“The Incubation Period of Coronavirus. Disease 2019 (COVID-19) From
Publicly Reported Confirmed Cases: Estimation and Application,” Annals of
Internal Medicine 172, no. 9 (2020); 577-582. Accessed May 21, 2020
https://www.acpjournals.org/doi/10.7326/M20-0504

119
Statistics and
Probability
Quarter 2 - Module 7:
Drawing Conclusion About
Population Mean Based on
Test Statistic Value and
Critical Region

120
What I Need to Know

So far, you’ve already learned how to formulate null and alternative


hypotheses, identify appropriate test statistic, look for critical values,
identify the critical region, and compute for the value of the test statistic.

In this module, you will learn how interpret the result based on the
computed value of t-test and z-test. Perform each activity independently. If
you find any difficulty in answering the exercises, you may consult your
peers or ask the assistance of your teacher.

After going through this module, you are expected to:


1. recall and apply steps in hypothesis testing; and
2. draw conclusion about the population mean based on the test statistic
value and the rejection region.
Before you proceed to the lesson, make sure to answer first the questions
on the next page (What I Know).

121
What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. What is the critical value in a one-tailed test with 5% level of significance
and a degree of freedom of 18?
A.1.734 B. 2.101 C. 2.567 D. 2.898
2. If the absolute value of the computed test statistic is greater than the
critical value, then we ___________________________________.
A. retain the null hypothesis C. reject the alternative hypothesis
B. reject the null hypothesis D. do not reject the null hypothesis
3. When we failed to reject the null hypothesis, which of the following
statements is true?
A. The conclusion is guaranteed.
B. The conclusion is not guaranteed.
C. There is enough evidence to back up the claim.
D. There is no enough evidence to reject the claim.

4. If the t-computed value is 1.093 and the critical value is 1.699, what will
be the decision?
A. Reject both hypotheses. C. Do not reject the null hypothesis.
B. Reject the null hypothesis. D. Support the alternative hypothesis.
5. On the given figure below, the t-computed value is 2.130. What
conclusion can be drawn?

A. Reject the null hypothesis.


B. Fail to reject the null hypothesis.
C. Reject only the alternative hypothesis.
D. Reject both the null and alternative hypotheses.

122
6. What does it mean when the null hypothesis is rejected?
A. The null hypothesis is incorrect.
B. The alternative hypothesis is correct.
C. There is sufficient evidence to support the null hypothesis.
D. There is sufficient evidence to disprove the null hypothesis.
7. If the z-computed value is 2.505 and the critical value is 2.011, what will
be the decision?
A. Reject both hypotheses. C. Support the null hypothesis.
B. Reject the null hypothesis. D. Do not reject the null hypothesis.

8. From the given figure below, the z-computed value is 1.375. What
conclusion can be drawn?

A. Reject the null hypothesis.


B. Failed to reject the null hypothesis.
C. Reject only the alternative hypothesis.
D. Reject both the null and alternative hypotheses.
9. In a right-tailed test, if the critical value is greater than the computed
value, then we ____________________.
A. reject both the null and alternative hypotheses
B. do not reject both the null and alternative hypotheses
C. reject the null hypothesis and support the alternative hypothesis
D. fail to reject the null hypothesis and the alternative hypothesis not
supported
10. A drink vending machine is adjusted so that, on average, it dispenses
200ml of fruit juice with a standard deviation of 13ml into a plastic cup.
However, the machine tends to go out of adjustment and periodic checks
are made to determine the average amount of fruit juice being dispensed.
The operator thinks that the amount dispensed is less than 200ml. So, to
verify, a sample of 25 drinks is taken to test the adjustment of the
machine and a mean of 195 is obtained. For α = 5%, an appropriate
decision rule would be:
A. rejecting the null hypothesis
B. not rejecting the null hypothesis

123
C. rejecting both the null and alternative hypotheses
D. supporting both the null and alternative hypotheses

11. Find the critical value(s) of a two-tailed z-test with 𝛼 = 0.05.


A. z = -1.64 B. z = ±0.06 C. z = 1.64 D. z = ±1.96
12. What should you do if the computed z-value lies in the critical region?
A. Reject the null hypothesis.
B. Do not reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.
13. The computed t-value is 1.9 and the critical value is 1.690. What
conclusion can be drawn?
A. Reject both the null and alternative hypotheses.
B. Support both the null and alternative hypotheses.
C. Support the null hypothesis and reject the alternative hypothesis.
D. Reject the null hypothesis and support the alternative hypothesis.
For nos. 14-15, refer to the given statement:
In a right-tailed test with α = 0. 01, the z-computed value is 1.682.

14. What is the critical value?


A. 1.28 B. 1.645 C. 1.960 D. 2.326
15. What is your decision?
A. Reject both the null and alternative hypotheses.
B. Support both the null and alternative hypotheses.
C. Reject the null hypothesis and support the alternative hypothesis.
D. Do not reject the null hypothesis; hence the alternative hypothesis is
unsupported.

124
Drawing Conclusion About
Lesson
Population Mean Based on
7 Test Statistic Value and
Critical Region
The final step in testing hypothesis is to interpret the results or draw
conclusions out of the computed value. In this module, you will decide
whether you reject or not the null hypothesis.

Let us first recall the different terms related to hypothesis testing by


answering the following activity.

What’s In

Fact or Bluff?
Directions: Write FACT if the statement is true and BLUFF if not. Then,
answer the guide questions that follow.
___________ 1. The notation 𝜇 and 𝜎 are sample values.
___________ 2. The alternative hypothesis is a statement that there is no
significance difference between the two given properties.
___________ 3. In the given, 𝐻𝑜 : 𝜇 = 21.5 and 𝐻𝑎 : 𝜇 > 21.5 show a one-tailed
test since it shows direction of the distribution.
___________ 4. The rejection region for a hypothesis test is also called the
critical region.
___________ 5. The two types of significance test are one-tailed and two-
tailed test.
___________ 6. The level of significance refers to the degree of significance in
which we reject or fail to reject the null hypothesis.
___________ 7. We don’t need to set the level of the significance because we
can get 100% accuracy level in testing hypothesis.
___________ 8. A two-tailed test shows that the null hypothesis may be
rejected when the test value is on the critical region on either
side of the distribution.
___________ 9. Hypothesis testing is basically testing an assumption that we
make about the population parameter.

125
__________ 10. In a two-tailed test, the null hypothesis should not be rejected
when the test value is on either of the two critical regions.

Guide Questions:
1. Were you able to answer the questions correctly? If yes, you may proceed
to the next question. If not, you may go back to the previous discussion
so that you can recall the different terms related to hypothesis testing.

2. Why is test statistic important in hypothesis testing? Explain briefly.

3. After obtaining the computed value of your statistic, how will you
interpret the result?

What’s New

Which Is Greater?
Directions: Write the symbols greater than (>), less than (<), or equal to (=)
in the following numbers. Then, answer the questions that follow.

1. 32.01______ 32.10 6. 1.894 ______ 1.98


2. 4.5______ 4.50 7. - 2.26 ______ -2.3
3. 1.25 ______ 1.241 8. -1.45 ______ 1.25
4. - 3.3______ 3.3 9. 1.87 ______ -1.87
5. 2.25______ 2.2 10. 2.33 ______ 2.5

126
Were you able to write the correct symbols? If not, which part was
confusing? Why do you think so?

You must know how to use these symbols in preparation for this
lesson.

What Is It

After obtaining the computed value of the test statistic, it is being


compared to the critical values. You will use the following tables on z- and t-
critical value.

Table 1: z – Critical Value

Level of Significance
Type of Test
𝜶 = 1% 𝜶 = 2.5% 𝜶 = 5% 𝜶 = 10%

one-tailed test 𝑐 = ±2. 326 𝑐 = ±1.960 𝑐 = ±1.645 𝑐 = ± 1. 28

two-tailed test 𝑐 = ±2. 575 𝑐 = ±2.326 𝑐 = ±1.960 𝑐 = ±1.645

Table 2: t – Critical Value

𝜶 for one-tailed test 0.05 0.025 0.01 0.005


𝜶 for two-tailed test 0.10 0.05 0.025 0.01
df = (n – 1)
1 6.311 12.706 31.821 63.657
2 2.920 4.303 6.065 9.925
3 2.353 3.182 4.541 5.841
4 2.132 2.776 3.747 4.604
5 2.025 2.571 3.365 4.032
6 1.943 2.447 3.143 3.707
7 1.895 2.365 2.998 3.499
8 1.860 2.306 2.896 3.355
9 1.833 2.262 2.821 3.250
10 1.812 2.228 2.764 3.169
11 1.796 2.201 2.718 3.106

127
12 1.782 2.179 2.681 3.055
13 1.771 2.160 2.650 3.012
14 1.761 2.145 2.624 2.977
15 1.753 2.131 2.602 2.947
16 1.746 2.120 2.583 2.921
17 1.740 2.110 2.567 2.898
18 1.734 2.101 2.552 2.878
19 1.729 2.093 2.539 2.861
20 1.725 2.086 2.528 2.845
21 1.721 2.080 2.512 2.831
22 1.717 2.074 2.508 2.819
23 1.714 2.069 2.500 2.807
24 1.711 2.064 2.492 2.797
25 1.708 2.060 2.485 2.787
26 1.706 2.056 2.479 2.779
27 1.703 2.052 2.473 2.771
28 1.701 2.048 2.467 2.763
29 1.699 2.045 2.462 2.756
30 1.697 2.042 2.457 2.750
31 1.695 2.040 2.453 2.744
32 1.694 2.037 2.449 2.738
33 1.692 2.035 2.445 2.733
34 1.691 2.032 2.441 2.728
35 1.690 2.030 2.438 2.724
36 1.688 2.028 2.434 2.719
37 1.687 2.026 2.431 2.715
38 1.686 2.024 2.429 2.712
39 1.685 2.023 2.426 2.708
40 1.684 2.021 2.423 2.704
42 1.682 2.018 2.418 2.698
44 1.680 2.015 2.414 2.692
46 1.679 2.013 2.410 2.687
48 1.677 2.011 2.407 2.682
50 1.676 2.009 2.403 2.678
60 1.671 2.000 2.390 2.660
Infinity 1.645 1.960 2.326 2.576

128
In general, if the absolute value of the computed value is greater
than the absolute value of the critical value, we reject the null hypothesis
and support the alternative hypothesis. But if the absolute value of the
computed value is less than the absolute value of the critical value, we do
not reject or we fail to reject the null hypothesis and the alternative
hypothesis is not supported.
In a right-tailed test, if the computed value is greater than the
critical value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is less than the critical value, we do
not reject or we fail to reject the null hypothesis and the alternative
hypothesis is not supported.
In a left-tailed test, if the computed value is less than the critical
value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is greater than the critical value, we
do not reject or we fail to reject the null hypothesis and the alternative
hypothesis is not supported.
Rejecting the null hypothesis doesn’t mean that it is incorrect or the
alternative hypothesis is correct. The collected data suggest a sufficient
evidence to disprove the null hypothesis, hence we reject it.
Similarly, a failure to reject the null hypothesis does not mean that it
is true -only that the test did not prove it to be false. There is an insufficient
evidence to disprove the null hypothesis; hence we do not reject it.

Study the examples below.


Example 1: Compute for its value given the following information. Use 𝛼 =
0. 05. Interpret the result.
𝐻𝑜 : 𝜇 = 70 𝑥̅ = 71.5 𝜇 = 70
𝐻𝑎 : 𝜇 > 70 𝜎=8 𝑛 = 100
Solution: It is a one-tailed test, since it does mention about the direction of the
distribution (the alternative hypothesis uses the symbol >). Since σ is known and n
≥ 30, we will use z-test. The level of significance is 0.05. From Table 1, the z-critical
value is 1.645. Thus, we have:
Non-Rejection Rejection Region
𝑥̅ − 𝜇 1.5
𝑧= 𝜎 𝑧= Region
8
ξ𝑛 10
71. 5 − 70 1.5
𝑧= 𝑧=
8 0.8
ξ100 𝐳 = 𝟏. 𝟖𝟕𝟓

Decision: 1.645

The computed z-value is 1.875 which is greater than the critical value of
1.645. Therefore, we reject the null hypothesis and support the alternative
hypothesis.

129
Example 2: Compute for its value given the following information. Use 𝛼 =
0.01. Interpret the result.
𝐻𝑜 : 𝜇 = 127 𝑥̅ = 124.5 𝜇 = 127
𝐻𝑎 :𝜇 < 127 𝑠=5 𝑛 = 12

Solution: It is a left-tailed test, since it does mention about the direction of


the distribution (the alternative hypothesis uses the symbol <). Since σ is
unknown and n < 30, we will use t-test. The degree of freedom (df = n - 1) is
11 and 𝛼 = 0.01. Therefore, the t-critical value from Table 2 is -2.718. Thus,
we have:

Rejection Acceptance or
𝑥̅ − 𝜇 −2. 5
𝑡= 𝑡= Region Non-Rejection
𝑠 5 Region
ξ𝑛 3.46
124. 5 − 127 −2.5
𝑡= 𝑡=
5 1.44
ξ12 𝐭 = −𝟏. 𝟕𝟑𝟔
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5

-2.718
Decision:

The computed t-value is greater than the t-critical value at 𝛼 = 0.01 (i.
e.−1.736 > −2.718. Since we have a left-tailed test, our conclusion is that we
fail to reject the null hypothesis.

Example 3: The government claims that P10,000 is the monthly expenses


of a Filipino family with four members. A sample of 26 families has mean
monthly expenses of P10,900 and a standard deviation of P1,250. Is there
enough evidence to reject the government’s claim at 𝛼 = 2.5%?

Solution: Let us identify first the given. So we have:


𝐻𝑜 : 𝜇 = 𝑃10,000 𝑥̅ = P10,900 𝑠 = P1,250
𝐻𝑎 : 𝜇 ≠ 𝑃 10,000 𝜇 = P10,000 𝑛 = 26

It is a two-tailed test, since it does not mention about the direction of the
distribution. Since σ is unknown and n < 30, we will use t-test. The degree of
freedom (df = n - 1) is 25 and 𝛼 = 2.5%. Therefore, the t-critical value from
Table 2 is 2.485. Thus, we have:

130
Non-Rejection
𝑥̅ − 𝜇 900 Region Rejection Region
𝑡= 𝑠 𝑡=
1 250
ξ𝑛 5.10
10 900 − 10 000 900
𝑡= 𝑡=
1 250 245. 10
ξ26 𝐭 = 𝟑. 𝟔𝟕𝟏

-5 -4 -3 -2 -1 0 1 2 3 4 5

-2.485 2.485

Decision:

The absolute value of the computed t-value is greater than the absolute of the
critical t-value at 𝛼 = 0.025 (i.e. |3.671|> |2.485|). Therefore, we reject the null
hypothesis.

Conclusion:

We can conclude that there is enough evidence to reject the claim of the
government that P10,000 is the monthly expenses of a Filipino family with
four members.

What’s More

Activity 1: Rejected or Not Rejected?


Directions: Based on the given, decide whether the null hypothesis is
rejected or not.

z- or t-computed value z- or t-critical value


1. 2.310 1.960
2. 1.240 2.131
3. 2.960 2.896
4. 2.431 1.943
5. 1.523 1.721

131
Activity 2: Find Me
Directions: Complete the table below. Use Table 1: z-Critical Value and
Table 2: t-Critical Value. The first item is done for you.
Type of Test 𝜶 Sampl Compute Critical Decision
e Size d Value Value
1. one-tailed Reject the null
0.05 n = 17 2.015 1.746
hypothesis.
2. 0.01 n ≥ 30 1.361 ±2.575
3. two-tailed 0.05 n = 27 3.026
4. one-tailed Do not reject
0.1 2.318 2.552 the null
hypothesis.
5. one-tailed n ≥ 30 1.008 ±1.960

Activity 3: Am I Rejected or Not?


Directions: Color the emoticon RED if the null hypothesis is not rejected
and BLUE if it is rejected. (Note: Use the table for the z- and t-critical
values.)
1. A one-tailed test with 5% level of significance has z-computed
value of 1.120.
2. The level of significance is 1% with z-computed value of 2.780
using two-tailed test.
3. The z-computed value of a two-tailed test is -1.740 with 2.5%
level of significance.
4. The z-computed value is 2.037 with 0.05 level of significance
of a one-tailed test.
5. The t-computed value of a one-tailed test is 2.784 with 5%
level of significance with 23 samples.
6. A two-tailed test with 1% level of significance has t-computed
value of 1.129 with sample size of 16.
7. The level of significance is 5% with t-value of 1.458 using
two-tailed test and n = 20.
8. The computed value is -1.023 with 0.05 level of significance
of a two-tailed test.
9. The sample size is 11. The t-computed value of a one-tailed
test is 2.374 with 5% level of significance.
10. A one-tailed test with 1% level of significance has z-computed
value of 2.455.

132
Activity 4: Interpret Me
Directions: Draw a conclusion from the given information.

1. 𝐻𝑜 : 𝜇 = 80 𝐻𝑎 : 𝜇 ≠ 80
The sample mean is 83, the sample size is 39, and the standard deviation
is 5. Use 𝛼 = 0.05.

2. 𝐻𝑜 : 𝜇 = 7. 5 𝐻𝑎 : 𝜇 > 7. 5
The sample mean is 8.3 and the sample size is 52. The population follows
a normal distribution with standard deviation of 3.17. Use 𝛼 = 0.01.

3. 𝐻𝑜 : 𝜇 = 10 𝐻𝑎 : 𝜇 > 10
The sample mean is 15, the sample standard deviation is 6.1, and the
sample size is 9. Use 𝛼 = 0.05.

4. 𝐻𝑜 : 𝜇 = 116.12 𝐻𝑎 : 𝜇 > 116.12


The population follows a normal distribution with standard deviation of
7.18. The sample mean is 118.7 and the sample size is 21. Use 𝛼 = 0. 01.

5. 𝐻𝑜 : 𝜇 = 215 𝐻𝑎 : 𝜇 ≠ 215
The population is approximately normal. The sample mean is 219.3, the
sample standard deviation is 13.12, and the sample size is 22. Use
𝛼 = 0.05.

133
What I Have Learned

Directions: Fill in each blank with the correct word or phrase.

If the computed test statistic is in the critical region, then we


(1)___________the null hypothesis.
In general, if the absolute value of the computed test statistic (i.e., z-
value or t-value) is greater than the absolute value of the critical value, we
(2)______________ the null hypothesis and support the alternative
hypothesis. But if the absolute value of the computed test statistic is less
than the absolute value of the critical value, we (3)_______________ the null
hypothesis and the alternative hypothesis is not supported.
In a right-tailed test, if the computed value is (4)____________ the
critical value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value (5)________________ the critical value,
we fail to reject the null hypothesis and the alternative hypothesis is not
supported.
In a left-tailed test, if the computed value is less than the critical
value, we (6)_____________________the null hypothesis and support the
alternative hypothesis. But if the computed value is greater than the critical
value, we (7)_________________ the null hypothesis and the alternative
hypothesis is not supported.

What I Can Do

Direction: Answer the given problem.

The Guidance Counselor of your school claims that the Grade 11


students spend an average of 11.28 hours in a week doing performance
tasks with standard deviation of 1.64. Your adviser thinks that students
spend more time in doing performance tasks, so he decided to conduct
his own research. He used a sample of 46 Grade 11 students and
obtained a mean of 11.83. Is there enough evidence at 0.05 level of
significance that the students spend 11.28 hours in a week doing
performance tasks?

134
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. What is the critical value in a two-tailed test with 10% level of
significance and a degree of freedom of 18?
A. 2.575 B. 2.326 C. 1.960 D. 1.734
2. If the computed value is greater than the critical value, then we
______________.
A. retain the null hypothesis C. support the null hypothesis
B. reject the null hypothesis D. fail to reject the null hypothesis
3. What does it mean when the null hypothesis is rejected?
A. The null hypothesis is incorrect.
B. The alternative hypothesis is correct.
C. There is a sufficient evidence to support the null hypothesis.
D. There is an insufficient evidence to disprove the null hypothesis.
4. If the t-computed value is 2.115 and the critical value is 2.423, what will
be the decision?
A. Reject the null hypothesis.
B. Do not reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.

5. On the given figure below, the t-computed value is 1.217. What


conclusion can be drawn?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative
hypotheses.
6. When we fail to reject the null hypothesis, which of the following
statements is true?
A. The conclusion is guaranteed.
B. The conclusion is not guaranteed.
C. There is a sufficient evidence to back up the claim.
D. There is no sufficient evidence suggesting that the claim is false.

135
7. If the z-computed value is 1.253 and the critical value is 1.645, what will
be the decision?
A. Reject the null hypothesis.
B. Do not reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.
8. On the given figure below, the z-computed value is 2.431. What
conclusion can be drawn?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject both the null and alternative
hypotheses.
D. Support both the null and alternative
hypotheses.
.
9. In a right-tailed test, if the critical value is greater than the computed
value, then we __________________________________.
A. reject the null hypothesis
B. fail to reject the null hypothesis
C. reject both the null and alternative hypotheses
D. support both the null and alternative hypotheses
10. A drink vending machine is adjusted so that, on average, it dispenses
200ml of fruit juice with a standard deviation of 13ml into a plastic cup.
However, the machine tends to go out of adjustment and periodic checks
are made to determine the average amount of fruit juice being dispensed.
The operator thinks that the amount dispensed is less than 200 ml. So to
verify, a sample of 25 drinks is taken to test the adjustment of the
machine and a mean of 195 is obtained. For α = 5%, an appropriate
decision rule would be _________________________________.
A. retain the null hypothesis C. support the null hypothesis
B. reject the null hypothesis D. fail to reject the null hypothesis

11. Find the critical value of a right-tailed z-test with 𝛼 = 10%.


A. z = 1.28 B. z = 1.645 C. z = 1.96 D. z = 2.326
12. What should you do if the computed t-value lies in the critical region?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.
13. The z-computed value is 2.113 and the critical value is 1.645. What
conclusion can be drawn?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.

136
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.
For nos. 14-15, refer to the given statement:

In a two-tailed test with 𝛼 = 0. 025, the z-computed value is 2.014.

14. What are the critical values?


A. ±2.575 B. ±2.326 C. ±1.960 D. ±1.645
15. What is your decision?
A. Reject the null hypothesis.
B. Do not reject the null hypothesis.
C. Reject both the null and alternative hypotheses.
D. Support both the null and alternative hypotheses.

Additional Activities

Directions: Answer the following.


1. When do you reject the null hypothesis?
2. What is your basis in rejecting the null hypothesis?
3. What conclusion can you derive if you reject the null hypothesis?
4. If you fail to reject the null hypothesis, does it mean that there is no
enough evidence to back up the decision? Why?
5. Nowadays, people tend to buy products online. The shipping
department manager claims that the average order shipped by their
company is 1.89kgs. The general manager wants to verify if his claim
is true. So, he randomly selects 25 sample of orders. What can the
general manager conclude at 0.01 level of significance if the sample
has a mean weight of 2.07kgs with a standard deviation of 0.72kg?

137
References
Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De
Mesa. Teaching Guide for Senior High School: Statistics and Probability.
Quezon City: Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.
Chan Shio, Christian Paul, and Maria Angeli Reyes. Statistics and
Probability for Senior High School. Quezon City: C & E Publishing Inc.,
2017.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc., 2017.
Jaggia, Sanjiv, and Alison Kelly. Business Statistics: Communicating with
Numbers. 2nd ed. New York: McGraw-Hill Education, 2016.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources
HackMath.net. “Normal Distribution Calculator.” Accessed May 22, 2020
https://www.hackmath.net/en/calculator/normaldistribution?mean=
0&sd=1&area=above&above=1.645&below=&ll=&ul=&outsideLL=&out
sideUL=&draw=Calculate

138
Statistics and
Probability
Quarter 2 – Module 8:
Solving Problems Involving Test
of Hypothesis on Population
Mean

139
What I Need to Know

In the previous module, you studied about constructing hypotheses


based on assumptions made. You’ve learned how to determine the
appropriate test statistic to be used and solve its value in a given situation
as well as how to identify the critical value and draw the critical region.
In this module, you will apply your knowledge and skills on solving
problems in hypothesis testing. Eventually, you will decide whether you will
reject the null hypothesis or not.
After going through this module, you are expected to:
1. identify the steps in hypothesis testing; and
2. solve problems involving test of hypothesis on the population mean.
Before you proceed to the lesson, make sure to answer first the
questions on the next page (What I Know).

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. Which of the following will produce a correct decision?
A. rejecting a false hypothesis
B. rejecting a true null hypothesis
C. failure to reject a false hypothesis
D. failure to reject a true null hypothesis
2. If a result is said to be significant at 5% level, what does it mean?
A. The null hypothesis is 5% true.
B. The null hypothesis is 5% incorrect.
C. We fail to reject the false null hypothesis 5% of the time.
D. There is a 5% probability that a true null hypothesis is rejected.
3. Which value separates the critical region from the non-critical region in a
normal curve when testing a hypothesis?
A. t-value C. critical value
B. z-value D. computed value

140
4. What should be the decision if the computed z-value lies in the critical
region?
A. Reject the null hypothesis.
B. Reject the alternative hypothesis.
C. Do not reject the null hypothesis.
D. Do not reject the alternative hypothesis.
5. The mean height of women is greater than 64" (inches). Which of the
following represents the null and alternative hypotheses?
A. H0: μ > 64" C. H0: μ < 64"
Hₐ: μ ≠ 64" Hₐ: μ > 64"
B. H0: μ > 64" D. H0: p = 64"
Hₐ: μ ≠ 64" Hₐ: p > 64"
6. What is the last step in the hypothesis testing procedure?
A. Draw conclusion.
B. Choose the level of significance.
C. State the null and alternative hypotheses.
D. Determine the test statistic and compute it.
7. A one sample t-test is conducted on Ho: μ = 81.6. The sample has a
sample mean = 84.1, s = 3.1, n = 25, and α = .01. State your null and
alternative hypotheses.
A. H0: μ = 81.6 C. H0: μ < 81.6
Hₐ: μ ≠ 81.6 Hₐ: μ > 81.6
B. H0: μ = 81.6 D. H0: p = 64"
Hₐ: μ < 81.6 Hₐ: p > 81.6
8. Perform a hypothesis test on the null hypothesis where μ = 6.9. A
random sample of 25 items is selected. The sample mean is 7.1 and the
sample standard deviation is 2.4. It can be assumed that the population
is normally distributed at α = .01.
A. There is enough evidence to reject the claim.
B. There is enough evidence to support the claim.
C. There is not enough evidence to reject the claim.
D. There is not enough evidence to support the claim.
9. In a right-tailed test, what will you do if the critical value is greater than
the computed value?
A. Reject the null hypothesis.
B. Reject the alternative hypothesis.
C. Do not reject the null hypothesis.
D. Fail to reject the alternative hypothesis.

141
10. When the null hypothesis is rejected, which of the following statements is
true?
A. The null hypothesis is incorrect.
B. The alternative hypothesis is true.
C. There is enough evidence against the null hypothesis.
D. There is a very small probability that the given null hypothesis is true.
11. What does it mean when we failed to reject the null hypothesis?
A. The conclusion is not significant.
B. The null hypothesis is definitely correct.
C. There is enough evidence to back up the null hypothesis.
D. There is insufficient evidence to disagree with the null hypothesis.
12. If the t-computed value is 1. 093 and the critical value is 1.699, what will
be the decision?
A. Reject the null hypothesis.
B. Support the null hypothesis.
C. Do not reject the null hypothesis.
D. Support the alternative hypothesis.
13. What is the first step in the hypothesis testing procedure?
A. Draw conclusion.
B. Choose the level of significance.
C. State the null and alternative hypotheses.
D. Determine the test statistic and compute it.
14. What will you do if the computed value is greater than the critical value?
A. Reject the null hypothesis.
B. Support the null hypothesis.
C. Do not reject the null hypothesis.
D. Support the alternative hypothesis.
15. If the computed z-value is 1.130 and the critical value is 1.96, what
conclusion can be drawn?
A. Fail to reject the null hypothesis.
B. Reject both the null and alternative hypotheses.
C. Reject the null hypothesis in favor of the alternative hypothesis.
D. Fail to reject both the null hypothesis and alternative hypothesis.

How did you find this pre-test? Did you encounter both familiar and
unfamiliar terms? Kindly compare your answer in the Answer Key on the
last part of this module.
If you got a perfect score or 100%, skip this module and proceed to
the next one. But if you missed even a single point, please continue with
this module as it will enrich your knowledge in hypothesis testing.

142
Lesson Solving Problems Involving

8 Test of Hypothesis on the


Population Mean
Hypothesis testing is a method of testing a claim or hypothesis about
a parameter in a population using data measured in a sample. In this
method, we test some hypotheses by determining the likelihood that a
sample statistic could have been selected and if the hypothesis regarding
the population parameter was true.

In this module, you will apply your knowledge in solving problems on


hypothesis testing. To do that, recall the different terms related to
hypothesis testing by answering the activity below.

What’s In

Find the Word… That’s the Word!

Directions: Find the words related to hypothesis testing. The letters


consisting the word may be arranged horizontally, vertically, or diagonally.
Make sure to identify each of them.

A H Y P O T H E S I S N A V Q N T
L C W A A O A N S A I D Q A U O Y
T A N R D N S U Z D R E A R K I P
E R T A G U V P T T H I R I G R E
R S A M P L E M E A N S S A T Y I
N I R E Q L U S S I O N E N I N E
A G T T E S T I T L S W A C T W R
T N S E P A J K K W L E T E K A R
I I O R L S K L O Y O R O Q S F O
V F O N E T A I L E D T E S T G R
E I R C E I L F Y G Q U A X P H S
N C C R I T I C A L R E G I O N T
S A M P L E S I Z E I L I Z L U Y
I N A U D L R W O E L P Q P S E U
L C P O P U L A T I O N M E A N D
W E L E V E L C S E N E R X Y L J

143
Since you already know the different terms related to hypothesis
testing, you are now ready to solve problems.
In decision making, what are the factors that you need to consider?
Do you think of the consequences of your actions?
Statistics can help us in making decisions. Included in the process is
forming reliable conclusions and the decision making starts with the testing
of the hypothesis. Let us enhance your decision-making skills by answering
the next activity.

What’s New

Would You Rather!

Directions: Choose only one, then justify your choice.


Would you rather … Would you rather … Would you rather …
be a girl or a boy? have more siblings or go to college or get a
be the only child? job?
Would you rather … Would you rather … Would you rather …
come to school or go without Facebook have many good
hang out with your or junk food for the friends or one very
friends? rest of your life? best friend?

Every day, we are faced with all sorts of decisions. Sometimes the
decisions are small, like what to wear or what to eat. But sometimes the
decisions are bigger, like what course you are going to take up or which
university you are going to enrol in college. The test of hypothesis will aid
you in the decision-making process so you can make the right choices for
better results.

What Is It

In testing hypothesis on the population means, follow the steps below:


1. State the null hypothesis 𝐻𝑜 and the alternative hypothesis 𝐻𝑎 .
2. Determine the test statistic that will be used to conduct the hypothesis
test. Then, calculate its value.
3. Find the critical value for the test and draw the critical region.
4. Decide and draw a conclusion based on the comparison of the calculated
value of the test statistic and the critical value of the test.

144
In general, if the absolute value of the computed value is greater than
the absolute value of the critical value, we reject the null hypothesis and
support the alternative hypothesis. But if the absolute value of the
computed value is less than the absolute value of the critical value, we fail
to reject the null hypothesis and the alternative hypothesis is not
supported.

In a right-tailed test, if the computed value is greater than the


critical value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is less than the critical value, we
fail to reject the null hypothesis and the alternative hypothesis is not
supported.

In a left-tailed test, if the computed value is less than the critical


value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is greater than the critical value,
we fail to reject the null hypothesis and the alternative hypothesis is
not supported.

Study the given examples below.

Example 1: According to a study conducted by the Grade 12 students, ₱155


is the average monthly expense for cell phone loads of high school students
in their province. A Statistics student claims that this amount has increased
since January of this year. Do you think his claim is acceptable if a random
sample of 50 students has an average monthly expense of ₱165 for cell
phone loads? Using 5% level of significance, assume that a population
standard deviation is ₱52.
Solution:
Given: 𝑥̅ = 165 𝜇 = 155 𝜎 = 52 𝑛 = 50 𝛼 = 0.05
Step 1: State the null and alternative hypotheses.
𝐻𝑜 : 𝜇 = 155 𝐻𝑎 : 𝜇 > 155
Step 2: Determine the test statistic, then compute its value.
Since the population mean is being tested, the population standard
deviation 𝜎 is known, and 𝑛 > 30, the appropriate test statistic is the z-test.

𝑥̅ −𝜇
𝑧= 𝜎
√𝑛
165−155
𝑧= 52
√50
10
𝑧= 7.35
𝐳 = 𝟏. 𝟑𝟔𝟏

145
Step 3: Find the critical value and draw the critical region. Use the z-critical
value table.
The alternative hypothesis is directional. Hence, the one-tailed test
(right-tailed test) shall be used. From the z-value table at 0.05 level of
significance, the critical value is 1.645.

Non-Rejection
Region
Rejection Region

1.361 1.645

Step 4: Draw a conclusion.


The z-computed value is 1.361 and it lies within the non-rejection
region, so we fail to reject the null hypothesis. Therefore, there is no enough
evidence to support the claim that the average monthly expense for cell
phone loads is more than ₱155. This result is significant at 𝛼 = 0.05 level.
Example 2: Blood glucose levels for obese teenagers have a mean of 120. A
researcher thinks that a diet high in raw cornstarch will have a positive or
negative effect on blood glucose levels. A sample of 25 patients who have
tried the raw cornstarch diet has a mean glucose level of 135 with a
standard deviation of 38. Test the hypothesis at 𝛼 = 0.10 that the raw
cornstarch had an effect.
Solution:
Given: 𝑥̅ = 135 𝜇 = 120 𝑠 = 38 𝑛 = 25 𝛼 = 0.10 𝑑𝑓 = 24
Step 1: State the null and alternative hypotheses.
𝐻𝑜 : 𝜇 = 120 𝐻𝑎 : 𝜇 ≠ 120
Step 2: Determine the test statistic, then compute its value.
Since it is the population mean being tested, the population standard
deviation is unknown, and 𝑛 < 30, the appropriate test statistic is the t-test.

𝑥̅ −𝜇
𝑡= 𝑠
√𝑛
135−120
t= 38
√25

15
𝑡= 7.6
𝒕 = 𝟏. 𝟗𝟕𝟒

146
Step 3: Find the critical value and draw the critical region.
The alternative hypothesis is non-directional. Hence, the two-tailed
test shall be used. From the t-value table at 0.10 level of significance, the
critical value is ±1.711.
Rejection Region Non-Rejection Rejection Region
Region

- 1.711 1.711

Step 4: Draw a conclusion.


Since the t-computed value is 1.974 which is greater than the critical
value of 1.711, we reject the null hypothesis and support the alternative
hypothesis. We can conclude that there is enough evidence to support the
claim that the raw cornstarch had an effect on blood glucose levels.
Example 3: The average IQ of Senior High School students is 99 with a
standard deviation of 15. A researcher believes that the average IQ of Senior
High School students is lower. A random sample of 40 students was tested
and got an average of 95. Is there enough evidence to suggest that the
average IQ is lower? Test the hypothesis at 0.05 level of significance.
Solution:
Given: 𝑥̅ = 95 𝜇 = 99 𝜎 = 15 𝑛 = 40 𝛼 = 0.05
Step 1: State the null and alternative hypotheses.
𝐻𝑜 : 𝜇 = 99 𝐻𝑎 : 𝜇 < 99
Step 2: Determine the test statistic, then compute its value.
Since the population mean is being tested, the population standard
deviation 𝜎 is known, and 𝑛 > 30, the appropriate test statistic is the z-test.

𝑥̅ −𝜇
𝑧= 𝜎
√𝑛
95−99
𝑧= 15
√40
−4
𝑧= 2.37
𝐳 = −𝟏. 𝟔𝟖𝟖

147
Step 3: Find the critical value and draw the critical region. Use the z-critical
value table. The alternative hypothesis is directional. Hence, the one-tailed
test (left-tailed test) shall be used. From the z-value table at 0.05 level of
significance, the critical value is -1.645.

Non-Rejection
Region
Rejection Region

-1.645

Step 4: Draw a conclusion.


The z-computed value is -1.688 and it lies within the rejection region,
so we reject the null hypothesis. Therefore, there is enough evidence to
support the claim that the IQ level of Senior High School students is lower
than 99. This result is significant at 𝛼 = 0.05 level.

What’s More

Activity 1: Complete Me!

Direction: Fill in the blanks/boxes.

A researcher reports that the average IQ level of students in Philippine


Science High School (PSHS) is 110. A sample of 20 students has a mean IQ
level of 106 with a standard deviation of 9. At 5% level of significance, test
the claim that the IQ level of students in PSHS is 110.
Solution:
Given: 𝑥̅ = ___ 𝜇 = 110 𝑠 = ___ 𝑛 = ___ α = ___ df = ___
Step 1: ____________________________________________________
𝐻𝑜 : 𝜇 = 110 𝐻𝑎 : 𝜇 ≠ 110
Step 2: Determine the test statistic, then compute its value.
Since n < 30, we will use_______________________.
𝑥̅ −𝜇
𝑡= 𝑠
√𝑛

𝑡= 9
√ 20

148
𝑡=

𝑡 = ____
Step 3: ____________________________________________________

From the t-value table at 0.05 level of significance, the critical value is
_______________.

Acceptance Region
or Non-Rejection Rejection Region
Region

-2.093 2.093

Step 4: Draw a conclusion.

Since it is a left-tailed test and the t-computed value is _______, which is


________ than the critical value of ______, we _______ the null hypothesis. We
therefore conclude that _________________________________________________.

Activity 2: Follow Me!


Directions: Follow the steps in testing hypothesis to answer the following
problems.
1. Mapalad Integrated High School determined students’ Body Mass Index
(BMI) at the opening of classes. It has been recorded that the average
height of female students is 154.2 centimeters with a standard deviation
of 9 centimeters. The researcher conducted her own study and she
randomly selected 40 female students. In her study, she got an average of
156.7 centimeters. Is there a reason to believe the claims of the school?
Use 5% level of significance in testing the hypothesis.
2. The manager of a certain TV station claimed that the average rating of
people watching their noontime teleserye in Manila is 62.5. A researcher
randomly selected 25 people and asked them their favorite noontime
teleserye. He computed the mean and obtained 67.8 with standard
deviation of 15.9. Is there a reason to believe that the manager is correct?
Use 0.01 as the level of significance.
3. According to the report of National Economic Development Authority
(NEDA) last year, a Filipino household spends an average of ₱333 a day.
You took a random sample of 20 households and determined the amount
of their allotted budget each day revealing a mean of ₱420 and standard
deviation of ₱120. Using 0.01 level of significance, can it be concluded

149
that the average amount spent per day by a Filipino household has
increased? Assume normality over the population.

4. One of the psychological tests conducted by the guidance counselors in a


public school is the Survey of Study Habits and Attitudes (SSHA) that
measures student’s attitudes toward studying. The mean score of Senior
High students is 135 with standard deviation of 35. Makisig suspects that
older students have better attitudes toward school. He randomly selects
50 Grade 12 students who are at least 18 years old and gives them SSHA.
The test result has a mean score of 144.8 points. Is there a reason to
believe that the claim of the guidance counselors is correct? Assume that
the population mean score is normally distributed. Carry out a
significance test at 5% level.
5. According to the World Health Organization’s statistics published in
2018, the lifespan of a person in the Philippines is 67 years old. A
random sample of 25 obituary notices in the Philippine Daily Inquirer has
an average mean of 60 years old with a standard deviation of 19 years. If
the life span in the Philippines is normally distributed, does this
information indicate that the population mean life span of Filipinos is less
than 67 years old? Use 5% level of significance.

What I Have Learned

Directions: Answer the following.


1. What are the steps in testing hypothesis for the population mean?
2. In a right-tailed test, what will you do if the computed value is greater
than the critical value?
3. In a left-tailed test, what will you do if the computed value is less than
the critical value?
4. What does it mean if you reject the null hypothesis?
5. What do you mean if you fail to reject the null hypothesis?

150
What I Can Do

In a long bond paper, create an infographic chart about solving problems in


hypothesis testing. Be creative!

Grading Rubric for Infographic Chart


Indicator Excellent Good Fair Needs
(5 points) (4 points) (3 points) Improvement
(2 points)
Content All Information Most Few
information are detailed, information information
are detailed, accurate, are detailed, are detailed,
accurate, relevant, and accurate, accurate,
relevant, and properly relevant, and relevant, and
properly cited. properly properly cited.
cited. cited
Infographic Layout is Layout is Layout is Layout maybe
Design aesthetically clear. generally somewhat
pleasing. clear. unclear.
Creativity Additional Additional No additional Additional
elements elements are elements are elements are
such as used but do used. used but
pictures are not enhance there is no
incorporated the relevance to
to enhance infographic. the content of
the the
infographic. infographic.

Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. The null hypothesis is rejected. What does it mean?
A. The null hypothesis is incorrect.
B. The alternative hypothesis is true.
C. There is enough evidence against the null hypothesis.
D. There is a very small probability that the null hypothesis is true.

151
2. If the t-computed value is 2.430 and the critical value is 2.011, what will
be the decision?
A. Reject the null hypothesis. C. Reject the alternative hypothesis.
B. Support the null hypothesis. D. Do not reject the null hypothesis.
3. What is the third step in the hypothesis testing procedure?
A. Draw conclusion.
B. State the null and alternative hypotheses.
C. Determine the test statistic and compute it.
D. Find the critical value for the test; then draw the critical region.

4. In a left-tailed test, what will you do if the critical value is less than the
computed value?
A. Reject the null hypothesis.
B. Do not reject the null hypothesis.
C. Reject the alternative hypothesis.
D. Do not reject the alternative hypothesis.
5. The t-computed value is 1.875 and the critical value is 2.080. What
conclusion can be drawn?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject the alternative hypothesis.
D. Fail to reject the alternative hypothesis.
6. What does it mean if a result is said to be significant at 1% level?
A. The null hypothesis is 99% true.
B. The null hypothesis is 99% wrong.
C. We fail to reject the false null hypothesis 1% of the time.
D. There is a 1% probability that a true null hypothesis is rejected.
7. It is a value that separates the acceptance region from the rejection region
in a normal curve when testing the hypothesis?
A. t-value C. critical value
B. z-value D. computed value
8. What should you do if the computed z-value lies in the critical region?
A. Reject the null hypothesis.
B. Reject the alternative hypothesis.
C. Do not reject the null hypothesis.
D. Do not reject the alternative hypothesis.
9. The mean height of women is less than 64" (inches). Which of the following
represents the null and alternative hypotheses?
A. H0: μ > 64" C. H0: μ < 64"
Hₐ: μ < 64" Hₐ: μ ≠ 64"

152
B. H0: μ = 64" D. H0: p = 64"
Hₐ: μ ≠ 64" Hₐ: p > 64"
10. In the hypothesis testing procedure, drawing conclusion should always be
the __________ step.
A. first B. second C. third D. last
11. A one sample t-test is conducted on Ho: μ = 81.6. The sample has a mean
of 84.1, s = 3.1, n = 25, and α = .01. What conclusion can be drawn?
A. Reject Ho. C. Fail to reject Ho.
B. Reject Ha. D. Fail to reject Ha.
12. Perform a hypothesis test where the null hypothesis is that the μ = 6.9. A
random sample of 16 items is selected. The sample mean is 7.1 and the
sample standard deviation is 2.4. It can be assumed that the population is
normally distributed at α = 0.05.
A. There is enough evidence to reject the claim.
B. There is enough evidence to support the claim.
C. There is not enough evidence to reject the claim.
D. There is not enough evidence to support the claim.
13. If the computed t-value is 2.130 while the critical value is 2.086, what
conclusion can be drawn?
A. Reject both the null and alternative hypotheses.
B. Fail to reject the null and alternative hypotheses.
C. Reject the null hypothesis in favor of the alternative hypothesis.
D. Fail to reject the null and the alternative hypothesis is not supported.
14. After formulating the hypotheses, what is the next step in the hypothesis
testing procedure?
A. Draw conclusion.
B. Choose the level of significance.
C. Determine the test statistic and compute it.
D. Find the critical value and draw the critical region.
15. Find the critical value(s) for a two-tailed test with α = 0.05.
A. z = -1.65 B. z = ±0.06 C. z = 1.65 D. z = ±1.96

153
Additional Activities

In this activity, complete the1-4-3 chart by writing down what are being
asked.

1 – 4 – 3 LIST
One (1) thing I really love about this topic:
1.
Four (4) important reasons why I love this topic:
1.
2.
3.
4.
Three (3) things I still need to understand about this topic:
1.
2.
3.

154
References
Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De
Mesa. Teaching Guide for Senior High School: Statistics and Probability.
Quezon City: Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.
Chan Shio, Christian Paul, and Maria Angeli Reyes. Statistics and Probability
for Senior High School. Quezon City: C & E Publishing Inc., 2017.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc., 2017.
Jaggia, Sanjiv, and Alison Kelly. Business Statistics: Communicating with
Numbers. 2nd ed. New York: McGraw-Hill Education, 2016.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant. Manila:
Mindshapers Co., Inc., 2017.

Online Resources
Encourage Play. “A Simple and Fun Game in Making Decisions.” Accessed
May 22, 2020 https://www.encourageplay.com/blog/a-simple-and-fun-
game-to-practice-making-decisions
HackMath.net. “Calculators.” Accessed May 22, 2020
https://www.hackmath.net/en/calculator/normaldistribution?mean=0
&sd=1&area=above&above=1.645&below=&ll=&ul=&outsideLL=&outside
UL=&draw=Calculate
Lenhart, Kira. “Country Infographic Rubric.” Accessed June 17, 2020
https://i.pinimg.com/originals/0b/04/12/0b041201f9fcc8c8ef4decc42
8765529.png

155
Statistics and
Probability
Quarter 2 – Module 9:
Formulating Appropriate
Null and Alternative Hypotheses
on a Population Proportion

156
What I Need to Know

In the previous lessons, you have studied how to formulate appropriate null
and alternative hypotheses concerning population means. Also, you’ve
learned how to draw correct conclusions after solving given problems based
on the test statistic and the rejection region.

In this module, you will have a short recall about population proportions
and all other related concepts with their equivalent symbols like test
statistic, rejection region, p-value, level of significance, etc.
After going through this module, you are expected to:

1. recall and identify the symbols used in formulating hypotheses;


2. formulate the appropriate null and alternative hypotheses concerning
population proportions; and
3. identify whether the given hypothesis test is a single-tailed or a two-
tailed test.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. A certain telephone company with a target of 5,000 items found that 300
out of the 500 items they randomly chose and tested failed to meet the
quality control guidelines. The company would like to test if more than
25% of the target might be out of assurance. What is the appropriate
alternative hypothesis at α = 0.05 level?

A. Ha: p < 0.25 C. Ha: p = 0.25


B. Ha: p > 0.25 D. Ha: p ≠ 0.25
2. In problem no. 1, what is the correct null hypothesis?

A. Ho: p < 0.25 C. Ho: p = 0.25


B. Ho: p > 0.25 D. Ho: p ≠ 0.25
3. A survey conducted last year by the barangay health workers showed
that 20% of the students smoked. This year, a new survey is conducted
on 150 students selected randomly from the same school. It was found

157
out that 35 of them smoke. Test if the claim has decreased at α = 0.01
level. Formulate the correct alternative hypothesis.
A. Ho: p ≠ 0.20 C. Ha: p = 0.20
B. Ho: p < 0.20 D. Ha: p < 0.20

4. In a public junior high school, a study found out that 40% of Grade 7
students are overweight. Is this lower for grade level age if a sample of
100 students was randomly chosen at 0.05 level of significance? What is
the appropriate alternative hypothesis?

A. Ho: p = 0.40 C. Ha: p < 0.40


B. Ha: p ≠ 0.40 D. Ho: p < 0.40

5. Ships arriving in Manila Port are inspected by Custom Officials for


contaminated cargo. Assume that at a certain port, 15% of the ships
arriving in the previous year contained cargo that was contaminated. A
random selection of 50 ships in the current year included 5 that had
contaminated cargo. Does the data suggest that the proportion of ships
arriving in the port with contaminated cargoes has increased in the
current year? Use α = 0.01 level. Formulate the alternative hypothesis in
sentence form.

A. The proportion of ships arriving into the port this year with
contaminated cargo is equal to 0.15.
B. The proportion of ships arriving into the port this year with
contaminated cargo is less than 0.15.
C. The proportion of ships arriving into the port this year with
contaminated cargo is not equal to 0.15.
D. The proportion of ships arriving into the port this year with
contaminated cargo is greater than 0.15.
6. Which of the following words suggests a right-tailed test?

A. smaller C. increased
B. different D. unequal

7. A mayor is concerned about the percentage of city residents who express


disapproval of his job performance. His political committee pays for a
newspaper ad, hoping to keep his disapproval rating below 21%. They
will use a follow-up poll to determine effectiveness. What is the correct
null hypothesis?

A. Ho: μ > 21 C. Ho: p < 0.21


B. Ho: p > 0.20 D. Ho: p > 0.21

8. In problem no. 7, what is the appropriate alternative hypothesis?


A. Ha: p < 0.21 C. Ha: μ < 0.21

158
B. Ha: p > 0.21 D. Ha: p < 0.20

9. Which of the following alternative hypotheses in symbols involving


population proportions illustrate a two-tailed test (non-directional)?

A. Ha: p < .35 C. Ha: p ≠ .35


B. Ha: p > .35 D. Ha: p = .35

10. Which of the following alternative hypotheses in sentence form is NOT a


one-tailed test (directional)?
A. The proportion of female students who enrolled is greater than 10%.
B. The proportion of male teachers who has master’s degree is lower
than 5%.
C. The proportion of number of tourists who visited the park is not
equal to 50%.
D. The proportion of athlete students who joined the competition has
increased by 25%.
11. The proportion of patients with heart diseases is higher among smokers
than non-smokers. Which is true about this statement?
A. It is a null hypothesis that is directional.
B. It is a null hypothesis that is non-directional.
C. It is an alternative hypothesis that is directional.
D. It is an alternative hypothesis that is non-directional.

12. In the given statement below, what is the appropriate alternative


hypothesis to be used in symbols?
In a university, the proportion of graduates majoring in Mathematics is
more than 10% of the entire population.

A. Ha: p < .10 C. Ha: p ≠ .10


B. Ha: p > .10 D. Ha: p > .10

13. Which of the following statements is incorrect about alternative


hypothesis on a population proportion?
A. It is represented by Ha.
B. It is used as a basis to determine the location of the p-value.
C. It is the competing claim that the parameter is equal to a specific
value.
D. It is a claim that the proportion is less than, greater than, or not
equal to the hypothesized proportion po.
14. What is the use of the value of p in formulating the null hypotheses on a
population proportion?
A. It is the value only used for one-tailed test.
B. It is the basis for the computation of z-test.
C. It is the same as the hypothesized proportion.
D. It is a probability that tells if the null hypothesis is true.

159
15. Which of the following null and alternative hypothesis are correctly
written in symbols?
A. Ho: p = .10 C. Ho: p < .30
Ha: p < .20 Ha: p = .30

B. Ho: p ≤ .40 D. Ho: p ≥ .50


Ha: p > .40 Ha: p > .50

How did you find this pre-test? Did you encounter both familiar and
unfamiliar terms? Kindly compare your answers in the Answer Key on the
last part of this module.
If you got a perfect score or 100%, skip this module and proceed to
the next one. But if you missed even a single point, please continue with
this module as it will enrich your knowledge in formulating hypotheses on
the population proportion.

Lesson Formulating Appropriate Null

9 and Alternative Hypotheses on


a Population Proportion
In the previous modules, you have learned all the steps on how to test
hypothesis concerning population mean and sample mean using the critical
value approach. You also applied those concepts in solving real-life
problems. Of course, the problems presented were limited to testing
hypothesis concerning population mean.
This time, you will learn how to test hypothesis involving another
parameter, the population proportion. Are the steps in testing hypothesis on
population proportion the same as the steps you just have learned? Is it
easier to test hypothesis concerning population proportion than population
mean? These are some of the questions you may answer yourself as you go
along with next modules.
As what you know, the first step in hypothesis testing is to formulate
the null and the alternative hypothesis. This is also true if you are testing
hypothesis concerning population proportion. But prior to that, you must
fully understand the given situation and identify what values are given on
the problem. It is important to correctly identify the different symbols
involved and their corresponding values found in the given problem.
Recall the different symbols used in hypothesis testing by answering
the following activity.

160
What’s In

Review Activity: Match Them Up!

Directions: Match the given terms/phrases in Column A with the correct


symbols in Column B. Write your chosen symbols on the boxes.
A B

1. Alternative Hypothesis 𝑛

2. Sample Proportion 𝛼

3. Null Hypothesis 𝑝̂

4. Population Proportion 𝐻𝑎

5. Sample Size 𝑞

6. Value of 1 − 𝑝 𝐻0

7. Level of Significance 𝑝

Answer the following questions:

1. How did you find the activity?

2. Were you able to recall the different necessary symbols used in testing
hypothesis correctly?

3. Did you encounter both familiar and unfamiliar symbols?

4. What is the importance of those symbols?

161
What’s New

Activity 2: Synonyms Match

Directions: Classify the given words by grouping the relevant words


together. Place them on the table below.

lesser different bigger


changed increase greater
higher smaller unequal
lower more decrease
fewer larger

In further discussions of this lesson, you will encounter some of the


words listed above in formulating null and alternative hypotheses on a
population proportion. They will be of great help to you in answering
problems correctly.

What Is It

Once you already know that you are dealing with a population
proportion, you can conduct the hypothesis test. You can start with the first
step of a hypothesis test which is to determine the hypotheses. In order to
formulate null and alternative hypotheses concerning population
proportions, you can write them in sentence form or you can use different
symbols. Here, you will use the symbol p for the population proportion.

162
Remember that the hypotheses are claims about the population
proportion, p. The null hypothesis states that the proportion is equal to a
specific value or the hypothesized proportion, po. On the other hand, the
alternative hypothesis is the competing claim that the population proportion
is less than, greater than, or not equal to po.
As a reminder, the null hypothesis is always a statement of equality.
The alternative hypothesis is always a statement of inequality, using the
symbols <, >, or ≠. Moreover, the hypotheses are stated in such a way that
they are mutually exclusive. That is, if one is true, the other must be false;
and vice versa.

If you are going to write the null hypothesis in sentence form, you
will usually use “is” or “is equal to”. In symbols, you are going to use:

HO : p = po

Meanwhile, to formulate alternative hypothesis in sentence form or in


symbols, you will just remember the following:

 When testing for population proportions, there are three (3) possible
alternative hypotheses. They are based on the wording of the question
instructing you what to hypothesize. (See illustrative examples below.)

Alternative Hypotheses CLUES/WORDS USED


(SYMBOLS TO BE USED)

a. Ha : p < po smaller, less, decreased, fewer, lower


b. Ha : p > po larger, greater, more, increased
c. Ha : p ≠ po different, not equal to, changed

where: p = population proportion


po = hypothesized proportion

In the given symbols as shown above, letters a and b are used in a


one-tailed test or one-sided tests (directional) while letter c is used for a two-
tailed test (non-directional).
As you might recall, the differences between one-tailed test
(directional) and two-tailed test (non-directional) were already explained to
you in the previous modules. And for the purpose of this lesson, the table
below shows the differences between one-tailed test and two-tailed test.

163
One-Tailed Two-Tailed
 Alternative hypothesis contains  Alternative contains the
the greater than (>) or less than inequality (≠) symbol.
(<) symbols
 It is directional (either right-tailed  It has no direction.
or left-tailed)

The next table below shows the null and alternative hypotheses stated
together with the types of hypothesis tests.

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null 𝐻𝑜 : 𝑝 = 𝑝𝑜 or 𝐻𝑜 : 𝑝 = 𝑝𝑜 or
𝐻𝑜 : 𝑝 = 𝑝𝑜
Hypothesis 𝐻𝑜 : 𝑝 ≤ 𝑝𝑜 𝐻𝑜 : 𝑝 ≥ 𝑝𝑜
Alternative
𝐻𝑎 : 𝑝 ≠ 𝑝𝑜 𝐻𝑎 : 𝑝 > 𝑝𝑜 𝐻𝑎 : 𝑝 < 𝑝𝑜
Hypothesis

Illustrative Examples:
Example 1. It has been claimed that 40% of students in a particular senior
high school dislike Mathematics. When a survey was conducted by a
researcher, it showed that 145 of 800 students dislike Mathematics. Test if
the claim was different at α = 0.05 level.
Null Hypothesis (Ho):
In this example, the hypothesized proportion is 40% or 0.40. Hence,
the null hypothesis will be,
The proportion of students who dislike Mathematics is 40%.
In symbols, you can write,
Ho: p = 0.40
Alternative Hypothesis (Ha):
Our cue word here is “different” which means “not the same” or “not
equal”. Therefore the alternative hypothesis is,
The proportion of students who dislike Mathematics is not equal
to 40%.
In symbols, you can write,
Ha: p ≠ 0.40

Since the word “different” is used in the given problem,


the symbol to be used in alternative hypothesis is “ ≠ ”.

Note: This is a two-tailed test or non-directional.

164
Example 2. A certain senior high school plans to open STEM (Science and
Technology, Engineering, and Mathematics) as an academic track only if
60% of the students in their junior high school will enrol on the following
academic year. A survey conducted among a random sample of students
revealed that 450 out of 1000 students will enrol. Is the expected enrolment
significantly lower than the desired enrolment? Test at α = 0.05 level.

Null Hypothesis (Ho):


The hypothesized proportion here is 60%, therefore the null
hypothesis will be,
The proportion of students who will enroll on STEM track is 60%.
In symbols, it can be written as,
Ho: p = 0.60
Alternative Hypothesis (Ha):
Your hint in formulating the alternative hypothesis in this example is
the phrase “lower than” which means “less than”. So, your alternative
hypothesis will be,
The proportion of students who will enroll on STEM track is lower
than 60%.
which can be written as,
Ha: p < 0.60

Since the word “lower” is used in the given problem,


the symbol to be used in alternative hypothesis is “<”.

Note: This is a one-tailed test or directional.

Example 3. It has been claimed that 40% of qualified applicants passed in


a particular job interview. When a survey was conducted by a researcher of
a certain company, it showed that 90 of 145 applicants passed the job
interview. Test if the claim was larger at α = 0.05 level.
Null Hypothesis (Ho):
40% is the hypothesized proportion; hence you have the null
hypothesis stated as
The proportion of qualified applicants in a particular job
interview is 40%.
And it can be written in symbols as
Ho: p = 0.40

165
Alternative Hypothesis (Ha):
The word “larger” is synonymous to “greater” hence your alternative
hypothesis will be,
The proportion of qualified applicants in a particular job
interview was larger than 40%.
Or in symbols
Ha: p > 0.40

Since the word “larger” is used in the given problem,


the symbol to be used in alternative hypothesis is “ > “.

Note: This is a one-tailed test or directional.

What’s More

Activity 3: Please Correct Me If I’m Wrong!


Directions: On the first blank before each number, draw a happy face if
the pair of hypotheses is correct and a sad face if the pair is incorrect. If
incorrect, write the correct ALTERNATIVE hypothesis in symbols on the
second blank.
_____, __________1. In a public market, 65% of the vendors preferred to use
plastic over paper bags. After the local ordinance was signed, 92 out of 120
randomly selected vendors preferred plastic over paper bags. Does this
indicate that vendors in that public market have less preference in using
paper bags? Use 0.05 level of significance.
Ho : p = 0.65
Ha : p < 0.65
_____, __________2. The school principal in a certain private junior high
school claimed that 35% of all students are in favor of the new PE uniform.
A research teacher asked his students to verify the claim. With this, 271 out
of 400 randomly selected students agreed to the new PE uniform.
Using α = 0.10 level, is there enough evidence to conclude that the
percentage of students who are in favor of the new PE uniform is different
from 35%?

166
Ho : p = 0.35
Ha : p > 0.35
_____, __________3. A research found out that 5% of the senior high school
students in a certain school are working students. A researcher randomly
selected 35 out of 300 students who are working. Is there a percentage
increase in the number of senior high school students who are working? Use
α = 0.01 level.
Ho : p = 0.05
Ha : p ≠ 0.05
_____, __________4. Before the national elections, 75% of the voters in a
certain town said that they preferred older senatorial candidates running for
senatorial positions than younger candidates. After a certain survey was
conducted, 910 out of 1,300 randomly selected voters preferred older
senatorial candidates. Does this claim indicate that voters in that town have
a greater interest in older candidates than in younger ones? Use α = 0.05.
Ho : p = 0.75
Ha : p > 0.75
_____, __________5. A researcher claimed that 55% of elementary students
would rather play than read books during break time. Another researcher
was assigned to verify the claim. He randomly selected 300 students. Two
hundred seventy-four (274) of them said they would rather play during
break time than read books.
At 0.10 level, is there enough evidence to conclude that the percentage
of elementary students who would rather play than read books has changed
to 55%?
Ho : p = 0.55
Ha : p < 0.55

Activity 4: Use It in a Sentence!


Directions: Using the given problems in Activity 3, write the appropriate
null and alternative hypotheses in sentence form. Write your answers on the
blank provided.
i. Ho _______________________________________________________________________________________________
Ha _______________________________________________________________________________________________

ii. Ho _______________________________________________________________________________________________
Ha _______________________________________________________________________________________________

167
iii. Ho _______________________________________________________________________________________________
Ha _______________________________________________________________________________________________

iv. Ho _______________________________________________________________________________________________
Ha _______________________________________________________________________________________________

v. Ho _______________________________________________________________________________________________
Ha _______________________________________________________________________________________________

Activity 5: Tell me the Tail


Directions: In each problem below, give the null and alternative hypotheses
and identify whether it is right-tailed, left-tailed or two-tailed test.

1. A sample of 800 items produced on a new machine showed that 48 of


them are defective. The factory will get rid the machine if the data
indicates that the proportion of defective items is significantly more
than 5%. At a significance level of 10% does the factory get rid of the
machine or not?

2. A drug manufacturer claims that fewer than 10% of patients who take
its new drug for treating certain pneumonia will experience nausea.
In a random sample of 250 patients, 23 experienced nausea. Perform
a significance test at the 5% significance level to test this claim.

3. In a group of 375 Senior High School students, 40 were left-handed. Is


this significantly different from the proportion of all Senior High
School students who are left-handed, which is 12%?

4. In a random survey of 1000 households in Unlad Province, it is found


that 29% of the households have at least one member with a college
degree. Does this finding contradict the statement that the proportion
of all such households in Unlad Province is 35 percent? Test at α = .05
significance level.

5. In a random sample of 400 electronic gadgets, 14 were found to be


defective. The manufacturer wants to claim that less than 5% of all of
their games are defective. Test this claim at the 0.01 significance level.

168
What I Have Learned

Complete the following statements.

1. Formulating the appropriate null hypothesis on a given population


proportion in symbols is written as _________________________.

2. To formulate alternative hypothesis concerning population proportion,


there are three possible alternative hypotheses and they are based on the
wording of the question instructing you what to hypothesize.
a. A problem with the expressions “smaller”, “less”, “decreased”, “fewer”,
or “lower” is written in symbols as ________________.
b. A problem with the expressions “larger”, “greater”, “more”, or
“increased” is written in symbols as __________________.
c. A problem with the expressions “different”, “not equal to”, or “changed”
is written in symbols as ___________________.
3. A hypothesis test can be directional or _____________________________.

4. A directional or one-tailed test can be_________________ or _______________.

5. The null and alternative hypotheses are _________________. That means, if


one is true, the other must be false; and vice versa.

What I Can Do

Activity: Research, Create, Then Formulate!


Directions: Conduct a simple research to obtain some data in your
community. You may do this through an interview. For example, your topic
may be about population, health, number of households, accident rates,
employment, etc. Using the given words below, choose only three and
construct your own word problems. Then, formulate the appropriate null
and alternative hypotheses.
lesser different larger
changed increase greater
higher smaller unequal
lower more decrease

13

169
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. Which of the following is the correct symbol for hypothesized proportion?


A. po B. p
C. Hp D. Ho

2. What is the correct alternative hypothesis in sentence form of the


statement: Ha : p ≠ 0.21?
A. The proportion of male in a certain barangay is 21%.
B. The proportion of male in a certain barangay is fewer by 21%.
C. The proportion of male in a certain barangay is different from 21%.
D. The proportion of male in a certain barangay has increased by 21%.

3. Which of the following words is used in a null hypothesis written in


sentence form?
A. equal B. fewer
C. higher D. lower

4. A researcher will use a one-tailed test on his research study concerning


population proportions. One-tailed test is also called _____________.
A. one-sided B. one-region
C. directional D. non-directional

5. Which of the following statements is true about alternative hypothesis on


a population proportion?
A. It is represented by Ha.
B. It is used as a basis to determine the location of the extreme values.
C. It is the competing claim that the parameter is equal to a specific
value.
D. It is a claim that the proportion is equal to the hypothesized
proportion.

6. Before the opening of classes, parents were asked to answer a survey.


Those in favor of alternative learning mode were 1,900 out of 5,000
randomly selected parents. Using α = 0.10 level, is there enough evidence
to conclude that the percentage of parents who are in favor of alternative
learning mode is less than 38%? What is the appropriate alternative
hypothesis?

170
A. Ha : p > .38 C. Ha : p < .38
B. Ha : p = .38 D. Ha : p ≠ .38

7. In problem no. 6, formulate the correct null hypothesis.


A. The proportion of parents who are in favor of alternative learning
mode is 38%.
B. The proportion of parents who are in favor of alternative learning
mode is not equal to 38%.
C. The proportion of parents who are in favor of alternative learning
mode has increased by 38%.
D. The proportion of parents who are in favor of alternative learning
mode is more than 38%.

8. A survey conducted last year by certain barangay officials showed that


40% of the residents owned a private car. This year a new survey is
conducted on 375 residents selected randomly from the same barangay.
It was found out that 190 of them own a car. Test if the claim is fewer at
α = 0.01 level. Formulate the appropriate null hypothesis.
A. Ho : p = .40 C. Ho : p < .40
B. Ha : p = .40 D. Ha : p < .40

9. In problem #8, what is the correct alternative hypothesis in sentence


form?
A. The proportion of residents who owned a private car is 40%.
B. The proportion of residents who owned a private car is higher than
40%.
C. The proportion of residents who owned a private car has changed to
40%.
D. The proportion of residents who owned a private car is fewer than
40%.

10. In a certain senior high school, a study found that 972 out that 1,100
Grade 12 students use smartphones. Using α = 0.10 level, is there
enough evidence to conclude that the percentage of students who are
using smartphones is different from 25%? What is the appropriate
alternative hypothesis?
A. Ha: p > .25 C. Ha: p = .25
B. Ha: p < .25 D. Ha: p ≠ .25

11. In problem no. 10, what type of test is used?


A. directional C. one-tailed test
B. one-sided test D. non-directional

12. The president of a certain food chain claims that 70% of his 20,000
customers are very satisfied with the service they receive. In order to
test the claim, a survey was conducted among 100 customers randomly
chosen. Among them, 90% said that they are satisfied. Is there enough

171
evidence to say that 70% of the customers are satisfied at 0.05 level of
significance? Formulate the null hypothesis.
A. Ho: p ≠ 0.70 C. Ho: p < 0.70
B. Ho: p = 0.70 D. Ho: p > 0.70

13. In problem no. 12, what is its alternative hypothesis?


A. Ha: p ≠ 0.70 C. Ha: p = 0.70
B. Ha: p < 0.70 D. Ha: p > 0.70

14. Suppose a TV network claims that at least 6% of their 15,000


employees are living in a condominium unit in the city. A survey was
conducted randomly among 500 employees. Assume a significance level
of 0.05. What is the appropriate alternative hypothesis of this study?
A. Ha: p ≠ 0.06 C. Ha: p < 0. 06
B. Ha: p > 0. 06 D. Ho: p = 0. 06

15. In problem no. 14, what specific type of test is applied?


A. directional C. one-tailed test
B. left-tailed test D. non-directional

Additional Activities

Direction: Given below are alternative hypotheses (in sentence form or in


symbols) on a population proportion. Determine if it is one-tailed test or two-
tailed test.
1. Ha : p ≠ 0.12 ______________________________
2. Ha : p < 0.58 ______________________________
3. Ha : p > 0.27 ______________________________
4. The proportion of qualified students in an entrance examination for a
certain college admission is greater than 10%.
______________________________

5. The proportion of voters has decreased by 8% during this year’s


election.
_____________________________

172
References

Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De
Mesa. Teaching Guide for Senior High School: Statistics and Probability.
Quezon City: Commission on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc., 2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources
Minitab.com. “About the Null and Alternative Hypotheses.” Accessed
February 4, 2019. https://support.minitab.com/en-us/minitab/
18/help-and-how-to/statistics/basic-statistics/supporting-topics/
basics/null-and-alternative-hypotheses/
Minitab.com. “What Are Type I and Type II Errors?” Accessed February 4,
2019. https://support.minitab.com/en-us/minitab/18/help-and-
how-to/statistics/basic-statistics/supporting-topics/basics/type-i-
and-type-ii-error/
Zaiontz, Charles. “Null and Alternative Hypothesis.” Accessed February 2,
2018.http://www.real-statistics.com/hypothesis-testing/null-
hypothesis/

173
Statistics and
Probability
Quarter 2 – Module 10:

Identifying Appropriate Test


Statistic Involving Population
Proportion

174
What I Need to Know

As you may recall, the Central Limit Theorem tells that if the sample
size is sufficiently large, then the mean of the random sample from a
population has a sampling distribution that is approximately normal, even
when the original population is not normally distributed. This means that
regardless of the shape of the original distribution, the sampling distribution
of the mean approaches a normal distribution as long as the sample is large
enough. Remember that the Central Limit Theorem is not limited to sample
means only. It can also be applied to sample proportions.
This module deals on identifying the appropriate form of test statistics
involving population proportion when the Central Limit Theorem is to be
used. However, the activities are limited to estimating the population
proportion and sample proportion as preparation in solving for the
appropriate test statistics.
After going through this module, you are expected to:
1. define population proportion and sample proportion;
2. determine the value of the population proportion and sample
proportion;
3. identify the appropriate form of the test statistic when the Central
Limit Theorem is to be used; and
4. relate population proportion in real-life situations.

175
What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. It is a part of the population with a particular trait expressed in decimal,
percent, or fraction.
A. Sample C. Sample Proportion
B. Population D. Population Proportion
2. It is the symbol used to represent the proportion of the samples.
A. x B. p C. 𝑝̂ D. n

3. What is the sample proportion if n = 550 and x = 308?


A. 0.35 B. 0.50 C. 0.56 D. 0.65

4. A researcher claims that 4% of all helmets have manufacturing flaws that


could potentially cause injury to a motorcycle wearer. A sample of 100
helmets revealed that 5 contain such defects. Is the sample large enough
to use the Central Limit Theorem?
A. No, because 5/100 <5 C. Yes, because 100>30
B. No, because 100(0.04)<5 D. Yes, because 100(0.96)>5
5. In testing hypothesis involving population proportion, when do we say
that the sample size (n) is sufficiently large?
A. if n>30 B. if n<100 C. if np>5, nq>5 D. if n/p<5, n/q<5
6. What is the part of the proportion of individuals in a sample sharing a
certain trait?
A. sample C. sample proportion
B. population D. population proportion
7. Twenty-eight percent (28%) of all Masagana High School students believe
that Monday will be a rainy day. You take a sample of 50 students and
find that 15 of them believe Monday will be a rainy day. What does 50
represent?
A. x B. p C. 𝑝̂ D. n

8. What would be the sample in the following situation?


A restaurant wants to know if customers buy dessert when they
eat out. As people leave the restaurant one evening, 20 people are
randomly surveyed. Eight people say they usually order dessert when
they eat out. The restaurant concludes that most customers do not order
dessert.

176
A. 8 customers C. all customers
B. 20 customers D. all customers who do not order dessert
9. Which assumption/s must be considered in testing hypothesis involving
proportion?
I. The conditions for binomial experiment are met.
II. The expression np>5 and nq>5 are both satisfied.
III.The sample size must be greater than or equal to 30.
A. I and II B. I and III C. II and III D. III only

10. What would be the population in this situation?


Surveyors in a mall choose shoppers to ask about products they prefer.
A. the surveyors C. all shoppers in the mall
B. the products they sell D. the shoppers who were asked about
their preferences
11. In testing hypothesis involving population proportion, which of the
following is appropriate to use?
A. t-test B. z-test C. p-test D. chi square
12. If the value of p is 0.45, what is the value of q?
A. 0.45 B. 0.46 C. 0.54 D. 0.55
13. Before the Mayweather vs. Pacquiao’s Fight of the Century, 75% of people
in Manila said that they prefer boxing over basketball. After the fight, out
of 150 randomly chosen people in Manila, 105 said they prefer boxing
over basketball. Which statement represents the probability of failure, q?
A. They prefer boxing over basketball.
B. They prefer basketball over boxing.
C. They don’t prefer boxing over basketball.
D. They don’t prefer both boxing and basketball.

14. In problem no. 13, what is the value of the sample proportion, 𝑝̂ ?
A. 0.60 B. 0.67 C. 0.70 D. 0.75
15. In a learning study, 1,200 respondents were asked if they can assimilate
concepts while watching television wherein 586 said YES. What is the
proportion of those who said yes?
A. 0.40 B. 0.49 C. 0.51 D. 0.58

How did you find this pre-test? Did you encounter both familiar and
unfamiliar terms? Kindly compare your answers in the Answer Key on the
last part of this module.
If you got a perfect score or 100%, skip this module and proceed to
the next one. But if you missed even a single point, please continue with
this module as it will enrich your knowledge in hypothesis testing involving
population proportion.

177
Lesson Identifying Appropriate Test

10 Statistic Involving Population


Proportion
There are certain situations when the data to be analyzed involve
population proportion or percentage. The following are some of examples
that show this condition.

- A politician wants to know the percentage of his constituents who


approve of his policy on educational programs and reforms.
- A manufacturing company is interested on determining the proportions of
defective products in the assembly line.
- A set of randomly selected employees were asked to determine the
percentage of their incomes spent on food per month.
- In a sample of 50 students, there are 15 part-timers. (This situation
shows proportion.)
- Fifty percent (50%) of the restaurants in the sample generate more than a
third of their weekly sales of juices.
It is noticeable that the cases above used percentage of the
population. In the previous modules, you have learned how to test
hypothesis concerning population mean and sample mean. This time, you
will learn how to test hypothesis involving population proportion. To be able
to do so, the z-test statistics for population proportion will be applied
particularly when the Central Limit Theorem is to be used. However, as
mentioned before, this lesson will just serve as a preparation. Further
details on computing for the test statistics involving proportion will be
discussed on Module 12.
Go over the lessons and have fun in working with the activities.

What’s In

It is important that you get yourselves be acquainted to different


terms that you will encounter throughout this module. The activity below
will help you to check your understanding and be familiarized about these
terms.

Please observe honesty and perseverance at all times.

178
Activity 1: Remind Me Please…

Directions: Identify the word/s being described by the statements in the


box. Copy the letters of your answer on the corresponding columns on the
table below. Then, answer the questions that follow.

A - Sample E - Sample Proportion


R - Population T - Population Proportion

1. It is an entire group of people, objects, or events which all have


at least one characteristic in common and must be defined
specifically and unambiguously.
2. It refers to any part of a population regardless of whether it is a
representative or not.
3. It refers to a part of a population with a particular attribute,
expressed as a fraction, decimal, or percentage of the whole
population.
4. It is the proportion of individuals in a sample sharing a certain
trait.

1 2 3 4

Answer the following questions. Write your answer on a separate sheet of


paper.

1. Find a word that begins with p and is synonymous to your answer in


the table above.
2. Which One Doesn’t Belong? Identify the group of words that are
most likely NOT synonymous to proportion.

A. C.
Ratio Mean
Fraction Average
Percentage Calculation

B. D.
Part Rate
.
Section Percentage
Calculation Measurement

179
Notes to the Teachers
It is encouraged that the learners be asked to provide a
separate activity notebook where they will write their answers to
all assessments and activities in Modules 9-14 in which topics
are all interrelated. Through this, learners’ progress can easily be
monitored and parts of the lesson where intervention is needed
can be identified.

Did you get a perfect score in


Activity 1? Challenge yourself and
get the same score in Activity 2.
A bonus of 5 points awaits you if
you get 2 consecutive perfect scores.
Be focused always. Good luck!

What’s New

Now to start this lesson, accomplish the activity below. Do not forget
to keep your answers because we will be using them in our discussions.

Activity 2: Rainbow Connection

In Matapat City, 10% of the residents are senior


citizen. A survey was conducted to 500 randomly selected
senior citizen residents to determine if they have cell
phones. Out of 500 respondents, 421 answered that they
own a cell phone.

Directions: Based on the situation above, match the questions in Column A


to their corresponding answers in Column B. Copy and use colored pens in
connecting the dots.
.

180
Column A Column B
1. What is the survey  500 senior citizen

all about? residents

2. What percent of the  421 senior citizen


residents in the city  residents
are senior citizens?
 residents
3. How many senior in the city
citizen residents 
are actually surveyed?  senior citizen
residents
4. How many senior in the city
citizen residents 
own a cell phone?  senior citizen
residents
5. What variable describes who own
the population  a cell phone
in the situation?
 10 %
6. What variable serves of the residents
as the sample? 
 84.2% senior citizen
residents

What Is It

Dealing with various problems or situations oftentimes leads to
confusion. In this section, take note that problems involving proportions,
unlike in population mean and sample mean, never use terms such as
“average” and “mean” but “percentage” instead. Let us first define what
population proportion is.
Population Proportion and Sample Proportion
Population proportion (p) is a part of the population with a particular
attribute or trait expressed as a fraction, decimal, or percentage of the whole
population. In symbol:

181
𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐦𝐞𝐦𝐛𝐞𝐫𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐰𝐢𝐭𝐡 𝐚 𝐩𝐚𝐫𝐭𝐢𝐜𝐮𝐥𝐚𝐫 𝐚𝐭𝐭𝐫𝐢𝐛𝐮𝐭𝐞
p=
𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐦𝐞𝐦𝐛𝐞𝐫𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧

p= ____ %

Notice that in Matapat City, 10% (percentage is used) of the entire


residents are senior citizen. Therefore, the percentage of the senior citizen
residents represents the population proportion or percentage which
makes p = 10% = 0.10.
Similarly, among these senior citizens, what percentage owns a cell
phone? That illustrates the sample proportion, in symbol 𝒑 ̂ (read as “p
hat”) which is computed as follows:

no.of senior citizen residents with cell phone


̂=
𝒑
no.of senior citizen residents
421
̂=
𝒑
500
̂ = 0.84
𝒑
Sometimes, the sample proportion ( ̂𝒑) is stated directly, such as:
- “20% of the respondents” = 0.20
- “5% of the defective bulbs” = 0.05
- “50% of the Grade 12 students” = 0.50

To change percent to
decimal, see examples
below:
1. 12% = 0.12
2. 5% = 0.05
3. 12.5% = 0.125

On the other hand, there are cases where we still need to calculate 𝒑
̂.
Examples of these kinds are:
- “70 out of 200 residents are married.”
- “150 out of 500 listeners are interviewed.”
- “10 out of 1000 bulbs are defective.”
In this case, we need to solve for the value of the sample proportion
̂ (read as “p hat”).
𝒑
Sample proportion (𝒑 ̂ ) is the ratio of the number of elements in the
sample possessing the characteristics of interest over the number of
elements in the sample or n. It is computed by the formula:

𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 𝑛 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 𝒙


̂=
𝒑 =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙𝑠 𝑜𝑟 𝑡ℎ𝑒 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝒏
𝒙
𝑝̂ = 𝒏

182
where: ̂ is the proportion of the number of successes in n samples
𝒑
and read as “p hat”.
x represents the number of “successes” in n samples; and
n represents the size of the sample.

The example below will help you understand better how we can easily
estimate the value of the sample proportion.
Remember that in a situation
describing a population
proportion/sample proportion, the
words “mean” or “average” are not
used.

Illustrative Example:
For a class project, a Grade 12 STEM student wants to estimate the
percentage of students in his school who are registered voters. From 45%
Grade 12 students, he surveys 500 students and finds that 200 are
registered voters. Determine the value of p and compute for the sample
proportion.
Solution:
The population proportion is the rate or percent used from the entire
Grade 12 students. Therefore:
Population Proportion, p = 45% = 0. 45
To find the sample proportion ( 𝒑̂ ), identify the ff:
Surveyed Grade 12 students = n = 500
Registered Grade 12 students = x = 200

Therefore, the sample proportion will be computed as follows:


number of registered Grade 12 students
Sample Proportion, 𝒑
̂=
number of Grade 12 students
200
̂=
𝒑
500
̂ = 0.4
𝒑

Using the Central Limit Theorem in Testing Population Proportion


When testing situations involving proportion, a percentage, or a
probability, the following assumptions must be considered:
1. The conditions for binomial experiment are met. That is, there is a fixed
number of independent trials with constant probabilities and each trial
has two outcomes that we usually classify as “success” (p) and
“failure” (q). The sum of p and q is 1. Hence, we can write p + q = 1 or
q = 1 – p.

183
2. The conditions np ≥ 5 and nq ≥ 5 are both satisfied so that the
binomial distribution of sample proportion can be approximated by a
normal distribution with 𝜇 = 𝑛𝑝 and 𝜎 = √𝑛𝑝𝑞. (However, the specific
number varies from source to source, some authors use 10 instead of 5
depending on how good an approximation one wants.)
Likewise, the second assumption served as the basis to determine
whether the sample size from the population proportion is sufficiently large
or not. Remember that this time, the condition that sample be large is not n
to be at “least 30” but it should satisfy the second assumption. For a large
size of sample proportions, the Central Limit Theorem (CLT) can be used.
Bear in mind that if the sample size is sufficiently large, then the mean of
the random sample from a population has a sampling distribution that is
approximately normal, even when the original distribution is normally
distributed and n ≥ 30.
Now, let us check the assumptions from the previous situation:
1. It is evident that the responses have only two outcomes: “registered
voter” (success) or “not registered voter” (failure). Therefore, the first
assumption is met.
2. To be able to satisfy the second condition, we find the hypothesized
value of the population proportion p = 0.45 while n = 500. To get q, q
= 1 – p which makes q = 1 – 0.45 = 0.55.

Through substitution, it shows that the second assumption is also


met, since:
np ≥ 5 and nq ≥ 5
500 (0.45) ≥ 5 and 500 (0.55) ≥ 5
225 ≥ 5 and 275 ≥ 5
Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met
where the sample size is truly large enough to use CLT. In this condition,
the test statistic to be used is the z-test statistic for proportions denoted by
Zcom or the computed z-value.
The z-Test Statistic for Population Proportion
𝑥̅ − 𝜇𝑥̅
Recall the z-score formula to be z = 𝜎𝑥̅
With np ≥ 5 and nq ≥ 5 and with the
𝑝𝑞
standard deviation of sample proportion be √ 𝑛
Substituting 𝑝̂ for 𝑥̅
p for 𝜇𝑥̅
𝑝𝑞
and √𝑛 for 𝜎𝑥̅

184
Therefore, the formula for the value of z-test statistic for population
proportion would be:

𝑝̂−𝑝 𝑝̂−𝑝
Zcom = or Zcom =
𝑝𝑞 𝑝 ( 1−𝑝 )
√𝑛 √
𝑛

where:
zcom is the z-test statistic for proportion.
𝑥
𝑝̂ is the sample proportion ( 𝑛 ).
p is the hypothesized value of the population proportion.
n is the sample size or the number of observations in the
sample.
q is equal to 1 – p.

Remember this formula because you are going to use this in Module
12 where the actual computation for the test statistic involving population
proportion will be held.

What’s More

Activity 3: I Can
Directions: In each item, complete the set of solutions.
1. The iCare Center for Internet & Society at Kaliwanagan Province
recently conducted a study analyzing the privacy management habits
of 80% teen internet users. In a group of 50 teens, 13 are reported to
have more than 500 friends on Facebook. Determine the value of p
and sample proportion 𝒑 ̂.

Solution:
𝒙
p = ___ % ̂=
𝒑
𝒏
= ____ ̂ = _____
𝒑
̂ = ______
𝒑

2. A student polls his school to see if students in Matapat Integrated


High School are pro or against the new legislation regarding the
prohibition of the use of cell phones in classroom. From 65% of the
students in the school, she surveys 600 students and finds that 480

185
are against the new legislation. Determine the value of p and the
sample proportion 𝒑
̂.
Solution:
𝒙
p = ___ % ̂=
𝒑
𝒏
= ____ ̂ = _____
𝒑
̂ = ______
𝒑

3. A survey of 2500 women between the ages of 15 and 50 in Kalinisan


City found that 28% of those surveyed relied on the pill for birth
control. The research shows that 25% of the population are using the
pill for birth control. Determine the value of p and the sample
proportion 𝒑̂.
Solution:
𝒙
p = ___ % ̂=
𝒑
𝒏
= ____ ̂ = ______
𝒑

4. A poll taken prior to election day finds that 45% registered voters
intend to vote for Mayumi Caliwanagan as barangay chairperson of
Brgy. Kapatagan. A concerned citizen surveyed that 380 out of 700
registered voters favored for Mayumi. Determine the value of p and
the sample proportion 𝒑̂.
Solution:
𝒙
p = ___ % ̂=
𝒑
𝒏
= ____ ̂ = _____
𝒑
̂ = ______
𝒑

5. A survey to the pet owners in Green Village is taken and 40% of those
surveyed say they have dogs as their pet for protection for self/family.
A group of 180 pet owners are interviewed and 100 said that they
have dogs for protection of self/family. Determine the value of p and
the sample proportion 𝒑̂.
Solution:
𝒙
p = ___ % ̂=
𝒑
𝒏
= ____ ̂ = _____
𝒑
̂ = ______
𝒑

186
What I Have Learned

Directions: Copy and complete the statements below.


1. A part of the population with a particular attribute or trait expressed as a
fraction, decimal, or percentage of the whole population is known as
__________________ of which in symbol is ___________.
2. A part of the sample or the proportion of individuals in a sample sharing
a certain trait is known as __________ and is written in symbol as ______.
3. To be able to find 𝑝̂ , divide the number of _________________ in the
samples by the number of _____________ or _______________ of the
samples.
4. The symbol ______ represents the successes of the samples.
5. The size of the sample is symbolized as ______.
6. The test statistic used in testing hypothesis involving population
proportion is ______________ whose formula is _________________.
7. CLT means _____________________________________________.
8. The two assumptions in testing the situations involving population
proportion is to show that conditions for _______________________ are met;
9. and _______________________ are both satisfied.
10. Sample size is considered to be sufficiently large if _________.

What I Can Do

Activity 4.1: Fast Break


Directions: Given the following, compute for the value of the sample
proportion 𝑝̂ in as fast as you can. (Answers in nearest hundredths)

Number of Successes, Number of Samples, Value of Sample


(x) (n) Proportion, ( 𝑝̂ )
1 520 850
2 168 480
3 248 620
4 150 540
5 425 930

187
Activity 4.2. Puzzle
Direction. Identify the statements describe below. Write your answer in the
puzzle box. Copy the box.
POPROP Puzzle
Across: Down:
1. Sample _______ 1. _________ Proportion
3. Symbolized as q
4. Opposite of ‘Failure’ 2. An experiment with 2
6. Central _______Theorem outcomes only
5. Same as percentage
7. It is symbolized as n
8. Test statistic for population
proportion
9. The symbol 𝑝̂ is read as ___

3 5

Activity 4.3. On My Own


A. Directions: Check whether the sample is in each problem is sufficiently
large enough to use the Central Limit Theorem in normal approximation.
1. A Public Information Survey investigated whether the majority of 40%
of adults supported a tax increase to help fund the local school system.
Out of this, a random sample of 300 showed that 113 agreed with the
tax increase.

188
2. It is believed that in the coming election, 65% of the voters in the
Province of Kaunlaran will vote for the administrative candidate for
governor. Out of 1,170 randomly selected voters, 640 indicated that
they would vote for the administrative candidate.

3. Suppose that in the past, 42% of all adults favored capital


punishment. Do we have reason to believe that this proportion has
increased if in a random sample of 150 adults, 80 favored capital
punishment?

4. Professors from an organization for private colleges and universities


reported that more than 6% of professors attended a national
convention in the past year. To test this claim, a researcher surveyed
80 professors and found that 5 attended a national convention in the
past year.
5. An insurance industry report indicated that 30% of those persons
involved in minor traffic accidents this year have been involved in at
least one traffic accident in the last five years. Believing it was too
large, an advisory group decided to investigate this claim. A sample of
200 traffic accidents this year showed that 56 persons were also
involved in another accident in the last five years.

6. A researcher claims that 75% of college students would rather spend


their extra money for internet access loads than food. Another
researcher would like to verify this claim. She randomly selected 400
students. Among them, 296 said that they would rather use their extra
money for internet access loads than food.

7. Malakas made a claim that 95% of college male students in their


school join triathlon. His friend, Baste, finds this hard to believe and
decided to check the validity of such claim, so he took a random
sample and found out that 75 out of 90 had joined the race.

8. A politician claims that he will receive 60% of the votes in the


upcoming election. In a random sample of 500 voters, there are 175
who will surely vote for him.

9. A social worker reports that 4% of workers in a factory are below 21


years of age. Of the 120 employees surveyed, 8 said they are below 21
years old.

10. A certified public accountant (CPA) claims that more than 25% of all
accountants advertise. A sample of 112 accountants in Metro Manila
showed that 40 use some form of advertising.

189
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. It is a part of the sample with a particular trait expressed in decimal,


percent, or fraction.
A. sample C. sample proportion
B. population D. population proportion

2. It is the symbol used to represent the size of the samples.


A. n B. 𝒑
̂ C. p D. x

3. What is the sample proportion if n = 740 and x = 259?


A. 0.35 B. 0.50 C. 0.56 D. 0.65

4. In a study about household income conducted in a small town, it was


found out that 7% of all families in the town earn less than P4,000 per
month. Out of 64 families who were randomly selected, 10 families earn
less than P4,000 per month. Is the sample large enough to use the
Central Limit Theorem?
A. No, because 10/64 <5 C. Yes, because 64>30
B. No, because 64(0.07)<5 D. Yes, because 64(0.93)>5

5. In testing hypothesis involving population proportion, when do we say


that the sample size (n) is sufficiently large to use the Central Limit
Theorem?
A. if n>30 B. if n<100 C. if np>5, nq>5 D. if n/p<5, n/q<5
6. Compute for the value of 𝑝̂ if n = 740 and x = 259.
A. 0.35 B. 0.40 C. 0.52 D. 2.86
7. Which formula for the test statistic is appropriate if the Central Limit
Theorem is used in testing hypothesis on population proportion?
𝑥̅ − 𝜇𝑥̅ 𝑝̂−𝑝 𝑥̅ − 𝜇𝑥̅ 𝑝
̂ −𝑝
A. Zcom = B. Zcom = C. z = D. z = 𝜎
𝜎𝑥̅ 𝑝𝑞 𝜎𝑥̅ ⁄ 𝑛
√𝑛 √

8. An insurance company reported that 30% of those persons involved in


minor traffic accidents this year have been involved in at least one traffic
accident in the last three years. Believing it was too large, an advisory
group decided to investigate this claim. A sample of 200 traffic accidents
this year showed that 56 persons were also involved in another accident
in the last three years. Determine the value of the population proportion.
A. 0.28 B. 0.30 C. 3.57 D. 60.0

190
9. In problem no. 8, what is the value of 𝑝̂ ?
A. 0.28 B. 0.35 C. 3.57 D. 60.0
10. In a certain senior high school, it is estimated that approximately 15% of
the students ride bicycles in going to school. In a random sample of 90
senior high students, 19 are found to ride bicycles in coming to class.
What is the value of the population proportion?
A. 0.15 B. 0.21 C. 0.29 D. 4.74

11. Which of the following test statistic is appropriate to use in testing


hypothesis involving population proportion?
A. t-test B. z-test C. p-test D. c-test
12. Before a nationwide election, a polling place was trying to see who would
win. Which choice best represents the sample?
A. a selection of male voters C. a selection of voters over age 50
B. a selection of female voters D. a selection of voters of different ages
13. A computer store surveys its clients who purchased laptop computers to
ask what software the store should include in its computers. Identify the
population.
A. clients on the store
B. computer manufacturers
C. clients who are interested in software
D. clients who purchased laptop computers

14. In Karangalan State University, 78% of all students are receiving


financial assistance or recipients of scholarship programs. The school
paper selects a random sample of 100 students and 72% of the
respondents say they are receiving some sort of financial support. Which
of the following is true?
A. 100 represents the 72% of the students.
B. 78% is the population proportion and 100 is the sample proportion.
C. 78% is the sample proportion and 78% is the population proportion.
D. 78% is the population proportion and 72% is the sample proportion.
15. In a learning study, 1,200 respondents were asked if they can assimilate
concepts while watching television wherein 586 said YES. What is the
population proportion of those who said NO?
A. 0.48 B. 0.49 C. 0.51 D. 0.58

If you’ve got a perfect


score, you deserve a 5 point
bonus points.
Congratulations!

191
Additional Activities

Activity 5: Think and Express


Directions: Carefully analyze and answer the following questions.
A. 1. Give 3 examples showing proportions.
2. Why is proportion considered a binomial variable?
B. Think of an opportunity that once knocked on your door but you did
not value. How did you feel about it?

“The opportunity to live a better life is in direct proportion to your


willingness to change.”
~Raphael ‘Doctah’ Love~

192
References

Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon
City: Commision on Higher Education, 2016.

Arciaga, Ronald L., and Dan Andrew H. Magcuyao. Statistics and Probability. Pasay
City: JFS Publishing Services, 2016.
Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.
Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.
Stephens, Larry J. Schaum’s Outline Of Theory And Problems Of Beginning Statistics.
McGraw-Hill Companies, Inc.,1998.

Online Resources
Bluman, Allan. “Elementary Statistics A Step by Step Approach.” Accessed May 24,
2020https://www.academia.edu/35770135/_Allan_Bluman_Elementary_St
atistics_A_Step_By_St_BookFi.org_1_

Quizizz. “Population Proportion.” Accessed May 22, 2020 https://quizizz.com/


admin/ quiz/5c53e98dd3ea76001b63caeb/population-proportion

193
Statistics and
Probability
Quarter 2 – Module 11:

Identifying Appropriate
Rejection Region Involving
Population Proportion

194
What I Need to Know

Recall that the normal curve evolves from the probability distribution.
With the area under the curve being 1, it serves as a mathematical model in
hypothesis testing. The areas are the probability value that we will need in
decision-making on whether to accept or reject the hypothesis.

This module will help you identify the appropriate rejection region for a
given level of significance when the Central Limit Theorem is to be used.

After going through this module, you are expected to:


1. determine the critical value using the given level of significance;
2. transform the alternative hypothesis from statement into symbols; and
3. illustrate and identify the rejection region under the normal curve.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. Which of the following is a mathematical model used in decision-making?
A. z-statistic C. normal curve
B. proportion D. graphical representation
2. What is the alternative hypothesis in the statement below?
“Less than 35% of the students are fluent Filipino language speakers.”
A. p > 35 B. p < 35 C. p > 0.35 D. p < 0.35

3. Which of the following situations is non-directional?


A. More than 45% of the barangay population is male residents.
B. The proportion of ADHD students in the school decreased by 10%.
C. The principal claims that 30% of Grade 4 students are in favor of
staying in the playground after classes.
D. There is enough evidence to conclude that the percentage of students
who are in favor of the new uniform is different from 85%.
4. Which of the following is the critical value if the level of significance is 5%
and tailed to the right?
A. 0.125 B. 0.165 C. 1.645 D. 1.960

195
5. It is the range of the values of the test value indicating that there is
significant difference and that the null hypothesis (Ho) should be rejected.
A. critical value C. level of significance
B. rejection region D. non-rejection region

6. It is the basis for the critical or rejection region dictated by the alternative
hypothesis.
A. critical value C. acceptance region
B. rejection region D. level of significance

7. Which of the following terms does NOT describe a right-tailed test?


A. more B. improve C. changed D. increased

8. The z-score value in the critical region means that you should _______.
A. reject the null hypothesis
B. not reject the null hypothesis
C. reject the alternative hypothesis
D. not reject the alternative hypothesis

9. Which is the alternative hypothesis (in notation) of the following statement?


“Less than 4% of senior high school students take Statistics.”
A. Ha: p > 0.4 C. Ha: p > 0.04
B. Ha: p < 0.4 D. Ha: p < 0.04

10. Which of the following is not an option for an alternative hypothesis?


A. Ha > k B. Ha = k C. Ha < k D. Ha ≠ k

11. Which of the following shows a non-directional test?


A. p = k B. p > k C. p ≠ 0.4 D. p < 0.4

12. What is the critical value(s) for a two-tailed test with α = 0.05?
A. z = -1.64 B. z = ±0.06 C. z =1.64 D. z = ± 1.96
For nos. 13-15, refer to the situation below.
During the previous year, 55% of the people believed that there was
an improvement in the country’s economy. This year, only 280 out of 500
randomly selected people believe that there is an improvement in the
country’s economy. Using 0.05 level of significance, answer the following
questions to determine if this indicates a decrease in the number of people
who believe that there is an improvement in the country’s economy.

13. Which of the following represents the alternative hypothesis?


A. Ha > 0.55 B. Ha ≠ 0.55 C. Ha < 0.55 D. Ha = 0.55

14. Which of the following is the critical value?


A. –2.325 B. –1.960 C. –1.645 D. –1.285

196
15. Which of the following shows the appropriate rejection region?

A. C.

-2.325 -1.285

B. D.

-1.645 -1.960

How did you find this pre-test? Did you encounter both familiar and
unfamiliar terms? Kindly compare your answers in the Answer Key on the last
part of this module.
If you got a perfect score or 100%, skip this module and proceed to the
next one. But if you missed even a single point, please continue with this
module as it will enrich your knowledge in hypothesis testing involving
population proportion.

197
Lesson Identifying Appropriate
11 Rejection Region Involving
Population Proportion
One part in testing hypothesis is determining if the results of a theory or
the hypothesis from the experiment is probably true or statistically significant.
To be able to do this, the rejection region or critical region will be employed.
Every rejection region can be drawn on a probability distribution. Its image
can be done using the normal curve. It can either be one-tailed or two-tailed
rejection region. More specifically, one-tailed rejection region can be left-tailed
or right-tailed. Now, how can we determine the rejection regions? Let us find
out!

What’s In

Activity 1: Where Do I Belong?

Directions: The following are the common terms or phrases used to describe
whether the alternative hypothesis (Ha) is directional or non-directional such
as right-tailed, left-tailed, or two-tailed. Copy the table and write each under
the group where it should belong.

higher lower changed different


better worsened more increased
varied less affects improve
effective greater than less than influences
favored not equal to smaller decreased
not the same as

198
Left –Tailed Two – Tailed Right – Tailed

1
2
3
4
5
6
7
8
9
10

Notes to the Teacher


This lesson is simply about finding the critical value
and illustrating the rejection region. Making conclusions
whether to accept or reject is included in the scope of
Module 13. If the learner did not pass the assessment,
please give them more practice activities on how to
determine the critical value and illustrate the rejection
region. Otherwise, the learner may not be able to make a
conclusion.

199
What’s New

Activity 2: Tail Me Now

Directions: In each of the following statements, formulate the alternative


hypothesis, Ha. then, determine if it describes two-tailed, right-tailed, or left-
tailed. Th first one is done for you as an example.

1. The hypothesis that less than 20% of the population is right-handed


Ha: p < 0.20 ; left – tailed

2. The hypothesis that the proportion of ADHD students in the school is


not 0.40
_________ ; __________

3. The hypothesis that more than 45% of the barangay population is male
residents
_________ ; __________

4. The claim that less than 35% of the students are fluent Filipino language
speakers
_________ ; __________

5. The principal’s claim that 30% of Grade 4 students stay in the


playground after classes
_________ ; __________

6. The hypothesis that there is enough evidence to conclude that the


percentage of students who are in favor of the new uniform is different from 85%
_________ ; __________

200
What Is It

There are two ways to test the hypothesis: with a p-value approach and
with a critical value approach. Here, we will consider the rejection region with
the critical value approach. The critical value enables us to reject or not the
null hypothesis. Also, it is calculated through alpha ( α ) levels and symbolized
by Z or Ztab.

This is the first statement in Activity 2: “The hypothesis that less than
20% of the population are right-handed” wherein Ha: p < 0.20 and it indicates a
left-tailed rejection region. Illustrating it in the normal curve, we will come up
with the picture below:

Rejection
Region Non-Rejection
(α) Region This is the
critical value.

Ztab
The illustration above is for you to visualize how the statement would
look like when put into the normal curve. Notice that the line represented by
ztab separates the curve into two regions. The shaded part is the rejection
region while the non-shaded part is the non-rejection region or the acceptance
region/area. Therefore, it is important that we determine the value of z tab or
the critical value. Now, let us proceed!
Let us now describe the following important terms that we will be
needing in our discussion.

Critical Value, ztab


- separates the rejection region from the acceptance region
- derived from the level of significance and expressed as the standard z-
values
- symbolized as ztab
We can use the table of critical values for the commonly used levels of
significance presented in the previous modules.

201
Level of Significance
Test Type
𝛼 = 0.01 𝛼 = 0.025 𝛼 = 0.05 𝛼 = 0.10
left-tailed test −2.33 −1.96 −1.645 −1.28
right-tailed test 2.33 1.96 1.645 1.28
two-tailed test ±2.575 ±2.33 ±1.96 ±1.645

Level of Significance, 𝜶 (Greek letter, alpha)


- refers to the degree of significance in which we reject or do not reject the
null hypothesis
- the basis for the critical or the rejection region dictated by the
alternative hypothesis
The following are the common values of statistical significance:
 0.01 highly significant
 0.05 statistically significant
 0.10 significant

For instance, if we use 0.05


level of significance, then the size of
the rejection region is 0.05 or 5%.
For α = .01, then the size of the
rejection region is 1%, and 10% for
0.10.

Rejection Region
- the range of the values of the test value which indicates that there is a
significant difference and that the null hypothesis (Ho) should be
rejected
Non-Rejection Region
- the range of the values of the test value which indicates that the
difference was statistically insignificant and that we failed to reject the
null hypothesis (Ho)

Illustrative Example1:
A sample of 100 students is randomly selected from Pinagpala High
School and 18 of them said they are left-handed. Test the hypothesis that less
than 20% of the students are left-handed by using 𝛼 = 0.05 as the level of
significance.
What to do:
a. Identify the level of significance.
b. Formulate the alternative hypothesis, Ha.
c. Determine the critical value, ztab.
d. Illustrate the rejection region in the normal curve.

202
Solution:
a. The level of significance is 𝛼 = 0.05.
b. The alternative hypothesis is Ha: p < 0.20.
It is one directional or left-tailed as determined by the term “less than”.
c. To determine the critical value using the table, we consider the
intersection of the row for the left-tailed test and the column for = 0.05.
Hence, the table tells us that the critical value is – 1.645.
d. Illustrating it under the normal curve makes:

Rejection
Region

.
𝛼 = 0.05 Non-rejection
Region

-3 -2 -1.645 0
-1.645 1 2 3

From here, you will decide whether the null hypothesis will be
rejected or not, although that Region
part will be discussed in the next module.

Illustrative Example 2:
The claim is made that 40% of tax filers use computer software to file
their taxes. In a sample of 50 tax filers, 14 used computer software to file their
taxes. If Ha: p < 0.40 at α = 0.025 where p is the population proportion who
use computer software to file their taxes. Determine the critical value, Ztab
and illustrate the rejection region in the normal curve.

Solution:

At α = 0.025 level of significance, with p < 0.40, by referring to the


table of the Level of Significance, it shows that the critical value or Ztab = –
1.96

Illustrating the rejection region, we have

Rejection
Region α = 0.025

Non-rejection
Region

Ztab = - 1.96

203
Illustrative Example 3:
In Kalinga Special Education School, a sample of 144 students was
chosen and among them, 48 are diagnosed with Attention Deficit Hyperactivity
Disorder (ADHD). At 𝛼 = 0.01, test the hypothesis that the proportion of ADHD
students in the school is not 0.40.
When a
What to do: statement did not
a. Identify the level of significance. specify any cue
word that describes
b. Formulate the alternative hypothesis, Ha: p ≠ po. direction, then it is
non-directional or
c. Determine the critical value.
two-tailed.
d. Illustrate the rejection region in the normal curve.

Solution:
a. The level of significance is 𝛼 = 0.01.
b. The alternative hypothesis is p ≠ 0.40 due to the expression “is not 0.40”.
This explains why it is non-directional or two-tailed.
c. To determine the critical value using the table, we consider the intersection
of the row for the two-tailed test and the column for 𝛼 = 0.01. Hence, the
table tells us that the critical value is ±2.575.
d. Illustrating the rejection region in the normal curve gives:

Rejection
Region Acceptance 𝛼 0.01
= = 0.005
Region 2 2
𝛼
2

Ztab = -2.575 Ztab = 2.575

204
What’s More

Activity 3.1: Be Critical!

A. Directions: Determine the critical value and illustrate the rejection region
under the normal curve by using the given information.

1. Ha: p ≠ 0.52
𝛼 = 0.05

Critical Value: _________

2. Ha: p > 0.35


α = 0.01

Critical Value: _________

3. Ha: p < 0.70


α = 0.10

Critical Value: _________

4. Ha: p > 0.65


α = 0.10

Critical Value: _________

5. Ha: p ≠ 0.46
α = 0.05

Critical Value: _______

205
Activity 3.2. Be Quick!

Direction. The following are the different critical values under the
various level of significance and tails. By using their respective codes,
tell the direction of their tail and the corresponding level of significance.
Your answer will be a combination of codes, tail and α. Set time limit.

L: Left – tailed 1: α = 0.01


25: α = 0.025
R: Right – tailed
5: α = 0.05
T: Two - tailed 10: α = 0.10

Ztab Answer Ztab Answer


Ex. ±2.33 T25
1 1.96 6 1.645
2 1.28 7 −2.33
3 −1.645 8 ±1.96
4 2.33 9 ±2.575
5 ±1.645 10 −1.96

What I Have Learned

Direction:

A. Fill in each blank with the correct word or phrase to complete the
statement.
1. The range of the values of the test value which indicates that there is a
significant difference and that the null hypothesis (Ho) should be
rejected is called ____________.
2. The basis for the critical or the rejection region dictated by the
alternative hypothesis is called ________________.
3. The _______________ separates the rejection region from the non-rejection
region.
4. The _______________ is the range of the values of the test value which
indicates that the difference was statistically insignificant and that we
failed to reject the null hypothesis (Ho).
5. The __________ is the symbol used to represent the critical value.
B. Carefully read and answer the following questions
1. Is it true that if the rejection region is two-tailed, α needs to be divided
by 2 to be able to identify the rejection region?

206
2. The computed value should be negative if the rejection region is right-
tailed. Is it true? Explain.
3. A 0.01 level of significance means that the size of the rejection region is
10%. Is this correct? Why?
4. If a problem does not indicate any term of direction, it is non-directional
or two-tailed. Is it true or false?
5. In a right-tailed test, what is the critical value at α = 0.10?

What I Can Do

Activity 4.1: The Mystery Word

Directions: The following are the steps in creating the rejection region in
testing hypothesis for population proportion. Arrange them in their best order
by writing the codes indicated in each. What is the mystery word?

How to Create the Rejection Region

Steps 1 2 3 4 5 6 7 8
Answer

Activity 4.2: Borderline

Directions: Carefully read and analyze the following situations. Identify the
information being asked. Then, determine the critical value and shade the
area of the rejection region under the normal curve.

1. Suppose that in the past, 40% of all adults favored capital punishment. Do
we have reason to believe that this proportion has increased if in a random

207
sample of 150 adults, 80 favored capital punishment? Use a 0.05 level of
significance.
Ha: _______
α = _______

Critical Value: _________

2. Professors from a professional organization for private colleges and


universities reported that more than 16% of professors attended a national
convention in the past year. To test this claim, a researcher surveyed 200
professors and found that 50 attended a national convention in the past
year. At 𝛼 = 0.10, test if the figure in the claim is correct.
Ha: _______

α = _______

Critical Value: _________

3. Malakas made a claim that at least 5% of college male students in their


school join triathlon. His friend, Mayumi, finds this hard to believe and
decided to check the validity of such claim, so she took a random sample.
At 0.01, does Mayumi provide enough indication to reject the claim of
Malakas if there were 60 racers in her sample of 300 evidences?
Ha: _______
α = _______

Critical Value: _________

4. A pharmaceutical company is on the first phase of testing a vaccine for a


new virus. They form a group consists of 100 people each who have a
disease and given a vaccine. It is found out that, 65 recovered from the
disease. At significance level of 0.01, determine the critical value and
illustrate the rejection region.

Ha: _______
α = _______
Critical Value: _________

5. A sample poll of 300 voters from Town A showed that 56% were in favor of a
given candidate. At a significance level of 0.10, determine the critical value
and illustrate the rejection region.

Ha: _______
α = _______
Critical Value: _________

208
Time to sum up
what you’ve learned
Assessment today.
Good luck!

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. When the confidence level is 99%, 𝛼 is ___.


A. 0.01 B. 0.05 C. 0.10 D. 0.50
2. Which is the alternative hypothesis for the following statement?
“More than 65% of the students are fluent Filipino language speakers.”
A. p > 65 B. p < 65 C. p > 0.65 D. p < 0.65

3. Which of the following situations is directional?


A. A teacher wants to know if listening to popular music affects the
performance of the pupils.
B. The principal claims that more than 30% of Grade 4 students are in
favor of staying in the playground after classes.
C. The owner of a factory that sells a particular bottled juice drink claims
that the content of his product is 250ml.
D. There is enough evidence to conclude that the percentage of students
who are in favor of the new uniform is different from 85%.

4. Which of the following is the critical value if the level of significance is 0.01
tailed to the right?
A. 2.330 B. 2.325 C. 2.320 D. 2.315
5. It is the range of the values of the test value which indicates that there is
significant difference and that the null hypothesis (Ho) should be rejected.
A. critical value C. level of significance
B. rejection region D. non-rejection or acceptance region
6. What graphical model is appropriate for decision-making?
A. bell shape C. normal curve
B. test statistic D. graphical representation
7. It separates the rejection region from the acceptance region.
A. critical value C. acceptance region
B. rejection region D. level of significance

8. Which of the following terms does NOT describe a left-tailed rejection


region?
A. lower B. worsened C. decreased D. influences

9. Which of the following terms does NOT describe a non-directional rejection


region?

209
A. at most B. effective C. different D. not the same as

10. A farmer believes that using organic fertilizer on his plants will yield a
greater income. His income increased by 20% from last year. State the
alternative hypothesis in symbols.
A. Ha: p < 0.02 C. Ha: p > 0.20
B. Ha: p > 0.02 D. Ha: p < 0.20

11. Which of the following represents the critical value?


A. CV B. Ccom C. Ztab D. Zcom

12. What is the critical value(s) for a left-tailed test with α = 0.01 level of
significance?
A. –2.325 B. –1.960 C. –1.645 D. –1.285

For nos. 13-15, refer to the situation below.


A researcher claims that 75% of college students would rather spend
their extra money for internet access loads than food. Another researcher
would like to verify this claim. She randomly selected 400 students.
Among them, 296 said that they would rather use their extra money for
internet access loads than food. Answer the following questions to
determine if there is enough evidence to conclude that the percentage of
students who would want to spend their extra money in internet access
loads than food is different from 75% at 0.05 level of significance.

13. Which of the following represents the alternative hypothesis?


A. Ha > 0.75 C. Ha ≠ 0.75
B. Ha < 0.75 D. Ha = 0.75
14. Which of the following is the critical value?
A. ± 2.325 B. ± 1.960 C. ± 1.645 D. ± 1.245
15. Which of the following shows the appropriate rejection region?

A. C.

-2.325 2.325 -1.285 1.285

B. D.

- 1.960 1.960 -1.645 1.645

210
Additional Activities

Activity 4. Think and Reflect


1. In your own words, describe the following:
a) critical value
b) rejection region

2. What do you think will your conclusion be if the computed test statistic
(Zcom) is found outside the rejection region?

3. Explain: A rejection is a chance to consider if there are things we can


possibly work on.

"Rejection is merely a redirection; a course correction to your destiny."


~Bryant H. Mc.Gil

211
References

Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Arciaga, Ronald L., and Dan Andrew H. Magcuyao. Statistics and Probability. Pasay
City: JFS Publishing Services, 2016.
Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.
Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability. Malaysia:
Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Quizizz. “Intro & Hypothesis Z-Test.” Accessed May 19, 2020


https://quizizz.com/admin/quiz/5e5531af8548b9002063e87c/quiz-intro-
hypothesis-z-test

Quizizz. “Population Proportion.” Accessed May 19, 2020


https://quizizz.com/admin/search/population%20proportion
Statistics How To. “Critical Values: Find a Critical Value in Any Tail.” Accessed May
19, 2020 https://www.statisticshowto.com/probability-and-statistics/find-
critical-values/

Statistics How To. “Rejection Region (Critical Region) for Statistical Tests.” Accessed
May 19, 2020 https://www.statisticshowto.com/rejection-region/

212
Statistics and
Probability
Quarter 2 – Module 12:
Computing Test Statistic Value
Involving Population Proportion

213
What I Need to Know

One of the processes in hypothesis testing is the calculation of the test


statistic. It is the value used in determining the probability needed in
decision-making. The conclusion we make depends on the computed test
statistic.

Many hypothesis testing situations involve proportions. In fact, a


hypothesis test involving a population proportion can be considered a
binomial experiment. It means that there will only be two outcomes and the
probability of a success or failure does not change from trial to trial since
the outcome of each trial is independent. As you may recall, the Central
Limit Theorem is not limited to sample means only. It can also be applied to
sample proportions. In doing so, the z-test statistics for population proportion
shall be applied.

This module will be dealing on the computation of the test statistic


value for population proportion.

After going through this module, you are expected to:


1. describe the z-test statistic of proportion;
2. compute the z-value for population proportion; and
3. solve problems involving the z-value for population proportion

214
What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. A randomly selected sample of 500 senior high students was surveyed if
they spend more than 3 hours playing Mobile Legends. Thirty percent
(30% or 0.30) of the 500 students surveyed said they do. Which one of
the following statements about the number 0.30 is correct?
A. It is a margin of error. C. It is a population proportion.
B. It is a sample proportion. D. It is a randomly chosen number.

2. Which of the following is NOT included in using the statistic z-test?


A. The situation contains count data.
B. The situation contains the mean or average.
C. The situation contains the population proportion.
D. The situation has only two possible outcomes: success or failure.
3. In testing the proportion, which of the following assumptions should be
proven true?
A. np > 5 and nq > 5 C. np < 5 and nq < 5
B. np ≥ 5 and nq ≥ 5 D. np ≤ 5 and nq ≤ 5

4. If p = 0.37, what is the value of q?


A. 0.37 B. 0.53 C. 0.63 D. 0.73
For nos. 5-7, refer to the problem below.
A school administrator claims that less than 50% of the students are
dissatisfied with the food served in the school canteen. The claim used a
sample data obtained from a survey of 500 students of the school wherein
54% indicated their dissatisfaction with the food served in the school
canteen.

5. What is the value of the hypothesized population proportion?


A. 0.46 B. 0.50 C. 0.54 D. 500
6. What is the value of the sample proportion?
A. 0.46 B. 0.50 C. 0.54 D. 500

7. What is the z-value?


A. 0.0005 B. 0.0224 C. 1.7857 D. 1.8021

215
8. When should you NOT use z-test?
A. when you are testing for a mean
B. when you are given the population standard deviation
C. when you are ONLY given the sample standard deviation
D. when you are testing a proportion/percentage of a population
9. When performing a test about population proportion, what test statistic
would you need to use?
A. t-test B. z-test C. chi-square D. standard deviation
10. Considering the pandemic, a survey is held to 1,000 randomly chosen
students in which more than 80% are in favor of holding online classes.
What is the value of the sample proportion 𝑝̂ ?
A. 0.013 B. 0.160 C. 0.640 D. 0.800
11. Which of the following is NOT included in the computation of the z-test for
population proportion?
A. n B. p C. 𝑝̂ D. 𝜇

12. What is the formula to find the z-test for population proportion?
𝑝̂−𝑝 𝑝̂−𝑝
A. z = 𝑝 ( 1−𝑝 ) C. z = 𝑝 ( 1−𝑝 )
√ √
𝑛 𝜇
𝑝̂− 𝜇 𝑝−𝑝̂
B. z = D. z =
𝑝 ( 1−𝑝 ) 𝑝 ( 1−𝑝 )
√ √
𝑛 𝑛

13. The record of patients in Kalinga Community Hospital shows that 45 of


100 patients have high cholesterol level of 240mg/dl and above. Using a
one-tailed test with α = 5%, can we conclude that 30% of the patients
have high cholesterol level? In this problem, which test should be used?
A. t-test C. chi-square test
B. p-test D. z-test for proportion
For nos. 14-15, refer to the problem below.
A new proposition on a ballot wants to know whether it is likely to pass upon
obtaining more than 50% of the vote. A poll is taken and 571 out of 1,000
people support the proposition.

14. What test statistic should be applied in the problem?


A. mean B. t-test C. z-test D. z-test for proportion

15. What is the value of the test statistic?


A. 2.37 B. 3.46 C. 4.10 D. 4.49

216
Lesson
Computing Test Statistic Value
12 Involving Population Proportion

In the previous modules, you have learned how to test hypotheses


involving means or averages. In this one, you will learn how to conduct tests
involving count data, percentages, or population proportion. Inferences
involving proportions are made in the context of probability of “success” (p)
or “failure” (q) for a binomial distribution. When testing about a proportion,
a percentage, or a probability, there are some assumptions to be considered.
Once these assumptions are met, then the z-test statistics for proportions
can be applied.

What’s In

Activity 1. Say Something!


Directions: Copy the table below. In your own words, briefly describe each
of the following terms.

1. α Level of Significance

2. Ztab Critical Value

3. 𝑝̂ Sample Proportion

4. p Population Proportion

5. n Size of Samples

6. Rejection Region

7. Z test for Population Proportion

217
Notes to the Teacher
In this module, all situations presented involve sample
sizes which are large enough so that CLT or Central Limit
Theorem can be applied. Therefore, there is no need for the
learners to present or prove the two assumptions presented and
discussed in the lesson. Instead, the lesson will mainly focus on
calculating for the value of z-test statistics involving population
proportion.

What’s New

Here, you will use the concepts that you have learned in module 10.

Activity 2. Home Sweet Home!

Directions: Look for the word by carefully reading and answering the guide
questions that follow. Choose your answer from the list. Copy the answer
box and write the letter that corresponds in your answer.

A recent survey done by the Philippine Housing Authority found


that 35% of the population owns their homes. In a random sample of
240 heads of households, 78 responded that they own their homes.

A = 0.35 U = 78 L = 240 E = 0.325 V = 35%

Guide Questions.
1. What part of the whole population own their homes?
2. What is the value of p?
3. What is the size of the sample, n?
4. How many owned their homes, x?
5. Compute for the value of 𝒑
̂.
1 2 3 4 5

218
What Is It

It is observable that the previously cited situation did not use nor
mention words like “mean” or “average” but “percentage” instead. Also, it
utilized count data. Problems such as this involves population proportion.
Inferences involving proportions are made in the context of probability of
“success”, p, in a binomial distribution.

From the situation that we presented in the above activity, the


respondents have only two possible options for their responses and those
are the following:

Option 1 They own their house. “success” or p


Option 2 They do not own their house. “failure” or q

Showing if the number of samples is large enough as the Central Limit


Theorem states, we need to satisfy the two assumptions. It is evident that
the responses have only two possible outcomes: “owned” (success) or “not
owned” (failure). Therefore, the condition for binomial experiment is met.
Also, to be able to satisfy the condition that np ≥ 5 and nq ≥ 5, we find that
the hypothesized value of the population proportion is p = 0.35 while n =
240. To get q, q = 1 – p makes q = 1 – 0.35 = 0.65.

Through substitution, we can show that the second condition is also


met, since:
np ≥ 5 and nq ≥ 5
240 (0.35) ≥ 5 and 240 (0.65) ≥ 5
84 ≥ 5 and 156 ≥ 5

Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met


where the sample size is large enough to use Central Limit Theorem. In this
condition, the test statistic to be used is the z-test statistic for proportions
denoted by Zcom or the computed z-value.

Again, the problems presented here contain


sample sizes that are large enough to consider the
Central Limit Theorem or CLT. Thus, in solving
these problems, there is no need to show these
assumptions.

219
Z – Test Statistic for Population Proportion
Remember that the formula for the value of z-test statistic for
population proportion would be:

𝑝̂−𝑝 𝑝̂−𝑝
Zcom = 𝑝𝑞
or Zcom =
𝑝 ( 1−𝑝 )
√𝑛 √
𝑛

where:
zcom is the z-test statistic for proportion.
𝑥
𝑝̂ is the sample proportion ( 𝑛 ).
p is the hypothesized value of the population proportion.
n is the sample size or the number of observations in the
sample.
q is equal to 1 – p.

We will use this formula in the examples that follow.


Illustrative Example1:
Let us now determine the z-value in the situation presented
previously. To be able to solve it, we need to identify first the values of the
following:
Zcom = ?
𝑥 78
𝑝̂ = = = 0.325
𝑛 240
p = 35% = 0.35
n = 240
q = 1 – p = 1 – 0.35 = 0.65

Then, substitute these values in the formula:


𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛
0.325−0.35
=
0.35 ( 0.65 )

240
−0.025
=
0.2275

240
−0.025
=
√0.0009479
−0.025
= 0.03079

Therefore, the computed z-value is Zcom = - 0.812


If you are still a bit confused, here is another example.

220
Illustrative Example 2:
Determine the value of Zcom given the following information:
p = 0.42
Sample Size: n = 150
Sample Proportion: 𝑝̂ = 0.45

Solution:

To start your solution, identify first the values of the following:

Zcom = ?
𝑝̂ = 0.45
p = 0.42
n = 150
q = 1 – p = 1 – 0.42 = 0.58

Then, substitute these values in the formula:


𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛

0.45−0.42
=
0.42 ( 0.58 )

150

0.03
=
0.2436

150

0.03
=
√0.001624
0.03
=
0.0403

Zcom = 0.7444

Illustrative Example 3:
The claim is made that 40% of tax filers use computer software to file
their taxes. In a sample of 50, 14 used computer software to file their taxes.
To test Ho: p = 0.4 versus Ha: p > 0.4 at α= 0:05 where p is the population
proportion who use computer software to file their taxes. And to test using
the binomial distribution and test using the normal approximation to the
binomial distribution. Determine first the value of zcom.

221
Solution:
First, determine the value of the following:
Zcom = ?
𝑥 14
𝑝̂ = = = 0.28
𝑛 50
p = 40% = 0.40
n = 50
q = 1 – p = 1 – 0.40 = 0.60

Then, substitute these values in the formula:


𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛
0.28−0.40
=
0.40 ( 0.60 )

50
−0.12
=
0.24

50
−0.12
=
√0.0048
−0.12
= 0.069

Therefore, the computed z-value is Zcom = –1.739

What’s More

Activity 3. Fancy Meeting You


Directions: Tell if each part of the numbered solution is right or wrong. If
wrong, encircle the part of the solution that is incorrect. Then, replace it
with the correct answer. Total Points: 18

Problem A:
An insurance industry report indicated that 30% of those persons
involved in minor traffic accidents this year have been involved in at least
one traffic accident in the last five years. Believing it was too large, an
advisory group decided to investigate this claim. A sample of 200 traffic
accidents this year showed that 56 persons were also involved in another
accident in the last five years. Determine the value of zcom.
Solution:
First, prepare the data that will be used in the formula.

222
Zcom = ?
p = percentage of those who were involved in an accident
during the last five years
= 30%
p = 0.03
number of persons involved in another accident in the last five years
𝑝̂ =
number of persons involved in accidents in the last five years 1
56
= = 0.28
200
n = 200
q = 1 – p = 1 – 0.30 = 0.70
Substitute the values in the formula.
𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛

0.28− 0.30
= 2
0.30 ( 0.70 )

200

−0.20
= 0.2010 3
√ 200

−0.20
= 4
√0.00105
−0.02 5
= 0.0324

Zcom = 0.6173 6
Problem B:
In the website of Sweet Choco, it was stated that an ideal bag of
chocolates contains 24% white chocolates. Suppose we counted the number
of white chocolates in 40 chocolate sachet packs and the proportion from
the sample is found to be 23.04%, what is the value of z?
Solution:
As preparation for our solution, let us again identify first the needed
data for the formula.
Zcom = ?
p = 24% = 0.24
1
𝑝̂ = 23.04% = .02304
n = 40
q = 1 – p = 1 – 0.24 = 78
Then, substitute these values in the formula.
𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛

223
0.2304−0.24
= 2
0.24 ( 0.76 )

40

−0.0096
= 0.1724
3

40

−0.0096
= 4
√0.00456
−0.0069
= 5
0.0675

6
Zcom = 0.1422

Problem C:

A national survey asked the following question of 2500 registered


voters: “Is the character of a candidate for president important to you when
deciding for whom to vote?” Two thousand of the responses were yes. The
survey revealed that 90% of all registered voters who believe the character
of the president is important when deciding for whom to vote. Compute for
the test statistic Zcom.

Solution:
Zcom = ?
p = 90% = 0.90
2000
𝑝̂ = = .80
2500
1
n = 2500
q = 1 – p = 1 – 0.90 = 0.10
Substituting these values in the formula,
𝑝̂−𝑝
Zcom =
𝑝 ( 1−𝑝 )

𝑛

0.80−0.90
= 2
0.90 ( 0.10 )

2500

−0.01
= 0.09
√ 3
2500

224
−0.10
= 4
√0.00036
−0.10
= 5
0.06

Zcom = − 16.67 6

Did you get a very satisfying score? I


hope you did.
I think by this time you can handle well
how to compute for the test statistic or z test.
Please try to answer the next activity
with your utmost confidence.

What I Have Learned

Activity 4. Missing Piece


Direction: Complete the following statements.
1. The test statistic to be used in testing hypothesis involving population
proportion is called ______________________.
2. The formula to find the value of z-value for population proportion is
________________________.
3. The symbol for the computed z-value is ________.
4. The symbol 𝑝̂ represents _____________.
5. The hypothesized value of the population proportion is denoted by
_______.
6. The sample size or the number of observation in the sample is
symbolized by _______.
7. To be able to find q, subtract _________ from _________ or simply
_________.
8. If p = 0.56, then q = ______.
9. The Central Limit Theorem (CLT) can be used when the sample is
sufficiently large enough. Sample is large enough if it satisfies the
condition that _________________________.
10. The standard deviation for population is used in z-test. Is it true or false?
Explain.

225
What I Can Do

Activity 5. Finding Zcom


A. Direction: Determine the value of Zcom given the following information.
1. p = 0.35
sample size = 180
sample proportion = 0.40
2. p = 0.36
sample size = 250
sample proportion = 29%
3. p = 0.65
sample size = 200
sample proportion = 78%

B. Directions: Carefully analyze the following situations. Then, fill up the


missing data and solve for Zcom.
1. A politician claims that he will receive 60% of the votes in the
upcoming election. In a random sample of 500 voters, there are 175
who will surely vote for him.
Identify the following: Solution:
𝑝̂−𝑝
p = _____ Zcom =
𝑝 ( 1−𝑝 )

𝑝̂ = _____ 𝑛

n = _____ ______
q = 1 – p = ______
zcom = ?

2. A social worker reports that 30% of workers in a factory are below 25


years of age. Of the 120 employees surveyed, 38 said they are 15 years
old.
Identify the needed data:
Solution:
p = ______
𝑝̂ = ______ Zcom =
𝑝̂−𝑝

n = ______ √
𝑝 ( 1−𝑝 )
𝑛
q = 1 – p = ______
zcom = ?

226
3. Health-care coverage for employees varies with company size. It is
reported that 30% of all companies with fewer than 10 employees
provide health benefits for their employees. A sample of 50 companies
with fewer than 10 employees is selected. It is found that 19 of the 50
companies surveyed provide health benefits for their employees.
Identify the needed data:
Solution:
p = _____
𝑝̂ = _____ Zcom =
𝑝̂−𝑝

n = _____ ______ √
𝑝 ( 1−𝑝 )
𝑛
q = 1 – p = ______
zcom = ?

4. A survey of 2500 women between the ages of 15 and 50 found that


28% of those surveyed relied on the pill for birth control. The research
shows that 25% of them are using the pill for birth control.
Identify the needed data:
Solution:
p = _____
𝑝̂ = _____ Zcom =
𝑝̂−𝑝

n = _____ ______ √
𝑝 ( 1−𝑝 )
𝑛
q = 1 – p = ______
zcom = ?

5. A group of online shoppers were surveyed and 70% said that they
spent around P1000 to P1500 every month on the internet shopping.
From 400 respondents, 235 said that they consumed P1000 – P1500
in online shopping.
Identify the needed data:
Solution:
p = _____
𝑝̂ = _____ Zcom =
𝑝̂−𝑝
𝑝 ( 1−𝑝 )
n = _____ ______ √
𝑛

q = 1 – p = ______
zcom = ?

6. A poll taken just prior to election day finds that 389 of 700 registered
voters intend to vote for Kris P. Bacon for mayor of a certain city. The
poll resulted that 50% of all voters intend to vote for Kris.
Identify the needed data:
p = _____ Solution:
𝑝̂ = _____ Zcom =
𝑝̂−𝑝

n = _____ ______ √
𝑝 ( 1−𝑝 )
𝑛
q = 1 – p = ______
zcom = ?

227
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. Which of the following is NOT necessary in solving the statistic z-test for
population proportion?
A. Mean C. size of the sample
B. sample proportion D. population proportion

2. The following are to be considered in using the Central Limit Theorem.


Which one should NOT be included?
A. the critical value
B. np ≥ 5 and nq ≥ 5
C. The sample size is large enough.
D. The situation should only have two outcomes: “success” or “failure”.

3. If p = 0. 63, what is the value of q?


A. 0.37 B. 0.42 C. 0.58 D. 37.0
4. Mr. Makisig asserts that fewer than 5% of the bulbs that he sells are
defective. Suppose 300 bulbs are randomly selected. Each is tested and
10 defective bulbs are found. What is the value of the sample proportion?
A. 0.013 B. 0.023 C. 0.033 D. 0.043
5. In problem no. 4, what is the value of z?
A. – 1.20 B. – 1.25 C. – 1.30 D. – 1.35

6. A randomly selected sample of 500 senior high students was surveyed


whether they spend more than 3 hours playing Mobile Legends. Thirty
percent (30% or 0.30) of the 500 students surveyed said they do. Which
one of the following statements about the number 500 is correct?
A. It is the size of samples.
B. It is the sample proportion.
C. It is the population proportion.
D. It is the randomly chosen number.
7. If p = 0.46, what is the value of q?
A. 0.46 B. 0.53 C. 0.54 D. 0.64
For nos. 8-10, refer to the problem below.
A research conducted on a certain company last year revealed that 25%
of the employees prefer drinking milk tea than coffee during break time. The

228
company has decided to give free milk tea during break time. In a recent
study, out of 100 randomly sampled employees, 28% said that they would
rather drink milk tea than coffee.
8. What is the value of the hypothesized population proportion?
A. 0.25 B. 0.28 C. 0.75 D. 100
9. What is the value of the sample proportion?
A. 0.25 B. 0.28 C. 0.75 D. 100
10. What is the z-value?
A. 0.5682 B. 0.6065 C. 0.6928 D. 0.7713
11. Which of the following should NOT be considered in using z-test for
proportion?
A. when you are testing for the mean
B. when you are given the sample proportion
C. when you are testing a proportion/percentage of a population
D. when each sample point can result in just two possible outcomes:
success or failure

12. Which of the following is NOT included in the computation of the z-test
for population proportion?
A. N B. 𝜎 C. 𝑝̂ D. 𝑝
For nos. 13-15, refer to the problem below.
In the recent city triathlon, the sponsors have encouraged more
women to participate in the event. A sample is chosen randomly and among
70 runners, 32 are women. The sponsors would somewhat like to be 90%
certain that at least 40% of the participants are women.

13. What test statistic should be applied in the problem?


A. Mean B. p-test C. t-test D. z-test for proportion
14. Which of the following is the value of 𝑝̂ ?
A. 0.90 B. 0.46 C. 0.40 D. 0.32
15. What is the value of the test statistic?
A. 0.973 B. 0.819 C. – 0.973 D. – 0.819

Did you pass the assessment?


CONGRATULATIONS if you do. But if
not, please go back otherwise the next modules
would become more diffifcult.

229
Additional Activities

Activity 6. You Complete Me


A. Directions: Carefully analyze the following and solve for the value of z-
test statistic, zcom. Write your complete solutions.
1. A school principal claims that 40% of Grade 3 pupils stay in the
playground after their classes. A survey among 500 Grade 3 pupils
revealed that 150 of them stay in the playground after their classes.
2. A certified public accountant (CPA) claims that more than 30% of all
accountants advertise. A sample of 112 accountants in Metro Manila
showed that 40 use some form of advertising.
3. The GSIS states that 80% of its claims are settled within a month. A
consumer group selected a random sample of 240 of the company’s
claims to test this statement. It is found that 200 of the claims were
settled within a month.

B. Think and Reflect


1. Think of anything that you did today where you demonstrated or
practiced zeal or eagerness. How did you feel about it?

“Through zeal, knowledge is gotten; through lack of zeal, knowledge is lost.”


- Buddha

230
References

Textbooks

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon
City: Commision on Higher Education, 2016.

Arciaga, Ronald L., and Dan Andrew H. Magcuyao. Statistics and Probability. Pasay
City: JFS Publishing Services, 2016.
Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.
Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.
Stephens, Larry J. Schaum’s Outline of Theory and Problems Of Beginning Statistics.
McGraw-Hill Companies, Inc.,1998.
Stephens, Larry J.Theory and Problems of Statistics Fourth Edition. The McGraw-
Hill Companies, Inc. 2008.

Online Resources
CliffsNotes. “Test for a Single Population Proportion.” Accessed May 28, 2020
https://www.cliffsnotes.com/study-guides/statistics/univariate -inferential-
tests/test-for-a-single-population-proportion
Quizziz. “Hypothesis Tests for Population Proportions.” Accessed May 27, 2020
https://quizizz.com/admin/quiz/5c70658f40b384001a7dc327/ hypothesis-
tests-for-population-proportions
Quizizz. “Population Proportion.” Accessed May 29, 2020 https://quizizz. com/
admin/search/population%20proportion
UCI Donald Bren School of Information & Computer Sciences. “Sample Multiple
Choice Questions.” Accessed May 28, 2020
https://www.ics.uci.edu/~jutts/8/SampleFinalMCKey.pdf

231
Statistics and
Probability
Quarter 2 - Module 13:
Drawing Conclusions About Population
Proportion Based on Test Statistic
Value and Rejection Region

232
What I Need to Know

In conducting a study, the last part of the process is drawing conclusions


and it should be done correctly and carefully. In doing so, you need to learn how to
consider necessary data as your basis and follow different steps.

On the previous lessons, you were already taught how to compute test
statistic concerning population proportions as well as how to determine the
rejection or non-rejection region by using an illustration on a curve.

After going through this module, you are expected to:

1. compute for the test statistic of population proportion;


2. differentiate critical value approach from p-value approach of hypothesis
testing; and
3. draw conclusions on population proportions based on the test statistic
and the rejection region.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. What do you call the part of the sample or the proportion of individuals in a
sample sharing a certain trait?
A. sample mean
B. sample variance
C. sample proportion
D. sample standard deviation

2. Which of the statements is NOT true about rejection region?


A. This is also the critical region.
B. It tells the researcher if a certain theory is probably true.
C. This is the range of values of the test value where the null hypothesis
should be rejected.
D. This is the range of values of the test value where the null hypothesis
should fail to be rejected.

233
3. Which of the following is usually expressed as a fraction, decimal, or percentage
of the whole population which has a certain trait or characteristic?
A. sample mean
B. population mean
C. sample proportion
D. population proportion

4. Which of the following symbols is NOT used in computing the z-value?

A. 𝑝̂ B. n C. p D.

5. What is the first step in drawing your conclusions?


A. Identify the correct decision.
B. Compute the test statistic.
C. Determine the level of significance.
D. Formulate the null and alternative hypothesis.

6. An insurance industry report indicated that 30% of those persons involved in


minor traffic accidents this year have been involved in at least one traffic
accident in the last five years. Believing it was too large, an advisory group
decided to investigate this claim. A sample of 200 traffic accidents this year
showed that 56 persons were also involved in another accident in the last five
years.

What is the value of p in the given problem?


A. 200
B. 56
C. 0.70
D. 0.30

7. When the computed z-value (zcom) is 3.16 at α = 0.05 level of significance, which
of the following will be the correct decision?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject the alternative hypothesis.
D. Accept the alternative hypothesis.

8. Which of the following is an acceptable reason to conclude that there is enough


evidence for the null hypothesis to NOT be rejected?
A. The computed z-value falls on the rejection region.
B. The computed z-value does not fall on the rejection region.
C. The computed z-value is greater than the critical value (if there is a
positive sign).
D. The computed z-value is less than the critical value (if there is a negative
sign).

234
9. What is the relationship between a Type I error and the null hypothesis (Ho)?
A. A Type I error corresponds to rejecting the null hypothesis when it is
true.
B. A Type I error corresponds to rejecting the null hypothesis when it is false.
C. A Type I error corresponds to failing to reject the null hypothesis when it
is false.
D. A Type I error corresponds to failing to reject the null hypothesis when it
is true.

10. What distribution do we use when testing claims about population proportions?
A. F
B. Z
C. t
D. chi

11. Researchers used the given data below and the results to test the claim that
more than 50% of adults support the tax increase.
n = 200 𝑝̂ = 56.5
Ho : p = 0.5 Ha : p > 0.5
z ≈ 1.84 P-value of approximately 0.033

What should be the correct conclusion?


A. At = 0.01 significance level, you should conclude that less than 50% of
adults support the tax increase.
B. At = 0.05 significance level, you should conclude that less than 50% of
adults support the tax increase.
C. At = 0.05 significance level, you should conclude that more than 50% of
adults support the tax increase.
D. At = 0.01 significance level, you should conclude that more than 50% of
adults support the tax increase.

For numbers 12-15, refer to the following:


Ho : The proportion of barangays segregating wastes into biodegradable
and non-biodegradable is 45%. (Ho : p = 0.45)

Ha : The proportion of barangays segregating wastes into biodegradable


and non-biodegradable has changed to 45%. ______________

α = 0.05 level

Computed z-value: zcom = 2.37


Critical z-value: ztab = 1.96

12. What is the correct alternative hypothesis (Ha) in symbols?


A. Ha : p < 0.45
B. Ha : p > 0.45
C. Ha : p = 0.45
D. Ha : p ≠ 0.45

235
13. The given problem is a ____________________________.
A. one-tailed test
B. one-sided test
C. non-directional
D. cannot be determined

14. What is the correct decision based on the given results?


A. There is no possible decision.
B. Reject the null hypothesis.
C. Fail to reject the null hypothesis.
D. Change the alternative hypothesis.

15. What is the phrase that best completes the conclusion below?

Therefore, we conclude that at 0.05 level of significance, ____________________


to conclude that the proportion of barangays segregating wastes into
biodegradable and non-biodegradable has changed to 45%.

A. there was a problem


B. there was a missing data
C. there was enough evidence
D. there was not enough evidence

Drawing Conclusions About


Lesson
Population Proportion Based on
13 Test Statistic Value and
Rejection Region

What’s In

Activity 1: Do You Love Math?

DIRECTIONS: Determine the value of sample proportion (𝑝̂ ) using the given sample
size (n) and the number of elements or observed values (X). Each
number has a corresponding letter below. After you solve for 𝑃̂,
write its corresponding value and letter on the blanks to decode the
secret message. The formula to be used is provided in the box
below.
FORMULA:
𝑋
𝑝̂ =
𝑛

236
1. n = 100 ; X = 48 𝑝̂ = ____ _____
2. n = 225 ; X = 214 𝑝̂ = ____ _____
3. n = 450 ; X = 356 𝑃̂ = ____ _____
4. n = 1000 ; X = 772 𝑃̂ = ____ _____
5. n = 1330 ; X = 988 𝑃̂ = ____ _____
6. n = 2020 ; X = 1915 𝑃̂ = ____ _____
7. n = 2500 ; X = 2301 𝑃̂ = ____ _____
8. n = 3 000 ; X = 2 650 𝑃̂ = ____ _____
9. n = 3 800 ; X = 3 316 𝑃̂ = ____ _____
10. n = 10 000 ; X = 8 900 𝑃̂ = ____ _____
LEGEND:
E – 0.95 W - 0.48 M – 0.92
A – 0.88 L – 0.79 O – 0.77
H – 0.89 T – 0.87 V – 0.74

Guide Questions:

1. How did you find the activity?


2. Did you find it easy to decode the secret message?
3. What is the range of values of your answer to each item?
4. What do you mean by 𝑝̂ ?
5. How did you get the value of 𝑝̂ in each item?

What’s New

Activity 2: What’s the Decision?


Directions: Using the given conditions, write your decision whether to reject or fail
to reject the null hypothesis.
1. P-value is greater than = 0.01. ____________
2. The computed value does not fall in the rejection region. ____________
3. There is enough evidence to support the claim that there is an
increase in the population proportion at the alpha level of significance.
_________
4. The test statistic falls in the critical region. _________
5. ________

237
What Is It

In drawing conclusions, there are two different approaches that you may
apply: the critical z-approach (computed z-value) and the P-value approach.

CRITICAL VALUE APPROACH

In applying the first approach which is determining the critical value (which
you were already taught in the previous modules), you need to consider the
following:

a. Null and Alternative Hypotheses;


b. Level of Significance (α);
c. Computed Test Statistic, Critical Value (including rejection region);
and
d. Decision (whether to reject or fail to reject the null hypothesis (Ho).

Determine if the test statistic falls in the rejection region. If it


does, reject the null hypothesis. If it does not, do not reject the null
hypothesis.

 If the computed z-statistic (zcom) is > or < the tabular value (ztab), reject
the null hypothesis (Ho).
 If the computed z-statistic (zcom) falls in the rejection region, reject the
null hypothesis (Ho).
 If the computed z-statistic (zcom) does not fall in the rejection region,
fail to reject the null hypothesis (Ho).

Illustrative Example:

Example 1

a. Ho : p = 0.85
Ha : p < 0.85
b. Level of Significance: α = 0.01
c. Computed Test Statistic:

Given: x = 325 p = 0.85 n = 400

238
𝑋
𝑝̂ =
𝑛

325
=
400

̂=
𝒑 0.81

𝑝̂−𝑝
z=
𝑝(1−𝑝)

𝑛

0.81−0.85
=
0.85 (1−0.85)

400

z = -2.24

The alternative hypothesis is directional. Hence, one-tailed test shall


be used.

Using the Areas Under the Normal Curve Table, the critical value is
-2.326 at α = 0.01 level. There is a negative sign in the value due to the
direction of the alternative hypothesis.

d. DECISION: Since the computed test statistic (zcom) z = -2.24 does not fall
in the rejection region, fail to reject the null hypothesis (Ho).

CONCLUSION: Therefore, at 0.01 level of significance, there is not enough


evidence to conclude that there is a decrease in the number of students who
prefer male rather than female candidates.

P-VALUE APPROACH

What is P-value?

In critical value approach, a test statistic is compared with a critical value.


However, in p-value approach (short for probability value), probabilities or areas
are compared. P-value measures the consistency of the sample statistics with the
null hypothesis. High P-values mean that sample results are consistent with a true
null hypothesis while low P-values are not consistent. If the P value is small
enough, we can conclude that the sample is so incompatible with the null
hypothesis. Therefore, we can reject the null hypothesis for the entire population.

239
P-value approach uses the following basic procedures:

1. State the null hypothesis H0 and the alternative hypothesis Ha.


2. Set the level of significance α.
3. Calculate the test statistic.
4. Calculate the p-value.
5. Make a decision. Check whether to reject the null hypothesis by
comparing p-value to α.
 If the p-value < α, then reject Ho. Otherwise, do not reject Ho.

Illustrative Example:
Given:
Ho: p = 0.5 = 0.05 n= 25,468
Ha: p > 0.5

Solution:

Using the formula:


𝑝̂−𝑝
z =
𝑝 ( 1−𝑝 )

𝑛
0.5172− 0.5
z =
(0.5)(0.5) )

25468

z = 5.49

The p-value is represented in the graph below:

P=P(Z≥5.49)=0.0000⋯≈0
CONCLUSION: Because the p-value is smaller than the significance
level α=0.05, we can reject the null hypothesis. Again, we
would say that there is sufficient/enough evidence to
conclude that boys are more common than girls in the
entire population at α=0.05 level.

As should always be the case, the two approaches (critical value approach
and p-value approach) lead to the same conclusion.

240
OTHER ILLUSTRATIVE EXAMPLES USING TWO-TAILED TEST

Example 1
Given:
a. n= 50
b. = 0.01 significance level
c. H0 : The proportion of students that want to go to the zoo is 85%.
(H0: p = 0.85)
Ha: The proportion of students that want to go to the zoo is not 85%.
(Ha: p ≠ 0.85 )
d. p = 0.7554

DECISION/CONCLUSION: Because p > , we fail to reject the null hypothesis.


There is insufficient evidence to suggest that the proportion of students that want
to go to the zoo is not 85%.

Example 2

Given:
a. n= 150
b. = 0.1 significance level
c. Ho : The proportion of households that have three or more cell phones is
30%. (Ho : p = 0.3)
Ho : The proportion of households that have three or more cell phones is
different from 30%. (Ha : p ≠ 0.3)

d. 𝑝̂ = 0.287
e. Zcom = 0.347

-1.64 Zcom=.347 1.64


0

241
DECISION/CONCLUSION: Fail to reject the null hypothesis (Ho). There is
insufficient evidence supporting that the proportion of households with three or
more cell phones is different from 30%.

NOTE:
Conclusions are answers in sentence form which include: 1) whether there is
enough evidence or not (based on the decision); 2) the level of significance; and 3)
whether the original claim is supported or rejected.
Conclusions are based on the original claim which may be the null or
alternative hypothesis. The decisions are always based on the null hypothesis.

Original Claim

H0 Ha
Decision "REJECT" "SUPPORT"

Reject H0 There There is sufficient evidence at the


"SUFFICIENT" is sufficient evidence at alpha level of significance
the alpha level of to support the claim that (insert
significance original claim here).
to reject the claim that
(insert original claim
here).

Fail to reject H0 There There is insufficient evidence at


"INSUFFICIENT" is insufficient evidence the alpha level of significance
at the alpha level of to support the claim that (insert
significance original claim here).
to reject the claim that
(insert original claim
here).

NOTE:

If the null hypothesis isn’t rejected, this doesn’t necessarily mean that it’s
true. It simply means that there is not enough evidence to justify rejecting it.

The hypothesis-testing procedure leads to the acceptance of H0 when H0 is


true and the rejection of H0 when H0 is false. Unfortunately, since hypothesis tests
are based on sample information, the possibility of errors must be considered.
A Type I error corresponds to rejecting H0 when H0 is actually true, while a Type II
error corresponds to accepting H0 when H0 is false.

242
Notes to the Teacher
Students should be aware of p-value approach
since many statistical packages give the p-value but
not the critical value. One advantage of p-value is
that we can immediately know at what level the
testing becomes significant. For example, a p-value of
0.03 would be rejected at 0.01 level of significance,
but it would fail to be rejected at 0.05 level of
significance. Remember, we must decide first on the
level of significance before calculating the test
statistic and finding the p-value.

What’s More

Activity 3: Fill It Up!


Directions: Compute the test statistic. Fill in the blank with the word REJECT if
the decision is to reject the null hypothesis. Otherwise, write FAIL TO
REJECT. Then, draw your own conclusions by completing the
statement.

1. In a public senior high school, a survey conducted last year by a Health


Officer showed that 12% of the students drink alcohol. This year, a new
survey was conducted randomly on 500 students from the same school. It
was found that 97 of them drink alcohol. Test if the claim was higher at α =
0.01 level.
a. Ho : p = 0.12
Ha : p < 0.12
b. Level of Significance: α = 0.01
c. Computed Test Statistic: zcom = ______
d. Critical Value: 2.326
e. DECISION: Since the computed test statistic zcom = ____ falls in the
rejection region, _________________ the null hypothesis (Ho).

CONCLUSION: Therefore, we conclude that at 0.01 level of significance,


______________________ evidence to claim that ______________________________
___________________________________________________________________________
__________________________________________________________________________.

243
2. A research states that 28% of college degrees are from engineering courses.
A researcher doesn’t believe that this is correct. A sample of 1,000 graduates
was used and it was found out that 295 have finished engineering courses.
Test the claim if it has increased at α = 0.10 level. What is the correct null
hypothesis?
a. Ho : p = 0.28
Ha : p > 0.28
b. Level of Significance: α = 0.10
c. Computed Test Statistic: zcom = ____
d. Critical Value: 1.282

e. DECISION: Since the computed test statistic zcom = ____ does not fall in
the rejection region, _________________ the null hypothesis (H o).

CONCLUSION: Therefore, we conclude that at 0.10 level of significance,


_______________________________________ evidence to conclude that
________________________________________________________________________
_______________________________________________________________________.

Activity 4: Decide Now, Conclude Later!


Directions: Using the given hypotheses, computed z-value, and level of
significance, make your own decision and conclusion. Then, complete
the statement by filling in the blank with the appropriate word/s.
The first one was done for you as a guide.

Guide:
Given:

Ho: The proportion of students who are overweight is 25%. (H o : p = .25).

Ha: The proportion of students who are overweight is less than 25%.
(Ho : p < .25)

α = 0.05; Critical Value of −1. 645


Computed z-statistics: zc = - 2.24

DECISION: Reject the null hypothesis (Ho).


Since the computed z- statistic -2.24 falls in the rejection region, reject
the null hypothesis (Ho).

CONCLUSION: Therefore, we conclude at 0.05 level of significance that there is


enough evidence on the claim that less than 25% of the students are
overweight.

244
Problem 1

Given:
Ho: The proportion of employees in a shoe factory who smoke cigarette is
30%. ( Ho : p = .30)

Ha : The proportion of employees in a shoe factory who smoke cigarette has


increased to 30%. ( Ho : p > .30)
α = 0.01

Computed z-statistic: zcom = 2.56 and Critical z- value: z tab = 1.282


DECISION: Since the computed test statistic zcom = 2.56 ________________ in
the rejection region, _________________ the null hypothesis (Ho).

CONCLUSION: Therefore, we conclude that at 0.01 level of significance,


_________________________________________ evidence to conclude that
_________________________________________________________________________
_________________________________________________________________________.

What I Have Learned

Direction: Complete the following statements. In sentences no. 2, 5, and 6, choose


from the word/s in the parentheses that best complete/s the statement.
1. _______________ are statements which answer whether there is enough evidence
or not (based on the decision), what the level of significance is, and whether the
original claim is supported or rejected.
2. After computing the test statistic in order to draw the conclusion, just
remember the following:
a. If the computed z-statistic (zcom) is > or < the tabular value (ztab),
__________ (fail to reject/reject) the null hypothesis (Ho).
b. If the computed z-statistic (zcom) falls in the rejection region,
_____________ (fail to reject/reject) the null hypothesis (Ho).
c. If the computed z-statistic (zcom) does not fall in the rejection region,
________________ (fail to reject/reject) the null hypothesis (Ho).
3. The decision is always based on the __________________ hypothesis.

4. The two approaches to draw conclusions are ___________________________ and


______________________________________.

5. If the p-value < α, then _______________ (fail to reject/reject) Ho.

6. If the p-value > α, then _______________ (fail to reject/reject) Ho.

245
What I Can Do

Directions: Read job vacancies posts on the classified ads section of a newspaper.
Then, draw conclusions about the type of people who will apply for
each job. Write your conclusions based on facts and include the
newspaper clippings where you got the information. You will be
graded using the given rubric below.
RUBRIC

CATEGORY 4 3 2 1

Focus and There is a The idea is The idea is The idea is not
Support for clear and well- clear but the quite clear clear and not
Topic focused topic supporting and not supported
which is details/facts supported with needed
relevant and are not with needed details/facts.
supported complete. details/facts.
with
details/facts.

Conclusion The The The The


conclusion is conclusion is conclusion is conclusion is
correct and correct but incorrect but weak and
strong. quite weak. portrays a incorrect.
strong point.

Grammar & There are no There are 1-3 There are 4-6 There are
Spelling errors in errors in errors in more than 6
grammar or grammar or grammar or errors in
spelling. spelling. spelling. grammar or
spelling.

246
Assessment

Directions: Choose the best answer to the given questions or statements. Write
the letter of your choice on a separate sheet of paper.

1. Which of the following is an approach in drawing conclusions wherein a test


statistic is compared with a critical value?
A. critical value approach
B. sampling approach
C. two-way approach
D. p-value approach

2. Which is the correct decision for the given values/results below?


Ho: p = 0.13 Ha: p < 0.13
= 0.05 Zcom = -2.688 p-value = 0.0036
A. There is no possible decision.
B. Reject the null hypothesis.
C. Fail to reject the null hypothesis.
D. Change the alternative hypothesis.

For numbers 3 to 7, refer to the given problem below.


A state university wants to increase its retention rate of 4% for
graduating students from the previous year. After implementing several new
programs during the last two years, the university reevaluated its retention rate
using a random sample of 352 students and found the retention rate at 5%.

Which is the correct pair of hypotheses?


A. Ho: p = 0.04; Ha p > 0.04
B. Ho: p = 0.04; Ha p < 0.04
C. Ho: p = 0.04; Ha: p ≠ 0.04
D. Ho: p = 0.04; Ha: p ≥ 0.04

3. What is the value of z?


A. -1.07
B. 0.96
C. 1.07
D. 2.59

4. What is the p-value?


A. 0.8577
B. 0.2846
C. 0.2215
D. 0.1685

247
5. What is the correct decision?
A. There is no possible decision.
B. Reject the null hypothesis.
C. Fail to reject the null hypothesis.
D. Change the alternative hypothesis.

6. What should be the conclusion based on the computed test statistic?


A. This data shows that less than 4% of the students are retained. There
is enough evidence.
B. This data shows that more than 4% of the students are retained.
There is enough evidence.
C. This data does not show that less than 4% of students are retained.
There is not enough evidence.
D. This data does not show that more than 4% of students are retained.
there is not enough evidence.
7. What is meant by an increase of 5% retention rate?
A. There is an average of 18 learners who were retained in the grade level.
B. There is an average of 17 learners who were retained in the grade level.
C. There is an average of 334 learners who were retained in the grade
level
D. There is an average of 335 learners who were retained in the grade
level.

For numbers 8-11, refer to the given problem below.

Suppose a study found that 68% of the population owns a home. In a


random sample of 150 households, 92 own a home. Use α = 0.01 to
determine that there is a decrease in the proportion of population that owns
a home.

8. Find the z-score.

A. -1.75
B. -0.08
C. 0.08
D. 1.75

9. Which value is closest to the p-value?


A. 0.02
B. 0.04
C. 0.06
D. 0.08

10. What is the correct decision?


A. It cannot be concluded.
B. Reject the null hypothesis.
C. Fail to reject the null hypothesis.

248
D. Accept both null and alternative hypotheses.

11. Is there enough evidence to reject the claim?


A. There is enough evidence to reject the claim that 68% of the
population owns a home.
B. There is enough evidence to reject the claim that 32% of the
population owns a home.
C. There is not enough evidence to reject the claim that 32% of the
population owns a home.
D. There is not enough evidence to reject the claim that 68% of the
population owns a home.

For numbers 12-15, refer to the given problem below.


Suppose that the percentage of female physicians is 27%. In a survey
of physicians, 45 of 120 are women. Is there sufficient evidence at α = 0.01
to claim that the proportion of women physicians is greater than 27%?

12. Choose the correct hypotheses.


A. H0: p = 0.27; Ha: p > 0.27
B. H0: p = 0.27; Ha: p < 0.27
C. H0: p = 0.27; Ha: p ≠ 0.27
D. H0: p > 0.27; Ha: p = 0.27

13. What is the value of z?


A. -2.59
B. -0.005
C. 0.005
D. 2.59

14. Which is closest to the p-value?


A. 0.0005
B. 0.005
C. 0.05
D. 0.5

15. Whatis the correct decision and conclusion?


A. Change the alternative hypothesis.
B. There are no possible decision and conclusion.
C. Reject the null hypothesis because there is enough evidence to
support the claim that the proportion of women physicians is greater
than 27%.
D. Fail to reject the null hypothesis because there is not enough evidence
to support the claim that the proportion of women physicians is
greater than 27%.

249
Additional Activities

Directions: Read and analyze the following statements. Write ACCEPT if the
statement is correct and write REJECT if it is incorrect. Write your
answer on a sheet of paper.

____________1. The claim being assessed in a hypothesis test is the null hypothesis.

____________2. Critical value is the probability that the null hypothesis is true given
the observed results.

____________3. In a research report, the results of a hypothesis test include the


expression "z=3.15, p<0.01". This means that the test failed to reject
the null hypothesis at = 0.01.

____________4. When p-value is greater than alpha (0.05 used), we fail to reject Ho.

____________5. If a hypothesis test leads to a decision failing to reject the null


hypothesis, a Type II error may have been made.

250
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commission on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Minitab.com. “About the Null and Alternative Hypotheses.” Accessed February 4,


2019. https://support.minitab.com/en-us/minitab/18/help-and-how-
to/statistics/basic-statistics/supporting-topics/basics/null-and-alternative-
hypotheses/
Minitab.com. “What Are Type I and Type II Errors?” Accessed February 4, 2019.
https://support.minitab.com/en-us/minitab/18/help-and-how-
to/statistics/basic-statistics/supporting-topics/basics/type-i-and-type-ii-
error/

Zaiontz, Charles. “Null and Alternative Hypothesis.” Accessed February 2, 2018.


http://www.real-statistics.com/hypothesis-testing/null-hypothesis/
https://www.britannica.com/science/statistics/Hypothesis-testing
https://www.dummies.com/education/math/business-statistics/draw-
conclusions-about-a-population-using-confidence-intervals-and-hypothesis-
testing/

https://online.stat.psu.edu/statprogram/reviews/statistical-concepts/proportions

https://people.richland.edu/james/lecture/m170/ch09-int.html
https://www.khanacademy.org/math/ap-statistics/tests-significance-ap/one-
sample-z-test-proportion/v/comparing-p-value-to-significance-level-example

251
Statistics and
Probability
Quarter 2 – Module 14:
Solving Problems Involving Test
of Hypothesis on Population
Proportion

252
What I Need to Know

In real life whenever we are confronted with problems, our decision-making


skill is being tested. Before we decide, there are certain considerations and analysis
of the given conditions must be made. Someone can be an expert problem solver if
s/he is able to apply the learned concepts in a particular situation. Although
problem solving has steps, someone may have his/her own way or techniques of
solving a problem.

Meanwhile, in statistical analysis, there are steps that need to be followed in


solving problems involving test of hypothesis on population proportion. The
objective is for us to make a correct decision about the null hypothesis. It is
whether we can confidently say that the change in our data is real, definite, and
not attributed by chance.

After going through this module, you are expected to:

1. enumerate the steps in solving problems involving test of hypothesis on


population proportion; and

2. solve problems involving test of hypothesis on the population proportion.

What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. Supposing that in a certain research conducted about the number of students


who prefer using cell phones rather than reading books, it was found out that
85% of the students preferred not to read. On the following year, the same
study was conducted with 120 out of 150 randomly selected students having
the same preference. It was found out that there was an increase in number.
Test the claim at = 0.01.

Which of the following would be an appropriate alternative hypothesis?


A. The sample proportion is less than 0.85.
B. The sample proportion is no less than 0.85.
C. The population proportion is less than 0.85.
D. The population proportion is no less than 0.85.

253
2. In problem no. 1, which of the following would be the null hypothesis?
A. The sample proportion is 0.85.
B. The population proportion is 0.85.
C. The sample proportion is not equal to 0.85.
D. The population proportion is not equal to 0.85.

3. A Type I error is committed when ___________________________________.


A. we reject a null hypothesis that is true
B. we reject a null hypothesis that is false
C. we don't reject a null hypothesis that is true
D. we don't reject a null hypothesis that is false

4. In testing hypotheses, which of the following would be a strong evidence against


the null hypothesis?
A. using small number of samples
B. using a high level of significance
C. obtaining data with a small p-value
D. obtaining data with a low test statistic

5. What is the critical value (in a test about proportions) for a left-tailed test with
α = 0.05 and n ≥ 30?
A. Zcom = -2.33
B. Zcom = -1.96
C. Zcom = -1.645
D. Zcom = 2.58

6. Suppose the P-value for a hypothesis test is 0.0304. Using a = 0.05, what is the
appropriate conclusion?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject the alternative hypothesis.
D. Accept the alternative hypothesis.

7. When P-value is less than the alpha, we ___________________.


A. reject Ho
B. reject Ha
C. accept Ha
D. fail to reject Ho

8. Tina wants to know if the proportion of people who buy suman is affected at all
by her open microphone reading. If p=0.8 before her reading, what is the
appropriate set of hypotheses?
A. Ho: p = 0.8
Ha: p > 0.8

B. Ho: p = 0.8
Ha: p < 0.8

254
C. Ho: p ≠ 0.8
Ha: p = 0.8

D. Ho: p = 0.8
Ha: p ≠ 0.8

9. In a research report, the results of a hypothesis test include the expression


"z=3.15, p < 0.01". This means that the test should _______________________.
A. reject the null hypothesis
B. reject the alternative hypothesis
C. fail to reject the null hypothesis
D. fail to reject the alternative hypothesis

10. In problem no. 9, what is the level of significance used?


A. = 0.5
B. = 0.1
C. = 0.05
D. = 0.01

For nos. 11-15, refer to the given problem below.


It was claimed that on a certain year, 55% of Filipinos believed that
there was an improvement in the Philippine economy. Suppose that on the
following year, only 290 out of 500 people randomly selected believed that
there was an improvement in our country’s economy. Does this indicate an
increase in the number of certain Filipinos who believed that there was an
improvement in our economy? Use 0.05 level of significance.

11. What is the appropriate alternative hypothesis to be used?


A. Ha : p < po C. Ha : p ≠ po
B. Ha : p > po D. Ha : p = po

12. What is the value of ?


A. 0.55 B. 0.50 C. 0.05 D. 0.01

13. What is the value of 𝒑


̂?
A. 0.55 B. 0.58 C. 0.65 D. 0.725

14. What is the critical z-value to be used?


A. 1.645
B. 2.00
C. 2.58
D. 2 .96

255
15. Which of the following is the best decision and conclusion based on the results
of the test statistic? The computed z-statistic or zcom is 1.35.

A. Since the computed test statistic zcom = 1.35 does not fall in the rejection
region, do not reject the null hypothesis. Therefore, we conclude that at
0.05 level of significance, there was not enough evidence that the number of
people who believed that there was an improvement in our economy has
increased.
B. Since the computed test statistic z = 1.35 does not fall in the rejection
region, reject the null hypothesis. Therefore, we conclude that at 0.05 level
of significance, there was not enough evidence that the number of people
who believed that there was an improvement in our economy has increased.
C. Since the computed test statistic z = 1.35 falls on the rejection region, do
not reject the null hypothesis. Therefore, we conclude that at 0.05 level of
significance, there was enough evidence that the number of people who
believed that there was an improvement in our economy has increased.
D. Since the computed test statistic z = 1.35 does not fall in the rejection
region, do not reject the null hypothesis. Therefore, we conclude that at
0.05 level of significance, there was enough evidence that the number of
people who believed that there was an improvement in our economy has
increased.

Lesson Solving Problems Involving Test


14 of Hypothesis on Population
Proportion

What’s In

Activity 1: Give Your Best!

Directions: Read, analyze, and identify the given on the following problems
involving population proportions.

1. It has been claimed that 30% of students in a particular senior high school
dislike Mathematics. When a survey was conducted by a researcher, it showed
that 150 of 1,000 students dislike Mathematics. Test if the claim was different
from the population at α = 0.01 level.

256
Given:
a. Ho : _______________(symbols)
___________________________________________(statement)
b. Ha : _______________(symbols)
___________________________________________(statement)
c. Level of Significance = __________
d. n = ________
e. X = ________
f. 𝑝̂ = ________

2. In a public senior high school, a survey conducted last year by the barangay
health workers showed that 10% of the students drink alcohol. This year, a new
survey was conducted randomly on 320 students from the same school and it
was found out that 28 of them drink alcohol. Determine if the claim that there is
a decrease on the proportion of senior high school students who drink alcohol is
true. Use α = 0.05.

Given:
a. Ho : _______________(symbols)
___________________________________________(statement)
b. Ha : _______________(symbols)
___________________________________________(statement)
c. Level of Significance = __________
d. n = ________
e. X= _______
f. 𝑝̂ = ________

What’s New

Directions: Below is a problem with its solutions/answers already given. Arrange


the steps by writing numbers 1-5 based on your understanding on
the proper order of solving problems on population proportions.

PROBLEM:

A research study was conducted to determine the number of students who


watch news on national television during weekdays. The percentage of those
watching was 15%. The next school year, the same study was conducted among
randomly selected students. It was found out that the number of students not
watching news was lower than the previous year. Test the claim at = 0.5.

257
________ DECISION: Does not fall in the rejection region; fail to reject the Ho
________ computed z-statistic: zcom = -1.15 and critical z-value: -1.645

________ Ho: The proportion of students who watch news in national TV


during weekdays is 15%. ( Ho : p = 0.15)
Ha : The proportion of students who watch news in national TV
during weekdays is fewer than 15%. ( Ho : p < 0.15)

________ CONCLUSION: Therefore, we conclude that at 0.05 level of


significance, there was insufficient evidence to claim that the
proportion of students who watch news in national TV during
weekdays is lower than 15%.

_______ α = 0.05 level of significance

What Is It

Just like in puzzles, you need to think of different ways on how you will be
able to solve it. Same with solving problems involving test of hypotheses on
population proportions, you need to follow important steps in order to arrive at the
correct answer.

Here are the five (5) steps in solving problems for a test of hypothesis on the
population proportion.

STEP 1. HYPOTHESES: State the null and alternative hypotheses (either in


sentence/statement form or in symbols).

Ho : p = po
Ha : p < po or Ha : p > po or Ha : p ≠ po

STEP 2. LEVEL OF SIGNIFICANCE ( ): Choose a level of significance like


= 0.01 level.

STEP 3. TEST STATISTIC: Calculate the appropriate test statistic.

258
Remember:

Test statistic is a random variable calculated from a sample. You can


use test statistics to determine whether to reject the null hypothesis or not.
The test statistic compares your data with what is expected under the null
hypothesis. The test statistic is used to calculate the p-value.
A test statistic measures the degree of agreement between a sample of
data and the null hypothesis. Its observed value changes randomly from one
random sample to a different sample. A test statistic contains information
about the data relevant on deciding whether to reject the null hypothesis or
not.

STEP 4. CRITICAL VALUE/P-VALUE: Determine the critical value or p-


value.
𝑥 𝑝̂−𝑝 𝑝̂−𝑝
𝑝̂ = z= or z=
𝑛 𝑝𝑞 𝑝(1−𝑝)
√𝑛 √
𝑛

where: x = number of sample units that possess the characteristics of


interest

p = population proportion q=1–p

𝑝̂ = sample proportion n = sample size


Remember:
The critical value and p-value are the points being compared with the
test statistic in order to make the final decision on whether to reject the null
hypothesis or not.

STEP 5. DECISION/CONCLUSION:

 The decision will be either to reject or fail to reject the null


hypothesis (Ho).

 Draw your conclusion about the population proportion based on


the test statistic value and the rejection region.

 If the computed z-statistic (zcom) is > or < the tabular/critical


value (ztab), reject the null hypothesis (Ho).
 If the computed z-statistic(zcom) falls in the rejection region,
reject the null hypothesis (Ho).
 If the computed z-statistic(zcom) does not fall in the rejection
region, fail to reject the null hypothesis (Ho).

259
NOTE:

(These conditions were already mentioned in the previous module on drawing


conclusions on population proportions.)

To solve problems involving population proportions, just follow the


5-step procedure mentioned above.

Illustrative Examples

Example 1: Every year, the assigned teachers determine the Body Mass Index
(BMI) of students. In a certain public junior high school, a study finds
that 10% of Grade 7 students observed are underweight. A sample of
780 Grade 7 students were randomly chosen and it was found out
that 125 of them are underweight. Is this claim different for their
grade level age? Use 0.05 level of significance.

SOLUTION:

STEP 1: State the null and alternative hypotheses.


Ho ; p = 0.10
Ha : p ≠ 0.10

STEP 2: Choose a level of significance. α = 0.05

STEP 3: Compute the test statistic.

Given: X= 125 p = 0.10 n = 780

𝑋
𝑝̂ =
𝑛

125
=
780

̂=
𝒑 0.16

𝑝̂−𝑝
z=
𝑝(1−𝑝)

𝑛

0.16−0.10
=
0.10 (1−0.10)

780

260
0.06
=
0.03

zc = 5.61

STEP 4: Determine the critical value.

NOTE: Since the alternative hypothesis is non-directional, the two-


tailed test shall be used. Divide α by 2, then subtract the quotient
from 0.5.

𝛼 0.05
= = 0.25
2 2

Therefore, 0.5 – 0.25 = 0.25.

𝑍𝛼
NOTE: Using the Areas Under the Normal Curve Table, critical
2
𝑣𝑎𝑙𝑢𝑒𝑠 at 0.05 level of significance are ± 1.96.

Rejection Region

𝛼 𝛼
= 0.25 = 0.25
2 2

Rejection Region

261
STEP 5: Make a decision whether to reject or fail to reject the null
hypothesis. Draw a conclusion.
DECISION: Since the computed test statistic zcom = 2.0 is greater than the
critical value or it falls in the rejection region, reject the null
hypothesis.
CONCLUSION: Therefore, we conclude that at 0.05 level of significance,
there is enough evidence that the percentage of Grade 7
students who are underweight is different from 10%.

What’s More

Make Your Own


Directions: Using the given set of values/parts of the test of hypothesis involving
population proportions, construct your own word problem about the
specified topic in each number.

1. = 0.05
Ho: p = 0.6
Ha: p ≠ 0.6

TOPIC: Numeracy rate of a certain high school

2. = 0.05
p = 0.45
right-tailed test

TOPIC: Number of tourists at a certain landmark in the Philippines

3. Population proportion is 0.85.


non-directional test

TOPIC: Spread of virus/bacteria

262
What I Have Learned

In order to solve problems involving test of hypotheses on population


proportion, the five (5) steps are:

1. _____________________________________________________________________________

2. _____________________________________________________________________________
3. _____________________________________________________________________________

4. _____________________________________________________________________________

5.______________________________________________________________________________

What I Can Do

A. Give three (3) best experiences in your life wherein you think you made the
right decisions. Share some things, ideas, or techniques that you considered
before finally deciding. You are going to present your answers through a collage
in a short bond paper. (Use recyclable materials like old magazines, newspaper,
etc.)

B. In a 5-sentence paragraph, give reasons why you should be a wise decision


maker and why you should have good problem-solving skills.

263
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. A hypothesis test is done in which the alternative hypothesis is that more than
10% of the population is left-handed. The calculated p-value for the test is 0.25.
Which statement is correct?

A. We can conclude that exactly 25% of the population is left-handed.


B. We can conclude that more than 10% of the population is left-handed.
C. We can conclude that more than 25% of the population is left-handed.
D. We cannot conclude that more than 10% of the population is left-handed.

2. In a nationwide survey, 1,500 adults were asked about attitudes toward


“alternative medicine” such as acupuncture, massage therapy, etc. Among the
1,500 respondents, 660 said they would use alternative medicine if the
traditional medicine did not produce the results they wanted. The researcher
wants to determine if these data provide enough evidence to suggest that less
than half of all adults would use alternative medicine if traditional medicine
didn’t produce the desired results. The level of significance used was 5%.

Which is the correct conclusion for this test?


A. Since p-value = 0.001 < 0.05, I reject Ho. There is enough evidence to
suggest that the proportion is less than half.
B. Since p-value = 0.001 < 0.05, I fail to reject Ho. There is not enough evidence
to suggest that the proportion is less than half.
C. Since p-value = 0.001 > 0.05, I reject Ho. There is enough evidence to
suggest that the proportion is less than half.
D. Since p-value = 0.001 < 0.05, I fail to reject Ho. There is enough evidence to
suggest that the proportion is less than half.

3. A potato chip producer and a supplier of potatoes agree that each shipment
must meet certain quality standards. If the producer is convinced that more
than 8% of the potatoes in the shipment have blemishes, the truck will be sent
away and another one would have to be sent. In a recent shipment, an SRS of
80 potatoes was selected and 7 had blemishes. Use α = 0.01.

Which is the correct decision for this test?


A. Since p-value = 0.4024 > .01, I reject H0.
B. Since p-value = 0.4024 > .01, I reject H0.
C. Since p-value = 0.4024 < .01, I fail to reject H0.
D. Since p-value = 0.4024 > .01, I fail to reject H0.

264
4. In problem no. 3, what will be the correct conclusion regarding the claim?
A. There is no sufficient evidence that more than 8% of the potatoes in the
shipment have blemishes. Therefore, the truck should be returned.
B. There is sufficient evidence that more than 8% of the potatoes in the
shipment have blemishes. Therefore, the truck should be returned.
C. There is no sufficient evidence that more than 8% of the potatoes in the
shipment have blemishes. Therefore, the truck should not be returned.
D. There is sufficient evidence that more than 8% of the potatoes in the
shipment have blemishes. Therefore, the truck should not be returned.

For nos. 5 to 9, refer to the given problem below.

A public high school wants to increase its reading comprehension rate of


9% for Grade 7 students from the previous year. After planning and
implementing new reading programs during the last three years, the school re-
evaluated its reading comprehension rate using a random sample of 156
students and found the reading comprehension rate at 10%. Test the claim at
10% level.

5. What is the level of significance of the given problem?


A. = 0.01
B. = 0.05
C. = 0.1
D. = 0.5

6. What is the null hypothesis?


A. Ho : p > 0.09
B. Ho : p < 0.09
C. Ho : p = 0.09
D. Ho : p ≠ 0.09

7. What is the alternative hypothesis?


A. Ha : p > 0.09
B. Ha : p = 0.09
C. Ha : p < 0.09
D. Ha : p ≥ 0.09

8. If the computed p-value is greater than the given , which is the correct
decision?
A. Reject the null hypothesis.
B. Fail to reject the null hypothesis.
C. Reject both null and alternative hypotheses.
D. Fail to reject both null and alternative hypotheses.

265
9. From the correct decision in no. 8, what should be your conclusion?
A. There is a missing data.
B. There is an error in the claim.
C. There is sufficient evidence to claim that the reading comprehension rate is
higher during the current year than the previous year.

D. There is no sufficient evidence to claim that the reading comprehension rate


is higher during the current year than the previous year.

10. Which of the following will NOT result to a decision of rejecting the null
hypothesis?
A. The z-score is located at the rejection region.
B. The p-value is equal to the level of significance.
C. The test statistic is smaller or larger than the critical value.
D. The p-value is greater than the level of significance.

11. Why do you need to set the level of significance in solving problems for test of
hypothesis?
A. to determine the test statistic
B. to identify the margin of error
C. to easily compute the critical value
D. to make the probability of making a Type I error small

12. Which is true about using critical value approach and P-value approach?
A. They are used only for proportions.
B. They will give you different decisions.
C. They are used as alternative solutions.
D. They both have the same results used for drawing conclusions.

For nos. 13 to 15, refer to the given problem below.

The mayor of a town saw an article claiming that the national


unemployment rate is 8%. He wondered if this holds true in their town, so a
sample of 200 residents was taken. The sample included 22 unemployed
residents and 0.05 level of significance was used.

13. Formulate the pair of hypotheses.


A. Ho : p = 0.08
Ha : p ≠ 0.08

B. Ho : p = 0.08
Ha : p < 0.08

C. Ho : p = 0.08
Ha : p > 0.08

D. Ho : p = 0.08
Ha : p ≥ 0.08

266
14. This test is a _____________________.
A. left-tailed test
B. one-tailed test
C. right-tailed test
D. two-tailed test

15. What is the level of significance ( ) in the given problem?


A. 0.01
B. 0.05
C. 0.1
D. 0.5

267
Additional Activities

Finding the Errors


Directions: Read and analyze the given problems below. One of the data/
concepts/values is incorrect. Find the error, then write the correct
version. Write your answers on the blanks provided.

1. One thousand five hundred (1,500) randomly selected pine trees were tested for
traces of the Bark Beetle infestation. It was found that 153 of them showed
such traces. Test the hypothesis that more than 10% of the pine trees have
been infested. (Use 5% level of significance.)
=0.5
Ho : p = 0.10
Ha : p > 0.10
zcom = 1.645
ERROR: ___________________
CORRECTED: _____________
2. A sample of 100 students were randomly selected from Pinagpala High School
and 18 of them said they are left-handed. Test the hypothesis that less than
20% of the students are left-handed by using 𝛼 = 0.05 as the level of
significance.

Ho: p = 0.20
Ha: p ≠ 0.20.
zcom = 1.96
𝛼 = 0.05.
It is a one-directional or left-tailed test.
ERROR: __________________
CORRECTED: ____________
3. Newborn babies are more likely to be boys than girls. A random sample found
13,173 boys were born among 25,468 newborn children. The sample
proportion of boys was 0.5172. Is this sample evidence that the birth of boys is
more common than the birth of girls in the entire population?
Ho : p = 0.5
Ha: p > 0.5
Zcom = 5.49
It is a non-directional or two-tailed test.
ERROR: ___________________
CORRECTED: _____________

268
4. Traditionally, about 70% of students in a Statistics course at ECC are
successful. If only 15 students in a class of 28 randomly selected students are
successful, is there enough evidence at 5% level of significance to say that
students of a particular instructor are successful at a rate of less than 70%?

Ho : p = 0.70
Ha: p < 0.70
P-value = 0.0289
ERROR: ____________________
CORRECTED: ______________
Since P-value < , we fail to reject the null hypothesis (Ho).

5. For a class project, a Grade 12 STEM student wants to estimate the percentage
of students who are registered voters in his school. From 45% Grade 12
students, he surveys 500 students and finds that 200 are registered voters.
Test the claim at = 0.05 if there is enough evidence proving that there is a
change in the percentage of registered voters.

= 0.05
Ho : p ≠ 0.45
Ha: p ≠ 0.45
It is a non-directional test.
ERROR: ____________________
CORRECTION: _____________

269
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commission on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc.,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Minitab.com. “About the Null and Alternative Hypotheses.” Accessed February 4,


2019. https://support.minitab.com/en-us/minitab/18/help-and-how-
to/statistics/basic-statistics/supporting-topics/basics/null-and-alternative-
hypotheses/
Minitab.com. “What Are Type I and Type II Errors?” Accessed February 4, 2019.
https://support.minitab.com/en-us/minitab/18/help-and-how-
to/statistics/basic-statistics/supporting-topics/basics/type-i-and-type-ii-
error/

Zaiontz, Charles. “Null and Alternative Hypothesis.” Accessed February 2, 2018.


http://www.real-statistics.com/hypothesis-testing/null-hypothesis/
http://www.ltcconline.net/greenl/courses/201/hyptest/hypprob.htm
https://faculty.elgin.edu/dkernler/statistics/ch10/10-2.html
https://online.stat.psu.edu/statprogram/reviews/statistical-concepts/hypothesis-
testing/p-value-approach
https://www.khanacademy.org/math/ap-statistics/tests-significance-ap/one-
sample-z-test-proportion/v/calculating-a-z-statistic-in-a-significance-test

270
Statistics and
Probability
Quarter 2 – Module 15:
Illustrating the Nature of
Bivariate Data

271
What I Need to Know

Making sound decisions is a very important skill that needs to be


developed among individuals. Some people even claim that life is the
product of every decision he makes. Thus, the data and variables involved
should be carefully examined and studied before making decisions. In this
ADM module, you will be introduced to different nature of data that we
usually encounter in real life.

After going through this module, you are expected to:


1. describe the nature of bivariate data;
2. differentiate bivariate data from univariate data; and
3. determine the variables involved in the given bivariate data.

Are you ready now to study bivariate data using your ADM module? Good
luck and may you find it helpful.

272
What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. Data that involve two variables are called______.


A. nivariate data
B. bivariate data
C. trivariate data
D. multivariate data

2. Which of the following is the statistical procedure used to describe the


relationship of the variables of bivariate data?
A. measures of variation
B. correlation analysis
C. descriptive statistics
D. measures of central tendency

3. Determine the variables involved in the given situation: Cardo


surveyed for the daily allowance and the arm span of his 10
classmates and he found out that there is no correlation between the
variables involved.
A. height and arm span of students
B. weight and height of the student
C. daily allowance and height of students
D. Daily allowance and arm span of students

4. What do you call those data that involve one variable?


A. bivariate
B. multivariate
C. trivariate
D. univariate

5. “A MAPEH teacher wanted to determine the students’ Body Mass


Index (BMI).” What are the variables involved needed by the teacher?
A. weight of the students
B. height of the students
C. height and weight of the students
D. height and allowance of the students

273
6. “A teacher computed that the mean percentage of score of Grade 8-
Integrity on their 50-item test in Mathematics is 72.50.” What type of
data is illustrated on the situation?
A. bivariate
B. multivariate
C. trivariate
D. univariate

7. A pre-service teacher concluded that based on his study, the number


of minutes a student spends in browsing Facebook is significantly
related to his scores in a set of tests. How many variables are involved
in the study?
A. one
B. two
C. three
D. four

8. From question number 7, what type of data is presented?


A. bivariate
B. multivariate
C. trivariate
D. univariate

9. “Chester’s average grade from his 9 subjects is 92.38.” Which of the


following words will make you decide that the data presented is
univariate?
A. subjects
B. grade
C. average
D. 92.38

10. “A nutritionist advised his patient that the more protein he


consumes, the more weight he will gain.” What are the variables
presented on the given statement?
A. weight and height
B. weight and calorie intake
C. protein consumption and weight
D. protein consumption and visceral fat gain

274
11. Which of the following is not used to describe data that fall under
univariate category?
A. mean
B. mode
C. correlation analysis
D. measure of dispersion

12. A grade 10 student realized from his Araling Panlipunan subject


that the price of a certain good is inversely proportional to its
supply. What type of data is being presented?
A. multivariate
B. univariate
C. trivariate
D. bivariate

13. According to the record of World Health Organization (WHO) on


COVID-19 cases around the world, it was found out that those
infected persons around 60 years old and above and those who have
comorbidities or the presence of more than one disorder have high
chances of succumbing to death due to the effect of the said virus.
From what type of data did the conclusion come from?
A. bivariate
B. multivariate
C. trivariate
D. univariate

14. From an experiment conducted by a group of researchers, they found


out that students who perform good in Mathematics also perform
good in English based on the results of their test scores. What are
the variables involved in the study?
A. tests in Mathematics and English
B. insufficient information to determine
C. scores in Mathematics and English tests
D. questions on the tests in Mathematics and English

15. “Rommel got the following grades on his 9 subjects: six 90s, one 92,
and two 89s. Without computing the average, he estimated that his
general average would be around 90.” Based on the given situation,
what is/are the variable/s?
A. 9 subjects
B. grades on his 9 subjects
C. average on his 9 subjects
D. his general average and his 9 subjects

275
Lesson
Illustrating the Nature of
15 Bivariate Data
A variable is an attribute or characteristic that may take more than
one value which can either be measured or classified. The height and weight
of students, number of hours students spend in studying at home, and daily
allowance of students are examples of variables. From such variables,
information are collected and analyzed. If we are given a bivariate data, the
degree of association between the two variables can be determined.

In this lesson, we will deal with the nature of variables and data
collected.

What’s In

Where Am I Now?
Directions: Identify the variables involved in the following situations.

Situation Variable/s
Involved
Example: Height
Luffy measured the height of his 10
classmates and determined their average
height.
1. Zorro surveyed his cousins’ shoe sizes
and weight.
2. Nami conducted a survey to determine
the number of household members in
their barangay.
3. Sanji interviewed 10 students about
their daily money allowance and
weight.
4. Teacher Kim recorded his students’
scores from IQ and math tests.
5. Karina recorded her daily profit in
selling cassava cake.

276
From the activity, answer the following questions.

1. Are there situations that involve one variable? two variables?


____________________________________________________________________
2. Do you think there are situations that could involve more than two
variables?
____________________________________________________________________

3. If a situation involves two variables, is it necessary for the variables to


be related?
____________________________________________________________________

Notes to the Teacher


Check the student’s level of readiness for the next
topic. If s/he did not answer most of the items and the
guide questions correctly, you may provide another review
activity on identifying variable/s in a given situation.

277
What’s New

Math Analogy!
Directions: Examine the following sets of words or phrase. Look at the first
pair and examine how the two concepts relate to each other. Then, select the
best word/phrase that would complete the second pair to show the same
relationship.

1. one-wheeled bike: unicycle:: horse with a horn:_________


A. griffin B. merlion C. Pegasus D. unicorn

2. two-wheeled bike: ______:: single-variable data: univariate data


A. bicycle B. jeepney C. motorboat D. tricycle

3. three- wheeled vehicle: tricycle:: two variables: __________


A. bivariate B. multivariate C. non-variate D. univariate

4. bivariate: correlation analysis:: univariate: _______________


A. t-test
B. z-test
C. Pearson r
D. mean, mode, median

5. height of students: univariate:: IQ scores and test scores:


_________
A. bivariate
B. multivariate
C. non-variate
D. univariate

Guide Questions:

1. How is the word “variable” related to the given activity?


_____________________________________________________________________

2. How are you going to describe bivariate data?


_____________________________________________________________________

3. Is there a method or way to determine whether a relationship exists


between the variables in a bivariate data? How?
_____________________________________________________________________

278
What Is It

Data that involve one variable is called univariate data. Univariate


data are often described using the measures of central tendency (mean or
average, mode, and median), variations, or other descriptive statistics. Here
are examples of univariate data:

Examples Variable involved


Department of Health (DOH) number of infected cases
recorded the number of infected
COVID-19 cases from April 14 to
May 21, 2020 in the Philippines.
World Health Organization (WHO) number of COVID-19 recoveries
summarized the number of COVID-
19 recoveries around the world.

Data that involve two variables are called bivariate data. The
statistical procedure used to determine and describe the relationship
between two variables is called correlation analysis.

Examples Variables involved


In Tayabas City public market, a supply and price of vegetable
consumer observed that the fewer
is the supply of vegetables, the
higher the price gets.
The Quezon provincial government number of household members and
gave emphasis that limiting the rate of COVID-19 infection
number of household members
going outside to purchase essential
goods will help decrease the rate of
COVID -19 infection in the
province.

279
What’s More

Activity 1.1

Directions: Determine the number of variables involved in the following


situations.

Situation Number of Variables Involved


1. Mr. Gonzales will donate face
masks to the people in his
barangay. He asked a health
worker to survey the number of
family members living in each
house on his barangay.
2. To properly compensate an
employee, the administrative aid
records the number of hours their
employees are working and their
respective take home pay.
3. A school nurse finds out the
number of hours of sleep of 20
students and their weight in
kilograms.
4. A doctor’s secretary records the
number of minutes a patient
spends for a medical check-up.
5. A nursing student investigates the
number of hours of sleep of 20
patients and their red blood cells
count.

280
Activity 1.2

Directions: Identify the variable/s in each situation below.

Situation Variable/s

1. Jake, a STEM student, was tasked to conduct a


survey on the number of hours students spend
in playing online games like Mobile Legends.
2. Reid, an Accountancy and Business
Management student, wanted to determine his
classmates’ average daily allowance and their
weight in kilogram.
3. Mea recorded the height of her 8 classmates.

4. Robin asked the height of his friends and their


mothers.
5. Jacent interviewed 5 of her students on the
number of hours they spend in studying a
lesson and their grade in Mathematics 11.
6. An ABM student surveyed his teachers’
monthly salary and their weight.
7. A Grade 7 student interviewed 10 teachers
about their number of years in service.
8. A student determined his classmates’ weight
and their mothers’ weight.
9. A school nurse recorded the age and the blood
pressure of the teachers.
10. A HUMMS student surveyed Grade 8 students
on the number of hours spent in using
Facebook.

281
Activity 1.3: Univariate or Bivariate?

Directions: Determine whether the following situations involve univariate


or bivariate data.

1. A secretary recorded the daily number of patients a doctor has for a


month during the General Community Quarantine.
2. A researcher observed the number of minutes it takes for students to
answer a worded problem in Math and the number of hours they spend
in studying the subject for a grading period.
3. A researcher records the number of infected COVID-19 patients and the
number of days they spent in the hospital before recovering from the
disease.
4. A housewife finds out that their average electric consumption during the
quarantine period costs P 1,230.00.
5. A group of researchers found out that long hours spent by students in
browsing the Facebook application has negative effect on their academic
grades.

Activity 1.4:

Directions: Determine the variables in the following situations and identify


whether they involve univariate or bivariate data.
Situation Variable/s Univariate or
Bivariate
1. A security guard of a
supermarket estimates that on
the average, the number of
customers entering the
supermarket’s premises is 85.
2. A student researcher concluded
that the number of hours of
sleep is highly related to the
blood count of the students.
3. A mother asked her daughters
to minimize their electric
consumption so their monthly
electric bill will not be high.
4. A nutritionist advised her
patient that few hours of sleep
results to unhealthy weight
gain.
5. A school teacher finds out that
on the average, only 30% in
each class has internet access
in their homes.

282
What I Have Learned

Directions: Complete the following statements. Write the answers in your


notebook.

1. Univariate data consist of only __________ variable.


2. Data that involve two variables are called ____________.
3. The statistical treatments used to describe univariate data are
measure of variation and measure of ____________________ tendency.
4. The statistical analysis that can be used in bivariate data is
_______________.
5. If the data given in an experiment can only be described by the
measure of central tendency and variation, then the type of data given
is ______________.

What I Can Do

Directions: Create/cite three (3) examples of situations observable in your


community that involve bivariate data. Then, answer the
questions below.

1. What are the variables present in your examples?


2. Describe the relationship of the variables involved in your examples.

283
Assessment

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.
1. What do you call a set of data that involves 2 variables?
A. univarite
B. bivariate
C. trivariate
D. multivariate

2. Which of the following situations involve bivariate type of data?


A. Joan recorded the daily allowance of her 50 classmates.
B. Kassandra recorded the number of minutes 25 gym enthusiasts
spend doing their work-out routines.
C. Zoe estimated that the average number of students with internet
connection in a class of 50 students is 17.
D. Cedrick surveys the purchasing power and the number of hours
spent for overtime work of 50 employees of a certain company.

3. In a Zumba class, an instructor recorded the number of minutes


spent by the 15 participants and the number of calories they burned
within the month. What are the variables presented in the situation
above?
A. number of minutes spent and the burned calories
B. number of sessions attended and the burned calories
C. burned calories and number of days spent in the session
D. number of minutes spent and number of days present during
the class

4. A health enthusiast finds out that the volume of water intake of an


individual has an inverse effect on the accumulation of fats in his
body. Does the situation presented involve bivariate data?
A. No, because there are four variables involved.
B. Yes, because there are two variables involved.
C. No, because there are three variables involved.
D. No, because there is only one variable involved.

284
5. Determine the variables involved in the situation below:
“Asta’s goal for the summer vacation is to have a healthy and fit body.
He recorded the number of minutes he spends daily in doing
abdominal exercises and his weight for 30 days. He found out that the
longer he does abdominal exercise, the more weight he loses.”
A. Asta’s weight
B. Number of days spent and weight
C. Number of minutes doing abdominal exercise and weight
D. Number of minutes doing abdominal exercise and weight loss

6. “A teacher computed that the mean percentage of score of his advisory


class in their Achievement Test in Mathematics is 81.70.” What type of
data is illustrated in the situation?
A. bivariate
B. multivariate
C. trivariate
D. univariate

7. A teacher concluded in his study that the scores obtained by his 50


students in Mathematics and Science examinations are positively
related. How many variables are involved in the study?
A. one
B. two
C. three
D. four

8. Zorro is a hardworking student who supports his study by selling


cooking oil. Zorro’s average income in selling cooking oil for the past
10 days is Php340. Which of the following will make you decide that
the data is univariate?
A. 10 days
B. Php340
C. cooking oil
D. average income

9. A nutritionist advised his patient that few hours of sleep makes a


person heavier according to studies. What are the variables
presented?
A. hours of sleep
B. weight and calorie intake
C. hours of sleep and weight
D. protein consumption and visceral fat gain

285
10. Which of the following can be used to describe data that fall under
univariate category?
A. scatter plot
B. scatter diagram
C. correlation analysis
D. measure of central tendency

11. A Grade 11 student learned from his Economics subject that when
the supply of a product is limited, its price gets higher than the
average price. On the other hand, if there is an increase in supply, its
price gets lower. What type of data is being presented?
A. bivariate
B. multivariate
C. trivariate
D. univariate

12. “A teacher found out that 80% to 90% of the students in class
decided to enroll for the incoming school year despite the threat of
COVID-19 infections in their city.” What type of data is presented
above?
A. bivariate
B. univariate
C. multivariate
D. cannot be determined due to lack of data

13. From an experiment conducted by a group of researchers, they found


out that those students who perform well in English may not perform
well in Mathematics based on the results of their test scores. What
are the variables involved in the study?
A. tests in Mathematics and English
B. scores in Mathematics and English tests
C. scores in the test and the test questions
D. test questions in Mathematics and English

286
14. “Carla got the following grades on her 8 subjects: three 87s, one 90,
two 89s, and two 85s. Without computing the average, she estimated
that his general average would be around 87.” Based on the given
situation, what type of data is presented?
A. bivariate
B. multivariate
C. trivariate
D. univariate

15. “A student asked his 30 classmates their Body Mass Index (BMI) and
the number of glasses of water they drink daily. He found out that
those students who consume 8-12 glasses of water daily have normal
BMI.” What type of data is presented on the situation above?
A. bivariate
B. multivariate
C. trivariate
D. univariate

287
Additional Activities

Directions: Complete the statements below.

Data collected from surveys, studies, and the likes can involve one,
two, or more variables. These quantitative variables are anything
measurable like the height of students, weight, test scores, and many
(1)
more. If data involves only one variable, it is called __________ data, while
(2)
if data involve two variables, it is called _________ data.

Data that involves one variable is usually described using the


(3)
measures of central tendency, namely ___________, median, and mode.
This type of data can also be described using the measures of dispersion.

Data that involve two variables are usually described through the use
(4)
of __________ analysis and graphs like scatterplot or scatter diagram.

288
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De
Mesa. Teaching Guide for Senior High School: Statistics and Probability.
Quezon City: Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular


Approach. Mandaluyong City: Jose Rizal University Press, 2011.
De Guzman, Danilo. Statistics and Probability. Quezon City: C & E
Publishing Inc, 2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE
Subject A Comprehensive Approach K to 12 Curriculum Compliant.
Manila: Mindshapers Co., Inc., 2017.

Online Resources

Lane, David M. “Online Statistics Education: A Multimedia Course of


Study.” Accessed May 25, 2020.
http://onlinestatbook.com/2/regression/intro.html

Quizizz. “Linear Regression | Algebra I Quiz” Accessed May 25, 2020.


https://quizizz.com/admin/quiz/5acae751c4daf70019c2369f/linear-
regression

Rourke, Emily O. “Performance Based Learning and Assessment Task


Tuition Cost Activity.” Accessed May 25, 2020.
https://www.radford.edu/rumath-smpdc/Performance/src/Emily
O’Rourke - Tuition Cost Activity.pdf

289
Statistics and
Probability
Quarter 2 – Module 16:
Constructing a Scatter Plot

290
What I Need to Know

In the previous module, you were able to differentiate bivariate data from
univariate data as well as identify the variable/s that are present on given
situations. Illustrating the bivariate data and identifying the variables involved are
important especially in dealing with researches and studies that you will later
encounter on different fields. In this ADM module, we will specifically deal with
bivariate data and how presented variables are related as you construct a diagram
called “scatter plot”.

After going through this module, you are expected to:


1. illustrate a scatter plot; and
2. construct a scatter plot based on the given data.

Are you ready now to study how to construct a scatter plot using your ADM
module? Good luck and may you find it helpful.

291
What I Know

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. Which of the following is a graphical representation that shows how each


point collected from bivariate data are scattered on the Cartesian plane?
A. pictograph
B. scatter plot
C. bar graph
D. line graph

2. In constructing a scatter plot, which of the variables should be plotted


horizontally?
A. consistent variable
B. dependent variable
C. independent variable
D. depends on the given

3. Which geometric figure is used to show the relationship of the variables in a


scatter plot?
A. bar
B. comma
C. line
D. points/dots

4. The following are other names for a scatter plot EXCEPT:


A. xy plot
B. line graph
C. scatter grams
D. scatter diagram

5. Joana read an article from the Parenting.Firstcry.com that the height of the
offspring of a couple is inherited from their father. She surveyed the height
of her 10 classmates and their father to construct a scatter plot. Which
variable should be plotted on the x-axis?
A. height of their father
B. height of their mother
C. height of her classmates
D. annot be determined

292
6. In constructing a scatter plot, which of the variables should assume the
values of y-coordinate?
A. consistent variable
B. dependent variable
C. independent variable
D. cannot be determined

7. According to Margaret Renkl of Women’s Health, the weight of an individual


is more likely inherited from his/her mother. If data of the weight of
individuals and their mothers are collected and a scatter plot is constructed,
which of the variables will assume the values of the abscissa?
A. no enough data
B. weight of the mother
C. weight of the father
D. weight of the individual

8. Which of the following chart can be used BEST to show the correlation
between two variables?
A. bar graph
B. line graph
C. pictograph
D. scatter plot

9. What do we consider in describing the relationship of the variables in a


scatter plot?
A. shape, trend, and variance
B. mean, mode, and median
C. form, trend, and variation
D. form, direction, and standard deviation

10. The trend of the points in a scatter plot suggests the ______ of the
correlation of the variables.
A. direction
B. shape
C. strength
D. variation

11. The closeness of the points around the line in a scatter plot suggests the
_________ of correlation of the variables.
A. form
B. mean
C. shape
D. variation and strength

293
12. If the points on a scatter plot follow a trend of a line, then there is _______
correlation between the variables involved.
A. curvilinear
B. form
C. linear
D. non-linear

13. Luffy finds out that the IQ score of a child is inherited from his/her mother.
What is the independent variable on the situation above?
A. IQ score
B. IQ score of the father
C. IQ score of the child
D. IQ score of the mother

14. Which of the following shows the scatter plot of the data below?

Order of period of the


subject 1 2 3 4 5 6 7 8
Grades 85 84 87 88 90 81 80 85

A. C.

B. D.

15. From number 14, what is the variable that will assume the values of
x-coordinate?
A. average grade
B. grades
C. order of the subject
D. cannot be determined

294
Lesson

16 Constructing a Scatter Plot

From your previous knowledge on Mathematics, you learned to plot points in


a Cartesian plane or xy-plane.

Points in Cartesian plane are represented by an ordered pair (x,y). The


x-coordinate can be moved either left or right from 0 or origin. From where it ends,
the y-coordinate units will be moved either upward or downward and then mark
the point where the y-coordinate stopped.

Example: Plot point A with coordinates (2,3)

Step 1: Locate 2 on your x-axis.

Step 2: From 2 on x-axis, move the point upward since y-coordinate is


3.

Step 3: Mark the point where y-coordinate stops. That point is where
the x and y-coordinate met. Name point A.

295
What’s In

Where Am I Now?
Directions: Plot the following points in a Cartesian plane or xy-plane.

1. S (3, 2)
2. C (1, 4)
3. A (- 2, 3)
4. T ( -3,3)
5. E (- 2, -2)
6. R (2, -4)
7. P (2,2)
8. L (3, -1)
9. O (0,3)

Notes to the Teacher


Check the student’s level of readiness for the next topic. If
s/he did not answer most of the items and the guide questions,
you may provide another review activity about plotting of points
in a Cartesian plane.

296
What’s New

Directions: Refer to the Cartesian plane below. Determine the coordinate of the
points to decode the answer to the question given. Then, write the
name of the point to its corresponding coordinates.

Question: I am a graphical representation that shows the relationship of the


variables of bivariate data. What am I?

From the activity, what do you call the graph showing the relationship of the
variables of bivariate data?

297
What Is It

Scatter plot, scatter graph, scatter diagram, or scatter gram is a


graphical representation that shows the relationship or the correlation of two
variables of bivariate data.
Scatter plot shows how points collected from a set of bivariate data are
scattered on a Cartesian plane. It gives a good visual picture of how two variables
are related or associated with one another in terms of form, trend, and variation of
correlation. The form of points in the scatter plot determines the shape of the
correlation of the variables. The trend determines the direction of the points, either
the variables have positive, negative, or no correlation. The variation or strength of
correlation is based on the closeness of the points on a trend line and it determines
whether the variables have no, weak, moderate, strong, or perfect correlation.
In constructing a scatter plot, you should know how to plot points in a
Cartesian plane. The independent variable will assume the values of x or abscissa
while the dependent variable will assume the values of y or ordinate.

Example 1:

The given numbers are the age of a person in years and his/her
corresponding weight.

Age of a 11 12 13 14 15 16 17 18 19 20
person (x)
Weight (y) 40 42 38 35 45 51 48 48 50 47

Since the weight of an individual depends on his/her age, the independent


variable is the age of the person which is plotted horizontally. The dependent
variable is the weight of the person, which is plotted vertically as shown in the
scatter plot below.

298
Example 2:

A Math teacher conducted a study regarding the performance of grade 11


students in General Mathematics. Their average grades were taken at different time
or period. The data are given below.

Order of period of the


subject 1 2 3 4 5 6 7 8
Average grades 86 88 84 82 82 81 80 79

From the data given, the independent variable is the order of the subject and
the dependent variable is the average grade. From this, order of the subject will be
plotted on the x-axis and grades will be plotted on the y-axis as illustrated below.

299
Example 3:
A researcher asked for the weight of 10 students together with the weight of
their mother (biological) and created a scatter plot as presented below.

Weight of mother 65 69 74 78 59 81 76 80 81 75
Weight of student 52 55 62 63 47 66 63 69 68 65

On the given, the independent variable is the weight of the mother while the
dependent variable is the weight of the student. The scatter plot is presented below.

300
What’s More

Activity 1.1: Construct the scatter plot of the following data by plotting the
points.

1. Sakura interviewed 9 of her classmates on their average daily allowance in peso


and their weight in kilogram. The results are given below. Construct a scatter
plot of the given data.

Daily allowance 35 55 60 65 45 55 70 70 77
Weight 40 38 45 43 60 41 63 57 60

2. A gym instructor believes that health should be maintained by anyone


regardless of age. That’s why he recorded the ages of his customers and the
number of minutes they spend in exercising. The data are as follows:

Age 16 18 20 24 26 30 45
Number of
minutes 50 65 70 35 45 60 70

80
70
number of minutes

60
50
40
30
20
10
0
0 10 20 30 40 50
Age
3. A

301
researcher interviewed 10 students about their height and the height of their
father. The results are as follows:

Height 71 69 67 68 68 66 70 72 65 60
of the father
(x)
Height 71 69 69 65 66 63 68 70 60 58
of the
student (y)

80
70
Height of the student

60
50
40
30
20
10
0
58 60 62 64 66 68 70 72 74
Height of father

Activity 1.2: Identify the variables (dependent or independent) on the given


situations. Then, construct the scatter plot.

1. Matalino High School is known for students who excel in Math. A researcher
recorded the IQ of the students and their scores on a 50- item Math test as the
focus of his study.

IQ scores 85 87 93 95 87 97 105 110 115 120


Test scores 21 23 30 34 31 35 40 42 45 48

60
______________________

50
40
30
20
10
0
0 20 40 60 80 100 120 140

______________________________

302
2. ABM 11 students believe on the value of thriftiness. That’s why they conducted
a research on selected elementary learners regarding the amount they save from
their daily allowance and their corresponding weight as follows.

Amount saved in
10 8 15 20 5 3 5 25 10 15
peso
Weight in kilogram 38 40 37 36 42 41 39 35 36 37

43
42
______________________

41
40
39
38
37
36
35
34
0 5 10 15 20 25 30
________________________

3. The data below are aptitude test scores (x) and scores in a long test in
Mathematics (y) of 15 students.

x 38 35 30 28 25 24 20 18 16 15 12 10 8 7 5
y 25 20 17 15 12 15 18 10 12 10 10 10 7 6 5

30
_________________________

25

20

15

10

0
0 5 10 15 20 25 30 35 40
_________________________

303
Activity 1.3: Based on the scatter plot, determine the data and supply them
on the table below.

1. 50
45
40
35
Scores in Math test

30
25
20
15
10
5
0
0 20 40 60 80 100 120 140
No.of minutes spent in studying

Table:

Number of hours spent


in studying (minutes)
Grades

2. 87
86
85
84
83
Grades

82
81
80
79
78
77
76
0 20 40 60 80 100
Number of minutes spent in playing ML

Table:

Number of
minutes spent
in playing ML

Grades

304
3.
40
35
Price of Lumpia 30
25
20
15
10
5
0
0 50 100 150 200
No.of produced Lumpia

Table:

Number of
produced
lumpia

Price of
lumpia

Activity 1.4

Directions: Construct the scatter plot based on the given data. Create the scatter
plot on a separate sheet of graphing paper.

1.
Age of COVID-19
recovered patients 26 28 32 36 40 43 45 22 50 54 34 65
Number of days they
spent in the hospital 8 8 9 10 12 12 10 8 14 16 10 18

2.
Number of glasses
of water drank by
students per day 10 8 12 11 15 7 12 13 9 7 5 12
Number of glasses
of soft drinks
drank by students
per day 2 4 2 1 0 4 2 3 4 5 4 2

305
What I Have Learned

Directions: Complete the following statements. Write the answers in your


notebook.

1. Scatter plot is used to show the relationship or _____________ of two


variables.
2. The __________ variable assumes the values of the x-coordinates.
3. The trend of the points in a scatter plot determines the __________ of the
correlation of the variables.
4. The strength of the correlation of the variables is determined by the
_________ of the points on the trend line.
5. The dependent variable assumes the values plotted along the _______ -axis.

What I Can Do

Ask 15 of your classmates regarding their arm span and their height. Make a
scatter plot of the data collected on a separate sheet of paper. Your output will be
graded according to the rubrics below:
Neatness of the Output 10 points
Accuracy of the Scatter Plot 25 points
Presentation of Data 15 points
Total: 50 points

306
Assessment

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.
1. It is a graphical representation that shows the correlation of two variables using
points or dots.
A. bar graph
B. line graph
C. pictograph
D. scatter plot

2. The direction of the points on a scatter plot determines the ______ of the
correlation of the variables.
A. form
B. strength
C. trend
D. variation

3. Complete the statement:

The _______ variable is for the x-axis while the ________ variable is for the y-axis.

A. consistent, inconsistent
B. independent, dependent
C. inconsistent, consistent
D. dependent, independent

4. Which of the following tables corresponds to the scatter plot below?

92
90
88
86
84
82
80
78
76
74
0 2 4 6 8 10 12

307
A. x 2 3 4 5 6 7 8 9 10
y 82 78 81 80 79 85 90 76 75

B. x 2 3 4 5 6 7 8 9 10
y 80 78 81 82 79 83 80 76 75

C.
x 2 3 4 5 6 7 8 9 10
y 80 78 81 80 79 85 90 76 80

D.
x 2 3 4 5 6 7 8 9 10
y 84 78 81 80 79 85 90 76 82

5. Which of the following shows the scatter plot of the data below?

Order of period of the


subject 1 2 3 4 5 6 7 8
Grades 85 84 87 88 90 81 80 85

A. C.

B. D.

6. In constructing a scatter plot, which of the variables should assume the values
of x-coordinate?
A. consistent variable
B. dependent variable
C. independent variable
D. cannot be determined

308
7. The direction of the points around the line in a scatter plot suggests the
_________ correlation of the variables.
A. form
B. shape
C. trend
D. variation/strength

8. According to a study, the height of an individual is more likely inherited from


his/her father. If data on the height of individuals and their fathers are collected
to construct a scatter plot, which of the variables will assume the plot as y-
coordinate?
A. height of the father
B. height of the individual
C. height of the mother
D. cannot be determined

9. The form of the points in a scatter plot suggests the ______ of the correlation of
the variables whether linear or non-linear.
A. direction
B. shape
C. strength
D. variation

10. The relationship of the variables in a scatter plot is described in terms of its
______________.
A. shape, trend, and variance
B. mean, mode, and median
C. form, trend, and variation
D. form, direction, and standard deviation

11. Nami studied the results of the IQ test scores of children and the recorded IQ
scores of their parents. She noticed that the IQ of a child is more likely related
to the mother than that of the father. If a scatter plot will be constructed, what
is the independent variable on the situation above?
A. IQ score
B. IQ score of the father
C. IQ score of the child
D. IQ score of the mother

309
12. Which scatter plot is appropriate for the given data below?

x 14 13 17 16 13 15 16
y 160 154 164 162 158 165 170

A.

B.

C.

D.

310
13. Which of the tables contain the correct data based on the scatter plot below?

A.
x 1 2 3 4 5 6 7
y 92 90 87 90 92 91 88

B.
x 1 2 3 4 5 6 7
y 90 92 85 88 90 90 88

C.
x 1 2 3 4 5 6 7
y 90 92 87 88 91 90 88

D.
x 1 2 3 4 5 6 7
y 92 90 88 87 91 90 88

14. Choose the data that best complete the table based on the scatter plot below.

x 30 40 50 50 60 60 80
y

311
A.
y 90 92 88 90 87 90 88

B.
y 92 90 88 91 90 90 87

C.
y 91 92 90 88 92 90 88

D.
y 89 92 90 88 91 92 88

15. Mary Joy read from an article that the height of the offspring of a couple is
inherited from the father. She surveyed the height of her 10 classmates and
their fathers to construct a scatter plot. Which variable should be plotted on the
x-axis?
A. height of her classmates
B. height of their mothers
C. height of their fathers
D. cannot be determined

312
Additional Activities

Form a group consisting of three (3) members. Conduct a survey involving


two variables with at least 15 respondents. Put the data in a table identifying the
independent and dependent variables, then construct a scatter plot. As part of your
advance study, determine the shape, trend, and variation of the variables involved.
Your output will be presented to the class and will be graded using the rubrics
below:

Organization of Data 10 points

Accuracy of the Scatter Plot 15 points

Presentation of the Output 15 points


Cooperation Among the Members 10 points
Total 50 points

313
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Lane, David M. “Online Statistics Education: A Multimedia Course of Study.”


Accessed May 25, 2020. http://onlinestatbook.com/2/regression/intro.html

Quizizz. “Linear Regression | Algebra I Quiz” Accessed May 25, 2020.


https://quizizz.com/admin/quiz/5acae751c4daf70019c2369f/linear-
regression

Rourke, Emily O. “Performance Based Learning and Assessment Task Tuition Cost
Activity.” Accessed May 25, 2020. https://www.radford.edu/rumath-
smpdc/Performance/src/Emily O’Rourke - Tuition Cost Activity.pdf

314
Statistics and
Probability
Quarter 2 – Module 17:
Describing the Shape (Form), Trend
(Direction), and Variation (Strength)
Based on a Scatter Plot

315
What I Need to Know

Using scatter plot, the relationship of the variables involved can be


visualized. The scatter plot shows the shape, trend, and variation of the
variables involved. In this ADM module, you will learn how to describe the
relationship of variables of bivariate data in using the scatter plot.

After going through this module, you are expected to:


1. describe the relationship of variables in terms of shape (form) of the scatter
plot;
2. describe the relationship of variables in terms of trend (direction) of the
scatter plot; and
3. describe the relationship of variables in terms of variation (strength of
association) based on the scatter plot.

Are you ready now to study bivariate data using your ADM module? Good luck and
may you find it helpful.

316
What I Know

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. If the points on the scatter graph follow a trend of rising from right to left,
how will you describe the correlation of the variables involved?
A. moderate C. positive
B. negative D. zero

2. Joan noticed that the value of one variable corresponds to either low or high
value of the second variable of a set of bivariate data.
What conclusion can you draw from the direction of correlation?
A. The variables have positive correlation.
B. The variables have negative correlation.
C. The variables have moderate correlation.
D. The variables have zero correlation or negligible correlation.

3. Shanks observed that the points on the scatter plot are close AROUND the
trend line. What conclusion can he draw based from the scatter plot?
A. The variables have no correlation.
B. The variables have strong correlation.
C. The variables have perfect correlation.
D. The variables have moderate correlation

4. Zorro noticed that in constructing his scatter plot, the high values of one
variable correspond to low values of the second variable. What conclusion
can you draw from his data?
A. There is zero correlation between the variables.
B. There is a perfect correlation between the variables.
C. There is a positive correlation between the variables.
D. There is a negative correlation between the variables.

5. If the points on the scatter graph rise from left to right, then the variable
involved has a ______ correlation.
A. moderate C. positive
B. negative D. zero

6. Joan noticed that the high value of one variable corresponds to low value of
the second variable or low value of the first corresponds to high value of
second variable. What conclusion can you draw from the direction of
correlation?

317
A. The variables have zero correlation.
B. The variables have positive correlation.
C. The variables have negative correlation.
D. The variables have moderate correlation.

7. Given the scatter plot below, describe the variation of correlation of the
variables involved.
95

90

85

80
0 10 20 30 40 50 60

A. The variables have strong correlation.


B. The variables have perfect correlation.
C. The variables have moderate correlation.
D. The variables have no correlation or negligible correlation.

8. Zorro noticed that there is a direct relationship between the variables he


collected. What conclusion can you draw from his data?
A. There is zero correlation between the variables.
B. There is a perfect correlation between the variables.
C. There is a positive correlation between the variables.
D. There is a negative correlation between the variables.

9. The strength of the correlation is associated with the ______ of the points to
around the trend line on a scatter plot.
A. closeness C. form
B. direction D. number

10. If the points on the scatter plot fall almost on the trend line, then the
variables are said to have _______ correlation.
A. negative C. positive
B. perfect D. strong

11. Noah noticed that the points on the scatter plot follow a trend of rising from
right to left. He also noticed that the points are scattered moderately from
the trend line. What is the correlation of the variables involved?
A. strong negative C. weak negative
B. strong positive D. weak positive

318
12. Estimate the strength of correlation
of the scatter plot on the right
A. strong negative
B. strong positive
C. weak negative
D. weak positive

13. What conclusion can you draw from the scatter plot below?

A. The variables have perfect correlation.


B. The variables are not related or associated.
C. The variables are moderately and negatively related.
D. The variables involved are strongly and positively related.

14. Complete the statement: “Variables have _____ positive correlation if the
points fall closely to the trend line.”
A. negligible
B. moderate
C. strong
D. weak

15. Zorro noticed that in constructing his scatter plot, the points are scattered
and do not follow any direction. What conclusion can be drawn on the
situation?
A. The variables have perfect correlation.
B. The variables are not related or associated.
C. The variables are moderately and negatively related.
D. The variables involved are strongly and positively related

319
Lesson Form (Shape), Trend (Direction),

17 and Variation (Strength) of


Scatter Plot

What do you think happens with the height of a person as he grows older?
Does the person get taller as he ages or is there a certain period when he gets a bit
shorter after he stopped growing taller? Likewise, will your monthly electric bills get
higher if you continually increase your monthly electric consumption? To describe
the relationship of these variables, one way is to graph its scatter plot and analyze
the shape, trend, and variation of the scatter plot being formed.

What’s In

Where am I Now?
Directions: Create a scatter plot for each of the following situations.

1. A researcher believes that nutrition of the learners has something to do with


their IQ. That’s why she conducted a research and recorded the IQ of the
learners and their weight.

Weight in kg. 23 45 25 37 50 35 40 45 37 35
IQ scores 85 87 93 95 87 97 105 110 115 120

140

120

100
IQ SCORES

80

60

40

20

0
0 10 20 30 40 50 60 70
WEIGHT

320
2. An ABM student interviewed 10 students regarding the amount they save
from their allowance and their weight. Data are shown on the table below.

Amount
saved in peso 10 8 15 20 5 3 5 25 10 15
Weight in kg. 38 40 37 36 42 41 39 35 36 37

43
42
41
40
Weight

39
38
37
36
35
34
0 5 10 15 20 25 30
Amount in peso

Notes to the Teacher


Check the student’s level of readiness for the next topic. If
s/he did not answer most of the items, you may provide another
review activity about scatter plot.

321
What’s New

Activity 1. Directions: Study the scatter plot on each situation below and answer
the guide questions.

Situation 1: Teacher Koro recorded the IQ scores of his 10 students and their
average scores in Mathematics. He also constructed a scatter plot of his collected
data as shown below.

Test
scores 23 20 33 34 25 34 36 39 45 49
IQ
scores 85 87 93 95 87 97 105 110 115 130

140

120

100

80
IQ Scores

60

40

20

0
0 10 20 30 40 50 60
Test Score

Situation 2: Enrique plotted the age and height of 10 individuals in the graph
below.

Age 15 18 23 29 35 40 45 56 65 70
Height 146 150 152 155 157 154 153 148 146 145

158
156
154
Height

152
150
148
146
144
0 10 20 30 40 50 60 70 80
Age

322
Guide Questions:

1. What is the difference between the scatter plots in Situation 1 and 2?


__________________________________________________________________________
2. Which scatter plot has points that follow a trend of line?
___________________________________________________________________________

3. Which scatter plot has points that follow a trend of a curve?


___________________________________________________________________________
4. Based on the two situations, how are you going to describe the relationship
of the variables based on the form or shape of the scatter plot?
___________________________________________________________________________
___________________________________________________________________________

The relationship of two variables is called correlation. If the variables


involved have a linear correlation, it can be further described or explained
depending on the form, direction, and strength of the points on the scatter plot.

Do You Remember?

From a previous lesson in Mathematics, the concept of slope has already


been discussed. Let’s see how far you remember from your Grade 8 Algebra.

Positive Slope Negative Slope Zero Slope

1.
2.

A line has a positive slope if the line rises from left to right. A line has a
negative slope if the line rises from right to left. A line with a zero slope is parallel
to x-axis.

Using the same concept, the variables have positive correlation if the
points on the scatter plot follow a trend of rising from left to right portion of the
graph. The variables have negative correlation if the points on the scatter plot
follow a trend of rising from right to left. Finally, the variables have no or
negligible correlation if the points are scattered with no trend or direction of
rise.

323
Activity 2: Positive, Negative, or Zero?

Directions: Determine the trend of correlation based on the scatter plots below.

Scatter Plot (Positive,


Negative, or
No/Negligible
Correlation)
1.

2.

3.

4.

324
5.

6.
70
60
50
40
30
20
10
0
0 50 100 150

7.
7.

8.

Aside from the form and trend of points on a scatter plot, the correlation of
the variables can also be described by the closeness of the points on scatter plot.
This is called the variation or simply the strength of correlation of the variables.

325
Activity 3: Stop, Look, and Observe!

Directions: Observe the closeness and the direction of the points on each scatter
plot. Then, answer the questions that follow by writing the letter that corresponds
to your answer.

a. b. c.

d. e. f.

1. Which of the scatter plots above shows positive correlation?


A. scatter plot b only C. scatter plots b and f
B. scatter plot f only D. scatter plots b and e
2. Which of the graphs shows negative correlation?
A. scatter plot a only C. scatter plots and c only
B. scatter plot c only D. scatter plots a, c, and d
3. Which scatter plot shows negative correlation with points almost
falling to form a line?
A. scatter plot b C. scatter plot f
B. scatter plot g D. scatter plot d
4. Which of the scatter plots shows a positive correlation with points
widely spread apart?
A. scatter plot a C. scatter plot c
B. scatter plot b D. scatter plot f

326
5. The scatter plot that shows no correlation or negligible correlation
is _______.
A. scatter plot a C. scatter plot e
B. scatter plot c D. scatter plot f
6. Which of these scatter plots shows negative correlation with points
close to one another?
A. scatter plot c C. scatter plot d
B. scatter plot d D. scatter plot e
7. Which of the scatter plots shows a negative correlation with points
moderately dispersed?
A. scatter plot a C. scatter plot c
B. scatter plot b D. scatter plot d
8. The graph with positive correlation with points moderately spread
apart is ______.
A. scatter plot b C. scatter plot d
B. scatter plot c D. scatter plot f
9. The graph that shows points following no direction and correlation
is ______.
A. scatter plot a C. scatter plot d
B. scatter plot b D. scatter plot e

What Is It

The correlation of the variables can be described in terms of form (shape),


trend (direction), and variation (strength) of scatter plot. The form of correlation
can be determined by the shape of points on a scatter plot categorized as linear or
curvilinear. The form of correlation is linear if the points on scatter plot follow a
trend of straight line. The form of scatter plot is non-linear if the points follow a
trend of curve line. Sample scatter plots showing curvilinear form of correlation
are given below.

The correlation of variables can also be described in terms of its trend or


direction. The trend of correlation can be positive, negative, or zero/negligible
depending on the direction of the points. The trend of correlation is summarized in
the table that follows.

327
Trend Graph Direction of Description
the Points
A positive
Positive The points correlation
Correlation follow a exists when
trend rising high values of
from left to one variable
right. correspond to
high values of
another
variable or low
values of one
variable
correspond to
low values of
another
variable.

Negative The points A negative


Correlation follow a correlation
trend rising exists when
from right to high values of
left. one variable
correspond to
low values of
another
variable or low
values of one
variable
correspond to
high values of
another
variable.

No The points A negligible


Correlation/ are neither correlation
Negligible rising from exists when
Correlation left to right high values of
nor right to one variable
left. correspond to
either high or
low values of
another
variable.

328
The closeness of the points around the trend line determines the variation or
strength of the correlation between the variables involved. The closer the points to
the trend line, the stronger the correlation of the variables is. The strength of
correlation between two variables can be perfect, strong, weak, or no/negligible
correlation. To summarize the strength of correlation, refer to the table below.

Correlation Scatter Plot Description

Strong Positive This correlation exists


Correlation when almost all of the
points are on the line
or the points are
closely scattered on
the trend line that
rises from left to
right.

Weak Positive Compared to strong


positive correlation,
the points in this
correlation are
scattered a bit far
from the trend line
from left to right.

No Correlation The points in this


or Negligible correlation do not
Correlation follow any trend line.
The points are just
scattered around the
Cartesian plane.

329
Weak Negative The points in this
Correlation correlation are
scattered a bit far
from the trend line
from right to left.

Moderate This correlation exists


Negative when the points are
Correlation moderately scattered
rising from right to
left.

Strong This correlation exists


Negative when almost all of the
Correlation points are on the line
or the points are
closely scattered on
the trend line that
rises from right to
left.

Two variables can also have perfect positive or perfect negative correlation.
In a scatter plot, the variables with perfect correlation will show points that fall into
a straight line.

330
What’s More
Activity 1.1: Forms of a Scatter Plot
Directions: Determine whether the form of the given scatter plot is linear or
curvilinear.

Scatter Plot Trend


1. 140
120
100
80
60
40
20
0
0 20 40 60

2.
50

40

30

20

10

0
0 20 40 60 80

3.
40
35
30
25
20
15
10
5
0
0 20 40 60 80

331
4.
80
60
40
20
0
0 20 40 60 80

5.
60
50
40
30
20
10
0
0 20 40 60 80

Activity 1.2: Trend of Correlation

Directions: Examine the given variables below and determine the trend of
correlation as to positive, negative, or no/negligible correlation.

Variable 1 Variable 2 Trend of Correlation


1. IQ scores Test scores in an exam
2. Age of a car Price of the car
3. Number of Height of students
hours spent in
studying
4. Number of Number of teachers
students needed
enrolled in a
course
5. Number Number of hours
of workers hired to finish the job
to paint a
building
6. Number of Number of rats in a
snakes in a farm
farm
7. Electric Monthly electric bill
consumption

332
8. Height Weight of a person
of a person
9. Speed of a car Distance travelled
10. Salary Number of overtime
of an employee rendered

Activity 1.3: Let’s Estimate!

Directions: Estimate the variation (strength) of correlation of the following scatter


plots.

1. 2. 3.

_____________________ ____________________
4. 5. 6.

_____________________ _____________________ __________________

7. 8. 9.

_____________________ _____________________ __________________

333
Activity 1.4: Matchy-Matchy!
Directions: Match Column A with Column B by choosing the letter of the
description under Column B pertaining to the corresponding strength
of correlation listed under Column A. To decode the Word of the Day
below, arrange the letters of your answers accordingly from 1-8.

COLUMN A COLUMN B

_____1. Weak Positive Correlation R. points fall in the trend line


that rises from right to left

_____2. No Correlation G. points are closely scattered


around the trend line that rises
from right to left

_____3. Perfect Negative Correlation T. points are scattered around


the Cartesian plane

_____ 4. Strong Positive Correlation H. points fall far from the trend
line that rises from right to left

_____ 5. Perfect Positive Correlation S. scattered a bit far from the


trend line that rises from left to
right

_____ 6. Strong Negative Correlation E. points are closely scattered to


the trend line that rises from left
to right

_____ 7. No/Negligible Correlation N. points fall in the trend line


that rises from left to right

_____ 8. Weak Negative Correlation A. points fall out of the trend


line that rises from left to
right.

Word of the Day: _________________________

334
What I Have Learned

Directions: Complete the following statements. Write the answers in your


notebook.

The correlation of variables can be determined by studying its scatter plot. The
scatter plot can be described through its form, also known as (1)____. The form of
scatter plot is either linear or (2)____.
In terms of (3)____of correlation, it could be categorized as positive or negative
correlation depending on the behavior of the points. The variables can also have no
correlation.
The strength of correlation also known as (4)____determines the closeness of the
points in a line. If the points are plotted and they form a line, then there is
(5)____ correlation.

What I Can Do

Directions: Identify pair of variables that fall under the following strengths and
directions of correlation. Explain your answer.
Strength of Correlation Variables involved

1. Strong Positive Correlation


2. Moderate Positive Correlation
3. Strong Negative Correlation

335
Assessment

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. If the points on the scatter graph rise from left to right, then the variables
involved have a ______ correlation.
A. moderate C. positive
B. negligible D. zero

2. The strength of the correlation is associated with the ______ of the points to
the trend on a scatter plot.
A. closeness C. form
B. direction D. number

3. Noah noticed that the points on a scatter plot follow a trend rising from right
to left. He also noticed that the points are plotted closely around the trend
line. What is the correlation of the variables involved?
A. strong negative C. weak negative
B. strong positive D. weak positive

4. What conclusion can you draw from the scatter plot below?

A. The variables have perfect correlation.


B. The variables are not related or associated.
C. The variables are moderately and negatively related.
D. The variables involved are strongly and positively related.

5. If the points on the scatter plot fall almost in line, then the variables are said
to have ____ correlation.
A. negative C. positive
B. perfect D. strong

6. Joan noticed that the high value of one variable corresponds to high value of
the second variable or low value of the first corresponds to low value of
second variable. What conclusion can you draw from the direction of
correlation?
A. The variables have zero correlation.

336
B. The variables have positive correlation.
C. The variables have negative correlation.
D. The variables have moderate correlation.

7. Which of the statements best describes the scatter plot below?

A. The variables have weak and negative linear correlation.


B. The variables have weak and positive linear correlation.
C. The variables have strong and negative linear correlation.
D. The variables have strong and positive linear correlation.

8. Sanji noticed that there is an inverse relationship between the variables he


collected. What conclusion can you draw from his data?
A. There is zero correlation between the variables.
B. There is a perfect correlation between the variables.
C. There is a positive correlation between the variables.
D. There is a negative correlation between the variables.

9. What can you say about the relationship of the variables shown on the
scatter plot below?

A. The variables have a perfect negative correlation.


B. The variables have a perfect positive correlation.
C. The variables have a strong negative correlation.
D. The variables have a strong positive correlation.

10. If the points on the scatter plot fall almost on the trend line, rising from
right to left, then the variables are said to have _______ correlation.
A. perfect negative C. perfect positive
B. strong negative D. strong positive

11. Noelle noticed that the points on a scatter plot follow a trend of rising from
left to right. He also noticed that the points are moderately scattered from
the trend line. What is the correlation of the variables involved?

337
A. strong negative correlation C. weak negative correlation
B. strong positive correlation D. weak positive correlation

12. Estimate the strength of correlation of the scatter plot on the right.

A. strong negative
B. strong positive
C. weak negative
D. weak positive

13. What conclusion can you draw from the scatter plot below?

A. The variables have perfect correlation.


B. The variables are not related or associated.
C. The variables are moderately and negatively related.
D. The variables involved are strongly and positively related.

14. Complete the statement: “Variables have _____ and ______correlation if the
points rise from left to right falling closely to the trend line.”
A. negative, perfect C. positive, strong
B. negative, strong D. positive, perfect

15. Robin constructed a scatter plot based on the data she collected. What
conclusion can he draw about the relationship of the variables based on the
scatter plot?

A. The variables have strong and negative correlation.


B. The variables are moderately and negatively related.
C. The variables involved are strongly and positively related.
D. The variables are not related or associated with one another.

338
Additional Activities

Directions: Create a scatter plot based on the given data. Then, determine the
form, trend, and variation of the scatter plot.

1. Age 12 14 15 16 18 19 20 23 24 25
Weight
40 43 48 47 47 49 52 55 50 58
(in kg)

2.

Father's height
166 170 158 178 162 156 175 180 175 183
(cm)

Son's height
160 162 150 165 157 156 170 175 172 180
(cm)

339
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.

Online Resources

Lane, David M. “Online Statistics Education: A Multimedia Course of Study.”


Accessed May 25, 2020. http://onlinestatbook.com/2/regression/intro.html

Quizizz. “Linear Regression | Algebra I Quiz” Accessed May 25, 2020.


https://quizizz.com/admin/quiz/5acae751c4daf70019c2369f/linear-
regression

Rourke, Emily O. “Performance Based Learning and Assessment Task Tuition Cost
Activity.” Accessed May 25, 2020. https://www.radford.edu/rumath-
smpdc/Performance/src/Emily O’Rourke - Tuition Cost Activity.pdf

340
Statistics and
Probability
Quarter 2 – Module 18:
Calculating the Pearson’s
Sample Correlation Coefficient

341
What I Need to Know

This module was designed and written with you in mind. It is here to help you
master computing Pearson’s sample correlation coefficient r. The scope of this
module permits its use in many different learning situations. The language used
recognizes the diverse vocabulary level of students. The concepts are arranged to
follow the standard sequence of the learning area.

After going through this module, you are expected to:


1. define Pearson’s sample correlation coefficient r;
2. state the formula for Pearson’s sample correlation coefficient r;
3. compute the Pearson’s sample correlation coefficient r; and
4. apply and solve real-life problems using Pearson’s sample correlation
coefficient.

Are you ready now to study about the calculation of Pearson’s sample correlation
coefficient using your ADM module? Good luck and may you find it helpful.

342
What I Know

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. Which of the following is a statistical method that measures the strength of


the linear relationship between two variables?
A. z - value
B. scatterplot
C. testing hypothesis
D. Pearson’s sample correlation coefficient

2. Which of the following is the formula for Pearson’s sample correlation


coefficient r?
𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
A. 𝑟 =
√[𝑛(∑ 𝑥 2 )−(∑ 𝑥)2 ][𝑛(∑ 𝑦 2 )−(∑ 𝑦)2 ]
(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
B. 𝑟 =
√[(∑ 𝑥 2 )−(∑ 𝑥)2 ][(∑ 𝑦 2 )−(∑ 𝑦)2 ]
𝑛(∑ 𝑥𝑦)+(∑ 𝑥)(∑ 𝑦)
C. 𝑟 =
√[𝑛(∑ 𝑥 2 )+(∑ 𝑥)2 ][𝑛(∑ 𝑦 2 )+(∑ 𝑦)2 ]
(∑ 𝑥𝑦)+(∑ 𝑥)(∑ 𝑦)
D. 𝑟 =
√[(∑ 𝑥 2 )+(∑ 𝑥)2 ][(∑ 𝑦 2 )+(∑ 𝑦)2 ]

3. In the Pearson r, what does n represent?


A. sum of x-values
B. sum of square x-values
C. number of paired values
D. sum of the products of paired values x and y

4. Which of the following is the first step in computing Pearson’s sample


correlation coefficient r ?
A. Complete the table.
B. Construct the table.
C. Get the sum of all entries in all columns.
D. Substitute all the values obtained by all summations.

5. Which of the following values CANNOT represent a correlation coefficient r ?


A. -1
B. 0
C. 0.25
D. 1.001

6. Based on the bivariate data below, which among the choices is the correctly
constructed table?
X 2 8 11 9
Y 13 20 22 5

343
A. C. X Y XY X2 Y2
X Y XY X2 Y2
13 2 2 13
20 8 8 20
22 11 11 22
5 9 9 5

B. D.
X Y X2 Y2 X Y XY X3 Y3
2 13 2 13
8 20 8 20
11 22 11 22
9 5 9 5

7. In the bivariate data on the right, which among X 1 2 3


the choices is the correct completed table? Y 18 13 7

A. C.
X Y XY X2 Y2 X Y XY X2 Y2
1 18 18 1 324 1 18 1 324 18
2 13 26 4 169 2 13 4 169 26
3 7 21 9 64 3 7 9 64 21
6 39 68 14 557 6 39 14 557 68

B. D.
X Y XY X2 Y2 X Y XY X2 Y2
1 18 18 324 1 1 18 1 18 324
2 13 26 169 4 2 13 4 26 169
3 7 21 64 9 3 7 9 21 64
6 39 68 557 14 6 39 14 68 557

8. Using the following summation values below, what is the value of Pearson r ?
n=4 ∑ 𝑋 = 10 ∑ 𝑌 = 15 ∑ 𝑋𝑌 = 39 ∑ 𝑋2 = 30 ∑ 𝑌 2 = 65
A. -0.02
B. 0
C. 0.23
D. 1

9. Using the following summation values below, what is the value of Pearson r ?
n=3 ∑𝑋 = 6 ∑ 𝑌 = 39 ∑ 𝑋𝑌 = 68 ∑ 𝑋2 = 14 ∑ 𝑌 2 = 557
A. -1
B. -0. 74
C. 0
D. 0.39

For numbers 10-12, refer to the following bivariate data:

344
X 1 2 3
Y 5 9 8

10. Which of the following is the CORRECT completed table for the bivariate
data?
A. C.
X Y XY X2 Y2 X Y XY X2 Y2
1 5 1 25 5 1 5 5 1 25
2 9 4 81 18 2 9 18 4 81
3 8 9 64 24 3 8 24 9 64
6 22 14 170 47 6 22 47 14 170

B. D.
X Y XY X2 Y2 X Y XY X2 Y2
1 5 1 5 25 1 5 5 25 1
2 9 4 18 81 2 9 18 81 4
3 8 9 24 64 3 8 24 64 9
6 22 14 47 170 6 22 47 170 14

11. When you substitute all the summation ( ∑ ) values in the formula for
Pearson r, which among the choices is its best representation?
4(47)−(6)(22)
A. 𝑟 =
√[3(14)−222][3(170)− 62]
4(47)−(6)(22)
B. 𝑟 =
√[3(14)−62 ][3(170)− 222]

3(47)−(6)(22)
C. 𝑟 =
√[3(14)−222][3(170)− 62]

3(47)−(6)(22)
D. 𝑟 =
√[3(14)−62 ][3(170)− 222]

12. What is the value of r ?


A. 0.93
B. 0.72
C. 0.16
D. -0.16

For numbers 13-15, refer to the bivariate data below:

X 3 2 0 1 3
Y 10 24 21 15 28

13. Which of the following is the CORRECT completed table for the bivariate
data?

X Y XY X2 Y2
10 3 30 9 100
24 2 48 4 576

345
A. 21 0 0 0 441 C. X Y XY X2 Y2
15 1 15 1 225 10 3 30 100 9
28 3 84 9 784 24 2 48 576 4
98 9 177 23 2126 21 0 0 441 0
15 1 15 225 1
28 3 84 784 9
98 9 177 2126 23

B. X Y X2 Y2 XY2 D. X Y XY X2 Y2
3 10 30 9 100 3 10 30 9 100
2 24 48 4 576 2 24 48 4 576
0 21 0 0 441 0 21 0 0 441
1 15 15 1 225 1 15 15 1 225
3 28 84 9 784 3 28 84 9 784
9 98 177 23 2126 9 98 177 23 2126

14. When you substitute all the summation ( ∑ ) values in the formula for
Pearson r, which among the choices is its best representation?
5(177)−(9)(98)
A. 𝑟 =
√[5(23)−92 ][5(2126)− 982 ]

5(177)+(9)(98)
B. 𝑟 =
√[5(23)−92 ][5(2126)− 982 ]

5(2126)−(9)(98)
C. 𝑟 =
√[5(23)−92 ][5(2126)− 982 ]

5(2126)+(9)(98)
D. 𝑟 =
√[5(23)+92 ][5(2126)+ 982 ]

15. What is the value of r?


A. 0
B. 0.02
C. 0.16
D. 0.61

346
Lesson
Calculating the Pearson’s
18 Sample Correlation Coefficient

In the previous lesson, we learned about bivariate data and pairs of variables
that are related to each other. We also learned how to construct the scatter plots of
these bivariate data and determine the strength and direction of their association or
relationship based on how the points are scattered. In this module, you will focus on
the correlation of bivariate data. Check your readiness for this lesson by answering
the following exercises.

What’s In

Directions: Identify the trend and strength of correlation of the scatter plots below.
Choose your answer from the choices inside the box.

perfect positive correlation perfect negative correlation


strong positive correlation strong negative correlation
weak positive correlation weak negative correlation
no or negligible correlation

1. 3.

2. 4.

347
5.

How can we determine if there is a correlation between two variables: X and


Y? By observing the scatter plot, you can tell if the correlation is positive, negative,
or non-existent. If the points on the scatter plot closely resemble a straight line, then
the correlation may be positive or negative depending on the trend of the line. It has
a positive correlation if the line is increasing or rising from left to right. It has a
negative correlation if the line is decreasing or it is trending downward from left to
right. Meanwhile, the variables have no or negligible correlation if the points are
scattered randomly on the scatter plot.

We can only estimate the direction and strength of the relationship between
variables using a scatter plot. Is there a way to get the exact direction and strength
of the relationship between variables? Just like any other measurement, correlation
between two variables can be represented by a single number. This number can
determine exactly whether the relationship is negative or positive. It can also tell
exactly the degree or strength of the relationship. Let’s try the next activity.

What’s New

The following tables show the bivariate data x and y. Without constructing a
scatter plot, tell whether they have positive, negative, or no/negligible correlation.
Then, briefly explain your answer.

1. ____________________________
x 1 2 3 4 5 6 ____________________________
y 5 10 10 15 25 30 ____________________________

2. ____________________________
x 1 3 11 10 6 9 ____________________________
y 14 6 12 11 10 9 ____________________________

3. ____________________________
x 10 8 6 4 2 1 ____________________________
y 16 19 26 24 29 36 ____________________________

348
Guide Questions:

1. How do you assess the bivariate data do determine the trend of its
correlation?
___________________________________________________________________________
___________________________________________________________________________

2. Do you think it is easy to determine the trend of its correlation? Why and
why not?
___________________________________________________________________________
___________________________________________________________________________

3. Is there a way to get the exact number that will represent its correlation?
___________________________________________________________________________
___________________________________________________________________________

The scatter plot helps us visualize the relationship of the variables in a


bivariate data. However, only the trend of the correlation can be exactly determined.
We can only estimate the degree of the association whether the variables have weak,
moderate, or high degree of relationship. Meanwhile, there is a statistical method
that can be used to evaluate the strength of relationship between two quantitative
variables.

What Is It

The Pearson’s sample correlation coefficient (also known as Pearson r ),


denoted by r, is a test statistic that measures the strength of the linear relationship
between two variables. To find r, the following formula is used:

𝒏(∑ 𝑿𝒀) − (∑ 𝑿)(∑ 𝒀)


𝒓=
√[𝒏(∑ 𝑿𝟐 ) − (∑ 𝑿)𝟐 ][𝒏(∑ 𝒀𝟐 ) − (∑ 𝒀)𝟐 ]

The correlation coefficient (r) is a number between -1 and 1 that describes


both the strength and the direction of correlation. In symbol, we write -1 ≤ r ≤ 1.

Illustrative Example:
Teachers of Pag-asa National High School instilled among their students the
value of time management and excellence in everything they do. The table below
shows the time in hours spent in studying (X) by six Grade 11 students and their
scores in a test (Y). Solve for the Pearson’s sample correlation coefficient r.

X 1 2 3 4 5 6
Y 5 10 10 15 25 30

349
The next section will guide you on how to compute the Pearson product
moment correlation r.

STEPS SOLUTION
1. Construct a table as shown on
the right side. X Y XY X2 Y2
1 5
2 10
3 10
4 15
5 25
6 30

2. Complete the table.


a. Multiply entries in the X and Y X Y XY X2 Y2
columns. Put them under the
XY column. 1 5 5 1 25
2 10 20 4 100
b. Square all the entries in the X 3 10 30 9 100
column. Put them under X2
column. 4 15 60 16 225
5 25 125 25 625
c. Square all the entries in the Y 6 30 180 36 900
column. Put them under Y2
column.

3.
a. Get the sum of all entries in the X Y XY X2 Y2
X column. This is ∑ 𝑿.
1 5 5 1 25
b. Get the sum of all entries in the 2 10 20 4 100
Y column. This is ∑ 𝒀.
3 10 30 9 100
c. Get the sum of all entries in the 4 15 60 16 225
XY column. This is ∑ 𝑿𝒀.
5 25 125 25 625
d. Get the sum of all entries in the 6 30 180 36 900
X2 column. This is ∑ 𝑿𝟐 .
∑ 𝑿= ∑ 𝒀= ∑ 𝑿𝒀= ∑ 𝑿𝟐 = ∑ 𝒀𝟐 =
e. Get the sum of all entries in the 21 95 420 91 1,975
Y2 column. This is ∑ 𝒀𝟐.

4. Substitute the values obtained Here n = 6 because there are six (6)
from Step 3 in the formula: pairs of values.

𝑛(∑ 𝑋𝑌) − (∑ 𝑋)(∑ 𝑌) 𝒏(∑ 𝑿𝒀) − (∑ 𝑿)(∑ 𝒀)


𝑟= 𝒓=
√[𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2 ][𝑛(∑ 𝑌 2 ) − (∑ 𝑌)2 ] √[𝒏(∑ 𝑿𝟐 ) − (∑ 𝑿)𝟐 ][𝒏(∑ 𝒀𝟐 ) − (∑ 𝒀)𝟐 ]

6(420) − (21)(95)
=
√[6(91) − (21)2 ][6(1,975) − (95)2 ]

350
2,520 − 1,995
=
√[546 − 441][11,850 − 9,025]
You may use your
calculator here!
525
=
√[105][2,825]
525
=
√296,625

r ≈ 0.96395 or 0.96

The value of r is a positive


number. Therefore, we can say
accurately that there is a positive
correlation between hours spent in
studying and their scores in a test.

Note: For consistency of our answer,


round your final answer into two
decimal places.

In the next module, we will interpret the strength of value of computed r and
we will involve more real-life problems to solve using Pearson r. In the meantime,
let’s focus on computing the Pearson’s sample correlation coefficient r.

Let’s try to answer all the activities that follow.

Notes to the Teacher


In the next part, the value of r will be interpreted
given a scale. The numbers in the scale are expressed
in two decimal places. Tell students that for consistency
of answers, they should round the value of r into two
decimal places except for a very small number which is
nearly zero. During computation, rounding of partial
answer may be done in three to four decimal places.
Also, the symbol ≈ will be used for r to indicate that the
numbers were rounded off.

351
What’s More

Activity 1.1 Let Me Guide You!


In this activity, you will be guided on how to compute the Pearson’s sample
correlation coefficient r. First, fill in the blank parts of the table with the correct
values of each cell. After completing the table, get the sum of each column. Then,
substitute the values obtained in the given formula. Finally, perform the indicated
operations to calculate the value of r.

X 1 3 4 5 7
Y 35 20 15 10 15

X Y XY X2 Y2
1 35 35
3 20 9 400
4 15 60 225
5 10 25
7 15 105 225
∑ 𝑋= ∑ 𝑌= ∑ 𝑋𝑌= ∑ 𝑋2 = ∑ 𝑌2 =
20 ______ 310 ______ 2,175

n = 5 (since there are 5 pairs of values)


𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦)
𝑟=
√[𝑛(∑ 𝑥 2 ) − (∑ 𝑥)2 ][𝑛(∑ 𝑦 2 ) − (∑ 𝑦)2 ]

5(310) − (20)(____)
= Be careful in
√[5(_____) − (20)2 ][5(2,175) − (____)2 ] substituting the
values, make sure
1,550 − ________
= they are correct.
√[______ − 400][10,875 − ________] Always double check.
−_____
=
√[_____][_____]

−_____
=
√_____________

−_____
=
_____
𝒓 ≈ −𝟎. 𝟖𝟏

352
Activity 1.2 Complete Me!
Directions: Complete the table below. Then, fill in the blanks in the formula to
arrive at the computed Pearson r.

X Y XY X2 Y2
15 5 225
23 3
11 8 64
9 10 100
15 8 64
20 20 400
∑ 𝑋= ∑ 𝑌= ∑ 𝑋𝑌= ∑ 𝑋2 = ∑ 𝑌2 =
_____ _____ 842 1,581 _____

𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)


𝑟= n = ____
√[𝑛(∑ 𝑥 2 )−(∑ 𝑥)2 ][𝑛(∑ 𝑦 2 )−(∑ 𝑦)2 ]

___(842) − (____)(____)
𝑟= r ≈ 0.03
√[___(1,581)−(_____)2 ][____(_____)− (____)2 ]

Activity 1.3 Let Me Guide You Because…


Directions: The title of the activity is incomplete and to reveal the real message,
follow the given directions. Using the given sum, substitute each to the
formula of Pearson’s sample correlation coefficient. Then, compute the
value of r. Choose your answer from the LETTER BOX below. Write the
letter that corresponds to your answer on the DECODING AREA. (Show
your solution.)

1.
n=5 ∑ 𝑋 = 17 ∑ 𝑌 = 85 ∑ 𝑋𝑌 = 375 ∑ 𝑋2 = 75 ∑ 𝑌 2 = 1,875

2. n=8 ∑ 𝑋 = 72 ∑ 𝑌 = 105 ∑ 𝑋𝑌=1,020 ∑ 𝑋2 = 816 ∑ 𝑌 2 = 1,725

3. n=6 ∑ 𝑋 = 22 ∑ 𝑌 = 34 ∑ 𝑋𝑌 = 79 ∑ 𝑋2 = 734 ∑ 𝑌 2 = 364

Letter Box
Y T I L R
1 0.73 -0.14 0.31 0

DECODE…
Let me guide you because…
3 2 1

353
Activity 1.4 You Can Do It!
In Mapalad Integrated High School, a guidance counselor believes that
aptitude score is related to performance. The following sample data obtained from
six students show their aptitude and performance score. Compute the Pearson r.
Show your solution. Aptitude Quarterly Assessment
Score (X) Score (Y)
8 14
15 5
11 8
7 12
5 2
10 11

What I Have Learned

Answer the following questions below:


1. What do you call a statistical method that measures the strength of correlation
between two variables?
2. To find Pearson r, what is the formula to be used?
3. Briefly discuss the steps in computing the Pearson’s sample correlation
coefficient r.

What I Can Do

Ask your 10 classmates about their previous grade in Mathematics and


Science subjects. Create a table for the data obtained from the survey and solve for
Pearson’s sample correlation coefficient r between the grades in Mathematics and
Science.
Previous Grade
Names
Mathematics Science
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

354
Assessment

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. Which of the following is used to measure the strength of the association between
bivariate data?
A. z – value
B. diagram
C. Pearson - b
D. Pearson’s sample correlation coefficient

2. Which of the following is the Pearson r formula?


(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
A. 𝑟 =
√[(∑ 𝑥 2 )−(∑ 𝑥)2 ][(∑ 𝑦 2 )−(∑ 𝑦)2 ]

𝑛(∑ 𝑥𝑦)+(∑ 𝑥)(∑ 𝑦)


B. 𝑟 =
√[𝑛(∑ 𝑥 2 )+(∑ 𝑥)2 ][𝑛(∑ 𝑦 2 )+(∑ 𝑦)2 ]

𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)


C. 𝑟 =
√[𝑛(∑ 𝑥 2 )−(∑ 𝑥)2 ][𝑛(∑ 𝑦 2 )−(∑ 𝑦)2 ]

(∑ 𝑥𝑦)+(∑ 𝑥)(∑ 𝑦)
D. 𝑟 =
√[(∑ 𝑥 2 )+(∑ 𝑥)2 ][(∑ 𝑦 2 )+(∑ 𝑦)2 ]

3. In the formula of Pearson r, what is the meaning of ∑ 𝑥𝑦 ?


A. sum of x-values
B. sum of square x-values
C. sum of the square of paired values x and y
D. sum of the products of paired values x and y
4. In computing Pearson r, which of the following is the next step after obtaining
the sum of all entries in all columns in the table?
A. Construct a table.
B. Complete the table.
C. Simplify and compute for the value of r.
D. Substitute all the sum and n in the formula.
5. Which of the following is the range of the correlation coefficient (r)?
A. 0 ≤ r ≤ 1
B. 1 ≤ r ≤ -1
C. -1 < r < 1
D. -1 ≤ r ≤ 1

6. In the given bivariate data, which among the X -1 0 1 2


choices is the correctly constructed table?
Y 10 13 9 15
X Y XY X2 Y2
-1 10
0 13

355
A. 1 9 C. X Y XY X2 Y2
2 15 10 -1
13 0
9 1
15 2

B. X Y XY X2 Y2 D. X Y XY X3 Y3
-1 15 -1 15
0 9 0 9
1 13 1 13
2 10 2 10

7. In the bivariate data on the right, which X 2 4 6


among the choices is the correct completed Y 1 3 5
table?

A. C.
X Y XY X2 Y2 X Y XY X2 Y2
2 1 2 4 1 2 1 2 4 1
4 3 12 16 9 4 3 12 16 9
6 5 30 36 25 6 5 30 36 25
12 9 35 56 44 12 9 44 56 35
B. X Y XY X2 Y2 D. X Y X2 Y2 XY
2 1 2 1 4 2 1 2 4 1
4 3 12 9 16 4 3 12 16 9
6 5 30 25 36 6 5 30 36 25
12 9 44 35 56 12 9 44 56 35

8. Using the given summation values below, what is the value of Pearson r?
n=3 ∑𝑋 = 6 ∑ 𝑌 = 30 ∑ 𝑋𝑌 = 60 ∑ 𝑋2 = 14 ∑ 𝑌 2 = 450
A. -0.06
B. 0
C. 0.11
D. 1

9. Using the given summation values below, what is the value of Pearson r?
n = 5 ∑ 𝑋 = 10 ∑ 𝑌 = 15 ∑ 𝑋𝑌 = 90 ∑ 𝑋2 = 60 ∑ 𝑌 2 = 135
A. – 1
B. 0
C. 0.99
D. 1
For numbers 10-12, refer to the following X 1 2 3
bivariate data: Y 10 8 9
10. Which of the following is the CORRECT completed table for the bivariate data?
X Y XY X2 Y2
1 10 100 10 1
2 8 64 16 4

356
3 9 81 27 9
A. C. X Y XY X2 Y2
6 27 245 53 14
1 10 100 10 1
2 8 64 16 4
3 9 81 27 9
6 27 245 53 14
B. D.
X Y XY X2 Y2 X Y XY X2 Y2
1 10 10 1 100 1 10 10 1 100
2 8 16 4 64 2 8 16 4 64
3 9 27 9 81 3 9 27 9 81
6 27 53 14 245 6 27 53 245 14

11. When you substitute all the summation ( ∑ ) values in the formula for
Pearson r, which among the choices is its best representation?
3(53)−(6)(27)
A. 𝑟 =
√[3(14)−62 ][3(245)− 272]
3(53)−(6)(27)
B. 𝑟 =
√[3(14)+62 ][3(245)+ 272 ]

3(27)−(6)(27)
C. 𝑟 =
√[3(14)−62 ][3(245)− 272 ]
3(53)−(6)(27)
D. 𝑟 =
√[3(6)−142 ][3(27)− 2452 ]

12. What is the value of r ?


A. 0.95 C. -0.25
B. 0.75 D. -0.5

For numbers 13-15, refer to the X -2 0 3 4 1


following bivariate data: Y 8 5 8 13 20
13. Which of the following is the CORRECT completed table for the bivariate
data?
A. X Y XY X2 Y2 C. X Y XY X2 Y2
-2 8 -16 4 64 -2 8 -16 4 64
0 5 0 0 25 0 5 0 0 25
3 8 24 9 64 3 8 24 9 64
4 13 52 16 169 4 13 52 16 169
1 20 20 1 400 1 20 20 1 400
6 54 80 722 30 6 54 80 30 722

B. X Y XY X2 Y2 D. X Y XY X2 Y2
-2 8 64 4 -16 -2 8 -16 64 4
0 5 25 0 0 0 5 0 25 0
3 8 64 9 24 3 8 24 64 9
4 13 169 16 52 4 13 52 169 16
1 20 400 1 20 1 20 20 400 1
6 54 722 30 80 6 54 80 722 30

357
14. When you substitute all the summation ( ∑ ) values in the formula for
Pearson r, which among the choices is its best representation?
5(80)−(6)(54)
A. 𝑟 =
√[5(30)−62 ][5(722)− 542 ]

5(80)−(6)(54)
B. 𝑟 =
√[5(30)+62 ][5(722)+ 542 ]

5(80)−(6)(54)
C. 𝑟 =
√[5(6)−302 ][5(54)− 7222]

(6)(54)−5(80)
D. 𝑟 =
√[5(30)−62 ][5(722)− 542 ]

15. What is the value of r?


A. – 0.27
B. 0.27
C. 0.48
D. 0.84

Additional Activities

An ice cream vendor records the maximum daily temperature and the
number of ice creams he sells each day. An eight-day result is shown in the
table below.

Maximum
Temperature 26 28 24 28 23 24 27 32
(OC)
Number of
Ice Creams 21 38 42 47 29 19 52 56
Sold

Follow the directions below:

1. Display the data in a scatter plot and identify the trend of correlation.
2. Compute the Pearson’s sample correlation coefficient r.
3. Interpret the result of the data.

358
References
Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability. Malaysia:
Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.
Ubarro, Arvie D., Josephine Lorenzo S. Tan, Renato Guerrero, Simon L. Chua, and
Roderick V. Baluca. Soaring 21st Century Mathematics Precalculus. Quezon City,
Philippines: Phoenix Publishing House Inc., 2016.

Online Resources

Project Maths Development Team. “Teaching & Learning Plans: The Correlation
Coefficient.” Accessed May 23, 2020.
https://www.projectmaths.ie/documents/T&L/CorrelationCoefficient.pdf

Study.com. “Pearson Correlation Coefficient: Formula, Example & Significance.”


Accessed May 23, 2020. https://study.com/academy/practice/quiz-
worksheet-pearson-correlation-coefficient.html

359
Statistics and
Probability
Quarter 2 – Module 19:
Solving Problems Involving
Correlation Analysis

360
What I Need to Know

In the previous lessons, you already familiarized yourself with the concepts
about correlation of bivariate data. You learned about constructing a scatter plot and
identifying the form, direction, and strength of correlation. Also, you already know
how to compute for the Pearson’s sample correlation coefficient. Hence, you’ll have a
recall first on calculating correlation coefficient r. Then, the proceeding activities will
help you master solving problems involving correlation analysis especially in
interpreting Pearson’s r.

After going through this module, you are expected to:


1. compute the Pearson’s sample correlation coefficient r ;
2. interpret the strength of correlation between two variables based on the
computed correlation coefficient; and
3. solve real-life problems using Pearson’s sample correlation coefficient.

361
What I Know

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. What is used to determine the existence, strength, and direction of


relationship between bivariate data?
A. correlation
B. regression
C. hypothesis
D. interpretation

2. Miguel needs to analyze the strength of the relationship between two variables.
What is the correct statistical method he needs to conduct?
A. z-test
B. Pearson’s r
C. Pearson’s co
D. regression analysis

3. Which of the following Pearson coefficients is considered having the strongest


positive correlation?
A. r = -0.93
B. r = -0.81
C. r = 0.58
D. r = 0.75

4. Given the scatter plot below, describe the strength of correlation of the
variables involved.
A. The variables have no
correlation.
B. The variables have perfect
positive correlation.
C. The variables have strong
positive correlation.
D. The variables have weak
positive correlation.

5. In question number 4, which of the following values of r represents the


scatter plot?
A. 1
B. 0.98
C. 0.49
D. -0.99

362
6. Which of the following interpretation best describes the given scatter plot?
A. As the values of x increases,
the values of y increases.
B. As the values of x increases,
the values of y remain the
same.
C. As the values of x decreases,
the values of y decreases.
D. As the values of x increases,
the values of y decreases.

7. A recent study was conducted in school and it was found out that as a
student’s number of absences increase, the academic performance tends to
decline. Which of the following values of r is appropriate to represent the
correlation of the variables in the given statement?
A. -0.48
B. 0
C. 0.15
D. 1

8. In a survey, the correlation coefficient r between drinking coffee and the


number of hours you stay awake was found to be 0.87. Which of the following
statements best describes the result?
A. As you drink more coffee, the number of hours you stay awake stays
the same.
B. As you drink more coffee, the number of hours you stay awake
increases.
C. As you drink more coffee, the number of hours you stay awake
decreases.
D. As you drink less coffee, the number of hours you stay awake tends to
increase.

9. Researchers found out that the correlation coefficient r between the number
of hours, people are watching television and their body weight was nearly 0.
Which of the following best describes the result?
A. The longer the time of watching television, the more a person gains
weight.
B. The longer the time of watching television, the more a person loses
weight.
C. The lesser the time of watching television, the more a person gains
weight.
D. There is no relationship between the number of hours watching
television and weight of a person.

For numbers 10-12, refer to the following situation:

A trainer wants to find out the relationship between the person’s height and
pulse rate. The table below shows the gathered data from 7 people.

Height in inches (X) 62 64 71 64 75 67 66


Pulse rate per minute (Y) 103 90 83 71 85 87 69

363
10. What is the computed Pearson’s sample correlation coefficient?
A. -0.75
B. -0.28
C. -0.17
D. 0.123

11. What is the strength of correlation?


A. no correlation
B. strong negative
C. weak negative
D. weak positive

12. Based on the findings, which of the following best describes the result?
A. The taller a person, the higher his pulse rate.
B. The taller a person, the lower is his pulse rate.
C. The shorter a person, the lower is his pulse rate.
D. There is no correlation between a person’s height and his pulse rate.

For numbers 13-15, refer to the situation below:


Mr. Antonio always instilled among his students the value of striving hard for
excellence. That’s why he wants to determine if a relationship exists between the
number of hours students spend in studying and their final grade. The table below
indicates the final exam grade and the number of hours spent in studying for a
Mathematics exam.

study hours (X) 2 4 3 1 5 2


final exam grade (Y) 76 89 83 69 91 74

13. What is the computed Pearson’s sample correlation coefficient?


A. -1
B. -0.98
C. 0.98
D. 1

14. What is the strength of correlation?


A. perfect negative
B. perfect positive
C. strong positive
D. weak positive

15. Based on the findings, which of the following best describes the result?
A. As a student spends more time in studying, the higher is the final exam
grade.
B. A student spending less time in studying tends to have a higher final
exam grade.
C. As a student spends more time in studying, the lower is the final exam
grade.
D. There is no correlation exists between the hours spend in studying and
the final exam grade of the students.

364
Lesson
Solving Problems Involving
19 Correlation Analysis

Correlation analysis is one of the most important statistical tools that you may
consider employing in conducting your research studies. In this module, you will
learn how to solve problems involving correlation analysis. Remember your previous
lessons on describing the form, direction, and strength of association between two
variables and on calculating the Pearson’s r. These skills you learned in the previous
modules will help you understand the concepts presented here. You will dig deeper
in the concept of correlation analysis by solving problems and interpreting the results
in the context of the problems presented.

First, you need a recall on how to calculate Pearson’s sample correlation


coefficient by answering the first activity.

What’s In

In correlation analysis, we calculate a sample correlation coefficient


specifically by using Pearson’s sample correlation coefficient (Pearson’s r). Given the
bivariate data below, compute for the Pearson’s r.

Number of Final Average


Student
Absences Grade
1 16 77
2 2 86
3 14 75
4 9 87
5 8 85
6 4 86
7 2 89

What is the value of Pearson’s r? What does this value mean in terms of
student’s number of absences and their final average grade? Can you interpret the
value of Pearson’s r? To give you more background on problems involving correlation
analysis, let’s have another activity.

365
What’s New

Directions: Determine the trend of correlation of the situations below. Then,


estimate and interpret the r value. Choose your answers from the
choices in the box below.

Direction/ Degree/
Estimated
Bivariate Data Trend of Strength of
r Value
Correlation Correlation
1. age of a child and his clothing
size
2. volume of alcohol intake and
level of safety in driving a car
3. weight of a person and his/her
skill level in memorizing poem
4. hours spent working out at the
gym and the volume of body
fats
5. score in Quarterly Assessment
in Mathematics
and numbers of hours spent in
studying Mathematics

Choices:

Direction/Trend Estimated r Degree/Strength of Correlation

positive 0.0001 strong


negative -0.75 no/negligible
no/negligible 0.98 weak
perfect

Guide Questions:

1. How did you find the activity?


______________________________________________________________________________
______________________________________________________________________________

2. How did you come up with the trend of the correlation in each item?
______________________________________________________________________________
______________________________________________________________________________

3. Did you find it easy to pick from the choices for the estimated r value? Why?
______________________________________________________________________________
______________________________________________________________________________

366
4. What helped you decide the degree/strength of correlation?

______________________________________________________________________________
______________________________________________________________________________

5. Knowing that r value ranges from -1 to 1, make a scale for the degree/strength of
correlation as to no/negligible, weak, strong, and perfect. What are your
considerations in deciding the boundaries for each category?
______________________________________________________________________________
______________________________________________________________________________

What Is It

Correlation is used to determine the existence, strength, and direction of


relationship between two variables. Correlation coefficient r is a number between -1
and 1 that describes both the strength and the direction of correlation. In symbol,
we write -1 ≤ r ≤ 1.

In the first activity in What’s In, you solved for the value of r and identified its
trend. In What’s New activity, you identified the trends, estimated the values of r,
and based on the values, you chose the correct descriptions of the strength of the
correlation. So now, we will interpret r value by looking at the scale that gives both
strength and direction of correlation.

Using the correlation scale, we can determine the strength of the correlation
coefficient r. For example, you have r = 0.63 which means that there is a “strong
positive correlation” between the two variables. To interpret, we can simply state it
this way: “As x values increase, y values also increase and vice versa.”

367
In interpreting the linear relationship of two bivariate data, refer to the value
of r and the scale presented above. We can state our interpretation in different ways.
In order for you to solve problems involving correlation analysis, you must know how
to calculate the value of r and interpret this value using the scale. Since computing
for r value is a necessary skill, you may go back to the previous lesson if you feel that
you haven’t mastered it yet. Otherwise, proceed to the following examples of solving
for r.

Scenario: Filipino employees are known for being persistent and hardworking. That
is why they truly value every single cent of their salary. Here are some
situations showing the relationship between the salary and spending of a
Filipino employee.

Situation 1: There is a survey wherein the correlation coefficient r between salary


and spending of employee was found to be 0.97.
Interpretation: There is a “strong positive correlation” between salary and spending
of employees.

Situation 2: In another survey, the correlation coefficient r between salary and


spending of employee was found to be 0.38.
Interpretation: There is a “weak positive correlation” between salary and spending
of employees.

Situation 3: In another survey, the correlation coefficient r between salary and


spending of employee was found to be -0.81.
Interpretation: There is a “strong negative correlation” between salary and spending
of employees.

Situation 4: In another survey, the correlation coefficient r between salary and


spending of employee was found to be -0.19.
Interpretation: There is a “weak negative correlation” between salary and spending
of employees.

For more examples, see the table below:

Computed
Bivariate Data Interpretation
Pearson’s r
Temperature and the There is a strong negative correlation
number of hot chocolate -0.781 between the temperature and the
products sold number of hot chocolate products sold.
Amount of coffee intake There is a weak positive correlation
and number of hours 0.426 between the amount of coffee intake
you stay awake and number of hours you stay awake.
Height and salary of There is no correlation between the
0
teachers height and salary of teachers.

368
If data are in a scatter plot, we can determine the strength of correlation and
value of r by estimating it. Refer to the given examples below:

Estimated
Estimated
Scatter Plot Strength of the
Value of r
Correlation
The value of r should
be in the range
between
0.5 and 1.
Strong Positive
Correlation We can say 0.8 or
0.75
as long as it is within
the range in the
correlation scale.
The value of r should
be in the range
between
0 and -0.5
Weak Negative
Correlation
We can say -0.39
as long as it is within
the range in the
correlation scale.

The closeness of the points around the trend line determines the strength
of the correlation. The closer the points to the trend line, the stronger the
correlation of the variables is.

This is comparable to Filipino family ties, right? The closeness of each family
member will lead them to stronger family relationship.

Notes to the Teacher


Other authors use different scales of interpreting
the correlations. This may lead to students’ confusion. Tell
them that we will use the presented scale throughout this
course.

369
What’s More

Activity 1.1 Is My Value Enough? Then, Why?


Directions: Using the correlation scale, identify the strength of correlation in each
value of r.

Values of r Strength of Correlation


0.16 weak positive correlation
-1 1.
-0.94 2.
0.78 3.
0.43 4.
-0.19 5.
0 6.
1 7.

Activity 1.2 Who Is He?


He was the proponent of Pearson’s Sample Correlation
Coefficient during the latter half of the nineteenth century while ?
conducting a series of studies on individual differences with Sir
Francis Galton. It is now referred to as the Pearson's r.

Directions: Based on the scatter plot, ENCIRCLE its corresponding strength of


correlation and choose the value of r from the LETTER BOX below. Write
on the DECODE section the letter that corresponds to your answer. The
letters arranged accordingly will spell out the FIRST NAME of Pearson.

Scatter Plot Strength of Correlation Value of r


1.
Perfect Positive
Correlation

Perfect Negative
Correlation
2.
Strong Negative
Correlation

No Correlation

370
3.
Strong Positive
Correlation

Weak Positive
Correlation

4.
Strong Negative
Correlation

Weak Negative
Correlation

LETTER BOX
R S A I L K

0.26 -1 0 -0.27 -0.83 1

DECODE…
PEARSON
1 2 3 4

Activity 1.3 R You Interpreting?

Directions: The following are the results of some other researches with their
correlation coefficients. Make an interpretation for each survey. Item
number 1 is already given as example.

1. In a survey, the correlation coefficient r between engine size and fuel


consumption was found to be -0.9.

Interpretation: There is a strong negative correlation between the engine size


and fuel consumption.

2. In another survey, the correlation coefficient r between engine size and fuel
consumption was found to be -0.261.

Interpretation: ___________________________________________________________
___________________________________________________________

3. In a survey, the correlation between the number of hours per week students
spent studying and their performance in an exam was found to be 0.72.

Interpretation: ___________________________________________________________
___________________________________________________________

371
4. In another survey, the correlation between the numbers of hours per week
students spent studying and their performance in an exam was found to be
0.483.

Interpretation: ____________________________________________________________
____________________________________________________________

5. In a survey, the correlation between educational attainment and amount of


income was found to be 0.88.

Interpretation: ____________________________________________________________
____________________________________________________________

6. In a survey, the correlation between educational attainment and amount of


income was found to be -0.4.

Interpretation: ____________________________________________________________
____________________________________________________________

Activity 1.4 Read, Solve, Analyze, and Then Interpret

Directions: Read the following situations. Using each data, calculate the Pearson’s
sample correlation coefficient. After obtaining Pearson’s r, analyze and
interpret the result. (Show your solution.)

1. The table shows the data obtained from six students of Mapalad Integrated
High School in a study about the number of hours a student exercises each
week and the score s/he gets in a test.

Student Hours (X) Score (Y)


A 1 25
B 2 5
C 3 20
D 4 40
E 5 25
F 6 9

2. A group of Senior High School students is conducting a collaborative research


and they want to determine whether there is a correlation between the age of
tricycles (in years) in a certain city and the mileage it runs (in kilometers). The
data are shown below.

Age of a tricycle, in yrs (X) 0.5 1 1.5 2 3 4


Mileage, in km/liter (Y) 16 14 10 12 10 12

372
Notes to the Teacher
Tell your students to be cautious in interpreting
correlations. One can easily conclude that because two variables
have a strong correlation, one variable causes the change in the
other. In other words, correlation does not imply causation
(cause-and-effect relationship).
Also, the size of the sample affects the size of a
correlation. Generally, we calculate the sample correlation to
infer the true correlation of our population. The larger the
sample, the more reliable is the obtained correlation. That’s why
a significant test should be done to assess the sample
correlation’s reliability.

What I Have Learned

Directions: Fill in the blanks to complete the statements.


1. ________________ is used to determine the existence, strength, and direction of
relationship between two variables.
2. A statistical method to know the correlation between bivariate data is called
_____________________________.
3. Correlation coefficient r is a number between ___ and ___ that describes both
the strength and the direction of correlation. In symbol, we write_____________.

Fill in the boxes with the corresponding strength of correlation.

Briefly explain how to interpret the value of r.


____________________________________________________________________________________________
____________________________________________________________________________________________
____________________________________________________________________________________________

373
What I Can Do

The following are data on the height of a father and his eldest son, in inches:

Height of the father (X) 70 66 69 65 68 72

Height of the eldest son (Y) 69 68 65 61 68 70

Do the data support the hypothesis that height is hereditary? Explain.

Support your explanation with statistical computations.

Assessment

Directions: Choose the best answer to the given questions or statements. Write the
letter of your choice on a separate sheet of paper.

1. Which of the following statements about correlation is NOT correct?


A. Every correlation has strength only.
B. Correlation is a number from -1 and 1.
C. A correlation can either be positive or negative.
D. Correlation determines the direction between two variables.

2. Miguel needs to analyze the strength of the relationship between two variables.
What is the correct statistical method he needs to conduct?
A. z-test
B. Pearson co
C. Pearson alpha
D. Pearson’s sample correlation coefficient

3. Which of the following Pearson coefficients is considered having the weakest


negative correlation?
A. 0.45
B. -0.1
C. -0.45
D. -0.98

374
4. Given the scatter plot below, describe the strength of correlation of the
variables involved.
A. The variables have perfect positive
correlation.
B. The variables have strong positive
correlation.
C. The variables have strong negative
correlation.
D. The variables have weak negative
correlation.

5. In question number 4, which of the following values of r represents the


scatter plot?
A. 0.001
B. 0
C. -0.47
D. -0.85

6. Which of the following interpretation best describes the given scatter plot?
A. As x values increase,
y values increase.
B. As x values increase,
y values decrease.
C. As x values increase, y values
remain the same.
D. As x values decrease,
y values increase.

7. There is a survey indicating that as weather gets colder, air conditioning costs
decrease. Which of the following values of r is appropriate to the result of the
survey?
A. -0.93
B. 0
C. 0.19
D. 0.87

8. The correlation coefficient r between the number of bags of popcorn and the
number of sodas sold at each performance of the circus over one week was
found to be 0.62. Which conclusion can be drawn from the result?
A. There is no correlation between popcorn sales and soda sales.
B. There is a strong positive correlation between popcorn sales and soda
sales.
C. There is a weak positive correlation between popcorn sales and soda
sales.
D. There is a strong negative correlation between popcorn sales and soda
sales.

9. An ornithologist, a person who studies every aspect of birds, found out that the
correlation coefficient r between wing length and tail length of 12 different
species of bird was 0.43. Which conclusion can be drawn from the result?
A. A bird with longer wing length has shorter tail length.
B. A bird with shorter wing length has longer tail length.
C. A bird with longer wing length also has longer tail length.
D. There is no correlation between bird’s wing length and tail length.

375
For numbers 10-12, refer to the following situation:

The law of demand is an economic principle that explains certain correlation


between the price of a good and its demand. The table below shows the price of a
certain good and its quantity of demand.

Price in Peso 11 12 13 16 18 19 20
Demand 38 31 26 23 20 20 17

10. What is the computed Pearson’s sample correlation coefficient?


A. 0.94
B. 0.04
C. -0.84
D. -0.94

11. What is the strength of correlation?


A. weak positive
B. strong positive
C. strong negative
D. perfect negative

12. Based on the findings, which of the following best describes the result?
A. As price of good increases, the quantity of demand increases.
B. As price of good decreases, the quantity of demand decreases.
C. As price of good increases, the quantity of demand decreases.
D. As price of good increases, the quantity of demand remains the same.

For numbers 13-15, refer to the situation below:

A Mathematics teacher is interested in finding out if critical and scientific


thinking exists among students who are good in Mathematics and Science. Thus, he
conducted a research and gathered scores of his respondents in Math and Science.
The following data have been obtained:

Score in Score in
Mathematics Science
(X) (Y)
12 13
10 9
5 8
7 8
11 14
6 7

13. What is the computed Pearson’s sample correlation coefficient?


A. 0.80
B. 0.78
C. 0.52
D. 0.23

376
14. What is the strength of correlation?
A. strong negative
B. perfect positive
C. strong positive
D. weak positive

15. Based on the findings, which of the following best describes the result?
A. A student who is good in Mathematics is also good in Science.
B. A student who is good in Mathematics is not good in Science.
C. A student who is not good in Mathematics is good in Science.
D. There is no correlation between the performance of the students in
Mathematics and Science.

Additional Activities

Solve the following problems.

1. Compute the correlation coefficient of the following bivariate data. Then, give a
conclusion based on the results.

Number Selling
of years price
owned (Y)
(X)
1 23
2 20
3 17
4 14
5 11

2. As shown in the table below, a person’s heart rate during exercise changes as he
gets older. Compute the Pearson’s r, then interpret the result.

Age Heart rate


(years) (beats per minute)
20 135
22 134
24 132
25 132
27 129
30 130
35 130

377
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability. Malaysia:
Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.
Ubarro, Arvie D., Josephine Lorenzo S. Tan, Renato Guerrero, Simon L. Chua, and
Roderick V. Baluca. Soaring 21st Century Mathematics Precalculus. Quezon City,
Philippines: Phoenix Publishing House Inc., 2016.

Online Resources

DSoftSchools. “Correlation Coefficient Practice Worksheets.” Accessed May 27,


2020. https://dsoftschools.com/correlation-coefficient-practice-worksheets/
Project Maths Development Team. “Teaching & Learning Plans: The Correlation
Coefficient.” Accessed May 23, 2020.
https://www.projectmaths.ie/documents/T&L/CorrelationCoefficient.pdf

378
Statistics and
Probability
Quarter 2 – Module 20:
Identifying Dependent and
Independent Variables

379
What I Need to Know

This module was designed and written with you in mind. In this module, you
will be introduced to the two types of variables: dependent and independent.
Furthermore, you will learn how to distinguish the two.

After going through this module, you are expected to:


1. define dependent and independent variables; and
2. identify the dependent and independent variables in a sentence or problem.

Are you ready now to study using your ADM module? Good luck and may
you find it helpful.

380
What I Know

Directions: Choose the best answer to the given questions or statements.


Write the letter of your choice on a separate sheet of paper.

1. Which of the following statements is FALSE about bivariate?


A. It involves one variable.
B. It involves two variables.
C. It deals with causes or relationship.
D. Its two variables are dependent and independent.

2. Which of the following variables depends on other factors, measured and


presumed as the “effect”?
A. bivariate
B. constant
C. dependent
D. independent

3. Which of the following variables describes something that is stable and


unaffected by other variables you are trying to measure and is presumed the
“cause”?
A. bivariate
B. constant
C. dependent
D. independent

4. What variable is something that is influenced and affected?


A. bivariate
B. constant
C. dependent
D. independent

5. Marco believes that health is wealth. That is why as part of his routine, he
jogs every morning. The number of calories he burns during a jog depends
on the distance he jogs. In the given situation, which are the variables?
A. Marco
B. number of calories
C. distance Marco jogs
D. number of calories Marco burns and distance Marco jogs

381
6. Which of the following concepts is TRUE about dependent and independent
variables?
A. Dependent variable is the one being studied and measured.
B. Dependent variable is the one you do not expect to change.
C. Dependent variable causes a change in independent variable.
D. Dependent variable is the variable the experimenter changes or
controls.

7. In graphing the scatter plot, where do you plot the independent variable?
A. origin
B. x-axis
C. y-axis
D. z-axis

8. Andrew knows that through education, he can achieve his dreams. That is
why to support his schooling, Andrew is doing a part-time job. For each hour
of working, he earns Php40. What is the independent variable?
A. Php 40
B. part-time job
C. amount earned
D. hours spent in doing part-time job

9. Jeny and Camille are preparing relief goods to serve at an evacuation center.
The more relief goods they prepare, the more people they will be able to
serve. Which of the following is the dependent variable?
A. Jeny and Camille
B. evacuation center
C. number of relief goods prepared
D. number of people they will be able to serve

10. Suppose a study found that spending time with friends or family decreases
the amount of stress someone is feeling and allows them to perform better
on tests. In the given statement, which represents the dependent variable?
A. total number of tests
B. spending time with friends or family
C. spending stress with friends or family
D. amount of stress someone is feeling

11. Which of the following sentence structures is CORRECT in featuring


dependent and independent variables?
A. Body weight depends on the amount of food intake.
B. The amount of food intake depends on body weight.
C. The number of hours you work out depends on body weight.
D. The number of hours you work out depends on food intake.

382
12. Which of the following sentence structures is CORRECT in featuring
dependent and independent variables?
A. How far you can drive depends on the weight of the car.
B. The amount of gas you have depends on your driver’s license.
C. The amount of gas you have depends on how far you can drive.
D. How far you can drive depends on the amount of gas you have.

13. If the dependent variable is the plant’s height, which of the following
CANNOT possibly be the independent variable?
A. amount of water provided to the plant
B. type of fertilizer provided to the plant
C. exposure of plant to classical music
D. amount of sunlight received by the plant

14. If the independent variable is the number of minutes spent on watching


Math YouTube videos, which of the following is the best possible dependent
variable?
A. number of likes
B. number of views
C. number of subscribers
D. academic performance in Math

15. If the dependent variable is the speed of goldfish growth, which of the
following can possibly be the independent variable?
A. type of food
B. type of water
C. size of aquarium
D. all of the above

383
Lesson
Dependent and Independent
20 Variables

It is important to understand variables because they are being studied and


interpreted based on the given data. A study may take into consideration several
variables. These variables may be of many types and levels but in this lesson, you
will be identifying independent and dependent variables in given situations. Check
your readiness for this lesson by answering the following exercises.

What’s In

Recall the Variable!


Directions: Encircle the variables in each situation below and determine whether
the situation involves univariate or bivariate data.

Univariate or
Statements
Bivariate
1. A researcher looked into salary/income and civil status of
government employees.

2. The veterinarian listed the weight of the newborn puppies.

3. The school nurse recorded the age and the blood pressure
of the teachers.

4. A vendor keeps track of how much ice candy they sell


everyday versus the daily temperature.
5. A STEM student surveyed Grade 10 students on the
number of hours spent in using cell phone and their
previous grade.

Guide Questions:
1. How were you able to differentiate univariate from bivariate data?
___________________________________________________________

2. Which among the statements above may deal with relationship between
variables?
___________________________________________________________

When we are examining bivariate data, the two variables could depend on
each other and one variable could influence another. In this case, try to answer the
next activity.

384
What’s New

Are You in a Relationship?

Directions: Match the pictures in Column A to Column B. Then, answer the


questions that follow.

Column A Column B

____ 1. A.

____ 2. B.

____ 3. C.

____ 4. D.

____ 5. E.

385
Guide Questions:

1. How did you match each picture in Column A to Column B? What are the things
you considered?
______________________________________________________________________

2. In general, what do the pictures in Column A and Column B represent?


______________________________________________________________________

3. Suppose you didn’t eat breakfast, what could possibly happen to you?
______________________________________________________________________

4. If you didn’t pass an examination, what could possibly be the reason?


______________________________________________________________________

5. Which situation do you think should happen first before another situation
happens? Explain your answer.
_______________________________________________________________________

In the activity, you were able to identify situations that involve


relationship between two variables, a skill that will be of help in
understanding our next lesson. You can also use your language skills dealing
with cause-and-effect relationship for better understanding. Let’s find
out as you go through the lesson.

Notes to the Teacher


Check the student’s level of readiness for the next topic. If s/he
did not answer most of the items and the guide questions, you may
provide another activity involving situations with related variables.

386
What Is It

In the previous activity, you became familiar with bivariate data. Bivariate
data always involve two variables. One of these variables is the dependent variable
and the other one is the independent variable.

What is the difference between the two variables?

Dependent variable depends on other variables or factors. It is something


that is influenced and affected. It is also associated with the word effect or outcome.

Independent variable affects the dependent variable. It is something you


have control over, one which you can choose and manipulate. However, in some
cases, you may not be able to manipulate the independent variable. It is commonly
known as the cause or the reason behind changes.

For example, the researcher wants to determine the effects of use of social
media in the academic performance of students in Mathematics.

The bivariate data in the study are use of social media and academic
performance. The academic performance depends on the use of social media, or we
can say that academic performance is affected by the use of social media.
Therefore, independent variable here is the use of social media and the
dependent variable is the academic performance.

affects
INDEPENDENT DEPENDENT
VARIABLE VARIABLE

use of social academic


media performance

depends on

Independent variables happen first and


y (dependent variable) is the result of x (independent variable).

387
Let’s have more examples.
Identify the dependent and independent variables in the next statements.

 Taking a nap in the afternoon makes people more relaxed and less
irritable for the rest of the day. If you want to be more relaxed and less
irritable, it depends on the number of hours you take a nap in the afternoon.

affects
INDEPENDENT DEPENDENT
VARIABLE VARIABLE

more relaxed hours taking a


and less nap in the
irritable afternoon

depends on

 Research Title: Math App (Math Tricks) for Improving Fundamental


Basic Skills and Retention of Grade 11 Students of Mapalad Integrated
High School

Increasing or improving your fundamental basic skills and retention


depends on how you frequently practice or use the Math app called Math
Tricks.

affects
INDEPENDENT DEPENDENT
VARIABLE VARIABLE

improving frequency of
fundamental using Math app
basic skills and (Math Tricks)
retention

depends on

388
What’s More

Activity 1.1 Let’s Connect to Correct!


Directions: Connect the dependent and independent variables to form a correct
sentence structure. The first one is given as an example.

Independent Variable Dependent Variable Correct Sentence

Test score depends on the


number of hours
test score number of hours studying for a
studying for a test
test.

1.
kilowatts used in a
electricity bill
household

2.
person’s running time it takes to run a
speed mile

3.
genes they inherit
height of a person
from their parents

Activity 1.2 It Depends on…


Directions: On the table below, you have sets of variables not categorized into
dependent and independent variables. Connect the two variables by
using depends on to form a correct sentence structure.

Dependent variable depends on


Variables
independent variable.

number of
chance of
cigarettes a person 1.
developing cancer
smokes

2.
taking care of our
global warming
natural resources

389
tardiness, cutting 3.
chance of dropping classes, and
out absenteeism of a
student
4.
educational opportunities for
attainment high-paying jobs

5.
total calories and number of junk
fat foods you eat

Activity 1.3 Identify Me!


Directions: Underline the independent variable and encircle the dependent variable.

1. effect of temperature on plant pigmentation

2. effects of fertilizer brand on plant growth

3. effect of brightness of light on a moth being attracted to the light

4. test scores of students and time spent on studying

5. relationship between income and educational attainment of young adult

Activity 1.4 Put Me in the Box!


Directions: Identify the independent and dependent variables in each question
stated below.
Independent Dependent
Questions
Variable Variable
1. How does logical thinking develop
critical thinking?

2. What are the effects of Koreanovelas on


the Filipino value system?

3. In what way does collaborative learning


increase communicative competence?

4. To what extent does texting decrease


students’ grammatical competence?

5. What corrupt practices trigger one’s


resignation?

In graphing the scatter plot, independent variable always goes


on the x-axis (horizontal axis) while the dependent variable goes on the
y-axis (vertical axis).
390
What I Have Learned

Direction: Complete the following statements. In nos. 3-4, choose the expression
on the parentheses that best completes the sentences.

1. ________________ data always involve two variables.

2. Bivariate data has ____________ variable and _________________ variable.

3. Dependent variable _________ (depends on, affects) the other variable.

4. Independent variable ___________ (depends on, affects) the other variable.

5. _____________ variable is related with the words “outcome” or “effect”.

6. _____________ variable is linked as the “cause” or the “reason” behind the


changes.

391
What I Can Do

Let’s Search in the Research!


Directions: Identify the independent and dependent variables in the given research
titles.

Independent Dependent
Research Title
Variable Variable
Example:
Math Minute and Peer Tutoring in Math Minute student’s mastery
Raising the Level of Student’s and peer tutoring in solving basic
operation
Mastery in Solving Basic
of integers
Operations of Integers
1. Organizational Commitment and
Teaching Performance
of Elementary Teachers in Rizal
2. Conceptual, Interpersonal,
and Technical Skills of Bank
Managers: Their Relationship
to Operational Efficiency
3. Increasing Mathematics
Achievement Through
Contextualized and Localized
Materials
4. Impact of Blended Learning
on Student Achievement in Social
Studies
5. Effectiveness of Exposing
Students to Classical Music
on Reading Comprehension

392
Assessment

Directions: Choose the best answer to the given questions or statements. Write
the letter of your choice on a separate sheet of paper.
1. What do you call a data that involves two variables?
A. bivariate
B. constant
C. dependent
D. independent

2. Which of the following variables describes something that is already there


and is fixed, one that you would like to evaluate with respect to how it affects
something else?
A. bivariate
B. constant
C. dependent
D. independent

3. Which of the following variables may change or vary and can be considered
as outcome?
A. bivariate
B. constant
C. dependent
D. independent

4. Changes in the ________ variable cause changes in the _________ variable.


A. constant; independent
B. dependent; independent
C. independent; dependent
D. independent; independent

5. As a person’s exposure to sunlight increases, his or her chance of developing


cancer increases. In the given findings, which are the variables?
A. person
B. increases
C. exposure to sunlight
D. exposure to sunlight and chance of developing cancer

6. Which of the following concepts below is NOT TRUE about dependent and
independent variables?
A. The dependent variable causes change in the independent variable.
B. The independent variable causes change in the dependent variable.
C. The dependent variable is affected by the independent variable.
D. The dependent variable depends on the independent variable.

393
7. In graphing the scatter plot, where do you plot the dependent variable?
A. origin
B. x-axis
C. y-axis
D. z-axis

8. You want to determine which type of fertilizer helps plants grow the fastest,
so you add a different brand of fertilizer to each plant and see how tall they
grow. What is the independent variable?
A. plants
B. type of fertilizer
C. how plants grow
D. fruit of the plant

9. Students took a test after they studied either in silence or with the television
turned on. What is the dependent variable?
A. silence
B. subject
C. score on test
D. television turned on

10. A team from the Department of Public Works and Highways (DPWH) is
painting lines on the freeway. The number of hours needed to finish the
project depends on the length of road that needs to be lined. In the given
statement, which represents the independent variable?
A. road
B. length of the road that needs to be lined
C. number of hours need to finish the project
D. Department of Public Works and Highways

11. What would be the independent variable in an experiment testing which


paper airplane design goes the farthest?
A. the paper used
B. the paper airplane design
C. how hard the plane is thrown
D. the distance of each plane’s flight

12. According to a study, the more time people spend using social media, the
less able they are to express themselves in conversation. In the given
statement, which represents the dependent variable?
A. more time for themselves
B. using social media in conversation
C. more time people spend in using social media
D. less able to express themselves in conversation

13. Which of the following sentence structures is CORRECT in featuring


dependent and independent variables?
A. How grass grows depends on the type of soil used.
B. The type of soil used depends on how grass grows.

394
C. The type of soil used depends on the type of grass.
D. Grass doesn’t grow in any type of soil used.

14. Which of the following sentence structures is INCORRECT in featuring


dependent and independent variables?
A. The noise level in a classroom is affected by memory retention.
B. The more grams of salt a person consumes, the higher his/her blood
pressure is.
C. If a person eats carrots daily, there will be an improvement on his/her
vision.
D. Eating breakfast in the morning increases academic performance in
Mathematics.

15. If the dependent variable is “increasing the ability to learn in school”, what is
the best possible independent variable?
A. travelling every week
B. taking selfies three times a day
C. eating breakfast in the morning
D. watching television for 12 hours

395
Additional Activity

Directions: Give five (5) research proposal titles that involve bivariate data. Identify
the dependent and independent variables in the title.

RESEARCH PROPOSAL TITLES

1. ______________________________________________________________________
______________________________________________________________________
2. ______________________________________________________________________
______________________________________________________________________
3. ______________________________________________________________________
______________________________________________________________________
4. ______________________________________________________________________
______________________________________________________________________
5. ______________________________________________________________________
______________________________________________________________________

396
References

Books

Albacea, Zita VJ., Mark John V. Ayaay, Isidoro P. David, and Imelda E. De Mesa.
Teaching Guide for Senior High School: Statistics and Probability. Quezon City:
Commision on Higher Education, 2016.

Caraan, Avelino Jr S. Introduction to Statistics & Probability: Modular Approach.


Mandaluyong City: Jose Rizal University Press, 2011.

De Guzman, Danilo. Statistics and Probability. Quezon City: C & E Publishing Inc,
2017.
Punzalan, Joyce Raymond B. Senior High School Statistics and Probability.
Malaysia: Oxford Publishing, 2018.
Sirug, Winston S. Statistics and Probability for Senior High School CORE Subject A
Comprehensive Approach K to 12 Curriculum Compliant. Manila: Mindshapers
Co., Inc., 2017.
Ubarro, Arvie D., Josephine Lorenzo S. Tan, Renato Guerrero, Simon L. Chua, and
Roderick V. Baluca. Soaring 21st Century Mathematics Precalculus. Quezon
City, Philippines: Phoenix Publishing House Inc., 2016.

Online Resources
Carpentieri, Danielle. “Grades 7-8 Independent and Dependent Variables.”
Accessed May 20, 2020.
https://digitalcommons.pace.edu/cgi/viewcontent.cgi?article=1003&context
=middle_math.
Khan Academy. “Dependent and Independent Variables Review.” Accessed January
21, 2019. https://www.Khanacademy.Org/Math/Pre-Algebra/Pre-Algebra-
Equations Expressions/Pre-Algebra-Dependent-Independent/A/Dependent-
And-Independent-Variables-Review
Northern Arizona University. “Understanding Variables.” Accessed January 25,
2019.
http://Jan.Ucc.Nau.Edu/~Mid/Edr610/Class/Variables/Variables/Lesson3-
1-1.Html

Sarikas, Christine. “Independent And Dependent Variables: Which Is Which?”


Accessed January 23, 2019. https://Blog.Prepscholar.Com/Independent-
And-Dependent-Variables

Virginia Department of Education. “Independent and Dependent Variables.”


Accessed May 20, 2020.
http://www.doe.virginia.gov/testing/solsearch/sol/math/8/mess_8-17.pdf.

397
For inquiries or feedback, please write or call:

Department of Education - Bureau of Learning Resources (DepEd-BLR)

Ground Floor, Bonifacio Bldg., DepEd Complex


Meralco Avenue, Pasig City, Philippines 1600

Telefax: (632) 8634-1072; 8634-1054; 8631-4985

Email Address: [email protected] * [email protected]

398

You might also like