Modssssss
Modssssss
Quarter 4 – Module 6:
Computing Test Statistic on Population Mean
There are two specific test statistics used for hypothesis testing concerning means: z-test and t-test.
If the sample size is large, where 𝑛 ≥ 30 and the population standard deviation (𝜎) is known, use z-test.
Example 1: Compute the z-value given the following information. Use onetailed test and 0. 05 level of significance.
𝑥̅ = 70 𝜇 = 71.5 𝜎=8 𝑛 = 100
Solution: Since σ is known and n ≥ 30, we will use z-test. Thus, we have:
1.5
𝑧=
0.8
𝐳 = 𝟏. 𝟖𝟕𝟓
Therefore, the computed z-value is 1.875.
Example 2: In the first semester of the school year, a random sample of 200 students got a mean score of 81.72 with
a population standard deviation of 15 in Statistics and Probability test. The population mean is 79.83. Use 0.05 level
of significance.
Solution: To answer the problem, let us first identify the given. We have:
𝑥̅ = 81.72 𝜇 = 79.83 𝜎 = 15 𝑛 = 200
Since σ is known and n ≥ 30, we will use z-test.
In Central Limit Theorem, the sample standard deviation (𝑠) may be used as an estimate of the population
standard deviation (𝜎) when the value of 𝜎 is unknown.
−5
𝑧=
30 Simplify.
12.25
−5
𝑧=
2.45 Therefore, the computed z – value
𝐳 = − 𝟐. 𝟎𝟒𝟏 is -2.041.
Solution: Since σ is unknown and n < 30, we will use t-test. Thus, we have:
Example 5: The government claims that the monthly expenses of a Filipino family with four members is P10,000. A
sample of 26 family’s expenses has a mean of P10,900 and a standard deviation of P1,250. Is there enough evidence
to reject the government’s claim at 𝛼 = 0. 01?
Level of Significance
Type of Test
𝜶 = 1% 𝜶 = 2.5% 𝜶 = 5% 𝜶 = 10%
df = (n – 1)
In general, if the absolute value of the computed value is greater than the absolute value of the critical
value, we reject the null hypothesis and support the alternative hypothesis. But if the absolute value of the computed
value is less than the absolute value of the critical value, we do not reject or we fail to reject the null hypothesis
and the alternative hypothesis is not supported.
In a right-tailed test, if the computed value is greater than the critical value, we reject the null hypothesis
and support the alternative hypothesis. But if the computed value is less than the critical value, we do not reject or
we fail to reject the null hypothesis and the alternative hypothesis is not supported.
In a left-tailed test, if the computed value is less than the critical value, we reject the null hypothesis
and support the alternative hypothesis. But if the computed value is greater than the critical value, we do not reject
or we fail to reject the null hypothesis and the alternative hypothesis is not supported.
Rejecting the null hypothesis doesn’t mean that it is incorrect or the alternative hypothesis is correct. The
collected data suggest a sufficient evidence to disprove the null hypothesis, hence we reject it.
Similarly, a failure to reject the null hypothesis does not mean that it is true -only that the test did not prove
it to be false. There is an insufficient evidence to disprove the null hypothesis; hence we do not reject it.
Example 1: Compute for its value given the following information. Use 𝛼 =
Solution: It is a one-tailed test, since it does mention about the direction of the distribution (the alternative hypothesis
uses the symbol >). Since σ is known and n ≥ 30, we will use z-test. The level of significance is 0.05. From Table 1, the
z-critical value is 1.645. Thus, we have:
Non-Rejection Rejection Region
𝑥̅ − 𝜇 1.5
𝑧= 𝜎 𝑧= Region
8
ξ𝑛 10
71. 5 − 70 1. 5
𝑧= 𝑧=
8 0. 8
ξ 100 𝐳 = 𝟏. 𝟖𝟕𝟓
Decision: 1.645
The computed z-value is 1.875 which is greater than the critical value of 1.645. Therefore, we reject the null hypothesis
and support the alternative hypothesis.
Example 2: Compute for its value given the following information. Use 𝛼 =
0.01. Interpret the result.
𝐻𝑜: 𝜇 = 127 𝑥̅ = 124.5 𝜇 = 127
𝐻𝑎:𝜇 < 127 𝑠=5 𝑛 = 12
Solution: It is a left-tailed test, since it does mention about the direction of the distribution (the alternative hypothesis
uses the symbol <). Since σ is unknown and n < 30, we will use t-test. The degree of freedom (df = n - 1) is 11 and 𝛼 =
0.01. Therefore, the t-critical value from Table 2 is -2.718. Thus, we have:
Rejection Acceptance or
𝑥̅ − 𝜇 −2. 5
𝑡= 𝑡= Region Non-Rejection
𝑠 5 Region
ξ𝑛 3.46
124. 5 − 127 −2.5
𝑡= 𝑡=
5 1.44
ξ 12 𝐭 = −𝟏. 𝟕𝟑𝟔
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5
-2.718
Decision:
The computed t-value is greater than the t-critical value at 𝛼 = 0.01 (i. e.−1.736 > −2.718. Since we have a left-tailed test,
our conclusion is that we fail to reject the null hypothesis.
Example 3: The government claims that P10,000 is the monthly expenses of a Filipino family with four members. A
sample of 26 families has mean monthly expenses of P10,900 and a standard deviation of P1,250. Is there enough
evidence to reject the government’s claim at 𝛼 = 2.5%?
It is a two-tailed test, since it does not mention about the direction of the distribution. Since σ is unknown and n < 30,
we will use t-test. The degree of freedom (df = n - 1) is 25 and 𝛼 = 2.5%. Therefore, the t-critical value from Table 2 is
2.485. Thus, we have:
Non-Rejection
𝑥̅ − 𝜇 900 Region Rejection Region
𝑡= 𝑠 𝑡=
1 250
ξ𝑛 5.10
10 900 − 10 000 900
𝑡= 𝑡=
1 250 245. 10
ξ 26 𝐭 = 𝟑. 𝟔𝟕𝟏
-5 -4 -3 -2 -1 0 1 2 3 4 5
-2.485 2.485
Decision:
The absolute value of the computed t-value is greater than the absolute of the critical t-value at 𝛼 = 0.025 (i.e. |3.671|>
|2.485|). Therefore, we reject the null hypothesis.
Conclusion:
We can conclude that there is enough evidence to reject the claim of the government that P10,000 is the monthly expenses
of a Filipino family with four members.
2. Determine the test statistic that will be used to conduct the hypothesis test. Then, calculate its value.
3. Find the critical value for the test and draw the critical region.
4. Decide and draw a conclusion based on the comparison of the calculated value of the test statistic and the critical
value of the test.
In general, if the absolute value of the computed value is greater than the absolute value of the critical value,
we reject the null hypothesis and support the alternative hypothesis. But if the absolute value of the computed value
is less than the absolute value of the critical value, we fail to reject the null hypothesis and the alternative hypothesis
is not supported.
In a right-tailed test, if the computed value is greater than the critical value, we reject the null hypothesis
and support the alternative hypothesis. But if the computed value is less than the critical value, we fail to reject
the null hypothesis and the alternative hypothesis is not supported.
In a left-tailed test, if the computed value is less than the critical value, we reject the null hypothesis
and support the alternative hypothesis. But if the computed value is greater than the critical value, we fail to
reject the null hypothesis and the alternative hypothesis is not supported.
Example 1: According to a study conducted by the Grade 12 students, ₱155 is the average monthly expense for cell
phone loads of high school students in their province. A Statistics student claims that this amount has increased since
January of this year. Do you think his claim is acceptable if a random sample of 50 students has an average monthly
expense of ₱165 for cell phone loads? Using 5% level of significance, assume that a population standard deviation is
₱52.
Solution:
𝑥̅ −𝜇
𝑧= 𝜎
𝑛 √
𝐳 = 𝟏. 𝟑𝟔𝟏
Step 3: Find the critical value and draw the critical region. Use the z-critical value table.
The alternative hypothesis is directional. Hence, the one-tailed test (right-tailed test) shall be used. From the z-
value table at 0.05 level of significance, the critical value is 1.645.
Non-Rejection
Region
Rejection Region
1.361 1.645
Example 2: Blood glucose levels for obese teenagers have a mean of 120. A researcher thinks that a diet high in raw
cornstarch will have a positive or negative effect on blood glucose levels. A sample of 25 patients who have tried the raw
cornstarch diet has a mean glucose level of 135 with a standard deviation of 38. Test the hypothesis at 𝛼 = 0.10 that the
raw cornstarch had an effect.
Solution:
𝒕 = 𝟏. 𝟗𝟕𝟒
Given: 𝑥̅ = 95 𝜇 = 99 𝜎 = 15 𝑛 = 40 𝛼 = 0.05
Step 1: State the null and alternative hypotheses.
𝑥̅ −𝜇
𝑧= 𝜎
𝑛 √
𝐳 = −𝟏. 𝟔𝟖𝟖
Step 3: Find the critical value and draw the
critical region. Use the z-critical value
table. The alternative hypothesis is
directional. Hence, the one-tailed test (left-
tailed test) shall be used. From the z-value
Non-Rejection table at 0.05 level of significance, the
Region critical value is -1.645.
Rejection Region
Once you already know that you are dealing with a population proportion, you can conduct the hypothesis
test. You can start with the first step of a hypothesis test which is to determine the hypotheses. In order to formulate
null and alternative hypotheses concerning population proportions, you can write them in sentence form or you can
use different symbols. Here, you will use the symbol p for the population proportion.
Remember that the hypotheses are claims about the population proportion, p. The null hypothesis states that
the proportion is equal to a specific value or the hypothesized proportion, po. On the other hand, the alternative
hypothesis is the competing claim that the population proportion is less than, greater than, or not equal to po.
As a reminder, the null hypothesis is always a statement of equality. The alternative hypothesis is always a
statement of inequality, using the symbols <, >, or ≠. Moreover, the hypotheses are stated in such a way that they are
mutually exclusive. That is, if one is true, the other must be false; and vice versa.
If you are going to write the null hypothesis in sentence form, you will usually use “is” or “is equal to”. In
symbols, you are going to use:
HO : p = po
Meanwhile, to formulate alternative hypothesis in sentence form or in symbols, you will just remember the
following:
➢ When testing for population proportions, there are three (3) possible alternative hypotheses. They are based on the
wording of the question instructing you what to hypothesize. (See illustrative examples below.)
a. Ha : p < po
smaller, less, decreased, fewer, lower
b. Ha : p > po larger, greater, more, increased
In the given symbols as shown above, letters a and b are used in a one-tailed test or one-sided tests (directional)
while letter c is used for a twotailed test (non-directional).
One-Tailed Two-Tailed
Alternative hypothesis contains the greater than (>) Alternative contains the inequality (≠) symbol.
or less than
(<) symbols It has no direction.
It is directional (either right-tailed or left-tailed)
The next table below shows the null and alternative hypotheses stated together with the types of hypothesis
tests.
Two-Tailed Test Right-Tailed Test Left-Tailed Test
Alternative
Hypothesis 𝐻𝑎: 𝑝 ≠ 𝑝𝑜 𝐻𝑎: 𝑝 > 𝑝𝑜 𝐻𝑎:𝑝 < 𝑝𝑜
Illustrative Examples:
Example 1. It has been claimed that 40% of students in a particular senior high school dislike Mathematics. When a
survey was conducted by a researcher, it showed that 145 of 800 students dislike Mathematics. Test if the claim was
different at α = 0.05 level.
In this example, the hypothesized proportion is 40% or 0.40. Hence, the null hypothesis will be,
The proportion of students who dislike Mathematics is 40%.
In symbols, you can write,
Ho: p = 0.40
Our cue word here is “different” which means “not the same” or “not equal”. Therefore the alternative hypothesis
is,
The proportion of students who dislike Mathematics is not equal to 40%.
In symbols, you can write,
Ha: p ≠ 0.40
Since the word “different” is used in the given problem, the symbol to be used in alternative
hypothesis is “ ≠ ”.
Your hint in formulating the alternative hypothesis in this example is the phrase “lower than” which means
“less than”. So, your alternative hypothesis will be,
The proportion of students who will enroll on STEM track is lower than 60%.
which can be written as,
Ha: p < 0.60
Since the word “lower” is used in the given problem, the symbol to be used in
alternative hypothesis is “<”.
Example 3. It has been claimed that 40% of qualified applicants passed in a particular job interview. When a survey
was conducted by a researcher of a certain company, it showed that 90 of 145 applicants passed the job interview. Test
if the claim was larger at α = 0.05 level.
40% is the hypothesized proportion; hence you have the null hypothesis stated as
The proportion of qualified applicants in a particular job interview is 40%.
And it can be written in symbols as
Ho: p = 0.40
The word “larger” is synonymous to “greater” hence your alternative hypothesis will be,
The proportion of qualified applicants in a particular job interview was larger than 40%.
Or in symbols
Ha: p > 0.40
Since the word “larger” is used in the given problem, the symbol to be used in
alternative hypothesis is “ > “.
Dealing with various problems or situations oftentimes leads to confusion. In this section, take note that
problems involving proportions, unlike in population mean and sample mean, never use terms such as “average”
and “mean” but “percentage” instead. Let us first define what population proportion is.
Population proportion (p) is a part of the population with a particular attribute or trait expressed as a
fraction, decimal, or percentage of the whole population. In symbol:
p= ____ %
Notice that in Matapat City, 10% (percentage is used) of the entire residents are senior citizen. Therefore,
the percentage of the senior citizen residents represents the population proportion or percentage which makes
p = 10% = 0.10.
Similarly, among these senior citizens, what percentage owns a cell phone? That illustrates the sample
proportion, in symbol 𝒑̂ (read as “p hat”) which is computed as follows:
𝒑̂ = 0.84
To change percent to
decimal, see examples
below:
1. 12% = 0.12
2. 5% = 0.05
3. 12.5% = 0.125
On the other hand, there are cases where we still need to calculate 𝒑̂. Examples of these kinds are:
In this case, we need to solve for the value of the sample proportion
𝒑̂ (read as “p hat”).
Sample proportion (𝒑̂) is the ratio of the number of elements in the sample possessing the
characteristics of interest over the number of elements in the sample or n. It is computed by the formula:
The example below will help you understand better how we can easily estimate the value of the sample
proportion.
Remember that in a situation
describing a population
proportion/sample proportion, the
words “mean” or “average” are notused.
Illustrative Example:
For a class project, a Grade 12 STEM student wants to estimate the percentage of students in his school
who are registered voters. From 45% Grade 12 students, he surveys 500 students and finds that 200 are registered
voters. Determine the value of p and compute for the sample proportion.
Solution:
The population proportion is the rate or percent used from the entire Grade 12 students. Therefore:
Sample Proportion,
𝒑̂ = 0.4
When testing situations involving proportion, a percentage, or a probability, the following assumptions
must be considered:
1. The conditions for binomial experiment are met. That is, there is a fixed number of independent trials with
constant probabilities and each trial has two outcomes that we usually classify as “success” (p) and
“failure” (q). The sum of p and q is 1. Hence, we can write p + q = 1 or q = 1 – p.
2. The conditions np ≥ 5 and nq ≥ 5 are both satisfied so that the binomial distribution of sample proportion
can be approximated by a normal distribution with 𝜇 = 𝑛𝑝 and (However, the specific number
varies from source to source, some authors use 10 instead of 5 depending on how good an approximation
one wants.)
Likewise, the second assumption served as the basis to determine whether the sample size from the
population proportion is sufficiently large or not. Remember that this time, the condition that sample be large is
not n to be at “least 30” but it should satisfy the second assumption. For a large size of sample proportions, the
Central Limit Theorem (CLT) can be used. Bear in mind that if the sample size is sufficiently large, then the mean
of the random sample from a population has a sampling distribution that is approximately normal, even when
the original distribution is normally distributed and n ≥ 30.
1. It is evident that the responses have only two outcomes: “registered voter” (success) or “not registered
voter” (failure). Therefore, the first assumption is met.
2. To be able to satisfy the second condition, we find the hypothesized value of the population proportion p
= 0.45 while n = 500. To get q, q = 1 – p which makes q = 1 – 0.45 = 0.55.
Through substitution, it shows that the second assumption is also met, since:
np ≥ 5 and nq ≥ 5
500 (0.45) ≥ 5 and 500 (0.55) ≥ 5
225 ≥ 5 and 275 ≥ 5
Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met where the sample size is truly large
enough to use CLT. In this condition, the test statistic to be used is the z-test statistic for proportions denoted by
Zcom or the computed z-value.
𝑝𝑞
and
√𝑛 for 𝜎𝑥̅
Therefore, the formula for the value of z-test statistic for population proportion would be:
Zcom
or Zcom
where:
Remember this formula because you are going to use this in Module 12 where the actual computation
for the test statistic involving population proportion will be held.
There are two ways to test the hypothesis: with a p-value approach and with a critical value approach. Here,
we will consider the rejection region with the critical value approach. The critical value enables us to reject or not the
null hypothesis. Also, it is calculated through alpha ( α ) levels and symbolized by Z or Ztab.
This is the first statement in Activity 2: “The hypothesis that less than 20% of the population are right-handed”
wherein Ha: p < 0.20 and it indicates a left-tailed rejection region. Illustrating it in the normal curve, we will come up
with the picture below:
Rejection
Region Non-Rejection
(α) Region This is the
critical value.
Ztab
The illustration above is for you to visualize how the statement would
look like when put into the normal curve. Notice that the line represented by ztab separates the curve into two regions.
The shaded part is the rejection region while the non-shaded part is the non-rejection region or the acceptance
region/area. Therefore, it is important that we determine the value of ztab or the critical value. Now, let us proceed!
Let us now describe the following important terms that we will be needing in our discussion.
- derived from the level of significance and expressed as the standard zvalues
- symbolized as ztab
We can use the table of critical values for the commonly used levels of significance presented in the previous
modules.
Level of Significance
Test Type
𝛼 = 0.01 𝛼 = 0.025 𝛼 = 0.05 𝛼 = 0.10
- the basis for the critical or the rejection region dictated by the alternative hypothesis
Rejection Region
- the range of the values of the test value which indicates that there is a significant difference and that the null
hypothesis (Ho) should be rejected
Non-Rejection Region
- the range of the values of the test value which indicates that the difference was statistically insignificant and that
we failed to reject the null hypothesis (Ho)
Illustrative Example1:
A sample of 100 students is randomly selected from Pinagpala High School and 18 of them said they are left-
handed. Test the hypothesis that less than 20% of the students are left-handed by using 𝛼 = 0.05 as the level of
significance.
What to do:
Solution:
a. The level of significance is 𝛼 = 0.05.
Rejection
Region
𝛼 = 0.05 .
Non-rejection
-3 -2 -1.645 0 1 2 3
From here, you will decide whether the null hypothesis will be rejected or not, although that part will be discussed
in the next module.
Illustrative Example 2:
The claim is made that 40% of tax filers use computer software to file their taxes. In a sample of 50 tax filers,
14 used computer software to file their taxes. If Ha: p < 0.40 at α = 0.025 where p is the population proportion who
use computer software to file their taxes. Determine the critical value, Ztab and illustrate the rejection region in the
normal curve.
Solution:
At α = 0.025 level of significance, with p < 0.40, by referring to the table of the Level of Significance, it shows
that the critical value or Ztab = –
1.96
Rejection
Region α = 0.025
Non-rejection
Region
Ztab = - 1.96
Illustrative Example 3:
In Kalinga Special Education School, a sample of 144 students was chosen and among them, 48 are
diagnosed with Attention Deficit Hyperactivity Disorder (ADHD). At 𝛼 = 0.01, test the hypothesis that the proportion of
ADHD students in the school is not 0.40.
When a
What to do: statement did not
specify any cue
a. Identify the level of significance. word that describes
b. Formulate the alternative hypothesis, Ha: p ≠ po. direction, then it is
non-directional or
c. Determine the critical value.
two-tailed.
d. Illustrate the rejection region in the normal curve.
Solution:
a. The level of significance is 𝛼 = 0.01.
b. The alternative hypothesis is p ≠ 0.40 due to the expression “is not 0.40 ”.
This explains why it is non-directional or two-tailed.
c. To determine the critical value using the table, we consider the intersection
of the row for the two -tailed test and the column f or 𝛼 = 0.01. Hence, the
table tells us that the critical value is ±2.575.
d. Illustrating the rejection region in the normal curve gives:
Rejection
Region Acceptance
𝛼
= 0.01 = 0.005
Region 2 2
𝛼
2
Z = -2.575 Z = 2.575
tab tab
It is observable that the previously cited situation did not use nor mention words like “mean” or “average” but
“percentage” instead. Also, it utilized count data. Problems such as this involves population proportion. Inferences
involving proportions are made in the context of probability of “success”, p, in a binomial distribution.
From the situation that we presented in the above activity, the respondents have only two possible
options for their responses and those are the following:
Showing if the number of samples is large enough as the Central Limit Theorem states, we need to satisfy
the two assumptions. It is evident that the responses have only two possible outcomes: “owned” (success) or “not
owned” (failure). Therefore, the condition for binomial experiment is met. Also, to be able to satisfy the condition
that np ≥ 5 and nq ≥ 5, we find that the hypothesized value of the population proportion is p = 0.35 while n =
240. To get q, q = 1 – p makes q = 1 – 0.35 = 0.65.
Through substitution, we can show that the second condition is also met, since:
np ≥ 5 and nq ≥ 5
84 ≥ 5 and 156 ≥ 5
Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met where the sample size is large enough
to use Central Limit Theorem. In this condition, the test statistic to be used is the z-test statistic for proportions
denoted by Zcom or the computed z-value.
Zcom or Zcom
where:
Illustrative Example1:
Let us now determine the z-value in the situation presented previously. To be able to solve it, we need to
identify first the values of the following:
Zcom = ?
78
Illustrative Example 2:
Determine the value of Zcom given the following information:
p = 0.42
Sample Size: n = 150
Sample Proportion: 𝑝 = 0.45
Solution:
Zcom = ?
𝑝 = 0.45
p = 0.42
n = 150 q = 1 – p = 1 – 0.42 =
0.58
Zcom
Zcom = 0.7444
Illustrative Example 3:
The claim is made that 40% of tax filers use computer software to file their taxes. In a sample of 50, 14
used computer software to file their taxes. To test Ho: p = 0.4 versus Ha: p > 0.4 at α= 0:05 where p is the
population proportion who use computer software to file their taxes. And to test using the binomial distribution
and test using the normal approximation to the binomial distribution. Determine first the value of zcom.
Solution:
First, determine the value of the following:
Zcom = ?
p = 40% = 0.40 n = 50 q = 1 – p
= 1 – 0.40 = 0.60
Zcom
In drawing conclusions, there are two different approaches that you may apply: the critical z-approach
(computed z-value) and the P-value approach.
In applying the first approach which is determining the critical value (which you were already taught in the previous
modules), you need to consider the following:
a. Null and Alternative Hypotheses;
b. Level of Significance (α);
c. Computed Test Statistic, Critical Value (including rejection region); and
d. Decision (whether to reject or fail to reject the null hypothesis (Ho).
Determine if the test statistic falls in the rejection region. If it does, reject the null hypothesis. If it does
not, do not reject the null hypothesis.
❖ If the computed z-statistic (zcom) is > or < the tabular value (ztab), reject the null hypothesis (Ho).
❖ If the computed z-statistic (zcom) falls in the rejection region, reject the null hypothesis (Ho).
❖ If the computed z-statistic (zcom) does not fall in the rejection region, fail to reject the null
hypothesis (Ho).
Illustrative Example:
Example 1
a. Ho : p = 0.85
Ha : p < 0.85
b. Level of Significance: α = 0.01
c. Computed Test Statistic:
𝑋
𝑝=𝑛
𝒑̂ = 0.81
𝑝 −𝑝
𝑝 (1 −𝑝 )
√
z= 𝑛
z = -2.24
d. DECISION: Since the computed test statistic (zcom) z = -2.24 does not fall in the rejection region, fail to reject
the null hypothesis (Ho).
CONCLUSION: Therefore, at 0.01 level of significance, there is not enough evidence to conclude that there is a
decrease in the number of students who prefer male rather than female candidates.
P-VALUE APPROACH
What is P-value?
In critical value approach, a test statistic is compared with a critical value. However, in p-value approach (short
for probability value), probabilities or areas are compared. P-value measures the consistency of the sample statistics with
the null hypothesis. High P-values mean that sample results are consistent with a true null hypothesis while low P-
values are not consistent. If the P value is small enough, we can conclude that the sample is so incompatible with the
null hypothesis. Therefore, we can reject the null hypothesis for the entire population.
Illustrative Example:
Given:
Ho: p = 0.5 = 0.05 n= 25,468
Ha: p > 0.5
Solution:
z =
z =
z = 5.49
CONCLUSION: Because the p-value is smaller than the significance level α=0.05, we can reject the null
hypothesis. Again, we would say that there is sufficient/enough evidence to conclude
that boys are more common than girls in the entire population at α=0.05 level.
As should always be the case, the two approaches (critical value approach and p-value approach) lead to the
same conclusion.
Example 1
Given:
a. n= 50
b. = 0.01 significance level
c. H0 : The proportion of students that want to go to the zoo is 85%.
(H0: p = 0.85)
Ha: The proportion of students that want to go to the zoo is not 85%.
(Ha: p ≠ 0.85 )
d. p = 0.7554
DECISION/CONCLUSION: Because p > , we fail to reject the null hypothesis. There is insufficient evidence to suggest
that the proportion of students that want to go to the zoo is not 85%.
Example 2
Given:
a. n= 150
b. = 0.1 significance level
c. Ho : The proportion of households that have three or more cell phones is
30%. (Ho : p = 0.3)
Ho : The proportion of households that have three or more cell phones is different from 30%. (H a : p ≠ 0.3)
d. 𝑝 = 0.287
e. Zcom = 0.347
NOTE:
Conclusions are answers in sentence form which include: 1) whether there is enough evidence or not (based on
the decision); 2) the level of significance; and 3) whether the original claim is supported or rejected.
Conclusions are based on the original claim which may be the null or alternative hypothesis. The decisions are
always based on the null hypothesis.
Original Claim
H0 Ha
Decision "REJECT" "SUPPORT"
Fail to reject H
0 There There isinsufficientevidence at
"INSUFFICIENT" is insufficientevidence the alpha level of significance
at the alpha level of to supportthe claim that(insert
significance original claim here)
.
to reject the claim that
(insert original claim
here).
NOTE:
If the null hypothesis isn’t rejected, this doesn’t necessarily mean that it’s
true. It simply means that there is not enough evidence to justify rejecting it.
The hypothesis-testing procedure leads to the acceptance of H0 when H0 is true and the rejection of H0 when H0
is false. Unfortunately, since hypothesis tests are based on sample information, the possibility of errors must be
considered. A Type I error corresponds to rejecting H0 when H0 is actually true, while a Type II error corresponds to
accepting H0 when H0 is false.
Just like in puzzles, you need to think of different ways on how you will be able to solve it. Same with solving
problems involving test of hypotheses on population proportions, you need to follow important steps in order to arrive at
the correct answer.
Here are the five (5) steps in solving problems for a test of hypothesis on the population proportion.
STEP 1. HYPOTHESES: State the null and alternative hypotheses (either in sentence/statement form or in
symbols).
H o : p = po H a : p < po or Ha : p > po or H a : p ≠ po
Remember:
Test statistic is a random variable calculated from a sample. You can use test statistics to determine
whether to reject the null hypothesis or not. The test statistic compares your data with what is expected under
the null hypothesis. The test statistic is used to calculate the p-value.
A test statistic measures the degree of agreement between a sample of data and the null hypothesis.
Its observed value changes randomly from one random sample to a different sample. A test statistic contains
information about the data relevant on deciding whether to reject the null hypothesis or not.
𝑥̅ 𝑝 −𝑝 𝑝 −𝑝
𝑝= 𝑛
z= 𝑝𝑞
or z= 𝑝 (1 −𝑝 )
√𝑛 √
𝑛
STEP 5. DECISION/CONCLUSION:
➢ The decision will be either to reject or fail to reject the null hypothesis (Ho).
➢ Draw your conclusion about the population proportion based on the test statistic value and the
rejection region.
❖ If the computed z-statistic (zcom) is > or < the tabular/critical value (ztab), reject the null
hypothesis (Ho).
❖ If the computed z-statistic(zcom) falls in the rejection region, reject the null hypothesis (Ho).
❖ If the computed z-statistic(zcom) does not fall in the rejection region, fail to reject the null
hypothesis (Ho).
NOTE:
(These conditions were already mentioned in the previous module on drawing conclusions on population
proportions.)
To solve problems involving population proportions, just follow the 5-step procedure
mentioned above.
Illustrative Examples
Example 1: Every year, the assigned teachers determine the Body Mass Index (BMI) of students. In a certain public
junior high school, a study finds that 10% of Grade 7 students observed are underweight. A sample of
780 Grade 7 students were randomly chosen and it was found out that 125 of them are underweight.
Is this claim different for their grade level age? Use 0.05 level of significance.
SOLUTION:
𝒑̂ = 0.16
𝑝 −𝑝
z= 𝑝 (1 −𝑝 )
√
𝑛
STEP 4: Determine the critical value.
NOTE: Since the alternative hypothesis is non-directional, the two- tailed test shall be used. Divide α by 2, then subtract
the quotient from 0.5.
Rejection Region
𝛼 𝛼
2
= 0.25 2
= 0.25
Rejection Region
𝑍𝛼
NOTE: Using the Areas Under the Normal Curve Table, critical
2
𝑣𝑎𝑙𝑢𝑒𝑠 at 0.05 level of significance are ± 1.96.
STEP 5: Make a decision whether to reject or fail to reject the null hypothesis. Draw a conclusion.
DECISION: Since the computed test statistic zcom = 2.0 is greater than the critical value or it falls in the rejection region,
reject the null hypothesis.
CONCLUSION: Therefore, we conclude that at 0.05 level of significance, there is enough evidence that the percentage of
Grade 7 students who are underweight is different from 10%.
Data that involve one variable is called univariate data. Univariate data are often described using the measures of central tendency
(mean or average, mode, and median), variations, or other descriptive statistics. Here are examples of univariate data:
Data that involve two variables are called bivariate data. The statistical procedure used to determine and describe the relationship
between two variables is called correlation analysis.
In Tayabas City public market, a consumer observed supply and price of vegetable
that the fewer is the supply of vegetables, the higher
the price gets.
The Quezon provincial government gave emphasis that number of household members and rate of COVID-19
limiting the number of household members going infection
outside to purchase essential goods will help decrease
the rate of
COVID -19 infection in the
province.
Scatter plot shows how points collected from a set of bivariate data are scattered on a Cartesian plane. It gives a good visual picture
of how two variables are related or associated with one another in terms of form, trend, and variation of correlation. The form of points
in the scatter plot determines the shape of the correlation of the variables. The trend determines the direction of the points, either the
variables have positive, negative, or no correlation. The variation or strength of correlation is based on the closeness of the points on a
trend line and it determines whether the variables have no, weak, moderate, strong, or perfect correlation.
In constructing a scatter plot, you should know how to plot points in a
Cartesian plane. The independent variable will assume the values of x or abscissa while the dependent variable will assume the values
of y or ordinate.
Example 1:
The given numbers are the age of a person in years and his/her corresponding weight.
Age of a 11 12 13 14 15 16 17 18 19 20
person (x)
Weight (y) 40 42 38 35 45 51 48 48 50 47
Since the weight of an individual depends on his/her age, the independent variable is the age of the person which is plotted
horizontally. The dependent variable is the weight of the person, which is plotted vertically as shown in the scatter plot below.
Example 2:
A Math teacher conducted a study regarding the performance of grade 11 students in General Mathematics. Their average
grades were taken at different time or period. The data are given below.
From the data given, the independent variable is the order of the subject and the dependent variable is the average grade.
From this, order of the subject will be plotted on the x-axis and grades will be plotted on the y-axis as illustrated below.
Example 3:
A researcher asked for the weight of 10 students together with the weight of their mother (biological) and created a scatter plot as
presented below.
Weight of mother 65 69 74 78 59 81 76 80 81 75
Weight of student 52 55 62 63 47 66 63 69 68 65
On the given, the independent variable is the weight of the mother while the dependent variable is the weight of the student. The scatter
plot is presented below.
Statistics and Probability
Quarter 4 – Module 17:
Describing the Shape (Form), Trend
(Direction), and Variation (Strength) Based on a Scatter Plot
The correlation of the variables can be described in terms of form (shape), trend (direction), and variation (strength) of
scatter plot. The form of correlation can be determined by the shape of points on a scatter plot categorized as linear or curvilinear.
The form of correlation is linear if the points on scatter plot follow a trend of straight line. The form of scatter plot is non-linear if the
points follow a trend of curve line. Sample scatter plots showing curvilinear form of correlation are given below.
The correlation of variables can also be described in terms of its trend or direction. The trend of correlation can be positive,
negative, or zero/negligible depending on the direction of the points. The trend of correlation is summarized in the table that follows.
A positive
Positive The points follow correlation
Correlation a trend rising exists when high
from left to right. values of one
variable
correspond to high
values of another
variable or low
values of one
variable correspond
to low values of
another variable.
Negative The points follow A negative
Correlation a trend rising correlation
from right to left. exists when high
values of one
variable
correspond to low
values of another
variable or low
values of one
variable correspond
to high values of
another variable.
The closeness of the points around the trend line determines the variation or strength of the correlation between the variables
involved. The closer the points to the trend line, the stronger the correlation of the variables is. The strength of correlation between two
variables can be perfect, strong, weak, or no/negligible correlation. To summarize the strength of correlation, refer to the table below.
Correlation Scatter Plot Description
The Pearson’s sample correlation coefficient (also known as Pearson r ), denoted by r, is a test statistic that measures the
strength of the linear relationship between two variables. To find r, the following formula is used:
The correlation coefficient (r) is a number between -1 and 1 that describes both the strength and the direction of
correlation. In symbol, we write -1 ≤ r ≤ 1.
Illustrative Example:
Teachers of Pag-asa National High School instilled among their students the value of time management and excellence in
everything they do. The table below shows the time in hours spent in studying (X) by six Grade 11 students and their scores in a test
(Y). Solve for the Pearson’s sample correlation coefficient r.
X 1 2 3 4 5 6
Y 5 10 10 15 25 30
The next section will guide you on how to compute the Pearson product moment correlation r.
STEPS SOLUTION
1. Construct a table as shown on the right side.
X Y XY X2 Y2
1 5
2 10
3 10
4 15
5 25
6 30
2. Complete the table.
a. Multiply entries in the X and Y columns. Put
them under the XY column. X2 Y2
X Y XY
4 15 60 16 225
5 25 125 25 625
6 30 180 36 900
3.
a.
X Y XY
X2 Y2
2 10 20 4 100
Get the sum of all entries in the Y column.
c. This is ∑ 𝒀.
3 10 30 9 100
Get the sum of all entries in the XY column.
This is ∑ 𝑿𝒀. 4 15 60 16 225
d.
Get the sum of all entries in the X2 column.
This is ∑ 𝑿𝟐. 5 25 125 25 625
e.
Get the sum of all entries in the Y2 column. 6 30 180 36 900
This is ∑ 𝒀𝟐.
6(420) − (21)(95)
=
√[6(91) − (21)2][6(1,975) − (95)2]
r ≈ 0.96395 or 0.96