Chapter 5.test of Hyp
Chapter 5.test of Hyp
149
LESSON 5-1
Introduction to
y pothesis Testing
H
The null hypothesis (H0) is a statement of “zero” difference. Here, H0 assumes that
there is no difference between two means (the population mean and the sample mean) or
variables being compared.
On the other hand, the alternative hypothesis (Ha) is a statement that assumes there is a
significant difference between the two means or variables under test or investigation.
In the previous lesson, if the difference between the means or variables being compared
is based on the right tail or left tail of the curve, the test is a one-tailed test. It indicates that
the data obtained in sampling is either higher or lower than the idealized model data.
The school’s record management claims that the average score of the incoming
freshmen during the admission test is 73. The teacher wishes to find out if the claim is true.
She tests whether or not there is a significant difference between the batch average and the
mean scores of students in her class. What is the appropriate hypothesis to be used in this
case?
Solution:
Let the population mean be µ or the average score of the incoming freshmen, and the
sample mean be or the average score of the students in her class.
The teacher may select any of the following statements as her null and alternative
hypothesis:
H0: The average score of the incoming freshmen has no significant difference
with the mean score of her students; or µ = .
Ha: The average score of the incoming freshmen has a significant difference with
the mean score of her students; or µ ≠ .
The mean profit of a company is 1.02 million a month with a standard deviation of
0.2 million. The newly appointed company manager utilized a proposed system of
operation in 5 randomly selected branches, and the mean profit in one month is
1.2 million. If the manager wishes to find out if the proposed system of operation he
utilized is more effective than the existing one, what is the appropriate hypothesis to be
tested in this problem?
Solution:
The manager may select the following statements as his null and alternative
hypothesis:
H0: µ = ; or
There is no significant difference between the mean profit under the existing
system of operation and the mean profit under the proposed system.
Ha: µ < ; or
The mean profit under the existing system is less than that of the proposed
system.
In hypothesis testing, it is necessary to choose the right type of test in order to arrive at a
reliable conclusion. As a rule of thumb, we follow the conditions below.
1. If Ha is of the form µ ≠ or µ ≠ c where c is a constant, then we use a two-tailed test.
2. If Ha is of the form µ < or µ < c where c is a constant, then we use a left-tailed test,
which is a one-tailed test.
3. If Ha is of the form µ > or µ > c where c is a constant, then we use a right-tailed test,
which is a one-tailed test.
Based on the rules above, we should use a two-tailed test for illustrative example 5-1 and
a left-tailed test for illustrative example 5-2.
= 80 1 = 8 = 20
= 90 2 = 10 = 10
The following are notations and definitions we will encounter in hypothesis testing:
Notation Definition
Significance level (1%, 5%, etc.); Probability of Type I error or the probability
of rejecting H0 when H0 is true.
Probability of not rejecting the alternative hypothesis; Probability of Type II
error or the probability of not rejecting H0 when H0 is false.
Confidence level (99%, 95%, etc.); This is related to , such that =1– .
n Sample size
f (x) Probability function, values of which are restricted between 0 and 1.
F(x) Probability distribution function
df Number of degrees of freedom
H0 Null hypothesis
Ha Alternative hypothesis
c The z- or t-value set as the critical value
(x) Probability distribution function (standardized probability)
μ Population mean
σ2 Population variance
σ Population standard deviation
Sample mean
s2 Sample variance
s Sample standard deviation
A table of outcomes in decision making is shown below. Each outcome comes with the
corresponding probabilities of their occurence.
Decision
H0 : μ = μo
Correct
P= P=1–
In this section, we will only focus on Type I error as the computation for the Type II
error is beyond the scope of this book.
95 %
Rejection Acceptance Rejection
region region for region
2.5 % 2.5 %
(critical value c)
z = –1.96
95 %
Acceptance Rejection
region for region
5%
z = 1.645
(critical value c)
The mean score of all grade 11 students during a departmental examination in calculus
is claimed to be 65 with a standard deviation of 9. The 40 students of section Peter have a
mean score of 68. The teacher wishes to find out if the scores of the grade 11 students are
significantly higher than the scores of the students in section Peter. Assuming the scores are
normally distributed,
Solution:
z= = = 2.11
Type II error
0.4826
(48.26%) Type I error
= 0.0174 or 1.74%
The mean age of the registered voters in a certain municipality is 35. A random sample
of 35 registered voters from the same community has a mean age of 37 and the variance is
known to be 36. If the field reporter wants to prove that that the record is not 35, what is
the appropriate null and alternative hypotheses? What is the probability of getting a Type I
error? Can we conclude that the mean age is not 35 given a confidence level of 95%?
Solution:
H0: There is no significant difference between the mean age of the population and
the sample group of registered voters; that is, H0: =
Ha: There is a significant difference between the mean age of the population and the
mean age of the sample group of registered voters; that is, Ha: ≠
We will utilize the two-tailed test since the aim is to prove that the mean age is not
35 and could either be lower or higher than 35.
z= = = 1.97.
The graphs below shows the comparison between the alpha error ( ) of the
computed value of z and the confidence interval at a confidence level of 95%.
95% confidence
interval = 2.44% (z = 1.97)
= 2.44% falls in the
z = 1.97 0.45 0.45 rejection region
= 0.025(2.5%) = 0.025(2.5%)
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
z = 1.96 z = 1.96
Write the letter of your answer. Write ”X” if your answer is not among the choices.
1. What is the proposed explanation, assertion, or assumption about a population
parameter?
a. assumption c. argument
b. hypothesis d. proposition
For numbers 6–10: Given: µ = 300, = 290, = 25, n = 35, and Ho: µ = 300.
8. Assuming the true parameter has a confidence level of 95%, the error is .
a. beta error a. Type I error
b. Type II error b. confidence error
the z -test
The z-test is a test used to investigate large sample sizes (n > 30) and assess
whether two means or proportions differ significantly. The data are assumed to come
from a normal population whose variance is known.
Note that H0 takes any of the following forms: µ = or µ1 = µ2. Then form an Ha
which asserts an inequality between the values of µ and or µ1 and µ2.
Hence, Ha may take the two-tailed form: µ ≠ or µ1 ≠ µ2; or a one-tailed test form,
either a left-tailed test: µ < or µ1 < µ2; or a right-tailed test: µ > or µ1 > µ2.
Step 2 Determine the appropriate test to be used. We use the z-test if n > 30 and the
population standard deviation is given. On the other hand, we use the
t-test if n ≤ 30 (to be discussed in the next chapter).
Step 3 Determine the critical value c using the table below.
Level of significance
Test type
= 1% = 2.5% = 5% = 10%
z=
z= or z =
z=
A manufacturer of cellular phone batteries claims that when fully charged, the mean life
of his products lasts for 26 hours with a standard deviation of 5 hours. Mr. DG, a regular
distributor, randomly picked and tested 35 of the batteries. His test showed that the average
life of his sample is 24.3 hours. Is there a significant difference between the average life of all
Solution:
z=
= 2.01
Step 5 Since z > c, or that the computed z-score is in the rejection region
of H0, we reject H0.
Conclusion: There is a significant difference between the population mean and
the sample mean; or
When fully charged, the mean life of the sample batteries are
significantly different from the rest of the batteries.
Test the hypothesis in illustrative example 5-5 using a one-tailed test and µ = 25.5.
Solution:
Given: population mean µ = 25.5
sample mean = 24.3
sample size n = 35
standard deviation =5
level of significance = 5% (one-tailed)
z= = = –1.42.
The z-test calculator, which works like scientific and probability calculators, can be
downloaded for free. In this chapter, we will use one example of this software, the “in-
silico,” which may be accessed through http://in-silico.net/tools/statistics/ztest.
Test the difference between the two means given in illustrative example 5-5 using the
z-test calculator.
Step 3 The interpretation appears at the lower part of the box at the right side. This
means that we have to reject the null hypothesis in favor of the alternative
hypothesis. That is, there is a significant difference between the population
mean and the sample mean.
What is the test statistic to be used and the reasons for its selection?
The data were gathered from the result of testing the effectiveness of two different
strategies in increasing the mean sales of a product . Can we conclude that there is a
significant difference between the two strategies based from the mean sales? Test the
hypothesis using a two-tailed test at = 10% with the given data in each strategy.
90%
Rejection region Non-rejection Rejection region
z = –4.68 z = 4.68
= 5%
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
c = –1.645 c = –1.645
z = = 4.68.
Step 5 Since the computed value z is higher than critical value c, thus, it falls in the
rejection region which means we reject H0.
Conclusion: There is a significant difference between the mean sales of the two
strategies.
Now we test the hypothesis of the given data in illustrative example 5-8 using a z-test
calculator.
Enter the data: = 155,000, s1 2 = 7,000, n1 = 45, 2 = 149,000, s22 = 5,000, and n2 = 45.
Choose “Both” then click “Calculate.”
Test the hypothesis of the given data in illustrative example 5-8, assuming n1 = 40 so
n1 ≠ n2. Use a one-tailed test at = 1%.
Strategy A: 1 = 155,000 s1 = 7,000 n 1 = 40 areas
Strategy B : 2 = 149,000 s2= 5,000 n 2 = 45 areas
Solution:
Step 1 H0: There is no significant difference between the two strategies based
from the mean sales of the product; or µ1 = µ2.
Ha: The mean sales of the product from strategy A is higher than the
mean sales of the product in strategy B; or µ1 > µ2.
Step 2 We use the z-test since n > 30.
Step 3 From the table of z-scores where = 0.01 in a one-tailed test, c is 2.33.
Given the previous set of data, we test the hypothesis using a z-test calculator.
Notice that there is a slight difference in the values of z because of the rounding-off of
the value in step 4.
A hog raiser in a certain province uses two methods of pig-farming: intensive pig
farming, where pigs are housed indoors in group-housing or straw-lined sheds; and
extensive pig farming, where pigs are allowed to wander around the farm or fence. Test
the hypothesis whether or not the mean weight of pigs in intensive farming is better
than the extensive farming based from the mean weight of the pigs in the sample with
data shown below. Use a one-tailed test at = 1%.
and
Letting
The records of patients in the hospital show that 35 of 100 patients have a high
cholesterol level of 240 mg/dl and above. Can we conclude that 30% of the patients have
high cholesterol level? Use a one-tailed test with = 5%.
Solution:
Step 1 H0: There is no significant difference between the population and sample
proportions, that is, p0 = .
Ha: The population proportion is higher than the sample proportion; that
is, p0 = .
Step 2 We use the z-test with = 5%, one-tailed test.
z=
= 1.09
Step 5 Since z < c, thus, it does not fall in the rejection region, which means we
do not reject H0.
Conclusion: There is no significant difference between the population and sample
proportions; or H0: p0 = .
Click “One-proportion Z-test.” Enter the necessary data. Then click “Right” for a
right-tailed test. Click “Calculate.”
In a study on the effect of cigarette smoking to human health, it was found out that
35 among 55 cigarette smokers aged 40 years old and above suffer from cigarette-related
diseases, and 28 from among 50 smokers aged below 40 years suffer from the same
diseases. Is it safe to conclude that the proportion of smokers aged 40 years old and above
is significantly higher than the proportion of smokers aged below 40 years who suffer from
cigarette-related diseases? Test the hypothesis at = 0.01.
Solution:
The following are the given for each age group:
Step 1 H0: That the proportion of smokers aged 40 years old and above does not
differ from the proportion of smokers aged below 40 years old who suffer
from cigarette-related diseases. That is, p1 = p2.
Ha: The proportion of smokers aged 40 years old and above is significantly
higher than the proportion of smokers aged below 40 years old who
suffer from cigarette-related diseases. That is, p1 > p2.
Step 5 The computed z-value is less than the critical value c = 2.33, thus, it does not
fall in the rejection region, which means we do not reject H0.
Conclusion: The proportion of smokers aged 40 years old and above does not
significantly differ from the proportion of smokers aged below
40 years old who suffer from cigarette-related diseases; that is,
p1 = p2.
1. D.G Co. has an average gross sales of 37 million per week from their products in all
of their outlets with a standard deviation of 6 million. An area manager found out
that the average gross sales from the 32 outlets under his jurisdiction is 34.8 million
per week. Is the mean sales of the products of all of the outlets significantly different
from the mean sales of the outlets in the area of the manager? Use = 5%. Perform the
hypothesis testing for both one-tailed test and two-tailed tests.
2. A teacher wants to find out if the webcasting methods of teaching social sciences
subjects is more effective than the existing standard method of teaching. Two classes of
approximately equal intelligence were selected. From one class, the teacher randomly
selected a group of students as experimental group for webcasting methods. From
another class, the teacher then selected another group of students as sample for existing
standard method of teaching. After one trimester, the same tests were given to the
students. Their mean scores are shown in the following table:
Existing standard
= 78.5 =9 = 38
method
Test the hypothesis whether or not the webcasting method of teaching social sciences
subjects is more effective than the existing standard method of teaching. Use = 0.05.
the t -test
The t-test is used when n < 30 and only the sample standard deviation is given
as a basis for the estimation of the population standard deviation. If the sample size
is small, the sampling distribution of the sample mean and standard deviation is no
longer an approximate of the standard normal distribution. Thus, you will have to use
the student’s t- distribution as compensation for using the less information to form
our conclusions.
The t-test table or the student’s t-distributions table is shown below where is expressed
as a decimal.
The formulas used for the t-test are similar with that of the z-test.
1. One-sample t-test:
, where df = n – 1
2. Two-sample t-test:
• Two sample means with n1 = n2 or with unequal variance
, where df = n1 + n2 – 2
, where df = 2n – 2
, where df = n1 + n2 – 2
, where
df = n1 + n2 – 2
How to interpret the result:
• If the computed value of t exceeds the critical or tabular value, then we reject H0.
• If the computed value of t is less than the critical or tabular value, then we fail to
reject H0.
The mean serum level measured in 12 patients twenty-four hours after they received
a newly recommended antibiotic was 1.2 mg/dl with a standard deviation of 0.4 mg/dl.
If the mean serum level in the general population is 1.0 mg/dl, test whether or not the
mean serum level in the sample group is significantly different from that of the general
population. Use = 5%.
Ha: That there is a significant difference between the mean serum level of
the sample group and the general population; or .
Step 2 We use the t-test since n < 30 and only the sample standard deviation is
given. Furthermore, we know that we should conduct a two-tailed test
since the alternative hypothesis is a state of inequality.
Step 3 Since n = 12, we have df = n – 1 = 12 – 1 = 11. It is also given that = 5%
or 0.05. Locating the intersection = 0.05 from two-tailed test and df = 11,
we get c = 2.201.
df
Step 5 Since our computed value of 1.73 is less than the critical value of 2.201,
then we fail to reject H0.
Conclusion: There is no siginificant difference between the mean serum of the
sample group and that of the population.
Step 1 Click the t-test at the right side of the page which is one of the options under
“Statistical Calculators.”
Step 2 Choose “One-Sample t-test” at the top.
Ten randomly selected gold mines produced 21, 19, 20, 22, 24, 21, 19, 22, 22, and 20
barrels of gold per day. Is this enough evidence to conclude that the gold mines are not
producing an average of 22.5 barrels of gold per day? Use = 1%.
Solution:
Step 1 Ho: µ = 22.5
Ha: µ ≠ 22.5
Step 2 We use the t-test since n < 30 and only the standard deviation of the
sample can be obtained. Also, we know that this is a two-tailed test for
the same reason as that of the previous example.
Step 3 Since n = 10, we have df = n – 1 = 10 – 1 = 9. It is also given that = 1%.
Thus, we have c = 3.250.
Step 4 Since we do not have given values for and s, we have to compute for
them using the given data. We should get = 21 and s = 1.56. Computing
for the value of the t-statistic, we get:
A teacher wishes to find out if the E-learning teaching method is more effective than the
traditional lecture method. For each of the teaching method, 15 students of approximately
equal intelligence were selected to be part of the study. After two months of conducting
the two methods to the selected students, a 30-item test was given to them to assess their
performance. The scores of the students are shown in the table below.
Students 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Proposed 30 28 29 20 18 19 16 27 22 24 26 28 30 29 18
Existing 25 27 20 30 16 21 15 25 28 21 19 17 18 13 19
Is the E-learning teaching method more effective than the traditional method? Use
= 0.05 and use a one-tailed test.
Solution:
To find the mean and standard deviation of the scores in each teaching method, do
the following steps:
Step 1 To open statistical functions, press (MODE, 2). This means press “MODE”
then press 2.
Step 2 To enter the first set of data (which are the scores of the students under
the E-learning method), type in each score then press “M+.” For example,
press (30, M+). Perform this until all scores are encoded.
Step 3 To compute for the mean, press (SHIFT, 2) then (1, =). The result must be
24.27.
Step 4 To compute for the standard deviation, press (SHIFT, 2) then (3, =). The
result must be 4.98.
Step 5 For data clearing and setting your calculator to normal mode, press
(SHIFT, MODE) then (3, =).
Apply the same steps to find the mean and standard deviation for the second set of
data (which are the scores of the students under the traditional lecture method).
Method n s
Proposed n1 = 15 1
= 24.27 s1 = 4.98
Existing n2 = 15 2=
20.93 s2 = 5.05
Solution:
Since n1 = n2 = 15, and there are two independent samples, we have the following test
statistics:
Step 1 State the null and alternative hypothesis.
Ha: The E-learning method is more effective than the traditional method
of teaching; or µ1 > µ2.
Step 2 It is given that = 5%. The degree of freedom is:
df = n1 + n2 – 2 = 15 + 15 – 2 = 28.
Step 3 The critical value for this is 1.701.
Step 4 The t-test formula to be used if n1 = n2 is:
Since the computed value of 1.83 is higher than the critical value of 1.701,
we reject H0.
Conclusion: The E-learning method is more effective than the traditional
method of teaching; or µ1 > µ2.
A teacher wishes to find out if the E-learning teaching method is more effective
than the traditional lecture method. For the E-learning teaching method, 15 students
of approximately equal intelligence were selected to be part of the study while for the
traditional lecture method, 14 students were chosen. After two months of conducting the
two methods to the students, a 30-item test was given to them to assess their performance.
The scores of the students are shown in the table below.
Students 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Proposed 30 28 29 20 18 19 16 27 22 24 26 28 30 29 18
Existing 25 27 20 30 16 21 15 25 28 21 19 17 18 13
Test the hypothesis if there is no significant difference between the mean scores of
students in the two methods of teaching at = 0.01.
Method n s
Proposed n1 = 15 1
= 24.27 s1 = 4.98
Existing n2 = 14 2
= 21.07 s2 = 5.21
Solution:
Since n1 = 15 and n2 = 14 and there are two independent samples, we have the
following test statistics:
Step 1 H0: There is no significant difference between the mean scores of the two
groups of students in the study; or µ1 = µ2.
Ha: There is a significant difference between the mean scores of the two
groups of students in the study; or µ1 ≠ µ2.
Step 2 It is given that = 5%. The degree of freedom is:
df = n1 + n2 – 2 = 15 + 14 – 2 = 27.
Step 3 The critical value for this is 2.763.
Since the computed value of 1.69 is less than the critical value of 2.763, we
do not reject H0.
Conclusion: There is no significant difference between the mean scores of the
two groups of students in the study; or µ1 = µ2.
The sample uric acid levels of selected basketball and tennis players of the same age and
gender in a certain university were tested. The result is shown below. Is there
a difference in the mean uric acid levels between basketball and tennis players? Use
= 0.05.
Group
Basketball players 15 4.5 1.0
Tennis players 15 3.4 1.5
Before performing the t-test, make sure that the “Analysis Tool Pack” is installed. If not,
select “Add-ons,” “Analysis Tool Pack,” then “OK.”
Step 4 Select cells A1 to A6 for Variable 1 Range and cells B1 to B6 for Variable
2 Range.
Set = 5% (0.05) in default. Click in the Output Range box and select cell A8.
Step 5 Click “OK” and the t-test table will appear as shown below.
Consider the absolute value of t-stat from the table above, which is 1.263.
Before performing the t-test, make sure that the “Analysis Tool Pak” is installed or
activated. If not, do this following:
2. Click “Options.”
4. In the “Manage” box at the bottom, select “Excel Add-ins,” and then click “Go.”
5. In the Add-Ins dialog box, select the “Analysis Tool Pak” check box, and then click
“OK.”
Step 1 In the “Data” Tab, click “Data Analysis” located at the right side of the
ribbon.
Step 2 Select “t-test: Two-Sample Assuming Unequal Variances“ then click “OK”.
Perform hypothesis testing in illustrative example 5-15 using the Microsoft Excel.
Number of branches n1 = 6 n2 = 6
3. Oil wells in a large field produce an average of 33.5 barrels per day.
Fifteen randomly selected oil wells produce an average of 30 barrels of
crude oil per day with a standard deviation of 3.5 barrels. Is this enough
evidence to conclude that the oil wells are not producing an average of
33.5 barrels of crude oil per day? Test at = 0.1.
Use a
two-tailed test.
standard
mean sample
deviation
Students with 0–1 gadgets 1 = 83 s1 = 10 n1 =12