Module XI Testing Hypothesis
Module XI Testing Hypothesis
HYPOTHESIS TESTING
Type of Hypothesis:
1. The null hypothesis states that there is no difference between a parameter and specific
value or that there is no difference between two parameter or no relationship hypothesis.
This definition means neutrality and objectivity which must be present in any research
undertaking.
2. The alternative hypothesis is a statistical hypothesis that states a specific difference
between a parameter and a specific value or states that there is a difference between two
parameters. It is the opposite of the null hypothesis. It specifies an existence of a
difference or a relationship.
The following examples illustrate how to formulate the different types of hypotheses
by knowing the title of any research study.
1. Title: The NSAT Scores and Academic Achievement of the Students in Private and
Public Schools.
Ho: There is no significant relationship between the NSAT performance and Academic
achievement among the four learning areas of private schools, public schools and
combination of private and public school.
Hi: There is a significant relationship between the NSAT performance and Academic
achievement among the four learning areas of private schools, public schools and
combination of private and public school.
Hi: The NSAT performance among the four learning areas of private schools, public schools
and combination of private and public schools are better than their academic
achievement.
Ho: The competencies of nurses from government hospitals are equal to the competencies of
nurses from private hospitals.
Hi: The competencies of nurses from government hospitals are not equal to the competencies
of nurses from private hospitals.
Hi: Nurses from government hospitals are less competent that nurses from private hospitals.
Type of Error:
In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it
is in fact true; that is, Ho is wrongly rejected.
The probability of a type I error can be precisely computed as,
P (type I error) = significance level = α
If we do not reject the null hypothesis, it may still be false (a type II error) as the
sample may not be big enough to identify the falseness of the null hypothesis (especially if
the truth is very close to hypothesis).
In a hypothesis test, a type II error occurs when the null hypothesis Ho, is not rejected
when it is in fact false.
A type II error is frequently due to sample sizes being too small.
The probability of a type II error is symbolized by ß and written:
P (type II error) = ß (but is generally unknown).
The following table gives a summary of possible results of any hypothesis test:
Decision
Reject Ho Don't reject Ho
Ho Type I Error Right Decision
Truth
H1 Right Decision Type II Error
Significance Level
One-tailed Test
A one-sided test is a statistical hypothesis test in which the values for which we can
reject the null hypothesis, Ho are located entirely in one tail of the probability distribution.
In other words, the critical region for a one-sided test is the set of values less than the
critical value of the test, or the set of values greater than the critical value of the test.
A one-sided test is also referred to as a one-tailed test of significance.
Example: Suppose we wanted to test a manufacturer’s claim that there are, on average, 50
matches in a box. We could set up the following hypotheses:
Either of these two alternative hypotheses would lead to a one-sided test. Presumably,
we would want to test the null hypothesis against the first alternative hypothesis since it
would be useful to know if there is likely to be less than 50 matches, on average, in a box (no
one would complain if they get the correct number of matches in a box or more).
Two-tailed Test
A two-sided test is a statistical hypothesis test in which the values for which we can
reject the null hypothesis, Ho are located in both tails of the probability distribution.
In other words, the critical region for a two-sided test is the set of values less than a
first critical value of the test and the set of values greater than a second critical value of the
test.
A two-sided test is also referred to as a two-tailed test of significance.
Example: Suppose we wanted to test a manufacturer’s claim that there are, on average, 50
matches in a box. We could set up the following hypotheses
That is, nothing specific can be said about the average number of matches in a box;
only that, if we could reject the null hypothesis in our test, we would know that the average
number of matches in a box is likely to be less than or greater than 50.
x−μ
t=
s
√ n (small sample)
b. Two sample mean test. A sample mean with another sample mean
( x̄− ȳ )
t=
df= nx +ny – 2
nx + n y −2 √ 1 1
+
nx ny
Example 1: Bottles of ketchup are filled automatically by a machine which must be adjusted
periodically to increase or decrease the average content per bottle. Each bottle is
supposed to contain 18 oz. It is important to detect an average content significantly
above or below 18 oz so that the machine can be adjust: too much ketchup per bottle
would be unprofitable, while too little would be a poor business practice and open the
company up to law suites about invalid labeling. We select a random sample of 32
bottles filled by the machine and compute their average weight to be 18.34 with a
standard deviation of 0.7334. Should we adjust the machine? Use a comfort level of
5%.
Solution: We can see right away that the average weight of our sample, being 18.34 oz, is
indeed different from what it's supposed to be 18 oz, but the question is whether the
difference is statistically significant. In our particular case we want to know whether
the machine is "off" and be sure to allow at most a 5% chance of an error in our
conclusion. After all, if we did conclude the difference is significant we would have to
adjust the Ketchup machine, which is an expensive procedure that we don't want to
perform unnecessarily.
Our statistical test for the mean will provide the answer:
The Null Hypothesis is: the mean is equal to 18, i.e. "everything is fine"
The Alternative Hypothesis is: mean is not equal to 18 (2-tail test), i.e. we should
adjust the machine
The Test Statistics: sample mean is 18.34, standard deviation is 0.7334, Sample size is
32. Compute the z-score as follows :
x−μ
z=
s
√n
18 . 34−18
=
0. 7334
√32
=2 . 622
p = 2*(1-P(2.622) = 0.0087,
The probability is smaller than our comfort level
α = 0.05 (or 5%) = 1.96, so we reject the null hypothesis
In other words, we conclude that the difference was statistically significant and that
therefore the alternative hypothesis is (likely) true. Therefore, we will adjust the ketchup
filling machine. Note that while we feel comfortable rejecting the null hypothesis (and
adjusting the machine) the probability that this decision (and our course of action) is
incorrect is 0.8%.
EXAMPLE 2: A New York Times article noted that the mean life span for 35 male
symphony conductors was 73.4 years, in contrast to the mean of 69.5 years for the
general population. Assuming the 35 males have life spans with a standard deviation of
8.7 years, use a 0.05 significance level to test the claim that male symphony conductors
have a mean life span that is different from 69.5 years.
Solution:
We want to compare this z-score to the critical value and critical region for a standard
normal distribution, so we find the critical value first. Since we are assuming the null
hypothesis is true and it says μ=69.5, this is a two tailed test - in other words you would be
suspicious if your sample mean was too low or too high, since we are assuming μ=69.5. So
we need to find the critical values - we find that they are and .
Your sample has a z-score of z = 2.652, so it is clearly in the critical region - here is a picture:
Now since our test statistic was in the critical region, we reject the null hypothesis.
Now we need to state our conclusion in non-technical terms. The sample data support the
claim that male symphony conductors have an average life span different than 69.5 years.
Notice how carefully that conclusion is stated - it does not say "it proves" - in fact
statistics can never prove anything completely - it says it supports the claim
Example3: The expense of moving the storage yard for Consolidated Delivery Service is
justified only if it can be shown that the mean travel distance is less than 214 miles. In
trial runs of 12 delivery trucks, the mean and standard deviation are found to be 198
miles and 42 miles respectively. At the 0.01 level of significance, test the claim that
the mean is less than 214 miles
Solution: Claim: μ<214
The test statistic to use if sample size is less than 30 and you do not know the
population standard deviation is the t statistic.
You will notice that this statistic is essentially the same as the z statistic, what is
different is that you compare against the t distribution and not the normal. So since there are
11 degrees of freedom, your distribution looks as follows - use the 0.01 ONE TAIL column
to find the critical value.
You are NOT in the critical region, so you should Fail to Reject the null hypothesis.
Conclusion: There is not sufficient evidence to support the claim that mean distance is less
than 214 miles
Example 4: Suppose GAP, the clothing store, wants to introduce their line of clothing for
women to another country. But their clothing sizes are based on the assumption that
the average size of a woman is 162 cm. To determine whether they can simply ship
the clothes to the new country they select 5 women at random in the target country
and determine their heights as follows: 149, 165, 150, 158, 153. Should they adjust
their line of clothing or they ship them without change? Make sure to decide at the
0.05-level.
Solution:
Null Hypothesis: mean height in new country is the same as in old country, i.e. M =
162
Alt. Hypothesis: mean height in new country is different from old country, i.e. M not
equal to 162 (either too small or too tall would be bad for GAP)
Test Statistics: we can compute the sample mean = 155 and the sample standard
deviation = 6.59 while the sample size is clearly n = 5. Therefore
155−162
t=
6 .59
√5
= -2.37
Rejection Region: p = (2.37, 4, 2) = 0.077.
Thus, if we did decide to reject the null hypothesis the probability of that decision being
wrong is 7.7%. That is larger than 0.05, thus we declare the test inconclusive.
Here is the test statistic for testing a claim about a population proportion
You compare this value to critical values found using the normal distribution
Example 1: In 1990, 5.8% of job applicants who were tested for drugs failed the test. At the
0.01 level of significance test the claim that the failure rate is now lower if a
random sample of 1520 current job applicants results in 58 failures.
Solution:
Since the test statistic falls in the critical region we reject the null hypothesis. So the
Since the test statistic falls in the critical region we reject the null hypothesis. So the
sample data support the claim that the failure rate is now lower than 5.8%
Example 2: 1500 randomly selected pine trees were tested for traces of the Bark Beetle
infestation. It was found that 153 of the trees showed such traces. Test the
hypothesis that more than 10% of the pine trees have been infested. (Use a 5% level
of significance)
Direction. Answer the following exercises in a yellow pad paper. Please do not answer
back to back. Minimize erasure. DO DOT COMBINED EXERCISES 11. 1, 11.2 AND
11.3 AND 11. 4. ANSWER SEPARATELY.
State the null and alternative hypothesis to be used in testing the following claims and
determine generally where the critical region is located. Choose between directional and non-
directional test.
1. No more than 20% of the faculty at the university contributed to the University day.
2. On the average, children attend schools within 4.2 kilometers of their homes.
3. The proportion of voters favoring the incumbent in the upcoming election is 0.65.
4. There is no significant difference between the IQ’s of male and female students.
5. The average age of CapSU Poblacion faculty is 38.5 years.
6. The average of electric bill for residents of Mambusao is P2000.00.
7. The average cost of mobile phone is P5,050.
8. The average monthly income of Mambusao vendors is P10,500.00
9. The average lifespan of battery of cellphone produced this month is 500 hours.
10. The mean IQ of television programming executives is 125.
1. A report in LTO stated that the average age of taxis in the Philippines is 9 years. An
operation manager of a large taxi company selects a sample of 40 taxis and finds the
average age of taxis is 8.2 years. The σ of the population is 2.3 years. At α = 0.05, can it
be concluded that the average age of the taxis in his company is less than the national
average?
2. Average college cost of tuition fee for all private institution last year was P36,400. A
random sample of costs this year for 45 institutions of higher learning indicated that the
sample mean was P37,900 and a sample standard deviation was P5,600. At the 0.01 level
of significance, is there sufficient evidence to conclude that the cost has increased?
3. The production manager of a fruit canning factory begins to suspect that, as a result
observing the machine operators, the 16-ounce cans of fruits may be slightly filled
beyond the required weight. He takes a random sample of 80 packed cans and finds that
the Mean weight is 16.08 ounces with a standard deviation of 0.04 ounces. At 1% Level
of Significance, can the production manager conclude that the fruit cans were being
overfilled?
4. The treasurer of a municipality claims that the average net worth of families living in this
municipality is P590,000. A random sample of 50 families selected from this area
produced a mean net worth of P720,000 with a standard deviation of P65,000. Using 5%
significant level , can we conclude that the claim is true?
5. A fruit juice franchise company has a policy of opening new fruit juice stand only on
those areas that have a mean household income of at least P30,500 a month. Te company
is currently considering an area in which to open a new fruit juice stand. The company’s
research department took a sample of 25 households from this area and found that the
mean monthly income of these households is P32,600. Using 5% significant level, would
you conclude that the company should open a fruit juice stand in the area?
Exercise 11.3 Test Concerning Proportion
1. The school registrar estimates that the dropout rate of freshmen high schools in Mindanao
is 20%. Last year, 45 freshmen from a random sample of 250 Mindanao freshmen high
school withdrew. At 0.01, is there enough evidence to reject the registrar’s claim?
2. A Certified Public Accountant claims that more than 30% of all accountants advertise. A
sample of 300 accountants in Metro manila showed that 84 had used some form of
advertising. At 0.05 level of significance, is there enough evidence to support the claim?
3. The GSIS states that 80% of its claims are settled within a month. A consumer group
selected a random sample of 180 of the company’s claims to test this statement. If it is
found that 150 of the claims were settled, do they have sufficient reason o support their
contention that less than 80% of the claims are settled within a month at 0.05 level of
significance?
4. The mathematics department has had a consistent failure rate of 12%. In one experiment
semester, more active counseling plus rigorous enforcement of prerequisites results in
338 failures among 2200 mathematics students. At the 0.05 significance level, test the
department’s claim that the failure rate has been lowered.
5. Elsie made a claim that at least 5% of the college male students drives racing car. His
friend Sofia finds this hard to believe and decide to check the validity of Elsie’s claim, so
she took a random sample. At 0.05, does Sofia have sufficient evidence to reject Elsie’s
claim if there were 19 racing car in her sample of 250 cars?
1. A survey found that the average hotel room rate in Luzon is P2,300 and the average
room rate in Mindanao is P2,150. Assume that the data were obtained from two
samples of 75 hotels in Luzon and 80 in Mindanao and the standard deviations were
P250 and P215, respectively. At α = 0.05 can it be concluded that there is a
significant difference in the rates?
3. As an aid for improving employees working habits, eight employees were randomly
selected to attend a seminar-workshop on the importance of work. The table shows
the number of workload done per week before and after the seminar-workshop. At
0.05, did attending the seminar – workshop increase the performance level of
employees?
Before 10 12 8 7 11 10 13 9
After 9 14 11 12 15 13 12 14
4. The average credit card debt for a recent year was P43,100. Three years earlier the
average credit card debt was P41,890. Assume sample sizes of 50 were used and the
standard deviations of both samples are P7,810. Is there enough evidence to believe
that the average credit card debt has increased? Use α = 0.05.
5. A survey was designed to compare the smoking habits of married men with those of
married women. A random sample of 240 men revealed that 180 smoked. A random
sample of 220 women indicated that 54 smoked. At the 0.01 significance level does
the evidence show that a higher proportion of men smoke?