Chapter 3 Hypothesis Testing (Students - Notes)
Chapter 3 Hypothesis Testing (Students - Notes)
Introduction
A person who has indicted for committing a crime and is being tried in a court. Based on the available
evidence, the judge or jury will make one of the two possible decisions:
The person is not guilty
The person is guilty
At the outset of the trial the person is presumed not guilty to prove
(gather evidence) that the person has committed the crime, and hence, guilty.
Two Hypotheses
In statistics
Null hypothesis, : The person is not guilty.
Alternative hypothesis, : The person is guilty.
Null hypothesis:
is usually the hypothesis that is assumed to be true to begin with
states that a given claim (or statement) about a population parameter is true.
Definitions
A statistical hypothesis is a conjecture about a population parameter. This conjecture may or
may not be true.
A null hypothesis is a claim (or statement) about a population parameter that is assumed to be
true until it is declared false.
An alternative hypothesis is a claim about a population parameter that will be true if the null
hypothesis is false.
Actual Situation
is true is false
Do not reject Correct decision Type II or error
Decision
Reject Type I or error Correct decision
STA408 Chapter 3: Hypothesis Testing
Tails of a test
A two-tailed test has the rejection region in both tails of the distribution curve.
A left-tailed test has the rejection region in the left tail of the distribution curve.
A right-tailed test has the rejection region in the right tail of the distribution curve.
Example 1
State the null and alternative hypotheses for each of the following statement. Determine if each is a case
of a two-tailed, a right tailed or a left tailed test.
(a) Test if the mean number of hours spent working per week by college students who hold jobs is
different from 20 hours.
(b) 10 hours per
month.
(c) Test if the mean credit card debt of college seniors is less than RM 1000.
2
STA408 Chapter 3: Hypothesis Testing
3
STA408 Chapter 3: Hypothesis Testing
Step 2: Assume that is true, state the distribution for the hypothesis test (optional).
Step 3: Determine the rejection region (or critical region) according to the given significance level, . (can
be based on -value or critical value)
Step 4: State the -value or test statistic from the Minitab output. Otherwise calculate the test statistic
using the formula given,
Example 2
A telephone company provides long-
records, the average length of all long-distance calls placed through this company in 2009 was 12.44
long-distance
calls is different from 12.44 minutes. A sample of 150 such calls placed through this company produced
a mean length of 13.71 minutes. The standard deviation of all such calls is 2.65 minutes. Using the 2%
significance level, can you conclude that the mean length of all current long-distance calls is different
from 12.44 minutes.
4
STA408 Chapter 3: Hypothesis Testing
Example 3
a sample of 36 pairs of shoes from a catalogue and finds the following cost (rounded to the nearest RM).
Is there enough evidence to support the resea ? Assume .
5
STA408 Chapter 3: Hypothesis Testing
Example 4
The Minitab output for the data in Example 3 is as given below. Use -value to test if the average cost of
at 10% significance level.
One-Sample Z: cost
Example 5
A psychologist claims that the mean age at which children start walking is 12.5 months. Carol wanted to
check if this claim is true. She took a random sample of 18 children and found that the mean age at which
these children started walking was 12.9 months with a standard deviation of 0.80 month. It is known that
the ages at which all children start walking are approximately normally distributed. Test that the mean
age at which all children start walking is older than 12.5 months. What will your conclusion be if the
significance level is 1%?
6
STA408 Chapter 3: Hypothesis Testing
Example 6
An educator claims that the average salary of substitute teachers in school is more than RM55 per day. A
random sample of eight schools is selected, and the daily salaries (in RM) are shown. Is there enough
?
60 56 60 55 70 55 60 55
Example 7
The Minitab output for the data in Example 6 is as given below. Use -value to test if the average mean
daily salary substitute teachers in school is more than RM55 per day.
One-Sample T: salary
7
STA408 Chapter 3: Hypothesis Testing
Test statistic
In general, a test statistic for two population means is computed as follows:
where
Variances and
unknown
and
and
Variances and
unknown
and
8
STA408 Chapter 3: Hypothesis Testing
Example 9
A sample of 14 cans of Brand I diet soda gave a mean number of calories of 23 per can with a standard
deviation of 3 calories. Another sample of 16 cans of Brand II diet soda gave the mean number of calories
of 25 per can with a standard deviation of 4 calories. At the 1% significance level, can you conclude that
the mean numbers of calories per can are different for these two brands of diet soda? Assume that the
calories per can of diet soda are normally distributed for each of the two brands and that the standard
deviations for the two populations are equal.
9
STA408 Chapter 3: Hypothesis Testing
Example 10
A company recently opened two supermarkets in two different areas. The management wants to know if
the mean sales per day for these two supermarkets are different. The Minitab output for two-sample
T-test and CI of RM) for Supermarket A and B
respectively are presented below.
-
Estimate for difference: -9.49
99% CI for difference: (-15.06, -3.92)
T-Test of difference = 0 ( -Value = -4.85 P-Value = 0.000 DF = 20
Both use Pooled StDev = 4.5735
Assume the daily sales of the two supermarkets are both normally distributed.
a) Show that the test statistic, .
b) State the null and alternative hypotheses for the above test.
c) Based on the output, what is the assumption for the variances of the daily sales between the two
supermarkets. Explain your answer.
d) Using the p-value, do the data provide sufficient evidence to indicate that there is a significant
difference in daily sales between the two supermarkets at 1% significant level?
e) State the 99% confidence interval of the mean difference in daily sales between Supermarket A and
Supermarket B? Does the interval further verify the conclusion in (d)? Support your answer.
10
STA408 Chapter 3: Hypothesis Testing
Example 11
Refer Example 9. Test at 1% significance level whether the mean number of calories per can of diet soda
are different for these two brands. Assume that the calories per can of diet soda are normally distributed
for each of these two brands and that the standard deviations for the two populations are not equal.
Example 12
Below is the Minitab output for two-sample T-test and CI for the 50 randomly selected 30-year fixed-rate
mortgages and 45 randomly selected 20-year fixed-rate mortgages granted in a week.
-
Estimate for difference: 0.1352
99% lower bound for difference: -0.0794
T-Test of difference = 0 (vs >): T-Value = 1.49 P-Value = 0.070 DF = 92
11
STA408 Chapter 3: Hypothesis Testing
Example 13
the average sales if its employees. The company sent six of its salespersons to attend this course. The
table below gives the 1-week sales of these salespersons before and after they attended the course.
Before 12 18 25 9 14 16
After 18 24 24 14 19 20
Using the 1% significance level, can you conclude that the mean weekly sales for all salespersons increase
as a result of attending this course? Assume that the population of paired differences has a normal
distribution.
Example 14
Below is the Minitab output of Example 13.
Paired T-Test and CI: Before, After
Note: Whether the hypothesis testing is tested using test statistic or -value, the decision and
conclusion of the test remain the same.
In Example 13, the test was conducted using test statistic and was rejected.
Similarly in Example 14, is rejected because -value .
12
STA408 Chapter 3: Hypothesis Testing
Test statistic
Example 15
An instructor wishes to see whether the variation in scores of the 23 students in her class is less than the
variance of the population. The variance of the class is 198. Is there enough evidence to support the claim
that the variation of the students is less than the population variance ( ) at ? Assume
that the scores are normally distributed.
13
STA408 Chapter 3: Hypothesis Testing
Example 16
A hospital administrator believes that the standard deviation of the number of people using outpatient
surgery per day is greater than 8. A random sample of 15 days is selected. The data are shown. At
, is there enough evidence to
distributed.
25 30 5 15 18 42 16 9
10 12 12 38 8 14 27
Example 17
Below is the Minitab output for the data in Example 16. Using -
.
Test and CI for One Variance: No_of_patient
Method
Statistics
Lower
Bound
for Lower Bound
Variable Method StDev for Variance
No_of_patient Chi-Square 9.1 83
Bonett 8.7 76
Tests
Test
Variable Method Statistic DF P-Value
No_of_patient Chi-Square 27.45 14 0.017
Bonett 0.046
14
STA408 Chapter 3: Hypothesis Testing
Test Statistics
where and .
Note:
The larger variance should always be placed in the numerator of the formula regardless of the
subscripts.
For a two-tailed test, the value must be divided by 2 and the critical value placed on the right side
of the curve.
Example 18
The standard deviation of the average waiting time to see a doctor for non-life-threatening problems in
the emergency room at an urban hospital is 32 minutes. At a second hospital, the standard deviation is
28 minutes. If a sample of 13 patients was used in the first case and 19 in the second case, is there enough
evidence to conclude at the 0.01 significance level that the standard deviation of the waiting times in the
first hospital is greater than the standard deviation of the waiting times in the second hospital? Assume
that the population distributions for both the waiting time is approximately normally distributed.
15
STA408 Chapter 3: Hypothesis Testing
Example 19
A medical researcher wishes to see whether the variance of the heart rates (in beats per minute) of
smokers is different from the variance of heart rates of people who do not smoke. Two samples are
selected, and the data are as shown. Using , is there enough evidence to support the claim?
Assume that distributions for both the variables are normally distributed.
Smokers Non-smokers
16
STA408 Chapter 3: Hypothesis Testing
Example 20
The CEO of an airport hypothesizes that the variance in the number of passengers for Malaysian Airports
is greater than the variance in the number of passengers for foreign airports. The data is summarised in
the Minitab output below.
Test and CI for Two Variances: Malaysian_Airport, Foreign_Airport
Method
F method was used. This method is accurate for normal data only.
Statistics
90% Lower
Bound for
Variable N StDev Variance Variances
Malaysian_Airport 6 15.697 246.382 133.376
Foreign_Airport 4 9.791 95.873 46.009
Tests
Test
Method DF1 DF2 Statistic P-Value
F 5 3 2.57 0.234
17