0% found this document useful (0 votes)
8 views17 pages

Chapter 3 Hypothesis Testing (Students - Notes)

Uploaded by

nrbtrsyialia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

Chapter 3 Hypothesis Testing (Students - Notes)

Uploaded by

nrbtrsyialia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

STA408: Statistics for Science and Engineering

Chapter 3: Hypothesis Testing

Introduction
A person who has indicted for committing a crime and is being tried in a court. Based on the available
evidence, the judge or jury will make one of the two possible decisions:
The person is not guilty
The person is guilty
At the outset of the trial the person is presumed not guilty to prove
(gather evidence) that the person has committed the crime, and hence, guilty.

Two Hypotheses
In statistics
Null hypothesis, : The person is not guilty.
Alternative hypothesis, : The person is guilty.

Null hypothesis:
is usually the hypothesis that is assumed to be true to begin with
states that a given claim (or statement) about a population parameter is true.

Definitions
A statistical hypothesis is a conjecture about a population parameter. This conjecture may or
may not be true.
A null hypothesis is a claim (or statement) about a population parameter that is assumed to be
true until it is declared false.
An alternative hypothesis is a claim about a population parameter that will be true if the null
hypothesis is false.

Rejection and non-rejection regions

Four possible outcomes of a test of hypothesis

Actual Situation
is true is false
Do not reject Correct decision Type II or error
Decision
Reject Type I or error Correct decision
STA408 Chapter 3: Hypothesis Testing

Two types of errors


A Type I error occurs when a true null hypothesis is rejected.

The value of represents the significance level of the test.

A Type II error occurs when a false null hypothesis is not rejected.

The value of represents the power of the test.

Tails of a test
A two-tailed test has the rejection region in both tails of the distribution curve.
A left-tailed test has the rejection region in the left tail of the distribution curve.
A right-tailed test has the rejection region in the right tail of the distribution curve.

Signs in and and tails of a test

Two-tailed Test Left-Tailed Test Right-Tailed Test


Sign in the null
or or
hypothesis,
Sign in the alternative
hypothesis,
Rejection region In both tails In the left tail In the right tail

Hypothesis Testing Common Phrases

Is greater than Is less than Is equal to Is not equal to


Is above Is below Is the same as Is different from
Is higher than Is lower than Has not changed from Has changed from
Is longer than Is shorter than Is the same as Is not the same as
Is bigger than Is smaller than
Is increased Is decreased or
reduced from

Example 1
State the null and alternative hypotheses for each of the following statement. Determine if each is a case
of a two-tailed, a right tailed or a left tailed test.
(a) Test if the mean number of hours spent working per week by college students who hold jobs is
different from 20 hours.
(b) 10 hours per
month.
(c) Test if the mean credit card debt of college seniors is less than RM 1000.

2
STA408 Chapter 3: Hypothesis Testing

3.1 Hypothesis Test of One Population Mean


Two approaches for hypothesis tests
1. The -value approach
Calculate probability or -value for the observed value of the sample statistic. (obtained the
-value from the Minitab output)
Compare the -value with the significance level, and make a decision.
Decision making:
- Reject if -value .
- Do not reject if -value .

2. The critical value approach


Find the critical value(s) from a table (normal distribution or t distribution table).
Find the value of the test statistic for the observed value of the sample. (the value is either
calculated or obtained from the Minitab output)
Compare the test statistic (either from Minitab output or calculated, i.e., or ) with
the critical value(s) and make a decision.
Decision making

Type of Distribution Type of Test Decision Test statistic

Two-tailed test Reject if


- known
- normal or .

- large sample size


(approximated by Right-tailed test Reject if .
CLT)
Left-tailed test Reject if .

- unknown Two-tailed test Reject if


- normally or .
distributed with
small sample size
Right-tailed test Reject if
- large sample size
(approximated by Left-tailed test Reject if
CLT)

Steps to perform a test of hypothesis:


Step 1: State the null and alternative hypotheses.

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null Hypothesis
Alternative Hypothesis

3
STA408 Chapter 3: Hypothesis Testing

Step 2: Assume that is true, state the distribution for the hypothesis test (optional).
Step 3: Determine the rejection region (or critical region) according to the given significance level, . (can
be based on -value or critical value)
Step 4: State the -value or test statistic from the Minitab output. Otherwise calculate the test statistic
using the formula given,

Step 5: Make comparison:


- Compare the -value with the significance level, ; or
- Compare the test statistic (either from Minitab output or calculated, i.e., or ) with
the critical value(s).
Step 6: Make a decision
Step 7: Draw a conclusion.

Example 2
A telephone company provides long-
records, the average length of all long-distance calls placed through this company in 2009 was 12.44
long-distance
calls is different from 12.44 minutes. A sample of 150 such calls placed through this company produced
a mean length of 13.71 minutes. The standard deviation of all such calls is 2.65 minutes. Using the 2%
significance level, can you conclude that the mean length of all current long-distance calls is different
from 12.44 minutes.

4
STA408 Chapter 3: Hypothesis Testing

Example 3

a sample of 36 pairs of shoes from a catalogue and finds the following cost (rounded to the nearest RM).
Is there enough evidence to support the resea ? Assume .

60 50 120 110 75 110 70 40 90


65 60 85 75 80 75 80 90 45
55 70 85 85 90 90 80 50 80
85 60 70 55 95 60 45 95 70

5
STA408 Chapter 3: Hypothesis Testing

Example 4
The Minitab output for the data in Example 3 is as given below. Use -value to test if the average cost of
at 10% significance level.

One-Sample Z: cost

The assumed standard deviation = 18.89

Variable N Mean StDev SE Mean 90% Upper Bound Z P


cost 36 75.00 19.16 3.20 79.10 -1.56 0.059

Example 5
A psychologist claims that the mean age at which children start walking is 12.5 months. Carol wanted to
check if this claim is true. She took a random sample of 18 children and found that the mean age at which
these children started walking was 12.9 months with a standard deviation of 0.80 month. It is known that
the ages at which all children start walking are approximately normally distributed. Test that the mean
age at which all children start walking is older than 12.5 months. What will your conclusion be if the
significance level is 1%?

6
STA408 Chapter 3: Hypothesis Testing

Example 6
An educator claims that the average salary of substitute teachers in school is more than RM55 per day. A
random sample of eight schools is selected, and the daily salaries (in RM) are shown. Is there enough
?

60 56 60 55 70 55 60 55

Example 7
The Minitab output for the data in Example 6 is as given below. Use -value to test if the average mean
daily salary substitute teachers in school is more than RM55 per day.

One-Sample T: salary

Variable N Mean StDev SE Mean 95% Lower Bound T P


salary 8 58.88 5.08 1.80 55.47 2.16 0.034

7
STA408 Chapter 3: Hypothesis Testing

3.2 Hypothesis Test of Two Population Means


Hypothesis testing about
Two-Tailed Test Right-Tailed Test Left-Tailed Test
Null Hypothesis
Alternative Hypothesis

Test statistic
In general, a test statistic for two population means is computed as follows:

Null Hypothesis Test statistic

Variances and known

where
Variances and
unknown

and

and
Variances and
unknown

and

where is the number of pairs.

8
STA408 Chapter 3: Hypothesis Testing

Example 8 (Example14 of Chapter 2)


A survey of low-and middle-income households show that consumers aged 65 years and older had an
average credit card debt of RM 10, 235 and consumers in the 50- to 64-year group had an average credit
card debt of RM 9, 342 at the time of survey. Suppose that these averages where based on the random
samples of 1200 and 1400 people for the two groups, respectively. Further, assume that the population
standard deviations for the two groups were RM 2, 800 and RM 2, 500, respectively. Let and be the
respective population means for the two groups, people ages 65 years and older and people in the 50- to
64- year age group. Test at 5% significance level whether the population means credit card debts for the
two groups are different.

Example 9
A sample of 14 cans of Brand I diet soda gave a mean number of calories of 23 per can with a standard
deviation of 3 calories. Another sample of 16 cans of Brand II diet soda gave the mean number of calories
of 25 per can with a standard deviation of 4 calories. At the 1% significance level, can you conclude that
the mean numbers of calories per can are different for these two brands of diet soda? Assume that the
calories per can of diet soda are normally distributed for each of the two brands and that the standard
deviations for the two populations are equal.

9
STA408 Chapter 3: Hypothesis Testing

Example 10
A company recently opened two supermarkets in two different areas. The management wants to know if
the mean sales per day for these two supermarkets are different. The Minitab output for two-sample
T-test and CI of RM) for Supermarket A and B
respectively are presented below.

Two-Sample T-Test and CI: Supermarket_A, Supermarket_B

Two-sample T for Supermarket_A vs Supermarket_B

N Mean StDev SE Mean


Supermarket_A 10 51.16 4.58 1.4
Supermarket_B 12 60.65 4.57 1.3

-
Estimate for difference: -9.49
99% CI for difference: (-15.06, -3.92)
T-Test of difference = 0 ( -Value = -4.85 P-Value = 0.000 DF = 20
Both use Pooled StDev = 4.5735

Assume the daily sales of the two supermarkets are both normally distributed.
a) Show that the test statistic, .
b) State the null and alternative hypotheses for the above test.
c) Based on the output, what is the assumption for the variances of the daily sales between the two
supermarkets. Explain your answer.
d) Using the p-value, do the data provide sufficient evidence to indicate that there is a significant
difference in daily sales between the two supermarkets at 1% significant level?
e) State the 99% confidence interval of the mean difference in daily sales between Supermarket A and
Supermarket B? Does the interval further verify the conclusion in (d)? Support your answer.

10
STA408 Chapter 3: Hypothesis Testing

Example 11
Refer Example 9. Test at 1% significance level whether the mean number of calories per can of diet soda
are different for these two brands. Assume that the calories per can of diet soda are normally distributed
for each of these two brands and that the standard deviations for the two populations are not equal.

Example 12
Below is the Minitab output for two-sample T-test and CI for the 50 randomly selected 30-year fixed-rate
mortgages and 45 randomly selected 20-year fixed-rate mortgages granted in a week.

Two-Sample T-Test and CI: 30_year, 20_year

Two-sample T for 30_year vs 20_year

N Mean StDev SE Mean


30_year 50 5.434 0.467 0.066
20_year 45 5.298 0.417 0.062

-
Estimate for difference: 0.1352
99% lower bound for difference: -0.0794
T-Test of difference = 0 (vs >): T-Value = 1.49 P-Value = 0.070 DF = 92

a) Show that the test statistic, .


b) Test at 5% significance level whether the average rate on all 30-year fixed-rate mortgages was
higher than the average rate on all 20-year fixed-rate mortgages granted in that week.

11
STA408 Chapter 3: Hypothesis Testing

Example 13

the average sales if its employees. The company sent six of its salespersons to attend this course. The
table below gives the 1-week sales of these salespersons before and after they attended the course.

Before 12 18 25 9 14 16
After 18 24 24 14 19 20

Using the 1% significance level, can you conclude that the mean weekly sales for all salespersons increase
as a result of attending this course? Assume that the population of paired differences has a normal
distribution.

Example 14
Below is the Minitab output of Example 13.
Paired T-Test and CI: Before, After

Paired T for Before - After

N Mean StDev SE Mean


Before 6 15.67 5.54 2.26
After 6 19.83 3.82 1.56
Difference 6 -4.17 2.64 1.08

1% upper bound for mean difference: -7.79


T-Test of mean difference = 0 (vs < 0): T-Value = -3.87 P-Value = 0.006

Note: Whether the hypothesis testing is tested using test statistic or -value, the decision and
conclusion of the test remain the same.
In Example 13, the test was conducted using test statistic and was rejected.
Similarly in Example 14, is rejected because -value .

12
STA408 Chapter 3: Hypothesis Testing

3.3 Test for a Variance or Standard Deviation

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null Hypothesis
Alternative Hypothesis

Test statistic

where the degrees of freedom is .

Assumptions for the Chi-square Test for a Single Variance


The sample must be randomly selected from a population.
The population must be normally distributed for the variable under study.
The observations must be independent of one another.

Example 15
An instructor wishes to see whether the variation in scores of the 23 students in her class is less than the
variance of the population. The variance of the class is 198. Is there enough evidence to support the claim
that the variation of the students is less than the population variance ( ) at ? Assume
that the scores are normally distributed.

13
STA408 Chapter 3: Hypothesis Testing

Example 16
A hospital administrator believes that the standard deviation of the number of people using outpatient
surgery per day is greater than 8. A random sample of 15 days is selected. The data are shown. At
, is there enough evidence to
distributed.
25 30 5 15 18 42 16 9
10 12 12 38 8 14 27

Example 17
Below is the Minitab output for the data in Example 16. Using -
.
Test and CI for One Variance: No_of_patient

Method

The chi-square method is only for the normal distribution.


The Bonett method is for any continuous distribution.

Statistics

Variable N StDev Variance


No_of_patient 15 11.2 125

90% One-Sided Confidence Intervals

Lower
Bound
for Lower Bound
Variable Method StDev for Variance
No_of_patient Chi-Square 9.1 83
Bonett 8.7 76

Tests
Test
Variable Method Statistic DF P-Value
No_of_patient Chi-Square 27.45 14 0.017
Bonett 0.046

14
STA408 Chapter 3: Hypothesis Testing

3.4 Testing the Difference Between Two Variances

Two-Tailed Test Right-Tailed Test Left-Tailed Test


Null Hypothesis
Alternative Hypothesis

Test Statistics

where and .
Note:
The larger variance should always be placed in the numerator of the formula regardless of the
subscripts.
For a two-tailed test, the value must be divided by 2 and the critical value placed on the right side
of the curve.

Example 18
The standard deviation of the average waiting time to see a doctor for non-life-threatening problems in
the emergency room at an urban hospital is 32 minutes. At a second hospital, the standard deviation is
28 minutes. If a sample of 13 patients was used in the first case and 19 in the second case, is there enough
evidence to conclude at the 0.01 significance level that the standard deviation of the waiting times in the
first hospital is greater than the standard deviation of the waiting times in the second hospital? Assume
that the population distributions for both the waiting time is approximately normally distributed.

15
STA408 Chapter 3: Hypothesis Testing

Example 19
A medical researcher wishes to see whether the variance of the heart rates (in beats per minute) of
smokers is different from the variance of heart rates of people who do not smoke. Two samples are
selected, and the data are as shown. Using , is there enough evidence to support the claim?
Assume that distributions for both the variables are normally distributed.
Smokers Non-smokers

(a) Using , is there enough evidence to support the claim?


(b) Given that the average heart rate of smokers and non-smokers are 86 and 80 beats per minute.
From the conclusion in (a), construct a 95% confidence interval for the difference in the average
heart rates of smokers and non-smokers.

16
STA408 Chapter 3: Hypothesis Testing

Example 20
The CEO of an airport hypothesizes that the variance in the number of passengers for Malaysian Airports
is greater than the variance in the number of passengers for foreign airports. The data is summarised in
the Minitab output below.
Test and CI for Two Variances: Malaysian_Airport, Foreign_Airport

Method

Null hypothesis Variance(Malaysian_Airport) / Variance(Foreign_Airport) = 1


Alternative hypothesis Variance(Malaysian_Airport) / Variance(Foreign_Airport) > 1
Significance level

F method was used. This method is accurate for normal data only.

Statistics

90% Lower
Bound for
Variable N StDev Variance Variances
Malaysian_Airport 6 15.697 246.382 133.376
Foreign_Airport 4 9.791 95.873 46.009

Ratio of standard deviations = 1.603


Ratio of variances = 2.570

90% One-Sided Confidence Intervals

Lower Bound Lower Bound


for StDev for Variance
Method Ratio Ratio
F 0.696 0.484

Tests
Test
Method DF1 DF2 Statistic P-Value
F 5 3 2.57 0.234

Assume the variable is normally distributed.


a) Show that the test statistic, .
b) At , is there enough evidence to support the hypothesis? Use -value.

17

You might also like