Hypothesis Testing
Sutikno
Department of Statistics
Faculty of Sciences and Data Analytics
Institut Teknologi Sepuluh Nopember
Hypothesis Testing
I believe the population
mean age is 50
(hypothesis). Reject
hypothesis! Not
Population close.
Random
sample
Mean
X = 20
© 2011 Pearson Education, Inc
What’s a Hypothesis?
A statistical hypothesis is I believe the mean GPA of this
a statement about the class is 3.5!
numerical value of a
population parameter.
© 1984-1994 T/Maker Co.
© 2011 Pearson Education, Inc
Null Hypothesis
The null hypothesis, denoted H0, represents the
hypothesis that will be accepted unless the data
provide convincing evidence that it is false.
This usually represents the “status quo” or some
claim about the population parameter that the
researcher wants to test.
© 2011 Pearson Education, Inc
Alternative Hypothesis
• The alternative (research) hypothesis,
denoted Ha, represents the hypothesis that
will be accepted only if the data provide
convincing evidence of its truth. This usually
represents the values of a population
parameter for which the researcher wants to
gather evidence to support.
Identifying Hypotheses
Example problem: Test that the population
mean is not 3
Steps:
• State the question statistically ( 3)
• State the opposite statistically ( = 3)
— Must be mutually exclusive & exhaustive
• Select the alternative hypothesis ( 3)
— Has the , <, or > sign
• State the null hypothesis ( = 3)
© 2011 Pearson Education, Inc
What Are the Hypotheses?
Is the population average amount of TV viewing
12 hours?
• State the question statistically: = 12
• State the opposite statistically: 12
• Select the alternative hypothesis: Ha: 12
• State the null hypothesis: H0: = 12
© 2011 Pearson Education, Inc
What Are the Hypotheses?
Is the population average amount of TV viewing
different from 12 hours?
• State the question statistically: 12
• State the opposite statistically: = 12
• Select the alternative hypothesis: Ha: 12
• State the null hypothesis: H0: = 12
© 2011 Pearson Education, Inc
What Are the Hypotheses?
Is the average cost per hat less than or equal to
$20?
• State the question statistically: 20
• State the opposite statistically: 20
• Select the alternative hypothesis: Ha: 20
• State the null hypothesis: H0: 20
© 2011 Pearson Education, Inc
What Are the Hypotheses?
Is the average amount spent in the bookstore
greater than $25?
• State the question statistically: 25
• State the opposite statistically: 25
• Select the alternative hypothesis: Ha: 25
• State the null hypothesis: H0: 25
© 2011 Pearson Education, Inc
Test Statistic
The test statistic is a sample statistic,
computed from information provided in the
sample, that the researcher uses to decide
between the null and alternative hypotheses.
© 2011 Pearson Education, Inc
Test Statistic - Example
The sampling distribution of x assuming µ = 2,400.
the chance of observing x more than 1.645 standard
deviations above 2,400 is only .05 – if in fact the true
mean µ is 2,400.
© 2011 Pearson Education, Inc
Type I Error
A Type I error occurs if the researcher
rejects the null hypothesis in favor of the
alternative hypothesis when, in fact, H0 is
true. The probability of committing a
Type I error is denoted by .
© 2011 Pearson Education, Inc
Rejection Region
The rejection region of a statistical test is the
set of possible values of the test statistic for
which the researcher will reject H0 in favor of
H a.
© 2011 Pearson Education, Inc
Type II Error
A Type II error occurs if the researcher accepts
the null hypothesis when, in fact, H0 is false.
The probability of committing a Type II error is
denoted by .
© 2011 Pearson Education, Inc
Conclusions and Consequences
for a Test of Hypothesis
True State of Nature
Conclusion H0 True Ha True
Accept H0 Correct decision Type II error
(Assume H0 True) (probability )
Reject H0 Type I error Correct decision
(Assume Ha True) (probability )
© 2011 Pearson Education, Inc
Elements of a Test of Hypothesis
1. Null hypothesis (H0): A theory about the specific
values of one or more population parameters. The
theory generally represents the status quo, which
we adopt until it is proven false.
2. Alternative (research) hypothesis (Ha): A theory
that contradicts the null hypothesis. The theory
generally represents that which we will adopt only
when sufficient evidence exists to establish its
truth.
© 2011 Pearson Education, Inc
Elements of a Test of Hypothesis
3. Test statistic: A sample statistic used to decide
whether to reject the null hypothesis.
4. Rejection region: The numerical values of the test
statistic for which the null hypothesis will be
rejected. The rejection region is chosen so that
the probability is that it will contain the test
statistic when the null hypothesis is true, thereby
leading to a Type I error. The value of is usually
chosen to be small (e.g., .01, .05, or .10) and is
referred to as the level of significance of the test.
© 2011 Pearson Education, Inc
Elements of a Test of Hypothesis
5. Assumptions: Clear statement(s) of any
assumptions made about the population(s) being
sampled.
6. Experiment and calculation of test statistic:
Performance of the sampling experiment and
determination of the numerical value of the test
statistic.
© 2011 Pearson Education, Inc
Elements of a Test of Hypothesis
7. Conclusion:
a. If the numerical value of the test statistic falls in
the rejection region, we reject the null hypothesis
and conclude that the alternative hypothesis is
true. We know that the hypothesis-testing process
will lead to this conclusion incorrectly (Type I
error) only 100% of the time when H0 is true.
© 2011 Pearson Education, Inc
Elements of a Test of Hypothesis
7. Conclusion:
b. If the test statistic does not fall in the rejection
region, we do not reject H0. Thus, we reserve
judgment about which hypothesis is true. We do
not conclude that the null hypothesis is true
because we do not (in general) know the
probability that our test procedure will lead to
an incorrect acceptance of H0 (Type II error).
© 2011 Pearson Education, Inc
Determining the
Target Parameter
Parameter Key Words or Phrases Type of Data
µ Mean; average Quantitative
p Proportion; percentage; Qualitative
fraction; rate
2 Variance; variability; Quantitative
spread
© 2011 Pearson Education, Inc
One-Tailed Test
A one-tailed test of hypothesis is one in which
the alternative hypothesis is directional and
includes the symbol “ < ” or “ >.”
© 2011 Pearson Education, Inc
Two-Tailed Test
A two-tailed test of hypothesis is one in which
the alternative hypothesis does not specify
departure from H0 in a particular direction and is
written with the symbol “ ≠.”
© 2011 Pearson Education, Inc
Rejection Region
(One-Tail Test)
Sampling Distribution Level of Confidence
Rejection
Region
1–
Fail to Reject
Region
Ho Sample Statistic
Critical Value
Value
© 2011 Pearson Education, Inc
Rejection Regions
(Two-Tailed Test)
Sampling Distribution Level of Confidence
Rejection Rejection
Region Region
1–
1/2 1/2
Fail to Reject
Region
Ho Sample Statistic
Critical Value Critical
Value Value
© 2011 Pearson Education, Inc
Rejection Regions
Alternative Hypotheses
Lower- Upper- Two-Tailed
Tailed Tailed
= .10 z < –1.28 z > 1.28 z < –1.645 or z > 1.645
= .05 z < –1.645 z > 1.645 z < –1.96 or z > 1.96
= .01 z < –2.33 z > 2.33 z < –2.575 or z > 2.575
© 2011 Pearson Education, Inc
Large-Sample Test of Hypothesis
about µ
One-Tailed Test Two-Tailed Test
H0: µ = µ0 H0: µ = µ0
H a: µ < µ 0 H a: µ ≠ µ 0
(or Ha: µ > µ0)
Test Statistic: Test Statistic:
x µ0 x µ0 x µ0 x µ0
z z
x s n x s n
© 2011 Pearson Education, Inc
Large-Sample Test of Hypothesis
about µ
One-Tailed Test
Rejection region:
z < –z
(or z > zwhen Ha: µ > µ0)
where z is chosen so that
P(z > z) =
© 2011 Pearson Education, Inc
Large-Sample Test of Hypothesis
about µ
Two-Tailed Test
Rejection region:
|z| > z
where z is chosen so that
P(|z| > z) = /2
Note: µ0 is the symbol for the numerical value assigned
to µ under the null hypothesis.
© 2011 Pearson Education, Inc
Conditions Required for a Valid
Large-Sample Hypothesis Test
for µ
1. A random sample is selected from the target
population.
2. The sample size n is large (i.e., n ≥ 30). (Due to the
Central Limit Theorem, this condition guarantees
that the test statistic will be approximately normal
regardless of the shape of the underlying probability
distribution of the population.)
© 2011 Pearson Education, Inc
Two-Tailed z Test Thinking
Challenge
You’re a Q/C inspector. You want to find out if
a new machine is making electrical cords to
customer specification: average breaking
strength of 70 lb. with = 3.5 lb. You take a
sample of 36 cords & compute a sample mean
of 69.7 lb. At the .05 level of significance, is
there evidence that the machine is not
meeting the average breaking strength?
© 2011 Pearson Education, Inc
Two-Tailed z Test Solution*
• H0: = 70 Test Statistic:
• Ha: 70 x 69.7 70
z .51
• = .05 3.5
• n = 36 n 36
• Critical Value(s):
Decision:
Reject H 0 Reject H 0 Do not reject at = .05
.025 .025 Conclusion:
No evidence average
–1.96 0 1.96 z is not 70
© 2011 Pearson Education, Inc
p-Value
The observed significance level, or p-value, for a
specific statistical test is the probability
(assuming H0 is true) of observing a value of the
test statistic that is at least as contradictory to
the null hypothesis, and supportive of the
alternative hypothesis, as the actual one
computed from the sample data.
© 2011 Pearson Education, Inc
p-Value
• Probability of obtaining a test statistic more
extreme (or than actual sample value,
given H0 is true
• Called observed level of significance
• Smallest value of for which H0 can be
rejected
• Used to make rejection decision
• If p-value , do not reject H0
• If p-value < , reject H0
© 2011 Pearson Education, Inc
Small-Sample Test of Hypothesis
about µ
One-Tailed Test
H0: µ = µ0
Ha: µ < µ0 (or Ha: µ > µ0)
x
Test statistic: t
s n
Rejection region: t < –t
(or t > t when Ha: µ > µ0)
where t and t are based on (n – 1) degrees of
freedom
© 2011 Pearson Education, Inc
Small-Sample Test of Hypothesis
about µ
Two-Tailed Test
H0: µ = µ0
Ha: µ ≠ µ0
x
Test statistic: t
s n
Rejection region: |t| > t
© 2011 Pearson Education, Inc
Conditions Required for a Valid
Small-Sample Hypothesis Test
for µ
1. A random sample is selected from the target
population.
2. The population from which the sample is
selected has a distribution that is
approximately normal.
© 2011 Pearson Education, Inc
Uji Mean
X
ztest
n
X X
ztest ttest
s n s n
Varians populasi diketahui
41
Varians populasi tidak diketahui
42
6.6
Large-Sample Test of Hypothesis
about a Population Proportion
© 2011 Pearson Education, Inc
Large-Sample Test of Hypothesis
about p
One-Tailed Test
H0: p = p0
Ha: p < p0 (or Ha: p > p0)
p̂ p0
Test statistic: z where p̂ p0 q0 n
p̂
q0 1 p0
Rejection region:
z < –z(or z > z when Ha: p > p0)
Note: p0 is the symbol for the numerical value of p
assigned in the null hypothesis
© 2011 Pearson Education, Inc
Large-Sample Test of Hypothesis
about p
Two-Tailed Test
H0: p = p0
Ha: p ≠ p0
p̂ p0
Test statistic: z where p̂ p0 q0 n
p̂
q0 1 p0
Rejection region: |z| < z
Note: p0 is the symbol for the numerical value of p
assigned in the null hypothesis
© 2011 Pearson Education, Inc
Conditions Required for a Valid
Large-Sample Hypothesis Test
for p
1. A random sample is selected from a binomial
population.
2. The sample size n is large. (This condition will
be satisfied if both np0 ≥ 15 and nq0 ≥ 15.)
© 2011 Pearson Education, Inc
One-Proportion z Test
Example
The present packaging system
produces 10% defective
cereal boxes. Using a new
system, a random sample of
200 boxes had11 defects.
Does the new system produce
fewer defects? Test at the .05
level of significance.
© 2011 Pearson Education, Inc
One-Proportion z Test Solution
• H0: p = .10 Test Statistic:
11
• Ha: p < .10 pö p0 200 .10
z 2.12
• = .05 p0 q0
.10 .90
• n = 200 n 200
• Critical Value(s):
Decision:
Reject H0 Reject at = .05
.05 Conclusion:
There is evidence new
-1.645 0 z system < 10% defective
© 2011 Pearson Education, Inc
Variance
Although many practical problems involve
inferences about a population mean (or
proportion), it is sometimes of interest to make
an inference about a population variance, 2.
© 2011 Pearson Education, Inc
Test of a Hypothesis about 2
One-Tailed Test
H0: = 0
Ha: < 0(or Ha: > 0)
Test statistic:
2 n 1s 2
02
Rejection region: 2 2
1
(or > when Ha: > 0)
where 0 is the hypothesized variance and the
distribution of is based on (n – 1) degrees of
freedom.
© 2011 Pearson Education, Inc
Test of a Hypothesis about 2
Two-Tailed Test
H0: = 0
Ha: ≠ 0
Test statistic:
2 n 1s 2
2
0
Rejection region:
2
or 2
2
1 2
2 2
where 0 is the hypothesized variance and the
distribution of is based on (n – 1) degrees of
freedom.
© 2011 Pearson Education, Inc
Conditions Required for a
Valid Hypothesis Test for s 2
1. A random sample is selected from the target
population.
2. The population from which the sample is
selected has a distribution that is
approximately normal.
© 2011 Pearson Education, Inc
Several probability
2
Distributions
© 2011 Pearson Education, Inc
Part of Table VI: Critical
Values of Chi Square
© 2011 Pearson Education, Inc
Chi-Square (2) Test Example
Is the variation in boxes of
cereal, measured by the
variance, equal to 15
grams? A random sample
of 25 boxes had a standard
deviation of 17.7 grams.
Test at the .05 level of
significance.
© 2011 Pearson Education, Inc
Chi-Square Test ( )
2
Solution
• H0: 2 = 15
Test Statistic:
• Ha: 2 15
(n 1) s 2
(25 1)17.7 2
• = .05
2
2
15 2
• df = 25 – 1 = 24 0
= 33.42
• Critical Value(s):
Decision:
/2 = .025 Do not reject at = .05
Conclusion:
There is no evidence
0 12.401 39.364 ©2 2011 Pearson Education,
2 is Inc
not 15
Type II Error
The Type II error probability is calculated
assuming that the null hypothesis is false
because it is defined as the probability of
accepting H0 when it is false.The situation
corresponding to accepting the null hypothesis,
and thereby risking a Type II error, is not
generally as controllable. For that reason, we
adopted a policy of nonrejection of H0 when the
test statistic does not fall in the rejection region,
rather than risking an error of unknown
magnitude.
© 2011 Pearson Education, Inc
Steps for Calculating for a
Large-Sample Test about µ
1. Calculate the value(s) of x corresponding to
the border(s) of the rejection region. There
will be one border value for a one-tailed test
and two for a two-tailed test. The formula is
one of the following, corresponding to a test
with level of significance :
s
Upper-tailed test: x0 0 z x 0 z
n
© 2011 Pearson Education, Inc
Steps for Calculating for a
Large-Sample Test about µ
s
Lower-tailed test: x0 0 z x 0 z
n
s
Two-tailed test: x0, L 0 z 2 x 0 z 2
n
s
x0, U 0 z 2 x 0 z 2
n
© 2011 Pearson Education, Inc
Steps for Calculating for a
Large-Sample Test about µ
2. Specify the value of µa in the alternative
hypothesis for which the value of is to be
calculated. Then convert the border value(s)
of x0 to z-value(s) using the alternative
distribution with mean µa. The general
formula for the z-value is
x0 a
z
x
© 2011 Pearson Education, Inc
Steps for Calculating for a
Large-Sample Test about µ
Sketch the alternative distribution (centered
at µa) and shade the area in the acceptance
(nonrejection) region. Use the z-statistic(s)
and Table IV in Appendix B to find the shaded
area, which is .
© 2011 Pearson Education, Inc
Power of Test
• Probability of rejecting false H0
• Correct decision
• Equal to 1 –
• Used in determining test adequacy
• Affected by
• True value of population parameter
• Significance level
• Standard deviation & sample size n
© 2011 Pearson Education, Inc
Two-Tailed z Test Example
Does an average box of
cereal contain 368 grams of
cereal? A random sample of
25 boxes had x = 372.5. The
company has specified to
be 15 grams. Test at the .05
level of significance.
368 gm.
© 2011 Pearson Education, Inc
Finding Power
Step 5
Reject H0
Hypothesis:
H0: 0 368 15
n
Do Not
Draw
Ha: 0 < 368 25 Reject H0
= .05
= 368
0
x 15
‘True’ Situation:
a = 360 (Ha) xL 0 z
n
368 1.64
25
Draw
= .154 363.065
1– =.846
Specify z Table
= 360 363.065
©a
2011 Pearson Education, Inc x
Properties of
and Power
1. For fixed n and ,
the value of
decreases, and the
power increases as
the distance
between the
specified null value
µ0 and the specified
alternative value µa
increases.
© 2011 Pearson Education, Inc
Properties
of and
Power
2. For fixed n and
values of µ0 and
µa, the value of
increases, and
the power
decreases as the
value of is
decreased.
Properties of and Power
3. For fixed n and values of µ0 and µa, the value of
decreases, and the power increases as the sample
size n is increased.
© 2011 Pearson Education, Inc
Contoh 1
Berdasarkan data Susenas 2006:
• Buktikan Apakah rata-rata pengeluaran makanan
dan nonmakanan rumahtangga per bulan di Jawa
Timur lebih besar Rp. 850.000,-
• Apakah benar proporsi kepemilikan handpone
rumahtangga di Jawa Timur 75%
• Apakah benar proporsi anggota rumahtangga di
Jawa Timur menerima kredit usaha sebesar 5%.
Gunakan tingkat signifikansi sebesar 5% utk semua
soal diatas
PENGUJIAN DUA POPULASI
INDEPENDENT RANDOM SAMPLES FROM TWO
POPULATIONS (1)
INDEPENDENT RANDOM SAMPLES FROM TWO
POPULATIONS (2)
SMALL SAMPLES: NORMAL
POPULATIONS (1)
SMALL SAMPLES: NORMAL
POPULATIONS (2)
MATCHED PAIRS COMPARISONS (1)
MATCHED PAIRS COMPARISONS (2)
COMPARING TWO POPULATION
PROPORTIONS (1)
Contoh 2
• Apakah perbedaan rata-rata pengeluaran
makanan dan nonmakanan antara
rumahtangga di perdesaan dan perkotaan
• Apakah benar proporsi kepemilikan handpone
rumahtangga perkotaan lebih tinggi daripada
rumahtangga perdesaan
• Apakah benar proporsi anggota rumahtangga
menerina kredit usaha perkotaan lebih tinggi
daripada rumahtangga perdesaan
Contoh 3