0% found this document useful (0 votes)
56 views49 pages

Hypothesis Test

The document discusses hypothesis testing. It defines the null and alternative hypotheses and explains that the null hypothesis is accepted unless there is convincing evidence against it from the data. It provides examples of stating hypotheses for different scenarios and explains the components of a hypothesis test, including the test statistic, rejection region/p-value, and conclusion. It discusses type I and type II errors and how significance levels are used to make decisions about whether to reject the null hypothesis.

Uploaded by

windi aprilia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views49 pages

Hypothesis Test

The document discusses hypothesis testing. It defines the null and alternative hypotheses and explains that the null hypothesis is accepted unless there is convincing evidence against it from the data. It provides examples of stating hypotheses for different scenarios and explains the components of a hypothesis test, including the test statistic, rejection region/p-value, and conclusion. It discusses type I and type II errors and how significance levels are used to make decisions about whether to reject the null hypothesis.

Uploaded by

windi aprilia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

HYPOTHESIS TEST

INTRODUCTION

A statistical hypothesis is a statement about the


numerical value of population parameter.
INTRODUCTION
The null hypothesis, denoted H0, represents the
hypothesis that will be accepted unless the data provide
convincing evidence that it is false.

The alternative (research) hypothesis, denoted Ha or


H1 ,represents the hypothesis that will be accepted
only if the data provide convincing evidence of its
truth. This usually represents the values of a
population parameter for which the researcher wants
to gather evidence to support.
INTRODUCTION
Null hypothesis Alternative hypothesis
• No differences, nothing • Differences, something
happen happen
• A statistical hypothesis • A statement of
that contains a statement inequality such as >, ≠,
of equality, =. or <.
• Denoted H0 read “H • Denoted Ha read “H –a,
null.” or H1 H one.”

complementary
statements
Example

 The mayor of a small city claims that the average


income in his city is $35,000. State the hypothesis.

H0: μ = 35,000 (mayor is correct) versus


Ha: μ ≠ 35,000 (mayor is wrong)
Example

A cereal company advertises that the mean weight of the


contents of its 20-ounce size cereal boxes is at least 20
ounces. We think the boxes contain less cereal.

Solution:
H0: μ ≥ 20 ounces Null Hypothesis (Claim)
Ha: μ < 20 ounces Alternative Hypothesis
Components of Hypothesis Test

1. The hypothesis
Null Hypothesis (H0):
Alternative Hypothesis (H1):
2. The test statistic
3. The rejection region or its p-value :
 A rule that tells us for which values of the test statistic, or
for which p-values, the null hypothesis should be rejected.
4. Conclusion:
Accept H0
Reject H0
Decision/Conclusion Rule
 Depends on the significance level, α, the maximum
tolerable risk you want to have of making a mistake, if
you decide to reject H0. Usually, the significance level
is α = .01 or α = .05.
Decision

Based on Confidence Interval

t > t table
Based on Critical Values Reject H0
z > z table
Based on p-values p <α Reject H0
Possible Results

Actual Situation
H0 is true H1 is true

α 1−β
Reject H0
Type I Error Power
Decision

Accept H0
β
1−α Type II Error

Define:
α = P(Type I error) = P(reject H0 when H0 is true)
β =P(Type II error) = P(accept H0 when H1 is true)
Type I and Type II Errors

When we accept or do not accept H0, we risk


making one of two possible errors:
Type I error:
Reject H0 when it is actually correct
Type II error:
Retain H0 when it is actually false
α & β Have an Inverse Relationship
Reduce probability of one error and the other one goes up.

α
Level of Significance

In a hypothesis test, the level of significance is your


maximum allowable probability of making a type I error.
It is denoted by α, the lowercase Greek letter alpha.
Hypothesis tests
are based on α.
The probability of making a type II error is denoted by β,
the lowercase Greek letter beta.
By setting the level of significance at a small value,
you are saying that you want the probability of
rejecting a true null hypothesis to be small.
Commonly used levels of significance:
α = 0.10 α = 0.05 α = 0.01
Using p value to make decision

Decision Rule Based on P-value


To use a P-value to make a conclusion in a hypothesis test,
compare the P-value with α.
1. If p < α, then reject H0.
2. If p ≥ α, then fail to reject H0.
Finding the P-value

After determining the test statistic and the test statistic’s


corresponding area, do one of the following to find the P-
value.
a. For a left-tailed test, P = (Area in left tail).
b. For a right-tailed test, P = (Area in right tail).
c. For a two-tailed test, P = 2(Area in tail of test statistic).

Example:
The test statistic for a right-tailed test is z = 1.56. Find the P-value.

P-value = 0.0594 The area to the right of z = 1.56


is 1 – .9406 = 0.0594.
0 1.56 z
Finding the P-value

Example:
The P-value for a hypothesis test is P = 0.0594. What is
your decision if the level of significance is
a.) 0.05,

b.) 0.1?

a.) Because 0.0594 is > 0.05, you should accept the null
hypothesis.

b.) Because 0.0594 is < 0.1, you reject the null


hypothesis.
Types of Hypothesis test

Test for a Population Mean

Test for Difference between Two Population


Means (Independent/Dependent Samples)

Test for a Population Proportion

Test for Difference between Two


Population Proportions
1. Hypothesis Test for A Population Mean, μ
The Hypothesis are:

H 0 : μ = μ0 H 0 : μ = μ0 H 0 : μ = μ0
H1 : μ ≠ μ0 H1 : μ < μ0 H1 : μ > μ 0
Two-tailed One-tailed One-tailed
Assumptions

 The samples are randomly selected

 The data comes from normally distributed population


Test Statistic

For large sample


x − μ0 x − μ0
z= ≈
σ n s n
For small sample
x − μ0
t=
s n
has a t distribution with n-1 degrees of freedom.
Example 1

 The daily yield for a chemical plant has averaged


880 tons for several years. The quality control
manager wants to know if this average has
changed. She randomly selects 50 days and
records an average yield of 871 tons with a
standard deviation of 21 tons. (choose a
significance level, say α = .01. )
Solution

H 0 : μ = 880
H a : μ ≠ 880
Te st sta tistic :
x − μ 0 871 − 880
z≈ = = − 3 .0 3
s/ n 21 / 50
Solution
What is the critical value of z that
cuts off exactly α/2 = .01/2 = .005 in the tail of the z
distribution?
For our example, z = -3.03
falls in the rejection region
and H0 is rejected at the
1% significance level.

Rejection Region: Reject H0 if z > 2.575 or z < -2.575. If the


test statistic falls in the rejection region, its p-value will be less
than α = .01.
Solution (using p value)

p - value : P( z > 3.03) + P( z < −3.03)


= 2 P( z < −3.03) = 2(.0012) = .0024

99% CI for μ
(863.35, 878.65)
Example 2

A local telephone company claims that the average


length of a phone call is 8 minutes. In a random sample
of 58 phone calls, the sample mean was 7.8 minutes and
the standard deviation was 0.5 minutes. Is there
enough evidence to support this claim at α = 0.05?

H0: μ = 8 (Claim) Ha: μ ≠ 8


The level of significance is α = 0.05.

0.025 0.025
z
−z0 = −1.96 0 z0 = 1.96

Continued.
Solution

H0: μ = 8 (Claim) Ha: μ ≠ 8

The test statistic is The test statistic falls


z = x − μ = 7.8 − 8 in the rejection region,
σ n 0.5 58 so H0 is rejected.
≈ −3.05. z
−z0 = −1.96 0 z0 = 1.96

At the 5% level of significance, there is enough evidence to reject


the claim that the average length of a phone call is 8 minutes.
Example 3

A homeowner randomly samples 64 homes


similar to her own and finds that the average
selling price is $252,000 with a standard deviation
of $15,000. Is this sufficient evidence to conclude
that the average selling price is AT LEAST
$250,000? Use α = .01.
Example 3

Test statistic :
x − μ 0 252,000 − 250,000
z≈ = = 1.07
s/ n 15,000 / 64
Example 3 (Decision based on critical Value)

• What is the critical value of z that cuts off exactly α= .01


in the right-tail of the z distribution?

•For our example, z =


1.07 does not fall in the
rejection region and H0 is
not rejected. There is not
enough evidence to
indicate that μ is greater
than $250,000.

•Rejection Region: Reject H0 if z > 2.33. If the test statistic falls in the
rejection region, its p-value will be less than α = .01.
Example 3 (Decision based on p Value)

p - value : P( z > 1.07) = 1 − .8577 = .1423


Since the p-value is greater
than α = .01, H0 is not
rejected. There is
insufficient evidence to
indicate that m is greater
than $250,000.
Example 4

A manufacturer claims that its rechargeable batteries


have an average life greater than 1,000 charges. A
random sample of 10 batteries has a mean life of 1002
charges and a standard deviation of 14. Is there enough
evidence to support this claim at α = 0.01?

H0: μ ≤ 1000 Ha: μ > 1000 (Claim)


The level of significance is α = 0.01.
The degrees of freedom are d.f. = n – 1 = 10 – 1 = 9.
The test statistic is
t = x − μ = 1002 − 1000
s n 14 10
≈ 0.45
Solution

H0: μ ≤ 1000 Ha: μ > 1000 (Claim)


t = 0.45

Using the d.f. = 9 row from t Table, you can determine


that 0.25 < p value <0.5 and therefore also greater than
the 0.01 significance level. H0 would fail to be rejected.

0 0.45 t

At the 1% level of significance, there is not enough


evidence to support the claim that the rechargeable
battery has an average life of at least 1000 charges.
2. Difference Between Two Population Means
(Case of Two Independent Samples)

• Independent random samples of size n1 and n2 are drawn


from population 1 and population 2 with means μ1 dan
σ
μ2,and variances 1 and σ 22 .
2

The Hypothesis are:


H 0 : μ1 − μ 2 = Δ 0 H 0 : μ1 − μ 2 = Δ 0 H 0 : μ1 − μ 2 = Δ 0
H1 : μ1 − μ 2 ≠ Δ 0 H1 : μ1 − μ 2 < Δ 0 H1 : μ1 − μ 2 > Δ 0
•Two-tailed •One-tailed (lower-tail) •One-tailed (upper-tail)
Assumptions

 The samples are random & independent of each other

 The data must be continuous

 The distribution is normal.


Test Statistic

For Large Sample

z=
( x1 − x2 ) − Δ 0
2 2
s s
1
+ 2
n1 n2
For Small Sample
Test Statistic: Small Sample
Equal Variances Unequal Variances

x1 − x2
t=
s12 s22
t=
(x1 − x2 ) − Δ 0 +
n1 n2
1 1
s  + 
2

 n1 n2 
How to check the reasonable equality of variance assumption?

Rule of Thumb
2
larger s Assume that the
2
≤3 variances are equal
smaller s
larger DO NOT Assume that
smaller the variances are equal
Example 1

• Is there a difference in the average daily intakes of dairy


products for men versus women? Use = .05.

Avg Daily Men Women


Intakes
Sample size 50 50
Sample mean 756 762
Sample Std Dev 35 30
Solution

H0 : μ1 − μ2 = 0 (same)
Ha : μ1 − μ2 ≠ 0 (different)

T e st sta tistic :
x1 − x 2 − 0 756 − 762 − 0
z≈ = = − .9 2
2 2 2 2
s1 s 2 35 30
+ +
n1 n 2 50 50
Example 2

Two training procedures are compared by measuring the


time that it takes trainees to assemble a device. A different
group of trainees are taught using each method. Is there a
difference in the two methods? Use α = 0.01.

Time to Assemble Method 1 Method 2


Sample size 10 12
Sample mean 35 31
Sample Std Dev 4.9 4.5
Solution
Hypothesis Calculate:
𝑛 1 𝑠 𝑛 1 𝑠
𝑠
𝑛 𝑛 2
9 4. 9 11 4. 5
20
Equality of Variances Checking 21.942
larger s 2 4.9 2 24.01
= = = 1.186 ≤ 3 35 − 31
smaller s 2
4.5 2
20.25 t= = 1.99
1 1
21.942 + 
Test Statistics  10 12 
x1 − x2 − 0
t= Critical Value
2 1 1
s  +  𝑡 ⁄ ; 𝑡 . ⁄ ; 2.845
 n1 n2 
Decision?
Paired-t
(Case of Two Dependent Samples)

Two samples are independent if the sample selected from


one population is not related to the sample selected from
the second population. Two samples are dependent if each
member of one sample corresponds to a member of the
other sample. Dependent samples are also called paired
samples or matched samples.

Independent Samples Dependent Samples


Independent and Dependent Samples
Example:
Classify each pair of samples as independent or dependent.

Sample 1: The Blood Pressure 20 students before consuming Panadol


Sample 2: The Blood Pressure the same 20 students after consuming
Panadol
These samples are dependent because Blood Pressure
before and after consuming Panadol are obtained from
the same persons

Sample 1: The price of 15 new trucks


Sample 2: The price of 20 used sedans
These samples are independent because it is not
possible to pair the new trucks with the used sedans.
The data represents prices for different vehicles.
t test for Difference between Means

To perform a two-sample hypothesis test with dependent


samples, the difference between each data pair is first
found:
d = x1 – x2 Difference between entries for a data pair.

The test statistic is the mean d of these differences.


d Mean of the differences between paired
d = . data entries in the dependent samples.
n

Three conditions are required to conduct the test.


t test for Difference between Means

1. The samples must be randomly selected.


2. The samples must be dependent (paired).
3. Both populations must be normally distributed.

If these conditions are met, then the sampling distribution


for d is approximated by a t-distribution with n – 1
degrees of freedom, where n is the number of data pairs.

d
–t0 μd t0
t test for Difference between Means

The following symbols are used for the t-test for μd .

Symbol Description
n The number of pairs of data
d The difference between entries for a data pair, d = x1 – x2
μd The hypothesized mean of the differences of paired data
in the population
d The mean of the differences between the paired data
entries in the dependent samples
d = d
n
sd The standard deviation of the differences between the
paired data entries in the dependent samples
2
n( d 2 ) − (d )
sd =
n(n − 1)
The Test Statistics

To test H 0 : μ1 − μ 2 = μ 0 we test H 0 : μ d = μ 0
using the test statistic
d − μ0
t=
sd / n
where n = number of pairs, d and sd are the
mean and standard deviation of the differences, d i .
Use the p - value or a rejection region based on
a t - distribution with df = n − 1.
Example

A reading center claims that students will perform better


on a standardized reading test after going through the
reading course offered by their center. The table shows the
reading scores of 6 students before and after the course. At
α = 0.05, is there enough evidence to conclude that the
students’ scores after the course are better than the scores
before the course?

Student 1 2 3 4 5 6
Score (before) 85 96 70 76 81 78
Score (after) 88 85 89 86 92 89

H 0 : μd ≤ 0
Ha: μd > 0 (Claim)
Solution

H 0 : μd ≤ 0 α = 0.05

Ha: μd > 0 (Claim)


-3 -2 -1 0 1 2 3
t
d = (score after) – (score before)
tα = 2.015
Student 1 2 3 4 5 6
Score (before) 85 96 70 76 81 78
Score (after) 88 85 89 86 92 89
d 3 -11 19 10 11 11 ∑𝑑 43
d2 9 121 361 100 121 121 d 2 = 833

d = d • 7.167
n
n( d 2 ) − ( d )2 = 6(833) − 1849 ≈ 104.967 ≈ 10.245
sd = 6(5)
n(n − 1)
Solution

-3 -2 -1 0 1 2 3
t
tα = 2.015 d.f. = 6 – 1 = 5

The test statistic is

d − μd 7.167 0
t= 1.714.
sd n 10.245
6
Fail to reject H0.
There is not enough evidence at the 5% level to support
the claim that the students’ scores after the course are
better than the scores before the course.

You might also like