0% found this document useful (0 votes)
65 views39 pages

Hypo Test

This document discusses hypothesis testing, including the concepts of the null and alternative hypotheses, examples of correctly formulating hypotheses, and how to conduct hypothesis tests for proportions. It explains that the sample proportion may differ from the hypothesized value under the null hypothesis due to sampling variability, and outlines how to set critical values and rejection regions to test hypotheses at various significance levels.

Uploaded by

Ashutosh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views39 pages

Hypo Test

This document discusses hypothesis testing, including the concepts of the null and alternative hypotheses, examples of correctly formulating hypotheses, and how to conduct hypothesis tests for proportions. It explains that the sample proportion may differ from the hypothesized value under the null hypothesis due to sampling variability, and outlines how to set critical values and rejection regions to test hypotheses at various significance levels.

Uploaded by

Ashutosh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Hypothesis Testing

Karthik Sriram, IIM Ahmedabad


Objective

For confidence interval estimation, we were interested in a range of


possible values for a population proportion or mean.

Different from confidence interval estimation, in hypothesis testing the


objective is specific, and is to answer questions like

• Whether market share has increased beyond 20%

• Whether medicine will have desired effect or not


NULL vs Alternative Hypothesis
• Null Hypothesis denoted 𝐻
status quo (e.g. “market share is 20%”)
or
default assumption
or
conservative belief (e.g. “Drug does not work”)

• Alternative denoted 𝐻 or 𝐻
usually opposite of Null
that for which we need empirical evidence

Idea: Unless data evidence strongly favors Alternative, we will not reject Null
Example 1
A pharma company wants to demonstrate to the regulator that, on consuming its
new drug for 30 days, the average reduction in triglyceride level (𝜇) is more than 40
units. Which one of the following is the correct formulation of null and alternative?
a) 𝐻 : 𝜇 ≥40, 𝐻 : 𝜇 <40

x
b) 𝐻 : 𝜇 ≤ 40, 𝐻 : 𝜇 > 40
c) 𝐻 : 𝜇 = 40, 𝐻 : 𝜇 ≠ 40
d) 𝐻 : 𝜇 ≠ 40, 𝐻 : 𝜇 = 40
Example 2
Currently the proportion (𝑝) of dissatisfied employees in a company is believed to
be more than 30%. Recently, several new HR incentive schemes were introduced.
The company wanted to test if the proportion of dissatisfied employees has now
decreased. Based on a random sample of 100 employees it was found that 𝑝̂ = .28
of employees are dissatisfied. Which one of the following is the correct formulation
of null and alternative?
Quo
a) 𝐻 : 𝑝 ≥ 0.3, H : p < 0.3 status
know abtpopl
b) 𝐻 : 𝑝 ≥ 0.28, H : p < 0.28 is to Nullthypt
Antension in
c) 𝐻 : 𝑝 ≤ 0.3, H : p > 0.3
always is
d) 𝐻 : 𝑝 ≤ 0.28, H : p > 0.28 to
e) 𝐻 : 𝑝̂ ≥ 0.3, H : 𝑝̂ < 0.3 Equal
f) 𝐻 : 𝑝 = 0.3, H : p ≠ 0.3
g) 𝐻 : 𝑝̂ ≤ 0.3, H : 𝑝̂ > 0.3
Example 3
A production process for manufacturing has been running smoothly for many months. The
machine is designed to manufacture screws of expected length (𝜇) of 2cm. If at any time
there is evidence to suggest the average has deviated from intended design, the machine
has to be stopped, checked for issues and recalibrated if need be, which involves significant
cost. A testing process is put in place to check this, based on a random sample of 35
screws manufactured in each batch and computing the average length (𝑥)̅ in the sample.
Which one of the following is the correct formulation of null and alternative?
w̅ test form
a) 𝐻 : 𝑥̅ = 2, 𝐻 : 𝑥̅ ≠ 2 We don't testfor we


b)
c)
𝐻 : 𝜇 = 2, 𝐻 : 𝜇 ≠ 2
𝐻 : 𝜇 ≠ 2, 𝐻 : 𝜇 = 2
d) 𝐻 : 𝜇 ≥ 2, 𝐻 : 𝜇 < 2
e) 𝐻 : 𝑥̅ ≠ 2, 𝐻 : 𝑥̅ = 2
f) 𝐻 : 𝜇 ≤ 2, 𝐻 : 𝜇 > 2
Hypothesis Testing
for Proportions
2-sided test at 5% significance level

Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
is true
if Ho
Even
Is it necessary that 𝑝̂ that will result from the sample be same as ?
r
2-sided test at 5% significance level

Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
If H0 is TRUE, Is it necessary that 𝑝̂ that will result from the sample be same as ?
Not Necessarily

1
Possible values of 𝑝̂
3
2-sided test at 5% significance level

Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
If H0 is TRUE, Is it necessary that 𝑝̂ that will result from the sample be same as ?
Not Necessarily
So, instead of checking whether 𝑝=1/3,
̂ we
will give a buffer and check if either
𝑝̂ < − 𝑐 or 𝑝̂ > + 𝑐

1 values of 𝑝̂ 1 1
Possible
−𝑐 +𝑐
3 3 3
2-sided test at 5% significance level

Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
If H0 is TRUE, Is it necessary that 𝑝̂ that will result from the sample be same as ?
Not Necessarily
So, instead of checking whether 𝑝=1/3,
̂ we
will give a buffer and check if either
𝑝̂ < − 𝑐 or 𝑝̂ > + 𝑐 Reject H0 if 𝑝̂ falls Reject H0 if 𝑝̂ falls
here here

1 values of 𝑝̂ 1 1
Possible
−𝑐 +𝑐
3 3 3
2-sided test at 5% significance level
Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
If H0 is TRUE, Is it necessary that 𝑝̂ that will result from the sample be same as ?
Not Necessarily

So, instead of checking whether 𝑝=1/3,


̂ we
will give a buffer and check if either prob=2.5% prob=2.5%
𝑝̂ < − 𝑐 or 𝑝̂ > + 𝑐 Reject H0 if 𝑝̂ falls Reject H0 if 𝑝̂ falls
here here
Specifically, for a 5% test we choose c to
1 values of 𝑝̂ 1 1
ensure a 5% chance of rejecting H0, when it Possible
−𝑐 +𝑐
3 3 3
is True
2-sided test at 5% significance level
Suppose we want to test H : 𝑝 = vs. H : 𝑝 ≠

To test this, a sample of n=40 will be taken and 𝑝̂ will be computed after sample is
obtained.
If H0 is TRUE, Is it necessary that 𝑝̂ that will result from the sample be same as ?
Not Necessarily

In fact, by CLT, if H is true, i.e. if 𝑝 = , prob=2.5% prob=2.5%


Then (approx.) with 95% probability Reject H0 if 𝑝̂ falls Reject H0 if 𝑝̂ falls
×( ) here here
𝑝̂ will be within ± 1.96
1 values of 𝑝̂ 1 1
Possible
−𝑐 +𝑐
3 3 3
Test rule: 2-sided test at 5% significance level
So, to test H : 𝑝 = vs. H : 𝑝 ≠

the testing rule at 5% level of significance is as follows

×( )
Reject H if 𝑝̂ is outside: ± 1.96
P 71.96s
(Equivalently) Reject H if |z|= > 1.96 [note: 1.96=97.5th pctile of N(0,1)]
×( )

Heuristic:
We will reject H0 if sample proportion turns out to be too far from the hypothesized value
of 1/3
Upper(or Right)-tail test at 5% significance level
H :𝑝 ≤ vs. H : 𝑝 >

Heuristic: We will reject H0 if sample proportion turns out to be too far to the right of hypothesized
value of 1/3

So, the testing rule at 5% level of significance is as follows

×( )
Reject H if 𝑝̂ > + 1.645

(Equivalently) Reject H if z= > 1.645 prob=5%


×( )
Reject H0 if 𝑝̂ falls
here
Note: 1.645= 95th percentile of N(0,1)
1 1
Possible values of 𝑝̂ +𝑐

IF
3 3
Lower(or Left)-tail test at 5% significance level
H :𝑝 ≥ vs. H : 𝑝 <

Heuristic: We will reject H0 if sample proportion turns out to be too far to the left of hypothesized
value of 1/3

So, the testing rule at 5% level of significance is as follows

×( )
Reject H if 𝑝̂ < − 1.645

(Equivalently) Reject H if z= < −1.645


×( )
prob=5%
Reject H0 if 𝑝̂ falls
Note: -1.645= 5th percentile of N(0,1) here

1 values of 𝑝̂ 1
Possible−𝑐
3 3
Use Case 1: Complaints on utility services
A government department handled complaints from citizens related to a utility
service. There was a concern that many complaints were not resolved satisfactorily
in time. The head of the department introduced a new process wherein she held
the senior officers of different functions accountable for the complaints requiring
them to give a weekly update on the same, while promising all support in terms of
resources required for their efficient functioning. Her goal was to reduce
percentage complaints not resolved satisfactorily in time, to less than 10%. To see
whether the changes have helped achieve her goal, she conducted a survey on 200
randomly chosen consumers who had filed complaints post the changes. 19 of
them reported that their complaints were not resolved satisfactorily in time. Is
there statistical evidence at 5% level to support that the goal of the department
head has been achieved?
unresolved complaints on 2200
P 2 prop of I If
Ho 0 1 µ true
p

i
H p c o L

0.0651
0 0950
obtsvalue
Use Case 2: Free Dinner Incentive
A large company decided to provide free dinner to all those employees who had to
stay back late due to work beyond 8:30 pm. They had consistently seen that about
30% of employees stayed back beyond this time and it would be a nice
thankfulness gesture. To engage a caterer it was important for the company to
commit and pay upfront for the volume of food required, which was of course
directly dependent on the proportion “p” who would stayed back beyond 8:30 pm.
It was important for the company to monitor this proportion. While the proportion
was believed to be stable, if at all it changed significantly, it would result in either
over or under supply of required food. The company decided to take a sample of
300 employees every week to test this. What should be the null and alternative
hypothesis, what is the testing rule based on 𝑝̂ at 1% level?.
If in a given week, the sample proportion turned out to be 0.27, what would you
conclude from the test?
p 0 3
Ho Ho
H 0 3
p

any.se
3 I S
P o 3 a 0.27 0.3

I Don't eject
l
o 27 3 50

Norm in
0.955,0

2 57

P-value
Test rule (in terms of test-statistic) is usually stated as
“ reject H0 if test statistic is beyond some cut-off”

Note that probability beyond cutoff when H0 is true, = alpha (e.g.5%)

P-value= probability beyond the test statistic value, when H0 is true


Then, test rule (in terms of P-value) can be stated as
“reject H0 if P-value < alpha”

em

Mphl
P-value
The test rule can be stated either

in terms of test-statistic :
2-sided test:“Reject H0 if |𝑝-1/3|>c”
̂ [c chosen to ensure level=alpha]
Upper tail test: "Reject H0 if 𝑝>c”
̂ [c chosen to ensure level=alpha]

OR
equivalently in terms of P-value as
“Reject H0 if P-value < alpha”
P-value for Lower tail test, e.g. 𝐻 : 𝑝 <
𝒑 𝟏/𝟑
𝒏𝒐𝒕𝒆: 𝒊𝒏 𝒕𝒉𝒆 𝒑𝒊𝒄𝒕𝒖𝒓𝒆, 𝒕𝒆𝒔𝒕 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 𝒊𝒔 𝒕𝒂𝒌𝒆𝒏 𝒕𝒐 𝒃𝒆
𝑺𝑬

picture courtesy: https://online.stat.psu.edu/stat462/node/253/

With P-value RULE is always: Reject H0 if P-value < alpha


P-value for two tail test, e.g. 𝐻 : 𝑝 ≠
𝒑 𝟏/𝟑
𝒏𝒐𝒕𝒆: 𝒊𝒏 𝒕𝒉𝒆 𝒑𝒊𝒄𝒕𝒖𝒓𝒆, 𝒕𝒆𝒔𝒕 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 𝒊𝒔 𝒕𝒂𝒌𝒆𝒏 𝒕𝒐 𝒃𝒆
𝑺𝑬

picture courtesy: https://online.stat.psu.edu/stat462/node/253/

With P-value RULE is always: Reject H0 if P-value < alpha


P-value for Upper tail test, e.g. 𝐻 : 𝑝 >
𝒑 𝟏/𝟑
𝒏𝒐𝒕𝒆: 𝒊𝒏 𝒕𝒉𝒆 𝒑𝒊𝒄𝒕𝒖𝒓𝒆, 𝒕𝒆𝒔𝒕 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 𝒊𝒔 𝒕𝒂𝒌𝒆𝒏 𝒕𝒐 𝒃𝒆
𝑺𝑬

picture courtesy: https://online.stat.psu.edu/stat462/node/253/

With P-value RULE is always: Reject H0 if P-value < alpha


Types of Errors
• Suppose we are testing H : 𝑝 ≤ vs. H : 𝑝 > at 5% level
×( )
• Test rule : Reject H if 𝑝̂ > c, c= + 1.645
Types of Errors
• Suppose we are testing H : 𝑝 ≤ vs. H : 𝑝 > at 5% level
×( )
• Test rule : Reject H if 𝑝̂ > c, c= + 1.645
Possible Test Decision
don’t reject H0 reject H0
H0 is true Correct Type I Error
Actual H1 is true Type II Error Correct

FI
Et

Type I p Greg H when H true


Types of Errors
• Suppose we are testing H : 𝑝 ≤ vs. H : 𝑝 > at 5% level
×( )
• Test rule : Reject H if 𝑝̂ > c, c= + 1.645
Possible Test Decision
don’t reject H0 reject H0
H0 is true Correct Type I Error
Actual H1 is true Type II Error Correct

• alpha=Significance Level= P(Type I Error)


is computed by supposing H0 is TRUE (i.e. p=1/3)
Types of Errors
• Suppose we are testing H : 𝑝 ≤ vs. H : 𝑝 > at 5% level
×( )
• Test rule : Reject H if 𝑝̂ > c, c= + 1.645 =.456
Possible Test Decision
don’t reject H0 reject H0
H0 is true Correct Type I Error
Actual H1 is true Type II Error Correct

• alpha=Significance Level= P(Type I Error) =P(Rejecting H0 when it is true)


is computed by supposing H0 is TRUE (i.e. p=1/3)
• beta= P(Type II Error) =P(Rejecting H1 when it is true)
is computed by supposing H1 is TRUE.
• power= 1- beta
Power and Level: Right tailed test
Test-statistic Test-statistic Suppose we are testing
distribution distribution H : 𝑝 ≤ vs. H : 𝑝 > at
if H0 is true if H1 is true 5% level

Test rule : Reject H if 𝑝̂ > c,

Picture courtesy : https://medium.com/almabetter/hypothesis-testing-602fcb022a70


Interpretation
Suppose test rule is: “reject H0 if 𝑝̂ > .456”

Our rejection of H0 depends on value of 𝑝̂ we will get in our sample.

Alpha= Significance Level measures


“how likely we may reject H0 when H0 is true”

Beta= “how likely we may reject H1 when H1 is true”


Power (=1-Beta) measures: “how likely we may accept H1 when H1 is true”

We would want power to be high (same as saying beta should be low)


How to increase power at a given alpha
• Suppose level=alpha=5% is fixed

• test rule is: “reject H0 if 𝑝̂ > c”

• Power is usually computed at a “contextually meaningful alternative”,


e.g. if p=0.4 is a value at which company becomes unprofitable.

To increase power to a desired level, say 90%, one needs to increase


the sample size. Note that then c also is readjusted to ensure level=5%
Use Case 2: Free Dinner Incentive
A large company decided to provide free dinner to all those employees who had to stay
back late due to work beyond 8:30 pm. They had consistently seen that about 30% of
employees stayed back beyond this time and it would be a nice thankfulness gesture. To
engage a caterer it was important for the company to commit and pay upfront for the
volume of food required, which was of course directly dependent on the proportion “p”
who would stayed back beyond 8:30 pm. It was important for the company to monitor this
proportion. While the proportion was believed to be stable, if at all it changed significantly,
it would result in either over or under supply of required food. In particular, any
proportion below 0.25, would be strictly undesirable as it would lead to a great loss of
food as well as money. The company decided to take a sample of 300 employees to test
this. What should be the null and alternative hypothesis, what is the testing rule based on
𝑝̂ at 1% level?.

If the company wanted the power of the test (at p=0.25) to be 90% what should be the
sample size?

i
i
Concepts of Hypothesis Testing
• Null and Alternative Hypothesis
• Type I Error and Significance Level
• Type II Error and Power

• Test can be carried by comparing test statistic computed from data


with critical value
OR
Alternatively using P-value
Hypothesis Testing
for Mean
2-sided test at 5% significance level
So, to test H : 𝜇 = 52 vs. H : 𝜇 ≠ 52

the testing rule at 5% level of significance is as follows

̅
Test statistic= t-statistic =
/√

Reject H if t-statistic> 97.5th pctile of t-distribution with (n-1) d.o.f


Heuristic:
Reject H0 if sample mean turns out to be too far from the hypothesized value
right-sided test at 5% significance level
So, to test H : 𝜇 ≤ 52 vs. H : 𝜇 > 52

the testing rule at 5% level of significance is as follows

̅
Test statistic= t-statistic =
/√

Reject H if t-statistic> 95th pctile of t-distribution with (n-1) d.o.f


Heuristic:
Reject H0 if sample mean turns out to be too far to the right of the
hypothesized value
left-sided test at 5% significance level
So, to test H : 𝜇 ≥ 52 vs. H : 𝜇 < 52

the testing rule at 5% level of significance is as follows

̅
Test statistic= t-statistic =
/√

Reject H if t-statistic< 5th pctile of t-distribution with (n-1) d.o.f


Heuristic:
Reject H0 if sample mean turns out to be too far to the right of the
hypothesized value
Use Case 3
The manager of a restaurant has introduced an improved healthy recipe for her
signature dish “Veg-e’-khaas”. She believes that the customers will like the new
style better. She invites 30 randomly chosen customers for a special tasting event.
She serves each of them a small portion of the dish based on the old recipe and a
small portion of the dish based on the new recipe. Then the customers are asked to
rate the difference on a scale -5 to +5, with 0 being “no difference”, -5 being
“strongly favour old dish” and +5 being “strongly favour new dish”. Can we
conclude that the new dish is more preferred based on a 5% level test?
customer 1 2 3 4 5 6 7 8 9 10 Sample
rating 5 0 0 -2 -3 3 1 4 2 -3 mean
=0.8333
customer 11 12 13 14 15 16 17 18 19 20
rating 5 -2 -1 0 1 -3 2 0 2 -1 Sample SD
=2.4786
customer 21 22 23 24 25 26 27 28 29 30
rating -1 4 4 3 1 0 4 -3 1 2
Use Case 4
Suppose Ebay is interested to demonstrate for marketing purposes that people pay
lesser on Ebay on an average for a particular product than on regular online
purchases. It is found that Amazon charged a price of USD 46.99 for this product. A
sample of 52 Ebay auction prices during the same period for the same product
were recorded. Their average was USD 44.17 with a standard deviation of USD 4.15.
a) Formulate the Hypothesis to test
b) What is the 1% critical value for the t-statistic ?
c) What is the p-value? Is there enough evidence for Ebay’s claim at 1% level of
significance?
d) Suppose we had wanted to do a two-sided test, what would have been the test
rule for alpha= .01?
Further reference (not in syllabus)

• 2-sample test: difference in means, difference in proportions

• Chi-square test: test of independence


• Multi-sample (ANOVA)

Each involves a test statistic and P-value. Can be implemented by referring


text book for formulas.

Rule: If P-value< alpha, reject H0.

You might also like