100% found this document useful (1 vote)

148 views30 pages

IE241 Hypothesis Testing

This document provides an introduction to hypothesis testing. It defines statistical hypotheses as assumptions about probability distributions. Hypothesis testing involves setting up a null hypothesis (Ho) and an alternative hypothesis (Ha) and using a procedure to decide whether to reject the null hypothesis based on sample data. There are two types of possible errors: type 1 errors where Ho is incorrectly rejected, and type 2 errors where Ho is incorrectly not rejected. The document outlines how to set significance levels (α) to control the probability of type 1 errors and choose tests that minimize the probability of type 2 errors (β). It provides an example of testing if two brands of light bulbs have different lifetimes on average. The document also discusses simple versus composite hypotheses and

Uploaded by

Anonymous RJtBkn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

148 views30 pages

IE241 Hypothesis Testing

Uploaded by

Anonymous RJtBkn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 30

IE241: Introduction to

Hypothesis Testing

We said before that estimation of parameters

was one of the two major areas of statistics.
Now lets turn to the second major area of
statistics, hypothesis testing.
What is a statistical hypothesis? A statistical
hypothesis is an assumption about f(X) if X is
continuous or p(X) if X is discrete.
A test of a statistical hypothesis is a procedure
for deciding whether or not to reject the
hypothesis.

Lets look at an example.

A buyer of light bulbs bought 50 bulbs of
each of two brands. When he tested
them, Brand A had an average life of 1208
hours with a standard deviation of 94
hours. Brand B had a mean life of 1282
hours with a standard deviation of 80
hours. Are brands A and B really different
in quality?

We set up two hypotheses.

The first, called the null hypothesis Ho, is
the hypothesis of no difference.
Ho: A = B
The second, called the alternative
hypothesis Ha, is the hypothesis that
there is a difference.
Ha: A B

On the basis of the sample of 50 from

each of the two populations of light bulbs,
we shall either reject or not reject the
hypothesis of no difference.
In statistics, we always test the null
hypothesis. The alternative hypothesis is
the default winner if the null hypothesis is
rejected.

We never really accept the null hypothesis;

we simply fail to reject it on the basis of
the evidence in hand.
Now we need a procedure to test the null
hypothesis. A test of a statistical
hypothesis is a procedure for deciding
whether or not to reject the null
hypothesis.
There are two possible decisions, reject or
not reject. This means there are also two
kinds of error we could make.

The two types of error are shown in the table

below.
True state

Ho true

Ho false

Reject Ho

Type 1
error

Correct
decision

Do not
reject Ho

Correct
decision

Type 2
error

Decision

If we reject Ho when Ho is in fact true, then

we make a type 1 error. The probability of
type 1 error is .
If we do not reject Ho when Ho is really
false, then we make a type 2 error. The
probability of a type 2 error is .

Now we need a decision rule that will

make the probability of the two types of
error very small. The problem is that the
rule cannot make both of them small
simultaneously.
Because in science we have to take the
conservative route and never claim that we
have found a new result unless we are
really convinced that it is true, we choose
a very small , the probability of type 1
error.

Then among all possible decision rules given , we choose the one that makes as small
as possible.
The decision rule consists of a test statistic and a critical region where the test statistic
may fall. For means from a normal population, the test statistic is

X X
X X
t

s
s
s

n A nB

where the denominator is the standard deviation of the difference between two independent
A
B
A
B
means.
2
2
diff
A
B

The critical region is a tail of the distribution of

the test statistic. If the test statistic falls in the
critical region, Ho is rejected.
Now, how much of the tail should be in the
critical region? That depends on just how small
you want to be. The usual choice is = .05,
but in some very critical cases, is set at .01.
Here we have just a non-critical choice of light
bulbs, so well choose = .05. This means that
the critical region has probability = .025 in each
tail of the t distribution.

For a t distribution with .025 in each tail,

the critical value of t = 1.96, the same as
z because the sample size is greater than
30. The critical region then is
|t |
> 1.96.
In our light bulb example, the test statistic
is
t

1282 1208

4.23
2
2
17.5
80
94

50
50

Now 4.23 is much greater than 1.96 so we

reject the null hypothesis of no difference
and declare that the average life of the B
bulbs is longer than that of the A bulbs.
Because = .05, we have 95% confidence
in the decision we made.

We cannot say that there is a 95% probability that

we are right because we are either right or wrong
and we dont know which.
But there is such a small probability that t will
land in the critical region if Ho is true that if it
does get there, we choose to believe that Ho is
not true.
If we had chosen = .01, the critical value of t
would be 2.58 and because 4.23 is greater than
2.58, we would still reject Ho. This time it would
be with 99% confidence.

How do we know that the test we used is

the best test possible?
We have controlled the probability of Type
1 error. But what is the probability of Type
2 error in this test? Does this test
minimize it subject of the value of ?

To answer this question, we need to

consider the concept of test power. The
power of a statistical test is the probability
of rejecting Ho when Ho is really false.
Thus power = 1-.
Clearly if the test maximizes power, it
minimizes the probability of Type 2 error
. If a test maximizes power for given , it
is called an admissible testing strategy.

Before going further, we need to distinguish

between two types of hypotheses.
A simple hypothesis is one where the value of
the parameter under Ho is a specified constant
and the value of the parameter under Ha is a
different specified constant.
For example, if you test
Ho: = 0 vs Ha: = 10
then you have a simple hypothesis test.
Here you have a particular value for Ho and a
different particular value for Ha.

For testing one simple hypothesis Ha against the

simple hypothesis Ho, a ground-breaking result
called the Neyman-Pearson lemma provides the
most powerful test.
L(a )

L(0 )

is a likelihood ratio with the Ha parameter MLE in

the numerator and the Ho parameter MLE in the
denominator. Clearly, any value of > 1 would
favor the alternative hypothesis, while values less
than 1 would favor the null hypothesis.

Consider the following example of a test of

two simple hypotheses.
A coin is either fair or has p(H) = 2/3.
Under Ho, P(H) = and under Ha, P(H) =
2/3.
The coin will be tossed 3 times and a
decision will be made between the two
hypotheses. Thus X = number of heads =
0, 1, 2, or 3. Now lets look at how the
decision will be made.

First, lets look at the probability of Type 1 error .

In the table below, Ho P(H) =1/2 and Ha P(H)
= 2/3.
X P(X|Ho)

P(X|Ha)

1/8

1/27

1
2

3/8
3/8

6/27
12/27

1/8

8/27

Now what should the critical region be?

Under Ho, if X = 0, = 1/8. Under Ho, if X = 4, = 1/8.

So if either of these two values is chosen as the critical
region, the probability of Type 1 error would be the same.
Now what if Ha is true? If X = 0 is chosen as the critical
region, the value of = 26/27 because that is the
probability that X 0. On the other hand, if X = 4 is
chosen as the critical region, the value of = 19/27
because that is the probability that X 3.
Clearly, the better choice for the critical region is X=3
because that is the region that minimizes for fixed .
So this critical region provides the more powerful test.

In discrete variable problems like this, it

may not be possible to choose a critical
region of the desired . In this illustration,
you simply cannot find a critical region
where = .05 or .01.
This is seldom a problem in real-life
experimentation because n is usually
sufficiently large so that there is a wide
variety of choices for critical regions.

This problem to illustrate the general

method for selecting the best test was
easy to discuss because there was only a
single alternative to Ho.
Most problems involve more than a single
alternative. Such hypotheses are called
composite hypotheses.

Examples of composite hypotheses:

Ho: = 0 vs

Ha: 0

which is a two-sided Ha.

A one-sided Ha can be written as
Ho: = 0 vs

Ha: > 0

Ho: = 0 vs

Ha: < 0

or
All of these hypotheses are composite because
they include more than one value for Ha. And
unfortunately, the size of here depends on the
particular alternative value of being considered.

In the composite case, it is necessary to

compare Type 2 errors for all possible
alternative values under Ha. So now the
size of Type 2 error is a function of the
alternative parameter value .
So () is the probability that the sample
point will fall in the noncritical region when
is the true value of the parameter.

Because it is more convenient to work with

the critical region, the power function 1() is usually used.
The power function is the probability that
the sample point will fall in the critical
region when is the true value of the
parameter.
As an illustration of these points, consider
the following continuous example.

Let X = the time that elapses between two

successive trippings of a Geiger counter in
studying cosmic radiation. It is assumed
that the density function is
f(x;) = e-x
where is a parameter which depends on
experimental conditions.
Under Ho, = 2. Now a physicist believes
that < 2. So under Ha, < 2.

Now one choice for the critical region is X 1. and

2e 2 x dx .135
1

Another choice is the left tail, X .07 for which = .

135. That is,
.07

2e 2 x dx .135
0

Now lets examine the power functions for the two

competing critical regions.

For the critical region X > 1,

1 (1 ) e x dx e
1

and for the critical region X <.07,

.07

1 ( 2 ) e x dx 1 e .07
0

The graphs of these two functions are called the

power curves for the two critical regions.

These two power functions are

Note that the power function for X>1 region is always higher than the
power function for X<.07 region before they cross at = 2. Since the
alternative values in the problem are all <2, clearly the region X>1
is superior.

Organizational Behavior For A Better Tomorrow 2nd Edition by Bruno DyckMitchell J Neubert
No ratings yet
Organizational Behavior For A Better Tomorrow 2nd Edition by Bruno DyckMitchell J Neubert
321 pages
Statistics For Health Research: Non-Parametric Methods
100% (1)
Statistics For Health Research: Non-Parametric Methods
56 pages
BEED Course Description (Old Curriculum) Final
100% (2)
BEED Course Description (Old Curriculum) Final
25 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
56 pages
Artificial UnIntelligence Meredith Broussard
No ratings yet
Artificial UnIntelligence Meredith Broussard
3 pages
Lectures in Educational Statistics
No ratings yet
Lectures in Educational Statistics
16 pages
Numerical Methods in Engineering
0% (7)
Numerical Methods in Engineering
3 pages
Normal Distribution
No ratings yet
Normal Distribution
101 pages
Hypothesis Testing: Concepts and Simple Examples
No ratings yet
Hypothesis Testing: Concepts and Simple Examples
16 pages
Complex Matrices
No ratings yet
Complex Matrices
17 pages
Ungrouped Data
No ratings yet
Ungrouped Data
13 pages
Statistical Methods of Financial Accounting
No ratings yet
Statistical Methods of Financial Accounting
108 pages
Correlation
100% (1)
Correlation
29 pages
Test Bank For Business Statistics in Practice 8th Edition by Bowerman Chapters 1 18
100% (2)
Test Bank For Business Statistics in Practice 8th Edition by Bowerman Chapters 1 18
27 pages
Linear Regression Analysis
100% (3)
Linear Regression Analysis
53 pages
6 Inferential Statistics
100% (1)
6 Inferential Statistics
55 pages
TFN I and II
No ratings yet
TFN I and II
3 pages
Estimation
No ratings yet
Estimation
66 pages
1-History of Handwriting
100% (1)
1-History of Handwriting
8 pages
Central Tendency + Dispersion
No ratings yet
Central Tendency + Dispersion
28 pages
Metode Schrenk
No ratings yet
Metode Schrenk
13 pages
EXAMPLES of PRobability Using The Rules
No ratings yet
EXAMPLES of PRobability Using The Rules
59 pages
Estimation of Parameters (Part 1)
No ratings yet
Estimation of Parameters (Part 1)
21 pages
Malhotra 15
No ratings yet
Malhotra 15
99 pages
Introduction To Organizational Behavior
No ratings yet
Introduction To Organizational Behavior
6 pages
37778
100% (1)
37778
30 pages
Statistics and Probability - Math - 7th Grade by Slidesgo
No ratings yet
Statistics and Probability - Math - 7th Grade by Slidesgo
16 pages
Histograms, Frequency Polygons, and Ogives: Section 2.3
No ratings yet
Histograms, Frequency Polygons, and Ogives: Section 2.3
20 pages
Exponential and Logarithmic Functions
No ratings yet
Exponential and Logarithmic Functions
3 pages
Logic Mat101
No ratings yet
Logic Mat101
32 pages
MMW (Data Management) - Part 2
No ratings yet
MMW (Data Management) - Part 2
43 pages
Research Methodology
100% (1)
Research Methodology
42 pages
Finding Z - Scores & Normal Distribution: Using The Standard Normal Distribution Week 9 Chapter's 5.1, 5.2, 5.3
No ratings yet
Finding Z - Scores & Normal Distribution: Using The Standard Normal Distribution Week 9 Chapter's 5.1, 5.2, 5.3
28 pages
Science Forward Planning Document 2
No ratings yet
Science Forward Planning Document 2
8 pages
MYA Final Statistics and Probability
No ratings yet
MYA Final Statistics and Probability
10 pages
Testing The Difference Between Two Means of Independent Samples: Using The T Test
No ratings yet
Testing The Difference Between Two Means of Independent Samples: Using The T Test
5 pages
Hypothesis Testing With One Sample
No ratings yet
Hypothesis Testing With One Sample
16 pages
The Nature of Mathematics
No ratings yet
The Nature of Mathematics
43 pages
Spearman Rank Order Correlation Coefficient
No ratings yet
Spearman Rank Order Correlation Coefficient
18 pages
Parametric Test
No ratings yet
Parametric Test
28 pages
Fokker 50
100% (3)
Fokker 50
18 pages
Random Variables
No ratings yet
Random Variables
57 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
61 pages
Estimation of Parameter (LESSON 3)
No ratings yet
Estimation of Parameter (LESSON 3)
12 pages
1000 Doctors (And Many More) Against Vivisection
No ratings yet
1000 Doctors (And Many More) Against Vivisection
155 pages
Chapter 1 - 2
0% (1)
Chapter 1 - 2
11 pages
(Foundations of Modern Political Science) Edward R. Tufte - Data Analysis For Politics and Policy - Prentice Hall (1974)
No ratings yet
(Foundations of Modern Political Science) Edward R. Tufte - Data Analysis For Politics and Policy - Prentice Hall (1974)
176 pages
Syllabus CE 103 - Highway - Railroad Engineering
No ratings yet
Syllabus CE 103 - Highway - Railroad Engineering
10 pages
Aerodynamics 2 Summary
No ratings yet
Aerodynamics 2 Summary
44 pages
Statistics 2
100% (1)
Statistics 2
4 pages
Measures of Central Tendency
0% (1)
Measures of Central Tendency
25 pages
Topic04 - Simple Linear Regression
No ratings yet
Topic04 - Simple Linear Regression
11 pages
Lesson 1.4 Circular Functions
No ratings yet
Lesson 1.4 Circular Functions
23 pages
Literature Review TCD
100% (2)
Literature Review TCD
5 pages
Chap 15 Web Site
100% (1)
Chap 15 Web Site
8 pages
Central Limit Theorem: Mat02 - Statistics and Probability
No ratings yet
Central Limit Theorem: Mat02 - Statistics and Probability
20 pages
Frequency Distribution
No ratings yet
Frequency Distribution
24 pages
Measures of Central Tendency Position
No ratings yet
Measures of Central Tendency Position
12 pages
Unit 4 - Induction and Recursion - Complete
No ratings yet
Unit 4 - Induction and Recursion - Complete
12 pages
Discrete Probability Distributions Ppt2
No ratings yet
Discrete Probability Distributions Ppt2
20 pages
English Journal Ranking List - 20211130
No ratings yet
English Journal Ranking List - 20211130
205 pages
Midterm Unit Test in General Mathematics 11
No ratings yet
Midterm Unit Test in General Mathematics 11
3 pages
Central Limit Theorm
No ratings yet
Central Limit Theorm
101 pages
Planarity and Euler
No ratings yet
Planarity and Euler
9 pages
Histogram:: See Chapter 2 Section 1 Worksheet, #3 For Solution To Grouped Frequency Distribution
No ratings yet
Histogram:: See Chapter 2 Section 1 Worksheet, #3 For Solution To Grouped Frequency Distribution
7 pages
Stain Gage Measurements Ys
No ratings yet
Stain Gage Measurements Ys
9 pages
Finding The Raw Score
No ratings yet
Finding The Raw Score
33 pages
Probablity Lecture
No ratings yet
Probablity Lecture
23 pages
Task Performance 1 Integral Calculus Prelims
No ratings yet
Task Performance 1 Integral Calculus Prelims
3 pages
ViBrasi Inman
100% (1)
ViBrasi Inman
67 pages
Ed - MathMajor12 Elementary Statistics and PRobability
No ratings yet
Ed - MathMajor12 Elementary Statistics and PRobability
3 pages
Module 24 Steps in Hypothesis Testing
No ratings yet
Module 24 Steps in Hypothesis Testing
4 pages
Payload Range Diagrams
No ratings yet
Payload Range Diagrams
5 pages
Case Study Research Design
No ratings yet
Case Study Research Design
21 pages
Measures of Central Tendency Grouped Data
No ratings yet
Measures of Central Tendency Grouped Data
8 pages
Vol3 - 7 by Fangqi Xu
No ratings yet
Vol3 - 7 by Fangqi Xu
10 pages
Basic Business Mathematics: Simple Nterest
No ratings yet
Basic Business Mathematics: Simple Nterest
12 pages
Nov. 13-17, 2023
No ratings yet
Nov. 13-17, 2023
7 pages
Problem Solving in Primary Education Mathematics Textbooks in Spain
No ratings yet
Problem Solving in Primary Education Mathematics Textbooks in Spain
31 pages
Aerodynamics C Summary: 1. Basic Concepts
No ratings yet
Aerodynamics C Summary: 1. Basic Concepts
19 pages
Chapter 2
No ratings yet
Chapter 2
7 pages
Module 1 - Session - 1 - Intro To AI
No ratings yet
Module 1 - Session - 1 - Intro To AI
18 pages
The Analysis of Economic Data Using Multivariate and Time Series Technique
No ratings yet
The Analysis of Economic Data Using Multivariate and Time Series Technique
5 pages
Classification Webquest
No ratings yet
Classification Webquest
6 pages
Fluid Mechanics. Pijush K. Kundu
No ratings yet
Fluid Mechanics. Pijush K. Kundu
8 pages
Probability
No ratings yet
Probability
30 pages
Basic Anova PDF
No ratings yet
Basic Anova PDF
6 pages
Ready To Answer by Jim Baird Lesson #5 Do People Exist? Background Information For The Teacher
No ratings yet
Ready To Answer by Jim Baird Lesson #5 Do People Exist? Background Information For The Teacher
9 pages
Sciencedirect: Recurring Issues in Historic Building Conservation
No ratings yet
Sciencedirect: Recurring Issues in Historic Building Conservation
9 pages
St. Gallen PHD
No ratings yet
St. Gallen PHD
40 pages
Day 11 & 12 - Hypothesis Testing
No ratings yet
Day 11 & 12 - Hypothesis Testing
6 pages
Cutoff Rank Sat r3
No ratings yet
Cutoff Rank Sat r3
8 pages
Condition of Carriage
No ratings yet
Condition of Carriage
5 pages
FAA Certification in Experimental Category
No ratings yet
FAA Certification in Experimental Category
1 page
White - Ed 6 - P4.26
No ratings yet
White - Ed 6 - P4.26
2 pages
ACCT 403 Chapter 03
No ratings yet
ACCT 403 Chapter 03
3 pages
Task Sheet History
No ratings yet
Task Sheet History
6 pages
Forty Qualia Propositions and The Implic PDF
No ratings yet
Forty Qualia Propositions and The Implic PDF
11 pages
Visa Pricing Currency Table: 1 January 2015
No ratings yet
Visa Pricing Currency Table: 1 January 2015
1 page
Tuglab: A Cooperative Game Theory Toolbox: Mir As Calvo, Miguel Angel and S Anchez Rodr Iguez, Estela
100% (1)
Tuglab: A Cooperative Game Theory Toolbox: Mir As Calvo, Miguel Angel and S Anchez Rodr Iguez, Estela
3 pages
Text 2. People of The Corn
No ratings yet
Text 2. People of The Corn
4 pages
Never Invest Your Time in Learning Complex Things
No ratings yet
Never Invest Your Time in Learning Complex Things
2 pages
Bdp/Bca/Bts Fhs-01: Foundation Course in Humanities and Social Sciences
No ratings yet
Bdp/Bca/Bts Fhs-01: Foundation Course in Humanities and Social Sciences
7 pages

IE241 Hypothesis Testing

Uploaded by

IE241 Hypothesis Testing

Uploaded by

IE241: Introduction to

We said before that estimation of parameters

Lets look at an example.

We set up two hypotheses.

On the basis of the sample of 50 from

We never really accept the null hypothesis;

The two types of error are shown in the table

If we reject Ho when Ho is in fact true, then

Now we need a decision rule that will

The critical region is a tail of the distribution of

For a t distribution with .025 in each tail,

Now 4.23 is much greater than 1.96 so we

We cannot say that there is a 95% probability that

How do we know that the test we used is

To answer this question, we need to

Before going further, we need to distinguish

For testing one simple hypothesis Ha against the

is a likelihood ratio with the Ha parameter MLE in

Consider the following example of a test of

First, lets look at the probability of Type 1 error .

Now what should the critical region be?

Under Ho, if X = 0, = 1/8. Under Ho, if X = 4, = 1/8.

In discrete variable problems like this, it

This problem to illustrate the general

Examples of composite hypotheses:

which is a two-sided Ha.

In the composite case, it is necessary to

Because it is more convenient to work with

Let X = the time that elapses between two

Now one choice for the critical region is X 1. and

Another choice is the left tail, X .07 for which = .

Now lets examine the power functions for the two

For the critical region X > 1,

and for the critical region X <.07,

The graphs of these two functions are called the

These two power functions are

You might also like