0% found this document useful (0 votes)
28 views170 pages

Stat 154-Statistical Methods II (Updated)

Uploaded by

benefourb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views170 pages

Stat 154-Statistical Methods II (Updated)

Uploaded by

benefourb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 170

STATISTICAL

METHODS II
STAT 154

VINCENT K. DEDU

DEPARTMENT OF STATISTICS AND


ACTUARIAL SCIENCE
SPECIAL
PROBABILITY
DISTRIBUTIONS
THE BINOMIAL DISTRIBUTION
For a situation to be described using a
binomial model
a)A finite number, n, trials are carried out
b)Trials are independent
c)The outcome of each trial is described as
either a success or a failure
d)The probability p of a successful outcome is
the same for each trial
The discrete random variable X is the
number of successful outcomes in n
trials.
If the above conditions are satisfied
X is said to follow a binomial
distribution.
This is written as

X ~ B(n, p) or X ~ Bin(n, p)
Note:
The number of trials n, and the
probability of success p are both
needed to describe the distribution
completely. They are known as the
parameters of the binomial
distribution.
Let P(failure)=q then q=1-p
If X ~ B (n, p ) the probability of obtaining
r successes in n trials is
P( X r ) where
n r n r
P( X r )  Cr p q for r = 0,1,2,......n

n  n n!
Cr or  
 r  (n  r )!r !
EXAMPLE
At A-life Supermarket of
customers pay by credit card. Find
the probability that in a randomly
selected sample of ten customers

a)exactly two pay by credit card


b)more than 7 pay by credit card
SOLUTION
X is number of customers in a
sample of ten who pay by
credit card
consider paying by credit card
as success
A binomial model can be used with n=10
so

X ~ B (10, 0.6)
n r n r
P(X=r) = Cr p q

2 8
10C2 p q
a). P(X=2) = 2 8
45 0.6 0.4
= 0.011
=
b) P(X>7)= P(X=8) + P(X=9) +
P(X=10)
= 0.17
Example:
A random variable X is B(7,0.2).
Find to 3dp
a) P( X 3)

b) P (1  X 4)

c) P( X  1)
3 4
a) P(x=3) = 7 C3 * 0.2 * 0.8
EXPECTATION AND VARIANCE OF
BINOMIAL DISTRIBUTION
If the Random Variable, X is such that

Then
Example:

The random variable X~B(4,0.8)

a) Construct the probability distribution for X


and Find the Expectation and Variance.

b) Verify that E(x) = np , Var(x) = npq


The random variable X~B(4,0.8)
Find;
i. The expectation
ii. The variance

SOLUTION
iii. E(X) = np = 4(0.8) = 3.2
iv. Var(X) = np(1-p) = npq = 4(0.8)(0.2) =
0.64
Five independent trials of an
experiment are carried out. The
probability of successful outcomes is p
and probability of failure = 1- p= q
Write out the probability distribution of
X, where X is the number of successful
outcomes in five trials. Comment on
your answer.
. X ~ B (5, p ) X 0,1, 2,3, 4,5

P ( X 0) 5 C0 q 5 p 0 q 5
5 4 1 4 1
P ( X 1)  C1q p  5q p
5 3 2 3 2
P ( X 2)  C2 q p 10q p
5 2 3 2 3
P ( X 3)  C3 q p 10q p
5 1 4 1 4
P ( X 4)  C4 q p  5q p
5 0 5 0 5
P ( X 5)  C5 q p  5q p
5 4 3 2 2 3 1 4 5 5
q  5q p  10q p  10q p  5q p  p (q  p)
5
(q  p ) 1

sin ce q  p 1

This confirms the total sum of the probability


=1
THE POISSON DISTRIBUTION
Consider these random variables
- the number of emergency calls received by an
ambulance control in an hour.
- the number of vehicles approaching a
motorway toll bridge in a five minute interval
- the number of flaws in a meter length of
material
- The number of white corpuscles on a slide
- The number of calls in a radio phone-in
session.
Assuming that each of the above occurs
randomly, they are all examples of variables
that can be modeled using a Poisson
distribution.

CONDITIONS FOR A POISSON MODEL


- Events occur singly and at random In a given
interval
 or of time or space
- ,the mean number of occurrences in
the given interval, is known and is finite
the variable X is the number of occurrences in the
given interval
If the above conditions are satisfied X is said to
follow a Poisson distribution written

X ~ Po (  ) where

  x
P( X x) 
x! for x=0,1,2……
EXAMPLE
A student finds that the average number of
amoebas in 10ml of pond water from a
particular pond is four. Assuming that the
number of amoebas follow a Poisson
distribution, find the probability that in a
10ml sample
a)There are exactly five amoebas
b)There are no amoebas
c)There are fewer than three amoebas
SOLUTION

X follows Poisson with mean 4


X ~ Po (4)

  x
P( X x)  , where  4
x!
4 5
 4
P ( X 5)  0.156
5!
4 0
 4
P ( X 0)  0.0183
0!
P( X  3) P( X 0)  P( X 1)  P( X 2)
4 0 4 1 4 2
 4  4  4
  
0! 1! 2!
4
 (1  4  8)
4
13
0.238
10ml→ 4
5ml→ 2
UNIT INTERVAL

Care must be taken to specify the interval


being considered.
in the above example the mean number of
amoebas in 10ml of pond water from a
particular pond is 4, so the number in 10ml
is distributed Po (4) .
Now suppose you want to find a
probability relating to the number of
amoebas in 5ml of water from the same
pond, the mean number of amoebas in
5ml is 2 so the number in 5ml is
distributed Po (2)
Similarly, the number of amoebas in
1ml of pond water is distributed Po (0.4) .
EXAMPLE
On average the school photocopier breaks
down eight times during the school week
(Monday to Friday).Assuming that the
number of breakdowns can be modeled by a
Poisson distribution ,find the probability
that it breaks down,
a) Five times in a given week
b) once on Monday
c) eight times in a fortnight
SOLUTION
a)X is the number of breakdowns in a week,
where X ~ Po (8)
8 5
 8
P ( X 5)   0.0916
5!
b)Let Y be the number of breakdowns in a day.
The mean number of breakdowns in a day is
8/5 = 1.6 so Y ~ Po (1.6)
 1.6
P (Y 1) 1.6 0.323
c)Let F be the number of breakdowns in a
fortnight
the mean number of breakdowns in a
fortnight is 2x8 =16. So F ~ Po (16)
 16168
P ( F 8)  0.0120 (3 sf).
8!
MEAN AND VARIANCE OF THE POISSON DISTRIBUTION

, is the only parameter of the


Poisson distribution.

if X ~ Po (  )
E ( x )  Var ( x) 
Examples
If X follows a Poisson distribution with
standard deviation 1.5, find P(X  3)
solution
if X ~ Po ( ) then Var (x) =
Var(x) = SD 2 1.52 2.25
so  =2.25 and X ~ P0 (2.25)
a ) P ( x 3) 1  P ( x  3)
= 1  P ( x 0)  P ( x 1)  P ( x 2)
 2.25  2.25 2

1   1  2.25  2! 
 
 1  0.6093
 0.391
USING THE POISSON DISTRIBUTION AS AN
APPROXIMATION TO THE BINOMIAL
DISTRIBUTION
When n is large (n>50) and p is small
(p<0.1),the binomial distribution
X ~ B ( n, p )
can be approximated using a Poisson
distribution with the same mean ie

X ~ Po (np)
the approximation gets better as
n gets larger and p gets smaller

i.e.
as p  0
n 
EXAMPLE
Eggs are packed into boxes of 500. On
average 0.7% of the eggs are found to be
broken when the eggs are unpacked.
Find correct to 2 significant figures, the
probability that in a box of 500 eggs;
a) Exactly three are broken
b) At least two are broken
SOLUTION
Let X be the number of broken
eggs in a box of 500
,
so
E(x) = np = 500 x 0.007 = 3.5
since and , we can use a Poisson
approximation
X ~ Po (3.5)
 3.5 3
 3.5
a) P( x 3)  0.22
3!
b) P( x 2) 1  P( x  2)
1  P( x 0)  P( x 1)
.
 35  3.5
1     3.5 
0.86
ASSIGNMENT 2
The random variable X is
B(100,0.03). Find the following
probabilities
i)P(x=0) ii)P(x=2) iii)P(x=4) ,
using
a) The Binomial distribution
b) A Poisson approximation
c) Comment on your results in a and b
NORMAL DISTRIBUTION
Carl Friedrich Gauss (1777-1855)
The normal distribution is one of the
most important distributions in
statistics. Many quantities in life follow
a normal distribution.
E.g.
• Ages of students in a class
• Marks of students obtained in an exam.
• Heights of Social Sc. I students
The normal distribution X is
continuous. Its pdf f(x)
depends on two parameters
mean( ) and standard
deviation ( ).
To describe the distribution we
write;

X ~ N ( , ) 2
The normal distribution curve has the ff features;
• It is bell-shaped
• It is symmetrically about the mean
• It extends from  to  
• The total area under the curve is 1

    
2
 ( x  )
1 2 2
f ( x)  e ,   x  
 2
. X ~ N (0,1)

3 2 1 0 1 2 3

 0
 1
. 1
X ~ N (4, )  4
4
1
 
2

2 3 4 5 6
.
X ~ N (50, 4)  50
 2

48 49 50 51 52
FINDING THE PROBABILITIES
Consider
The probability that X lies between a and b is;
b
P ( a  x  b) f ( x ) dx
which is area under
a the curve.

a b
THE STANDARD NORMAL VARIABLE Z
In order to use the same set of tables for all
possible values of  and  2 , the variable X is
standardized so that the mean is 0 and standard
deviation is 1. The standardized normal variable
is called Z, and Z~N(0,1).
X Z

To illustrate how the variable X is standardized to


the variable Z, Consider X distributed normally
with a mean 50 and variance 4
i.e.
2
X ~ N (50, 4)  50,  4 or  2

Use:

X
Z where Z ~ N (0,1)

Eg: P(X<56)

X  50 56  50
Z   3
2 2

P(Z<3)
Find P(X<56) if X ~ N (50, 4)
We standardize as follows

X  56  50
P( < 2
)

=P (Z< 3)
USING STANDARD NORMAL TABLES
P ( Z  z )  ( z ); area under the curve up to z

(Z )

0 Z
Eg.
Find a). P(Z<0.16)
P ( Z  0.16)  (0.16)
 (0.16) 0.5636
P( Z  0.16) 0.5636
b). P(Z<0.34)

P( Z  0.34) 0.6331
c). P(Z<0.43) = 0.6664
Question:
Find a). P(Z<0.85) b). P(Z>0.85)
Solution:

 (0.85)

0 0.85
a).
P( Z  0.85) 0.8023
b).

0 0.85

P ( Z  0.85) 1   (0.85)
1  0.8023
0.1977
NEGATIVE VALUES
.
P ( Z   a )  ( a ) 1   (a )
P ( Z   a ) P ( Z  a )  (a )

-a 0 a
Example
Z~N(0,1)
Use the standard Normal Table to
find;
a). P(Z<1.37)
b). P(Z>-1.37)
c). P(Z>1.37)
d). P(Z<-1.37)
.

P( Z  1.37) (1.37)
0.9147
0.91 (2s. f )
.

P ( Z   1.37) P ( Z  1.37)  (1.37)


 0.9147
 0.91 (2 s. f )
.

P( Z  1.37) 1  P( Z  1.37)
1   (1.37)
1  0.9147
0.0853
.
P ( Z   1.37) P ( Z  1.37)
1   (1.37)
1  0.9147
0.0853
For P(a<Z<b)
a b

P (a  Z  b)  (b)   (a )
Find P(0.345<Z<1.751)
P ( Z  1.751)  P ( Z  0.345)
 (1.751)   (0.345)
0.9599  0.6368
0.3231
Show that;

Example:  150  10


2
X ~ N (150 , 10 )
P ( X  165)
.
165  150
XZ 1.5
10
P( X  165)  P( Z  1.5)
 (1.5)
0.9332
P ( X  0.58) 0.7190
example
P ( X  a ) 0.7190
Find a
Reverse:
If Z~N(0,1). Find the value of a
a). P(Z<a) = 0.9693
 ( a ) 0.9693
1
a  0.9693
1.87
Find the values of the ff.
EXAMPLE
• The time taken by a milkman to deliver milk
to the High Street is normally distributed
with mean 12 minutes and standard
deviation, 2 minutes. He delivers milk
everyday. Find the probability that he takes
I. Longer than 17 minutes
II. Less than 10 minutes
III. Between 9 and 13 minutes
SOLUTION
• where Z

I. (X
)
SOLUTION CONT’D
II. )
SOLUTION CONT’D
III.
THE CENTRAL LIMIT THEOREM
If a sample of any size is taken from a population with a
normal distribution with mean = and standard deviation =

x
the distribution of means of sample size n, will be normal
with a mean

standard deviation
for proportion (p) is given by:
p (1  p ) pˆ (1  pˆ )
pˆ Z   Se( pˆ )  pˆ Z   pˆ Z 
2 2 n 2 n
where pˆ  x is the point estimator for p which should
n ˆ
np
n 30
not be close to 0 or 1 and, and both and
npˆ (1  pˆ ) 
are 5. If these conditions are not met the
interval estimate becomes unreliable and is not
recommended to be used. The sample size needed to
estimate p with a specified maximum error of estimation,
E and confidence coefficient (1- ) is obtained as follows:
2
Z  pˆ (1  pˆ )
pˆ (1  pˆ )
E Z  , from which we have n 2
2
2 n E

If there is no prior knowledge of p, or pˆ (1  pˆ ) is


replaced by its maximum value, 0.25 and the
maximum possible error then becomes
0.25
E Z 
2 n
2
0.25Z 
from which we obtain n 2
2
E
SAMPLING DISTRIBUTIONS
Sample Sample
Sample

Sample Sample
Sample

The sampling distribution


consists of the values of the
sample means,
SAMPLING DISTRIBUTIONS
A sampling distribution is the
probability distribution of a sample
statistic that is formed when samples
of size n are repeatedly taken from a
population. If the sample statistic is the
sample mean, then the distribution is
the sampling distribution of sample
means.
THE CENTRAL
LIMIT THEOREM
If X ~ N (  ,  )
2

 2
X ~ N ( , )
n
Consider the population 2, 4, 6. The
mean  4

Consider the samples:


Sample X Sample meanX
( 2,4 ) 3
( 2, 6) 4
(4 , 6 ) 5
Mean=12 /3=4
2
 
i.e.  x  Var ( X )  or SDX 
n n
THE CENTRAL LIMIT THEOREM
THE CENTRAL LIMIT THEORY
Theorem(CLT): If a random sample
of n observations is drawn from a
population with 2 mean and 
variance  when n is
,then
sufficiently large, the sampling
X
distribution of will
approximately be normally  x 
distributed with mean 
x 
and standard error, n
POINT ESTIMATES
.

X
 X
  (X  X ) 2

n N
INTERVAL ESTIMATION

P(?  X  ?) 0.95
 0.05 5%
1   0.95 95%
INTERVAL ESTIMATION
(CONFIDENCE INTERVAL)
We wish to find L1 and L2 such that;

P( L1  X  L2 ) 1  
 = probability that it will not contain
X

The interval [ L L ] is the (1- )100%


1 2
confidence interval.
1 -  = confidence coefficient
CONFIDENCE INTERVAL
THERE ARE TWO CASES

1. When “n” is large(n30)

2. When “n” is small(n30)


CONFIDENCE INTERVAL FOR  (THE
POPULATION MEAN)
• Of a Normal population
• With known variance
• n large or small
 2
If X ~ N ( , )
2
X ~ N ( , )
n

X 
Z when Z ~ N (0,1)
 n
P( a < Z < b) = 95%

95%

2.5% 2.5%

P( -1.96 < Z < 1.96) = 95%


X 
 n
 
 1.96  X    1.96
n n
 
1...... X  1.96    X 1.96
n n
 a 95% confidence int erval for  is (1)
 (1   )100%
.
 
X  Z    X  Z
2 n 2 n

 X Z 
2 n

E Z  Maximum Error
2 n
Example:
Suppose X =28.5, n=32,  =2. Find 90%
confidence interval.
Solution:
0.9
0.05 0.05

1.64 1.64

Z 0.05 1.64
.
 
X  Z    X  Z
2 n 2 n

2 2
28.5  1.64    28.5  1.64
32 32

27.92    29.08 is C.I


CONFIDENCE INTERVAL
FOR POPULATION
MEANS AND
PROPORTIONS
Confidence interval for  may
exist for two cases:
• large sample size
• small sample size.

i. When n is large n 30 and  2


is known, the C.I. for  is given
by
.
X Z  Se ( x )
2


 X Z 
2 n
where  is estimated by the sample standard
deviation, s if it is unknown. The sample size, n
required for estimating  may be obtained as
follows. The maximum error of estimation is


E Z 
2 n
From which we obtain the sample size, n

Z  2 2

n 2
2
E
QUESTION 1

Construct a 99% confidence interval given


that 𝑋 ̅=25.4, 𝜎= 4.3 , n=50
SOLUTION
because n=50(Thus n>30), we have

Where E =

= 1%

=
Hence

E=

But = 2.57

E = 2.57 .56
25.4

( 26.96)
is the 99% confidence interval for the
distribution.
QUESTION 2

Construct a 95% confidence interval


given that , = 2, n=32
ANSWER

( 29.193)
QUESTION 3

A sample of size n= 100


produced the sample mean
of = 16. Assuming the
population standard
deviation σ= 3, compute a
95% confidence interval for
the population mean
ANSWER

[15.412,16.588]
WHEN “n” IS SMALL (n)

When n is small, (n  30)then the sampling


distribution becomes the t-distribution
and the confidence interval becomes:
X t  Se( x )
( n  1)
2

s
 X t 
2
( n  1) n
where t
( n  1)
2
is the critical value obtained
from the t-distribution with
degrees of freedom (n-1) and s
sample standard deviation.
QUESTION 4
Given n=10
Standard deviation(s)=11.1
Sample mean = 121.2
Construct a 95% confidence interval for
the distribution .
SOLUTION
(

E=
T- DISTRIBUTION TABLE
= 2.262
E= 2.262 =7.94

121.2121.2

113.26

Is the 95% confidence interval for the


distribution.
QUESTION 5
A random sample of 29 households was
selected as part of a study on electricity
usage, and the number of kilowatt-hours
(kWh) was recorded for each household in
the sample for the March quarter of 2006.
The average usage was found to be 375kWh.
It was found that the standard deviation of
the usage was 81kWh.
Assuming the standard deviation is
unchanged and that the usage is normally
distributed, provide an expression for
calculating a 99% confidence interval for the
mean usage in the March quarter of 2006.
SOLUTION

σ=81,n=29, = 375

Is “n” large or
small?
n is small(n<30)

Answer:

333.44
Question 6.
An industrial designer wants to
determine the average amount of
time it takes an adult to assemble an
“easy to assemble” toy. A sample of
16 times yielded an average time of
19.92 minutes, with a sample
standard deviation of 5.73 minutes.
Assuming normality of assembly
times, provide a 95% confidence
interval for the mean assembly time.
Answer

s =5.73, n=16, and


t=2.131

Is the confidence
interval.
FOR PROPORTION (P)

pˆ (1  pˆ )
pˆ Z 
2 n
CONFIDENCE INTERVALS FOR DIFFERENCE
BETWEEN TWO PARAMETERS
Determining an interval estimate for the
difference between two population means ( 1  2 )
or two population proportions ( p1  p2 ) is a
means of comparing two population
parameters. The
(1   )100% confidence interval for ( 1  2 ),
like for the single parameter  , two cases of sample sizes.
1- parameter 2 -parameters

u u1  u2
p p1  p2
• If the sample sizes are large, we have
_ _ _ _
( x1  x2 ) Z .Se ( x1  x2 )
2

_ _
 2
 2
( x1  x2 ) Z 1
 2
2 n1 n2

where n1 30, n2 30


• If the sample sizes are large, we have
_ _ _ _ _ _
 12  22
( x1  x2 ) Z .Se ( x1  x2 ) ( x1  x2 ) Z 
2 2 n1 n2

where n1 30, n2 30 and  and  are estimated ,


2
1
2
2
2 2
using s and s respectively if they are unknown.
1 2

If n1 n2 n , then the error of estimation will be given by


 12  22  12   22
E= Z  Z ,
2 n n 2 n
Z 2 ( 12   22 )
from which we have n= 2
E2
• If the sample sizes are small, the two
population variances are assumed to be equal

where n1  30, n2  30
_ _
1 1
( x1  x2 ) t ,( n n  2) .Sp  ,
2 1 2 n1 n2
2 2
(n1  1) s  (n2  1) s
2
and s  p
1
, 2

(n1  n2  2)
for
. (p1  p2 ) is given by
   
  p1 (1  p1 ) p1 (1  p2 )
( p1  p2 ) Z 
2 n1 n2
where n1 and n2 are large
 x1 x2 
and p1  and p2  .
n1 n2
EXAMPLE
a. A mid-semester examination in Statistics
was given 25 students randomly selected
from class A and also to another 40 students
randomly selected from class B. The mean
scores obtained from both samples and the
standard deviations are as shown below.
_
Class A n A  55 x A
66 s A 10
_
Class B nB 40 x 62 sB 8
B

Construct a 95% confidence interval for the


difference in the mean scores.
b. A random sample of size, was selected from a
population where and Another random
sample size, was selected from a different
population with and

If the difference between the two mean scores


is to be estimated,
i. Find an estimate for the common variance of
the populations, and
ii. Provide a 95% confidence interval for the
difference.
SOLUTION
a ) the 95% confidence interval
for the difference (u A  u B ),
between the scores
where both n A and nB 30 is

2 2
s s
( x A  xB ) Z A
 B
2 nA nB
2 2
.
10 8
(66  62) (1.96) 
55 40
4 1.96(1.85)
4 3.626
or 0.374  (u A  uB )  7.626
(u A  uB )  (0.374,7.626)
CONFIDENCE INTERVAL FOR
POPULATION VARIANCE
CHI-SQUARE DISTRIBUTION
χ2 =

n = Sample Size
PROPERTIES OF THE CHI-SQUARE
DISTRIBUTION
χ2 - Distribution

1. Not Symmetrical
2. Values are non-negative
3. As the degrees of freedom goes up, the
distribution becomes more symmetrical but
never gets symmetrical
Cont’d

135
FINDING THE CONFIDENCE INTERVAL
n= 12, 0.05 = .025
= .025

χL2 χR2
3.816 21.92

1-.025 =.975
Cont’d

137
CONT’D
CONFIDENCE INTERVAL FOR VARIANCE
CONFIDENCE INTERVAL FOR STANDARD
DEVIATION
EXAMPLE 1
n=7, 315.6
Find a 95% confidence interval for

SOLUTION
C.I. = []
C.I. = [131.04 ; 1530.66]
EXAMPLE 2
A sample of 7 boxes of a certain type of cereal
with a nominal weight of 750grams had the
following weights;

775, 780, 781, 795, 803, 810, 823

Find a 95% confidence interval for


SOLUTION
n= 7

C.I = []

C.I =[131.042 ; 1530.662]


EXAMPLE 3
n = 8, 5.3
Find a 95% confidence interval for

Solution
C.I. =[]
C.I. = [2.32; 21.95]
Hypothesis Testing
Introduction:
In some practical problems of statistical inference we
may be required to take decision concerning the
parameters of the population instead of finding
estimates for them. The following are some
situations requiring such decisions:

(a) A health personnel may claim that a drug is


effective in 90% cases it is administered.

(b)A mean life span of a type of an electric bulb is at


(c) An accused, in a criminal trial, is always
assumed to be innocent until proven otherwise.

(d) An educationist may claim that two methods


of teaching are equally effective.

(e) An educational programme will result in


improved communication between parents and
children.
(f)A medical researcher may have a
hypothesis that a new drug is more
effective than another one in curing a
disease.
• The above statements can be subjected to
statistical verification using sample
observations.

• Thus by means of hypothesis testing we are


able to determine whether or not the
statements are consistent with available
data or evidence.
(a)Definitions:

(i) Hypothesis is a statement, assertion or


conjecture about the nature of one or more
situations ( or populations) to be studied.

(ii)Hypothesis Testing is a statistical procedure


that uses a random sample data to determine
whether the statement about a population
should be rejected or not. Hypothesis testing
involving population parameters are called
Parametric or Classical tests
(b) Types of Hypothesis:
In testing for the validity of a hypothesis we
usually propose two types of hypotheses namely;

HO
i) Null Hypothesis, denoted , which is the
tentative statement assumed to be true.

H
ii) Alternative Hypothesis, denoted 1 , which
contradicts the null hypothesis. It is accepted
only when sufficient evidence exist to establish
its truth.
Formulation of H0 and H1
Testing the validity of a claim
• The claim made is chosen as
the null hypothesis while the
challenge to the claim is
taken as the alternative
hypothesis.
(c) Formulation of H O and H1
When we wish to establish the validity of a
statement about a population using the
evidence obtained from a random sample data,
the negative of the statement is what we take
as the null hypothesis. The statement itself
constitutes the alternative hypothesis. In some
applications, it is not obvious how H and H
0 1
should be formulated. The following
guidelines for developing hypothesis of three
types of situations are suggested.
(i) Testing Research Hypothesis: This is formulated as
alternative hypothesis.

(ii) Testing the validity of a claim: This generally


corresponds to the “innocent until proven guilty”
analogy. The claim made is chosen as the null
hypothesis while the challenge to the claim is taken
as the alternative hypothesis.

(iii)Testing in decision making situations: This occurs


when the decision-maker must choose between H 0 two
courses of action,Hone1
associated with and the
other with . If example, the decision involves the
population parameter, we should have the two
hypothesis formulated as:
H 0 :   0 against H1 :   0
where  0 is a particular value of  and an
instant action is taken if H 0 is rejected.
Hypothesis Testing of Population
Means

For single means


Forms of Tests:
In general a hypothesis testing involving a
parameter, say  takes one of the following
forms:
(i) One-Tailed test to the right which is
formulated as:
H 0 :   0

H1 :    0
(ii) One-Tailed test to the left, formulated as:

H 0 :   0

H1 :    0
(iii) Two-Tailed Test formulated as:

H 0 :   0

H1 :    0
Test Statistic
• It is a formular that leads to the
rejection or the acceptance of
the null hypothesis.
The test-statistic is

X
Z for n 30
 n

X  0
t for n  30
s n
DECISION RULE
• Zis compared with the critical value Z /2 or Z

• t is compared with the critical value t ,( n  1)


2
or t ,( n  1)

Reject H 0 if | Z |  Z / 2 or | t |  t ,( n  1)
2
AN OUTLINE FOR HYPOTHESIS TESTING

• State H 0 and H1
• Choose the level of significance 
• Select a test statistic
• Find the critical region
• Compute the value of the statistic
• Draw conclusions (Reject or Accept)
Example
A car manufacturer claims that
average weekly income of owners of his
car is $180. An investigator takes a
random sample of 200 such car owners
and finds out that they have an average
weekly income of $184.26 with a
standard deviation of $24.12. On the
bases of the sample, do you agree on
the manufacturer’s claim. (Test at 5%
significance level)
Solution
H 0 :  180
H1 :  180
  0.05

X 184.26  180


Z    2.50
 n 24.12 / 200
Z /2  1.96

Since 2.50 is
H 0outside the acceptance region,
we reject And conclude that the weekly
income of the car owners is not $180.
Errors in Hypothesis
Testing:
When H0 is tested against H1 using a randomly
selected sample data, two possible errors may be
committed.

These are the Type I and Type II errors which have


come about as a result of the decisions which are
taken. These are illustrated in the diagram below:
Actual Situation
Decision

H0 is true H 0 is false
(H1 is true)

Accept Correct decision Type II error


1     

Reject Type I error


 
Correct decision
1   
(i) Type I error is committed when the null
hypothesis, H 0 is rejected when in actual
situation it is true. The probability of
committing this error is  , which is also
referred to level of significance and indicates
the size of the critical region.

(ii) Type II error is committed when H 0 is


accepted when in actual situation it is false.
The probability of committing this error is  .
P-value
For given level of significance α , the null
hypothesis

i. Is rejected if p-value < α

ii. Is failed to be rejected if p-value > α


• Let α=0.05

(HS ) Reject (S) Accept


(NS)

0.01 0.05
(g) P-Value:
The P-value is the smallest level of significance for
which the observed data would call for rejection of H 0
in favour ofH1 . The p-value dives additional
insight into the strength of the decision taken. A very
small p-value, such as 0.0001, indicates that there is
virtually no likelihood
H 0 that is true. On the other
hand, a high p-value such as 0.2 means H 0that is
not rejected and there is little likelihood that it is
false. The p-value is often referred to as the observed

level of significance. For given level of significance, ,
null hypothesis,
H0
i. Is rejected if p-value 
ii. Is failed to be rejected if p-value  
END

You might also like