Estimation of Parameter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

ESTIMATION OF PARAMETER

STATISTICAL INFERENCE

Inference regarding the unknown aspects of the distribution of the


population based on samples taken from it. The unknown aspects may be the
form of the distribution or values of parameters involved or both.
Statistical inference is broadly classified into two
1. Estimation of parameters
2. Testing of hypothesis
Estimation deals with methods of determining numbers which may be taken
as the values of the unknown parameters as well as with the determination of
intervals which will contain the unknown parameters with a specified
probability based on samples taken from the population.

Testing of hypothesis deals with the methods for deciding either to accept
or reject the hypothesis based on the samples taken from the population.

Point estimate
Any statistics (function of samples) suggested as an estimate of an unknown
parameter is called a point estimate of that parameter.
E.g. For a normal population with unknown mean, sample standard
deviation is a point estimate of standard deviation of the population.

The statistic suggested as the estimate is called estimator and its value in
any particular case is called estimate.

Desirable properties of a good estimate

1. Unbiasedness
2. Consistency
3. Efficiency
4. Sufficiency
Unbiasedness
Let t be the statistic suggested as an estimate of a parameter 𝜃. t is said to be
an unbiased estimate of 𝜃 if 𝑬 𝒕 = 𝜽

Consistency
Let 𝑡𝑛 be a statistics where n is the sample size. 𝑡𝑛 is said to be a consisitent
estimate of a parameter 𝜃, if for any two positive numbers ∈ and ɳ a N can
be found out such that ,for 𝑛 ≥ 𝑁.
𝑃 𝑡𝑛 − 𝜃 < 𝜖 > 1 − ɳ
Or
𝑡𝑛 is said to be a consistent estimator of 𝜃 if,
𝑃 𝑡𝑛 − 𝜃 < 𝜖 → 1 𝑎𝑠 𝑛 → ∞
Note 1: Consistency is a large sample property and unbiasedness is a small
sample property.
Note 2: A consistent estimate need not be unbiased for small samples, but it
becomes unbiased as 𝑛 → ∞

Sufficient condition for consistency

If 𝐸 𝑡𝑛 = 𝜃 or 𝐸 𝑡𝑛 → 𝜃 and 𝑉 𝑡𝑛 → 0 as 𝑛 → ∞ , 𝑡𝑛 is a consistent
estimate of 𝜃.
Proof:
By Tchebycheffs inequality
1
𝑃 |𝑡𝑛 − 𝜃| < 𝑘 𝑉 𝑡𝑛 > 1 − 2
𝑘
Let ∈> 0 & ɳ > 0 be two given numbers. since 𝑉 𝑡𝑛 → 0 we can find an N
such that for 𝑛 ≥ 𝑁,
𝑉 𝑡𝑛
𝑘 𝑉 𝑡𝑛 < 𝜖 𝑎𝑛𝑑 <ɳ
𝜖2

1
∴ 𝑃 |𝑡𝑛 − 𝜃| < 𝑘 𝑉 𝑡𝑛 >1−
𝑘2
𝑉 𝑡𝑛
⟹ 𝑃 |𝑡𝑛 − 𝜃| < 𝜖 > 1 − 2
> 1 − ɳ2
𝜖
∴ 𝑡𝑛 is a consistent estimate of 𝜃

Result: If T is an unbiased estimator of𝜃. 𝑇 2 is a biased estimator of 𝜃 2 .but


if T is a consistent estimator of 𝜃 then 𝑇 2 is also consistent estimator of 𝜃 2 .

Proof:
Part 1
Given T is an unbiased estimator of 𝜃,
∴𝐸 𝑇 =𝜃
2
𝑉 𝑇 =𝐸 𝑇−𝐸 𝑇 = 𝐸(𝑇 − 𝜃)2 = 𝐸 𝑇 2 − 2𝜃𝐸 𝑇 + 𝜃 2
= 𝐸 𝑇 2 − 2𝜃 2 + 𝜃 2 = 𝐸 𝑇 2 − 𝜃 2

Since 𝑉 𝑇 ≥ 0 ⟹ 𝐸 𝑇 2 − 𝜃 2 ≥ 0 ⟹ 𝐸 𝑇 2 ≥ 𝜃 2
∴ 𝑇 2 is not an unbiased estimate of 𝜃 2
Part 2
Since T is a consistent estimator of 𝜃, for 𝑛 ≥ 𝑛0
𝑃 |𝑇 − 𝜃| ≤ 𝜖 ≥ 1 − ɳ
i.e.,
𝑃 −𝜖 ≤ 𝑇 − 𝜃 ≤ +𝜖 ≥ 1 − ɳ
𝑃 −𝜖 + 𝜃 ≤ 𝑇 ≤ 𝜖 + 𝜃 ≥ 1 − ɳ
𝑃 −𝜖 + 𝜃 2 ≤ 𝑇 2 ≤ 𝜖 + 𝜃 2 ≥ 1 − ɳ
𝑃 𝜖 2 − 2𝜃𝜖 + 𝜃 2 ≤ 𝑇 2 ≤ 𝜖 2 + 2𝜃𝜖 + 𝜃 2 ≥ 1 − ɳ
𝑃 𝜖 2 − 2𝜃𝜖 ≤ 𝑇 2 − 𝜃 2 ≤ 𝜖 2 + 2𝜃𝜖 ≥ 1 − ɳ
Put 𝜖 2 − 2𝜃𝜖 = 𝜖1

𝑃 −𝜖1 ≤ 𝑇 2 − 𝜃 2 ≤ 𝜖1 ≥ 1 − ɳ
𝑃 |𝑇 2 − 𝜃 2 | ≤ 𝜖1 ≥ 1 − ɳ

So 𝑇 2 is consistent estimator of 𝜃 2
Efficiency
Let 𝑡1 & 𝑡2 be two unbiased estimates of a parameter 𝜃. Then 𝑡1 is said to be
more efficient than 𝑡2 if 𝑉(𝑡1 ) < 𝑉(𝑡2 )
𝑉(𝑡 2 )
Note: As variance decreases efficiency of the estimates increases. is
𝑉(𝑡 1 )
called the relative efficiency of 𝑡2 with respect to 𝑡1 .

Sufficiency

An estimate of a parameter 𝜃 is called a sufficient estimate if it contains all


the information about 𝜃 contained in the sample.

Neyman’s condition for sufficiency

Let 𝑥1 , 𝑥2 , … 𝑥𝑛 be a sample from a population with p.d.f 𝑓 𝑥, 𝜃 . the joint


p.d.f of the sample (usually called likelihood of the sample) is
𝐿 𝑥1 , 𝑥2 , … 𝑥𝑛 , 𝜃 = 𝑓 𝑥1 , 𝜃 𝑓 𝑥2 , 𝜃 … 𝑓 𝑥𝑛 , 𝜃
T is sufficient estimate of 𝜃 if and only if it is possible to write,
𝐿 𝑥1 , 𝑥2 , … 𝑥𝑛 , 𝜃 = 𝐿1 𝑡, 𝜃 𝐿2 (𝑥1 , 𝑥2 , … 𝑥𝑛 )
Where 𝐿1 𝑡, 𝜃 is a function of t and 𝜃 alone and 𝐿2 is independent of 𝜃

Completeness

Let X be a random variable and 𝑓(𝑥, 𝜃) be the p.d.f of X. Let 𝑢(𝑥) is a


measurable function x. If 𝐸 𝑢 𝑥 = 0 implies 𝑢 𝑥 = 0, expect on a set of
points with probability zero, for each p.d.f f (𝑥𝑖 , 𝜃).Then the density function
𝑓(𝑥, 𝜃) is complete.
Methods of Estimation

 Method of Maximum likelihood


Definition: Let 𝑓 𝑥, 𝜃1 , 𝜃2 , … 𝜃𝑘 be the p.d.f of the population where
𝜃1 , 𝜃2 , … 𝜃𝑘 are the parameters. Let 𝑥1 , 𝑥2 , … 𝑥𝑛 be the random sample
taken from the population. The likelihood function of sample is,

𝐿(𝑥1 , 𝑥2 , … 𝑥𝑛 : 𝜃1 , 𝜃2 , … 𝜃𝑘 )
= 𝑓 𝑥1 , 𝜃1 , 𝜃2 , … 𝜃𝑘 𝑓 𝑥2 , 𝜃1 , 𝜃2 , … 𝜃𝑘 … . 𝑓(𝑥𝑛 , 𝜃1 , 𝜃2 , … 𝜃𝑘 )

Those values of 𝜃1 , 𝜃2 , … 𝜃𝑘 which maximizes the likelihood function


are called the maximum likelihood estimates of 𝜃1 , 𝜃2 , … 𝜃𝑘

Method:
𝑑𝐿 𝑑 2𝐿
 If there is only one parameter (k = 1), = 0 𝑎𝑛𝑑 < 0 will
𝑑𝜃 𝑑𝜃 2
leads us to the M.L estimates of 𝜃.
 If there are more than one parameters (𝑘 > 1) the equations
𝜕𝐿 𝜕𝐿 𝜕𝐿
= 0, = 0, … = 0 together gives M.L estimates.
𝜕𝜃 1 𝜕𝜃2 𝜕𝜃 𝑘

Since those values of the parameters which maximize L maximizes


𝜕 log 𝐿 𝜕 log 𝐿 𝜕 log 𝐿
log L, so = 0, = 0, … = 0 together gives M.L
𝜕𝜃1 𝜕𝜃2 𝜕𝜃 𝑘
estimates.

 Methods of moments

Let 𝑓(𝑥, 𝜃1 , 𝜃2 , … 𝜃𝑛 ) be the p.d.f. of the population and let


𝑥1 , 𝑥2 , … 𝑥𝑛 be the samples taken from it. In this method we first take
k moments of the population and equate them to the corresponding
moments of the sample and the values of 𝜃1 , 𝜃2 , … 𝜃𝑛 which are
obtained as the solutions of those equations are taken as their
estimates.
Desirable properties of Maximum likelihood estimate

 M.L estimates are asymptotically unbiased


 M.L estimates are consistent
 M.L estimates are most efficient
 M.L estimates are sufficient if sufficient estimates exist
 M.L estimates are asymptotically normally distributed

Cramer-Rao Inequality

Let 𝑓(𝑥, 𝜃) be the p.d.f of a ranom variable X with only one parameter 𝜃.Let
𝑥1 , 𝑥2 , … 𝑥𝑛 be a random sample taken from the population and let t be an
unbiased estimator of 𝜃. If,

a) The range of variation of X is independent of 𝜃, and


b) Differentiation under the integral sign or summation sign
(continuous/discrete) is valid for 𝑓 𝑥, 𝜃 ,
1
𝑉(𝑡) ≥
𝜕 log 𝐿 2
𝐸
𝜕𝜃
Where 𝑉(𝑡) is the sampling variance of t and L, the likelihood
function of the sample.

Note:
1) If 𝑡 is not an unbiased estimate of 𝜃,but 𝐸 𝑡 = 𝜓(𝜃), 𝜓(𝜃) is a
function of 𝜃, the inequality becomes,

𝜓 ′ (𝜃) 2
𝑉(𝑡) ≥ 2
𝜕 log 𝐿
𝐸
𝜕𝜃
𝜕 log 𝐿 2 𝜕 2 log 𝐿
2) 𝐸 = −𝐸
𝜕𝜃 𝜕𝜃 2
𝜕 log 𝐿
3) The inequality becomes an equality when = 𝐴(𝑡 − 𝜃) ,
𝜕𝜃
Where A is independent of the observations but may be a function
of 𝜃. If this condition satisfied t is an unbiased minimum variance of
𝟏
𝜃 and is the minimum value of the variance of t.
𝑨

Method of minimum variance

Let 𝑓(𝑥, 𝜃) be the pdf of the population with one parameter 𝜃 and
𝑥1 , 𝑥2 , … 𝑥𝑛 a random sample. Let 𝐿(𝑥1 , 𝑥2 , … 𝑥𝑛 : 𝜃) be the likelihood
𝜕 log 𝐿
function. If can be put in the form 𝑘(𝑡 − 𝜃) where k is either a
𝜕𝜃
constant or a function of 𝜃 and t is a function of the observations only, t is
the minimum variance unbiased estimator of 𝜽.

Interval estimation

Confidence interval
Let 𝑡1 & 𝑡2 be two statistics such that the probability that the interval 𝑡1 , 𝑡2
contains the true value of the unknown parameter has a preassigned value 𝛼
called the confidence coefficient of the interval. Such an interval is called a
confidence interval with confidence coefficient ∝.

1. Confidence interval for the mean of a normal population


a) C.I for 𝝁 when 𝝈 is known
Let 𝑥 be the mean of a sample of size n taken from 𝑁(𝜇, 𝜎).We know
that
𝑥−𝜇 𝑛
𝑡= ~𝑁(0,1)
𝜎
We can find 𝑡𝛼 2
such that,
𝑃 𝑡 ≤ 𝑡𝛼 2
=𝛼

𝑃 −𝑡𝛼 2
≤ 𝑡 ≤ 𝑡𝛼 2
=𝛼
𝑥−𝜇 𝑛
𝑃 −𝑡𝛼 ≤ ≤ 𝑡𝛼 =𝛼
2 𝜎 2
𝜎 𝜎
𝑃 −𝑡𝛼 2
≤ 𝑥 − 𝜇 ≤ 𝑡𝛼 2
=𝛼
𝑛 𝑛
𝜎 𝜎
𝑃 −𝑡𝛼 2
− 𝑥 ≤ −𝜇 ≤ 𝑡𝛼 2
−𝑥 =𝛼
𝑛 𝑛
𝜎 𝜎
𝑃 𝑡𝛼2 𝑛
+ 𝑥 ≥ 𝜇 ≥ −𝑡𝛼 2 + 𝑥 = 𝛼
𝑛
𝜎 𝜎
𝑃 𝑥− 𝑡𝛼 2 ≥ 𝜇 ≥ 𝑥 + 𝑡𝛼 2 = 𝛼
𝑛 𝑛
𝜎 𝜎
So 𝑥 − 𝑡𝛼 2
, 𝑥+ 𝑡𝛼 2
be the required interval.
𝑛 𝑛

b) C.I for 𝝁 when 𝝈 is unknown

𝑥−𝜇 𝑛−1
𝑡= ~𝑡𝑛 −1
𝑠
We can find 𝑡𝛼 2
such that,
𝑃 𝑡 ≤ 𝑡𝛼 2
=𝛼

𝑃 −𝑡𝛼 2
≤ 𝑡 ≤ 𝑡𝛼 2
=𝛼
𝑥−𝜇 𝑛−1
𝑃 −𝑡𝛼 ≤ ≤ 𝑡𝛼 =𝛼
2 𝑠 2
𝑠 𝑠
𝑃 𝑥− 𝑡𝛼 2
≥𝜇≥𝑥+ 𝑡𝛼 2
=𝛼
𝑛−1 𝑛−1
𝑠 𝑠
So 𝑥 − 𝑡𝛼 2 , 𝑥 + 𝑡𝛼 2
be the required interval.
𝑛−1 𝑛 −1
2. C.I interval for the variance of a normal population
Let 𝑠 2 be the variance of a sample of size n taken from a normal
population 𝑁(𝜇, 𝜎) .
We know that
𝑛𝑠 2 2
𝑢 = 2 ~χ 𝑛 −1
𝜎
Let 𝑢1 , 𝑢2 be two numbers such that 𝑃 𝑢1 ≤ 𝑢 ≤ 𝑢2 = 𝛼
𝑛𝑠 2
𝑃 𝑢1 ≤ 2 ≤ 𝑢2 = 𝛼
𝜎
𝑢1 1 𝑢2
𝑃 ≤ ≤ =𝛼
𝑛𝑠 2 𝜎 2 𝑛𝑠 2

𝑛𝑠 2 2
𝑛𝑠 2
𝑃 ≥𝜎 ≥ =𝛼
𝑢1 𝑢2
𝑛𝑠 2 𝑛 𝑠 2
So , is a confidence interval with confidence coefficient 𝛼
𝑢2 𝑢1

3. C.I interval for the proportion of a binomial population


Consider a sample of size n from a binomial population with
parameter N and p where N is assumed to be known. Let 𝑝 be the
sample proportion.
We know that
𝑝 −𝑝
𝑡= ~ 𝑁(0,1)
𝑝 (1−𝑝 )
𝑛
𝑝 −𝑝
As an approximation we take 𝑡 =
𝑝 (1−𝑝 )
𝑛

We can find 𝑡𝛼 2
such that,
𝑃 𝑡 ≤ 𝑡𝛼 2
=𝛼

𝑃 −𝑡𝛼 2
≤ 𝑡 ≤ 𝑡𝛼 2
=𝛼

𝑝−𝑝
𝑃 −𝑡𝛼 2
≤ ≤ 𝑡𝛼 2
=𝛼
𝑝(1 − 𝑝)
𝑛

𝑝(1 − 𝑝) 𝑝(1 − 𝑝)
𝑃 −𝑡𝛼 ≤ 𝑝 − 𝑝 ≤ 𝑡𝛼 =𝛼
2 𝑛 2 𝑛
𝑝(1 − 𝑝) 𝑝(1 − 𝑝)
𝑃 −𝑝 − 𝑡𝛼 ≤ −𝑝 ≤ 𝑡𝛼 −𝑝 =𝛼
2 𝑛 2 𝑛

𝑝(1 − 𝑝) 𝑝(1 − 𝑝)
𝑃 𝑝 + 𝑡𝛼 ≥ 𝑝 ≥ −𝑡𝛼 +𝑝 =𝛼
2 𝑛 2 𝑛

𝑝(1 − 𝑝) 𝑝(1 − 𝑝)
𝑃 −𝑡𝛼 + 𝑝 ≤ 𝑝 ≤ 𝑝 + 𝑡𝛼 =𝛼
2 𝑛 2 𝑛

𝑝(1 − 𝑝) 𝑝(1 − 𝑝)
𝑃 𝑝 − 𝑡𝛼 ≤ 𝑝 ≤ 𝑝 + 𝑡𝛼 =𝛼
2 𝑛 2 𝑛

𝑝 (1−𝑝 ) 𝑝 (1−𝑝 )
So 𝑝 − 𝑡𝛼 2
, 𝑝 + 𝑡𝛼 2
is the confidence interval of
𝑛 𝑛

p with confidence coefficient 𝛼.

You might also like