Topic 3 Theory of Estimation
Topic 3 Theory of Estimation
1. Sufficiency
One of the most important objectives in the primary stage of statistical analysis is to summarize the
observed data to a form most suitable for decision making. This primary data processing generally
reduces the size of the original set of the sample values to a relatively small number of statistics.
When sample data is summarized by small number of statistics it is desired that no information relevant
to the decision procedure is lost in this process, however, this is not always possible.
The theory of sufficient statistics provides us with the necessary criteria for identifying and constructing
sufficiently informative non-trivial statistics.
Suppose that x 1 , x 2 , x 3 ,… x n form a random sample from a distribution having a p.d.f f ( X ; θ ) where θ
is a real number belonging to Ω. Let T be a statistic and t be any particular value of T . Then T is said to
be a sufficient statistic for the parameter θ if there exists a determination of the conditional joint
distribution of x 1 , x 2 , x 3 ,… x n given that T =t which does not depend on θ .
Therefore for each value of t , there will be a family of possible conditional distributions corresponding
to different possible values of θ ∈Ω. If it happens that for each possible value of t , the conditional
distribution of x 1 , x 2 , x 3 ,… x n given T =t is the same for all the values of θ and therefore does not
actually depend on the value of θ , then T is a sufficient statistic for the parameter θ .
Example 1
Suppose that x 1 , x 2 , x 3 ,… x n form a random sample from a Poisson distribution with mean λ . Show
n
that a statistic T =∑ x i is a sufficient statistic for λ .
i=1
Solution
e−λ λ xi
And
B= {T =t=0,1,2, … }
L ( xi , t ; λ )=Pr ( A ∩ B ) =Pr ( A )
n
e− λ λ x i
¿∏
i=1 xi !
e−nλ λ∑ x i
e−nλ λT
¿ n
= n
∏ x i ! ∏ xi !
i=1 i =1
g ( t ; λ )=Pr ( T =t )
e−nλ ( nλ )t
¿ , t=0,1,2 …
t!
n
Recall: T =t=∑ x i Poi ( nλ )
i=1
Since T is the sum of n independent Poisson random variables each having mean λ hence the
conditional distribution of x 1 , x 2 , x 3 ,… x n given T =t is given by
e−nλ λ T
n
∏ xi !
i=1
g ( X|t )= t
e−nλ ( nλ )
t!
t!
1
¿ n
∏x!
( n)
i
t
i=1
{( x 1 , x 2 , x 3 , … x n ) :t=∑ x i , x i=0,1,2 , …
i=1
}
1
This is the multinomial distribution with parameters T and pi where pi= i=1,2,3 …
n
n
Since this distribution of independent of parameter λ , we conclude that T =∑ x i is sufficient for λ .
i=1
Example 2
Suppose that x 1 , x 2 , x 3 ,… x n form a random sample from a Bernoulli distribution with parameter θ .
n
Show that the statistic T =∑ x i, i=1,2 , … , n is sufficient for θ .
i=1
Solution
x 1−x
f ( X ,θ )= θ (1−θ ) , x=0,1
{
0 , elsewhere
Pr ( x1 , x 2 , x 3 , … x n|T =t )
Pr ( X|T =t ) =
Pr ( T =t )
Pr [ ( x 1 , x 2 , x 3 , … x n ) ∩ ( T =t ) ]
¿
Pr ( T =t )
Pr ( x 1 , x 2 , x 3 ,… x n )
¿
Pr ( T =t )
θ∑ x ( 1−θ )n−∑ x
i i
¿
Pr ( T =t )
n
But Pr ( T =t )=Pr t=∑ x i
( i=1
)
n
Note: the distribution of ∑ x i is binomial with parameters n and p
i=1
x i Bin ( p )
n
∑ xi Bin ( n , p )
i=1
n
This implies that Pr t=∑ x i = n θ ( 1−θ )
t
( n−t
i=1
) () t
Therefore
θt ( 1−θ )n−t
Pr ( X|T =t ) =
n θ t ( 1−θ )n−t
t ()
1
¿
n This is independent of θ hence is a suffient statistic for θ .
()t
Example 3
Solution
x 1−x
f ( X ,θ )= θ (1−θ ) , x=0,1
{
0 , elsewhere
Let us now consider a particular case where x 1=1 , x 2=1 , x 3=0 then,
Pr ( x 1=1 , x 2=1 , x3 =0 )
g ( 1,1,0|T =t )=
Pr ( T =3 )
Pr ( x 1=1 , x 2=1 , x3 =0 )
¿
Pr ( x 1=1 , x 2=1 , x3 =0∨x 1=0 , x 2=0 , x 3=1 )
f ( 1,1,0 )
¿
f ( 1,1,0 ) +f ( 0,0,1 )
NB:
The calculations in the previous example suggest the following general conclusions:
∏ f ( xi ; θ )
L (x ; θ)
i=1
= … … … … … …∗¿
g ( t ,θ ) g ( t ,θ )
We say that T is a sufficient statistic for θ if and only if the ratio in ¿ does not depend on θ .
Although for distributions of continuous case we can’t use the same argument. It is still true that
the ratio ¿ does not depend on θ and the conditional distribution of x 1 , x 2 , x 3 ,… x n given
T =t does not depend on θ . We therefore take the following alternative definition of a
sufficient statistic T of a parameter θ .
Let x 1 , x 2 , x 3 ,… x n form a random sample of size n from a distribution that has p.d.f f ( x ,θ ).
Let T =U ( x 1 , x 2 , x 3 ,… x n ) be a statistic whose p.d.f is g ( t , θ ). Then T is a sufficient statistic for
θ if and only if the ratio ¿ does not depend on θ for every fixed value t of T .
Exercise
Example
Let Y 1 <Y 2 <Y 3 <…<Y n denote the order statistic of a random sample x 1 , x 2 , x 3 ,… x n from the
distribution that has p.d.f
− ( x−θ )
f ( X ,θ )= e ,−∞< x< ∞, θ>0
{ 0 , elsewhere
Solution
Then G 1 ( Y 1 )=Pr ( Y 1 ≤ y 1 )
¿ 1−Pr ( Y 1> y 1 )
n ∞
− ( x−θ )
¿ 1−∏ ∫ e dx
i=1 y 1
n
− ( y1−θ )
¿ 1−∏ e
i=1
−n( y 1−θ )
¿ 1−e is the c.d.f
d
g1 ( y 1 )= ( 1−e−n ( y −θ ) ) 1
d y1
−n ( y1−θ )
¿ne
Now the joint p.d.f of x 1 , x 2 , x 3 ,… x n is
n
−( x i −θ)
f ( X ,θ )=∏ e =e nθ−∑ x
i
i=1
¿ e nθ . e−∑ x i
L ( X ; θ ) e nθ . e−∑ x i
e− ∑ x i
−n y Which is independent of θ
¿
ne 1
FACTORISATION CRITERION
We now introduce a simple method of finding a sufficient statistic applicable in many problem areas. Let
x 1 , x 2 , x 3 ,… x n form a random sample for which the p.d.f is f ( X ,θ ) where the values of θ is unknown
and belongs to a given parameter space Ω. A statistic T =U ( x 1 , x 2 , x 3 ,… x n ) is sufficient for θ if and
only if a joint p.d.f L ( X ; θ ) of x 1 , x 2 , x 3 ,… x n can be factored as follows for all values of
X =( x 1 , x 2 , x 3 ,… x n ) and for all θ ∈Ω.
Exercise
Example 1
Suppose that x 1 , x 2 , x 3 ,… x n form a random sample from a Bernoulli distribution for which the
n
probability of success θ is unknown, 0 ≤ θ ≤1. Show that t=∑ x i is sufficient for θ .
i=1
Solution
i =1
¿ θ∑ x (1−θ )n−∑ x
i i
n
¿ θt ( 1−θ )n−t Where t=∑ x i
i=1
We can see that L ( X ; θ ) has been expressed as a product of h ( x )=1 that does not depend on function
g ( t , θ )=θ t ( 1−θ )n−t that depends on θ but depends on the observed vectors X on through the value of
n
t . It then follows thatt=∑ x i is a sufficient statistic for θ .
i=1
Example 2
Suppose that x 1 , x 2 , x 3 ,… x n form a random sample from a Normal distribution with mean μ and
n
2 2
variance σ for which mean μ is unknown and variance σ is known. Show that T =∑ x i is a sufficient
i=1
statistic for μ.
Solution
2
[ }]
n
−( x i−μ )
L ( X ; μ )= ∏
i=1
1
√2 π σ 2
exp {
2σ
2
−n n
¿ (2 π σ ) 2 2
exp { −1
2∑( i
2 σ i=1
x −μ )2 }
−n n
¿ (2 π σ ) 2 2
exp { −1
2∑( i
2 σ i=1
x 2−2 x i μ+ μ 2) }
−n n n
μ −n μ 2
¿ ( 2 π σ 2 ) 2 exp { −1
∑ i
2 σ 2 i=1
x 2
} {
exp ∑ i
σ 2 i =1
x exp
2 σ2 } { }
−n n n
μ n μ2
¿ (2 π σ ) 2 2
exp
−1
{ 2
} {
∑ x i exp σ 2 ∑ x i− 2 σ 2
2 σ 2 i=1 i =1
}
We see that L ( X ; μ ) has been expressed as a product of the function
−n n
2 2
h ( x ) =( 2 π σ ) exp
−1
{
2∑ i
2 σ i=1 }
x 2 Which does not depend on μ and
n
μ n μ2 μt n μ 2
g ( t , μ )=exp { ∑
σ 2 i=1
x i −
2 σ2
=exp } {
−
σ 2 2σ 2 }
Which depends on X only through the values of T .
n
It follows on factorization criterion that T =∑ x i is a sufficient statistic for μ.
i=1
NB:
n
Since ∑ x i=n x́ we can equivalently say that g ( t , μ ) above depends on x 1 , x 2 , x 3 ,… x n
i=1
n
1
through the value of x́ . Therefore in our example, the statistic x́= ∑ x i is also sufficient
n i=1
statistic for μ. In fact any one to one functions of x́ will be a sufficient statistic for μ.
Example 3
Solution
{
¿ θn exp (θ−1 ) ∑ ln x i
i=1
}
n
{
¿ θn exp (θ−1 ) ln ∏ }
i=1
xi
Hence L ( X ; θ ) has been expressed as a function of h ( x )=1 and g ( t , θ )=θ n exp { (θ−1 ) ln t }
n
Hence t=∏ xi is sufficient for θ .
i=1
Since the conditional distribution of random sample x 1 , x 2 , x 3 ,… x n given T =t does not depend on θ ,
the conditional distribution of any other statistic say S= √ x 1 , x 2 , x 3 , … x n given T =t does not depend
on θ .
Proof
L ( X ; θ )=h ( x ) g ( t ,θ )
Z−b
Let Z=aT +b and a ≠ 0, then T =
a
L ( X ; θ )=g ( Z−b
a
,θ ) h ( x )
Since the first factor on the right hand side of this equation is a function of Z and θ while the second
factor does not depend on θ , factorization criterion implies that Z=aT +b is also a sufficient statistic for
θ.
L ( X ; θ )=g ( Z , θ ) h ( x ) =g ( aT +b ,θ ) h ( x )
Which implies that T is a sufficient statistic for θ .
From this result, we deduce the fact that sufficient statistics for θ are unique.
Example 4
Let x 1 , x 2 , x 3 ,… x n denote a random sample from Poisson distribution with mean λ> 0. Then the joint
p.d.f of x 1 , x 2 , x 3 ,… x n is
n
e− λ λ x
( )
i
L ( X ; λ ) =∏
i=1 xi !
e−nλ λ∑ x i
¿ n
∏ xi !
i=1
1
n h ( x )= n
By factorization criterion T =∑ x i is a sufficient statistic for λ where
i=1 ∏ xi !
i=1
And g ( t , λ ) =e−nλ λt
Any function of T with single value inverse will also be sufficient for θ .
T 1
x́= , log x , T 2 are all sufficient statistics for λ .
n T