0% found this document useful (0 votes)

71 views6 pages

Chapter 7. Statistical Estimation 7.7: Properties of Estimators II

The document discusses desirable properties of estimators such as consistency, unbiasedness, and efficiency. Consistency means an estimator converges to the true parameter value as the number of samples increases. Unbiasedness means an estimator's expected value equals the true parameter on average. Efficiency means an estimator achieves the lowest possible variance. The document provides examples to illustrate these concepts for estimators of a parameter of the uniform distribution.

Uploaded by

onyangavinceny147

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views6 pages

Chapter 7. Statistical Estimation 7.7: Properties of Estimators II

Uploaded by

onyangavinceny147

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Chapter 7.

Statistical Estimation
7.7: Properties of Estimators II
(From “Probability & Statistics with Applications to Computing” by Alex Tsun)

We’ll discuss even more desirable properties of estimators. Last time we talked about bias, variance, and
MSE. Bias measured whether or not, in expectation, our estimator was equal to the true value of θ. MSE
measured the expected squared difference between our estimator and the true value of θ. If our estimator
was unbiased, then the MSE of our estimator was precisely the variance.

7.7.1 Consistency

Definition 7.7.1: Consistency

An estimator θ̂n (depending on n iid samples) of θ is said to be consistent if it converges (in

probability) to θ. That is, for any ε > 0,

lim P |θ̂n − θ| > ε = 0
n→∞

Basically, as n → ∞, θ̂n in the limit will be extremely close to θ.

As usual, we’ll do some examples to see how to show this.

Example(s)

Recall that, if x1 , ..., xn are iid realizations from (continuous) Unif(0, θ), then
n
1X
θ̂n = θ̂n,M oM = 2 · xi
n i=1

Let ε > 0. Show that θ̂n is a consistent estimator of θ.

Solution
Since θ̂n is unbiased, we have that
h i
P |θ̂n − θ| > ε = P |θ̂n − E θ̂n | > ε

because we can replace θ with the expected value of the estimator. Now, we can apply Chebyshev’s inequality
(6.1) to see that
h i Var θ̂n
P |θ̂n − E θ̂n | > ε ≤
ε2

Now, we can take out the 22 from the estimator’s expression and are left only with the variance of the sample

1
2 Probability & Statistics with Applications to Computing 7.7

σ2 Var(xi )
mean, which is always just n = n .

h i Var θ̂n 22 Var 1
Pn
xi

4 · Var (xi ) /n
n i=1
P |θ̂n − E θ̂n |>ε ≤ = =
ε2 ε2 ε2

So now we take the limit with this expression.

4 · Var (xi ) /n
lim P |θ̂n − θ| > ε ≤ lim =0
n→∞ n→∞ ε2

So, θ̂n,M oM is a consistent estimator of θ.

We’re also going to show that the MLE estimator is consistent!

Example(s)

Recall that, if x1 , ..., xn are iid realizations from (continuous) Unif(0, θ), then

θ̂n = θ̂n,M LE = max{x1 , ..., xn }

Let ε > 0. Show that θ̂n is a consistent estimator of θ.

Solution
In this case, we cannot use Chebyshev’s inequality unfortunately, because the maximum likelihood estimator
is not unbiased. The CDF for θ̂n is
Fθ̂n (t) = P θ̂n ≤ t

which is the probability that each individual sample is less than t because only in that case will the max be
less than t, and we have independence so we can say

P θ̂n ≤ t = P (X1 ≤ t) P (X2 ≤ t) ...P (Xn ≤ t)

t
This is just the CDF of Xi to the n-th power, where the CDF of Unif(0, θ) is just θ (see the distribution
sheet):

0,
 t<0
n
Fθ̂n (t) = FX (t) = ( θt )n , 0 ≤ t ≤ θ

1, t>0


There are two ways we can have the absolute value from before be greater than epsilon

P |θ̂n − θ| > ε = P θ̂n > θ + ε + P θ̂n < θ − ε

The first term is 0, because there’s no way our estimator is greater than θ + ε, as it’s never going to be
greater than θ by definition (the samples are between 0 and θ so there’s no way the max of the samples is
greater than θ). So, now we can just use the CDF on the right term, and just plug in for t:
(
( θ−ε n
θ ) , ε<θ
P θ̂n > θ + ε + P θ̂n < θ − ε = P θ̂n < θ − ε =
0, ε≥θ
7.7 Probability & Statistics with Applications to Computing 3

We can assume that ε is less than θ because we really only care when ε is very very small, so we have that
θ − ε n
P |θ̂n − θ| > ε =
θ

Thus, when we take the limit as n approaches infinity, we see that in the parenthesis, we have a number less
than 1, and we raise it to the n-th power, so it goes to 0

lim P |θ̂n − θ| > ε = 0
n→∞

So, θ̂n,M LE is also a consistent estimator of θ.

Now we’ve seen that, even though the MLE and MoM estimators of θ given iid samples from Unif(0, θ) are
different, they are both consistent! That means, as n → ∞, they will both converge to the true parameter
θ. This is clearly a good property of an estimator.

7.7.2 Consistency vs Unbiasedness

You may be wondering, what’s the difference between consistency and unbiasedness? I, for one, was very
confused about the difference for a while as well. There is, in fact, a subtle difference, which we’ll see by
comparing estimators for θ in the continuous Unif(0, θ) distribution.

1. For instance, an unbiased and consistent estimator was the MoM for the uniform distribution: θ̂n,M oM =
2x̄. We proved it was unbiased in 7.6, meaning it is correct in expectation. It converges to the true
parameter (consistent) since the variance goes to 0.

2. However, if you ignore all the samples and just take the first one and multiply it by 2, θ̂ = 2X1 , it is
unbiased (as it is 2 · θ2 ), but it’s not consistent; our estimator doesn’t get better and better with more n
because we’re not using all n samples. Consistency requires that as we get more samples, we approach
the true parameter.

3. Biased but consistent, on the other hand, was the MLE estimator. iWe showed its expectation was
n h n
θ, which is actually “asymptotically unbiased” since E θ̂n,M LE = θ → θ as n → ∞. It
n+1 n+1
does get better and better as n → ∞.

1
4. Neither unbiased nor consistent would just be some random expression, such as θ̂ = X12
.
4 Probability & Statistics with Applications to Computing 7.7

7.7.3 Efficiency

To take about our last topic, efficiency, we first have to define Fisher Information. Efficiency says that our
estimator has as low variance as possible. This property combined with consistency and unbiasedness mean
that our estimator is on target (unbiased), converges to the true parameter (consistent), and does so as fast
as possible (efficient).

7.7.3.1 Fisher Information

Definition 7.7.2: Fisher Information

Let x = (x1 , ..., xn ) be iid realizations from probability mass function pX (t | θ) (if X is discrete), or
from density function fX (t | θ) (if X is continuous), where θ is a parameter (or vector of parameters).
The Fisher Information of the parameter θ is defined to be:
" 2 # 2
∂ ln L(x | θ) ∂ ln L(x | θ)
I(θ) = n · E = −E
∂θ ∂θ2

where L(x | θ) denotes the likelihood of the data given parameter θ (defined in 7.1). From Wikipedia,
it “is a way of measuring the amount of information that an observable random variable X carries
about an unknown parameter θ upon which the probability X depends”.

That written definition is definitely a mouthful, but if you stop and parse it, you’ll see it’s not too bad
to compute. We always take the second derivative of the log-likelihood to confirm that our MLE was a
maximizer; now all you have to do is take the expectation to get the Fisher Information. There’s no way
though that I can interpret the negative expected value of the second derivative of the log-likelihood, it’s
just too gross and messy.

7.7.3.2 The Cramer-Rao Lower Bound (CRLB) and Efficiency

Why did we define that nasty Fisher information? (Actually, it’s much worse when θ is a vector instead of a
single number, as the second derivative becomes a matrix of second partial derivatives). It would be great if
the mean squared error of an estimator θ̂ was a low as possible. The Cramer-Rao Lower Bound actually gives
a lower bound on the variance on any unbiased estimator θ̂ for θ. That is, if θ̂ is any unbiased estimator for
θ, there is a minimum possible variance (variance = MSE for unbiased estimators). And if your estimator
achieves this lowest possible variance, it is said to be efficient. This is also a highly desirable property of
estimators. The bound is called the Cramer-Rao Lower Bound.
Definition 7.7.3: Cramer-Rao Lower Bound (CRLB)

Let x = (x1 , ..., xn ) be iid realizations from probability mass function pX (t | θ) (if X is discrete), or
from density function fX (t | θ) (if X is continuous), where θ is a parameter (or vector of parameters).
If θ̂ is an unbiased estimator for θ, then
1
MSE(θ̂, θ) = Var θ̂ ≥
I(θ)
7.7 Probability & Statistics with Applications to Computing 5

where I(θ) is the Fisher information defined earlier. What this is saying is, for any unbiased estimator
1
θ̂ for θ, the variance (=MSE) is at least I(θ) . If we achieve this lower bound, meaning our variance
1
is exactly equal to I(θ) , then we have the best variance possible for our estimate. That is, we have
the minimum variance unbiased estimator (MVUE) for θ.

Since we want to find the lowest variance possible, we can look at this through the frame of finding the
estimator’s efficiency.

Definition 7.7.4: Efficiency

Let θ̂ be an unbiased estimator of θ. The efficiency of θ̂ is

I(θ)−1
e(θ̂, θ) = ≤1
Var θ̂

This will always be between 0 and 1 because if your variance is equal to the CRLB, then it equals 1,
and anything greater will result in a smaller value. A larger variance will result in a smaller efficiency,
and we want our efficiency to be as high as possible (1).
An unbiased estimator is said to be efficient if it achieves the CRLB - meaning e(θ̂, θ) = 1. That is,
it could not possibly have a lower variance. Again, the CRLB is not guaranteed for biased estimators.

That was super complicated - let’s see how to verify the MLE of Poi(θ) is efficient. It looks scary - but it’s
just messy algebra!

Example(s)

Recall that, if x1 , ..., xn are iid realizations from X ∼ Poi(θ) (recall E [X] = Var (X) = θ), then
n
1X
θ̂ = θ̂MLE = θ̂MoM = xi
n i=1

Is θ̂ efficient?

Solution
First, you have to check that it’s unbiased, as the CRLB only holds for unbiased estimators...
" n #
h i 1X
E θ̂ = E xi = E [xi ] = θ
n i=1

...which it is! Otherwise, we wouldn’t be able to use this bound. We also need to compute the variance. The
2
variance of the sample mean (the estimator) is just σn , and the variance of a Poisson is just θ.
n
!
1X Var (xi ) θ
Var θ̂ = Var xi = =
n i=1 n n

Then, we’re going to compute that weird Fisher Information, which gives us the CRLB, and see if our
variance matches. Remember, we take the second derivative of the log-likelihood, which we did earlier in 7.2
6 Probability & Statistics with Applications to Computing 7.7

so we’re just going to copy over the answer.

n
∂2 X xi
ln L(x | θ) = − 2
∂θ2 i=1
θ

Then, we need to take the expected value of this. It turns out, with some algebra, you get − nθ .
2 " n # n
∂ ln L(x | θ) X xi 1 X 1 n
E 2
= E − 2
= − 2
E [xi ] = − 2 nθ = −
∂θ i=1
θ θ i=1 θ θ

Our Fisher Information was the negative expected value of the second derivative of the log-likelihood, so
we just flip the sign to get nθ .
2
∂ ln L(x | θ) n
I(θ) = −E =
∂θ2 θ
.
Finally, our efficiency is the inverse of the Fisher Information over the variance:

I(θ)−1 ( n )−1
e(θ̂, θ) = = θθ =1
Var θ̂ n

Thus, we’ve shown that, since our efficiency is 1, our estimator is efficient. That is, it has the best pos-
sible variance among all unbiased estimators of θ. This, again, is a really good property that we want to have.

To reiterate, this means we cannot possibly do better in terms of mean squared error. Our bias is 0, and our
variance is as low as it can possibly go. The sample mean is the unequivocally best estimator for a Poisson
distribution, in terms of efficiency, in terms of bias, and MSE (it also happens to be consistent, so there are
a lot of good things).
As you can see, showing efficiency is just a bunch of tedious calculations!

Cramer Raoh and Out 08
No ratings yet
Cramer Raoh and Out 08
13 pages
Unbiased Estimator
No ratings yet
Unbiased Estimator
70 pages
Unit-16 IGNOU STATISTICS
No ratings yet
Unit-16 IGNOU STATISTICS
16 pages
CRLB Vector Proof
No ratings yet
CRLB Vector Proof
24 pages
4 - Point Estimation
No ratings yet
4 - Point Estimation
36 pages
ET Lecture02
No ratings yet
ET Lecture02
41 pages
Solved Examples of Cramer Rao Lower Bound
100% (1)
Solved Examples of Cramer Rao Lower Bound
6 pages
Topic 2a Theory of Estimation
No ratings yet
Topic 2a Theory of Estimation
12 pages
Slides Estimation PDF
No ratings yet
Slides Estimation PDF
17 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
02 Estimation
No ratings yet
02 Estimation
20 pages
Estimation
No ratings yet
Estimation
6 pages
Statistical Methods
No ratings yet
Statistical Methods
25 pages
Notes On The Cram Er-Rao Inequality: Kimball Martin February 8, 2012
No ratings yet
Notes On The Cram Er-Rao Inequality: Kimball Martin February 8, 2012
6 pages
Unbiasedness: Sudheesh Kumar Kattumannil University of Hyderabad
No ratings yet
Unbiasedness: Sudheesh Kumar Kattumannil University of Hyderabad
10 pages
Debre Berhan University: College of Natural and Computational Science Department of Statistics
No ratings yet
Debre Berhan University: College of Natural and Computational Science Department of Statistics
9 pages
Last Week: 4.2 Cramer-Rao Lower Bound: 2 2 Fisher Bilgisi CRB
No ratings yet
Last Week: 4.2 Cramer-Rao Lower Bound: 2 2 Fisher Bilgisi CRB
9 pages
Risk Fisher
No ratings yet
Risk Fisher
39 pages
01 Estimation PDF
No ratings yet
01 Estimation PDF
13 pages
Estimators1 PDF
No ratings yet
Estimators1 PDF
2 pages
A Modern Gauss-Markov Theorem
No ratings yet
A Modern Gauss-Markov Theorem
18 pages
Cramer-Rao Lower Bound: 4.1 Estimator Accuracy
No ratings yet
Cramer-Rao Lower Bound: 4.1 Estimator Accuracy
7 pages
MVUE
No ratings yet
MVUE
89 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Minimum Variance Unbiased Estimation: Example
No ratings yet
Minimum Variance Unbiased Estimation: Example
4 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
Evaluating Estimators PDF
No ratings yet
Evaluating Estimators PDF
4 pages
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
No ratings yet
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
2 pages
Estimation Bertinoro09 Cristiano Porciani 1
No ratings yet
Estimation Bertinoro09 Cristiano Porciani 1
42 pages
Efficiency Notes
No ratings yet
Efficiency Notes
3 pages
MVU
100% (1)
MVU
72 pages
Properties of Estimators New Update Spin
No ratings yet
Properties of Estimators New Update Spin
43 pages
Notes Estimation Theory
100% (3)
Notes Estimation Theory
39 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
Introecon Estimators Properties
No ratings yet
Introecon Estimators Properties
8 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
Estimator Properties
No ratings yet
Estimator Properties
17 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Unit 2
No ratings yet
Unit 2
41 pages
CH 7
No ratings yet
CH 7
47 pages
Statinf Estimation
No ratings yet
Statinf Estimation
110 pages
Chap 10
No ratings yet
Chap 10
7 pages
Chapter 7. Statistical Estimation: 7.6: Properties of Estimators I
No ratings yet
Chapter 7. Statistical Estimation: 7.6: Properties of Estimators I
6 pages
Chapter10 Solutions
No ratings yet
Chapter10 Solutions
62 pages
Estimação Pontual
No ratings yet
Estimação Pontual
58 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
Lecture Note 15
No ratings yet
Lecture Note 15
6 pages
Lecture15 Fisherinfo
No ratings yet
Lecture15 Fisherinfo
4 pages
Classical Estimation
No ratings yet
Classical Estimation
11 pages
Lecture Note 16
No ratings yet
Lecture Note 16
4 pages
Module 3
No ratings yet
Module 3
5 pages
Econ-2042 - Unit 6-W12-13
No ratings yet
Econ-2042 - Unit 6-W12-13
77 pages
Chapter 7. Statistical Estimation: 7.7: Properties of Estimators II
No ratings yet
Chapter 7. Statistical Estimation: 7.7: Properties of Estimators II
6 pages
6.chapter 4
No ratings yet
6.chapter 4
9 pages
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
No ratings yet
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
16 pages
Chapter4 Estimation
No ratings yet
Chapter4 Estimation
28 pages
Unbias
No ratings yet
Unbias
15 pages
4 Statistics
No ratings yet
4 Statistics
3 pages