0% found this document useful (0 votes)

66 views5 pages

Maximum Likelihood Estimation.: N N I N I 1 N I I 1

The document provides an overview of maximum likelihood estimation (MLE). It defines MLE as the value of the parameter θ that maximizes the likelihood function given a sample. The document proves that under certain regularity conditions, MLE is a consistent and asymptotically normal estimator. Specifically, it shows that the distribution of the square root of the sample size times the difference between the MLE and the true parameter value converges in distribution to a normal with mean 0 and variance equal to the inverse of the Fisher information evaluated at the true parameter value. An example of MLE for the exponential distribution is also provided.

Uploaded by

Hugo Roque

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views5 pages

Maximum Likelihood Estimation.: N N I N I 1 N I I 1

Uploaded by

Hugo Roque

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Lecture 7

Maximum Likelihood Estimation.

1 MLE

Let f (·|θ) with θ∈Θ be a parametric family. Let X = (X1 , ..., Xn ) be a random sample from distribution
∏n
f1 (·|θ0 ) θ0 ∈ Θ. Then the joint pdf is
with f (x|θ) = i=1 f1 (xi |θ) where x = (x1 , ..., xn ). The log-likelihood
∑n
is ℓ(θ|x) = i=1 log f1 (xi |θ). The maximum likelihood estimator is, by de˝nition,

θ̂M L = arg max ℓ(θ|x).

θ∈Θ

The FOC is
1 ∑ ∂ℓ1 (θ̂M L |xi )
n
= 0.
n i=1 ∂θ

Note that the ˝rst information equality is E[∂ℓ1 (θ0 |Xi )] = 0. Thus MLE is the method of moments estimator

corresponding to the ˝rst information equality. So we can expect that the MLE is consistent. Indeed, the

theorem below gives the consistency result for MLE:

Theorem 1 (MLE consistency). In the setting above, assume that (1) θ0 is identi˝able, i.e. for any θ =
̸ θ0 ,
̸ f (x|θ0 ), (2) the support of f (·|θ) does not depend on θ, and (3) θ0 is an
there exists x such that f (x|θ) =
interior point of parameter space Θ. Then θ̂M L →p θ0 .
The proof of MLE consistency will be given in 14.382 and 14.385. What the proof does, it shows that

function g(θ) = Eθ0 ℓ1 (θ|Xi ) (here Xi ∼ f1 (xi |θ0 )) is maximized at θ = θ0 and random process
1
n ℓ(θ|X)
converges to the function g(θ) in a uniform manner in probability. Then it argues that the maximizer of the

process θˆM L will converge to θ0 .

Once we know that the estimator is consistent, we can think about the asymptotic distribution of the

estimator. The next theorem gives the asymptotic distribution of MLE:

Theorem 2 (MLE asymptotic normality). In the setting above, assume that conditions (1)-(3) in the
MLE consistency theorem hold. In addition, assume that (4) f1 (xi |θ) is thrice di˙erentiable with respect
to θ and we can interchange integration with respect to x and di˙erentiation with respect to θ, and (5)
|∂ 3 log f1 (xi |θ)/∂θ3 | ≤ M (x) and E[M (Xi )] < ∞. Then

√
n(θ̂M L − θ0 ) ⇒ N (0, I1−1 (θ0 ))

1
Proof. This is a sketch of the proof as it misses and important step. By de˝nition,
∂ℓ(θ̂M L ,x)
∂θ = 0. By the

Taylor theorem with a remainder, there is some random variable θ̃ with value between θ0 and θ̂M L such that

∂ℓ(θ̂M L )|X ∂ℓ(θ0 |X) ∂ 2 ℓ(θ̃|X)

= + (θ̂M L − θ0 ).
∂θ ∂θ ∂θ2

So,

√ − √1n ∂ℓ(θ∂θ0 |X)

n(θ̂M L − θ0 ) = .
1 ∂ 2 ℓ(θ̃|X)
n ∂θ 2

Since θ̂M L →p θ0 and θ̃ is between θ0 and θ̂M L , θ̃ →p θ0 as well. From θ̃ →p θ0 , one can prove that

1 ∂ 2 ℓ(θ̃|X) 1 ∂ 2 ℓ(θ0 |X)

2
− = op (1).
n ∂θ n ∂θ2

We will not discuss this result here since it requires knowledge of the concept of asymptotic equicontinuity

which we do not cover in this class. You will learn it in 14.385. Note, however, that this result does not

follow from the Continuous mapping theorem since we have a sequence of random functions ℓ(θ|X) instead

of just one non-random function. Suppose we believe in this result. Then, by the Law of large numbers,

[ 2 ]
1 ∑ ∂ 2 log f1 (Xi |θ0 )
n
1 ∂ 2 ℓ(θ0 |X) ∂ log f1 (Xi |θ0 )
= →p E = −I1 (θ0 ).
n ∂θ2 n i=1 ∂θ2 ∂θ2

[ ] [ ]
∂ log f1 (Xi |θ0 ) ∂ log f1 (Xi |θ0 )
Next, by the ˝rst information equality, E ∂θ =0 while V ar ∂θ = I1 (θ0 ). Thus,

by the Central limit theorem,

1 ∑ ∂ log f1 (Xi |θ0 )

n
1 ∂ℓ(θ0 |X)
√ =√ ⇒ N (0, I(θ0 )).
n ∂θ n i=1 ∂θ

Finally, by the Slutsky theorem,

√
n(θ̂M L − θ0 ) ⇒ N (0, I −1 (θ0 )).

One interpretation of MLE asymptotics is that the MLE is asymptotically e°cient (hit Rao-Cramer

bound in very large samples).

Example Let X1 , ..., Xn be a random sample from a distribution with pdf f (x|λ) = λ exp(−λx). This

distribution is called exponential. Its log-likelihood for one draw is ℓ1 (λ|xi ) = log λ − λxi . So ∂ℓ1 ∂λ
(λ|xi )
=
2
1/λ − xi and ∂ ℓ∂λ
1 (λ|xi )
2 = −1/λ 2
. So Fisher information is I1 (λ) = 1/λ 2
. Let us ˝nd the MLE for λ. The
∑n ∑n
log-likelihood for the whole is ℓ(θ|x) = n log λ − λ i=1 xi . The FOC is
n
− i=1 xi = 0. So λ̂M L = 1
.
√ λ̂M L Xn
Its asymptotic distribution is given by n(λ̂M L − λ) ⇒ N (0, λ2 ).

2
2 Inference using MLE
−1
We will have a longer discussion about how to estimate the asymptotic variance of the MLE (I1 (θ0 )) later

when we will discuss asymptotic tests. Right now I want to mention several suggestions.

First of all, if I1 (θ) is a continuous function in θ (which is needed for asymptotic results), then given that
( )−1 ( −1 )−1
θ̂M L is consistent for θ0 , the quantity I1 (θ̂M L ) is consistent for I1 (θ0 ) .

Second, by de˝nition of Fisher information, it equals to the expectation of either negative second deriva-

tive of the likelihood or of the squared score. Instead of taking expectation one may approximate it by taking

averages. For example

1 ∑
n
∂ 2 ℓ1 (θ̂|Xi )
Iˆ = −
n i=1
∂θ2

will be a consistent estimator of the Fisher information.

The third idea to be used in this context is parametric bootstrap. Assume θ̂M L is the MLE we obtained

from our sample of size n. For b = 1, ..., B do the following:

• Simulate sample Xb∗ = (X1b

∗ ∗
, ..., Xnb ) as i.i.d. draws from f1 (xi |θ̂M L ) (that is, assuming that θˆM L is

the true parameter value).

• Find MLE using sample Xb∗ , denote it θb∗ .

Calculate the sample variance of
∗
(θ1∗ , ..., θB ), it gives the bootstrap approximation to (nI1 (θ0 ))−1 . You may

also do bootstrap- bias correction using similar procedure.

3 When MLE asymptotic theory fails us...

Example A word of caution. For asymptotic normality of MLE, we should have common support. Let

us see what might happen otherwise. Let X1 , ..., Xn be a random sample from U [0, θ]. Then θ̂M L = X(n) .
√
So n(θ̂M L − θ) is always nonpositive. So it does not converge to mean zero normal distribution. In fact,

E[X(n) ] = (n/(n + 1))θ and V (X(n) ) = θ2 n/((n + 1)2 (n + 2)) ≈ θ2 /n2 . On the other hand, if the theorem

worked, we would have V (X(n) ) ≈ 1/(nI(θ)). The MLE happens to be super-consistent here, means it
√
converges to the true value at a faster speed than the regular parametric speed of 1/ n.

Example Now, let us consider what might happen if the true parameter value θ0 were on the boundary

of Θ. Let X1 , ..., Xn be a random sample from distribution N (µ, 1) with µ ≥ 0. As an exercise, check that
√
µ̂M L = X n if Xn ≥ 0 and 0 otherwise. Suppose that µ0 = 0 . Then n(µ̂M L − µ0 ) is always nonnegative.

So it does not converge to mean zero normal distribution.

Example Finally, note that it is implicitly assumed both in the consistency and asymptotic normality

theorems that parameter space Θ is ˝xed, i.e. independent of n. In particular, the number of parameters

should not depend on n. Indeed, let

( ) (( ) ( ))
X1i µi σ2 0
Xi = ∼N ,
X2i µi 0 σ2

3
for i = 1, ..., n, and X1 , ..., Xn be mutually independent. One can show that if the sample size (n) increases
2
to in˝nity, the MLE for σ is inconsistent in this case, though a consistent estimator for σ2 exists.

What is interesting, though we won't show it here is that a bootstrap does not help in this cases, that

is, the bootstrap approximation to the distribution of θ̂M L is not close to the true ˝nite-sample distribution

of θ̂M L .

4 Pseudo-MLE

Let us have a sample X = (X1 , ..., Xn ) i.i.d from some distribution. We do not know what distribution it

is, let's assume it has pdf g(xi ). But we wrongly assumed a speci˝c parametric family, that is, we assumed

Xi ∼ f1 (xi |θ). What would happen if we do MLE. Apparently, MLE will be estimating a pseudo-true

parameter value θ0 with minimizes in some sense the distance between g(·) and family f (·|θ). In particular:

∫
θ0 = arg max log[f1 (xi |θ)]g(xi )dxi = arg max E log f1 (Xi |θ).
θ θ

Parameter θ0 may be of interest or may be not. Under some regularity condition θ̂M L →p θ0 , and in most

parts the logic of the proof of theorem about normality will hold. However, the the information equality

would fail. De˝ne [( )2 ]

∂ log f1 (Xi |θ0 )
Σ1 = E ,
∂θ0
[ ]
∂ 2 log f1 (Xi |θ0 )
Σ2 = −E ,
∂θ02
where expectations in both cases are taken assuming that Xi ∼ g(·). If g is not in the parametric family,

then in general Σ1 ̸= Σ2 . But using the logic of the proof, we can prove that

√
n(θ̂M L − θ0 ) ⇒ N (0, Σ−1 −1
2 Σ1 Σ2 )

This asymptotic variance Σ−1 −1

2 Σ1 Σ2 is often called White's due to White's (1980) paper and thus White's

standard errors.

4
MIT OpenCourseWare
https://ocw.mit.edu

14.381 Statistical Method in Economics

Fall 2018

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms

Final2014 PDF
No ratings yet
Final2014 PDF
2 pages
Mean Squared Error
No ratings yet
Mean Squared Error
5 pages
Module 4
No ratings yet
Module 4
3 pages
Maximum Likelihood Estimation (MLE)
No ratings yet
Maximum Likelihood Estimation (MLE)
4 pages
Notes
No ratings yet
Notes
10 pages
Module02B Slides Print 1
No ratings yet
Module02B Slides Print 1
59 pages
SampleQs Solutions PDF
No ratings yet
SampleQs Solutions PDF
20 pages
MLE Lecture Note For Econometrician
No ratings yet
MLE Lecture Note For Econometrician
13 pages
Mathematical Statistics (MA212M) : Lecture Slides
No ratings yet
Mathematical Statistics (MA212M) : Lecture Slides
14 pages
Ps 2,3
No ratings yet
Ps 2,3
48 pages
Likelihood, Bayesian, and Decision Theory
No ratings yet
Likelihood, Bayesian, and Decision Theory
50 pages
4 Comparison of Estimators: 4.1 Optimality Theory
No ratings yet
4 Comparison of Estimators: 4.1 Optimality Theory
16 pages
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
No ratings yet
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
6 pages
Solution 3 Problem 1: Let X
No ratings yet
Solution 3 Problem 1: Let X
12 pages
Msqe Metrics 1 ps2
No ratings yet
Msqe Metrics 1 ps2
11 pages
Lecture1 ML MLE
No ratings yet
Lecture1 ML MLE
103 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
NOTES
No ratings yet
NOTES
14 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
VE564 Summer 2023: Lecture 3-1: Maximum Likelihood Estimation and Least Squares
No ratings yet
VE564 Summer 2023: Lecture 3-1: Maximum Likelihood Estimation and Least Squares
78 pages
Math435 HW 8
No ratings yet
Math435 HW 8
8 pages
Chap - 2point - Estimation
No ratings yet
Chap - 2point - Estimation
11 pages
Section 5
No ratings yet
Section 5
18 pages
Cosistent Asymptotic Normal Estimator
No ratings yet
Cosistent Asymptotic Normal Estimator
11 pages
Lectures 10-11
No ratings yet
Lectures 10-11
17 pages
Asymptotic Variance of The MLE and The CRLB
No ratings yet
Asymptotic Variance of The MLE and The CRLB
12 pages
Proof Wilks Theorem Likelihood Ratio Test
No ratings yet
Proof Wilks Theorem Likelihood Ratio Test
4 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
Mathematical Statistics (II)
No ratings yet
Mathematical Statistics (II)
112 pages
CH 7
No ratings yet
CH 7
47 pages
Lecture 2727K19EN
No ratings yet
Lecture 2727K19EN
15 pages
Unit 4
No ratings yet
Unit 4
8 pages
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
No ratings yet
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
16 pages
Inf 2
No ratings yet
Inf 2
37 pages
Lecture 03 Maximum Likelihood Estimation
No ratings yet
Lecture 03 Maximum Likelihood Estimation
22 pages
STAT 2-2 Test of Hypothesis
No ratings yet
STAT 2-2 Test of Hypothesis
14 pages
02 Review Estimation 2
No ratings yet
02 Review Estimation 2
36 pages
Review Sol 8
No ratings yet
Review Sol 8
9 pages
MLE Assingnment
No ratings yet
MLE Assingnment
7 pages
Supplement03-Asymptotic Properties of Mle
No ratings yet
Supplement03-Asymptotic Properties of Mle
3 pages
3.exponential Family & Point Estimation - 552
0% (1)
3.exponential Family & Point Estimation - 552
33 pages
Sol Stat Chapter2
No ratings yet
Sol Stat Chapter2
9 pages
All Ex Sol
No ratings yet
All Ex Sol
43 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
11 pages
STAT 135 Solutions To Homework 4:: 30 Points
No ratings yet
STAT 135 Solutions To Homework 4:: 30 Points
9 pages
Unit 4 1lec 5
No ratings yet
Unit 4 1lec 5
6 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
10 pages
Maximum Likelihood An Introduction: L. Le Cam
No ratings yet
Maximum Likelihood An Introduction: L. Le Cam
31 pages
7 Mle
No ratings yet
7 Mle
31 pages
Suggested Solutions: Problem Set 3 Econ 210: April 27, 2015
No ratings yet
Suggested Solutions: Problem Set 3 Econ 210: April 27, 2015
11 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Statistical Inference: Stat 472
No ratings yet
Statistical Inference: Stat 472
11 pages
Chap 10
No ratings yet
Chap 10
7 pages
Asymptotic Statistics (By Changliang ZOU)
No ratings yet
Asymptotic Statistics (By Changliang ZOU)
115 pages
STAT732: Solutions For Homework 2: Due: Wednesday, Feb 14
No ratings yet
STAT732: Solutions For Homework 2: Due: Wednesday, Feb 14
7 pages
EE Exercise Solutions 2022
No ratings yet
EE Exercise Solutions 2022
21 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
16 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Julia Markowitz
No ratings yet
Julia Markowitz
12 pages
The Basic Model
No ratings yet
The Basic Model
9 pages
Brownian Motion, Diffusion, Entropy and Econophysics
No ratings yet
Brownian Motion, Diffusion, Entropy and Econophysics
6 pages
Julia Reference Sheet
No ratings yet
Julia Reference Sheet
1 page
Sugerencias Técnicas para La Elección de Bomba
No ratings yet
Sugerencias Técnicas para La Elección de Bomba
15 pages
Proteomic Techeniques
No ratings yet
Proteomic Techeniques
24 pages
0901 B 803809330 CC
No ratings yet
0901 B 803809330 CC
16 pages
Econ-607 - Unit2-W1-3
No ratings yet
Econ-607 - Unit2-W1-3
117 pages
Kalman Filtering Book PDF
No ratings yet
Kalman Filtering Book PDF
83 pages
Chapter 9
No ratings yet
Chapter 9
43 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
No ratings yet
4estimation and Hypothesis Testing (DB) (Compatibility Mode)
170 pages
MVUE
No ratings yet
MVUE
89 pages
ST107 Exam Paper 2010
No ratings yet
ST107 Exam Paper 2010
12 pages
Sampling and Estimation
No ratings yet
Sampling and Estimation
12 pages
DLMAIRIL01 Q4-2024 Session4
No ratings yet
DLMAIRIL01 Q4-2024 Session4
80 pages
ECS3706 Exam Pack-1 (1) (1) Ecs4863
No ratings yet
ECS3706 Exam Pack-1 (1) (1) Ecs4863
25 pages
Aditeya Gupta Bachelor Thesis
No ratings yet
Aditeya Gupta Bachelor Thesis
36 pages
Understandable Statistics 9th Edition Charles Henry (Charles Henry Brase) Brase Download
No ratings yet
Understandable Statistics 9th Edition Charles Henry (Charles Henry Brase) Brase Download
52 pages
Good Practices in Free-Energy Calculations
No ratings yet
Good Practices in Free-Energy Calculations
19 pages
BS IMI U4 Oct23 Complete
No ratings yet
BS IMI U4 Oct23 Complete
182 pages
Verbeek e Nijman - Testing For Selectivity Bias in Panel Data Models
No ratings yet
Verbeek e Nijman - Testing For Selectivity Bias in Panel Data Models
24 pages
Cor 006 Final Reviewer
No ratings yet
Cor 006 Final Reviewer
176 pages
Sample Size Determination PDF
100% (1)
Sample Size Determination PDF
28 pages
Exam 1cv40 January 2019
No ratings yet
Exam 1cv40 January 2019
11 pages
Transportation Research Part A: Anna Bottasso, Maurizio Conti, Claudio Ferrari, Alessio Tei
No ratings yet
Transportation Research Part A: Anna Bottasso, Maurizio Conti, Claudio Ferrari, Alessio Tei
12 pages
Fundamentals of Statistics For Data Scientists and Analysts - by Tatev Karen Aslanyan - Towards Data Science
No ratings yet
Fundamentals of Statistics For Data Scientists and Analysts - by Tatev Karen Aslanyan - Towards Data Science
49 pages
Estimation and Hypothesis Testing
No ratings yet
Estimation and Hypothesis Testing
101 pages
KA Prep 20 04 24
No ratings yet
KA Prep 20 04 24
10 pages
Econometrics: Problem Set 3: Professor: Mauricio Sarrias
No ratings yet
Econometrics: Problem Set 3: Professor: Mauricio Sarrias
5 pages
〈1010〉 ANALYTICAL DATA-INTERPRETATION AND TREATMENT
No ratings yet
〈1010〉 ANALYTICAL DATA-INTERPRETATION AND TREATMENT
29 pages
Midterm 2 Review Solution
No ratings yet
Midterm 2 Review Solution
71 pages
Statistical Methods in Experimental Physics 2nd Ed. Edition James - The Complete Ebook Version Is Now Available For Download
100% (2)
Statistical Methods in Experimental Physics 2nd Ed. Edition James - The Complete Ebook Version Is Now Available For Download
58 pages
Instructor's Presentation - Introduction To Statistical Inference
No ratings yet
Instructor's Presentation - Introduction To Statistical Inference
20 pages
Approved Notification - Assistant Statistical Officer in A.P.Economic and Statistical Subordinate Service
No ratings yet
Approved Notification - Assistant Statistical Officer in A.P.Economic and Statistical Subordinate Service
28 pages

Maximum Likelihood Estimation.: N N I N I 1 N I I 1

Uploaded by

Maximum Likelihood Estimation.: N N I N I 1 N I I 1

Uploaded by

Lecture 7

Maximum Likelihood Estimation.

θ̂M L = arg max ℓ(θ|x).

theorem below gives the consistency result for MLE:

process θˆM L will converge to θ0 .

estimator. The next theorem gives the asymptotic distribution of MLE:

∂ℓ(θ̂M L )|X ∂ℓ(θ0 |X) ∂ 2 ℓ(θ̃|X)

√ − √1n ∂ℓ(θ∂θ0 |X)

1 ∂ 2 ℓ(θ̃|X) 1 ∂ 2 ℓ(θ0 |X)

by the Central limit theorem,

1 ∑ ∂ log f1 (Xi |θ0 )

Finally, by the Slutsky theorem,

bound in very large samples).

averages. For example

will be a consistent estimator of the Fisher information.

from our sample of size n. For b = 1, ..., B do the following:

• Simulate sample Xb∗ = (X1b

the true parameter value).

• Find MLE using sample Xb∗ , denote it θb∗ .

also do bootstrap- bias correction using similar procedure.

3 When MLE asymptotic theory fails us...

So it does not converge to mean zero normal distribution.

should not depend on n. Indeed, let

would fail. De˝ne [( )2 ]

This asymptotic variance Σ−1 −1

14.381 Statistical Method in Economics

You might also like