0% found this document useful (0 votes)

14 views3 pages

Module4

This document discusses methods of point estimation in statistics, focusing on the Method of Moments (MoM) and Maximum Likelihood (ML) estimation. MoM involves equating sample moments to population moments to find estimators, while ML estimation maximizes the likelihood function to derive estimators. The document also highlights properties of MLE, including its consistency, asymptotic normality, and potential shortcomings.

Uploaded by

priyamsahoojnvkp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

Module4

Uploaded by

priyamsahoojnvkp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MTH211A: Theory of Statistics

Module 4: Methods of Point Estimation

In the previous modules, we have talked about properties of good estimators. Next, we will discuss
some common methods of finding point estimators of θ. The two most popular frequentist methods of
finding estimators are:
(A) method of moments (MoM), and
(B) maximum likelihood (ML) estimation.
We will now discuss these methods in details.

1 Method of Moments (MoM)

One of the simplest and oldest methods of finding estimators is the method of moments or substi-
tution principle. Let X1 , . . . , Xn be a random sample from a population with distribution function
{Fθ ; θ ∈ Θ}. Method of moments estimators are found by equating the first k sample moments to the
corresponding k population moments, and solving the resulting system of equations.
Let θ be a k-dimensional parameter. Then usually we require a system of k-equations to get an
estimate of θ. Let Mr′ be the r-th sample raw moment, and µ′r = E (X1r ) be the r-th raw moment of
Fθ , r = 1, . . . , k. Of course, for each r, µ′r will be a function of θ. Then we obtain estimates of θ by
solving the following equations:
Mr′ = µ′r , r = 1, . . . , k.

Example 1. Let X1 , . . . , Xn be a random sample from Normal(µ, σ 2 ). The MoM estimators of µ and
σ 2 are X̄ and S 2 respectively.
P (a.s)
Remark 1. From Weak (Strong) Law of Large Numbers, Mr′ −−−−−→ µ′r . Therefore, if one is interested
in the population moments, then MoM provides consistent (strongly consistent) estimators.
Remark 2. However, MoM may lead to estimators having sub-optimal sampling properties, and may
lead to absurd estimators in some cases.

Example 2. Let X1 , . . . , Xn be a random sample from Uniform(α, β). The MoM estimators of α and
β are T1 (X) and T2 (X), respectively, where
r Pn r Pn
3 i=1 (Xi − X̄n )2 3 i=1 (Xi − X̄n )2
T1 (X) = X̄n − , and T2 (X) = X̄n + .
n n
Observe that, none of the estimates are based on minimal sufficient statistic.

2 Maximum Likelihood (ML) Estimation

Definition 1 (Likelihood Function). Let X = (X1 , . . . , Xn )′ be a random sample from a population
with distribution function {Fθ ; θ ∈ Θ}. Suppose the distribution {Fθ ; θ ∈ Θ} is characterized by a
pdf (or, pmf ) fX (·; θ). Further, suppose x is a realization of X. Then the function of θ defined as
L(θ | x) = f (x; θ) is called the likelihood function.

Definition 2 (Maximum Likelihood Estimate, MLE). Given a realization x, let θ b be the value in
Θ that maximizes the likelihood function L(θ | x) with respect to θ, then θ is called the MLE of the
b
parameter θ.

1
Note that, the maximizer θ b is nothing but a function of the realization x. Thus we can treat the
maximizer of the likelihood function as a statistic or estimator of θ. This estimator is called Maximum
Likelihood (ML) estimator. Notationally, we write θ b=θ bML (X).

Example 3. Suppose there are n tosses of a coin, and we do not know the value of n, or the probability
of head (p). However, we know that n is between 3 to 5 and one of the sides of the coin
is twice as heavy as the other (i.e., either p = 2(1 − p) or (1 − p) = 2p). Then what is
the MLE of θ = (n, p) given we observe x heads, x = 1, . . . , 5?

Remark 3. If the likelihood function is differentiable with respect to θ, the one may take the differen-
tiation approach for finding the MLE. In case of a several value function L(θ), if the function is twice
continuously differentiable with respect to each θj , then a critical point of L(θ | x) can be obtained by
∂
equating L(θ | x) = 0. Then to verify, if the critical point is a maximizer, one can check if the
∂θ
∂2
Hessian matrix L(θ | x) is negative definite at the critical point.
∂θ∂θ ′
Remark 4. Often it is convenient to work with the log likelihood function, instead of the likelihood
function. As logarithm is a monotone function, the maximizer of likelihood and the log likelihood are
the same. The log likelihood is generally denoted by l(θ; x).

Example 4. Let X1 , . . . , Xn be a random sample from normal(µ, σ 2 ). Then the MLE of µ and σ 2 are
X̄n and Sn2 , respectively.
Example 5. Let X1 , . . . , Xn be a random sample from uniform(α, β). Then the MLE of α and β are
X(1) and X(n) , respectively.

In the above two examples, we have seen that MLE is a function of a sufficient statistic. This
phenomenon is in general true, as stated by the following theorem.

Theorem 1 (Properties of MLE: 1). Let X1 , . . . , Xn be a random sample from some distribution with
pdf (or pmf ) fθ ; θ ∈ Θ ⊆ Rk , and let T(X) be a sufficient statistic for θ. Then an MLE, if exists and
unique, is a function of T. If MLE exists, but is not unique then, one can find an MLE which is a
function of T. [Proof]

Remark 5. Maximum likelihood estimate may not exist. See the following example.

Example 6. Let X1 , X2 be a random sample from Bernoulli(θ), θ ∈ (0, 1). Suppose the realization
(0, 0) is observed. Then the MLE does not exist.

Remark 6. Even if MLE exists, it may not be unique. See the following example.

Example 7. Let X1 , . . . , Xn be a random sample from Double Exponential(θ, σ) distribution, θ ∈ R.

Then the θbML is the median X1 , . . . , Xn , which is not unique.

Remark 7. The method of maximum likelihood estimation may produce an absurd (not meaningful)
estimator. In the example

In spite of all the above shortcomings, MLE is by far the most popular and reasonable frequentist
method of estimation. The reason is that MLE possesses a list of desirable properties. We will discuss
some of them below.
Theorem 2 (Properties of MLE: 2). Suppose the regularity conditions of CRLB (see Theorem 5 of
Module 3) are satisfied, the log-likelihood is twice differentiable, and there exists an unbiased estimator
θb⋆ of θ, the variance of which attains the CRLB. Suppose further that the likelihood equation has a
unique maximizer θbML (X), then θb⋆ = θbML (X). [Proof]
Corollary 1. Theorem 2 implies that if the CRLB is attained by any estimator, then it must be an
MLE. However, the converse is not true, i.e., the variance of an MLE may not attain the CRLB.

2
2.1 Invariance Property
Let η := Ψ(θ) be any function of θ, and we are interested in the optimal value of η given a sample
X1 , · · · , Xn . Let H = {η = Ψ(θ) : θ ∈ Θ} be the set of all possible values of η, and for each η ∈ H,
Aη = {θ ∈ Θ : Ψ(θ) = η}. We are interested in obtaining that value of η, corresponding to which
the likelihood is maximized, i.e., the η for which Aη contains the θ bML . If we term that optimal
η as η b ML , then ηb ML := arg maxη∈H supθ∈Aη L(θ | X) = arg maxη∈H L⋆ (η | X). The function
⋆
L (η | X) = supθ∈Aη L(θ | X) is called the induced likelihood of η, and the maximizer of the induced
likelihood is called the MLE of η. The following theorem states that η b ML = Ψ(θbML ) for any function
Ψ.
Theorem 3 (Properties of MLE: 3, Invariance Property). Let {fX (·; θ) : θ ∈ Θ} be a family of PDFs
(PMFs), and let L(θ | X) be the likelihood function. Suppose Θ ⊆ Rk , k ≥ 1. Let Ψ :Θ → Λ be a
mapping of Θ onto Λ, where Λ ⊆ Rp (1 ≤ p ≤ k). If θ bML (X) is an MLE of θ, then Ψ θ bML (X) is
an MLE of Ψ(θ). [Proof]

Example 8. Let X1 , . . . , Xn be a random sample from Gamma(1, θ) distribution, θ > 0. Find an MLE
of θ.
Example 9. Let X1 , . . . , Xn be a random sample from Poisson(θ) distribution, θ > 0. Find an MLE
of P (X = 1) = exp{−θ}.

2.2 Asymptotic properties of MLE

Definition 3 (Consistent Estimator). Consider the family of distributions Fθ indexed by θ ∈ Θ ⊆ Rk ,
and let X1 , · · · , Xn be a random sample from Fθ0 for some θ 0 ∈ Θ. Then the estimator T(X) is called
p
a consistent estimator of θ if T(X) − → θ0 as n → ∞, i.e.,

Pθ0 (∥T(X) − θ∥ > ϵ) → 0 as n → ∞, for every ϵ > 0.

Definition 4 (Asymptotic Normality). Consider the family of distributions Fθ indexed by θ ∈ Θ ⊆ Rk ,

and let X1 , · · · , Xn be a random sample from Fθ0 for some θ 0 ∈ Θ. Then the estimator T (X) of θ is
called asymptotically normal (AN) if there exists a k × k positive definite matrix Vn (θ) depending on
n and θ such that
−1/2 d
[Vn (θ)] (T(X) − θ) −
→ N (0, Ik ) as n → ∞,
where Ik is the identity matrix of order k.
An estimator is called a CAN estimator, if it is consistent and asymptotically normal.
Definition 5 (Asymptotically Efficient Estimator). Suppose the Fisher information In (θ) = E [S(X; θ)S(X; θ)′ ]
is well-defined and positive definite. A asymptotically normal estimator T(X) is said to be asymptoti-
cally efficient iff Vn (θ) = In−1 (θ).
In the following theorem, we will see that under some regularity conditions MLE is a CAN estimator.
Theorem 4 (Properties of MLE: 4, Asymptotic normality). Consider the family of distributions Fθ
indexed by θ ∈ Θ ⊆ Rk which satisfies some regularity conditions (for details see the book ‘A course
in large sample theory’ by Ferguson, Chapter 18). Let X1 , · · · , Xn be a random sample from Fθ0 for
some θ 0 ∈ Θ, and θbML (X) is the MLE of θ. Then the following is true

√ n

1/2
o
D ′ ∂S(X; θ 0 )
I (θ0 ) n θ ML (X) − θ 0 −→ N(0, Ik ), where I(θ0 ) = Eθ0 [S(X; θ 0 )S(X; θ 0 ) ] = −Eθ0
b .
∂θ′

Here S(X; θ) = ∂ log fX (·; θ)/∂θ is the score function based on one sample. [Without Proof]
Corollary 2. Theorem 4 implies that under appropriate regularity conditions, MLE is a consistent
p
bML (X) −
estimator of θ, i.e., θ → θ as n → ∞ [Proof].
Further, from the definition of asymptotic efficiency it is also clear that MLE is an asymptotically
efficient estimator.

Fundamentals of Statistical Signal Processing, Volume I Estimation Theory by Steven M.kay
No ratings yet
Fundamentals of Statistical Signal Processing, Volume I Estimation Theory by Steven M.kay
303 pages
Asymptotical Statistics
100% (2)
Asymptotical Statistics
460 pages
Mathematical Statistics (MA212M) : Lecture Slides
No ratings yet
Mathematical Statistics (MA212M) : Lecture Slides
14 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
11 pages
Maximum Likelihood Estimation.: N N I N I 1 N I I 1
No ratings yet
Maximum Likelihood Estimation.: N N I N I 1 N I I 1
5 pages
Maximum Likelihood Estimation (MLE)
No ratings yet
Maximum Likelihood Estimation (MLE)
4 pages
NOTES
No ratings yet
NOTES
14 pages
3.exponential Family & Point Estimation - 552
0% (1)
3.exponential Family & Point Estimation - 552
33 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
16 pages
TRÌNH CHIẾU_CHỐT
No ratings yet
TRÌNH CHIẾU_CHỐT
33 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
MLE_Assingnment (1)
No ratings yet
MLE_Assingnment (1)
7 pages
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
No ratings yet
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
6 pages
Solution 3 Problem 1: Let X
No ratings yet
Solution 3 Problem 1: Let X
12 pages
ps2,3
No ratings yet
ps2,3
48 pages
Chap - 2point - Estimation
No ratings yet
Chap - 2point - Estimation
11 pages
sta255 Week 11-1 pre
No ratings yet
sta255 Week 11-1 pre
37 pages
Frequentist Estimation: 4.1 Likelihood Function
No ratings yet
Frequentist Estimation: 4.1 Likelihood Function
6 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
7 pages
Statistical Inference: Classical and Bayesian Methods
No ratings yet
Statistical Inference: Classical and Bayesian Methods
22 pages
Inf 2
No ratings yet
Inf 2
37 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
8.estimation I - 530
100% (1)
8.estimation I - 530
22 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
A Guide To Modern Econometrics by Verbeek 181 190
No ratings yet
A Guide To Modern Econometrics by Verbeek 181 190
10 pages
Lecture 2727K19EN
No ratings yet
Lecture 2727K19EN
15 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
STAT 2-2 Test of Hypothesis
No ratings yet
STAT 2-2 Test of Hypothesis
14 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
lecture1_ml_MLE
No ratings yet
lecture1_ml_MLE
103 pages
VE564 Summer 2023: Lecture 3-1: Maximum Likelihood Estimation and Least Squares
No ratings yet
VE564 Summer 2023: Lecture 3-1: Maximum Likelihood Estimation and Least Squares
78 pages
Section_5
No ratings yet
Section_5
18 pages
Lecture 03 Maximum Likelihood Estimation
No ratings yet
Lecture 03 Maximum Likelihood Estimation
22 pages
Maximum
No ratings yet
Maximum
3 pages
Likelihood, Bayesian, and Decision Theory
No ratings yet
Likelihood, Bayesian, and Decision Theory
50 pages
12_MLEFilled (1)
No ratings yet
12_MLEFilled (1)
8 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
MLE Lecture Note For Econometrician
No ratings yet
MLE Lecture Note For Econometrician
13 pages
ML Notes
No ratings yet
ML Notes
4 pages
Unit 4 1lec 5
No ratings yet
Unit 4 1lec 5
6 pages
4 Comparison of Estimators: 4.1 Optimality Theory
No ratings yet
4 Comparison of Estimators: 4.1 Optimality Theory
16 pages
7 Mle
No ratings yet
7 Mle
31 pages
SampleQs Solutions PDF
No ratings yet
SampleQs Solutions PDF
20 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
Statistics
No ratings yet
Statistics
60 pages
CHAPTER 3
No ratings yet
CHAPTER 3
9 pages
Estimation EMV
No ratings yet
Estimation EMV
37 pages
Maximum Likelihood Method-Red1eco
No ratings yet
Maximum Likelihood Method-Red1eco
14 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
8 pages
Chapter 2: Maximum Likelihood Estimation: Advanced Econometrics - HEC Lausanne
No ratings yet
Chapter 2: Maximum Likelihood Estimation: Advanced Econometrics - HEC Lausanne
207 pages
Topic 14: Maximum Likelihood Estimation: 1 Examples
No ratings yet
Topic 14: Maximum Likelihood Estimation: 1 Examples
6 pages
stat100b_maximum_likelihood
No ratings yet
stat100b_maximum_likelihood
9 pages
X X X F X F X F X: Likelihood Function
No ratings yet
X X X F X F X F X: Likelihood Function
12 pages
Advanced Statistical Inference
No ratings yet
Advanced Statistical Inference
7 pages
STAT 2006 Chapter 2 - 2022
No ratings yet
STAT 2006 Chapter 2 - 2022
83 pages
11mle
No ratings yet
11mle
26 pages
chapter4-estimation (1)
No ratings yet
chapter4-estimation (1)
28 pages
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
3.5/5 (1)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
sample-questions
No ratings yet
sample-questions
2 pages
darp-midterm-v3
No ratings yet
darp-midterm-v3
4 pages
SampleQuestions
No ratings yet
SampleQuestions
1 page
BSDS_notes_weeks10_11
No ratings yet
BSDS_notes_weeks10_11
8 pages
Module3
No ratings yet
Module3
5 pages
Statistical signal processing of complex valued data the theory of improper and noncircular signals 1st Edition Peter J Schreier pdf download
100% (1)
Statistical signal processing of complex valued data the theory of improper and noncircular signals 1st Edition Peter J Schreier pdf download
64 pages
R300 Advanced Econometrics Methods Lecture Slides
No ratings yet
R300 Advanced Econometrics Methods Lecture Slides
362 pages
Recent Advances in Wireless Communications and Networks
No ratings yet
Recent Advances in Wireless Communications and Networks
466 pages
CS1 Formula Sheet
No ratings yet
CS1 Formula Sheet
15 pages
Chapter 9
No ratings yet
Chapter 9
18 pages
Instant Access to Modern Signal Processing 1st Edition Xian-Da Zhang ebook Full Chapters
No ratings yet
Instant Access to Modern Signal Processing 1st Edition Xian-Da Zhang ebook Full Chapters
67 pages
Acoustic Detection and Localization of Artillery Guns: A B C C A B C
No ratings yet
Acoustic Detection and Localization of Artillery Guns: A B C C A B C
12 pages
Jeffreys-Prior-for-Unknown-Sinusoidal-Noise-Model-via-Cramer-Rao-Lower-Bound
No ratings yet
Jeffreys-Prior-for-Unknown-Sinusoidal-Noise-Model-via-Cramer-Rao-Lower-Bound
6 pages
Iit Jam Mathematical Statistics Question Paper 2015
No ratings yet
Iit Jam Mathematical Statistics Question Paper 2015
18 pages
1976 Savage
No ratings yet
1976 Savage
60 pages
VAN TREES I Detection Estimation and Modulation Theory Part I.0002
No ratings yet
VAN TREES I Detection Estimation and Modulation Theory Part I.0002
1 page
Full Magazine
No ratings yet
Full Magazine
10 pages
[Ebooks PDF] download (Ebook) Machine Learning: A Bayesian and Optimization Perspective by Sergios Theodoridis ISBN 9780128188033, 0128188030 full chapters
100% (10)
[Ebooks PDF] download (Ebook) Machine Learning: A Bayesian and Optimization Perspective by Sergios Theodoridis ISBN 9780128188033, 0128188030 full chapters
47 pages
Applied Time-Series Analysis: Arun K. Tangirala
No ratings yet
Applied Time-Series Analysis: Arun K. Tangirala
50 pages
Review
No ratings yet
Review
33 pages
Statistical Inference
No ratings yet
Statistical Inference
62 pages
Solved Examples of Cramer Rao Lower Bound
100% (1)
Solved Examples of Cramer Rao Lower Bound
6 pages
TD1 PointEstimation
No ratings yet
TD1 PointEstimation
5 pages
MTL390 L0 Introduction
No ratings yet
MTL390 L0 Introduction
12 pages
Direction of Arrival Estimation of Reflections From Room Impulse Responses Using A Spherical Microphone Array
No ratings yet
Direction of Arrival Estimation of Reflections From Room Impulse Responses Using A Spherical Microphone Array
13 pages
UT Dallas Syllabus For Ee6343.001.07f Taught by Naofal Al-Dhahir (Nxa028000)
No ratings yet
UT Dallas Syllabus For Ee6343.001.07f Taught by Naofal Al-Dhahir (Nxa028000)
6 pages
Elements On Estimation Theory 04nvp04de13
No ratings yet
Elements On Estimation Theory 04nvp04de13
52 pages
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
No ratings yet
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
18 pages
TDOA Equation
100% (1)
TDOA Equation
27 pages
Bios602 Wi13 Lec13 Presentation
No ratings yet
Bios602 Wi13 Lec13 Presentation
115 pages
ECE531 Screencast 2.1: Introduction To The Cramer-Rao Lower Bound (CRLB)
No ratings yet
ECE531 Screencast 2.1: Introduction To The Cramer-Rao Lower Bound (CRLB)
5 pages
Get Machine Learning A Bayesian and Optimization Perspective 2nd Edition Sergios Theodoridis free all chapters
100% (1)
Get Machine Learning A Bayesian and Optimization Perspective 2nd Edition Sergios Theodoridis free all chapters
55 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages

Module4

Uploaded by

Module4

Uploaded by

MTH211A: Theory of Statistics

Module 4: Methods of Point Estimation

1 Method of Moments (MoM)

2 Maximum Likelihood (ML) Estimation

Example 7. Let X1 , . . . , Xn be a random sample from Double Exponential(θ, σ) distribution, θ ∈ R.

2.2 Asymptotic properties of MLE

Pθ0 (∥T(X) − θ∥ > ϵ) → 0 as n → ∞, for every ϵ > 0.

Definition 4 (Asymptotic Normality). Consider the family of distributions Fθ indexed by θ ∈ Θ ⊆ Rk ,

You might also like