0% found this document useful (0 votes)
47 views14 pages

Mathematical Statistics (MA212M) : Lecture Slides

1. The maximum likelihood estimator (MLE) is one of the most popular methods of estimation. It estimates parameters by finding the values that maximize the likelihood function given the data. 2. An example is provided to estimate the probability p that a ball drawn from a box is black, given it is either 1/2 or 1/3. The MLE of p depends on the number of black balls observed. 3. The likelihood function and its properties are defined. The MLE is the value of the parameter that maximizes the likelihood function given the observed data. Several examples are provided to illustrate calculating the MLE for different distributions.

Uploaded by

akshay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views14 pages

Mathematical Statistics (MA212M) : Lecture Slides

1. The maximum likelihood estimator (MLE) is one of the most popular methods of estimation. It estimates parameters by finding the values that maximize the likelihood function given the data. 2. An example is provided to estimate the probability p that a ball drawn from a box is black, given it is either 1/2 or 1/3. The MLE of p depends on the number of black balls observed. 3. The likelihood function and its properties are defined. The MLE is the value of the parameter that maximizes the likelihood function given the observed data. Several examples are provided to illustrate calculating the MLE for different distributions.

Uploaded by

akshay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Mathematical Statistics (MA212M)

Lecture Slides
Lecture 24
Maximum Likelihood Estimator(MLE)

Proposed by R. Fisher in 1912.


One of the most popular method of estimation.
Let us start with an example (next slide).
Example

Example 107: Let a box has some red balls and some black balls. It is
known that number of black balls to red balls is in 1:1 or 1:2 ratio.
We want find whether it is 1:1 or 1:2. We may proceed as follows:
Randomly draw two balls from the box.
Let X be the number of black balls out of two drawn balls.
X ∼ Bin(2, p), where p ∈ 12 , 13 .


Problem boils down to estimate the value of p.


Example (cont.)
Now consider the following table, where the entries are Pp (X = x)
for each possible values of x and p.

x =0 x =1 x =2
p = 1/2 1/4 1/2 1/4
p = 1/3 4/9 4/9 1/9

From first column, we see that for x = 0, the P(X = 0) is


maximum if p = 1/3. Hence if we observe x = 0 (that is no
black balls in the sample), it is plausible to take p = 1/3 and the
maximum likelihood estimate (MLE) of p is 1/3.
From second column, we see that for x = 1, the P(X = 1) is
maximum if p = 1/2.
From third column, we see that for x = 2, the P(X = 2) is
maximum if p = 1/2.
Example (cont.)

Hence the maximum likelihood estimator of p is


(
1
if x = 0
pb = 13
2
if x = 1, 2.

Remark: If x = 0 occur, it is more likely that there are lesser number


of black balls and hence the estimate turns out to be 1:2. For other
values of x, it is 1:1.
MLE (cont.)

Def: Let X = (X1 , . . . , Xn ) be a RS from a population with


PMF/PDF f (x; θ). The function
n
Y
L(θ, x) = fθ (x) = f (xi , θ)
i=1

considered as a function of θ ∈ Θ for any fixed x ∈ X (X is


support of the RS), is called the likelihood function.
Def: For a sample point x ∈ X , let θ(x)
b be a value in Θ at which
L(θ, x) attains its maximum as a function of θ, with x held fixed.
Then maximum likelihood estimator of the parameter θ based on a
RS X is θ(X
b ).
MLE (cont.)

Remark: MLE always lies in the parametric space.


Remark: Problem of finding MLE boils down to finding maxima of a
function, the likelihood function.
Remark: Sometimes it is easier to work with l(θ, x) = ln L(θ, x)
than L(θ, x). Note that ln(·) is a strictly increasing function on the
positive side of R. However, it does not work in many examples.
Examples
i.i.d.
Example 108: X1 , X2 , . . . , Xn ∼ P(λ), λ > 0. For λ > 0, the
log-likelihood function is
n
X
l(λ, x) = ln L(λ, x) = −nλ + nxλ − ln(x!).
i=1

dl 2
d l

= 0 =⇒ λ = x. Also dλ 2 < 0 for all λ > 0. Hence l(λ, x)

maximizes at λ = x and the MLE of λ is λ b = X.


Remark: Now on wards, we will write L(θ) instead of L(θ, x).
Similarly, we will use l(θ).
i.i.d.
Example 109: X1 , X2 , . . . , Xn ∼ N(µ, 1), µ ∈ R. The MLE of µ is
b = X.
µ
Examples
i.i.d.
Example 110: X1 , X2 , . . . , Xn ∼ N(µ, σ 2 ), where µ ∈ R and
σ > 0. The log-likelihood function is
n
n n 1 X
2 2
l(µ, σ ) = − ln(2π) − ln(σ ) − 2 (xi − µ)2 .
2 2 2σ i=1
∂l ∂l
Now ∂µ = 0 and ∂σ 2 = 0 implies that µ = x and
1
Pn
σ = n i=1 (xi − x)2 . Find the Hessian matrix and show that it is
2

negative definite. Hence MLEs of µ and σ 2 are µ b = X and


2 1
Pn 2
σ = n i=1 (Xi − X ) , respectively.
b
i.i.d.
Example 111: X1 , X2 , . . . , Xn ∼ N(0, σ 2 ), where σ > 0. MLEs of
σ 2 is σb2 = n1 ni=1 Xi2 .
P

Remark: Note that the estimator of σ 2 are different in the last two
examples. It shows that the MLE may update itself based on any
available information on the parameters.
Examples
i.i.d.
Example 112: X1 , X2 , . . . , Xn ∼ N(µ, 1), µ ≤ 0. For µ ≤ 0, the
log-likelihood function is
n
1X
l(µ) = C − (xi − x)2 ,
2 i=1
where C is a constant (does not depend on unknown parameter).
dl dl
Now dµ = n(x − µ). Clearly for x > 0, dµ = 0 does not possess a
dl
solution in the parametric space. However dµ > 0 for all µ ≤ 0.
Hence for x > 0, l(µ) is an increasing function and it takes it
maximum value at µ = 0.
dl
On the other hand if x ≤ 0, dµ = 0 possesses a solution and it is
µ = x. The second derivative condition can be easily checked.
Hence, the MLE of µ is
(
X if X ≤ 0
µ
b=
0 otherwise.
Examples

1
Example 113: Let X1 be a sample of size one from Bernoulli( 1+e θ ),

where θ ≥ 0.
eθ 1
In this case L(θ, 0) = 1+e θ and L(θ, 1) = 1+e θ . Clearly MLE does

not exist for x = 0 as L(θ, 0) is a increasing function of θ. On the


other hand MLE exist for x = 1 and it is θb = 0.
Remark: MLE may not exist.
Remark: The log-likelihood is not used here.
Examples

i.i.d.
Example 114: X1 , X2 , . . . , Xn ∼ U(0, θ), θ > 0. The likelihood
function is
1
L(θ) = if 0 < x1 , . . . , xn ≤ θ.
θn
Clearly L(θ) is a decreasing function on θ > max x1 , x2 , . . . , xn = x(n)
and it takes it maximum value at θ = x(n) . Hence the MLE of θ is
θb = X(n) .
Remark: When support depends on unknown parameters, you should
take proper care to find MLE, if it exist.
Examples
i.i.d.
Example 115: X1 , X2 , . . . , Xn ∼ U θ − 12 , θ + 1

2
, θ ∈ R.
The likelihood function is
1 1
L(θ) = 1 if θ − ≤ x1 , . . . , xn ≤ θ +
2 2
1 1
= 1 if x(n) − ≤ θ ≤ x(1) + ,
2 2
where x(n) = max {x1 , . . . , xn } and x(1) = min {x1 , . . . , xn }.
As X(n) − X(1) ≤ 1 with probability one, x(n) − 12 , x(1) + 21 is a


non-empty interval. Also L(θ) maximizes at any point


 in the interval.
1 1

Hence any point in the interval X(n) − 2 , X(1) + 2 is a MLE of θ.
In particular, a MLE of θ is θb = α X(n) − 12 + (1 − α) X(1) + 21 for
 

any value of α ∈ [0, 1].


Remark: This example shows that MLE may not be unique.
Invariance Property of MLE

Theorem: (Without Proof) If θb is MLE of θ, then for any function


τ (·) defined on Θ, the MLE of τ (θ) is τ (θ).
b
i.i.d.
Example 116: X1 , X2 , . . . , Xn ∼ P(λ), λ > 0. To find the MLE of
P(X1 = 0), we can proceed as follows:
Note that P(X1 = 0) = e −λ and we know that the MLE of λ is X .
Hence the MLE of P(X1 = 0) is e −X .
Remark: Note that invariance property does not hold for MME.
Remark: The proof of above theorem is straight forward for a strictly
monotone function τ (·). However, the proof is little involved for a
general function.

You might also like