0% found this document useful (0 votes)
10 views24 pages

MLE Topic

The document discusses Maximum Likelihood Estimation (MLE) with a focus on its application to normal distribution, including steps for calculating MLE in both simple and multiple linear regression contexts. It covers important notes on probability density functions, evaluation of MLE through gradient and Hessian matrices, and introduces Trinity Tests for hypothesis testing. Key components such as the likelihood function, log likelihood, and Fisher information are also outlined.

Uploaded by

Mostafa Allam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views24 pages

MLE Topic

The document discusses Maximum Likelihood Estimation (MLE) with a focus on its application to normal distribution, including steps for calculating MLE in both simple and multiple linear regression contexts. It covers important notes on probability density functions, evaluation of MLE through gradient and Hessian matrices, and introduces Trinity Tests for hypothesis testing. Key components such as the likelihood function, log likelihood, and Fisher information are also outlined.

Uploaded by

Mostafa Allam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Maximum Likelihood Estimator

Mina Sami Ayad


Outline

1. Maximum Likelihood Estimation.

2. Important Notes.

3. Maximum Likelihood Estimation (Normal Distribution).


1. Simple Linear Regression.
2. Multiple Linear Regression.

4. Evaluation of MLE (Normal Distribution)


1. Introduction for further Matrices.
2. Trinity Tests.
2
1. Maximum Likelihood Estimation

3
2. Important Notes

4
2. Important Notes

Probability Density Function:

1 1 2

exp − 2 𝑦𝑖 − 𝛼ො − 𝛽𝑋
𝜎 2𝜋 2𝜎

1 1 2
exp − 2 𝜀
𝜎 2𝜋 2𝜎

5
3. Maximum Likelihood Estimator (Normal Distribution)

ML Steps:

1. Assume Distribution relevant to data (Normal).

2. Define PDF function of the distribution.

3. Insert Joint Probability to define Likelihood Function 𝐿(𝜃).

4. Convert likelihood function to Log Likelihood 𝑙 𝜃 .

5. Maximize the function.

6
3. Maximum Likelihood Estimator (Normal Distribution)
1. Simple Form

ML Steps:

1. Normal Distribution

2. PDF function

1 1 2
exp − 2 መ
𝑦𝑖 − 𝛼ො − 𝛽𝑋 .
𝜎 2𝜋 2𝜎

7
3. Maximum Likelihood Estimator (Normal Distribution)
1. Simple Form

ML Steps:

3. Insert Joint:

1 1 2
𝐿 𝜃 = 𝑁
Π𝑖=1 መ
exp − 2 ∑ 𝑦𝑖 − 𝛼ො − 𝛽𝑋
𝜎 2𝜋 2𝜎

𝑁
1 1 2
𝐿 𝜃 = መ
− 2 ∑ 𝑦𝑖 − 𝛼ො − 𝛽𝑋
𝜎 2𝜋 2𝜎

8
3. Maximum Likelihood Estimator (Normal Distribution)
1. Simple Form

4. Get 𝑙 𝜃
1 2

𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 ∑ 𝑌𝑖 − 𝛼ො − 𝛽𝑋
2𝜎

5. Maximize the function


ഥ−𝛽
𝛼ො = 𝑌 ෡𝑋 ഥ
𝐶𝑜𝑣 𝑋, 𝑌

𝛽=
𝑉𝑎𝑟 (𝑋)
2
∑𝜀 𝑖
𝜎2 =
𝑁
9
3. Maximum Likelihood Estimator (Normal Distribution)
2. Matrix Form

4. Get 𝑙 𝜃
1

𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 (𝑌 − 𝑋𝛽)′(𝑌 መ
− 𝑋𝛽)
2𝜎

5. Maximize the function


𝛽መ = 𝑋 ′ 𝑋 −1
𝑋′𝑌
𝜀 ′𝜀
𝜎෢2 =
𝑁

10
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

• Three Matrix/Vectors are important to Derive in MLE


Context:

• Score (Gradient) Vector: (Vector of Partial Derivatives)


𝜕𝑙
∇𝐺 =
𝜕𝜃

• Hessian Matrix: (Matrix of Second Partial Derivatives)


𝜕2 𝑙
𝐻=
𝜕𝜃𝜕𝜃′

• Fisher Information: 𝐼 = −𝐸(𝐻 𝜃 )


11
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(1)Gradient Vector:
1

𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 (𝑌 − 𝑋𝛽)′(𝑌 መ
− 𝑋𝛽)
2𝜎

𝜕𝑙
𝜕𝑙 𝜕𝛽መ
𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝑉𝑒𝑐𝑡𝑜𝑟 = =
𝜕𝜃 𝜕𝑙
𝜕𝜎ො

12
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

1

𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 (𝑌 − 𝑋𝛽)′(𝑌 መ
− 𝑋𝛽)
2𝜎

1
𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 (𝑌 ′ 𝑌 + 𝛽መ ′ 𝑋 ′ 𝑋𝛽 − 2𝛽′ 𝑋 ′ 𝑌 )
2𝜎

𝜕𝑙 1
= − 2 𝑋 ′ 𝑋𝛽መ − 𝑋 ′ 𝑌 → (1)
𝜕𝛽መ 𝜎

13
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

1
𝑙 𝜃 = −𝑁 ln 𝜎 − 𝑁 𝑙𝑛 2Π − 2 (𝜀 ′ 𝜀)
2𝜎

𝜕𝑙 𝑁 𝜀′𝜀
= − + 3 → (2)
𝜕𝜎ො 𝜎 𝜎

1
− 2 𝑋 ′ 𝑋𝛽෠ − 𝑋 ′ 𝑌
∇𝐺 = 𝜎
𝑁 𝜀′𝜀
− + 3
𝜎 𝜎

14
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(2) Hessian Matrix

𝜕2𝑙 𝜕2𝑙
𝜕𝛽𝜕𝛽′ 𝜕𝛽𝜕𝜎
𝐻=
𝜕2𝑙 𝜕2𝑙
𝜕𝜎𝜕𝛽 𝜕𝜎𝜕𝜎′

𝑎 𝑏
𝐻=
𝑐 𝑑
15
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(2) Hessian Matrix

𝜕𝑙 1
= − 2 𝑋 ′ 𝑋𝛽መ − 𝑋 ′ 𝑌 → (1)
𝜕𝛽መ 𝜎

𝜕2𝑙 𝑋′𝑋
𝑎 = =− 2
𝜕𝛽𝜕𝛽′ 𝜎

16
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(2) Hessian Matrix

𝜕𝑙 1
= − 2 𝑋 ′ 𝑋𝛽መ − 𝑋 ′ 𝑌 → (1)
𝜕𝛽መ 𝜎

𝜕2𝑙 2
𝑏=𝑐= = 3 (𝑋 ′ 𝑋𝛽መ − 𝑋 ′ 𝑌)
𝜕𝛽𝜕𝜎 𝜎
2 2 ′
= − 3 (𝑋 𝑌 − 𝑋𝛽መ = − 3 𝑋 𝜀

𝜎 𝜎
17
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(2) Hessian Matrix

𝜕𝑙 𝑁 𝜀 ′𝜀
= − + 3 → (2)
𝜕𝜎ො 𝜎 𝜎

𝜕2𝑙 𝑁 𝜀 ′𝜀 𝑁 3𝜎 2 𝑁
𝑑= = 2−3 4 = 2−
𝜕𝜎𝜕𝜎′ 𝜎 𝜎 𝜎 𝜎4
𝑁 3𝑁 2𝑁
= 2− 2 =− 2
𝜎 𝜎 𝜎
18
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

𝑋′𝑋 2 ′
− 2 − 3𝑋 𝜀
𝐻= 𝜎 𝜎
2 ′ 2𝑁
− 3𝑋 𝜀 − 2
𝜎 𝜎

19
4. Evaluation of MLE (Normal Distribution)
1. Introduction for further Matrices.

(3) Fisher Information Matrix

𝑋′𝑋
0
𝐼 = −𝐸 𝐻 𝜃 = 𝜎2
2𝑁
0
𝜎2

20
4. Evaluation of MLE (Normal Distribution)
2. Trinity Tests
(1)Likelihood Ratio Test:

𝐻𝑜 : 𝑙 𝜃 ∗ = 𝑙 𝜃𝑖

𝜃 ∗ : 𝑃𝐴𝑟𝑎𝑚𝑡𝑒𝑟𝑠 𝑜𝑓 𝑀𝐿𝐸 𝑀𝐴𝑥𝑖𝑚𝑢𝑚 𝐿𝑜𝑔 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑


𝜃𝑖 : 𝑎 𝑅𝑎𝑛𝑑𝑜𝑚 𝐼𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛

𝐿 𝜃𝑖
𝐿𝑅 = −2
𝐿(𝜃 ∗ )

𝐿𝑅 = 2 𝑙 𝜃 ∗ − 𝑙 𝜃𝑖 ~𝜒𝑟2

𝑟: 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑅𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑜𝑛𝑠.
𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛:
𝐿𝑅 > 𝜒𝑟2 → 𝑊𝑒 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻𝑜
21
4. Evaluation of MLE (Normal Distribution)
2. Trinity Tests

(2) Wald Test

𝐻𝑜 : 𝜃 ∗ = 𝜃𝑖

𝑊 = 𝜃 ∗ − 𝜃𝑖 ′ 𝐼(𝜃) (𝜃 ∗ − 𝜃𝑖 )
𝐼 𝜃 = 𝑉 𝜃 ∗ −1
In Stata 𝜃 is composed only from 𝛼ො and 𝛽መ

𝑊 = 𝜃 ∗ − 𝜃𝑖 ′ 𝑉 𝜃∗ −1 (𝜃 ∗ − 𝜃𝑖 )

22
4. Evaluation of MLE (Normal Distribution)
2. Trinity Tests

(3) Lagrange Multiplier Test


𝑊= ∇𝜃
𝐼 𝜃 ∇𝑔
𝑊 = ∇′𝜃 𝑉 𝜃 ∗ −1 ∇𝑔

𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒𝑠 𝑀𝐿𝐸 𝑓𝑟𝑜𝑚 𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝑉𝑒𝑐𝑡𝑜𝑟


Will be developed in next topics

23
24

You might also like