MAP&MLE

Uploaded by

Akshaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views44 pages

MAP&MLE

Uploaded by

Akshaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Basics of parameter

estimation: MLE and MAP

Estimating the Bias of a Coin
Problem: Assume we can flip a coin with bias θ several times. Estimate the probability
that it turns out heads when we flip it?
Each flip yields a Boolean value for X, X~ Bernoulli(θ){"The random variable X follows
a Bernoulli distribution with parameter θ. It has only two outcomes, typically
represented as 0 (failure) and 1 (success).
}
Bernoulli Random Variable P (X = 1) = θ; P (X = 0) = 1 − θ
Estimating the Bias of a Coin
MLE for Bernoulli Variables
● note that if data D consists of just one coin flip, then
P(D|θ) = θ if that one coin flip results in X = 1, and P(D|θ)
= (1−θ) if the result is instead X = 0.
● Furthermore, if we observe a set of i.i.d. coin flips such as
D = {1,1,0,1,0}, then we can easily calculate P(D|θ) by
multiplying together the probabilities of each individual
coin flip:
● P(D = {1,1,0,1,0} | θ) = θ · θ ·(1−θ)· θ ·(1−θ) = θ 3 ·(1−θ)2.
MLE
● maximizing P(D|θ) with respect to θ is equivalent to
maximizing its logarithm, lnP(D|θ) with respect to θ,
because ln(x) increases monotonically with x:

It often simpliﬁes the mathematics to maximize lnP(D|θ)

rather than P(D|θ)
Set derivative to 0 - Because lnP(D|θ) is a concave function
of θ, the value of θ where this derivative is zero will be the
value that maximizes lnP(D|θ).
MLE
MLE - Examples
1. Suppose that X is a discrete random variable with the
●
following probability mass function: where 0 ≤ θ ≤ 1 is
a parameter. The following 10 independent
observations were taken from such a distribution: (3,
0, 2, 1, 3, 2, 1, 0, 2, 1). What is the maximum likelihood
estimate of θ.
MLE - Examples
Solution:
●
Since the sample is (3,0,2,1,3,2,1,0,2,1), the likelihood is
L(θ) = P(X = 3)P(X = 0)P(X = 2)P(X = 1)P(X = 3)
× P(X = 2)P(X = 1)P(X = 0)P(X = 2)P(X = 1)
Substituting from the probability distribution given above,
we have

= (2θ/3)2. (θ/3)3. (2(1-θ)/3)3. ((1-θ)/3)2

MLE - Examples
Solution:
●
Taking logarithm
ln(L(θ)) = 2ln(θ) + 3ln(θ) + 3ln(1-θ) +2 ln(1-θ) - log c
Setting the derivative to 0 and solving
d/dθ(ln(L(θ))) = d/dθ(5ln(θ)+5ln(1-θ))
= 5/θ - 5/1-θ
Equating to 0 we get 1/θ = 1/(1-θ)
θ = 1/2, The maximum likelihood value of θ is 0.5
MLE - Examples
Suppose
● X1,X2….,Xn are i.i.d random variable with density
function . What is the MLE for 𝞼?
MLE - Examples
Maximum
● likelihood for continuous distributions

Suppose that the lifetime of light bulbs is modeled by an

exponential distribution with(unknown)parameter λ. We
test 5 bulbs and ﬁnd they have lifetimes of 2, 3, 1, 3, and
4 years, respectively. What is the MLE for λ?
MLE - Examples
Maximum
● Likelihood to estimate multiple Parameters

Suppose the data x1, x2, . . . , xn is drawn i.i.d from a N(μ, σ2)
distribution, where μ and σ2are unknown. Find the
maximum likelihood estimate for the pair(μ, σ2).
MLE
●
Maximum A Posteriori (MAP) Estimation
● MLE is powerful when you have enough data. However, it
● doesn’t work well when observed data size is small.
● If we ﬂip the coin 50 times, observing 24 heads and 26

tails, then we will estimate θMLE = 0.48.

● If we observe only 3 ﬂips of the coin, we might observe 1

head and 2 tails, and the estimate θMLE= 0.33.

● If we have prior knowledge about the coin, like θ is close
to 0.5, then we might respond by still believing the
probability is closer to 0.5 than 0.33.
● This leads to Maximum A Posteriori Estimation
Maximum A Posteriori (MAP) Estimation
● ● Maximum likelihood estimation (MLE) says that we
should ﬁnd the parameter θ that maximizes the
likelihood (“probability”) of seeing the data.
● But MAP allows us to incorporate prior knowledge
into our estimate
● We can determine the MAP estimation by using Bayes
theorem to calculate the posterior probability of each
candidate.
Maximum A Posteriori (MAP) Estimation
● Prior Knowledge: E.g., I know that the coin is “close”
to 50-50.
Maximum A Posteriori (MAP) Estimation
●
Maximum A Posteriori (MAP) Estimation
●

It chooses the value that is most probable given observed

data and prior belief
Maximum A Posteriori (MAP) Estimation

Maximum A Posteriori Estimation

As P(D) does not depend on θ, we can simplify this by ignoring the

denominator:
Maximum A Posteriori (Example)
1. Suppose our samples are x = (0, 0, 1, 1, 0), from Bernoulli(θ), where θ is
● unknown. Assume θ is unrestricted; that is, θ ∈ (0, 1). What is the MLE for
θ?
L(x | θ) = θ2(1 − θ)3
θMLE= argmaxθ∈[0,1]θ2(1 − θ)3
= No:0f heads/(No:of heads+tails)
=⅖
2. Suppose we impose the restriction that θ ∈ {0.2, 0.5, 0.7}. What is the MLE
for θ?
L(x | 0.2) = (0.220.83 ) = 0.02048
L(x | 0.5) = (0.520.53 ) = 0.03125
L(x | 0.7) = (0.720.33 ) = 0.01323
θMLE = arg maxθ∈{0.2,0.5,0.7}L(x | θ) =0.5
Maximum A Posteriori (Example)
●
Maximum A Posteriori (Example)
Consider a medical diagnosis problem in which there are two
●
alternative hypotheses: (1) that the patient has cancer. and (2) that the
patient does not. The available data is from a particular laboratory test
with two possible outcomes: (positive) and (negative). We have prior
knowledge that over the entire population of people only .008 have this
disease. Furthermore, the lab test is only an imperfect indicator of the
disease. The test returns a correct positive result in only 98% of the
cases in which the disease is actually present and a correct negative
result in only 97% of the cases in which the disease is not present. In
other cases, the test returns the opposite result.
Maximum A Posteriori (Example)
The above situation can be summarized by the following probabilities:
●
P(positive|cancer) = 0.98 P(negative|cancer) = 0.02
P(positive|not cancer) = 0.03 P(negative|not cancer) = 0.97
P(cancer) = 0.008 P(not cancer) = 0.992

Suppose we now observe a new patient for whom the lab test returns a
positive result. Should we diagnose the patient as having cancer or not?

For that calculate Maximum a posteriori estimation.

Maximum A Posteriori (Example)
Maximum a posteriori estimation:
●
P(cancer|positive) = P(positive|cancer) * P(cancer)
= 0.98 * 0.008
= 0.00784
P(not cancer|positive) = P(positive|not cancer)*P(not cancer)
= 0.03 * 0.992
= 0.0298
Thus MAP estimation diagnose the patient as not having cancer
P(positive|cancer) = 0.98 P(negative|cancer) = 0.02
P(positive|not cancer) = 0.03 P(negative|not cancer) = 0.97
P(cancer) = 0.008 P(not cancer) = 0.992
Maximum A Posteriori (Example)
● Consider the same example of ﬂipping the coin. Now let’s try to
● construct a MAP estimate for θ for the same Bernoulli
experiment. Obviously, we now need a prior belief distribution for
the parameter θ to be estimated.
● Our prior belief in possible values for θ must reﬂect the following
constraints: –
○ The prior for θ must be zero outside the [0, 1] interval.
○ Within the [0, 1] interval, we are free to specify our beliefs in
any way we wish.
○ In most cases, we would want to choose a distribution for the
prior beliefs that peaks somewhere in the [0, 1] interval.
Maximum A Posteriori (Example)
● The following beta distribution can express our prior beliefs
●

The MAP estimate for θ

Ignore the denominator as B(β0,β1) is independent of θ

Maximum A Posteriori (Example)
● Solve for the value of θ that maximizes the expression.
●● It is identical to the Maximum likelihood function
● We can therefore reuse the derivation of θMLE

Here you can see that both p(θ) and p(θ|D) are following beta
distribution:

i.e P(θ|D) ~ Beta(𝞫0+ 𝞪0 , 𝞫1+𝞪1){ prior and posterior are following beta
distribution)
Maximum A Posteriori (Example)
●
If the posterior distribution is in the same family as the prior
distribution, then we say that the prior distribution is the conjugate
prior for the likelihood function.

Here Beta is a conjugate prior.

Maximum A Posteriori (Example)
●
Maximum A Posteriori (Example)
●
●
Priors
●
Examples
A gamma distribution with parameters α, β has the following density
●
function, where Γ(t) is the gamma function.

Using the Gamma distribution as a prior, show that the Exponential

distribution is a conjugate prior of the Gamma distribution. Also, ﬁnd
the maximum a posteriori estimator for the parameter of the
Exponential distribution as a function of α and β.

#3 Forecasting Method: Industrial Management
No ratings yet
#3 Forecasting Method: Industrial Management
25 pages
Akshaya P 03
No ratings yet
Akshaya P 03
3 pages
Data Analysis Coca Cola
No ratings yet
Data Analysis Coca Cola
7 pages
2002 Formulae and Tables
100% (7)
2002 Formulae and Tables
196 pages
Lecture Slides Chapter 5-6 Wooldridge 7th Edition
No ratings yet
Lecture Slides Chapter 5-6 Wooldridge 7th Edition
25 pages
FAI Problem Sheet III
No ratings yet
FAI Problem Sheet III
2 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Permutation and Combinations
From Everand
Permutation and Combinations
Ramesh Chandra
4/5 (36)
Section 5
No ratings yet
Section 5
18 pages
Sreehari
No ratings yet
Sreehari
133 pages
Fe 455
No ratings yet
Fe 455
340 pages
Simple PD and LGD Estimation in MS Excel
100% (1)
Simple PD and LGD Estimation in MS Excel
4 pages
Machine Learning L1
No ratings yet
Machine Learning L1
34 pages
Chapter 8 Residual Analysis (Auto-Saved)
No ratings yet
Chapter 8 Residual Analysis (Auto-Saved)
28 pages
PSLecture18 2022
No ratings yet
PSLecture18 2022
100 pages
Congratulations To All Our Qualifiers
No ratings yet
Congratulations To All Our Qualifiers
1 page
P4 New - CHeat Sheet End-Term
No ratings yet
P4 New - CHeat Sheet End-Term
7 pages
GOS4 ch11 Solutions Solved
No ratings yet
GOS4 ch11 Solutions Solved
5 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
12 MLEFilled
No ratings yet
12 MLEFilled
8 pages
3673 113516 1 SM
No ratings yet
3673 113516 1 SM
12 pages
Managerial Economics
No ratings yet
Managerial Economics
28 pages
ML Map and Bayseian
No ratings yet
ML Map and Bayseian
35 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Assignment Forecasting
No ratings yet
Assignment Forecasting
9 pages
Module 4
No ratings yet
Module 4
3 pages
Introduction To MME
No ratings yet
Introduction To MME
4 pages
11 Mle
No ratings yet
11 Mle
26 pages
L08 MaximumLikelihoodEstimation
No ratings yet
L08 MaximumLikelihoodEstimation
5 pages
Likelihood Frequentist
No ratings yet
Likelihood Frequentist
27 pages
Sta255 Week 11-2 Pre
No ratings yet
Sta255 Week 11-2 Pre
21 pages
L08 Map
No ratings yet
L08 Map
8 pages
Statistical Treatment of Data - Written Report
No ratings yet
Statistical Treatment of Data - Written Report
3 pages
Mime 4100 Final Exam: Prospectus Spring 2011
No ratings yet
Mime 4100 Final Exam: Prospectus Spring 2011
2 pages
DS 630 - Lec 02 - ST
No ratings yet
DS 630 - Lec 02 - ST
34 pages
MLE Assingnment
No ratings yet
MLE Assingnment
7 pages
Sta255 Week 11-1 Pre
No ratings yet
Sta255 Week 11-1 Pre
37 pages
Sensi Dan Spesi Metode Apung
No ratings yet
Sensi Dan Spesi Metode Apung
2 pages
Ps 2,3
No ratings yet
Ps 2,3
48 pages
Sa Oct09 Pogue PDF
No ratings yet
Sa Oct09 Pogue PDF
5 pages
7 Mle
No ratings yet
7 Mle
31 pages
XXX Contoh ANOVA Manual
No ratings yet
XXX Contoh ANOVA Manual
9 pages
Chapter 5S Decision Theory Discussion
No ratings yet
Chapter 5S Decision Theory Discussion
2 pages
Lecture 6
No ratings yet
Lecture 6
13 pages
Lecture17 Mle Map
No ratings yet
Lecture17 Mle Map
29 pages
Python For Finance: Risk Measurement
No ratings yet
Python For Finance: Risk Measurement
36 pages
Assignment Problem
No ratings yet
Assignment Problem
82 pages
A Stochastic Frontier Model With Correction For Sample Selection
No ratings yet
A Stochastic Frontier Model With Correction For Sample Selection
10 pages
Chap 5
No ratings yet
Chap 5
32 pages
CH 12
0% (2)
CH 12
25 pages
21 Mle
No ratings yet
21 Mle
24 pages
Week 6 Mle
No ratings yet
Week 6 Mle
41 pages
The Theory of Algebraic Numbers
From Everand
The Theory of Algebraic Numbers
Harry Pollard
4/5 (1)
Experiment 1
No ratings yet
Experiment 1
5 pages
Lecture 03 Maximum Likelihood Estimation
No ratings yet
Lecture 03 Maximum Likelihood Estimation
22 pages
03 Lectureslides ParameterInference
No ratings yet
03 Lectureslides ParameterInference
24 pages
Midterm Fall2014
No ratings yet
Midterm Fall2014
11 pages
ML Lecture 03 - Probabilistic Inference (Spring 2024)
No ratings yet
ML Lecture 03 - Probabilistic Inference (Spring 2024)
46 pages
Mangaldan National High School Mangaldan, Pangasinan Budgeted Lesson in Statistics & Probability
100% (1)
Mangaldan National High School Mangaldan, Pangasinan Budgeted Lesson in Statistics & Probability
14 pages
Beer or Quiche
No ratings yet
Beer or Quiche
7 pages
Randomized Block Design and Latin Square
No ratings yet
Randomized Block Design and Latin Square
2 pages
Chapter1 - An Overview of Regression Analysis
No ratings yet
Chapter1 - An Overview of Regression Analysis
35 pages
Decision Making Under Uncertainty
100% (1)
Decision Making Under Uncertainty
17 pages
MLE Lecture Note For Econometrician
No ratings yet
MLE Lecture Note For Econometrician
13 pages
Set Theory Essentials
From Everand
Set Theory Essentials
Emil Milewski
No ratings yet
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
No ratings yet
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
28 pages
Game Thry
No ratings yet
Game Thry
14 pages
10-701/15-781, Machine Learning: Homework 1: Aarti Singh Carnegie Mellon University
No ratings yet
10-701/15-781, Machine Learning: Homework 1: Aarti Singh Carnegie Mellon University
6 pages
ML and MAP - HTML
No ratings yet
ML and MAP - HTML
9 pages
Homework 2
No ratings yet
Homework 2
4 pages
Biostatistics: School of Mathematics and Statistics
No ratings yet
Biostatistics: School of Mathematics and Statistics
17 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
55 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
46 pages
Statistical Inference
No ratings yet
Statistical Inference
55 pages
Maximum
No ratings yet
Maximum
3 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages
STAT 135 Solutions To Homework 4:: 30 Points
No ratings yet
STAT 135 Solutions To Homework 4:: 30 Points
9 pages
Mathematical Statistics (MA212M) : Lecture Slides
No ratings yet
Mathematical Statistics (MA212M) : Lecture Slides
14 pages
Maximum Likelihood Notes1
No ratings yet
Maximum Likelihood Notes1
10 pages
MIT18 05S14 Reading10b PDF
No ratings yet
MIT18 05S14 Reading10b PDF
9 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
8 pages
Etf3600 Lecture3 Mle LPM 2013
No ratings yet
Etf3600 Lecture3 Mle LPM 2013
36 pages
Statistical Inference: Classical and Bayesian Methods
No ratings yet
Statistical Inference: Classical and Bayesian Methods
22 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
Topic 14: Maximum Likelihood Estimation: 1 Examples
No ratings yet
Topic 14: Maximum Likelihood Estimation: 1 Examples
6 pages
Simulating Maximum Likelihood Estimators: Corbin Miller Stat 342 February 14, 2011
No ratings yet
Simulating Maximum Likelihood Estimators: Corbin Miller Stat 342 February 14, 2011
5 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages
3.exponential Family & Point Estimation - 552
0% (1)
3.exponential Family & Point Estimation - 552
33 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
11 pages
Frequentist Estimation: 4.1 Likelihood Function
No ratings yet
Frequentist Estimation: 4.1 Likelihood Function
6 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
16 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
7 pages
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
No ratings yet
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
6 pages