Chapter 3 (PR)
Chapter 3 (PR)
Maximum-Likelihood and
Bayesian Parameter Estimation
The collection of training examples is composed of c data sets
Each example in is drawn according to the class‐
conditional pdf, i.e.
Examples in are i.i.d. random variables, i.e.
independent and identically distributed (独立同
分布)
To show the dependence of
on explicitly:
Ch. 4 Case II: doesn’t have parametric form
Pattern Recognition Soochow, Fall Semester 3
Estimation Under Parametric Form
Parametric class-conditional pdf:
Assumption I: Maximum‐Likelihood (ML) estimation (极大似然估计)
Estimate parameter values by
View parameters as
maximizing the likelihood
quantities whose values
(probability) of observing the
are fixed but unknown
actual training examples
Assumption II: Bayesian estimation (贝叶斯估计)
View parameters as Observation of the actual training
random variables examples transforms parameters’
having some known prior distribution into posterior
prior distribution distribution (via Bayes theorem)
Work with each category separately and therefore simplify
the notations by dropping subscripts w.r.t. categories
without loss of generality:
is named as the log‐likelihood function
suppose is known
(necessary condition
for ML estimate ) Multiply on
both sides
Consider univariate case
(xk ¡ μ1 )
(necessary condition
for ML estimate and )
ML estimate in univariate case
Arithmetic average of
n vectors
Arithmetic average
of n matrices
In this case, we can no longer make a single ML estimate
and then infer based on and
How can we Fully exploit training examples!
proceed under
this situation
Eq.22 [pp.91]
Two assumptions
Eq.23 [pp.91]
Treat each class Simplify the class‐conditional pdf
independently notation as
( random variables w.r.t. parametric form)
( is independent of given )
parametric
form
training posterior
set Bayes pdf
Formula
prior pdf
parametric
form
posterior class‐conditional
pdf Law of pdf
Total Prob.
Phase III:
Gaussian parametric
form
Prior pdf still takes
Gaussian form
Other form of
prior pdf could be
How would look like in this case? assumed as well
( is a constant
not related to )
(examples in are i.i.d.)
Equating the
coefficients in
both form:
Eq.36 [pp.95]
is an exponential is a
function of a quadratic normal pdf
function of as well
Bayesian estimation
No parametric form for class‐conditional pdf
Pattern Recognition Soochow, Fall Semester 27
Summary (Cont.)
Maximum likelihood estimation
Settings: parameters as fixed but unknown values
The objective function: Log‐likelihood function
Necessary conditions for ML estimation: gradient
for the objective function should be zero vector
The Gaussian case
Unknown ¹
¹
Unknown and §
The general procedure
Phase I: prior pdf posterior pdf (for μ )
Phase II: posterior pdf (for )
μ class‐conditional pdf (for x)
Phase III: prediction (Eq.22 [pp.91])
The Gaussian case
¹
Unknown : univariate and multivariate