Estimation 4
Estimation 4
Ajit K Chaturvedi
2 / 16
Any function g(Y) of the observations taking values in Rm
can be viewed as an estimator of X.
Among such estimators, the maximum likelihood estimator
XML (Y) maximizes the likelihood function fY|X (y | x), i.e.,
3 / 16
Although the MAP estimate was obtained by using a different
approach, it is closely related to the ML estimate.
The marginal density fY (y) in the denominator of the
expression for the conditional density fX|Y does not affect the
maximization, so we can rewrite the MAP estimate as
4 / 16
We see that the only difference between the MAP and ML
estimates is that the objective function minimized by the MAP
estimate is formed by adding to the log-likelihood function a
term ln fX (x) representing the a-priori information about X.
Therefore, when X admits a uniform distribution, fX (x) is
constant, and hence the two estimates coincide.
This indicates that the dichotomy between the random and
nonrandom formulations of parameter estimation is not
completely strict, and it is often possible to switch from one
viewpoint to the other.
5 / 16
Example 4.4: Signal with unknown amplitude
Y ∼ N As, σ 2 IN ,
6 / 16
ÂML is obtained by maximizing
1
ln fY|A (y | A) = − ∥y − As∥22 + c
2σ 2
sT y
ÂML (y) = .
∥s∥22
7 / 16
Bias
8 / 16
Example 4.6: ML estimates of the mean and variance of
i.i.d. Gaussian random variables
is
N
!
1 −1 X
fY (y | m, v ) = exp (yk − m)2 .
(2πv )N/2 2v
k=1
9 / 16
Case 1: m unknown, v known.
In this case the observation vector density is denoted as
fY (y | m), and we observe it has the form considered in
Example 4.4, with A = m and
T
s=u≜ 1 1 ... 1
This leads to
N
1 X
m
b ML (Y) = Yk ,
N
k=1
10 / 16
Case 2: m known, v unknown.
In this case the observation vector density is denoted as
fY (y | v ), and the log-likelihood is given by
N
N 1 X
ln (fY (y | v )) = − ln(2πv ) − (yk − m)2 .
2 2v
k=1
12 / 16
By observing that
N
1 X h i
E [v̂ML (Y)] = E (Yk − m)2 = v ,
N
k=1
13 / 16
Case 3: m and v unknown.
In this case, the derivatives of the log-likelihood function
L(y | m, v ) ≜ ln (fY (y | m, v )
N
N 1 X
= − ln(2πv ) − (yk − m)2
2 2v
k=1
14 / 16
Further,
N
∂ N 1 X
L(y | m, v ) = − + 2 (yk − m)2 .
∂v 2v 2v
k=1
15 / 16
This estimate can be viewed as obtained by replacing the
unknown mean m in the earlier estimator by the sampled
mean mb ML .
Even though this is a reasonable choice, it affects the
properties of the resulting estimator, and in particular its bias.
16 / 16