Estimation 2
Estimation 2
Ajit K Chaturvedi
2 / 20
Here the posterior density of the vector X given the
observations Y is evaluated by applying the Bayes’s rule
3 / 20
Since fY (y) ≥ 0, the expected cost will be minimized if the
term between brackets is minimized for each y. This gives
Z
X̂(y) = arg minm C (x, x̂)fX|Y (x | y)dx
x̂∈R
4 / 20
MSE
For
C (x, x̂) = ∥x − x̂∥22 ,
the minimum mean-square error (MMSE) estimate X̂MSE (y)
minimizes the conditional mean-square error given by
Z
J(x̂ | y) = ∥x − x̂∥2 fX|Y (x | y)dx
∂J T
∂J ∂J
∇x̂ J = ··· ··· .
∂ x̂1 ∂ x̂i ∂ x̂m
Therefore,
Z
∇x̂ J(x̂ | y ) = 2 (x̂ − x)fX|Y (x | y)dx = 0
5 / 20
We get Z Z
x̂ fX|Y (x | y)dx = xfX|Y (x | y)dx
This leads to
Z
X̂MSE (y) = xfX|Y (x | y)dx =?
It is equal to
E [X | Y = y]
Thus the mean-square error estimate X̂MSE (Y) is just the
conditional mean of X given Y.
6 / 20
MMSE
Recall Z
J(x̂ | y) = ∥x − x̂∥2 fX|Y (x | y)dx
7 / 20
Averaging with respect to Y, the minimum mean-square error
(MMSE) can be expressed as
8 / 20
MAE Estimate
For
C (x, x̂) = ||x − x̂||1 ,
the minimum mean absolute error estimate (MMAE)
X̂MAE (y) minimizes the objective function
Z
J(x̂|y) = ||x − x̂||1 fX|Y (x|y)dx.
Z x̂i Z ∞
fXi |Y (xi |y)dy = fXi |Y (xi |y)dy
−∞ x̂i
10 / 20
MAP Estimate
The estimator corresponding to
C (x, x̂) = Lϵ (x − x̂),
minimizes the objective function
Z
J(x̂|y ) = Lϵ (x − x̂)fX|Y (x|y)dx
11 / 20
MAP Estimate
Therefore,
X̂MAP (y) = arg maxm fX|Y (x|y)
x∈R
12 / 20
Example 4.1: Jointly Gaussian Random Vectors
with
mX E [X]
m= =
mY E [Y]
and
KX KXY X − mX
(X − mX )T )T
K= =E (Y − mY
KYX KY Y − mY
13 / 20
The conditional density of X given Y is also Gaussian.
It is given by
1
fX|Y (x|y) = 1/2
(2π)m/2 KX|Y
1 T −1
exp − (x − mX|Y ) KX|Y (x − mX|Y )
2
Here
mX|Y = mX + KX|Y K−1
Y (Y − mY )
r σ (y −η )
(recall E [x|y ] = ηx + x σy y )
and
KX|Y = KX − KXY K−1
Y KYX
mX|Y and KX|Y denote the conditional mean vector and the
conditional covariance matrix of X given Y.
This can be written compactly as:
fX|Y (x|y) ∼ N(mX|Y , KX|Y )
14 / 20
Then
X̂MSE (Y ) = mX|Y = mX + KXY K−1
Y (Y − mY )
It can be seen that the estimate depends linearly on the
observation vector Y while the conditional error covariance
matrix KX|Y does not depend on the observation vector Y.
It can be seen that that the error covariance matrix KE is the
same as KX|Y i.e.
KE = KX − KXY K−1
Y KYX
Since the median of a Gaussian distribution equals its mean,
and since the maximum of a Gaussian density is achieved at
its mean, we have also
X̂MAE (Y) = X̂MAP (Y) = mX|Y
Thus, in the Gaussian case the MSE, MAE and MAP
estimates coincide.
However, this last property does not hold in general as can be
seen from the following example.
15 / 20
Example 4.2: Exponential observation of an exponential
parameter
Assume that Y is an exponential random variable with
parameter X , so that
x exp(−xy ) for y ≥0
fY |X (y |x) =
0 otherwise
fX ,Y (x, y )
fX |Y (x|y ) = = (y + a)2 x exp(−(y + a)x)u(x)
fY (y )
17 / 20
We find the estimate using integration by parts:
Z ∞
2
X̂MSE (y ) = xfX |Y (x|y )dx =
−∞ y +a
To obtain the MAE estimate, we need to solve:
Z x̂ Z x̂
1 2
= fX |Y (x|y )dx = (y + a) x exp(−(y + a)x)dx
2 −∞ 0
= [1 + (a + y )x̂] exp(−(y + a)x̂)
∂
f (x|y ) = (y + a)2 exp(−(y + a)x)[1 − (y + a)x] = 0
∂x X |Y
which yields
1
X̂MAP (y ) =
y +a
So in this example, the MAP, MAE and MSE estimates take
different values.
19 / 20
20 / 20