0% found this document useful (0 votes)
14 views16 pages

Estimation 4

The document discusses the estimation of nonrandom parameters, focusing on the maximum likelihood (ML) and maximum a posteriori (MAP) estimates. It explains how the likelihood function is derived from the observation model and provides examples of estimating parameters in Gaussian distributions. Additionally, it covers the concepts of bias in estimators and presents cases for estimating the mean and variance of independent and identically distributed Gaussian random variables.

Uploaded by

Fehad Nazir 037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views16 pages

Estimation 4

The document discusses the estimation of nonrandom parameters, focusing on the maximum likelihood (ML) and maximum a posteriori (MAP) estimates. It explains how the likelihood function is derived from the observation model and provides examples of estimating parameters in Gaussian distributions. Additionally, it covers the concepts of bias in estimators and presents cases for estimating the mean and variance of independent and identically distributed Gaussian random variables.

Uploaded by

Fehad Nazir 037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Estimation of Nonrandom Parameters

Ajit K Chaturvedi

Ajit K Chaturvedi Estimation of Nonrandom Parameters


4.4 Estimation of Nonrandom Parameters

Earlier we have assumed that some a-priori information was


available for the parameter vector X in the form of the density
fX (x) or the mean vector mX and covariance matrix KX .
An alternate perspective consists of viewing the parameter
vector X as unknown but nonrandom.
In this approach, the only available information is the
measurement vector Y ∈ Rn and the observation model
specified by the density fY|X (y | x).
The density fY|X (y | x), when viewed as a function of x, is
called the likelihood function.
It indicates how likely we are to observe Y = y when the
parameter vector is x.

2 / 16
Any function g(Y) of the observations taking values in Rm
can be viewed as an estimator of X.
Among such estimators, the maximum likelihood estimator
XML (Y) maximizes the likelihood function fY|X (y | x), i.e.,

X̂ML (y) = arg maxm fY|X (y | x).


x∈R

Since the logarithm is a monotone function, i.e. z1 < z2 if and


only if ln (z1 ) < ln (z2 ), we can equivalently obtain the ML
estimate by maximizing the log likelihood function:

X̂ML (y) = arg maxm ln fY|X (y | x) .
x∈R

3 / 16
Although the MAP estimate was obtained by using a different
approach, it is closely related to the ML estimate.
The marginal density fY (y) in the denominator of the
expression for the conditional density fX|Y does not affect the
maximization, so we can rewrite the MAP estimate as

X̂MAP (y) = arg maxm fY,X (y, x)


x∈R
  
= arg maxm ln fY|X (y | x) + ln (fX (x)) .
x∈R

4 / 16
We see that the only difference between the MAP and ML
estimates is that the objective function minimized by the MAP
estimate is formed by adding to the log-likelihood function a
term ln fX (x) representing the a-priori information about X.
Therefore, when X admits a uniform distribution, fX (x) is
constant, and hence the two estimates coincide.
This indicates that the dichotomy between the random and
nonrandom formulations of parameter estimation is not
completely strict, and it is often possible to switch from one
viewpoint to the other.

5 / 16
Example 4.4: Signal with unknown amplitude

We observe an N-dimensional Gaussian vector

Y ∼ N As, σ 2 IN ,


Here the noise variance σ 2 and the signal s ∈ RN are known


but the amplitude A is unknown.
The density of Y can be expressed as
 
1 1 2
fY|A (y | A) = exp − ∥y − As∥2 .
(2πσ 2 )N/2 2σ 2

6 / 16
ÂML is obtained by maximizing
1
ln fY|A (y | A) = − ∥y − As∥22 + c
2σ 2

Here c does not depend on A and y, so ÂML is obtained by


minimizing ∥y − As∥22 over A, which gives

sT y
ÂML (y) = .
∥s∥22

7 / 16
Bias

The bias b(x) of an estimator X̂(Y) is the expectation of its


error, i.e.,
b(x) = x − E [X̂(Y)].
An estimator is said to be unbiased if its bias is zero, i.e. if
b(x) = 0.
This property just indicates that when the estimator is
averaged over a large number of realizations, it gives the
correct parameter vector value.
This property is rather weak since it does not ensure that for
a single realization, the estimator X̂(Y) is close to the true
parameter vector.

8 / 16
Example 4.6: ML estimates of the mean and variance of
i.i.d. Gaussian random variables

Assume that we observe a sequence {Yk , 1 ≤ k ≤ N} of i.i.d.


N(m, v ) random variables.
The probability density of vector
 T
Y= Y1 Y2 . . . YN

is
N
!
1 −1 X
fY (y | m, v ) = exp (yk − m)2 .
(2πv )N/2 2v
k=1

Depending on whether m or/and v is unknown, we consider


three estimation cases.

9 / 16
Case 1: m unknown, v known.
In this case the observation vector density is denoted as
fY (y | m), and we observe it has the form considered in
Example 4.4, with A = m and
 T
s=u≜ 1 1 ... 1

This leads to
N
1 X
m
b ML (Y) = Yk ,
N
k=1

i.e. the sampled mean of the observations.


We have
N
1 X
E [m
b ML (Y)] = E [Yk ] = m.
N
k=1

So the estimator is unbiased.

10 / 16
Case 2: m known, v unknown.
In this case the observation vector density is denoted as
fY (y | v ), and the log-likelihood is given by
N
N 1 X
ln (fY (y | v )) = − ln(2πv ) − (yk − m)2 .
2 2v
k=1

Taking its derivative with respect to v yields


N
∂ N 1 X
ln (fY (y | v )) = − + 2 (yk − m)2 .
∂v 2v 2v
k=1

Setting this derivative equal to zero, we obtain the ML


estimate
N
1 X
v̂ML (Y) = (Yk − m)2 ,
N
k=1

i.e. the sampled variance of the observations.


11 / 16
To ensure that the estimate represents a maximum of the
log-likelihood function, we need to verify that the second
derivative
N
∂2 N 1 X
ln (fY (y | v )) = − (yk − m)2
∂v 2 2v 2 v 3
k=1

is negative at this point.


This is the case since
∂2 N
ln (fY (y | v̂ML )) = − < 0.
∂v 2
2 (v̂ML )2

12 / 16
By observing that
N
1 X h i
E [v̂ML (Y)] = E (Yk − m)2 = v ,
N
k=1

we conclude that the ML estimate is unbiased.

13 / 16
Case 3: m and v unknown.
In this case, the derivatives of the log-likelihood function

L(y | m, v ) ≜ ln (fY (y | m, v )
N
N 1 X
= − ln(2πv ) − (yk − m)2
2 2v
k=1

with respect to m and v are given by


N
∂ 1X
L(y | m, v ) = (yk − m) ,
∂m v
k=1

14 / 16
Further,
N
∂ N 1 X
L(y | m, v ) = − + 2 (yk − m)2 .
∂v 2v 2v
k=1

Setting these two derivatives equal to zero, we need to solve


the resulting coupled equations.
We find that the ML estimate of m is the same as in the case
where the variance is known.
But the ML estimate of v is now given by:
N
1 X
v̂ML (Y) = b ML )2 .
(Yk − m
N
k=1

15 / 16
This estimate can be viewed as obtained by replacing the
unknown mean m in the earlier estimator by the sampled
mean mb ML .
Even though this is a reasonable choice, it affects the
properties of the resulting estimator, and in particular its bias.

16 / 16

You might also like