0% found this document useful (0 votes)

8 views18 pages

Aosp ch11

Uploaded by

Anuratha Nagarajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views18 pages

Aosp ch11

Uploaded by

Anuratha Nagarajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

474 10.

Wavelets

DWT decomposition DWT detail coefficients

7 7
11
Wiener Filtering
levels

levels
6 6

5 5

0 0.25 0.5 0.75 1 0 64 128 192 256

time DWT index

UWT decomposition UWT detail coefficients

7 7 The problem of estimating one signal from another is one of the most important in
signal processing. In many applications, the desired signal is not available or observable
directly. Instead, the observable signal is a degraded or distorted version of the original
levels

levels

6 6 signal. The signal estimation problem is to recover, in the best way possible, the desired
signal from its degraded replica.
We mention some typical examples: (1) The desired signal may be corrupted by
5 5
strong additive noise, such as weak evoked brain potentials measured against the strong
background of ongoing EEGs; or weak radar returns from a target in the presence of
0 0.25 0.5 0.75 1 0 64 128 192 256
time UWT index strong clutter. (2) An antenna array designed to be sensitive towards a particular “look”
direction may be vulnerable to strong jammers from other directions due to sidelobe
Fig. 10.9.1 UWT/DWT decompositions and wavelet coefficients of housing data. leakage; the signal processing task here is to null the jammers while at the same time
maintaining the sensitivity of the array towards the desired look direction. (3) A signal
transmitted over a communications channel can suffer phase and amplitude distortions
10.3 Prove the downsampling replication property (10.4.11) by working backwards, that is, start and can be subject to additive channel noise; the problem is to recover the transmitted
from the Fourier transform expression and show that
signal from the distorted received signal. (4) A Doppler radar processor tracking a
1
L−1
moving target must take into account dynamical noise—such as small purely random
X(f − mfsdown )= s(k)x(k)e−2πjf k/fs = x(nL)e−2πjf nL/fs = Ydown (f )
L n
accelerations—affecting the dynamics of the target, as well as measurement errors. (5)
m=0 k
An image recorded by an imaging system is subject to distortions such as blurring due to
where s(k) is the periodic “sampling function” with the following representations: motion or to the finite aperture of the system, or other geometric distortions; the prob-
1 lem here is to undo the distortions introduced by the imaging system and restore the
L−1
1 1 − e−2πjk
s(k)= e−2πjkm/L = = δ(k − nL) original image. A related problem, of interest in medical image processing, is that of re-
L m=0
L 1 − e−2πjk/L n
constructing an image from its projections. (6) In remote sensing and inverse scattering
Moreover, show that the above representations are nothing but the inverse L-point DFT of applications, the basic problem is, again, to infer one signal from another; for example,
the DFT of one period of the periodic pulse train: to infer the temperature profile of the atmosphere from measurements of the spectral

s(k)= [. . . , 1, 0, 0, . . . , 0, 1, 0, 0, . . . , 0, 1, 0, 0, . . . , 0, . . . ]= δ(k − nL) distribution of infrared energy; or to deduce the structure of a dielectric medium, such
n
L−1 zeros L−1 zeros L−1 zeros as the ionosphere, by studying its response to electromagnetic wave scattering; or, in
oil exploration to infer the layered structure of the earth by measuring its response to
10.4 Show that the solution to the optimization problem (10.7.7) is the soft-thresholding rule of
an impulsive input near its surface.
Eq. (10.7.8).
In this chapter, we pose the signal estimation problem and discuss some of the
10.5 Study the “Tikhonov regularizer” wavelet thresholding function:
criteria used in the design of signal estimation algorithms.
|d|a We do not present a complete discussion of all methods of signal recovery and es-
dthr = f (d, λ, a)= d , a > 0, λ > 0
|d|a + λa timation that have been invented for applications as diverse as those mentioned above.
476 11. Wiener Filtering 11.1. Linear and Nonlinear Estimation of Signals 477

Our emphasis is on traditional linear least-squares estimation methods, not only be- the vectors ⎡ ⎤ ⎡ ⎤
cause they are widely used, but also because they have served as the motivating force xn yn
⎢ a ⎥ ⎢ a ⎥
for the development of other estimation techniques and as the yardstick for evaluating ⎢ xna +1 ⎥ ⎢ yna +1 ⎥
⎢ ⎥ ⎢ ⎥
x=⎢ . ⎥, y=⎢ . ⎥
them. ⎢ .. ⎥ ⎢ .. ⎥
⎣ ⎦ ⎣ ⎦
We develop the theoretical solution of the Wiener filter both in the stationary and xnb ynb
nonstationary cases, and discuss its connection to the orthogonal projection, Gram-
Schmidt constructions, and correlation canceling ideas of Chap. 1. By means of an ex- For each value of n, we seek the functional dependence
ample, we introduce Kalman filtering concepts and discuss their connection to Wiener
x̂n = x̂n (y)
filtering and to signal modeling. Practical implementations of the Wiener filter are dis-
cussed in Chapters 12 and 16. Other signal recovery methods for deconvolution applica- of x̂n on the given observation vector y that provides the best estimate of xn .
tions that are based on alternative design criteria are briefly discussed in Chap. 12, where
we also discuss some interesting connections between Wiener filtering/linear prediction 1. The criterion for the MAP estimate is to maximize the a posteriori conditional
methods and inverse scattering methods. density of xn given that y already occurred; namely,

p(xn |y)= maximum (11.1.1)

11.1 Linear and Nonlinear Estimation of Signals
in other words, the optimal estimate x̂n is that xn that maximizes this quantity
The signal estimation problem can be stated as follows: We wish to estimate a random for the given vector y; x̂n is therefore the most probable choice resulting from the
signal xn on the basis of available observations of a related signal yn . The available given observations y.
signal yn is to be processed by an optimal processor that produces the best possible
estimate of xn : 2. The ML criterion, on the other hand, selects x̂n to maximize the conditional density
of y given xn , that is,
p(y|xn )= maximum (11.1.2)

The resulting estimate x̂n will be a function of the observations yn . If the optimal This criterion selects x̂n as though the already collected observations y were the
processor is linear, such as a linear ﬁlter, then the estimate x̂n will be a linear function most likely to occur.
of the observations. We are going to concentrate mainly on linear processors. However,
3. The MS criterion minimizes the mean-square estimation error
we would like to point out that, depending on the estimation criterion, there are cases
where the estimate x̂n may turn out to be a nonlinear function of the yn s. E = E[e2n ]= min, where en = xn − x̂n (11.1.3)
We discuss brieﬂy four major estimation criteria for designing such optimal proces-
sors. They are:
that is, the best choice of the functional dependence x̂n = x̂n (y) is sought that
minimizes this expression. We know from our results of Sec. 1.4 that the required
(1) The maximum a posteriori (MAP) criterion.
solution is the corresponding conditional mean
(2) The maximum likelihood (ML) criterion.
(3) The mean square (MS) criterion. x̂n = E[xn |y]= MS estimate (11.1.4)
(4) The linear mean-square (LMS) criterion.

computed with respect to the conditional density p(xn |y).

The LMS criterion is a special case of the MS criterion. It requires, a priori, that the
estimate x̂n be a linear function of the yn s.† The main advantage of the LMS processor 4. Finally, the LMS criterion requires the estimate to be a linear function of the ob-
is that it requires only knowledge of second order statistics for its design, whereas the servations
other, nonlinear, processors require more detailed knowledge of probability densities.
nb

To explain the various estimation criteria, let us assume that the desired signal xn x̂n = h(n, i)yi (11.1.5)
i=na
is to be estimated over a finite time interval na ≤ n ≤ nb Without loss of generality, we
may assume that the observed signal yn is also available over the same interval. Define For each n, the weights h(n, i), na ≤ i ≤ nb are selected to minimize the mean-
† Note that the acronym LMS is also used in the context of adaptive filtering, for least mean-square. square estimation error

E = E[e2n ]= E (xn − x̂n )2 = minimum (11.1.6)
478 11. Wiener Filtering 11.1. Linear and Nonlinear Estimation of Signals 479

With the exception of the LMS estimate, all other estimates x̂n (y) are, in general, that the maximum among the three coefﬁcients p(y|1), p(y|0), p(y| − 1) will determine
nonlinear functions of y. the value of x. Thus, for a given y we select that x that

Example 11.1.1: If both xn and y are zero-mean and jointly gaussian, then Examples 1.4.1 and p(y|x)= maximum of p(y|1), p(y|0), p(y| − 1)}
1.4.2 imply that the MS and LMS estimates of xn are the same. Furthermore, since p(xn |y)
is gaussian it will be symmetric about its maximum, which occurs at its mean, that is, at Using the gaussian nature of p(y|x), we ﬁnd equivalently
E[xn |y]. Therefore, the MAP estimate of xn is equal to the MS estimate. In conclusion, for
zero-mean jointly gaussian xn and y, the three estimates MAP, MS, and LMS coincide.
(y − c x)2 = minimum of (y − c)2 , y2 , (y + c)2 }

Example 11.1.2: To see the nonlinear character and the differences among the various esti-
Subtracting y2 from both sides, dividing by cT c, and denoting
mates, consider the following example: A discrete-amplitude, constant-in-time signal x
can take on the three values cT y
x = −1, x = 0, x=1 ȳ =
cT c
each with probability of 1/3. This signal is placed on a known carrier waveform cn and
we ﬁnd the equivalent equation
transmitted over a noisy channel. The received samples are of the form

yn = cn x + vn , n = 1, 2, . . . , M x2 − 2xȳ = min{1 − 2ȳ, 0, 1 + 2ȳ}

where vn are zero-mean white gaussian noise samples of variance σv2 , assumed to be inde- and in particular, applying these for +1, 0, −1, we ﬁnd
pendent of x. The above set of measurements can be written in an obvious vector notation
⎧
⎪ 1
y = cx + v ⎪
⎪ 1, if ȳ >
⎪
⎪ 2
⎨ 1 1
x̂MAP = 0, if − < ȳ <
(a) Determine the conditional densities p(y|x) and p(x|y). ⎪
⎪ 2 2
⎪
⎪
⎪
⎩ −1, 1
(b) Determine and compare the four alternative estimates MAP, ML, MS, and LMS. if ȳ < −
2

To ﬁnd the MAP estimate of x, the quantity p(x|y) must be maximized with respect to x. Canceling some common factors from the numerator and denominator, we ﬁnd the simpler
Since the expression for p(x|y) forces x to be one of the three values +1, 0, −1, it follows expression
480 11. Wiener Filtering 11.2. Orthogonality and Normal Equations 481

yn in the interval na ≤ n ≤ nb . The division into three types of problems depends on

which of the available observations in that interval are taken into account in making up
2 sinh(2aȳ) cT c
x̂MS = , where a = the linear combination (11.1.5).
ea + 2 cosh(2aȳ) 2
2 σv
In the smoothing problem, all the observations in the interval [na , nb ] are taken
Finally, the LMS estimate can be computed as in Example 1.4.3. We ﬁnd into account. The shaded part in the following ﬁgure denotes the range of observations
that are used in the summation of Eq. (11.1.5):
cT y cT c
x̂LMS = = ȳ
nb
σv2 σv2
+ cT c + cT c x̂n = h(n, i)yi
σx2 σx2 i=na

All four estimates have been expressed in terms of ȳ. Note that the ML estimate is linear Since some of the observations are to the future of xn , the linear operation is not
but has a different slope than the LMS estimate. The nonlinearity of the various estimates causal. This does not present a problem if the sequence yn is already available and
is best seen in the following ﬁgure: stored in memory.
The optimal ﬁltering problem, on the other hand, requires the linear operation
(11.1.5) to be causal, that is, only those observations that are in the present and past of
the current sample xn must be used in making up the estimate x̂n . This requires that
the matrix of optimal weights h(n, i) be lower triangular, that is,

h(n, i)= 0, for n < i

Thus, in reference to the ﬁgure below, only the shaded portion of the observation
interval is used at the current time instant:

11.2 Orthogonality and Normal Equations

n
x̂n = h(n, i)yi
i=na
From now on, we will concentrate on the optimal linear estimate defined by Eqs. (11.1.5)
and (11.1.6). For each time instant n at which an estimate x̂n is sought, the optimal The estimate x̂n depends on the present and all the past observations, from the fixed
weights h(n, i), na ≤ i ≤ nb must be determined that minimize the error criterion starting point na to the current time instant n. As n increases, more and more observa-
(11.1.6). In general, a new set of optimal weights must be computed for each time instant tions are taken into account in making up the estimate, and the actual computation of
n. In the special case when the processes xn and yn are stationary and the observations x̂n becomes less and less efficient. It is desirable, then, to be able to recast the expres-
are available for a long time, that is, na = −∞, the weights become time-invariant in sion for x̂n a time-recursive form. This is what is done in Kalman filtering. But, there is
the sense that h(n, i)= h(n − i), and the linear processor becomes an ordinary time- another way to make the Wiener filter computationally manageable. Instead of allowing
invariant linear filter. We will discuss the solution for h(n, i) both for the time-invariant a growing number of observations, only the current and the past M observations yi ,
and the more general cases. i = n, n − 1, . . . , n − M are taken into account. In this case, only (M + 1) filter weights
The problem of determining the optimal weights h(n, i) according to the mean- are to be computed at each time instant n. This is depicted below:
square error minimization criterion (11.1.6) is in general referred to as the Wiener fil-
tering problem [849–866]. An interesting historical account of the development of this
n
M
x̂n = h(n, i)yi = h(n, n − m)yn−m
problem and its ramifications is given in the review article by Kailath [866]. Wiener
i=n−M m=0
filtering problems are conventionally divided into three types:
This is referred to as the finite impulse response (FIR) Wiener filter. Because of its
1. The optimal smoothing problem, simple implementation, the FIR Wiener filter has enjoyed widespread popularity. De-
2. The optimal filtering problem, and pending on the particular application, the practical implementation of the filter may
3. The optimal prediction problem. vary. In Sec. 11.3 we present the theoretical formulation that applies to the stationary
case; in Chap. 12 we reconsider it as a waveshaping and spiking filter and discuss a
In all cases, the optimal estimate of xn at a given time instant n is given by an number of deconvolution applications. In Chap. 16, we consider its adaptive implemen-
expression of the form (11.1.5), as a linear combination of the available observations tation using the Widrow-Hoff LMS algorithm and discuss a number of applications such
482 11. Wiener Filtering 11.2. Orthogonality and Normal Equations 483

as channel equalization and echo cancellation; we also discuss two alternative adaptive These determine the optimal weights at the current time instant n. In the vector
implementations—the so-called “gradient lattice,” and the “recursive least-squares.” notation of Sec. 11.1, we write Eq. (11.2.3) as
Finally, the linear prediction problem is a special case of the optimal ﬁltering problem
with the additional stipulation that observations only up to time instant n − D must be E[xyT ]= HE[yyT ]
used in obtaining the current estimate x̂n ; this is equivalent to the problem of predicting
D units of time into the future. The range of observations used in this case is shown where H is the matrix of weights h(n, i). The optimal H and the estimate are then
below:

n−D x̂ = Hy = E[xyT ]E[yyT ]−1 y
x̂n = h(n, i)yi
i=na This is identical to the correlation canceler of Sec. 1.4. The orthogonality equations
Of special interest to us will be the case of one-step prediction, corresponding to the (11.2.2) are precisely the correlation cancellation conditions. Extracting the nth row of
choice D = 1. This is depicted below: this matrix equation, we ﬁnd an explicit expression for the nth estimate x̂n

1
n− x̂n = E[xn yT ]E[yyT ]−1 y
x̂n = h(n, i)yi
i=na which is recognized as the projection of the random variable xn onto the subspace
If we demand that the prediction be based only on the past M samples (from the spanned by the available observations; namely, Y = {yna , yna +1 , . . . , ynb }. This is a
current sample), we obtain the FIR version of the prediction problem, referred to as general result: The minimum mean-square linear estimate x̂n is the projection of xn onto
linear prediction based on the past M samples, which is depicted below: the subspace spanned by all the observations that are used to make up that estimate.
This result is a direct consequence of the quadratic minimization criterion (11.1.6) and
1
n−
M the orthogonal projection theorem discussed in Sec. 1.6.
x̂n = h(n, i)yi = h(n, n − m)yn−m Using the methods of Sec. 1.4, the minimized estimation error at time instant n is
i=n−M m=1
easily computed by
Next, we set up the orthogonality and normal equations for the optimal weights. We
begin with the smoothing problem. The estimation error is in this case
nb

En = E[en en ]= E[en xn ]= E xn − h(n, i)yi xn
i=na

nb
en = xn − x̂n = xn − h(n, i)yi (11.2.1)
nb
i=na = E[x2n ]− h(n, i)E[yi xn ]= E[x2n ]−E[xn yT ]E[yyT ]−1 E[yxn ]
Differentiating the mean-square estimation error (11.1.6) with respect to each weight i=na

h(n, i), na ≤ i ≤ nb , and setting the derivative to zero, we obtain the orthogonality
which corresponds to the diagonal entries of the covariance matrix of the estimation
equations that are enough to determine the weights:
error e :

∂E ∂en Ree = E[eeT ]= E[xxT ]−E[xyT ]E[yyT ]−1 E[yxT ]
= 2E e n = −2E[en yi ]= 0 , for na ≤ i ≤ nb , or,
∂h(n, i) ∂h(n, i) The optimum ﬁltering problem is somewhat more complicated because of the causal-
Rey (n, i)= E[en yi ]= 0 (orthogonality equations) (11.2.2) ity condition. In this case, the estimate at time n is given by

for na ≤ i ≤ nb . Thus, the estimation error en is orthogonal (uncorrelated) to each

observation yi used in making up the estimate x̂n . The orthogonality equations provide x̂n = h(n, i)yi (11.2.4)
i=na
exactly as many equations as there are unknown weights.
Inserting Eq. (11.2.1) for en , the orthogonality equations may be written in an equiv- Inserting this into the minimization criterion (11.1.6) and differentiating with respect
alent form, known as the normal equations to h(n, i) for na ≤ i ≤ n, we ﬁnd again the orthogonality conditions

nb

E xn − h(n, k)yk yi = 0 , or, Rey (n, i)= E[en yi ]= 0 for na ≤ i ≤ n (11.2.5)
k=na
where the most important difference from Eq. (11.2.2) is the restriction on the range

nb
of i, that is, en is decorrelated only from the present and past values of yi . Again, the
E[xn yi ]= h(n, k)E[yk yi ] (normal equations) (11.2.3)
estimation error en is orthogonal to each observation yi that is being used to make up
k=na
484 11. Wiener Filtering 11.3. Stationary Wiener Filter 485

the estimate. The orthogonality equations can be converted into the normal equations Making a change of variables i → i + d and k → k + d, we rewrite Eq. (11.3.1) as
as follows:

n

n

E[en yi ]= E xn − h(n, k)yk yi = 0 , or, Rxy (n+d, i+d)= h(n+d, k+d)Ryy (k+d, i+d), for na −d ≤ i ≤ n (11.3.2)
k=na k=na −d

n Now, if we assume stationarity, Eqs. (11.2.7) and (11.3.2) become
E[xn yi ]= h(n, k)E[yk yi ] for na ≤ i ≤ n , or, (11.2.6)
k=na

n
Rxy (n − i) = h(n, k)Ryy (k − i) , for na ≤ i ≤ n

n k=na
Rxy (n, i)= h(n, k)Ryy (k, i) for na ≤ i ≤ n (11.2.7) (11.3.3)
k=na

n
Rxy (n − i) = h(n + d, k + d)Ryy (k − i) , for na − d ≤ i ≤ n
Such equations are generally known as Wiener-Hopf equations. Introducing the vec- k=na −d
tor of observations up to the current time n, namely,
If it were not for the differences in the ranges of i and k, these two equations would
T be the same. But this is exactly what happens when we make the second assumption
yn = [yna , yna +1 , . . . , yn ]
that na = −∞. Therefore, by uniqueness of the solution, we ﬁnd in this case
we may write Eq. (11.2.6) in vector form as
h(n + d, k + d)= h(n, k)

E[xn yT
n ]= h(n, na ), h(n, na + 1), . . . , h(n, n) E[yn yT
n] and since d is arbitrary, it follows that h(n, k) must be a function of the difference of
its arguments, that is,
which can be solved for the vector of weights
h(n, k)= h(n − k) (11.3.4)
T −1
h(n, na ), h(n, na + 1), . . . , h(n, n) = E[xn yT
n ]E[yn yn ] Thus, the optimal linear processor becomes a shift-invariant causal linear ﬁlter and
the estimate is given by
and for the estimate x̂n : ∞
T −1
n
x̂n = E[xn yT
n ]E[yn yn ] yn (11.2.8) x̂n = h(n − i)yi = h(i)yn−i (11.3.5)
i=−∞ i=0
Again, x̂n is recognized as the projection of xn onto the space spanned by the ob-
servations that are used in making up the estimate; namely, Yn = {yna , yna +1 , . . . , yn }. and Eq. (11.3.3) becomes in this case
This solution of Eqs. (11.2.5) and (11.2.7) will be discussed in more detail in Sec. 11.8,

n
using covariance factorization methods. Rxy (n − i)= h(n, k)Ryy (k − i) , for − ∞ < i ≤ n
k=−∞

With the change of variables n − i → n and n − k → k, we ﬁnd

11.3 Stationary Wiener Filter
∞

In this section, we make two assumptions that simplify the structure of Eqs. (11.2.6) and Rxy (n)= Ryy (n − k)h(k) , for n ≥ 0 (11.3.6)
(11.2.7). The ﬁrst is to assume stationarity for all signals so that the cross-correlation k=0

and autocorrelation appearing in Eq. (11.2.7) become functions of the differences of their and written in matrix form
arguments. The second assumption is to take the initial time na to be the inﬁnite past, ⎡ ⎤⎡ ⎤ ⎡ ⎤
Ryy (0) Ryy (1) Ryy (2) Ryy (3) ··· h(0) Rxy (0)
na = −∞, that is, the observation interval is Yn = {yi , −∞ < i ≤ n}. ⎢ R (1)
⎢ yy Ryy (0) Ryy (1) Ryy (2) ··· ⎥ ⎢ ⎥ ⎢ ⎥
⎥ ⎢ h(1) ⎥ ⎢ Rxy (1) ⎥
The assumption of stationarity can be used as follows: Suppose we have the solution ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ Ryy (2) Ryy (1) Ryy (0) Ryy (1) ··· ⎥ ⎢ h(2) ⎥ ⎢ Rxy (2) ⎥
of h(n, i) of Eq. (11.2.7) for the best weights to estimate xn , and wish to determine the ⎢ ⎥⎢ ⎥=⎢ ⎥ (11.3.7)
⎢ R (3) · · · ⎥ ⎢ h(3) ⎥ ⎢ Rxy (3) ⎥
⎥ ⎢ ⎥ ⎢
best weights h(n + d, i), na ≤ i ≤ n + d for estimating the sample xn+d at the future ⎢ yy Ryy (2) Ryy (1) Ryy (0) ⎥
⎣ . .. .. .. ⎦⎣ . ⎦ ⎣ . ⎦
time n + d. Then, the new weights will satisfy the same equations as (11.2.7) with the .. . . . .. ..
changes
These are the discrete-time Wiener-Hopf equations. Were it not for the restriction

n+d n ≥ 0 (which reﬂects the requirement of causality), they could be solved easily by z-
Rxy (n + d, i)= h(n + d, k)Ryy (k, i), for na ≤ i ≤ n + d (11.3.1) transform methods. As written above, they require methods of spectral factorization
k=na for their solution.
486 11. Wiener Filtering 11.4. Construction of the Wiener Filter by Prewhitening 487

Before we discuss such methods, we mention in passing the continuous-time version Rx2 y = 0, then Rxy = Rx1 y . Therefore, the solution of Eq. (11.3.7) for the best weights to
of the Wiener-Hopf equation: estimate xn is also the solution for the best weights to estimate x1 (n). The ﬁlter may
∞ also be thought of as the optimal signal separator of the two signal components x1 (n)
Rxy (t)= Ryy (t − t )h(t ) dt , t≥0 and x2 (n).
0

We also consider the FIR Wiener filtering problem in the stationary case. The obser-
vation interval in this case is Yn = {yi , n − M ≤ i ≤ n}. Using the same arguments as 11.4 Construction of the Wiener Filter by Prewhitening
above we have h(n, i)= h(n − i), and the estimate x̂n is obtained by an ordinary FIR
The normal equations (11.3.6) would have a trivial solution if the sequence yn were a
linear filter
white-noise sequence with delta-function autocorrelation. Thus, the solution procedure

n
is first to whiten the sequence yn and then solve the normal equations. To this end, let
x̂n = h(n − i)yi = h(0)yn + h(1)yn−1 + · · · + h(M)yn−M (11.3.8) yn have a signal model, as guaranteed by the spectral factorization theorem
i=n−M

where the (M+ 1) filter weights h(0), h(1), . . . , h(M) are obtained by the (M+ 1)×(M+ Syy (z)= σ2 B(z)B(z−1 ) (11.4.1)
1) matrix version of the Wiener-Hopf normal equations:
⎡ ⎤⎡ ⎤ ⎡ ⎤ where n is the driving white noise, and B(z) a minimal-phase filter. The problem
Ryy (0) Ryy (1) Ryy (2) ··· Ryy (M) h(0) Rxy (0)
⎢ R (1) ··· Ryy (M − 1) ⎥ ⎢ h(1) ⎥ ⎢ Rxy (1) ⎥
⎥ ⎢ ⎥ ⎢ of estimating xn in terms of the sequence yn becomes equivalent to the problem of
⎢ yy Ryy (0) Ryy (1) ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥ estimating xn in terms of the white-noise sequence n :
⎢ Ryy (2) Ryy (1) Ryy (0) ··· Ryy (M − 2) ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ h(2) ⎥ = ⎢ Rxy (2) ⎥
⎢. .. .. .. ⎥⎢ . ⎥ ⎢ . ⎥
⎢. ⎥⎢ . ⎥ ⎢ .. ⎥
⎣. . . . ⎦⎣ . ⎦ ⎣ ⎦
Ryy (M) Ryy (M − 1) Ryy (M − 2) ··· Ryy (0) h(M) Rxy (M)
(11.3.9)
If we could determine the combined filter
Exploiting the Toeplitz property of the matrix Ryy , the above matrix equation can
be solved efficiently using Levinson’s algorithm. This will be discussed in Chap. 12.
F(z)= B(z)H(z)
In Chap. 16, we will consider adaptive implementations of the FIR Wiener filter which
produce the optimal filter weights adaptively without requiring prior knowledge of the
we would then solve for the desired Wiener filter H(z)
autocorrelation and cross-correlation matrices Ryy and Rxy and without requiring any
matrix inversion. F(z)
H(z)= (11.4.2)
B(z)

Since B(z) is minimum-phase, the indicated inverse 1/B(z) is guaranteed to be

stable and causal. Let fn be the causal impulse response of F(z). Then, it satisfies the
normal equations of the type of Eq. (11.3.6):
∞

Rx (n)= fi R (n − i) , n≥0 (11.4.3)
i=0
Fig. 11.3.1 Time-Invariant Wiener Filter.
Since R (n − i)= σ2 δ(n − i), Eq. (11.4.3) collapses to
We summarize our results on the stationary Wiener filter in Fig. 11.3.1. The optimal
filter weights h(n), n = 0, 1, 2, . . . are computed from Eq. (11.3.7) or Eq. (11.3.9). The Rx (n)= σ2 fn , n ≥ 0, or,
action of the filter is precisely that of the correlation canceler: The filter processes the
observation signal yn causally to produce the best possible estimate x̂n of xn , and then Rx (n)
fn = , for n ≥ 0 (11.4.4)
it proceeds to cancel it from the output en . As a result, the output en is no longer σ2
correlated with any of the present and past values of yn , that is, E[en yn−i ]= 0, for Next, we compute the corresponding z-transform F(z)
i = 0, 1, 2, . . . . As we remarked in Sec. 1.4, it is better to think of x̂n as the optimal
∞
∞
estimate of that part of the primary signal xn which happens to be correlated with the 1 1
F(z)= fn z−n = Rx (n)z−n = Sx (z) + (11.4.5)
secondary signal yn . This follows from the property that if xn = x1 (n)+x2 (n) with n=0
σ2 n=0
σ2
488 11. Wiener Filtering 11.5. Wiener Filter Example 489

where Sx (z) + denotes the causal part of the double-sided z-transform Sx (z). Gen- where wn is white noise of variance σw 2
= 0.82. Enough information is given above to
erally, the causal part of a z-transform determine the required power spectral densities Sxy (z) and Syy (z). First, we note that
the signal generator transfer function for xn is
∞
−1
∞

G(z)= gn z−n = gn z−n + gn z−n 1
n=−∞ n=−∞ n=0 M(z)=
z − 0.6
is defined as
∞
so that
G(z) + = gn z−n
0.82 0.82
n=0 2
Sxx (z)= σw M(z)M(z−1 )= =
(z − 0.6)(z−1 − 0.6) (1 − 0.6z−1 )(1 − 0.6z)
The causal instruction in Eq. (11.4.5) was necessary since the above solution for fn
was valid only for n ≥ 0. Since yn is the output of the filter B(z) driven by n , it follows Then, we find
that
Sxy (z) 0.82
Sxy (z)= Sx (z)B(z−1 ) or Sx (z)= Sxy (z) = Sx(x+v) (z)= Sxx (z)+Sxv (z)= Sxx (z)=
B(z−1 ) (1 − 0.6z−1 )(1 − 0.6z)
Combining Eqs. (11.4.2) and (11.4.5), we finally find Syy (z) = S(x+v)(x+v) (z)= Sxx (z)+Sxv (z)+Svx (z)+Svv (z)= Sxx (z)+Svv (z)

1 Sxy (z) 0.82 0.82 + (1 − 0.6z−1 )(1 − 0.6z)
H(z)= (Wiener filter) (11.4.6) = +1=
2
σ B(z) B(z−1 ) + (1 − 0.6z−1 )(1 − 0.6z) (1 − 0.6z−1 )(1 − 0.6z)
−1
Thus, the construction of the optimal filter first requires the spectral factorization of 2(1 − 0.3z )(1 − 0.3z) 1 − 0.3z−1 1 − 0.3z
= −
=2· ·
Syy (z) to obtain B(z), and then use of the above formula. This is the optimal realizable (1 − 0.6z )(1 − 0.6z)
1 1 − 0.6z−1 1 − 0.6z
Wiener filter based on the infinite past. If the causal instruction is ignored, one obtains
= σ2 B(z)B(z−1 )
the optimal unrealizable Wiener filter
Then according to Eq. (11.4.6), we must compute the causal part of
Sxy (z) Sxy (z)
Hunreal (z)= = (11.4.7)
σ2 B(z)B(z−1 ) Syy (z) 0.82
Sxy (z) (1 − 0.6z−1 )(1 − 0.6z) 0.82
The minimum value of the mean-square estimation error can be conveniently ex- G(z)= = =
B(z−1 ) 1 − 0.3z (1 − 0.6z−1 )(1 − 0.3z)
pressed by a contour integral, as follows
1 − 0.6z

E = E[e2n ]= E en (xn − x̂n ) = E[en xn ]−E[en x̂n ]= E[en xn ]= Rex (0) This may be done by partial fraction expansion, but the fastest way is to use the
contour inversion formula to compute gk for k ≥ 0, and then resum the z-transform:
dz dz
= Sex (z) = Sxx (z)−Sx̂x (z) , or,
u.c. 2πjz u.c. 2πjz dz 0.82zk dz
gk = G(z)zk =
dz u.c. 2πjz u.c. (1 − 0.3z)(z − 0.6) 2πj
E= Sxx (z)−H(z)Syx (z) (11.4.8)
u.c. 2πjz 0.82(0.6)k
= (residue at z = 0.6) = = (0.6)k , k≥0
1 − (0.3)(0.6)
11.5 Wiener Filter Example
Resumming, we find the causal part
This example, in addition to illustrating the above ideas, will also serve as a short intro- ∞

1
duction to Kalman filtering. It is desired to estimate the signal xn on the basis of noisy G(z) + = gk z−k =
1 − 0.6z−1
observations k=0
yn = xn + vn
Finally, the optimum Wiener estimation filter is
where vn is white noise of unit variance, σv2 = 1, uncorrelated with xn . The signal xn is
a first order Markov process, having a signal model 1 Sxy (z) G(z) + 0.5
H(z)= = = (11.5.1)
σ2 B(z) B(z−1 ) + σ2 B(z) 1 − 0.3z−1
xn+1 = 0.6xn + wn
490 11. Wiener Filtering 11.6. Wiener Filter as Kalman Filter 491

which can be realized as the difference equation If this prediction were perfect, and if the next observation yn were noise free, then
this would be the value that we would observe. Since we actually observe yn , the obser-
x̂n = 0.3x̂n−1 + 0.5yn (11.5.2) vation or innovations residual will be

The estimation error is also easily computed using the contour formula of Eq. (11.4.8): αn = yn − ŷn/n−1 (11.6.4)

dz This quantity represents that part of yn that cannot be predicted on the basis of
E = E[e2n ]= σe2 = Sxx (z)−H(z)Syx (z) = 0.5
the previous observations Yn−1 . It represents the truly new information contained in
u.c. 2πjz
the observation yn . Actually, if we are making the best prediction possible, then the
To appreciate the improvement afforded by filtering, this error must be compared
most we can expect of our prediction is to make the innovations residual a white-noise
with the error in case no processing is made and yn is itself taken to represent a noisy
(uncorrelated) signal, that is, what remains after we make the best possible prediction
estimate of xn . The estimation error in the latter case is yn − xn = vn , so that σv2 = 1.
should be unpredictable. According to the general discussion of the relationship be-
Thus, the gain afforded by processing is
tween signal models and linear prediction given in Sec. 1.17, it follows that if ŷn/n−1 is
σe2 the best predictor of yn then αn must be the whitening sequence that drives the signal
= 0.5 or 3 dB model of yn . We shall verify this fact shortly. This establishes an intimate connection
σv2
between the Wiener/Kalman filtering problem and the signal modeling problem. If we
overestimate the observation yn the innovation residual will be negative; and if we un-
11.6 Wiener Filter as Kalman Filter
derestimate it, the residual will be positive. In either case, we would like to correct our
We would like to cast this example in a Kalman filter form. The difference equation tentative estimate in the right direction. This may be accomplished by
Eq. (11.5.2) for the Wiener filter seems to have the “wrong” state transition matrix; x̂n/n = x̂n/n−1 + G(yn − ŷn/n−1 )= 0.6x̂n−1/n−1 + G(yn − 0.6x̂n−1/n−1 ) (11.6.5)
namely, 0.3 instead of 0.6, which is the state matrix for the state model of xn . How-
ever, it is not accidental that the Wiener filter difference equation may be rewritten in where the gain G, known as the Kalman gain, should be a positive quantity. The pre-
the alternative form diction/correction procedure defined by Eqs. (11.6.2) through (11.6.5) is known as the
x̂n = 0.6x̂n−1 + 0.5(yn − 0.6x̂n−1 ) Kalman filter. It should be clear that any value for the gain G will provide an estimate,
even if suboptimal, of xn . Our solution for the Wiener filter has precisely the above
The quantity x̂n is the best estimate of xn , at time n, based on all the observations
structure with a gain G = 0.5. This value is optimal for the given example. It is a very
up to that time, that is, Yn = {yi , −∞ < i ≤ n}. To simplify the subsequent notation,
instructive exercise to show this in two ways: First, with G arbitrary, the estimation filter
we denote it by x̂n/n . It is the projection of xn on the space Yn . Similarly, x̂n−1 denotes
of Eq. (11.6.5) has transfer function
the best estimate of xn−1 , based on the observations up to time n − 1, that is, Yn−1 =
{yi , −∞ < i ≤ n − 1}. The above filtering equation is written in this notation as G
H(z)=
1 − 0.6(1 − G)z−1
x̂n/n = 0.6x̂n−1/n−1 + 0.5(yn − 0.6x̂n−1/n−1 ) (11.6.1)
Insert this expression into the mean-square estimation error E = E[e2n ], where en =
It allows the computation of the current best estimate x̂n/n , in terms of the previous
xn − x̂n/n , and minimize it with respect to the parameter G. This should give G = 0.5.
best estimate x̂n−1/n−1 and the new observation yn that becomes available at the current
Alternatively, G should be such that to render the innovations residual (11.6.4) a
time instant n.
white noise signal. In requiring this, it is useful to use the spectral factorization model
The various terms of Eq. (11.6.1) have nice interpretations: Suppose that the best
for yn , that is, the fact that yn is the output of B(z) when driven by the white noise
estimate x̂n−1/n−1 of the previous sample xn−1 is available. Even before the next obser-
signal n . Working with z-transforms, we have:
vation yn comes in, we may use this estimate to make a reasonable prediction as to what
the next best estimate ought to be. Since we know the system dynamics of xn , we may α(z) = Y(z)−0.6z−1 X̂(z)= Y(z)−0.6z−1 H(z)Y(z)
try to “boost” x̂n−1/n−1 to the next time instant n according to the system dynamics,
that is, we take −1 G 1 − 0.6z−1
= 1 − 0.6z Y(z)= Y(z)
1 − 0.6(1 − G)z−1 1 − 0.6(1 − G)z−1
x̂n/n−1 = 0.6x̂n−1/n−1 = prediction of xn on the basis of Yn−1 (11.6.2)
1 − 0.6z−1 1 − 0.3z−1 1 − 0.3z−1
= (z)= (z)
Since yn = xn + vn , we may use this prediction of xn to make a prediction of the 1 − 0.6(1 − G)z−1 1 − 0.6z−1 1 − 0.6(1 − G)z−1
next measurement yn , that is, we take
Since n is white, it follows that the transfer function relationship between αn and
ŷn/n−1 = x̂n/n−1 = prediction of yn on the basis of Yn−1 (11.6.3) n must be trivial; otherwise, there will be sequential correlations present in αn . Thus,
492 11. Wiener Filtering 11.6. Wiener Filter as Kalman Filter 493

we must have 0.6(1 − G)= 0.3, or G = 0.5; and in this case, αn = n . It is also possible where we used the filtering equation X1 (z)= zX(z). The spectral density of yn can be
to set 0.6(1 − G)= 1/0.3, but this would correspond to an unstable filter. factored as follows:
We have obtained a most interesting result; namely, that when the Wiener filtering
c2 Q
problem is recast into its Kalman filter form given by Eq. (11.6.1), then the innovations Syy (z) = c2 Sxx (z)+Svv (z)= +R
(1 − az−1 )(1 − az)
residual αn , which is computable on line with the estimate x̂n/n , is identical to the

whitening sequence n of the signal model of yn . In other words, the Kalman filter can c2 Q + R(1 − az−1 )(1 − az) 2 1 − f z−1 1 − fz
= ≡ σ
be thought of as the whitening filter for the observation signal yn . (1 − az−1 )(1 − az) 1 − az−1 1 − az
To appreciate further the connection between Wiener and Kalman filters and between
Kalman filters and the whitening filters of signal models, we consider a generalized where f and σ2 satisfy the equations
version of the above example and cast it in standard Kalman filter notation.
It is desired to estimate xn from yn . The signal model for xn is taken to be the f σ2 = aR (11.6.9)
first-order autoregressive model
(1 + f 2 )σ2 = c2 Q + (1 + a2 )R (11.6.10)
xn+1 = axn + wn (state model) (11.6.6) and f has magnitude less than one. Thus, the corresponding signal model for yn is

with |a| < 1. The observation signal yn is related to xn by 1 − f z−1

B(z)= (11.6.11)
1 − az−1
yn = cxn + vn (measurement model) (11.6.7)
Next, we compute the causal parts as required by Eq. (11.4.6):

It is further assumed that the state and measurement noises, wn and vn , are zero- Sxy (z) cQ cQ 1
mean, mutually uncorrelated, white noises of variances Q and R, respectively, that is, = =
B(z−1 ) + (1 − az−1 )(1 − f z) +
1 − f a 1 − az−1

E[wn wi ]= Qδni , E[vn vi ]= Rδni , E[wn vi ]= 0 (11.6.8) Sx1 y (z) cQz cQa 1
= =
B(z−1 ) + (1 − az−1 )(1 − f z) +
1 − f a 1 − az−1
We also assume that vn is uncorrelated with the initial value of xn so that vn and xn
will be uncorrelated for all n. The parameters a, c, Q, R are assumed to be known. Let Using Eq. (11.4.6), we determine the Wiener filters H(z) and H1 (z) as follows:
x1 (n) be the time-advanced version of xn :
cQ/(1 − f a) cQ
x1 (n)= xn+1
1 Sxy (z) 1 − az−1 )
( σ2 (1 − f a)
H(z)= = =
σ2 B(z) B(z−1 ) + 1 − f z−1 1 − f z−1
and consider the two related Wiener filtering problems of estimating xn and x1 (n) on σ2
1 − az−1
the basis of Yn = {yi , −∞ < i ≤ n}, depicted below
or, defining the gain G by
cQ
G= (11.6.12)
σ2 (1 − f a)
The problem of estimating x1 (n)= xn+1 is equivalent to the problem of one-step we finally find
prediction into the future on the basis of the past and present. Therefore, we will denote G
H(z)= (11.6.13)
this estimate by x̂1 (n)= x̂n+1/n . The state equation (11.6.6) determines the spectral 1 − f z−1
density of xn : K
H1 (z)= aH(z)= (11.6.14)
1 − f z−1
1 Q
Sxx (z)= Sww (z)= where in Eq. (11.6.14) we defined a related gain, also called the Kalman gain, as follows:
(z − a)(z−1 − a) (1 − az−1 )(1 − az)

The observation equation (11.6.7) determines the cross-densities cQa

K = aG = (11.6.15)
σ2 (1 − f a)
Sxy (z) = cSxx (z)+Sxv (z)= cSxx (z)
Eq. (11.6.14) immediately implies that
Sx1 y (z) = zSxy (z)= zcSxx (z)
x̂n+1/n = ax̂n/n (11.6.16)
494 11. Wiener Filtering 11.7. Construction of the Wiener Filter by the Gapped Function 495

which is the precise justification of Eq. (11.6.2). The difference equations of the two where the term v̂n/n−1 was dropped. This term represents the estimate of vn on the
filters are basis of the past ys; that is, Yn−1 . Since vn is white and also uncorrelated with xn , it
x̂n+1/n = f x̂n/n−1 + Kyn follows that it will be uncorrelated with all past ys; therefore, v̂n/n−1 = 0. The second
(11.6.17)
x̂n/n = f x̂n−1/n−1 + Gyn way to show that ŷn/n−1 is the best prediction of yn is to show that the innovations
residual
Using the results of Problem 1.50, we may express all the quantities f , σ2 , K, and G αn = yn − ŷn/n−1 = yn − cx̂n/n−1 (11.6.23)
in terms of a single positive quantity P which satisfies the algebraic Riccati equation:
is a white-noise sequence and coincides with the whitening sequence n of yn . Indeed,
PRa2 working in the z-domain and using Eq. (11.6.17) and the signal model of yn we find
Q =P− (11.6.18)
R + c2 P
α(z) = Y(z)−cz−1 X̂1 (z)= Y(z)−cz−1 H1 (z)Y(z)
Then, we find the interrelationships
K 1 − (f + cK)z−1
acP Ra = 1 − cz−1 Y(z)= Y(z)
K = aG = , σ2 = R + c2 P , f = a − cK = (11.6.19) 1 − f z−1 1 − f z−1
R + c2 P R + c2 P
It is left as an exercise to show that the minimized mean-square estimation errors 1 − az−1 1
= Y(z)= Y(z)= (z)
are given in terms of P by 1 − f z−1 B(z)

RP which implies that

E[e2n/n−1 ]= P , E[e2n/n ]= αn = n
R + c2 P
where Finally, we note that the recursive updating of the estimate of xn given by Eq. (11.6.22)
is identical to the result of Problem 1.11.
en/n−1 = xn − x̂n/n−1 , en/n = xn − x̂n/n Our purpose in presenting this example was to tie together a number of ideas from
Chapter 1 (correlation canceling, estimation, Gram-Schmidt orthogonalization, linear
are the corresponding estimation errors for the optimally predicted and filtered esti- prediction, and signal modeling) to ideas from this chapter on Wiener filtering and its
mates, respectively. Using Eq. (11.6.19)), we may rewrite the filtering equation (11.6.17) recursive reformulation as a Kalman filter.
in the following forms: We conclude this section by presenting a simulation of this example defined by the
following choice of parameters:
x̂n+1/n = (a − cK)x̂n/n−1 + Kyn , or,
a = 0.95 , c = 1, Q = 1 − a2 , R=1
x̂n+1/n = ax̂n/n−1 + K(yn − cx̂n/n−1 ) , or, (11.6.20)
The above choice for Q normalizes the variance of xn to unity. Solving the Riccati
x̂n+1/n = ax̂n/n−1 + K(yn − ŷn/n−1 )
equation (11.6.18) and using Eq. (11.6.19), we find
where we set
ŷn/n−1 = cx̂n/n−1 (11.6.21) P = 0.3122 , K = 0.2261 , G = 0.2380 , f = a − cK = 0.7239

A realization of the estimation filter based on (11.6.20) is shown below: Fig. 11.6.1 shows 100 samples of the observed signal yn together with the desired
signal xn . The signal yn processed through the Wiener filter H(z) defined by the above
parameters is shown in Fig. 11.6.2 together with xn . The tracking properties of the filter
are evident from the graph. It should be emphasized that this is the best one can do by
means of ordinary causal linear filtering.

Replacing K = aG and using Eq. (11.6.16) in (11.6.20), we also ﬁnd

11.7 Construction of the Wiener Filter by the Gapped Function
x̂n/n = x̂n/n−1 + G(yn − ŷn/n−1 ) (11.6.22)
Next, we would like to give an alternative construction of the optimal Wiener filter based
The quantity ŷn/n−1 defined in Eq. (11.6.21) is the best estimate of yn based on its on the concept of the gapped function. This approach is especially useful in linear pre-
past Yn−1 . This can be seen in two ways: First, using the results of Problem 1.7 on the diction. The gapped function is defined as the cross-correlation between the estimation
linearity of the estimates, we find error en and the observation sequence yn , as follows:

ŷn/n−1 = cx!
n + vn = cx̂n/n−1 + v̂n/n−1 = cx̂n/n−1 g(k)= Rey (k)= E[en yn−k ] , for − ∞ < k < ∞ (11.7.1)
496 11. Wiener Filtering 11.8. Construction of the Wiener Filter by Covariance Factorization 497

4 the alternative form

3 ∞ ∞

2 g(k)= E[en yn−k ]= E xn − hi yn−i yn−k = Rxy (k)− hi Ryy (k − i) , or,
i=0 i=0
1
∞

0
g(k)= Rey (k)= Rxy (k)− hi Ryy (k − i) (11.7.3)
−1 i=0

−2 Taking z-transforms of both sides we ﬁnd

yn
−3 xn
G(z)= Sey (z)= Sxy (z)−H(z)Syy (z)
−4
0 20 40 60 80 100
n (time samples) Because of the gap conditions, the left-hand side contains only positive powers of
z, whereas the right-hand side contains both positive and negative powers of z. Thus,
Fig. 11.6.1 Desired signal and its noisy observation. the non-positive powers of z must drop out of the right side. This condition precisely
determines H(z). Introducing the spectral factorization of Syy (z) and dividing both
sides by B(z−1 ) we ﬁnd
4

3
G(z) = Sxy (z)−H(z)Syy (z)= Sxy (z)−H(z)σ2 B(z)B(z−1 )
2

1 G(z) Sxy (z)

= − σ2 H(z)B(z)
0
B(z−1 ) B(z−1 )

−1 The z-transform B(z−1 ) is anticausal and, because of the gap conditions, so is the
−2 ratio G(z)/B(z−1 ). Therefore, taking causal parts of both sides and noting that the
x̂n/n product H(z)B(z) is already causal, we ﬁnd
−3 xn

−4 Sxy (z)
0 20 40 60 80 100 0= − σ2 H(z)B(z)
n (time samples) B(z−1 ) +

which may be solved for H(z) to give Eq. (11.4.6).

Fig. 11.6.2 Best estimate of desired signal.

This definition is motivated by the orthogonality equations which state that the 11.8 Construction of the Wiener Filter by Covariance Factor-
prediction error en must be orthogonal to all of the available observations; namely, ization
Yn = {yi , −∞ < i ≤ n} = {yn−k , k ≥ 0}. That is, for the optimal set of filter weights
we must have In this section, we present a generalization of the gapped-function method to the more
g(k)= Rey (k)= E[en yn−k ]= 0 , for k ≥ 0 (11.7.2) general non-stationary and/or finite-past Wiener filter. This is defined by the Wiener-
Hopf equations (11.2.7), which are equivalent to the orthogonality equations (11.2.5).
The latter are the non-stationary versions of the gapped function of the previous section.
The best way to proceed is to cast Eqs. (11.2.5) in matrix form as follows: Without loss
of generality we may take the starting point na = 0. The final point nb is left arbitrary.
Introduce the vectors ⎡ ⎤ ⎡ ⎤
x0 y0
⎢ ⎥ ⎢ ⎥
⎢ x1 ⎥ ⎢ y1 ⎥
⎢ ⎥ ⎢ ⎥
x=⎢ .. ⎥ , y=⎢ .. ⎥
⎢ ⎥ ⎢ ⎥
and g(k) develops a right-hand side gap. On the other hand, g(k) may be written in ⎣ . ⎦ ⎣ . ⎦
xnb ynb
498 11. Wiener Filtering 11.8. Construction of the Wiener Filter by Covariance Factorization 499

and the corresponding correlation matrices This is the most general solution of the Wiener filtering problem [18, 19]. It includes
the results of the stationary case, as a special case. Indeed, if all the signals are station-
Rxy = E[xyT ] , Ryy = E[yyT ] ary, then the matrices Rxy , B, and BT become Toeplitz and have a z-transform associated
The filtering equation (11.2.4) may be written in vector form as with them as discussed in Problem 1.51. Using the results of that problem, it is easily
seen that Eq. (11.8.7) is the time-domain equivalent of Eq. (11.4.6).
x̂ = Hy (11.8.1) The prewhitening approach of Sec. 11.4 can also be understood in the present matrix
framework. Making the change of variables
where H is the matrix of optimal weights {h(n, i)}. The causality of the filtering oper-
ation (11.8.1), requires H to be lower-triangular. The minimization problem becomes y = B

equivalent to the problem of minimizing the mean-square estimation error subject to
the constraint that H be lower-triangular. The minimization conditions are the normal we find that Rxy = E[xyT ]= E[x T ]BT = Rx BT , and therefore, Rxy B−T = Rx and the
equations (11.2.5) which, in this matrix notation, state that the matrix Rey has no lower- filter H becomes H = [Rx ]+ R−1 −1
B . The corresponding estimate is then
triangular (causal) part; or, equivalently, that Rey is strictly upper-triangular (i.e., even
the main diagonal of Rey is zero), therefore x̂ = Hy = HB
= F
, where F = HB = [Rx ]+ R−

1
(11.8.8)

This is the matrix equivalent of Eq. (11.4.5). The matrix F is lower-triangular by

Rey = strictly upper triangular (11.8.2) construction. Therefore, to extract the nth component x̂n of Eq. (11.8.8), it is enough to
consider the n×n submatrices as shown below:
Inserting Eq. (11.8.1) into Rey we ﬁnd

Rey = E[eyT ]= E (x − Hy)yT , or,

Rey = Rxy − HRyy (11.8.3)

The minimization conditions (11.8.2) require H to be that lower-triangular matrix −1
The nth row of F is f(n)T = E[xn T n T
n ]E[ n ] . Therefore, the nth estimate be-
which renders the combination (11.8.3) upper-triangular. In other words, H should
comes
be such that the lower triangular part of the right-hand side must vanish. To solve −1
x̂n = f(n)T n = E[xn Tn ]E[ n T
n ] n
Eqs. (11.8.2) and (11.8.3), we introduce the LU Cholesky factorization of the covariance
matrix Ryy given by which may also be written in the recursive form
Ryy = BR BT (11.8.4)

n 1
n−
where B is unit lower-triangular, and R is diagonal. This was discussed in Sec. 1.6. x̂n/n = E[xn i ]E[i i ]−1 i = E[xn i ]E[i i ]−1 i + Gn n , or,
Inserting this into Eq. (11.8.3) we ﬁnd i=0 i=0

Rey = Rxy − HRyy = Rxy − HBR BT (11.8.5) x̂n/n = x̂n/n−1 + Gn n (11.8.9)

−1
Multiplying by the inverse transpose of B we obtain where we made an obvious change in notation, and Gn = E[xn n ]E[n n ] . This is
identical to Eq. (11.6.22); in the stationary case, Gn is a constant, independent of n.
Rey B−T = Rxy B−T − HBR (11.8.6) We can also recast the nth estimate in “batch” form, expressed directly in terms of the
observation vector yn = [y0 , y1 , . . . , yn ]T . By considering the n×n subblock part of the
Now, the matrix B−T is unit upper-triangular, but Rey is strictly upper, therefore, Gram-Schmidt construction, we may write yn = Bn n , where Bn is unit lower-triangular.
the product Rxy B−T will be strictly upper. This can be verified easily for any two such Then, x̂n can be expressed as
matrices. Extracting the lower-triangular parts of both sides of Eq. (11.8.6) we find
−1 T −1
x̂n = E[xn T n T
n ]E[
T
n ] n = E[xn yn ]E[yn yn ] yn
0 = Rxy B−T + − HBR
which is identical to Eq. (11.2.8).
where we used the fact that the left-hand side was strictly upper and that the term
HBR was already lower-triangular. The notation [ ]+ denotes the lower triangular
part of a matrix including the diagonal. We find finally

H = Rxy B−T − 1 −1
+ R B (11.8.7)
500 11. Wiener Filtering 11.9. The Kalman Filter 501

11.9 The Kalman Filter We ﬁnd also

The Kalman filter discussion of Sec. 11.6 and its equivalence to the Wiener filter was n = yn − ŷn/n−1 = (cxn + vn )−cx̂n/n−1 = cen/n−1 + vn
based on the asymptotic Kalman filter for which the observations were available from
the infinite past to the present, namely, {yi , −∞ < i ≤ n}. In Sec. 11.7, we solved the Using the fact that en/n−1 depends only on xn and Yn−1 , it follows that the two terms
most general Wiener filtering problem based on the finite past for which the observation in the right-hand side are uncorrelated with each other. Thus,
space was
E[2n ]= c2 E[e2n/n−1 ]+E[vn2 ]= c2 Pn/n−1 + R (11.9.5)
Yn = {y0 , y1 , . . . , yn } (11.9.1)
Here, we recast these results in a time-recursive form and obtain the time-varying also
Kalman filter for estimating xn based on the finite observation subspace Yn . We also E[n xn ]= cE[en/n−1 xn ]+E[vn xn ]= cPn/n−1 (11.9.6)
discuss its asymptotic properties for large n and show that it converges to the steady-
Therefore, the gain Gn is computable by
state Kalman filter of Sec. 11.6.
Our discussion is based on Eq. (11.8.9), which is essentially the starting point in E[n xn ] cPn/n−1
Gn = = (11.9.7)
Kalman’s original derivation [852]. To make Eq. (11.8.9) truly recursive, we must have a E[2n ] R + c2 Pn/n−1
means of recursively computing the required gain Gn from one time instant to the next.
As in Sec. 11.8, we denote by x̂n/n and x̂n/n−1 the optimal estimates of xn based on the Using Eqs. (11.9.4), (11.9.6), and (11.9.7) into Eq. (11.9.3), we obtain
observation subspaces Yn and Yn−1 , defined in Eq. (11.9.1), with the initial condition c2 Pn/n−1 RPn/n−1
x̂0/−1 = 0. Iterating the state and measurement models (11.6.6) and (11.6.7) starting at Pn/n = Pn/n−1 − Gn cPn/n−1 = Pn/n−1 − = (11.9.8)
R + c2 Pn/n−1 R + c2 Pn/n−1
n = 0, we obtain the following two results, previously derived for the steady-state case
The subtracted term in (11.9.8) represents the improvement in estimating xn using
x̂n+1/n = ax̂n/n , ŷn/n−1 = cx̂n/n−1 (11.9.2) x̂n/n over using x̂n/n−1 . Equations (11.9.3), (11.9.7), and (11.9.8) admit a nice geometrical
The proof of both is based on the linearity property of estimates; for example, interpretation [867]. The two right-hand side terms in n = cen/n−1 + vn are orthogonal
and can be represented by the orthogonal triangle
x̂n+1/n = ax!
n + wn = ax̂n/n + ŵn/n = ax̂n/n

where ŵn/n was set to zero because wn does not depend on any of the observations
Yn . This is seen as follows. The iteration of the state equation (11.6.6) leads to the
expression xn = an x0 + an−1 w0 + an−2 w1 + · · · + awn−2 + wn−1 . It follows from this
and Eq. (11.6.7) that the observation subspace Yn will depend only on

{x0 , w0 , w1 , . . . , wn−1 , v0 , v1 , . . . , vn } where the prediction error en/n−1 has been scaled up by the factor c. Thus, Eq. (11.9.5)
is the statement of the Pythagorean theorem for this triangle. Next, write the equation
Making the additional assumption that x0 is uncorrelated with wn it follows that
en/n = en/n−1 − Gn n as
wn will be uncorrelated with all random variables in the above set, and thus, with Yn .
en/n−1 = en/n + Gn n
The second part of Eq. (11.9.2) is shown by similar arguments. Next, we develop the
recursions for the gain Gn . Using Eq. (11.8.9), the estimation and prediction errors may Because en/n is orthogonal to all the observations in Yn and n is a linear combination
be related as follows of the same observations, it follows that the two terms in the right-hand side will be
orthogonal. Thus, en/n−1 may be resolved in two orthogonal parts, one being in the
en/n = xn − x̂n/n = xn − x̂n/n−1 − Gn n = en/n−1 − Gn n
direction of n . This is represented by the smaller orthogonal triangle in the previous
Taking the correlation of both sides with xn we ﬁnd diagram. Clearly, the length of the side en/n is minimized at right angles at point A. It
follows from the similarity of the two orthogonal triangles that
E[en/n xn ]= E[en/n−1 xn ]−Gn E[n xn ] (11.9.3) " "
Gn E[2n ] c E[e2n/n−1 ]
Using the orthogonality properties E[en/n x̂n/n ]= 0 and E[en/n−1 x̂n/n−1 ]= 0, which " = "
follow from the optimality of the two estimates x̂n/n and x̂n/n−1 , we can write the mean- E[e2n/n−1 ] E[2n ]
square estimation and prediction errors as
which is equivalent to Eq. (11.9.7). Finally, the Pythagorean theorem applied to the
Pn/n = E[e2n/n ]= E[en/n xn ] , Pn/n−1 = E[e2n/n−1 ]= E[en/n−1 xn ] (11.9.4) smaller triangle implies E[e2n/n−1 ]= E[e2n/n ]+G2n E[2n ], which is equivalent to Eq. (11.9.8).
502 11. Wiener Filtering 11.9. The Kalman Filter 503

To obtain a truly recursive scheme, we need next to ﬁnd a relationship between constant, we expect the solution of the Riccati equation (11.9.12) to converge, for large
Pn/n and the next prediction error Pn+1/n . It is found as follows. From the state model n, to some steady-state value Pn/n−1 → P. In this limit, the Riccati difference equation
(11.6.6) and (11.9.2), we have (11.9.12) tends to the steady-state algebraic Riccati equation (11.6.18), which determines
the limiting value P. The Kalman ﬁlter parameters will converge to the limiting values
en+1/n = xn+1 − x̂n+1/n = (axn + wn )−ax̂n/n = aen/n + wn fn → f , Kn → K, and Gn → G given by Eq. (11.6.19).
It is possible to solve Eq. (11.9.12) in closed form and explicitly demonstrate these
Because en/n depends only on xn and Yn , it follows that the two terms in the right- convergence properties. Using the techniques of [871,872], we obtain
hand side will be uncorrelated. Therefore, E[e2n+1/n ]= a2 E[e2n/n ]+E[wn
2
], or,
f 2n E0
Pn+1/n = a2 Pn/n + Q (11.9.9) Pn/n−1 = P + , for n = 0, 1, 2, . . . , (11.9.13)
1 + Sn E0

The ﬁrst term corresponds to the propagation of the estimate x̂n/n forward in time where E0 = P0/−1 − P and
according to the system dynamics; the second term represents the worsening of the
estimate due to the presence of the dynamical noise wn . The Kalman ﬁlter algorithm is 1 − f 2n c2
Sn = B , B=
now complete. It is summarized below: 1 − f2 R + c2 P

We have already mentioned (see Problem 1.50) that the stability of the signal model
0. Initialize by x̂0/−1 = 0 and P0/−1 = E[x20 ].
and the positivity of the asymptotic solution P imply the minimum phase condition
1. At time n, x̂n/n−1 , Pn/n−1 , and the new measurement yn are available.
|f | < 1. Thus, the second term of Eq. (11.9.13) converges to zero exponentially with a
2. Compute ŷn/n−1 = cx̂n/n−1 , n = yn − ŷn/n−1 , and the gain Gn using Eq. (11.9.7). time constant determined by f .
3. Correct the predicted estimate x̂n/n = x̂n/n−1 +Gn n and compute its mean-square
error Pn/n , using Eq. (11.9.8). Example 11.9.1: Determine the closed form solutions of the time-varying Kalman filter for the
state and measurement models:
4. Predict the next estimate x̂n+1/n = ax̂n/n , and compute the mean-square predic-
tion error Pn+1/n , using Eq. (11.9.9).
xn+1 = xn + wn , yn = xn + vn
5. Go to the next time instant, n → n + 1.
with Q = 0.5 and R = 1. Thus, a = 1 and c = 1. The Riccati equations are
The optimal predictor x̂n/n−1 satisfies the Kalman filtering equation
Pn/n−1 P
Pn+1/n = + 0.5 , P= + 0 .5
x̂n+1/n = ax̂n/n = a(x̂n/n−1 + Gn n )= ax̂n/n−1 + aGn (yn − cx̂n/n−1 ) , or, 1 + Pn/n−1 1+P

x̂n+1/n = fn x̂n/n−1 + Kn yn (11.9.10) The solution of the algebraic Riccati equation is P = 1. This implies that f = aR/(R +
c2 P)= 0.5. To illustrate the solution (11.9.13), we take the initial condition to be zero
where we deﬁned
P0/−1 = 0. We ﬁnd B = c2 /(R + c2 P)= 0.5 and
Kn = aGn , fn = a − cKn (11.9.11)
2
These are the time-varying analogs of Eqs. (11.6.17) and (11.6.19). Equations (11.9.8) Sn = 1 − (0.5)2n
3
and (11.9.9) may be combined into one updating equation for Pn/n−1 , known as the
discrete Riccati difference equation Thus,

a2 RPn/n−1 (0.5)2n 1 − (0.5)2n

Pn+1/n = +Q (11.9.12) Pn/n−1 = 1 − = 1 + 2(0.5)2n
R + c2 Pn/n−1 2 2n
1− 1 − (0.5)
3
It is the time-varying version of Eq. (11.6.18). We note that in deriving all of the
above results, we did not need to assume that the model parameters {a, c, Q, R} were The ﬁrst few values calculated from this formula are
constants, independent of time. They can just as well be replaced by time-varying model 1 5 21
parameters: P1/0 = , P2/1 = , P3/2 = ,...
2 6 22
{an , cn , Qn , Rn }
and quickly converge to P = 1. They may also be obtained by iterating Eq. (11.9.12).

The asymptotic properties of the Kalman ﬁlter depend, of course, on the particular
time variations in the model parameters. In the time-invariant case, with {a, c, Q, R}
504 11. Wiener Filtering 11.10. Problems 505

11.10 Problems Determine the optimal estimate of x(n) based on just these two samples in the form

11.1 Let x = [xna , . . . , xnb ]T and y = [yna , . . . , ynb ]T be the desired and available signal vectors. x̂(n)= h(n, na )y(na )+h(n, nb )y(nb )
The relationship between x and y is assumed to be linear of the form
for the following values of n: (a) na ≤ n ≤ nb , (b) n ≤ na , (c) n ≥ nb .
y = Cx + v 11.6 A stationary random signal xn is to be estimated on the basis of the noisy observations

where C represents a linear degradation and v is a vector of zero-mean independent gaussian yn = xn + vn

samples with a common variance σv2 . Show that the maximum likelihood (ME) estimation
It is given that
criterion is in this case equivalent to the following least-squares criterion, based on the
quadratic vector norm: 1
Sxx (z)= , Svv (z)= 5, Sxv (z)= 0
(1 − 0.5z−1 )(1 − 0.5z)
E= y − Cx 2
= minimum with respect to x
(a) Determine the optimal realizable Wiener filter for estimating the signal xn on the
Show that the resulting estimate is given by basis of the observations Yn = {yi , i ≤ n}. Write the difference equation of this filter.
Compute the mean-square estimation error.
x̂ = (CT C)−1 CT y
(b) Determine the optimal realizable Wiener filter for predicting one step into the future;
that is, estimate xn+1 on the basis of Yn .
11.2 Let x̂ = Hy be the optimal linear smoothing estimate of x given by Eq. (11.1.5). It is obtained
by minimizing the mean-square estimation error En = E[e2n ] for each n in the interval (c) Cast the results of (a) and (b) in a predictor/corrector Kalman filter form, and show
[na , nb ]. explicitly that the innovations residual of the observation signal yn is identical to the corre-
sponding whitening sequence n driving the signal model of yn .
(a) Show that the solution for H also minimizes the error covariance matrix
11.7 Repeat the previous problem for the following choice of state and measurement models
Ree = E[eeT ]
xn+1 = xn + wn , yn = x n + v n
where e is the vector of estimation errors e = [ena , . . . , enb ]T .
where wn and vn have variances Q = 0.5 and R = 1, respectively.
(b) Show that H also minimizes every quadratic index of the form, for any positive semi-
definite matrix Q : 11.8 Consider the state and measurement equations
E[eT Q e]= min
xn+1 = axn + wn , yn = cxn + vn
(c) Explain how the minimization of each E[e2n ] can be understood in terms of part (b).
as discussed in Sec. 11.6. For any value of the Kalman gain K, consider the Kalman predic-
11.3 Consider the smoothing problem of estimating the signal vector x from the signal vector y. tor/corrector algorithm defined by the equation
Assume that x and y are linearly related by
x̂n+1/n = ax̂n/n−1 + K(yn − cx̂n/n−1 )= f x̂n/n−1 + Kyn (P.1)
y = Cx + v
where f = a − cK. The stability requirement of this estimation filter requires further that K
and that v and x are uncorrelated from each other, and that the covariance matrices of x be such that |f | < 1.
and v, Rxx and Rvv , are known. Show that the smoothing estimate of x is in this case (a) Let en/n−1 = xn − x̂n/n−1 be the corresponding estimation error. Assuming that all
T T −1 signals are stationary, and working with z-transforms, show that the power spectral density
x̂ = Rxx C [CRxx C + Rvv ] y
of en/n−1 is given by
Q + K2 R
11.4 A stationary random signal has autocorrelation function Rxx (k)= σx2 a|k| , for all k. The See (z)=
(1 − f z−1 )(1 − f z)
observation signal is yn = xn + vn , where vn is a zero-mean, white noise sequence of
variance σv2 , uncorrelated from xn . (b) Integrating See (z) around the unit circle, show that the mean-square value of the
(a) Determine the optimal FIR Wiener filter of order M = 1 for estimating xn from yn . estimation error is given by

(b) Repeat for the optimal linear predictor of order M = 2 for predicting xn on the basis Q + K2 R Q + K2 R
E = E[e2n/n−1 ]= = (P.2)
of the past two samples yn−1 and yn−2 . 1 − f2 1 − (a − cK)2
11.5 A stationary random signal x(n) has autocorrelation function Rxx (k)= σx2 a|k| , for all k.
(c) To select the optimal value of the Kalman gain K, differentiate E with respect to K
Consider a time interval [na , nb ]. The random signal x(n) is known only at the end-points
and set the derivative to zero. Show that the resulting equation for K can be expressed in
of that interval; that is, the only available observations are
the form
caP
y(na )= x(na ), y(nb )= x(nb ) K=
R + c2 P
506 11. Wiener Filtering 11.10. Problems 507

where P stands for the minimized value of E; that is, P = Emin . for n < i. To do this, introduce a set of Lagrange multipliers Λni for n < i, one for each
(d) Inserting this expression for K back into the expression (P.2) for E, show that the constraint equation, and incorporate them into an effective performance index
quantity P must satisfy the algebraic Riccati equation
J = E[eeT ]+ΛHT + HΛT = min
a2 RP
Q =P−
R + c2 P where the matrix Λ is strictly upper-triangular. Show that this formulation of the minimiza-
tion problem yields exactly the same solution as Eq. (11.8.7).
Thus, the resulting estimator filter is identical to the optimal one-step prediction filter dis-
11.13 Exponential Moving Average as Wiener Filter. The single EMA filter for estimating the local
cussed in Sec. 11.6.
level of a signal that we discussed in Chap. 6 admits a nice Wiener-Kalman filtering interpre-
11.9 Show that Eq. (P.2) of Problem 11.8 can be derived without using z-transforms, by using only tation. Consider the noisy random walk signal model,
stationarity, as suggested below: Using the state and measurement model equations and
Eq. (P. l), show that the estimation error en/n−1 satisfies the difference equation xn+1 = xn + wn
(11.10.1)
en+1/n = f en/n−1 + wn − Kvn yn = xn + vn

Then, invoking stationarity, derive Eq. (P.2). Using similar methods, show that the mean- where wn , vn are mutually uncorrelated, zero-mean, white noise signals of variances Q = σw 2

square estimation error is given by and R = σv . Based on the material in Section 12.6, show that the optimum Wiener/Kalman
2

ﬁlter for predicting xn from yn is equivalent to the exponential smoother, that is, show that
RP it is given by,
E[e2n/n ]=
R + c2 P x̂n+1/n = f x̂n/n−1 + (1 − f )yn (11.10.2)

where en/n = xn − x̂n/n is the estimation error of the optimal filter (11.6.13). so that the forgetting-factor parameter λ of EMA is identified as the closed-loop parameter
11.10 Consider the general example of Sec. 11.6. It was shown there that the innovations residual f of the Kalman filter, and show further that f is given in terms of Q, R as follows,
was the same as the whitening sequence n driving the signal model of yn "
Q 2 + 4QR − Q
1−f =
n = yn − ŷn/n−1 = yn − cx̂n/n−1 2R

Show that it can be written as Show also the x̂n+1/n = x̂n/n .

n = cen/n−1 + vn
a. For the following values σw = 0.1 and σv = 1, generate N = 300 samples of xn , yn
where en/n−1 = xn − x̂n/n−1 is the prediction error. Then, show that from Eq. (11.10.1) and run yn through the equivalent Kalman filter of Eq. (11.10.2)
to compute x̂n/n−1 . On the same graph, plot all three signals yn , xn , x̂n/n−1 versus
σ2 = E[2n ]= R + c2 P 0 ≤ n ≤ N − 1. An example graph is shown at the end.
b. A possible way to determine λ or f from the data yn is as follows. Assume a tentative
11.11 Computer Experiment. Consider the signal and measurement model defined by Eqs. (11.6.6)
value for λ, compute x̂n/n−1 , then the error en/n−1 = xn − x̂n/n−1 , and the mean-square
through (11.6.8), with the choices a = 0.9, c = 1, Q = 1 − a2 , and R = 1. Generate 1500
error:
samples of the random noises wn and vn . Generate the corresponding signals xn and yn
MSE(λ)= e2n/n−1
according to the state and measurement equations. Determine the optimal Wiener filter of n
the form (11.6.13) for estimating xn on the basis of yn . Filter the sequence yn through the
Repeat the calculation of MSE(λ) over a range of λs, for example, 0.80 ≤ λ ≤ 0.95,
Wiener filter to generate the sequence x̂n/n .
chosen such that the interval [0.80, 0.95] contain the true λ. Then find that λ that
(a) On the same graph, plot the desired signal xn and the available noisy version yn for
minimizes MSE(λ), which should be close to the true value.
n ranging over the last 100 values (i.e., n = 1400–1500.)
Because the estimated λ depends on the particular realization of the model (11.10.1),
(b) On the same graph, plot the recovered signal x̂n/n together with the original signal
generate 20 different realizations of the pair xn , yn with the same Q, R, and for each
xn for n ranging over the last 100 values.
realization carry out the estimate of λ as described above, and finally form the average
(c) Repeat (a) and (b) using a different realization of wn and vn . of the 20 estimated λs. Discuss if this method generates an acceptable estimate of λ
(d) Repeat (a), (b), and (c) for the choice a = −0.9. or f .
11.12 Consider the optimal Wiener filtering problem in its matrix formulation of Sec. 11.8. Let c. Repeat part (b), by replacing the MSE by the mean-absolute-error:
e = x − x̂ = x − Hy be the estimation error corresponding to a particular choice of the

lower-triangular matrix H. Minimize the error covariance matrix Ree = E[eeT ] with respect MAE(λ)= |en/n−1 |
to H subject to the constraint that H be lower-triangular. These constraints are Hni = 0 n
508 11. Wiener Filtering

noisy random walk

6
observations yn
signal xn
12
prediction xn/n−1

4 Linear Prediction
2

−2

−4
0 50 100 150 200 250 300
time samples, n
12.1 Pure Prediction and Signal Modeling
In Sec. 1.17, we discussed the connection between linear prediction and signal modeling.
Here, we rederive the same results by considering the linear prediction problem as a
special case of the Wiener ﬁltering problem, given by Eq. (11.4.6). Our aim is to cast
the results in a form that will suggest a practical way to solve the prediction problem
and hence also the modeling problem. Consider a stationary signal yn having a signal
model
Syy (z)= σ2 B(z)B(z−1 ) (12.1.1)

as guaranteed by the spectral factorization theorem. Let Ryy (k) denote the autocorre-
lation of yn :
Ryy (k)= E[yn+k yn ]
The linear prediction problem is to predict the current value yn on the basis of all the
past values Yn−1 = {yi , −∞ < i ≤ n − 1}. If we define the delayed signal y1 (n)= yn−1 ,
then the linear prediction problem is equivalent to the optimal Wiener filtering problem
of estimating yn from the related signal y1 (n). The optimal estimation filter H(z) is
given by Eq. (11.4.6), where we must identify xn and yn with yn and y1 (n) of the present
notation. Using the filtering equation Y1 (z)= z−1 Y(z), we find that yn and y1 (n) have
the same spectral factor B(z)

Sy1 y1 (z)= (z−1 )(z)Syy (z)= Syy (z)= σ2 B(z)B(z−1 )

and also that

Syy1 (z)= Syy (z)z = zσ2 B(z)B(z−1 )
Inserting these into Eq. (11.4.6), we ﬁnd for the optimal ﬁlter H(z)

1 Syy1 (z) 1 zσ2 B(z)B(z−1 )
H(z)= = , or,
σ2 B(z) B(z−1 ) + σ2 B(z) B(z−1 ) +

1
H(z)= zB(z) + (12.1.2)
B(z)

Monson H. Hayes-Statistical Digital Signal Processing and Modeling-John Wiley & Sons (1996) PDF
78% (23)
Monson H. Hayes-Statistical Digital Signal Processing and Modeling-John Wiley & Sons (1996) PDF
622 pages
Introduction To Adaptive Filters - Simon Haykin
No ratings yet
Introduction To Adaptive Filters - Simon Haykin
240 pages
Statistical Digital Signal Processing and Modeling
100% (7)
Statistical Digital Signal Processing and Modeling
622 pages
Monson H Hayes Statistical Digital Signal Processing and Modeling John Wiley Sons 1996pdf
No ratings yet
Monson H Hayes Statistical Digital Signal Processing and Modeling John Wiley Sons 1996pdf
622 pages
Adaptive Filtering For Biomedical Applications
No ratings yet
Adaptive Filtering For Biomedical Applications
27 pages
Advanced Digital Signal Processing: Linear Prediction and Optimum Linear Filters
No ratings yet
Advanced Digital Signal Processing: Linear Prediction and Optimum Linear Filters
52 pages
LMS Algorithm It6303 - 4 PDF
No ratings yet
LMS Algorithm It6303 - 4 PDF
103 pages
Adaptive Filters
No ratings yet
Adaptive Filters
30 pages
Adaptive Filters
No ratings yet
Adaptive Filters
167 pages
Dsp2 13-01-24
No ratings yet
Dsp2 13-01-24
91 pages
An Introduction To Adaptive Filtering & It's Applications: Asst - Prof.Dr - Thamer M.Jamel
No ratings yet
An Introduction To Adaptive Filtering & It's Applications: Asst - Prof.Dr - Thamer M.Jamel
81 pages
Aasp 15 Book
No ratings yet
Aasp 15 Book
87 pages
DS4152 Statistical Signal Processing
100% (1)
DS4152 Statistical Signal Processing
2 pages
Documention Theory
No ratings yet
Documention Theory
72 pages
Least Mean Square Algorithm
No ratings yet
Least Mean Square Algorithm
14 pages
SSP-5-1 Adaptive Filtering
No ratings yet
SSP-5-1 Adaptive Filtering
88 pages
L6 Adaptive Filters
No ratings yet
L6 Adaptive Filters
35 pages
Performance Analysis of Adaptive Noise Cancellation by Using Algorithms
No ratings yet
Performance Analysis of Adaptive Noise Cancellation by Using Algorithms
7 pages
EC 614: Adaptive Signal Processing Techniques: Course Instructor: Dr. Debashis Ghosh
No ratings yet
EC 614: Adaptive Signal Processing Techniques: Course Instructor: Dr. Debashis Ghosh
56 pages
FULLTEXT01
No ratings yet
FULLTEXT01
66 pages
P1 Intro
No ratings yet
P1 Intro
35 pages
Adaptive Filters and Applications: Supervised by Prof. Dr. Ehab A. Hussein
No ratings yet
Adaptive Filters and Applications: Supervised by Prof. Dr. Ehab A. Hussein
41 pages
Chapter 4 Linear Estimation of Signals
No ratings yet
Chapter 4 Linear Estimation of Signals
54 pages
Omer, Ta-Lmp
No ratings yet
Omer, Ta-Lmp
22 pages
Adaptive Filter Design
No ratings yet
Adaptive Filter Design
25 pages
Estimation, Filtering and Adaptive Processes
No ratings yet
Estimation, Filtering and Adaptive Processes
135 pages
Lecture 1
No ratings yet
Lecture 1
40 pages
SP - Lecture - W4 - 2
No ratings yet
SP - Lecture - W4 - 2
16 pages
Naa Heeerc LBH Eeo D /: E H-Ao3311 Eee
No ratings yet
Naa Heeerc LBH Eeo D /: E H-Ao3311 Eee
30 pages
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
No ratings yet
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
220 pages
Weiner Filter
No ratings yet
Weiner Filter
32 pages
Real Time DSP: Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria
No ratings yet
Real Time DSP: Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria
29 pages
Analysis of Methods of Electronic Compensation of Reference Signal in Target Channel of Semi-Active Coherent Bistatic DVB-T2 Radar
No ratings yet
Analysis of Methods of Electronic Compensation of Reference Signal in Target Channel of Semi-Active Coherent Bistatic DVB-T2 Radar
8 pages
Lab Instruktion 02
No ratings yet
Lab Instruktion 02
12 pages
Fpga Implementation of Adaptive Weight PDF
No ratings yet
Fpga Implementation of Adaptive Weight PDF
7 pages
Optimum Filters
No ratings yet
Optimum Filters
11 pages
Adaptive Systems, Problem Classes
No ratings yet
Adaptive Systems, Problem Classes
25 pages
Linear System
No ratings yet
Linear System
13 pages
Weiner Filter
No ratings yet
Weiner Filter
35 pages
Adaptive Filtering - An Introduction: Jos e C. M. Bermudez
No ratings yet
Adaptive Filtering - An Introduction: Jos e C. M. Bermudez
21 pages
Adaptivels Signal Spectra
No ratings yet
Adaptivels Signal Spectra
19 pages
Davisson 1966
No ratings yet
Davisson 1966
3 pages
A New Approach To Linear Filtering and Prediction Problems: R. E. Kalman
No ratings yet
A New Approach To Linear Filtering and Prediction Problems: R. E. Kalman
12 pages
Maximum Likelihood Sequence Estimators:: A Geometric View
No ratings yet
Maximum Likelihood Sequence Estimators:: A Geometric View
9 pages
Estimation Theory in Signal Processing
No ratings yet
Estimation Theory in Signal Processing
2 pages
Wavelet Packet: A Multirate Adaptive Filter For De-Noising of TDM Signal
No ratings yet
Wavelet Packet: A Multirate Adaptive Filter For De-Noising of TDM Signal
6 pages
A Denoising Approach To Multichannel Signal Estimation
No ratings yet
A Denoising Approach To Multichannel Signal Estimation
4 pages
Chapter6-Wiener Filters and The LMS Algorithm-Pp32
No ratings yet
Chapter6-Wiener Filters and The LMS Algorithm-Pp32
32 pages
Signal-to-Noise: Equivalence of The Likelihood Ratio Processor, The Maximum Ratio Filter, and The Wiener Filter
No ratings yet
Signal-to-Noise: Equivalence of The Likelihood Ratio Processor, The Maximum Ratio Filter, and The Wiener Filter
3 pages
The Applications and Simulation of Adaptive Filter in Speech Enhancement
No ratings yet
The Applications and Simulation of Adaptive Filter in Speech Enhancement
7 pages
Palermo
No ratings yet
Palermo
4 pages
Adaptive Noise Cancellation Using Multirate Techniques: Prasheel V. Suryawanshi, Kaliprasad Mahapatro, Vardhman J. Sheth
No ratings yet
Adaptive Noise Cancellation Using Multirate Techniques: Prasheel V. Suryawanshi, Kaliprasad Mahapatro, Vardhman J. Sheth
7 pages
Basic Data Processing Sequence
No ratings yet
Basic Data Processing Sequence
15 pages
Syllabus-21CS1603-Advanced Digital Signal Processing
No ratings yet
Syllabus-21CS1603-Advanced Digital Signal Processing
2 pages
Ap7101 Advanced Digital Signal Processing
No ratings yet
Ap7101 Advanced Digital Signal Processing
1 page
M.E. Applied Electronics
No ratings yet
M.E. Applied Electronics
40 pages
Deep Learning Approaches For Enhanced Audio Quality Through Noise Reduction
No ratings yet
Deep Learning Approaches For Enhanced Audio Quality Through Noise Reduction
7 pages
Introduction To Deconvolution and Inversion
No ratings yet
Introduction To Deconvolution and Inversion
49 pages
Image Filtering and Noise Techniques
No ratings yet
Image Filtering and Noise Techniques
18 pages
A. H. Nuttal and G. C. Carter, A Generalized Framework For Power Spectral Estimation, Appendices
No ratings yet
A. H. Nuttal and G. C. Carter, A Generalized Framework For Power Spectral Estimation, Appendices
37 pages
@vtudeveloper - in CV Mod 2
No ratings yet
@vtudeveloper - in CV Mod 2
25 pages
Image Restoration: Noise Removal Using Frequency Domain Filters
No ratings yet
Image Restoration: Noise Removal Using Frequency Domain Filters
32 pages
Homomorphic Filtering
No ratings yet
Homomorphic Filtering
5 pages
M.E. Comm. Systems
No ratings yet
M.E. Comm. Systems
58 pages
Advance in Image and Audio Restoration and Their Assessments: A Review
No ratings yet
Advance in Image and Audio Restoration and Their Assessments: A Review
16 pages
Dip R20 Unit-3 Notes
No ratings yet
Dip R20 Unit-3 Notes
18 pages
July 2014
No ratings yet
July 2014
9 pages
Assignment Week 10 24
No ratings yet
Assignment Week 10 24
8 pages
Possion Noise Removal in MRI Data Sets
No ratings yet
Possion Noise Removal in MRI Data Sets
35 pages
Computer Aided Diagnosis System For Detection of Lung Cancer in CT Scan Images
No ratings yet
Computer Aided Diagnosis System For Detection of Lung Cancer in CT Scan Images
5 pages
A Brief Review On Image Restoration Techniques
No ratings yet
A Brief Review On Image Restoration Techniques
5 pages
Speech Enhancement With An Adaptive Wiener Filter: International Journal of Speech Technology March 2014
No ratings yet
Speech Enhancement With An Adaptive Wiener Filter: International Journal of Speech Technology March 2014
13 pages
A Review of Speech Signal Enhancement Techniques
No ratings yet
A Review of Speech Signal Enhancement Techniques
4 pages
Wiener Levinson Algorithm
No ratings yet
Wiener Levinson Algorithm
8 pages
Wiener Filtering: C °alan V. Oppenheim and George C. Verghese, 2010 195
No ratings yet
Wiener Filtering: C °alan V. Oppenheim and George C. Verghese, 2010 195
17 pages
12 RRY025 07 Jan 2020material - II - 07 Jan 2020 - Probf08.sol
No ratings yet
12 RRY025 07 Jan 2020material - II - 07 Jan 2020 - Probf08.sol
6 pages
Wiener Filter
No ratings yet
Wiener Filter
3 pages
Adaptive Wiener Filteriing Approach For Speech Processing
No ratings yet
Adaptive Wiener Filteriing Approach For Speech Processing
1 page
Fundamentals of Communication Systems
From Everand
Fundamentals of Communication Systems
Janak Sodha
No ratings yet
Technology in Telecommunications Networks
From Everand
Technology in Telecommunications Networks
Tanushri Kaniyar
No ratings yet
Data Communication and Networking: For Under-graduate Students
From Everand
Data Communication and Networking: For Under-graduate Students
DR LILADHAR REWATKAR
No ratings yet
Applied Digital Signal Processing and Applications
From Everand
Applied Digital Signal Processing and Applications
Othman Omran Khalifa
No ratings yet
Network Models in Finance: Expanding the Tools for Portfolio and Risk Management
From Everand
Network Models in Finance: Expanding the Tools for Portfolio and Risk Management
Gueorgui S. Konstantinov
No ratings yet
SR-IOV Networking in Kubernetes: The Complete Guide for Developers and Engineers
From Everand
SR-IOV Networking in Kubernetes: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
CCNA Exam Focus: Study Guide with Practice Tests
From Everand
CCNA Exam Focus: Study Guide with Practice Tests
SUJAN
No ratings yet
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Hidden Surface Determination: Unveiling the Secrets of Computer Vision
From Everand
Hidden Surface Determination: Unveiling the Secrets of Computer Vision
Fouad Sabry
No ratings yet

Aosp ch11

Uploaded by

Aosp ch11

Uploaded by

474 10.

DWT decomposition DWT detail coefficients

0 0.25 0.5 0.75 1 0 64 128 192 256

UWT decomposition UWT detail coefficients

p(xn |y)= maximum (11.1.1)

computed with respect to the conditional density p(xn |y).

yn = cn x + vn , n = 1, 2, . . . , M x2 − 2xȳ = min{1 − 2ȳ, 0, 1 + 2ȳ}

yn in the interval na ≤ n ≤ nb . The division into three types of problems depends on

h(n, i)= 0, for n < i

11.2 Orthogonality and Normal Equations

for na ≤ i ≤ nb . Thus, the estimation error en is orthogonal (uncorrelated) to each

With the change of variables n − i → n and n − k → k, we ﬁnd

Since B(z) is minimum-phase, the indicated inverse 1/B(z) is guaranteed to be

with |a| < 1. The observation signal yn is related to xn by 1 − f z−1

The observation equation (11.6.7) determines the cross-densities cQa

RP which implies that

Replacing K = aG and using Eq. (11.6.16) in (11.6.20), we also ﬁnd

4 the alternative form

−2 Taking z-transforms of both sides we ﬁnd

1 G(z) Sxy (z)

which may be solved for H(z) to give Eq. (11.4.6).

This is the matrix equivalent of Eq. (11.4.5). The matrix F is lower-triangular by

Rey = Rxy − HRyy (11.8.3)

Rey = Rxy − HRyy = Rxy − HBR BT (11.8.5) x̂n/n = x̂n/n−1 + Gn n (11.8.9)

11.9 The Kalman Filter We ﬁnd also

a2 RPn/n−1 (0.5)2n 1 − (0.5)2n

where C represents a linear degradation and v is a vector of zero-mean independent gaussian yn = xn + vn

Show that it can be written as Show also the x̂n+1/n = x̂n/n .

noisy random walk

Sy1 y1 (z)= (z−1 )(z)Syy (z)= Syy (z)= σ2 B(z)B(z−1 )

and also that

You might also like

Rey = Rxy − HRyy = Rxy − HBR BT (11.8.5) x̂n/n = x̂n/n−1 + Gn n (11.8.9)

Sy1 y1 (z)= (z−1 )(z)Syy (z)= Syy (z)= σ2 B(z)B(z−1 )