ECE 6123 Advanced Signal Processing Linear Prediction: 1 AR Models and Wiener Prediction
ECE 6123 Advanced Signal Processing Linear Prediction: 1 AR Models and Wiener Prediction
Fall 2017
Rw = r (2)
where
u[n − 1]
r[−1]
u[n − 2] r[−2]
u[n]∗
r ≡ E .. = .. (3)
. .
u[n − M ] r[−M ]
Why should we want to make such a prediction when all we need is to wait
a sample to see the real value u[n]? The answer is that if u[n] were what
we wanted we should do just that: wait for it. The reason to pose this as a
prediction problem is that the structure is important and useful.
Equation (2) should look familiar. In fact, to repeat from the first lecture,
we have the autoregressive (AR) model:
M −1
a∗k u[n − k]
X
u[n] = ν[n] − (4)
k=1
H
u[n] = x[n] − a un−1 (5)
where the input ν[n] is assumed to be white (and usually but not necessarily
Gaussian) with power σν2 . In the introduction we found that we could recover
the AR model (i.e., the ak ’s) from knowledge of the autocorrelation lags (the
r[m]’s) via
Ra = −r (6)
1
which are the “Yule-Walker” equations. That is: w = −a. And that does
make a good amount of sense: according to (4), using û[n] = wH un−1
eliminates all the “randomness” in u[n] except that from ν[n]; and ν[n]
can’t be removed.
where we have stressed the dimension of the matrix and vector in the sub-
scripts. It is probably useful to note the full matrix in long form
r[0] r[1] r[2] ... r[M ]
. . . r[M − 1]
! !
r[−1] r[0] r[1]
1 PM
.. .. .. .. =
. . . .
−w 0
r[−M ] r[−(M − 1)] r[−(M − 2)] . . . r[0]
(10)
or ! !
1 PM
RM +1 = (11)
−w 0
to see the naturalness of this concatenation.
Let us re-define1 the AR vector a as
1
! !
a1
1 1
a2
a ≡ = = (12)
−w aold
..
.
aM
1
Sorry, but this has to be done.
2
where aold is as in (2) and (4). Then
!
PM
RM +1 aM = (13)
0
is a way to re-write (11). Here the subscript on a denotes the order of the
predictor – it is a vector of length M + 1.
and that does indeed mean “backwards and conjugated”. We can concate-
nate this and the Wiener error equation as
! ! !
RM rB∗ −g 0
= (15)
(rB )T r[0] 1 backwards
PM
where PM backwards is the Wiener error for the backward prediction. Now let’s
write the Wiener solution another way, using û[n − M ] = (gB )H uB n , which
is identical in effect to the “normal” ordering. Now we have
gB∗ = w (19)
!
gB∗
= a (20)
1
backwards
PM = PM (21)
That is: the AR process “looks the same” whether viewed in forward time or
reverse time. That’s a cute point, but the main by-product of this analysis
3
is that we can write
! ! !
RM rB∗ −wB∗ 0
= (22)
r r[0] 1 PM
or !
0
RM +1 aB∗
M = (23)
PM
as an alternate way to write the augmented Yule-Walker equations.
where we are noting the order as m rather than the true (or at least we
assume it’s true) model order M , since we will start with m = 0 and work
up to m = M . This will be an inductive development, so we need to show
that the structure replicates. The structure might be suggested by (13) and
(23); but we need to show that it works.
We multiply (24) by RM +1 ; we want this product to be consistent with
(13). We have
! !
Rm rB∗
m am−1
Rm+1 am =
(rB )T r[0] 0
! !
r[0] rH
m 0
+ Γm (25)
rm Rm aB∗
m−1
! !
Rm am−1 rTm aB∗
m−1
= + Γm (26)
(rB )T am−1 Rm aB∗
m−1
Pm−1 ∆∗m−1
= 0 + Γm 0 (27)
∆m−1 Pm−1
4
where
∆m−1 ≡ (rB )T am−1 (28)
So, how do we make (27) into (13)? Easy: choose
−∆m−1
Γ = (29)
Pm−1
as desired, and
Pm = Pm−1 (1 − |Γ|2 ) (31)
The LD algorithm consists of starting with P0 = r[0] & a0 = 1, and iterating
on m: (28), (29), (30) then (31). Notice that the missing RHS of (13) – that
is, Pm – is created as is needed.
where (32) is a restatement of the AR model, the second is (33) is the same
in Wiener filtering notation, where fm [n] denotes the prediction error for the
mth -order predictor, (34) uses the new formulation for am and (35) is the
z-transform of (33). This is basically presented to suggest what is meant by
fm [n] and Hf,m (z). We do the same thing for “backward” prediction errors
5
bm [n] and Hb,m (z). We’ll use
m
!
gk∗ u[n
X
bm [n] = u[n − m] − + 1 − k] (36)
k=1
m
!
X
bm [n] = u[n − m] − wk u[n − m + k] (37)
k=1
m
X
bm [n] = am,k u[n − m + k]) (38)
k=0
Bm (z) = Hb,m (z)U (z) (39)
and it should be noted closely that bm [n] refers to the “error” in predicting
u[n − m].
We write
m
a∗m,k z −k
X
Hf,m (z) = (40)
k=0
m−1 m−1
a∗m−1,k z −k + Γ∗m a∗m−1,m−k−1 z −k−1
X X
= (41)
k=0 k=0
= Hf,m−1 (z) + Γ∗m z −1 Hb,m−1 (z) (42)
Hb,m (z) = z −m (Hf,m−1 (1/z ∗ ))∗ + Γ∗m zz −m (Hb,m−1 (1/z ∗ ))∗ (46)
= z −1 Hb,m−1 (z) + Γ∗m zz −m z m−1 Hb,m−1 (1/z ∗ ) (47)
= Γ∗m Hf,m−1 (z)z −1 + z −1 Hb,m−1 (z) (48)
Then we have
6
Hf,m(z) fm[n]
u[n]
Hb,m(z) bm[n]
being equivalent to
Hf,m-1(z) fm[n]
Gm *
u[n]
Gm
z-1
Hb,m-1(z) bm[n]
G1 * G2 * G3 * G4 * G5 *
u[n]
G1 G2 G3 G4 G5
z-1 z-1 z-1 z-1 z-1
b5[n]
b1[n] b2[n] b3[n] b4[n]
3.2 Orthogonality
This is pretty simple once you remember that the p.o.o. governs all this
optimal filtering. Let’s assume that i < j and remember that bi [n] is a
linear function of {u[n − i], u[n − i + 1], . . . , u[n]}. By the p.o.o., bj [n] is
orthogonal to {u[n − j + 1], u[n − j + 2], . . . , u[n]}, and hence it is orthogonal
to its subset {u[n − i + 1], u[n − i + 2], . . . , u[n]}. Case closed: bi [n] and
bj [n] are orthogonal
E{bi [n]bj [n]∗ } = 0 (49)
for all i 6= j – and this expectation is by definition Pi for the case i = j.
Now let’s write this out in full:
7
b1 [n] = u[n − 1] + a1,1 u[n] (51)
b2 [n] = u[n − 2] + a2,1 u[n − 1] + a2,2 u[n] (52)
.. .
. = ..
bm [n] = u[n − m] + am,1 u[n − m + 1] + . . . am,m u[n] (53)
D = LRLH (55)
where D is a diagonal matrix with (i, i)th element Pi . We can also write
(55) as
R = L−1 DL−H (56)
indicating that the PEF’s and the correspond error powers are actually the
LDU decomposition of the correlation matrix of the data.
3.3 Stability
It’s obvious that the PEFs are stable – they’re FIR. But one reason to create
a PEF structure is to be able to recreate the corresponding AR model. Since
that involves the reciprocal of the PEF, we need to know if the zeros of the
PEF are inside the unit circle. If not the AR model is unstable and trying
to use one would be hopeless.
We repeat (42)
to get
Hf,m (z) = Hf,m−1 (z) + Γ∗m z −m (Hf,m−1 (1/z ∗ ))∗ (59)
We convert this to the DTFT (discrete-time Fourier transform) as
Since (60) comprises the sum of two complex vectors, the first one of mag-
nitude |Hf,m−1 (ejω )| and the second one, since |Γm | < 1, of magnitude less
8
than |Hf,m−1 (ejω )|, we can see that as ω travels from zero to 2π the total
phase change of Hf,m (ejω ) must be the same as of Hf,m−1 (ejω ) – the second
term in the sum can have no effect. As such Hf,m (ejω ) begins and ends its
phase at the same point ∀m.
We turn now to a generic FIR model
m
(1 − zi z −1 )
Y
H(z) = (61)
i=1
m
−m
Y
= z (z − zi ) (62)
i=1
m
−jmω
Y
H(ejω ) = e (ejω − zi ) (63)
i=1
It is easy to see that a NASC for all zeros to be inside the unit circle is that
the total phase change as ω travels from zero to 2π must be zero. That is
what we have, hence the FPEF is indeed stable – the FPEF is minimum-
phase. It can be shown that all zero are outside the unit circle for the BPEF
(it is maximum-phase).