0% found this document useful (0 votes)
25 views25 pages

P6 Adaptive Filtering LMS

Dggf

Uploaded by

Ishwar Nirale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views25 pages

P6 Adaptive Filtering LMS

Dggf

Uploaded by

Ishwar Nirale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

LINEAR FIR ADAPTIVE

FILTERING ( I I)
Stochastic Gradient-based
Algorithms:
(Least-Mean-Square [LMS])

Dr. Yogananda Isukapalli


• Why the LMS Adaptive Filter ?
• Steepest descent algorithm has been used to
obtain an iterative solution to fixed normal
equations.
• We need to design a filter which is
responsive to changes in the input signal
environment, that is we need an interactive
structure that is dependent on the input data.
• It is possible to construct a steepest-descent
algorithm by replacing the fixed auto and
cross-correlation matrices by their time
dependent equivalents.

w( n + 1) = w( n ) + µ[ p( n ) - R( n ) w( n )]
~
We note two main problems with this approach:

A. It is very computationally intensive and


expensive since we have to compute R and p at
~
each point (n).
B. It may not be possible to compute R(n) and
~
p(n) if only a single realization of the process is
available.

LMS algorithm is based on estimating the


gradient of the mean-squared error by the
gradient of the instantaneous value of the
squared error.
¶J ¶E{e 2 ( n )}
Ñ( J ( n )) = =
¶ w( n ) ¶ w( n )
Is replaced by:
¶Jˆ ¶e 2
(n)
ˆ
Ñ( J ( n )) = =
¶ w( n ) ¶ w( n )
Notes on Previous Figure:

Fig 5.1a Adaptive transversal filter consists of


a transversal filter around which the LMS
algorithm is built, note also a mechanism for
performing the adaptive control process on the
tap weights of the transversal filter.
Fig 5.1b The tap input vector is:
u(n) : M-1 is the number of delay elements.
w(n) : Mx1 tap weigth vectors

Fig 5.1c Note the correction dwˆ k (n ) applied to


the tap weight wˆ k (n ) at time n+1. µ is called
the adaptation constant or step-size parameter.
The Least-Mean-Square (LMS) algorithm

From the steepest-descent algorithm,


reproduce:
Ñ( J ( n )) = -2 p + 2 R w( n ).............(1)
~
1
w( n + 1) = w( n ) + µ[ -ÑJ ( n )].....( 2)
2
Define:

R
~ ( n ) = u( n )u H ( n )...........(3)
pˆ ( n ) = u( n )d * ( n )...........( 4)
By using (3) and (4) in (1):
ш ( J ( n )) = -2u( n )d * ( n ) + 2u( n )u ( n )W ( n )...(5)
H

The instantaneous estimate of the gradient


vector.
Putting equation (5) in quation (2):

wˆ ( n + 1) = wˆ ( n ) + µu( n )[d * ( n ) - u H ( n ) wˆ ( n )]...(6)

LMS Algorithm: Three Basic Steps

1. Filter Output:

y ( n ) = wˆ ( n )u( n ).........(7a )
H

2. Estimation of errors:

e( n ) = d ( n ) - y ( n )....(7b)
3. Tap weight adaptation:
wˆ ( n + 1) = wˆ ( n ) + µu( n )e* ( n )...(7c )
Correction Term
The algorithm gets initiated with:
wˆ (0) = 0
Notes on the LMS algorithms from the above
figure: wˆ ( n )

1.) Algorithm is multivariable since has a


dimension greater then 1.
2.) Nonlinear since outer feedback loop depends
wˆ ( ninput
on tap ) vector.
3.) becomes a random vector fro n > 0.
Stability Analysis of the LMS algorithm:

As a rule of thumb, we have to choose µ so


that the following two forms of convergence
are satisfied:

Convergence in the mean:


ˆ ( n )] ®W0as n ®¥
E[w
Convergence in the mean square:
J(n) ® J(¥) as n ®¥
where J(¥) > Jmin
and J(¥) is finite
Jmin corresponds to the weiner solution.
Fundamental assumption for analysis of the
LMS algorithm.

1. The tap input vectors u(1), u(2) …u(n)


constitute a sequence of statistically
independent vectors.
2. At time n, the tap-input vector u(n) is
statistically independent of all previous
samples of the desired response d(1),
d(2),…..d(n-1).
3. At time n, the desired response d(n) is
dependent on the u(n), but statistically
indepenedent of all previous samples of the
desired response.
4. u(n) and d(n) consist of mutually gaussian-
distributed random variables for all n.
With the assumption of mutually gaussian
distributed random variables in (4), (1) and (2)
are equivalent to conditions of
uncorrelatedness:

E [u( n )u H ( k )] = 0.....k = 0,1,....n - 1


E [u( n )d * ( k )] = 0.....k = 0,1,....n - 1

the assumption of independence theory [(1) -


(3) ] is more general when (4) is not assumed.

Average tap weight analysis


Define: e( n ) = wˆ ( n ) - w(0)
the weight error vector
where wˆ ( n ) the estimate produced by the LMS
algorithm w 0 the optimum weiner solution.
Start from:
wˆ ( n + 1) = wˆ ( n ) + µu( n )[d * ( n ) - u H ( n ) wˆ ( n )]

Rewrite:
wˆ ( n + 1) - w 0 = wˆ ( n ) - w 0 + µu( n )[d * ( n ) - u H ( n ) wˆ ( n )]

e( n + 1) e(n )
e( n + 1) = e( n ) + µu( n )[d * ( n ) - u H ( n ) wˆ ( n )]

e * ( n ) - e H ( n )u ( n )
0

e(n + 1) = [ I - µu(n )u (n )]e(n ) + µu(n )e (n )


H *
0
e(n ) = d (n ) - wˆ H (n )u(n )
H
= d ( n ) - w u ( n ) - e ( n )u ( n )
H
0
H
= e0 (n ) - e (n )u(n )
e0: the estimation error produced in the
optimum weiner solution.
Now by taking the mathematical expectation:
E [e( n + 1)] = E [[ I - µu( n )u H ( n )]e( n )] + µ[E u( n )e0* ( n )]

Since e(n) and u(n) are independent, the first


term:
E [[ I - µu(n )u H (n )]e(n )]
= [ I - µE[u(n )u H (n )]]E[e(n )]
= [ I - µ R ]E[e(n )]
~
Where:
R~ = E[u(n)u H
(n )]

From the orthogonality principle:

E [u(n )e0* (n )] = 0

E[e( n + 1)] = [ I - µ R ]E[e( n )]......( A)


~
Compare equation (A) with that we obtained in
the case of steepest-decent algorithm where:

c( n + 1) = [ I - µ R ]c( n )
~
We therefore conclude from (A) that:
the mean of e(n) converges to zero as n ®¥,
provided:
2
0<µ< (B)
l max

Conclusion: if µ is set as in B

E[ wˆ (n)] ® w0 as n ®¥

that is: the LMS algorithm is convergent in the


mean.
Mean-Squared-error Analysis:

Start From:
H
e(n ) = d (n ) - wˆ (n )u(n )
H H
= d ( n ) - w u ( n ) - e ( n )u ( n )
0
H
= e0 (n ) - e (n )u(n )
Now:

J (n ) = E[| e(n ) |2 ]
H
= E[(e0 (n ) - e (n )u(n ))( e (n ) - e(n )u (n ))]
*
0
H

= J min - E[e H (n )u(n )e(n )u H (n )]


Where Jmin is the minimum mean-squared error
produced by the optimum weiner filter.
The second term can be manipulated to:

E[e H (n )u(n )e(n )u H (n )]


= tr{E[e H (n )u(n )e(n )u H (n )]}
= tr{R K (n )}
~~
J (n ) = J min - tr[ R K (n )] (C)
Where R~ : correlation matrix
K : the weight-error correlation matrix
~
Conclusion: since tr{R K ( n )} is positive
~ ~
definite for all n. LMS allways produces a
mean-squared error J(n) that is in excess of the
minimum-mean-squared-error Jmin
Rewrite equation (C):

J (n ) = J min - tr{R K (n )} (C)


~~

= the excess mean squared error.

By transforming to eigen-coordinate system:


Q H
R Q=L
~~ ~
and defining:
Q K (n ) Q = X (n )
H
~ ~ ~
We can write:
tr[ R K (n )] = tr[ L X (n )]
~~ ~~
J ex (n ) = tr[ L X (n )]
~~ (D)
In diagonal coordinate system:
M
J ex (n ) = å l i xi (n ) (E)
i =1

where xi(n), I=1,2,……M, diagonal elements


of X(n).

Equation (E) can be re-written as:

T
J ex (n ) = l x (n )

Where x(n) satisfies


x (n + 1) = B x (n ) + µ 2 J min l
~ ~ (F)
where B is the MxM matrix with:

{
~ (1-µl ) +µ l ...........i = j
2 2 2

bij = µ l l .........i ¹ j
2
i
i

j
i
The matrix B is real, positive and symmetric.
By solving the first order vector differennce
equation (F) gives:

n -1
x (n ) = B x (0) + µ J min å B l
n 2 i (G)
~ i =0 ~~
n -1
-1
B = ( I - B) ( I - B)
å~
i n

i =0 ~ ~ ~ ~
Then equation (G) becomes:
n
x (n ) = B [ x (0) - µ 2 J min ( I - B ) -1 l ] + µ 2 J min ( I - B ) -1 l
~ ~ ~ ~
transient Component Steady state
Component
Since B is symmetric, we can apply an
~
orthogonal similarity transformation.

GT B G = C
~ ~~
G: orthogonal matrix

GT G = I
~ ~ ~
B n = G C n GT =C
~ ~~ ~
C : Diagonal matrix with Ci, i = 1,2, …..M
gi : eigen vectors of B associated with
eigenvalues ci
We can then manipulate the equation for x(n)
to:
M
T
x (n ) = å c g i g i [ x (0) - x (¥)] + x (¥)
n
i
(H)
i =1

Where for stability:


0 < ci < 1 for all i

-1
x (¥) = µ J min ( I - B ) l
2
~ ~
And: M
T
GC G n T
= åc gi gi n
~~ ~ i =1
i

Now the excess mean-squared error:


J ex (n ) = lT x (n )
M
T T
= å c l g i g i [ x (0) - x (¥)] + J ex (¥)
n
i
(I)
i =1
Where Jex(¥) = lTx(¥)
M
= å l j x j (¥)
j =1
After some more algebra:
We may write an expression for:

M µl i
å
i =1 2 - µl (J)
J ex ( ¥) = J min i
M µl i
1- å
i =1 2 - µl
i

Now Since:
J ex ( n ) = J ( n ) - J min
= tr{R K (n )}
~~
Putting equations (J) in (I), the time evolution of
the mean squared error for the LMS algorithm:
M Jim
J ( n ) = å ri Ci +n
(K)
i =1 M µl i
1- å
i =1 2 - µl
i
Where: ri = l g i g iT [ x(0) - x(¥)]
T

xi (0) = E [ qiH e(0)e H (0) q i ]....i = 1,2,....M


e(0) = wˆ (0) - w(0)
q: eigen vector of R
Conclusions from equation (K)
a.) The first term is (K)
M
n
å ri ci
i =1

Represents the transient component of J(n)


where ri are constants and ci are the
eigenvalues of B and they are real positive
numbers since B is a real symmetric positive
definite matrix.

b.) J(n) ®J(¥) if, and only if, µ: satisfies two


conditions:
2
0<µ<
l max
M µl i
å <1
i =1 2 - µl
i

li I=1,2,….M, eigenvalues of R
~ .
The LMS algorithm is convergent in the mean
square.
c. The mean squared error produced by the
LMS algorithm has the final value:

J min
J (¥) = M µl i
1- å
i =1 2 - µl
i

d. The misadjustment is defined as:


J (¥)
M = ex
J min
M µl i
å
i =1 2 - µl
M= i
M µl i
1- å
i =1 2 - µl
i

(i) If µ is small compared to 2/lmax


µM
M » å li
2 i =1
(ii) By defining:
1 M
l av = å li
M i =1
We can define the convergence time constant
for the LMS algorithm:

1
( t) mse ,av »
2µl av
µMl av M
M= »
2 4t mse ,av

Note that we have conflicting requirements in


that if µ is reduced to reduce M then (t)mse av is
increased. Conversely if µ is increased to reduce
(t)mse av then M is increased. The choice of µ
becomes an important compromise.

For real data, it should be noted the real LMS


algorithm in the mean square is give by:
M µl i
å <1
i =1 2 - µl
i

You might also like