LMS Algorithm
LMS Algorithm
Contents
1 Problem formulation
o 1.1 Relationship to the least mean squares filter
o 1.2 Definition of symbols
2 Idea
3 Derivation
4 Simplifications
5 LMS algorithm summary
6 Convergence and stability in the mean
7 Normalised least mean squares filter (NLMS)
o 7.1 Optimal learning rate
o 7.2 Proof
8 See also
9 References
10 External links
Problem formulation
The FIR least mean squares filter is related to the Wiener filter, but minimizing the error criterion of the
former does not rely on cross-correlations or auto-correlations. Its solution converges to the Wiener filter
solution. Most linear adaptive filtering problems can be formulated using the block diagram above. That is,
an unknown system
it as close as possible to
and
and
; but
to make
,
are not directly observable. Its solution is closely related to the Wiener filter.
Definition of symbols
is the number of the current input sample
is the filter order
(Hermitian transpose or conjugate transpose)
estimated filter; interpret as the estimation of the filter coefficients after n samples
Idea
The basic idea behind LMS filter is to approach the optimum filter weights
, by updating the filter
weights in a manner to converge to the optimum filter weight. The algorithm starts by assuming a small
weights (zero in most cases), and at each step, by finding the gradient of the mean square error, the weights
are updated. That is, if the MSE-gradient is positive, it implies, the error would keep increasing positively, if
the same weight is used for further iterations, which means we need to reduce the weights. In the same way,
if the gradient is negative, we need to increase the weights. So, the basic weight update equation is :
,
where represents the mean-square error. The negative sign indicates that, we need to change the weights in
a direction opposite to that of the gradient slope.
The mean-square error, as a function of filter weights is a quadratic function which means it has only one
extrema, that minimises the mean-square error, which is the optimal weight. The LMS thus, approaches
towards this optimal weights by ascending/descending down the mean-square-error vs filter weight curve.
Derivation
The idea behind LMS filters is to use steepest descent to find filter weights
function. We start by defining the cost function as
where
where
Now,
is a vector which points towards the steepest ascent of the cost function. To find the
minimum of the cost function we need to take a step in the opposite direction of
mathematical terms
. To express that in
where is the step size(adaptation constant). That means we have found a sequential update algorithm
which minimizes the cost function. Unfortunately, this algorithm is not realizable until we know
.
Generally, the expectation above is not computed. Instead, to run the LMS in an online (updating after each
new sample is received) environment, we use an instantaneous estimate of that expectation. See below.
Simplifications
For most systems the expectation function
the following unbiased estimator
where
indicates the number of samples we use for that estimate. The simplest case is
Indeed this constitutes the update algorithm for the LMS filter.
filter order
step size
. If this
diverges.
where
is the smallest eigenvalue of R. Given that is less than or equal to this optimum, the
convergence speed is determined by
, with a larger value yielding faster convergence. This means that
faster convergence can be achieved when
is close to
, that is, the maximum achievable
convergence speed depends on the eigenvalue spread of
.
A white noise signal has autocorrelation matrix
where
is the variance of the signal. In this
case all eigenvalues are equal, and the eigenvalue spread is the minimum over all possible matrices. The
common interpretation of this result is therefore that the LMS converges quickly for white input signals, and
slowly for colored input signals, such as processes with low-pass or high-pass characteristics.
It is important to note that the above upperbound on
coefficients of
practical bound is
can still grow infinitely large, i.e. divergence of the coefficients is still possible. A more
where
denotes the trace of
. This bound guarantees that the coefficients of
do not diverge (in
practice, the value of should not be chosen close to this upper bound, since it is somewhat optimistic due
to approximations and assumptions made in the derivation of the bound).
filter order
step size
Initialization:
Computation: For
and
Proof
Let
and
See also
References
Monson H. Hayes: Statistical Digital Signal Processing and Modeling, Wiley, 1996, ISBN 0-47159431-8
Simon Haykin: Adaptive Filter Theory, Prentice Hall, 2002, ISBN 0-13-048434-2
Simon S. Haykin, Bernard Widrow (Editor): Least-Mean-Square Adaptive Filters, Wiley, 2003,
ISBN 0-471-21570-8
Bernard Widrow, Samuel D. Stearns: Adaptive Signal Processing, Prentice Hall, 1985, ISBN 0-13004029-0
Weifeng Liu, Jose Principe and Simon Haykin: Kernel Adaptive Filtering: A Comprehensive
Introduction, John Wiley, 2010, ISBN 0-470-44753-2
Paulo S.R. Diniz: Adaptive Filtering: Algorithms and Practical Implementation, Kluwer Academic
Publishers, 1997, ISBN 0-7923-9912-9
External links
Categories:
Navigation menu
Create account
Log in
Article
Talk
Read
Edit
View history
Main page
Contents
Featured content
Current events
Random article
Donate to Wikipedia
Wikimedia Shop
Interaction
Help
About Wikipedia
Community portal
Recent changes
Contact page
Tools
Print/export
Create a book
Download as PDF
Printable version
Languages
Catal
Deutsch
Espaol
Italiano
Edit links