Adaptive Filters
Adaptive Filters
The design of such filters is the domain of optimal filtering, which originated with the
pioneering work of Wiener and was extended and enhanced by Kalman, Bucy and others.
Filters used for direct filtering can be either Fixed or Adaptive.
1. Fixed Filters - The design of fixed filters requires a priori knowledge of both the signal
and the interference, i.e. if we know the signal and interference beforehand; we can design a
filter that passes frequencies contained in the signal and rejects the frequency band occupied
by the interference.
2. Adaptive Filters - Adaptive filters, on the other hand, have the ability to adjust their
impulse response to filter out the correlated signal in the input. They require little or no a
priori knowledge of the signal and interference characteristics. (If the signal is narrowband
and interference broadband, which is usually the case, or vice versa, no a priori information is
needed; otherwise they require a signal (desired response) that is correlated in some sense to
the signal to be estimated.) Moreover adaptive filters have the capability of adaptively
tracking the signal under non-stationary conditions.
The concept of adaptive interference cancelling, an alternative method of estimating
signals corrupted by additive interference or interference. The method uses a "primary" input
containing the corrupted signal and a "reference" input containing interference correlated in
some unknown way with the primary interference. The reference input is adaptively filtered
and subtracted from the primary input to obtain the signal estimate. Adaptive filtering before
subtraction allows the treatment of inputs that are deterministic or stochastic, stationary or
time variable. Wiener solutions are developed to describe asymptotic adaptive performance
and output signal-to-interference ratio for stationary stochastic inputs, including single and
multiple reference inputs. These solutions show that when the reference input is free of signal
and certain other conditions are met interference in the primary input can be essentiany
eliminated without signal distortion. It is further shown that in treating periodic interference
the adaptive interference canceller acts as a notch filter with narrow bandwidth, infinite null,
and the capability of tracking the exact frequency of the interference; in this case the
canceller behaves as a linear, time-invariant system, with the adaptive filter converging on a
dynamic rather than a static solution. Experimental results are presented that illustrate the
usefulness of the adaptive interference cancelling technique in a variety of practical
applications. These applications include the cancelling of various forms of periodic
interference in electrocardiography.
~1~
The cancelling of periodic interference in speech signals, and the cancelling of broad-band
interference in the side-lobes of an antenna array. In further experiments it is shown that a
sine wave and Gaussian interference can be separated by using a reference input that is a
delayed version of the primary input. Suggested applications include the elimination of tape
hum or turntable rumble during the playback of recorded broad-band signals and the
automatic detection of very-low-level periodic signals masked by broad-band interference.
The purpose of adaptive interference cancellation is to improve the signal-to-interference
ratio (SIR) of a signal by removing interference from the signal that you receive. A typical
example of this application is the communication between a pilot, who is inside a jet aircraft,
and a ground control tower. A jet engine can produce interference of over 140 dB, but normal
human speech is below 50 dB. If you are in the ground control tower, you might have
difficulty hearing the pilot's speech clearly. In this situation, you can use adaptive filters to
remove the jet engine interference while retaining the pilot's speech.
The following figure shows a interference cancellation system.
~2~
from the primary input to obtain the signal estimate. Adaptive filtering before subtraction
allows the treatment of inputs that are deterministic or stochastic, stationary or time-variable.
Adaptive Interference Cancellation algorithm has been applied to the removal of high-level
white interference from audio signals. Simulations and actual acoustically recorded signals
have been processed successfully, with excellent agreement between the results obtained
from simulations and the results obtained with acoustically produced data. A study of the
filter length required in order to achieve a desired interference reduction level in a hard-
walled room is presented. The performance of the algorithm in this application is described
and required modifications are suggested. A multichannel processing scheme is presented
which allows the adaptive filter to converge at independent rates in different frequency bands.
This is shown to be of particular use when the interfering interference is not white. Careful
implementation of the scheme allows the problem to be broken into several smaller ones
which can be handled by independent processors, thus allowing longer filter lengths to be
processed in real time.We have chosen this topic because in fixed coefficient transfer
function filters the coefficients was fixed it cannot be changed according to environmental
changes but adaptive filters can do so.
An adaptive filter is a system with a linear filter that has a transfer function controlled by
variable parameters and a means to adjust those parameters according to an optimization
algorithm. Because of the complexity of the optimization algorithms, most adaptive filters
are digital filters. Adaptive filters are required for some applications because some
parameters of the desired processing operation (for instance, the locations of reflective
surfaces in a reverberant space) are not known in advance or are changing. The closed loop
adaptive filter uses feedback in the form of an error signal to refine its transfer function.
Generally speaking, the closed loop adaptive process involves the use of a cost function,
which is a criterion for optimum performance of the filter, to feed an algorithm, which
determines how to modify filter transfer function to minimize the cost on the next iteration.
The most common cost function is the mean square of the error signal.
As the power of digital signal processors has increased, adaptive filters have become much
more common and are now routinely used in devices such as mobile phones and other
communication devices, camcorders and digital cameras, and medical monitoring equipment.
2. The structure that defines how the output signal of the filter is computed from its input
signal
3. The parameters within this structure that can be iteratively changed to alter the filter’s
input-output relationship
~3~
4. The adaptive algorithm that describes how the parameters are adjusted from one time
instant to the next
Figure shows a block diagram in which a sample from a digital input signal x(n) is fed into
advice, called an adaptive filter, that computes a corresponding output signal sample y(n)at
time n. For the moment, the structure of the adaptive filter is not important, except for the fact
that it contains adjustable parameters whose values affect how y (n) is computed. The output
signal is compared to a second signal d. (n), called the desired response signal, by subtracting
the two samples at time n. This difference signal, given by
e (n) = d(n)−y(n) 1
Is known as the error signal. The error signal is fed into a procedure which alters or adapts
the parameters of the filter from time n to time .n C 1/ in a well-defined manner. This process
of adaptation is represented by the oblique arrow that pierces the adaptive filter block in the
figure. As the time index is incremented, it is hoped that the output of the adaptive filter
becomes a better and better match to the desired response signal through this adaptation
process, such that the magnitude of e (n) decreases over time. In this context, what is meant
by “better” is specified by the form of the adaptive algorithm used to adjust the parameters of
the adaptive filter. In the adaptive filtering task, adaptation refers to the method by which the
parameters of the system are changed from time index n to time index n. The number and
types of parameters within this system depend on the computational structure chosen for the
system. We now discuss different filter structures that have been proven useful for adaptive
filtering tasks.
~4~
Adaptive Filter: Interference Mitigation techniques
Interference is a nuisance or disturbance during communication and it is unwanted. Noise
occurs because of many factors such as interference, delay, and overlapping. Noise problems
in the environment have gained attention due to the tremendous growth of technology that
has led to noisy engines, heavy machinery, high electromagnetic radiation devices and other
noise sources. For noise cancellation with the help of adaptive filter and employed for variety
of practical applications like the cancelling of various forms of periodic interference in
electrocardiography, the cancelling of periodic interference in speech signals, and the
cancelling of broad-band interference in the side-lobes of an antenna array. In sound signal or
speech signal, noise is very problematic because it will difficult to understanding of the
information. Speech is a very basic way for humans to convey information to one another
with a bandwidth of only 4 kHz; speech can convey information with the emotion of a human
voice. The speech signal has certain properties: It is a one-dimensional signal, with time as its
independent variable, it is random in nature, it is non-stationary, and i.e. the frequency
spectrum is not constant in time. Although human beings have an audible frequency range of
20Hz to 20 kHz, the human speech has significant frequency components only up to 4 kHz.
The most common problem in speech processing is the effect of interference noise in speech
signals. In the most of practical applications Adaptive filters are used and preferred over
fixed digital filters because adaptive filters have the property on the other hand, have the
ability to adjust their own parameters automatically and their design requires little or no a
priori knowledge of signal or noise characteristics. In this paper we have to use adaptive filter
for noise cancellation. The general configuration for an Adaptive filter system is shown in
Fig.1. It has two inputs: the primary input d(n), which represents the desired signal corrupted
with c undesired noise, and the reference signal x(n), which is the undesired noise to be
filtered out of the system. The goal of adaptive filtering systems is to reduce the noise
portion, and to obtain the uncorrupted desired signal. In order to achieve this, a reference of
the noise signal is needed and is called reference signal x(n). However, the reference signal is
typically not the same signal as the noise portion of the primary amplitude, phase or time.
Therefore the reference signal cannot be simply subtract from the primary signal to obtain the
desired portion at the output.
We have so many various techniques to study under adaptive filter techniques. Here we have
taken brief knowledge about
The analysis of these three algorithms and result of these are shown below.
~5~
Least mean square (LMS) algorithm is used to update the weight vector coefficient of the
adaptive filter or variable filter by using the mean square error.
x(i) wn y (i ) - + d(i)
Variable filter
+
e(n)
Updated Algorithm
Where
Main aim of the LMS algorithm is reduce the mean square error.
n 2
( n ) e (i )
i 0 …………………………………………… (1)
It is the stochastic gradient descent method in the filter is only adapted based on the error
at the current time .
Gradient descent is the first order optimization algorithm. it is used to find the local minima
of a function.
The basic idea behind LMS filter is to approach the optimum filter weights (R-1P); by
updating the filter weight in a manner to converge the optimum filter weight.
Where
~6~
R = autocorrelation
P = cross correlation
For stationary processes the gradient descent adaptive filter converges to the solution to the
Wiener Hoff eqn. when
2
max
Where
µ = step size
(n) E{ e(n) }
2
………………………. (3)
From fig.
^
e( n ) d ( n ) y ( n )
e(n) d (n) wn x( n)
T
………………… (4)
w
2 p 2 wR 0
2 wR 2 P
wR P
woptim R 1 P
~7~
The algorithm starts by assuming a small weight (zero is most cases) and each step, by
finding the gradient of the mean square error the weight are updated
Error
Weight vector
Error functions are the quadratic function of the weight vector. If the mean square error
gradient is positive if implies, the error would keep increasing positively if the same weight is
used for further iteration which means we need to reduce the weights.
In the same way to if the gradient is negative we need to increase the weights so basic weight
updates eqn. is
wn 1 wn ( n ) …………………………. (6)
2
Where
µ = step size
(n) = mean square error
= gradient operator
The (-ve) sign indicate that, we need to change the weights in opposite to that of the gradient
slope.
Derivation:-
d (n)
{w0,w1,w2,………….. wn}H _ +
^ +
(input) y[ n ] error e (n)
X (n)
The idea behind LMS filter is to use gradient descent to find filter weight w (n) which
minimize a cost function.
The cost function is given by
~8~
c(n) E{ e(n) }
2
…………………….. (7a)
(n) E{ e(n) }
2
….………………… (7b)
Where e (n) is the error at the current sample ‘n’ and E {.} denotes the expected value.
Appling gradient descent means to take the partial derivatives with respect to the individual
entries of the filter coefficient vector.
w H (n) w H E{e(n)e * (n) }
~9~
And one multiplication is needed to from the product µe (n). Finally, (p+1) multiplication and
p addition are necessary to calculate the output y (n) of adaptive filter. Thus a total of (2p+3)
multiplication and (2p+2) addition per output point are required.
Table:-
The LMS algorithm for a pth order FIR adaptive filter:-
Parameter:
P = filter order
µ = step size
Initialization:-
For n=0, 1, 2, 3……..
y (n) wnH x(n)
e( n ) d ( n ) y ( n )
wn 1 wn x(n)e * (n)
~ 10 ~
Property 3- the mean square error (n) converges to a steady state value of
1
() min ex () min p
k
1
k 0 2 k
And the LMS algorithm is said to converge in the mean square if and only if the step size
satisfies the following two conditions
2
0
max
p
k
1
k 0 2 k
This is the condition that is required for the LMS algorithm to converge in the mean and last
equation guarantees that () is positive. On further solving we will get
p
k
2
k 0
ex () min k
p
k
1
k 0 2 k
1 p
k 1
2 k 0
2
tr ( R x )
2
If it also follows that
max
~ 11 ~
1
() min
1
1 tr ( R x )
2
1
tr ( R x )
2 1
ex min min tr ( R x )
1 2
1 tr ( R x )
2
Thus for small the excess mean square error is proportional to the step size . Adaptive
filters may be described in terms of their misadjustment. Which is a normalized mean square
error that is defined as?
1
tr ( R x )
2 1
M tr ( R x )
1 2
1 tr ( R x )
2
Definition- the misadjustment M is the ratio of the steady state excess mean square error to
the minimum mean square error,
ex ()
M
min
Normalized LMS –
As the difficulty in LMS is the selection of the step size . For stationary processes, the LMS
2
algorithm converges in the mean if 0< , and converges in the mean square if 0<
max
2
, however since R x is generally unknown, then either max or R x must be
tr ( R x )
estimated in order to use these bounds. One way around this difficulty is to use the fact that
2
for stationary processes, tr(R x ) = (p+1) E{ x(n) }.
Therefore the condition for mean square convergence may be replaced with
2
0
( p 1) E x(n) 2
Where power
p 1 1 x(n k )
p 2
^
2
E x ( n)
k 0
~ 12 ~
Which leads to this bound on step size for mean square convergence?
2
0 H
x ( n) x ( n)
A convenient way to incorporate this bound into the LMS adaptive filter is to use a step size
of the form
( n) H
2
x ( n) x ( n) x ( n)
Where is a normalized step size with 0< 2 . Replacing in the LMS weight vector
update equation with (n) leads to the normalized LMS algorithm (NLMS) which is given
by
x ( n)
wn 1 wn 2
e( n )
x ( n)
2
Note that the effect of the normalized by x(n) is to alter the magnitude but not the
direction, of the estimated gradient vector. Therefore with the appropriate set of statistical
assumptions it may be shown that the normalized LMS converges in the mean square if 0<
<2.
In the LMS algorithm, the correction that is applied to wn is proportional to the input vector
x (n).
Therefore when x (n) is large, the LMS algorithm experiences a problem with gradient
2
interference amplification .with the normalization of the LMS step size by x(n) in the
NLMS algorithm, however, this interference amplification problem is diminished. Although
the NLMS algorithm by passes the problem of interference amplification, we are now faced
with a similar problem that occurs when x(n) becomes too small. An alternative therefore,
is to use the following modification to the NLMS algorithm
x * ( n)
wn 1 wn e( n )
x ( n)
2
~ 13 ~
Then the extra computation involves only two operations one addition and one subtraction.
~ 14 ~
3.2) Recursive Least Square Error (RLS) Algorithm:-
The Recursive Least Square Error (RLS) adaptive filter is algorithms which recursively find
the filter coefficient that minimize a weighted linear least square cost function relating to the
input signal.
Recursive least square (RLS) that aim to reduce the least square error.
n 2
( n ) e (i )
i 0
In a least square error required no statistical information about x (n) or d (n). So RLS input
signal are to be consider deterministic and it exhibits extremely fast convergence.
Minimizing a weighted least square error and derive an efficient algorithm for performing
this minimization known as Recursive Least Square.
wn (0)
w (1)
n
Input x(i ) wn y (i ) output
.
wn ( p )
Let us consider the design of an FIR adaptive wiener filter and find the filter coefficient
wn wn (0), wn (1),........wn ( p)
T
n 2
i 0,1,2,3......n n i n, n 1, n 2........0
i 0,1,2,3.........n 0 w n n1 ........ 0 1
~ 15 ~
x (i) Variable filter ( wn ) y (i ) - + d (i)
+
e (i)
wn
Updated Algorithm
And error
^
e(i ) d (i ) y (i )
…………………………..
e(i ) d (i ) wn x (i )
T
(2)
^
Where d (i) is the desired signal and y (i ) filter output at time I using the latest set of filter
coefficient, wn (k )
*
Equation (2) is derivative with respect w (k ) n
e * (i )
(d (i ) wn (k ) x(i k )) *
T
wn (k ) wn (k )
* *
e* (i )
x* (i k ) …………….. (3)
wn (k )
*
*
Least square error is minimizing so eq. (1) is derivative with respect to w (k ) equal to zero
n
n
(n) n
e (i )
*
(n) ni e(i )e * (i ) ni e(i ) 0
wn (k ) wn (k )
* *
i 0 i 0
e* (i )
Put the of value from eq. (3)
wn ( k )
*
n ………………………………. (4)
n i e(i ) x * (i k ) 0
i 0
~ 16 ~
Where k= 0, 1, 2………p.
Note:-
T
Y (n) = wn x(n)
n p
i 0
n i
{d (i ) wn (l ) x(n l )}x * (i k ) 0
i 0
n n p
i 0
n i
d (i ) x (i k )
*
i 0
n i
w (l ) x(n l ) x
i 0
n
*
(i k ) 0
(5)
Rx (n) wn rax (n) (This is the deterministic normal eqn.) ………………… (6)
And rdx (n) is the deterministic cross correlation between d (n) and x (n)
~ 17 ~
n
rdx (n) n i x * (i )d (i )
i 0
n 1
n i x * (i )d (i ) x * (i )d (i )
i 0
n 1
n i 1 x * (i )d (i ) x * (i )d (i )
i 0
Derived the eqn. that define the optimum filter coefficient evaluation the minimum square
error from eqn. (1)
n 2
( n) n i
e(i )
i 0
n
n i e(i )e * (i )
i 0
n p
e(i )[d (i ) wn (l ) x(i l )]*
n i
i 0 i 0
n p n
e(i )d (i ) w n (l ) n i e(i )x * (i l )
n i * *
i 0 i 0 i 0
From eqn. (4) that the second term is zero and the minimum error is
n
{ (n)}min n i e(i )d * (i )
i 0
n p
n i {d (i ) wn (l ) x(i l )}d * (i )
i 0 i 0
n p n
n i d (i ) wn (l ) n i d * (i )x(i l )
2
i 0 i 0 i 0
2
d (n) is the weighted norm of the vector?
Where
d (n) = [d(n),d(n-1)………d(0)] T
~ 18 ~
n
= ni d (i )
2 2
d (n)
i 0
And
n
r (n) ni x(i l )d * (i)
H
dx
i 0
Since R x (n) and rdx (n) both depends on n so scaling the deterministic normal eqn. directly
for each value of n. we will derive a recursive solution of the form
wn R x1 (n)rdx ( x)
This recursion will be derived by first expressing rdx in terms of rdx (n 1) and then deriving
a recursion that allow us to evaluate R x1 ( n) in terms of R x1 ( n 1) and the new data vector.
First we will find the inverse autocorrelation R x1 ( n) is finding using Woodbury identity eqn.
A 1UV H A 1
H -1 -1
(A+UV ) =A -
1 V H A 1U
We know that
[{x*(n)}H=xT(n)]
R x (n) R x (n 1) x * (i ) x T (i )
~ 19 ~
So
P(n)=Rx-1(n)
1 p (n 1) x * (n)
g ( n) ……………… (12)
1 1 p ( n 1) x * (n) x T ( n)
g ( n) x * ( n) p ( n)
And thus the gain vector is the solution to the linear eqn.
From eqn. (6) comparing the wn is replace by g(n) and rdx(n) is replace by x*(n).
To complete the recursive, we must derive the time update eqn. for the coefficient vector wn
with.
wn= rdx(n)p(n)
~ 20 ~
w(n) p(n)rdx (n 1) d (n) p (n) x * (n)
But
g ( n) x * ( n) p ( n)
Where
Parameter:-
p=filter order
Initialization:-
W0=0
P (0) = -1I
Computation:-
wn wn 1 g (n) (n)
Where (n) is small , the current set of filter coefficient are close to their optimal value (in a
least square sense ) and only a small correction needs to be applied the coefficient .
Where (n) is large the current set of filter coefficient are not performing must be applied to
the updated coefficient. The evolution of the gain vector g (n) and the inverse autocorrelation
matrix p (n), it is necessary to compute the product
This is known as filtered information vector and it is used to the calculate of both g (n) and p
(n).
First the initialization of the RLS algorithm since the RLS algorithm involves the recursive
updating of the vector wn and the inverse autocorrelation matrix p(n) initial condition for both
these terms are required.
1
0
p (0) i x * (i ) x T (i )
i p
0
rdx (0)
i p
i
d (i ) x * (i )
Than initialise
w0 p(0)rdx (0)
The advantage of this appropriate is that optimality is preserved at each step since the RLS
algorithm is initialized n=0 with the vector w0 that minimize the weighted least square error
( 0) .
Disadvantage:-
It is required the direct inversion of Rx(0), which required on the order of (p+1)3 operation.
And this approaches there will be a delay (p+1) sample before any updated are performed.
~ 22 ~
Another approach that many be used is to initialized the autocorrelation matrix
R x (0) I
p(0) 1 I
The disadvantage of this approach is that it introduces a bias in the least square solution.
The RLS algorithm require on the order of p2 operation, the evolution of z (n) requires (p+1)2
multiplication, computing the gain vector g (n) requires 2(p+1) multiplication, finding the a
prior error (n) requires another p+1 multiplication and the update of the inverse
autocorrelation matrix p (n) require 2(p+1)2 multiplication for total 3(p+1)2+3(p+1).
Therefore, In RLS inverse in computational complexity over the LMS algorithm. But RLS
algorithm converges faster than the LMS algorithm and less sensitive to Eigen value.
~ 23 ~
FLOW CHART OF RLS ALGORITHM:-
START
If NO
n=N
YES
END
~ 24 ~
3.3) Minimum mean square error (MMSE) algorithm-:
A recurring theme in this text and in much of communication, control and signal
processing is that of making systematic estimates, predictions or decisions about some set of
quantities, based on information obtained from measurements of other quantities. This
process is commonly referred to as inference. Typically, inferring the desired information
from the measurements involves incorporating models that represent our prior knowledge or
beliefs about how the measurements relate to the quantities of interest. Inference about
continuous random variables and ultimately about random pro-cesses is the topic of this
chapter and several that follow. One key step is the introduction of an error criterion that
measures, in a probabilistic sense, the error between the desired quantity and our estimate of
it. Throughout our discussion in this and the related subsequent chapters, we focus primarily
on choosing our estimate to minimize the expected or mean value of the square of the error,
re-ferred to as a minimum mean-square-error (MMSE) criterion. We consider the MMSE
estimate without imposing any constraint on the form that the estimator takes. In Section 8.3
we restrict the estimate to be a linear combination of the measurements, a form of estimation
that we refer to as linear minimum mean-square-error (LMMSE) estimation. Later in the text
we turn from inference problems for continuous random variables to inference problems for
discrete random quantities, which may be numerically specified or may be non-numerical. In
the latter case especially, the various possible outcomes associated with the random quantity
are often termed hypotheses, and the inference task in this setting is then referred to as
hypothesis testing, i.e., the task of deciding which hypothesis applies, given measurements or
observations. The MMSE criterion may not be meaningful in such hypothesis testing
problems, but we can for instance aim to minimize the probability of an incorrect inference
regarding which hypothesis actually applies.
^ ^ 2
C(x, x ) = x x
^ 2 ^ 2
E( x x ) = x x p x (x) dx
~ 25 ~
Minimizing the MSE
Let’s find the minimum mean-square error (MMSE) estimate of x; we need to solve
^ 2
min
^
E( x x )
x
We have
^ 2 ^ ^
E( x x )= E((x- x ) T ( x- x )
^T ^T ^
=E(x T x 2 x x x x )
2 ^T ^T ^
=E x 2 x Ex x x
^
Differentiating with respect to x gives the optimal estimate
^
x mmse = Ex
^ 2
Since
2 2
^
E ( x x mmse ) = E ( x Ex )
We can interpret this via the MVD. For any random variable z, we have
2 2 2
E z E z Ez Ez
~ 26 ~
^
Applying this to z = x- x gives
2
2 2
^ ^
E ( x x e ) = E ( x Ex ) + E x Ex
~ 27 ~
4.3 Comparisons between LMS and RLS Algorithm:-
Ratio (SIR)
~ 28 ~
References
9 Haykin, Simon (2002). Adaptive Filter Theory. Prentice Hall. ISBN 0-13-048434-2.
~ 29 ~
~ 30 ~