HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
http://ocw.mit.edu
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
Spring 2007
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
(12
Harvard-MIT Division of Health Sciences and Technology
HST.582J: Biomedical Signal and Image Processing, Spring 2007
Course Director: Dr. Julie Greenberg
HST582J/6.555J/16.456J Biomedical Signal and Image Processing Spring 2007
Chapter 12 - RANDOM SIGNALS AND LINEAR SYSTEMS
c Bertrand Delgutte 1999
Introduction
In Chapter 2, we saw that the impulse response completely characterizes a linear, time-invariant
system because the response to an arbitrary, but known, input can be computed by convolving
the input with the impulse response. The impulse response plays a key role for random signals
as it does for deterministic signals, with the important dierence that it is used for computing
time averages of the output from averages of the input. Specically, we will show that knowing
the impulse response suces to derive the mean and autocorrelation function of the output
from the mean and autocorrelation function of the input. That autocorrelation functions are
involved is to be expected, since we showed in Chapter 11 that these functions naturally arise
when processing random signals by linear lters.
In Chapter 3, we introduced Fourier analysis for deterministic signals, and showed that this
concept leads to simplications in the analysis and design of linear, time invariant systems.
Frequency-domain techniques are as powerful for stationary random signals as they are for
deterministic signals. They lead to the concepts of power spectrum and Wiener lters, which
have numerous applications to system identication and signal detection in noise.
12.1 Response of LTI systems to random signals
Our goal in this section is to derive general formulas for the mean and autocorrelation of the
response of a linear system to a stationary random signal, given both the system impulse response
and the mean and autocorrelation function of the input.
12.1.1 Mean of y[n]
Let x[n] be a random signal used as input to an LTI system with impulse response h[n]. The
mean of the output y[n] is:
< y[n] > = < x[n] h[n] > = <
h[m]x[n m] >
n
= h[m] < x[n m] >
n
(12.1)
m= m
h[n] .2a)
=
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
As a special case of (12.2), if x[n] has zero mean and the system is stable, the output also has
zero mean.
Further simplication can be obtained by introducing the frequency response H(f) and using
the initial value theorem:
< y[n] > = H(0) < x[n] > (12.2b)
This formula has an intuitive interpretation: The DC component (mean) of the output is the
DC component of the input multiplied by the frequency response evaluated at DC.
12.1.2 Crosscorrelation function between x[n] and y[n]
To obtain the autocorrelation function of the output, it is easier to rst derive the crosscorrelation
function between input and output.
R
xy
[k] =< x[n]y[n+k] >
n
=< x[n]
h[m]x[n+k m] >
n
= h[m] R
x
[k m] (12.3)
m=
m=
This gives the simple result:
R
xy
[k] = h[k] R
x
[k] (12.4)
Thus, the crosscorrelation function between the input and the output is the convolution of the
autocorrelation function of the input with the impulse response of the lter.
As an important special case, if the input w[n] is zero-mean, white noise with variance
2
w
, the
crosscorrelation function between input and output is
R
2
wy
[k] =
w
h[k] ( .5)
This result is the basis for a widely-used method of system identication: In order to measure
the impulse response of an unknown LTI system, a white signal is used as input to the system,
and the crosscorrelation function between input and output is computed, giving an estimate of
the impulse response. It can be shown that this method will also work if the white noise is not
the only input (provided that the other inputs are uncorrelated with the white noise), and if the
linear system is followed by a memoryless nonlinearity (Prices theorem). The mathematician
Norbert Wiener has further shown that, in principle, white noise inputs can be used for the
identication of a wider class of nonlinear systems, but such methods are dicult to implement
because small measurement errors or computational inaccuracies can greatly aect the results.
12.1.3 Autocorrelation function of y[n]
We can now give a general formula for the autocorrelation function of the output of an LTI
system:
R
y
[k] = < y[n] y[n k] >
n
= < y[n]
h[m] x[n k m] >
n
=
h[m] R
xy
[m+k]
m= m=
(12.6)
12
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
2
Again, we obtain a simple result:
R
y
[k] = h[k] R
xy
[k] .7)
Combining (12.4) and (12.7) yields:
R
y
[k] = h[
k] (h[k] R
x
[k]) = (h[k] h[k]) R
x
[k] = R
h
[k] R
x
[k] (12.8a)
with
h
[k] =
h[k] h[k] =
n
h
[k] is called the deterministic autocorrelation function of h[k]. It resembles a
true autocorrelation function, with the important dierence that the autocorrelation function of
a random signal is an average over time, applicable to signals that have a nite mean power (but
an innite energy), while a deterministic autocorrelation function is a sum over time applicable
to signals that have a nite energy, but zero mean power. Deterministic autocorrelation function
have similar properties to those of true autocorrelation functions: They are even, and have a
maximum at the origin.
Because y[n]
y
is the response of the lter to the centered signal x[n]
x
, (12.8a) also holds
for the autocovariance function, which is the autocorrelation function of the centered signal:
C
y
[k] = R
h
[k] C
x
[k] .9)
12.1.4 Example
For example, consider a rst-order FIR lter that approximates a dierentiator:
y[n] = x[n] x[n 1]
The deterministic autocorrelation function can be determined by inspection as:
R
h
[k] = 2 [k] [k 1] [k + 1]
From (12.8a), the autocorrelation of the output is given as a function of the input as:
R
y
[k] = 2 R
x
[k] R
x
[k 1] + R
x
[k + 1]
12.1.5 White noise inputs
The case of white-noise inputs is again of special interest. For these inputs, C [k] =
2
w w
[k],
so that the autocovariance function of the output signal is given by:
C
y
[k] =
2
w
R
h
[k] .10)
and, in particular, the variance of the output signal is:
2 2
y
= C
y
[0] =
w
n=
h[n]
2
=
2
w
E
h
(12.11)
(12
(12
(12
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
3
Thus, the variance of the output is equal to the variance of the white input multiplied by the
energy in the impulse response. This simple relationship only holds for white inputs: In general,
the variance of a ltered random signal depends not only on the variance of the input, but also
on the values of its covariance for non-zero lags.
12.1.6 Example 1: First-order autoregressive lter
Consider now a rst-order lowpass lter with unit sample-response a
n
u[n] excited by zero-mean,
white noise. The deterministic autocorrelation function of the lter is:
a
|k|
R
n n+ k k 2n
h
[k] = a a
| |
= a
| |
a =
n=0 n=0
(12.12)
1 a
2
If the variance of a zero-mean, white input is
2
w
, then the autocorrelation function of the output
is:
|
R [k] =
2
a
|k
y w
(12.13)
1 a
2
and in particular the variance is
2
2
y
=
w
(12.14)
1 a
2
R
y
[k] is shown in Fig. 12.1A for several values of the parameter a. As a increases from 0 to 1,
the autocorrelation spreads to larger lags, indicating that the memory of the system increases.
12.1.7 Example 2: N-point moving average
We introduce another example that will be useful in Chapter 13 for estimating parameters of
random signals from nite data: the rectangular, or boxcar lter of length N. The impulse
response of the lter is:
_
1 if 0 n N 1
N
[n] =
(12.15)
0 otherwise
It can be veried that its deterministic autocorrelation function is a triangular function of width
2N 1 and height N:
N k if k < N
N
[k] =
N
[k]
N
[k] =
_
| | | |
(12.16)
0 otherwise
If a random signal x[n] with covariance function C
x
[k] is used as input to this rectangular lter,
the variance of the output y[n] will be:
N
2
1
y
= C
y
[0] = (C
x
[k]
N
[k])
/k=0
=
(N |k|) C
x
[k] (12.17)
n=(N1)
Thus,
2
y
depends on the values of C
x
[k] for (N 1) k N 1. This is a general result
for all FIR lters of length N.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
4
In the special case of a white noise input with variance
2
w
, this expression simplies to:
2
y
=
2
w
2
N
[0] = N
w
(12.18)
Thus, adding N consecutive samples of a white signal increases the variance by a factor of N.
The rectangular lter also has the eect of introducing correlation between N 1 successive
samples, i.e. C
y
[k] = 0 for |k| < N. Figure 12.1C shows C
y
[k] for dierent values of N.
12.1.8 Generalization to two inputs and two outputs
The preceding results can be generalized to the case of two correlated inputs processed through
two linear systems. Specically, assume that the random signals x
1
[n] and x
1
[n] are processed
through the lters h
1
[n] and h
2
[n], respectively, to give the two outputs y
1
[n] and y
2
[n]. Further
assume that the crosscorrelation function R
x
1
x
2
[k] is known. We will derive an expression for
R
y
1
y
2
[k]:
R
y
1
y
2
[k] = < y
1
[n] y
2
[n +k] >
n
= <
h
1
[m] x
1
[n m] y
2
[n +k] >
n
m
R
y
1
y
2
[k] =
h
1
[m] R
x
1
y
2
[k +m] = h
1
[k] R
x
1
y
2
[k] (12.19)
m
To complete the proof, we need to derive an expression for R
x
1
y
2
[k]
R
x
1
y
2
[k] = < x
1
[n] y
2
[n +k] >
n
= < x
1
[n]
h
2
[m] x
2
[n +k
m
m] >
n
R
x
1
y
2
[k] =
h
2
[m] R
x
1
x
2
[k m] = h
2
[k] R
x
1
x
2
[k] (12.20a)
m
h
1
h
2
[k] =
h
1
[k] h
2
[k] =
h
1
[n] h
2
[n +k] (12.21b)
n
is the deterministic crosscorrelation function of h
1
[n] and h
2
[n]. Three cases are of special
interest:
1. If the two inputs are uncorrelated, i.e. if R
x
1
x
2
[k] = 0 for all k, then the outputs are also
uncorrelated.
2. On the other hand, if the two inputs are identical, i.e. if x
1
[n] = x
2
[n] = x[n], then
R R
y
1
y
2
[k] =
h
1
h
2
[k] R
x
[k],
In general, R
y
1
y
2
[k] = 0. Thus, the outputs of two lters excited by the same signal are,
in general correlated. The only exception is if R
h
1
h
2
[k] = 0 for all k, a condition best
expressed in the frequency domain.
3. If in addition h
1
[n] = h
2
[n] = h[n], so that y
1
[n] = y
2
[n] = y[n], then (12.21) reduces
to (12.8), as expected.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
5
12.2 The power spectrum
12.2.1 Denition and properties
We have just derived a general expression for the autocorrelation function of the response y[n]
of a linear lter h[n] to a random input x[n]:
R
y
[k] = R
x
[k] (h[k] h[k]) = R
x
[k]
R
h
[k] (12.22)
Because this equation includes a double convolution, it can be simplied by introducing the
discrete time Fourier transform S
x
(f) of the autocorrelation function:
S
j2fk
x
(f) =
R
x
[k] e
(12.23a)
k=
S
x
(f) is called the power spectrum of the random signal x[n]. The autocorrelation function can
be recovered from the power spectrum by means of an inverse DTFT:
1
R
x
[k] =
_
2
1
S
2
x
(f) e
j fk
df (12.23b)
2
Because R
x
[k] is always real and even, its transform S
x
(f) is also real and even:
R
x
[k] = R
x
[k] = R
x
[k] S
x
(f) = S
x
(f) = S
x
(f) (12.24)
Reporting (12.23a) into (12.22), and making use of the convolution theorem leads to a simple
expression for the power spectrum S
y
(f) of the output signal as a function of the power spectrum
of the input:
S
2
y
(f) = S
x
(f) H(f) H(f) = S
x
(f) |H(f)| (12.25)
Thus, the power spectrum of the output of an LTI system is the power spectrum of the input
multiplied by the magnitude squared of the frequency response. This simple result has many
important applications.
12.2.2 Physical interpretation
The power spectrum of a random signal represents the contribution of each frequency component
of the signal to the total power in the signal. To see this, we rst note that, applying the initial
value theorem to (12.23b), the signal power is equal to the area under the power spectrum:
1
P
x
= R
x
[0] =
_
2
1
S
x
(f) df (12.26)
2
Suppose now that x[n] is input to a narrow bandpass lter B(f) with center frequency f
0
and
bandwidth f:
r
|B(f)|
_
1 fo f
f
0
=
0
< f < f
0
2
+
f
2
modulo 1
(12.27)
otherwise
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
6
The power in the output signal y[n] is:
P
y
=
_ 1
2
1
1
S
y
(f) df =
2
_
2
1
S
x
(f) |B(f)
2
|
2
df (12.28)
Because the lter is very narrow, this integral is approximately the value of the integrand for
f = f
0
multiplied by the width of the integration interval:
P
y
S
2
x
(f
0
) |B(f
0
)| f S
x
(f
0
) f (12.29)
Thus, the power in the bandpass-ltered signal is equal to the power spectrum of the input at
the center of the passband multiplied by the lter bandwidth. This shows that S
x
(f
0
) represents
the contribution of frequency components near f
0
to the power of x[n]. Because the power at the
output of any bandpass lter is always nonnegative, the power spectrum must be nonnegative
for all frequencies:
S
x
(f) 0 for all f (12.30)
Writing this constraint for f = 0 gives an important property of autocorrelation functions:
R
x
[k] = S
x
(0) 0 (12.31)
k=
In general, because the positivity condition holds not only for f = 0, but for all frequencies, it
strongly constrains the set of possible autocorrelation functions.
12.2.3 Example 1: Sine wave
The autocorrelation function of a sine wave s[n] with amplitude A and frequency f
0
is a cosine
wave:
A
2
R
s
[k] = cos 2f
0
n (12.32)
2
Therefore, the power spectrum consists of impulses at frequencies f
0
modulo 1:
A
2
S
s
(f) = f
4
_
(f f
0
) +
( + f
0
)
_
(12.33)
If the sine wave is input to a lter with frequency response H(f), the power spectrum of the
output will be
2
S
y
(f) = |H(
0
)|
2
A
f
4
_
(f
f
0
) + (f + f
0
)
_
(12.34)
As expected, this is the power spectrum of a sine wave with amplitude A|H(f
0
)|. In general, the
presence of impulses in the power spectrum implies that the signal contains periodic components.
A special case is the DC component, which appears as an impulse at the origin in the power
spectrum. It is generally desirable to remove these periodic components, including the DC,
before estimating the power spectrum of a random signal.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
7
12.2.4 Example 2: White noise
The autocorrelation function of zero-mean, white noise w[n] is an impulse at the origin:
R
w
[k] =
2
w
[k] (12.35)
Therefore, its power spectrum is a constant equal to the variance:
S
w
(f) =
2
w
(12.36)
i.e., white noise has equal power at all frequencies. This result justies the term white noise by
analogy with white light which contains all visible frequencies of the electromagnetic spectrum.
From (12.25), if white noise, is used as input to a lter with frequency response H(f), the power
spectrum of the output y[n] is equal to the magnitude square of the frequency response within
a multiplicative factor:
S
y
(f) =
2
w
|H(f)|
2
(12.37)
This result has two important interpretations.
1. An arbitrary random signal with power spectrum S
x
(f) which does not contain impulses
can be considered as the response of a lter with frequency response |H(f)| =
_
S
x
(f) to
zero-mean, white noise with unit variance. This point of view is useful because it replaces
the problem of estimating a power spectrum by the simpler one of estimating parameters
of a linear lter. For example, in autoregressive spectral estimation, an all-pole lter model
is tted to random data. This technique is described in Chapter 8.
2. To generate noise with an arbitrary power spectrum, it suces to pass white noise through
a lter whose frequency response is the square root of the desired spectrum. This is always
possible because, as shown above, power spectra are nonnegative.
12.2.5 Example 3: Dierentiator
Let y[n] be the response of the rst-order dierentiator to zero-mean, white noise, i.e. y[n] =
w[n] w[n 1]. We have previously derived the autocorrelation function:
R
y
[k] =
2
w
(2 [k] [k 1] [k + 1])
Therefore, the power spectrum is
S
2
y
(f) = 2
w
(1 cos 2f)
Appropriately, S
y
(f) = 0 for f = 0 and increases monotonically to reach 4 for f =
1
2
.
12.2.6 Example 4: First-order autoregressive process
A rst-order autoregressive process is obtained by passing zero-mean, white noise through a
rst-order, recursive lowpass lter:
y[n] = a y[n 1] + w[n] (12.38)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
8
The lter frequency response is:
1
H(f) = (12.39)
1 a e
j2f
Applying (12.37) gives the power spectrum of the output:
2
S
y
(f) =
2
w
|H(f)|
2
=
w
(12.40)
1 2 a cos 2f + a
2
Taking the inverse Fourier transform of S
y
(f) gives the autocorrelation function that we derived
previously:
R [k] =
2
a
|k|
y w
(12.13)
1 a
2
Fig. 12.1B shows S
y
(f) for dierent values of a. As a approaches 1, and the eective duration
of the impulse response (and autocorrelation function) increases, the power spectrum becomes
increasingly centered around the origin.
12.2.7 Example 5: Rectangular lter
Our last example is the output of a rectangular lter of length N to a zero-mean, white noise
input. The autocorrelation function has a triangular shape:
R
y
[k] =
2
w
N
[k], (12.41)
where
N
[k] is dened in (12.16). Therefore, the power spectrum is the square of a periodic
sinc function:
f
S
y
f) =
2
sin
2
N
(
w
(12.
sin
2
42)
f
The width of the main lobe is 1/N, and the height N
2
. Fig. 12.1D shows S
y
(f) for dierent
values of N. As for the autoregressive process, increasing the lter length makes the output
power spectrum increasing lowpass.
12.2.8 Physical units of the power spectrum
For continuous-time signals, the power spectrum is dened by the CTFT of the autocorrelation
function R
x
():
S
x
(F) =
_
R
x
() e
j2F
d (12.43a)
R
x
() =
_
S
x
(F) e
j2F
dF (12.43b)
S (F) T
R (kT ) e
j2FkTs
x s x s
(12.44)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
9
Thus, a more physical denition of the power spectrum of a discrete random signal x[n] would
be
S
x
(F) =
T
s
R
x
[k] e
j2FkTs
(12.45a)
(F) = T
s
S
x
(F/F
s
) (12.46)
The power spectrum dened in (12.6) has units of Volt
2
/Hz if x[n] is expressed in Volts. The
autocorrelation function (in Volt
2
) can be recovered from S
x
(F) e
j2FkTs
x
x
dF (12.45b)
Fs/2
Denition (12.45) can be used as an alternative to (12.23) in situations when physical units are
important. Because this chapter is primarily concerned with mathematical properties, we use
Denition (12.23).
12.2.9 The periodogram - Wiener-Kitchnin theorem - optional
Can the power spectrum be directly derived from a random signal without introducing the
autocorrelation function as an intermediate? Specically, assume that we have N samples of a
random signal {x[n], n = 0, ...N 1}. Because the power spectrum represents the contribution
of each frequency band to the total power, one might expect that it would be approximately
proportional to the magnitude square of the Fourier transform of the data:
N
2
1
S
) = C |X
N
(f)|
2
x
(f = C
x[n] e
j2fn
n=0
(12.47)
The proportionality constant C can be determined f
rom dimensionality
considerations. Speci-
cally, we want the integral of the power spectrum over all frequencies to be equal to the mean
power:
_ 1
2
1
S
x
(f) df = P
x
(12.48)
2
If N is suciently large, one has from Parsevals theorem
1
P
x
N1
N
n
x[n]
2
1
=
=0
1
N
_
2
1
.
2
|X
N
(f)|
2
df (12 49)
Comparing with (12.47), it is clear that the proportionality constant C must equal 1/N for the
power to be conserved. We are thus led to dene the periodogram S
x
(f) as an estimate of the
power spectrum from the data {x[n], n = 0, ...N 1}:
1
S
x
(f) =
1
X
N
|
N
(f)|
2
=
x
N
N 1
[n] e
j2fn
(12.
n=0
50)
One might hope that, as N goes to innity, the periodo
=0
N
|X
k
(f)|
2
, (12.52)
where X
k
(f) is the DTFT of the k-th segment x
k
[n]. This identity between the power spectrum
and the limit of the average periodogram is known as the Wiener-Kitchnin theorem. Averaging
periodograms keeps the degrees of freedom (N) in the periodogram at only a fraction of the
number of data points (N M), so that each frequency sample of the periodogram is eectively
an average of a large number (M) of data points. In Chapter 13, we introduce techniques for
reliably estimating the power spectrum from nite data records.
12.3 The cross spectrum
12.3.1 Denition
We have shown above that the crosscorrelation function between the input and the output of a
linear lter is expressed by
R
xy
[k] = h[k] R
x
[k] (12.53)
This convolution can be simplied by introducing the DTFT of the cross-correlation function
S R
k
xy
(f) =
xy
[k] e
j2f
(12.54)
k=
S
xy
(f) is called the cross-spectrum of the signals x[n] and y[n]. Unlike the power spectrum, it
is not, in general, a positive nor even a real function. The order of the two signals x and y is
important because
R
yx
[k] = R
xy
[k] S
yx
(f) = S
xy
(f) = S
xy
(f) (12.55)
Taking the Fourier transform of (12.53) yields a simple relation between the cross-spectrum and
the power spectrum of the input:
S
xy
(f) = H(f) S
x
(f) (12.56)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
11
This relation is particularly simple if the input is zero-mean, white noise w[n] with variance
2
w
:
S
wy
(f) =
2
w
H(f) (12.57)
Thus, in order to measure the frequency response of an unknown system, it suces to estimate
the cross-spectrum between a white noise input and the lter response. This technique is often
used in system identication.
12.3.2 Physical interpretation
As the power spectrum, the cross spectrum evaluated at frequency f
0
has a simple physical
interpretation if we introduce the ideal bandpass lter:
|B(f)| =
_
1 for f
0
f
0
< f < f
0
2
+
f
2
modulo 1
(12.58)
otherwise
Specically, consider the arrangement of Fig. 12.2: u[n] is the response of B(f) to x[n], and v[n]
the response of B(f) to y[n]. Note that both u[n] and v[n] are complex because the impulse
response b[n] of the one-sided lter B(f) is complex. We will show that
< u[n] v
[n] > S
xy
(f
0
) f (12.59)
where * denotes the complex conjugate. Specializing Equation (12.21) with x
1
[n] = x[n],
x
2
[n] = y[n], h
1
[n] = h
2
[n] = b[n], y
1
[n] = u[n], and y
2
[n] = v[n], we obtain1
R
uv
[k] =
< u[n] v
[n +k] > = b
[k] b[k] R
xy
[k] (12.60a)
1
In the frequency domain, this becomes:
S
uv
(f) = |B(f)|
2
S
xy
(f) (12.60b)
Applying the initial value theorem to the above expression gives:
1
< u[n] v
[n] > = R
uv
[0] =
_
2
1
(
2
|B(f)|
2
S
xy
f) df (12.61)
For small f, the integral on the right side becomes approximately S
xy
(f
0
) f, completing
the proof of (12.59). Thus, the cross spectrum evaluated at f
0
can be interpreted as the mean
product of the frequency components of x[n] and y[n] centered at f
0
.
12.3.3 Generalization to two inputs and two outputs
We have derived an expression for the crosscorrelation function between the outputs of two
lters excited by two correlated inputs:
R
y
1
y
2
[k] = h
1
[k] h
2
[k] R
x
1
x
2
[k] (12.21a)
1
For complex signals fuch as u[n] and v[n], a complex conjugate as in (12.60a) must be introduced in the
denitions of the cross- and autocorrelation functions to ensure that the power is positive.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
12
This formula takes a simple form in the frequency domain:
S
y
1
y
2
(f) = H
1
(f) H
2
(f) S
x
1
x
2
(f) (12.62)
This result implies that the lter outputs can be uncorrelated (i.e. S
y
1
y
2
(f) = 0) even though
the inputs are correlated so long that |H
1
(f)| |H
2
(f)| = 0 for all f, meaning that the lter
frequency responses do not ovelap. This is the case, for example for a lowpass lter and a
highpass lter having the same cuto frequency.
12.4 Wiener lters
12.4.1 Linear least-squares estimation
In the preceding section, we have assumed that h[n] and x[n] were known, and derived an
expression for the cross spectrum S
xy
(f). Many applications lead to a dierent problem that was
rst solved by the mathematician Norbert Wiener in the 1940s: Given two random signals x[n]
and y[n], what is the lter h[n] that does the best job of producing y[n] from x[n]? This problem
has important applications in both system modeling and signal conditioning. Specically, let
y[n] be the estimate of y[n] obtained by processing x[n] through the lter h[n]. By best lter,
we mean the one that minimizes the mean power in the error signal e[n] =
y[n] y[n]:
P
2 2
e
= < e[n] > = < y([n] y[n]) > = < (y[n]
h[k] x[n k] )
2
> (12.63)
k
Because the relation between the data x[n] and the estimate y[n] is a linear one, this is a linear,
least-squares estimation problem. Making use of the linearity time averages, the mean-square
estimation error P
e
can be expanded into the expression:
P
e
= P
y
2
h[k] R
xy
[k] +
k
h[k] h[l] R
x
[k
l
l] (12.64)
The power in the error signal is a quadratic function of the lter coecients h[k]. Therefore it
has a single minimum which can be determined by setting to zero the partial derivatives of P
e
with respect to the h[k]:
P
e
= 0 (12.65)
h[k]
This yields the system of linear equations:
R
xy
[k] =
h[l] R
x
[k
l
l] (12.66)
It is easily veried that, if (12.66) holds, the prediction error can be written as:
P
e
= P
y
h[k] R
xy
[k] = P
y
k
P
y
(12.67a)
Thus, the power in the desired signal y[n] is the sum of the power in the estimate y[n] and the
power in the error signal e[n]. In fact it can be shown more generally that:
R
e
[k] = R
y
[k] R
y
[k] (12.67b)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
13
For any two random signals u[n] and v[n], one has:
R
u+v
[k] = R
u
[k] + R
v
[k] + R
uv
[k] + R
vu
[k] (12.68)
Taking u[n] = y[n] and v[n] = e[n], (12.67b) implies that R
ye
[k] = 0 for all k, i.e. that the
error signal e[n] is uncorrelated with the estimate y[n]. Because y[n] is a weighted sum of input
samples x[n k], this also means that the error is uncorrelated with the observations x[n k],
i.e. the R
xe
[k] = 0 equals zero for all k. The result that the error is uncorrelated with the
observations is a general property of linear, least-squares estimation which can be used to derive
the system of equations (12.66).
12.4.2 Non-causal Wiener lter
In general, solving the system of equations (12.66) requires knowledge of R
xy
[k] and R
x
[k] for
all k. The exact solution depends on constraints on the lter h[n]. For example, if h[n] is
constrained to be a causal, FIR lter of length N, i.e. if it is zero outside of the interval
[0, N 1], (12.66) reduces to a system of N linear equations with N unknowns that can be
solved by standard techniques. This is what we do for the special case y[n] = x[n + 1] when
deriving the Yule-Walker equations for linear prediction in Chapter 8. There is another case in
which the solution to (12.66) is easy to nd: When there are no constraints on the lter, i.e.
when (12.66) is to be solved for < k < . In this case, the right side of (12.66) is the
convolution R
x
[k] h[k], so that a solution can be obtained by means of the Fourier transform:
S )
H( ) =
xy
(f
f (12.69)
S
x
(f)
H(f) is called the (non-causal) discrete-time Wiener lter.
Note that (12.69) is the same as (12.56). This means that, if y[n] were exactly derived from
x[n] by a ltering operation, the lter that provides the least-squares estimate of y[n] from x[n]
would be the actual one.
12.4.3 Application to ltering of additive noise
A very common problem in signal processing is to lter a noisy signal in order to estimate a
desired signal. Specically, suppose that the noisy signal x[n] is the sum of the desired signal
y[n] plus a disturbance d[n] that is uncorrelated with y[n]:
x[n] = y[n] + d[n] (12.70)
One has:
R
xy
[k] = R
y
[k] + R
yd
[k] = R
y
[k] (12.71)
because y[n] and d[n] are uncorrelated. Therefore:
S
xy
(f) = S
y
(f) (12.72)
Similarly, one has:
R
x
[k] = R
y
[k] + R
yd
[k] + R
dy
[k] + R
d
[k] = R
y
[k] + R
d
[k] (12.73)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
14
so that:
S
x
(f) = S
y
(f) + S
d
(f) (12.74)
The optimal lter for estimating y[n] from the noisy signal x[n] is
S
H(f) =
xy
(f) S
=
y
(f)
S
x
(f)
(12.75)
S
y
(f) + S
d
(f)
As expected, H(f) 1 for frequencies where the signal-to-noise ratio S
y
(f)/S
d
(f) is large,
while H(f) 0 when the signal-to-noise ratio is small. Also note that, because power spectra
are real and even, H(f) is also real and even, which means that h[n] is symmetric with respect
to the origin, and therefore non-causal. In applications that require causality, a causal lter
could be obtained by approximating h[n] by a nite-impulse response lter, then delaying the
impulse response of the FIR lter by half its length.
12.4.4 Example
Let the disturbance be zero-mean, white noise with variance
2
w
, and the signal a rst-order
autoregressive process with parameter a. The signal spectrum is given by (12.40):
1
S
y
(f) =
1 2 a cos 2f + a
2
The Wiener lter is thus:
S
H(f) =
y
(f) 1
=
S
y
(f) +
2
w
1 +
2
w
(1 2 a cos 2f + a
2)
Fig. 12.3A shows the spectra of the signal and the noise for s
2
w
= 10 and a = 0.9, while
Fig. 12.3B shows the Wiener lter H(f).
12.4.5 Applications of Wiener lters
Wiener lters have two main applications, system identication, and signal conditioning. In
system identication, the goal is to model the unknown system that produces a known output
y[n] from a known input x[n]. This can be achieved in any of two ways. In direct system
identication (Fig. 12.4A), the unknown system and the Wiener lter are placed in parallel, in
the sense that both receive the same input x[n]. The goal is to nd the lter H(f) such that its
response y[n] to x[n] best estimates the output y[n] of the unknown lter. On the other hand, in
inverse system identication (Fig. 12.4B), the unknown lter and the Wiener lter are placed
in series: The output x[n] of the unknown system is used as input to the Wiener lter, and the
goal is to make the output of the Wiener lter y[n] best estimate the input y[n] to the unknown
system. Thus, if y[n] and x[n] are related by a lter G(f), the Wiener lter H(f) would ideally
be 1/G(f). This is the approach taken in linear prediction discussed in Chapter 8.
In conditioning applications of Wiener lters (Fig. 12.4C), the goal is to either cancel out the
noise from a noisy signal, or to detect a signal in additive noise. In both cases, the signal to be
estimated y[n] is assumed to be the sum of two uncorrelated components u[n] and v[n]. The
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
15
signal to be ltered x[n] is related to v[n] by an unknown system (and therefore x[n] and v[n]
are correlated), but x[n] and u[n] are uncorrelated. In detection applications, v[n] is the signal
and u[n] the noise, so that the output y[n] of the Wiener lter is eectively an estimate of the
signal v[n] from the observations x[n]. The error signal e[n] = y[n] y[n] is then an estimate
of the noise u[n]. On the other hand, in cancellation applications, u[n] is the signal, and v[n]
(and therefore x[n]) is noise. Thus, the output y[n] of the Wiener lter is an estimate of the
noise v[n] from x[n], while the error signal e[n] = y[n] y[n] is an estimate of the signal u[n].
This technique can be used, for example, to cancel 60-Hz components from recordings of the
electrocardiogram.
12.5 Matched lters
The Wiener lter (12.75) gives the optimum linear estimate of a desired random signal y[n]
corrupted by additive noise d[n]. To implement this lter, the desired signal y[n] does not have
to be known exactly, only its power spectrum is needed. A dierent kind of optimum lter,
the matched lter, is used in applications when the desired signal is known exactly. This is the
case, for example, in radar and sonar applications, where the echo closely resembles the emitted
pulse, except for a delay and a scale factor. We will rst derive the matched lter for the case
of additive white noise, then treat the general case of noise with an arbitrary, but known, power
spectrum.
12.5.1 White-noise case
Our goal is to detect a known signal s[n] in additive white noise w[n]. We further assume that
s[n] has nite energy, so that it is well localized in time. Specically, let x[n] be the sum of s[n]
and w[n]. We would like to design a digital lter h[n] that optimizes our chances of detecting
the known signal, in the sense that the signal-to-noise ratio at the output of the lter would be
maximized for a particular time n
0
. n
0
is the time when the signal is detected. The lter output
y[n] can be written as:
y[n] = h[n] x[n] = h[n] s[n] + h[n] w[n] (12.76)
This is the sum of a term y
s
[n] = h[n] s[n] due to the signal and a term y
w
[n] = h[n] w[n]
due to the noise. We want to maximize the following signal-to-noise ratio:
Power in s[n]
SNR =
h[n] at time n
0
y
=
s
[n
0
]
2
Mean power in w[n] h[n]
(12.77)
P
yw
From (12.11), the mean power due to the white noise input is:
P
yw
=
2
w
n
h[n]
2
(12.78)
=
The power due to the signal at time n
0
is:
2
y [n
2
s 0
] =
_
n=
h[n] s[n
0
n]
_
(12.79)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
16
From the Cauchy-Schwarz inequality, one has:
_
h[n]s[n n]
2
0
n]
_
2
_
_ _
h[
s[n
0
n]
2
(12.80)
n= n= n=
_
Therefore, the signal-to-noise ratio in (12.77) is bounded by:
y
2
R =
s
[n
0
]
SN
P
yw
s[n]
2
E
=
s
2
w
(12.81)
2
w
Equality is obtained if and only if the two signals are proportional to each other:
h[n] = C s[n
0
n] (12.82)
Thus, the impulse response of the lter that optimizes the signal-to-noise ratio is proportional
to the signal reversed in time and delayed by n
0
. Such a lter is called a matched lter. If s[n]
is of nite duration, h[n] is an FIR lter, and the delay n
0
must be at least as long as the signal
in order to make the lter causal. If causality is not an issue, n
0
can be set to zero.
The matched lter can also be expressed in the frequency domain:
H(f) = C S
(f) e
j2fn
0
(12.83)
The signal-to-noise ratio at the matched lter output is the ratio of signal energy E
s
to the
noise power
2
w
. Figure 12.5 illustrates the improvement in signal-to-noise ratio achieved by a
matched lter for nerve impulses buried in white noise.
12.5.2 General case
The concept of matched lter can be easily extended to the general case of detecting a known
signal in additive noise with arbitrary power spectrum. The basic idea is to introduce a whitening
lter that transforms the additive noise into white noise, then apply the results of the preceding
section for the white noise case.
Specically, let x[n] = s[n] + d[n] be the sum the the known signal s[n] and noise d[n] with
power spectrum S
d
(f). A whitening lter H
1
(f) transforms d[n] into white noise w[n]. From
(12.25), H
1
(f) must be the inverse square root of S
d
(f) (within a phase factor):
e
j(f)
H
1
(f) = _ , (12.84)
S
d
(f)
where (f) is an arbitrary phase function. For example, (f) could be chosen to make h
1
[n]
causal. The white signal w[n] = d[n] h
1
[n] is called the innovation of d[n].
The response x
1
[n] of H
1
(f) to x[n] can be written as
x
1
[n] = h
1
[n] x[n] = h
1
[n] s[n] + h
1
[n] d[n] = s
1
[n] + w[n], (12.85)
where s
1
[n] = s[n] h
1
[n]. To maximize the SNR with input x[n], it suces to nd a lter
H
2
(f) that maximizes the SNR with input x
1
[n]. From (12.83), the optimum (matched) lter
H
2
(f) for detecting s
1
[n] in white noise w[n] is:
H
2
(f) = C S
1
(f) e
j2fn
0
, (12.86a)
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
17
with
S
1
(f) = S(f) H
1
(f) (12.86b)
The overall matched lter H(f) is the cascade of the whitening lter H
1
(f) and the matched
ler H
2
(f) for white noise:
S
(f)
H(f) = H
1
(f) H
2
(f) = C S
(f) |H
1
(f)|
2
e
j2fn
0
= C e
j2fn
0
(12.87)
S
d
(f)
Note that the arbitrary phase factor (f) cancels out when H
1
(f) is multiplied by its complex
conjugate.
It is interesting to contrast the matched lter (12.87) with the Wiener lter (12.75). Both
lters depend on the ratio of a signal spectrum to the noise power spectrum S
d
(f), so that
lter attenuation will be large in frequency regions where the signal-to-noise ratio is low. The
two lters dier in that, for the Wiener lter, the signal power spectrum S
s
(f) appears in the
numerator, while for the matched lter it is the DTFT conjugate S
h[n]
n=
R
xy
[k] = h[k] R
x
[k]
R
y
[k] = R
x
[k] (h[k] h[k])
These relations are particularly simple for white noise, for which samples at dierent times are
uncorrelated.
These formulas can be further simplied by introducing the power spectrum S
x
(f), which is the
Fourier transform of the autocorrelation function:
S
x
(f) =
R
x
[k] e
j2fk
k=
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
18
The power spectrum for a particular frequency f
0
represents the contribution of the frequency
band centered at f
0
to the total power in the signal. The power spectrum of the output of a
linear lter is equal to the power spectrum of the input multiplied by the magnitude square of
the frequency response:
S (f) = |H(f)|
2
y
S
x
(f)
This property is useful for analyzing the responses of linear systems to random signals, and for
generating random signals with arbitrary spectral characteristics.
The cross spectrum S
xy
(f) of two random signals x[n] and y[n] is the Fourier transform of
the crosscorrelation function R
xy
[k]. If y[n] is the response of a linear lter to x[n], the cross
spectrum is the product of the power spectrum of the input by the lter frequency response.
Conversely, given two signals x[n] and y[n], the frequency response of the linear lter that best
characterizes the relation between the two signals in a least-squares sense is:
S )
(f) =
xy
(f
H
S
x
(f)
This Wiener lter has many applications to signal conditioning and system identication.
While the Wiener lter is useful for estimating a random signal in additive noise, the matched
lter is used to detect a known signal in noise. The matched lter takes a particularly simple
form for white noise, in which case its impulse response is the signal waveform reversed in time.
12.7 Further reading
Papoulis and Pillai: Chapters 9, 13
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
19
Figure 12.1:
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
20
Figure 12.2: A. Magnitude of the frequency response of the ideal bandpass lter B(f). B. Physical
interpretation of the power spectrum. C. Physical interpretation of the cross-spectrum.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
21
Figure 12.3:
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
22
Figure 12.4: Applications of Wiener lters. A. Direct system identication. B. Inverse system
identication. C. Noise cancellation and signal detection. In each case, the Wiener lter is
indicated by H(f), and the unknown system by a question mark.
Cit e as: Bert rand Delgut t e, Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
23
Figure 12.5: A. Nerve Impulses. B. The same impulses corrupted by white noise. C. Result of
processing the corrupted impulses by a matched lter.
Cit e as: Bert rand Delgut t e. Course mat erials for HST. 582J / 6. 555J / 16. 456J, Biomedical Signal and I mage Pr ocessing,
Spring 2007. MI T OpenCourseWare ( ht t p: / / ocw. mit . edu) , Massachuset t s I nst it ut e of Technology.
Downloaded on [ DD Mont h YYYY] .
24