Signals, Linear Systems, and Convolution
Signals, Linear Systems, and Convolution
Signals, Linear Systems, and Convolution
1
and Young (1983), and Oppenheim and Schafer (1989).
In each of the above examples there is an input and an output, each of which is a time-varying
()
signal. We will treat a signal as a time-varying function, x t . For each time t, the signal has some
() ()
value x t , usually called “x of t.” Sometimes we will alternatively use x t to refer to the entire
signal x, thinking of t as a free variable.
() []
In practice, x t will usually be represented as a finite-length sequence of numbers, x n , in
which n can take integer values between 0 and N 1
, and where N is the length of the sequence.
[]
This discrete-time sequence is indexed by integers, so we take x n to mean “the nth number in
sequence x,” usually called “x of n” for short.
[] ()
The individual numbers in a sequence x n are called samples of the signal x t . The word
“sample” comes from the fact that the sequence is a discretely-sampled version of the continuous
signal. Imagine, for example, that you are measuring membrane potential (or just about anything
else, for that matter) as it varies over time. You will obtain a sequence of measurements sampled
at evenly spaced time intervals. Although the membrane potential varies continuously over time,
you will work just with the sequence of discrete-time measurements.
It is often mathematically convenient to work with continuous-time signals. But in practice,
you usually end up with discrete-time sequences because: (1) discrete-time samples are the only
things that can be measured and recorded when doing a real experiment; and (2) finite-length,
discrete-time sequences are the only things that can be stored and computed with computers.
In what follows, we will express most of the mathematics in the continuous-time domain. But
the examples will, by necessity, use discrete-time sequences.
Pulse and impulse signals. The unit impulse signal, written Æ (t), is one at t = 0, and zero
everywhere else: (
Æ (t) = 1 if t = 0
0 otherwise
The impulse signal will play a very important role in what follows.
One very useful way to think of the impulse signal is as a limiting case of the pulse signal,
Æ (t): ( 1
Æ (t) = if 0 < t <
0 otherwise
The impulse signal is equal to the pulse signal when the pulse gets infinitely short:
2
Unit step signal. The unit step signal, written u (t), is zero for all times less than zero, and 1
for all times greater than or equal to zero:
(
u(t) = 0 if t<0
1 if t0
P
Summation and integration. The Greek capital sigma, , is used as a shorthand notation
for adding up a set of numbers, typically having some variable take on a specified set of values.
Thus:
X5
i=1+2+3+4+5
i=1
P
The notation is particularly helpful in dealing with sums over discrete-time sequences:
3
X
x[n] = x[1] + x[2] + x[3]:
n=1
Arithmetic with signals. It is often useful to apply the ordinary operations of arithmetic to
=
signals. Thus we can write the product of signals x and y as z xy , meaning the signal made up
of the products of the corresponding elements:
3
= + + + +
... ...
Representing signals with impulses. Any signal can be expressed as a sum of scaled and
~( )
shifted unit impulses. We begin with the pulse or “staircase” approximation x t to a continuous
()
signal x t , as illustrated in Fig. 1. Conceptually, this is trivial: for each discrete sample of the
original signal, we make a pulse signal. Then we add up all these pulse signals to make up the
approximate signal. Each of these pulse signals can in turn be represented as a standard pulse
scaled by the appropriate value and shifted to the appropriate place. In mathematical notation:
1
X
x~(t) = x(k) Æ (t k) :
k= 1
As we let approach zero, the approximation x ~(t) becomes better and better, and the in the limit
()
equals x t . Therefore,
1
X
x(t) = lim
!0
x(k) Æ (t k) :
k= 1
Also, as ! 0, the summation approaches an integral, and the pulse approaches the unit impulse:
Z1
x(t) = x(s) Æ (t s) ds: (1)
1
In other words, we can represent any signal as an infinite sum of shifted and scaled unit impulses. A
digital compact disc, for example, stores whole complex pieces of music as lots of simple numbers
representing very short impulses, and then the CD player adds all the impulses back together one
after another to recreate the complex musical waveform.
This no doubt seems like a lot of trouble to go to, just to get back the same signal that we
originally started with, but in fact, we will very shortly be able to use Eq. 1 to perform a marvelous
trick.
Linear Systems
A system or transform maps an input signal x (t) into an output signal y(t):
y (t) = T [x(t)];
where T denotes the transform, a function from input signals to output signals.
Systems come in a wide variety of types. One important class is known as linear systems. To
see whether a system is linear, we need to test whether it obeys certain rules that all linear systems
obey. The two basic tests of linearity are homogeneity and additivity.
4
Homogeneity. As we increase the strength of the input to a linear system, say we double it,
then we predict that the output function will also be doubled. For example, if the current injected
to a passive neural membrane is doubled, the resulting membrane potential fluctuations will double
as well. This is called the scalar rule or sometimes the homogeneity of linear systems.
Additivity. Suppose we we measure how the membrane potential fluctuates over time in
()
response to a complicated time-series of injected current x1 t . Next, we present a second (differ-
()
ent) complicated time-series x2 t . The second stimulus also generates fluctuations in the mem-
brane potential which we measure and write down. Then, we present the sum of the two currents
( )+ ( )
x1 t x2 t and see what happens. Since the system is linear, the measured membrane potential
fluctuations will be just the sum of the fluctuations to each of the two currents presented separately.
Superposition. Systems that satisfy both homogeneity and additivity are considered to be
linear systems. These two rules, taken together, are often referred to as the principle of superposi-
tion. Mathematically, the principle of superposition is expressed as:
Homogeneity is a special case in which one of the signals is absent. Additivity is a special case in
which = =1 .
Shift-invariance. Suppose that we inject a pulse of current and measure the membrane po-
tential fluctuations. Then we stimulate again with a similar pulse at a different point in time, and
again we measure the membrane potential fluctuations. If we haven’t damaged the membrane with
the first impulse then we should expect that the response to the second pulse will be the same as
the response to the first pulse. The only difference between them will be that the second pulse has
occurred later in time, that is, it is shifted in time. When the responses to the identical stimulus
presented shifted in time are the same, except for the corresponding shift in time, then we have
a special kind of linear system called a shift-invariant linear system. Just as not all systems are
linear, not all linear systems are shift-invariant.
In mathematical language, a system T is shift-invariant if and only if:
Convolution
Homogeneity, additivity, and shift invariance may, at first, sound a bit abstract but they are very
useful. To characterize a shift-invariant linear system, we need to measure only one thing: the way
the system responds to a unit impulse. This response is called the impulse response function of
the system. Once we’ve measured this function, we can (in principle) predict how the system will
respond to any other possible stimulus.
5
Impulses
Impulse Impulse Response
For
example
The way we use the impulse response function is illustrated in Fig. 2. We conceive of the input
stimulus, in this case a sinusoid, as if it were the sum of a set of impulses (Eq. 1). We know the
responses we would get if each impulse was presented separately (i.e., scaled and shifted copies of
the impulse response). We simply add together all of the (scaled and shifted) impulse responses to
predict how the system will respond to the complete stimulus.
Now we will repeat all this in mathematical notation. Our goal is to show that the response (e.g.,
membrane potential fluctuation) of a shift-invariant linear system (e.g., passive neural membrane)
can be written as a sum of scaled and shifted copies of the system’s impulse response function.
The convolution integral. Begin by using Eq. 1 to replace the input signal x (t) by its repre-
sentation in terms of impulses:
Z 1
y (t) = T [x(t)] = T x(s) Æ (t s) ds
2 1 3
1
X
= T lim
4
!0 k = 1
x(k) Æ (t k) 5:
Using additivity,
1
X
y (t) = lim
!0
T [x(k) Æ (t k) ]:
k= 1
Taking the limit, Z1
y (t) = T [x(s) Æ (t s) ds]:
1
6
past present future
0 0 0 1 0 0 0 0 0 input (impulse)
0 0 0 1 1 1 1 1 1 input (step)
Using homogeneity, Z1
y (t) = x(s) T [Æ (t s)] ds:
1
()
Now let h t be the response of T to the unshifted unit impulse, i.e., h (t) = T [Æ(t)]. Then by using
shift-invariance, Z 1
y (t) = x(s) h(t s) ds: (4)
1
Notice what this last equation means. For any shift-invariant linear system T , once we know its
()
impulse response h t (that is, its response to a unit impulse), we can forget about T entirely, and
()
just add up scaled and shifted copies of h t to calculate the response of T to any input whatsoever.
Thus any shift-invariant linear system is completely characterized by its impulse response h t . ()
The way of combining two signals specified by Eq. 4 is know as convolution. It is such a
widespread and useful formula that it has its own shorthand notation, . For any two signals x and
y , there will be another signal z obtained by convolving x with y ,
Z1
z (t) = x y = x(s) y (t s) ds:
1
Convolution as a series of weighted sums. While superposition and convolution may sound
a little abstract, there is an equivalent statement that will make it concrete: a system is a shift-
invariant, linear system if and only if the responses are a weighted sum of the inputs. Figure 3
shows an example: the output at each point in time is computed simply as a weighted sum of the
inputs at recently past times. The choice of weighting function determines the behavior of the
system. Not surprisingly, the weighting function is very closely related to the impulse response of
the system. In particular, the impulse response and the weighting function are time-reversed copies
of one another, as demonstrated in the top part of the figure.
7
Properties of convolution. The following things are true for convolution in general, as you
should be able to verify for yourself with some algebraic manipulation:
xy =yx commutative
(x y) z = x (y z) associative
(x z) + (y z) = (x + y) z distributive
Frequency Response
Sinusoidal signals. Sinusoidal signals have a special relationship to shift-invariant linear sys-
tems. A sinusoid is a regular, repeating curve, that oscillates above and below zero. The sinusoid
has a zero-value at time zero. The cosinusoid is a shifted version of the sinusoid; it has a value of
one at time zero.
The sine wave repeats itself regularly, and the distance from one peak of the wave to the next
peak is called the wavelength or period of the sinusoid and generally indicated by the greek letter
. The inverse of wavelength is frequency: the number of peaks in the signal that arrive per second.
The units for the frequency of a sine-wave are hertz, named after a famous 19th century physicist,
who was a student of Helmholtz. The longer the wavelength, the shorter the frequency; knowing
one we can infer the other. Apart from frequency, sinusoids also have various amplitudes, which
represent the distance between how high their energy gets at the peak of the wave and how low
it gets at the trough. Thus, we can describe a sine wave completely by its frequency and by its
amplitude.
The mathematical expression of a sinusoidal signal is:
A sin(2!t);
where A is the amplitude and ! is the frequency (in Hz). As the value of the amplitude, A, increases
the height of the sinusoid increases. As the frequency, ! , increases, the spacing between the peaks
becomes smaller.
Fourier Transform. Just as we can express any signal as the sum of a series of shifted and
scaled impulses, so too we can express any signal as the sum of a series of (shifted and scaled)
sinusoids at different frequencies. This is called the Fourier expansion of the signal. An example
is shown in Fig. 4.
The equation describing the Fourier expansion works as follows:
Z1
x(t) = A! sin(2!t + ! )d! (5)
0
where ! is the frequency of each sinusoid, and A! and ! are the amplitude and phase, respectively,
of each sinusoid. You can go both ways. If you know the coefficients, A ! and ! , you can use
()
this formula to reconstruct the original signal x t . If you know the signal, you can compute the
coefficients by a method called the Fourier transform, a way of decomposing a complex stimulus
into its component sinusoids (see Appendix II).
8
Fourier Series Approximations
1 0.5
0 0
-1 -0.5
-2 -1
0 2 4 6 8 0 2 4 6 8
1 2
0.5 1
0 0
-0.5 -1
-1 -2
0 2 4 6 8 0 2 4 6 8
Shift-invariant linear systems and sinusoids. The Fourier decomposition is important be-
cause if we know the response of the system to sinusoids at many different frequencies, then we
can use the same kind of trick we used with impulses to predict the response via the impulse re-
sponse function. First, we measure the system’s response to sinusoids of all different frequencies.
Next, we take our input (e.g., time-varying current) and use the Fourier transform to compute the
values of the Fourier coefficients. At this point the input has been broken down as the sum of
its component sinusoids. Finally, we can predict the system’s response (e.g., membrane potential
fluctuations) simply by adding the responses for all the component sinusoids.
Why bother with sinusoids when we were doing just fine with impulses? The reason is that
sinusoids have a very special relationship to shift-invariant linear systems. When we use a sinu-
soids as the inputs to a shift-invariant linear system, the system’s responses are always (shifted and
scaled) copies of the input sinusoids. That is, when the input is x t ( ) = sin(2 )
!t the output is
always of the form y t()= A! sin(2 + )
!t ! , with the same frequency as the input. Here, !
determines the amount of shift and A ! determines the amount of scaling. Thus, measuring the
response to a sinusoid for a shift-invariant linear system entails measuring only two numbers: the
9
Shift-Invariant Linear Systems and Sinusoids
Frequency Description
of the system
Sinusoidal Scaled and Shifted
Inputs sinusoidal outputs x
scaling
x
x
frequency
shifting
x
x
x
frequency
shift and the scale. This makes the job of measuring the response to sinusoids at many different
frequencies quite practical.
Often then, when scientists characterize the response of a system they will not tell you the
impulse response. Rather, they will give you the frequency response, the values of the shift and
scale for each of the possible input frequencies (Fig. 5). This frequency response representation
of how the shift-invariant linear system behaves is equivalent to providing you with the impulse
response function (in fact, the two are Fourier transforms of one another). We can use either to
compute the response to any input. This is the main point of all this stuff: a simple, fast, economical
way to measure the responsiveness of complex systems. If you know the coefficients of response
for sine waves at all possible frequencies, then you can determine how the system will respond to
any possible input.
Linear filters. Shift-invariant linear systems are often referred to as linear filters because they
typically attenuate (filter out) some frequency components while keeping others intact.
For example, since a passive neural membrane is a shift invariant linear system, we know
that injecting sinusoidally modulating current yields membrane potential fluctuations that are si-
nusoidal with the same frequency (sinusoid in, sinusoid out). The amplitude and phase of the
output sinusoid depends on the choice of frequency relative to the properties of the membrane.
The membrane essentially averages the input current over a period of time. For very low frequen-
cies (slowly varying current), this averaging is irrelevant and the membrane potential fluctuations
follow the injected current. For high frequencies, however, even a large amplitude sinusoidal cur-
rent modulation will yield no membrane potential fluctuations. The membrane is called a low-pass
10
Linear Systems Logic
Frequency
Measure the
sinusoidal responses method
filter: it lets low frequencies pass, but because of its time-averaging behavior, it attenuates high
frequencies.
Figure 7 shows an example of a band-pass filter. When the frequency of a sinusoidal input
matches the periodicity of the linear system’s weighting function the output sinusoid has a large
amplitude. When the frequency of the input is either too high or too low, the output sinusoid is
attenuated.
Shift-invariant linear systems are often depicted with block diagrams, like those shown in
^( )
Fig. 8. Fig. 8A depicts a simple linear filter with frequency response h ! . The equations that
go with the block diagram are:
- -
+ - + -
- -
10
Response amplitude
1
0.1
0.01
0.001
1 3 10 30 100
Spatial frequency (cpd)
Figure 7: Illustration of an idealized, retinal ganglion-cell receptive field that acts like a bandpass
filter (redrawn from Wandell, 1995). This linear on-center neuron responds best to an intermediate
spatial frequency whose bright bars fall on-center and whose dark bars fall over the opposing sur-
round. When the spatial frequency is low, the center and surround oppose one another because both
are stimulated by a bright bar, thus diminishing the response. When the spatial frequency is high,
bright and dark bars fall within and are averaged by the center (likewise in the surround), again
diminishing the response.
h1 (ω) y1 (ω)
x(ω) h(ω) y(ω)
h2 (ω) y2 (ω)
x(ω)
A
:
B
hN (ω) yN (ω)
x(ω) + y(ω)
C
f(ω)
Figure 8: Block diagrams of linear filters. A: Linear filter with frequency response h
^ (! ). B: Bank
12
phase[^y(!)] = phase[h^ (!)] + phase[^x(!)]
For an input sinusoid of frequency ! , the output is a sinusoid of the same frequency, scaled by
amplitude[^( )]
x ! and it is shifted by phase[^( )]
x! .
Fig. 8B depicts a bank of linear filters that all receive the same input signal. For example,
they might be spatially-oriented linear neurons (like V1 simple cells) with different orientation
preferences.
Fig. 8C depicts a linear feedback system. The equation corresponding to this diagram is:
()
Note that because of the feedback, the output y t appears on both sides of the equation. The
^( )
frequency response of the feedback filter is denoted by f ! , but the behavior of the entire linear
system can be expressed as:
y^(! ) = x^(! ) + f^(! ) y^(! ):
^(!) in this expression gives:
Solving for y
x^(! )
y^(! ) = h^ (! )^x(! ) = ;
1 f^(!)
where h^ (!) = 1=[1 f^(!)] is the effective frequency response of the entire linear feedback
system. Using a linear feedback filter with frequency response f^(! ) is equivalent to using a linear
feedforward filter with frequency response h ^ (!).
There is one additional subtle, but important, point about this linear feedback system. A system
is called causal or nonanticipatory if the output at any time depends only on values of the input at
the present time and in the past. For example, the systems y t ()= (
xt 1) ()= ()
and y t x 2 t are
causal, but the system y t( ) = ( + 1)
xt is not causal. Note that not all causal systems are linear and
that not all linear systems are causal (look for examples of each in the previous sentence).
()
For Eq. 6 to make sense, the filter f t must be causal so that the output at time t depends on
the input at time t plus a convolution with past outputs. For example, if f t( ) = ( 1)
1
2Æ t then:
13
Differential Equations as Linear Systems
dt y
d (t) = y(t) + x(t): (7)
The equation for a passive neural membrane, for example, can be expressed in this form. There
are two operations in this equation: differentiation and addition. Since both are linear operations,
() ()
this is an equation for a linear system with input x t and output y t . The output y t appears on ()
both sides of the equation, so it is a linear feedback system. Since the present output depends on
the full past history, it is an infinite impulse response system (an IIR filter).
14
Taking the Fourier transform of both sides, the differential equation can be expressed using
multiplication in the Fourier domain:
Fourier transforms involve complex numbers, so we need to do a quick review. A complex number
= +
z a jb has two parts, a real part x and an imaginary part jb, where j is the square-root of -1. A
complex number can also be expressed using complex exponential notation and Euler’s equation:
a =p
A cos(); b = A sin()
A = a2 + b2 ; = tan 1 (b=a)
so that the amplitudes multiply and the phases add. If you were instead to do the multiplication
+
using real and imaginary, a jb, notation you would get four terms that you could write using
sin and cos notation, but in order to simplify it you would have to use all those trig identities that
you forgot after graduating from high school. That is why complex exponential notation is so
widespread.
15
Signal Amplitude of Fourier Transform
x(t) 1
|X(f)|
0 0
1
-1
0 1 0
0 4 8
x(t) 1 |X(f)|
1
0
-1 0
0 1 0 4 8
Time (sec) Frequency (Hz)
Any signal can be written as a sum of shifted and scaled sinusoids, as was expressed in Eq. 5. That
equation is usually written using complex exponential notation:
Z1
xt ()=
x f ej!t dt:
1
^( ) (8)
The complex exponential notation, remember, is just a shorthand for sinusoids and cosinusoids,
^( )
but it is mathematically more convenient. The x ! are the Fourier transform coefficients for each
frequency component ! . These coefficientes are complex numbers and can be expressed either in
terms of their real (cosine) and imaginary (sine) parts or in terms of their amplitude and phase.
A second equation tells you how to compute the Fourier transfrom coefficients, x^(! ), from
from the input signal: Z1
x^(! ) = Ffx(t)g = 21 x(t) e j!t dt:
(9)
1
These two equations are inverses of one another. Eq. 9 is used to compute the Fourier transform
coefficients from the input signal, and then Eq. 8 is used to reconstruct the input signal from the
Fourier coefficients.
The equations for the Fourier transform are rather complex (no pun intended). The best way to
get an intuition for the frequency domain is to look at a few examples. Figure 9 plots sinusoidal
signals of two different frequencies, along with their Fourier transform amplitudes. A sinusoidal
signal contains only one frequency component, hence the frequency plots contain impulses. Both
sinusoids are modulated between plus and minus one, so the impulses in the frequency plots have
unit amplitude. The only difference between the two sinusoids is that one has 4 cycles per second
and the other has 8 cycles per second. Hence the impulses in the frequency plots are located at
4 Hz and 8 Hz, respectively.
Figure 10 shows the Fourier transforms of a sinusoid and a cosinusoid. We can express the
Fourier transform coefficients either in terms of their real and imaginary parts, or in terms of their
amplitude and phase. Both representations are plotted in the figure. Sines and cosines of the same
frequency have identical amplitude plots, but the phases are different.
16
cosine sine
real part
-8 0 8 -8 0 8
imaginary part
-8 0 8 -8 0 8
amplitude
-8 0 8 -8 0 8
Frequency (Hz) Frequency (Hz)
phase 0 π/2
Figure 10: Fourier transforms of sine and cosine signals. The amplitudes are the same, but the
phases are different.
Do not be put off by the negative frequencies in the plots. The equations for the Fourier
transform and its inverse include both positive and negative frequencies. This is really just a
mathematical convenience. The information in the negative frequencies is redundant with that
cos( ) = cos( )
in the positive frequencies. Since f f , the negative frequency components in the
real part of the frequency domain will always be the same as the corresponding postive frequency
components. Since sin( ) = sin( )
f f the negative frequency components in the imaginary
part of the frequency domain will always be minus one times the corresponding postive frequency
components. Often, people plot only the positive frequency components, as was done in Fig. 9,
since the negative frequency components provide no additional information. Sometimes, people
plot only the amplitude. In this case, however, there is information missing.
There are a few facts about the Fourier transform that often come in handy. The first of the
properties is that the Fourier transform is itself a linear system, which you can check for yourself
by making sure that Eq. 9 obeys both homogeneity and additivity. This is important because it
makes it easy for us to write the Fourier transforms of lots of things. For example, the Fourier
transform of the sum of two signals is the sum of the two Fourier transforms:
Ffx(t) + y(t)g = Ffx(t)g + Ffy(t)g = x^(!) + y^(!);
where I have used Ffg as a shorthand notation for “the Fourier transform of”. The linearity of
the Fourier transform was one of the tricks that made it easy to write the transforms of both sides
of Eq. 6.
A second fact, known as the convolution property of the Fourier transform, is that the Fourier
transform of a convolution equals the product of the two Fourier transforms:
Ffh(t) x(t)g = Ffh(t)gFfx(t)g = h^ (!)^x(!):
This property was also used was used to write the Fourier transform of Eq. 6. Indeed this property
is central to much of the discussion in this handout. Above I emphasized that for a shift-invariant
linear system (i.e., convolution), the system’s responses are always given by shifting and scaling the
frequency components of the input signal. This fact is expressed mathematically by the convolution
^( )
property above, where x ! are the frequency components of the input and h ! is the frequency ^( )
response, the (complex-valued) scale factors that shift and scale each frequency component.
17
The convolution property of the Fourier transform is so important that I feel compelled to write
out a derivation of it. We start with the definition of convolution:
Z1
y (t) = h(t) x(t) = x(s) h(t s) ds:
1
By the definition of the Fourier transform,
Z 1 Z 1
y^(! ) = x(s)h(t s) ds e j!t
dt:
1 1
Switching the order of integration,
Z1 Z 1
y^(! ) = x(s) h(t s)e j!t dt ds:
1 1
Letting =t s,
Z1 Z 1
y^(! ) = x(s) h( )e j!(+s) d
ds
1
Z1 1Z
1
= 1
x(s)e j!s
1
h( )e j! d
ds
A third property of the Fourier transform, known as the differentiation property, is expressed
as: n o
F dx
dt = 2j!Ffxg:
This property was used to write the Fourier transform of Eq. 7. It is also very important and I
would feel compelled to write a derivation of it as well, but I am running out of energy, so you will
have to do it yourself.
References
A V Oppenheim and R W Schafer. Discrete-Time Signal Processing. Prentice-Hall, Englewood
Cliffs, New Jersey, 1989.
A V Oppenheim, A S Willsky, and I T Young. Signals and Systems. Prentice-Hall, Englewood
Cliffs, New Jersey, 1983.
B A Wandell. Foundations of Vision. Sinauer, Sunderland, MA, 1995.
18