0% found this document useful (0 votes)
236 views121 pages

Digital Signal Processing (Bruce Francis) PDF

This document contains course notes for ECE431 Digital Signal Processing. It begins with a preface stating that the notes follow the topics in the textbook but may be incomplete in some sections. It notes that the chapter numbering differs from the textbook but the order is the same. The main topics covered are the sampling theorem, discrete-time processing of continuous-time signals, digital filters, the discrete Fourier transform (DFT) and fast Fourier transform (FFT), and spectral analysis. The notes use a matrix algebra approach whereas the textbook does not. MATLAB is recommended for solving numerical problems. The contents section provides an outline of the chapters.

Uploaded by

Quarteendolf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
236 views121 pages

Digital Signal Processing (Bruce Francis) PDF

This document contains course notes for ECE431 Digital Signal Processing. It begins with a preface stating that the notes follow the topics in the textbook but may be incomplete in some sections. It notes that the chapter numbering differs from the textbook but the order is the same. The main topics covered are the sampling theorem, discrete-time processing of continuous-time signals, digital filters, the discrete Fourier transform (DFT) and fast Fourier transform (FFT), and spectral analysis. The notes use a matrix algebra approach whereas the textbook does not. MATLAB is recommended for solving numerical problems. The contents section provides an outline of the chapters.

Uploaded by

Quarteendolf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 121

ECE431 Digital Signal Processing

Bruce Francis
Course notes, Version 1.04, September 2009

Preface
These notes follow the topics in the text. Some sections of these notes are complete in the sense of
being self-contained. Others are not; they are to be supplemented with the lecture notes and the
text. These incomplete sections are marked frag, meaning fragment.
Even the complete sections frequently ask you to fill things in, such as to complete examples.
Usually, we do this fill-in in class.
The chapter numbering doesnt coincide with that in the text, but the order is the same.
The main topics in this course are these: the sampling theorem and its applications; processing
continuous-time signals using discrete-time components; digital filters and multirate digital filters
for block-by-block processing; the discrete Fourier transform and its applications and the FFT;
spectral analysis of signals, that is, determining the Fourier transform of a signal. The course
begins with the basic tools of discrete-time signals and systems, namely, Fourier transforms and z
transforms. The main chapters in the text are 24, 710.
The treatment in these notes is on the same level as the textneither more nor less difficult.
But the notes use matrix algebra whereas the text does not. DSP systems are entirely linear (except
for quantizers), so its only natural to use linear algebra. For example, heres the definition of the
discrete Fourier transform of the N samples {x[0], . . . , x[N 1]}:
X
X[k] =
x[n]ej2kn/N , k = 0, 1, . . . , N 1.
(1)
n

If you stack the samples x[0], . . . , x[N 1] into a vector x and the DFT coefficients X[0], . . . , X[N 1]
into a vector X, then equation (1) is simply
X = F x,

(2)

where F is an N N matrix, called the Fourier matrix. Obviously, (2) is simpler in appearance than
(1), but more importantly (2) is ready for computations: X = F x is a MATLAB line command,
whereas (1) requires two for loops to implement, one for n and one for k. I hope you find the
treatment in these notes simple. You may have to review matrix algebradont worry, its easy to
pick up.
There are several computer applications for solving numerical problems in this course. The most
widely used is MATLAB, but its expensive. I like Scilab, which is free. Others are Mathematica
(expensive) and Octave (free).

Contents
Preface

The Most Important Things

1 Introductory Example: Digital Storage of Music


2 Review of Continuous Time
2.1 Elementary Signals . . . . .
2.2 Transforms . . . . . . . . .
2.3 Laplace vs Fourier . . . . .
2.4 LTI Systems . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

8
8
9
10
11

3 Discrete-time Signals and Systems


3.1 Elementary Signals . . . . . . . . .
3.2 z Transforms . . . . . . . . . . . .
3.3 Fourier Transforms . . . . . . . . .
3.4 System Concepts . . . . . . . . . .
3.5 Matrix Representations . . . . . .
3.6 Difference Equations . . . . . . . .
3.7 FIR, IIR, and Block Diagrams . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

13
13
13
18
20
23
31
34

. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Signals
. . . . .
. . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

40
40
41
45
49
50
51
52
56
57

4 The
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9

.
.
.
.

.
.
.
.

.
.
.
.

Sampling Theorem
The Underlying Idea . . . . . . . . . . . . .
Sampling a Discrete-time Signal . . . . . . .
Sampling a Continuous-time Signal . . . . .
Ideal, Non-causal Reconstruction . . . . . .
Summary of Formulas . . . . . . . . . . . .
Causal Reconstruction . . . . . . . . . . . .
Discrete-time Processing of Continuous-time
Discrete-time Random Signals (frag) . . . .
A/D and D/A Converters . . . . . . . . . .

5 Multirate DSP
63
5.1 Components of a Multirate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Block Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3 Changing the Sampling Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

CONTENTS
5.4
5.5
5.6

Subband Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MPEG Audio Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Efficient Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Filter Design (frag)


7 The
7.1
7.2
7.3
7.4
7.5
7.6
7.7

DFT
Definition of DFT . . . . . . . . . . . . . . . .
Circular Convolution . . . . . . . . . . . . . . .
Vectors and Matrices . . . . . . . . . . . . . . .
Circular Convolution via the Circulant Matrix
Ordinary Convolution via DFT . . . . . . . . .
Digital Filter Implementation via DFT . . . . .
Summary . . . . . . . . . . . . . . . . . . . . .

80
84
85
88

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

92
. 92
. 95
. 98
. 99
. 100
. 102
. 103

8 The FFT
105
8.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9 Spectral Analysis (frag)
109
9.1 Using the DFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A Proof of the Sampling Theorem using Fourier Series

111

B The Fourier Chart

113

Review Problems

115

The Most Important Things


Heres a summary of the most important things well cover. Read this list several times during the
term. At the end, you should understand all the points.
1. DSP is a signals and systems subject. Every component is linear: C/D, D/C, H(z), L,
M , DFT. The only exception is the quantizer, Q, which is nonlinear and memoryless. What
makes DSP a richer subject than your previous signals and systems courses are two things: 1)
A DSP system usually has both continuous-time and discrete-time signals co-existing; 2) many
DSP systems are multirate, with down- and up-samplers; this makes them not time-invariant.
2. And because almost everything is linear, linear mathematical tools are central to the subject.
In particular, we emphasize vectors, matrices, and linear algebra. A discrete-time signal can
be represented as a vector (of infinite dimension), and as such the set of all signals forms a
vector space. A discrete-time linear system has a matrix representation. Time invariance and
causality are then simple properties of the matrix, namely, constant along diagonals and lower
triangular, respectively.
3. As with all signals and systems subjects, the frequency domain is where we get the most
insight. Thus, Fourier transforms and z transforms are the most important tools.
The z transform can be inverted by partial fraction and power series expansion or by the
residue theorem.
The region of convergence (ROC) of a z transform can be a disk, the exterior of a disk, an
annulus, the punctured plane, or the whole plane. A system is causal iff the ROC of the
transfer function is the exterior of a disk and the transfer function is proper (deg num deg
den). The system is stable iff the ROC contains the unit circle.
If x[n] is absolutely summable, the Fourier transform converges absolutely and is continuous.
If x[n] is only square summable, the Fourier transform is guaranteed to converge only in the
mean-square sense. If x[n] is a sinusoid, its Fourier transform is an impulse. Other periodic
signals must be handled carefully too.
4. The sampling theorem: A signal bandlimited to f0 Hz can be reconstructed from its periodically sampled values provided the sampling frequency fs satisfies fs > 2f0 . The reason for
the factor of 2 is that the bandwidth of the signal is really (f0 , f0 ), which has width 2f0 .
You should be able to state the discrete-time sampling theorem in a similar way.
The sampling theorem provides a perfect reconstruction system: C/D followed by D/C. The
input to C/D is perfectly reconstructed at the output of D/C. The reconstruction system
can be derived via impulse-train modulation.
4

CONTENTS

5. Applications of the sampling theorem:


A signal sampled at one frequency can be converted entirely in discrete time to the same
signal sampled at another frequency provided the two frequencies are rationally related and
the signal is bandlimited to less than half both sampling frequencies.
A continuous-time filter can be implemented by C/D + digital filter + D/C provided the
input is bandlimited.
6. There are a variety of digital filter design methods. A popular one is by windowing: Truncating
the ideal impulse response and multiplying by a window function.
Digital filters are normally implemented in the time domain via convolution. If the input
is long, it can be broken up into blocks (segments); this then necessitates an overlap-add
combiner. The block processing involves multiplication by a finite Toeplitz matrix; this can
be speeded up via converting to a circulant matrix, which can then be diagonalized via FFT.
7. MPEG audio coding is a form of subband coding. It is implemented by a multirate filter bank.
8. Real A/D converters are based on oversampling, sigma-delta modulation, low-bit quantizers,
and noise shaping. This is because a high SQNR can be achieved. The analysis of these
systems is difficult and common practice is to approximate the quantizers by additive white
noise.
9. The DFT is the simplest of all the Fourier transforms. The DFT is an orthogonal transform
of a vector of sample values. In practice, the DFT is computed via X = F x, where F is
the DFT matrix. Use of the DFT is widespread because of the fast algorithm of Cooley and
Tukey.
10. The DFT is also used for spectral analysis of, for example, biological signals, such as an
electrocardiogram. This involves windowing too.

Chapter 1

Introductory Example: Digital


Storage of Music
An interesting application of DSP is the digital storing of music on, say, a CD player. Storage onto
a hard-disk or flash memory is similar. This chapter, meant to be an introduction to the course,
discusses how this is done. Notation is introduced that well use throughout.
1. Imagine sitting at a piano and striking A above middle C. What happens? A hammer strikes
the string for that note, the string vibrates at 440 Hz, air molecules move, a sound wave travels
to your eardrum, your eardrum vibrates, an electrical signal carries that information to your
brain, and you hear that note. If you strike the A key an octave higher, the same thing
happens but at 2 440 Hz. An octave higher, at 22 440 Hz; an octave higher, at 23 440
Hz. And then the keyboard ends. But imagine you could continue to listen to tones of higher
and higher frequency. Since the eardrum has mass, eventually the amplitude of oscillation of
the eardrum would be so small that no signal would be sent to the brain and you wouldnt
hear the tone. In this sense, the human auditory system is bandlimited, to about 20 kHz.1
And it is for this reason that digital storage of music is possible.
2. Now we turn to recording a piece of piano music onto a CD. We need a microphone and
a computer. Lets say the piano piece lasts 60 minutes. We can represent the music as a
continuous-time signal xc (t). With t in seconds, 0 t 602 . Physically, xc (t) represents
the pressure that the sound wave applies to the microphone. The output of the microphone
is a voltage signal, say wc (t). We want to store wc (t), 0 t 602 . on the hard-disk of
the computer. Because the ear is bandlimited to 20 kHz, we can lowpass filter wc (t) to this
frequency. Lets suppose we do this with an analog filter and that the output is vc (t).
According to the sampling theorem, that we will study in detail, the signal vc (t) can be
sampled at frequency fs > 2 20 kHz without any loss of information. The CD standard is
fs = 44.1 kHz. This produces the sample values
v[n] = vc (nT ), T = 1/(44.1 103 ),

0 n 602 44.1 103 .

Notice the notation: square brackets, [n], around the discrete-time variable. The signal v[n] is
still an analog voltage, so to be stored in a finite capacity memory it has to be quantized, and
1

Note that the system is not exactly a brickwall filter.

CHAPTER 1. INTRODUCTORY EXAMPLE: DIGITAL STORAGE OF MUSIC

this is normally done using 16 bits of resolution. So the number of bits needed to be stored is
602 44.1 103 16 = 2.54016 109 ,
i.e., 2.5 Gigabits.
The block diagram representation of the storage process is this:

xc (t)

wc (t)

vc (t)

v[n]
C/D

microphone

y[n]
Q

lowpass
filter

The block C/D is the ideal continuous-to-discrete transformation, namely, periodic sampling.
Then Q represents the quantizer. Well study these blocks in detail. The output signal y[n] is
burned onto the CD. Notice the arrow convention: a continuous arrow for a continuous-time
signal and a dashed arrow for a discrete-time signal.
3. The reverse process is to take y[n] and produce vc (t); neither wc (t) nor xc (t) can be reconstructed. The sampling theorem gives a formula for vc (t) in terms of v[n]. But this relationship
is not causalit requires forward interpolation. Since the CD is spinning on playback, it is customary to generate vc (t) causally. The standard way is zero-order-hold followed by smoothing.
ZOH is what it sayshold every y[n] for exactly T seconds. So the playback block diagram
is this:

vc (t)

y[n]

x
c (t)

ZOH
lowpass
filter

speaker

The hat on vc (t) indicates that the signal is some sort of approximation of vc (t); likewise for
x
c (t).
As we will see, in practice the sampling rate is a multiple of 44.1 kHz. This is called oversampling.

Chapter 2

Review of Continuous Time


2.1

Elementary Signals

1. Let us begin by remembering the most important signal for an electrical engineer, the sinusoid.
When we write sin(), the angle is always in radians. So in the signal sin(t), t has units
of radians, and hence the units of are radians/second, assuming t is in seconds. We could
also write = 2f , where f is in Hz. We usually write sin(t) instead of sin(2f t).
More useful than sin(t) or cos(t) is the complex sinusoid, ejt . By Eulers formula,
ejt = cos(t) + j sin(t).
Thus
cos(t) =



1 jt
1 jt
e + ejt , cos(t) =
e ejt .
2
2j

And why are sinusoids so important? You learned the answer in electric circuits: If you excite
an electric circuit by a sinusoid (voltage or current), in steady state all voltages and currents
are sinusoids too, of the same frequency. We abbreviate this by saying sinusoid in produces
sinusoid out.
2. The unit step is denoted by u(t).
3. The impulse (t) is not a real functionin mathematics its called a distribution. The idea
behind is the sifting formula. Let f (t) be a signal that is smooth near t = 0. The sifting
formula is
Z
f (t)(t)dt = f (0).

We pretend there is a function (t) satisfying this equation for every f (t), and we proceed
from there. For example, by change of variables
Z
Z
f (t)(t )dt =
f (t )(t)dt = f ( ).

CHAPTER 2. REVIEW OF CONTINUOUS TIME


Also,
Z

1
f (t)(at)dt =
|a|

f (t/a)(t)dt =

1
f (0),
|a|

and so (at) = (1/|a|)(t). The product (t)2 is not defined, but u = in the sense that, if
limt f (t) = 0, then
Z
f (t)u(t)dt

= f (0);

this follows by integration by parts.

2.2

Transforms

1. Let x(t) be a signal defined either for all t or just for t 0. The one-sided Laplace transform
(LT) of x(t) is
Z
X(s) =
x(t)est dt.
0

The region of convergence (ROC) is a right-half plane, that is, the real part of s has to be
large enough. Within the ROC X(s) has no poles.
2. The Fourier transform (FT) of x(t) is
Z
X(j) =
x(t)ejt dt.

The inversion formula is


Z
1
x(t) =
X(j)ejt d.
2
If the absolute value of x(t) is integrable, that is,
Z
|x(t)|dt < ,

then X(j) is a continuous function of . An example is x(t) = e|t| cos t. If the absolute
value of x(t) is only square-integrable, that is,
Z
|x(t)|2 dt < ,

then X(j) may not be a continuous function of . An example is x(t) = sin(t)/t.


The constant signal 1(t) that equals 1 for all t is neither absolutely-integrable nor squareintegrable. Its FT is defined to be 2(). We are convinced of this by the inversion formula,
which is an instance of sifting:
Z
1
2()ejt d.
1=
2

CHAPTER 2. REVIEW OF CONTINUOUS TIME

10

The forward FT equation is


Z
ejt dt
2() =

and it has no meaning in the sense of ordinary functions. Likewise, the sinusoidal signal ej0 t
is neither absolutely-integrable nor square-integrable. Its FT is defined to be 2( 0 ).

In general, Fourier transforms where either x(t) or X(j) is not a function (i.e., has an
impulse) must be treated with care to ensure the result is correct.

2.3

Laplace vs Fourier

Heres an interesting question: Suppose you have an LTI system with impulse response function
h(t). Lets assume h(t) = 0 for t < 0. Let H(s) be the Laplace transform of h(t) and then substitute
j for s in H(s), so you now have H(j). Is H(j) the Fourier transform of h(t)? The answer is,
not necessarily.
To see this, let us temporarily, for this section only, denote the Laplace transform of h(t) by
HLT (s) and the Fourier transform by HF T (j). The Laplace transform HLT (s) exists provided h(t)
satisfies some conditions; for example, if it grows without bound, the growth is at most exponential.
As we said before, the ROC is a right half-plane.
Let us look at two examples:
1. The unit step function is

0, t < 0
u(t) =
1, t 0
The Laplace transform and ROC are
1
ULT (s) = ,
s

ROC : Re s > 0.

Thus, the ROC is the open right half-plane, and ULT (s) has a pole on the boundary of that
region, namely at s = 0.
2. Consider the causal signal h(t) = et u(t). Then
HLT (s) =

1
,
s+1

ROC : Re s > 1.

Here the ROC contains the imaginary axis.


We turn now to the FT:
Z
HF T (j) =
h(t)ejt dt.
0

Again, the lower limit could be but it doesnt matter, because h(t) is causal. Let us do the two
examples from above:

CHAPTER 2. REVIEW OF CONTINUOUS TIME

11

1. The Fourier transform of the unit step is


UF T (j) = () +

1
.
j

Thus the Fourier transform is a distribution, having an impulse. Notice that the Fourier
transform and the Laplace transform are not equivalent:
UF T (j) 6= ULT (j).
Indeed, setting s = j in ULT (s), when the imaginary axis is not in the ROC, requires some
justification.
2. For h(t) = et u(t)
HF T (j) =

1
.
j + 1

In this case the Fourier transform is a function. Moreover, the Fourier and Laplace transforms
are equivalent:
HF T (j) = HLT (j).
In general, the Fourier and Laplace transforms are equivalent, that is,
HF T (j) = HLT (j),
when the ROC of the Laplace transform includes the imaginary axis.

2.4

LTI Systems

1. Now we turn to the basic notions of a system and its properties. A system has an input x(t)
and an output y(t). We will assume these are real-valued functions of the time variable t and
that t runs over all time, < t < . An alternative is 0 t < . There may also be
initial conditions if the system arises from a differential equation, but we wont review that
topic because its not central in communication systems.
2. The system is linear if it has two properties: if x1 produces y1 and x2 produces y2 , then
x1 + x2 produces y1 + y2 ; if x produces y, then cx produces cy for any real number c. Which
of the following systems are linear?
(a) y(t) = tx(t)
(b) y(t) = t2 x(t)
Rt
(c) y(t) = 0 x( )d

(d) y(t) = x(t 2)


(e) y(t) = ex(t)

(f) y(t) = ax(t) + b


(g) y(t) = x(t)

CHAPTER 2. REVIEW OF CONTINUOUS TIME

12

3. Many (probably most) linear systems have a model like this:


Z
h(t, )x( )d.
y(t) =

The function h(t, ) is the impulse response; h(t, t0 ) is the output at time t when the input
is an impulse at time t0 . The linear system is causal if y(t) depends on x( ) only for t ,
that is, h(t, ) = 0 for > t. The linear system is time-invariant if it has this property: if
x(t) produces y(t), then for every T , x(t T ) produces y(t T ). Equivalently, h(t, ) depends
only on t . Then we write h(t ) instead of h(t, ) and we have the convolution equation
Z
h(t )x( )d
y(t) =

which is also written y(t) = h(t) x(t), although more properly y(t) = (h x)( ). Thus a linear
time-invariant (LTI) system is modeled by a convolution equation
Z
y(t) =
h(t )x( )d.

If it is causal, then h(t) = 0 for t < 0 and


y(t) =

h(t )x( )d

4. The transfer function of an LTI causal system is the Laplace transform H(s) of h(t). The
frequency-response function of an LTI system is the Fourier transform H(j) of h(t).
5. A signal x(t) is said to be bounded if (obviously) there is a bound B such that |x(t)| B
for all t. Examples: Sinusoids are bounded, but a ramp x(t) = tu(t) is not. A linear system
with input x(t) and output y(t) is said to be stable if every bounded input x(t) produces a
bounded output y(t). Examples: An RLC circuit is stable, but an LC circuit is not. It is a
theorem that the LTI system y(t) = h(t) x(t) is stable if and only if the impulse response
function is absolutely integrable, i.e.,
Z
|h(t)|dt < .

This is valid even for non-causal systems. Finally, if a causal LTI system has a transfer function
H(s), then the system is stable if and only if the ROC of H(s) includes the imaginary axis,
that is, all the poles of H(s) have negative real part.

Chapter 3

Discrete-time Signals and Systems


Here we blend the topics of Chapters 2 and 3 in the text.

3.1

Elementary Signals

1. In discrete time, the time variable is denoted n and is an integer; depending on the application,
it could range over all positive and negative integers, < n < , or just the non-negative
integers, n 0, or just a finite number of integers, 0 n N 1. A discrete-time signal is
written x[n], with square brackets. The integer n does not stand for real time, but rather for
sample number. For example, consider the sinusoid x(t) = sin(10t). Suppose starting at
t = 0 we sample it every 0.2 seconds. Then we have the samples x[n] = x(0.2n) = sin(2n), n =
0, 1, 2, . . . . You can see that n indicates the sample number, and the samples x[0], x[1], x[2], . . .
occur at the real times 0, 0.2, 0.4, . . . .
So sinusoids in discrete time look like x[n] = sin(n) or x[n] = ejn .
2. The unit step is u[n] and the unit impulse [n]. The latter equals 1 when n = 0 and zero
otherwise.

3.2

z Transforms

1. The z transform is to discrete time as the Laplace transform is to continuous time. The z
transform of x[n] is
X(z) =

x[n]z n .

n=

The region of convergence (ROC) is in general an annulus (ring, doughnut). The function
X(z) has no poles inside the ROC.
2. Example: Let u[n] denote the unit step in discrete times. Thus u[n] = 1 for n 0 and
u[n] = 0 for n negative. The series
U (z) =

n=

u[n]z n = 1 +

1
1
+ 2 +
z z
13

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

14

converges for |1/z| < 1, i.e., |z| > 1. Thus the ROC is the exterior of the unit disk. For z in
this region, the series converges to
1
z
=
.
1 (1/z)
z1
Thus theres a pole at z = 1, on the boundary of the ROC.
3. Example: Let x[n] = u[n]. Then
X(z) = 1 + z + z 2 + .
The ROC is the disk |z| < 1, inside which
X(z) =

1
.
1z

Thus theres a pole at z = 1, on the boundary of the ROC.

4. For the next example, its convenient to have a formula for the finite series 1+a+a2 + +aN 1 .
Since
(1 + a + a2 + + aN 1 )(1 a) = 1 aN ,
so
1 + a + a2 + + aN 1 =

1 aN
.
1a

Notice on the right-hand side the numerator and denominator are both zero at a = 1. Thus
the roots of the polynomial 1 + a + a2 + + aN 1 are the points a such that aN = 1 excluding
the point a = 1. That is, the points
ej2k/N ,

k = 1, . . . , N 1.

5. Example: Let x[n] = u[n] u[n 10]. Thus


X(z) = 1 +

1
1
1 (1/z 10 )
z 10 1
+ + 9 =
= 9
.
z
z
1 (1/z)
z (z 1)

The ROC is z 6= 0, the extreme form of an annulus: 0 < |z| < . There are 9 poles at z = 0
and 9 zeros on the unit circle, at
z = ej2k/10 ,

k = 1, . . . , 9.

6. Example: Take the continuous-time sinusoid x(t) = cos(t)u(t) and sample it at 1 sample per
second: x[n] = cos(n)u[n]. (u is the unit step.) Find X(z) and the ROC. Locate the poles
and zeros.
7. The signal x[n] = cos(n) does
not have a z transform. The reason is that |z| has to be large
P
1 to converge, but also small enough, namely |z| < 1,
enough,
namely
|z|
>
1,
for
n=0 x[n]z
P1
for n= x[n]z 1 to converge. The regions |z| > 1 and |z| < 1 have empty intersection.

15

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS


8. Consider the following picture:

Im s
x(t)

Re s

The graph on the left is of a decaying exponential signal in continuous time; x(t) = 0 for
t < 0. The figure on the right is the s-plane. Shown is the ROC as a shaded right half-plane
that includes the imaginary axis. Extend to all other cases. For signals that are nonzero in
negative time, youll have to use the two-sided Laplace transform:
Z
X(s) =
x(t)est dt.

For example, heres another pair:

Im s
x(t)

Re s

The signal is blowing up in positive time but decaying in negative time; the ROC is a vertical
strip in the right half-plane.
9. Do all the analogous pictures for discrete time. The ROC may be a disk, an annulus, etc.
10. What about inversion? How do we get x[n] if we have X(z)? One way is to expand X(z) as
a series and then read off x[n] from the coefficients. That is, if we can expand
1
1
X(z) = + a2 z 2 + a1 z + a0 + a1 + a2 2 + ,
z
z
then x[n] = an . Such an expansion is called a Laurent series. Example: X(z) = 1/(z 1). If
the ROC is |z| < 1, then the series expansion must be
X(z) =

1
= (1 + z + z 2 + ).
1z

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

16

Read off x[n]. But if ROC is |z| > 1, then


X(z) =

1
1
1
= (1 + z 1 + z 2 + ).
1
z1z
z

Read off x[n].


Another example:
X(z) =

z1
.
(z 2)(z 3)

There are three possible ROCs. You have to know which one. Then you do a partial fraction
expansion, like this:
X(z) =

1
2
+
.
z2 z3

Then do the appropriate Laurent series for both terms. Then read off the coefficients. You
finish.
11. Theres another approach to inversion, based on complex function theory, in particular,
Cauchys theorem and its corollary, the residue theorem. This is an advanced topic but
its mentioned in case you have taken a course on complex variables.
Lemma 1 Let C be a circle centred at the origin and contained within the ROC of X(z).
Then x[n] equals the sum of the residues of X(z)z n1 at all poles encircled by C.
Example
X(z) =

2z + 1
.
z(z 2)

This has two poles, at z = 0 and 2. By partial fraction expansion we can write
X(z) =

c1
c2
+
z
z2

for some constants c1 and c2 . The constant c1 is called the residue of X(z) at the pole z = 0;
likewise c2 is the residue of X(z) at the pole z = 2. For this example c1 = 1/2 and c2 = 5/2.
The ROC could be either |z| > 2 or 0 < |z| < 2. Imagine a circle C in the complex plane,
centred at the origin, and contained within the ROC. So for the former ROC, the radius of
the circle is greater than 2; for the latter ROC, less than 2. First
X(z) =

2z + 1
,
z(z 2)

ROC : |z| > 2.

We have
X(z)z n1 =

z n2 (2z + 1)
.
z2

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

17

The radius of C is greater than 2. Inside C, X(z)z n1 has a pole at z = 2 and also a pole at
z = 0 if n < 2. For n = 0 we have
x[0] = sum of the residues of

2z + 1
.
2)

z 2 (z

Now
c2
2z + 1
c1
c3
= 2+
+
2)
z
z
z2

z 2 (z

and the sum of the residues is c2 + c3 , which equals 0. (The residue at the pole z = 0 is c2 ,
not c1 .) For n = 1
x[1] = sum of the residues of

2z + 1
= 2.
z(z 2)

Then for n 2, theres only a pole at z = 2:


x[n] = the residue of

5
z n2 (2z + 1)
= 5 2n2 = 2n .
z2
4

Now lets turn to the other ROC:


X(z) =

2z + 1
,
z(z 2)

ROC : 0 < |z| < 2.

Then inside C theres at most a pole at z = 0, so


x[n] = the residue of

z n2 (2z + 1)
.
z2

For n 2 there is no pole and so x[n] = 0. For n = 1


x[1] = the residue of

z 1 (2z + 1)
1
= .
z2
2

And so on.
Finally, heres a formula for residues: If X(z) has a simple pole at z = p, then its residue is
lim (z p)X(z).

zp

If the pole has multiplicity m, then the residue is


1
dm1
(z p)m X(z).
zp (m 1)! dz m1
lim

12. The convolution theorem says that convolution in the time domain corresponds to multiplication in the transform domain: If y[n] = h[n] x[n], then Y (z) = H(z)X(z). The proof is
quite direct and is similar to the analogous result in continuous time.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

3.3

18

Fourier Transforms

1. The Fourier transform (FT) of x[n] is


X(ej ) =

x[n]ejn .

n=

This is the analysis equationgetting the sinusoidal components of the signal. The inversion
formula (IFT) is
Z
1
x[n] =
X(ej )ejn d.
2
This is the synthesis equationcombining the sinusoidal components into the signal.
2. Given a signal x[n], lets say we have its z transform X(z) and ROC, and we have its Fourier
transform X(ej ). So it seems the FT is just the z transform with z replaced by ej , a point
on the unit circle. Is this correct? Yes, if the ROC of the z transform includes the unit circle.
For example, its not true for the unit step, u[n]. Even worse, the sinusoid cos n has a FT but
not a z transform.
3. The meaning of the FT formulas is different depending on the properties of the signalthis
is important. There are three different cases, as we now discuss.
4. Case 1: If the absolute value of x[n] is summable, that is,

|x[n]|dt < ,

then X(ej ) is a continuous function of . Example:


x[n] = [n] + [n 1] + [n 2].
This signal has a z transform and the ROC contains the unit circle. Therefore all we need to
do is find the z transform of x[n] and then set z = ej to get the FT.
5. Case 2: If x[n] is only square-summable, that is,

x(t)2 dt < ,

then X(ej ) may not be a continuous function of . Example: Lets start with the FT:

1,
|| /2
j
X(e ) =
0, /2 < || < .
Then by the inversion formula
Z
Z /2
1
1
j jn
x[n] =
X(e )e d =
ejn d.
2
2 /2

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

19

The indefinite integral equals


1 jn
e .
2jn
Thus
x[n] =


1  jn/2
1
sin(n/2).
e
ejn/2 =
2jn
n

This is not absolutely summable, because 1/n does not converge to zero fast enough as n goes
to . So in this example, X(ej ) and x[n] are both well defined. The anomaly concerns the
analysis equation

X(ej ) =

x[n]ejn .

n=

In what sense does the right-hand side converge to the discontinuous function on the left? Its
like this. Fix N and define
XN (ej ) =

N
X

x[n]ejn .

n=N

Then the question we just asked is, in what sense does XN (ej ) converge to X(ej )? In the
integral-squared sense:
Z


XN (ej ) X(ej ) 2 d = 0.
lim
N

6. Theres a beautiful theory for the functions in Case 2. It goes like this. Let x[n] and y[n] be
square-summable and let X(ej ) and Y (ej ) be their FTs. Define
X
hx, yi =
x[n]y[n].
n

This is called an inner product. Its exactly like the dot product of two vectors. Also, define
the inner product of the two FTs:
Z
1
hX, Y i =
X(ej )Y (ej )d.
2
The overbar denotes complex conjugate. So we have an inner product in the time domain and
an inner product in the frequency domain. In fact, theyre equal for every x and y. This is
called Parsevals Theorem. In the special case where x[n] = y[n], we have
Z
X


1
X(ej ) 2 d.
x[n]2 =
2
n

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

20

7. Case 3: The sinusoidal signal ej0 n is neither absolutely-summable nor square-summable. Its
FT is defined to be 2( 0 ). The FT equation
2( 0 ) =

ej0 n ejn

has no meaning in the sense of ordinary functions. However, the inversion formula
Z
1
j0 n
e
=
2( 0 )ejn d
2
is consistent with the sifting property of the impulse. Another example:
X
x[n] =
[n 3k],
k

which is an upsampled impulse, the output of 3 for the input [n]].


8. You can prove these properties of the DTFT:
(a) X(ej ) is periodic, of period 2. We restrict the angle to [, ).
(b) If x[n] is real, then X(ej ) has even magnitude and odd phase.
9. The convolution theorem holds for Fourier transforms too. Do this example:
X
x[n] =
[n 3k], h[n] = ([n] + [n 1])/2.
k

Find y[n] = h[n] x[n].


10. This is an aside: Consider a causal signal x[n] that has a FT. Causal means x[n] = 0 for
n < 0. It turns out the real part of X(ej ) is uniquely determined by the imaginary part, and
vice versa. This isnt important in our course, but it is in electromagnetics, in the form of the
Kramers-Kronig relations.

3.4

System Concepts

1. A system has an input x[n] and an output y[n]. There may also be initial conditions if the
system arises from a difference equation. The system is linear if it has two properties: if
x1 produces y1 and x2 produces y2 , then x1 + x2 produces y1 + y2 ; if x produces y, then cx
produces cy for any real number c.
Which of the following systems are linear?
(a) y[n] = nx[n]
(b) y[n] = n2 x[n]
P
(c) y[n] = nk=0 x[k]
P
(d) y[n] = n+2
k=n2 x[k]
(e) y[n] = x[n 2]

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

21

(f) y[n] = ex[n2]


(g) y[n] = ax[n] + b
(h) y[n] = x[n]
(i) y[n] = x[n] + 3u[n + 1]
2. Most linear systems have a model like this:
y[n] =

h[n, m]x[m].

m=

The function h[n, m] is the impulse response; h[n, m] is the output at time n when the
input is a pulse at time m, i.e., [n m]. The linear system is causal if y[n] depends on x[m]
only for n m, that is, h[n, m] = 0 for m > n.
3. The linear system is time-invariant if it has this property: if x[n] produces y[n], then for
every k, x[n k] produces y[n k]. Equivalently, h[n, m] depends only on n m. Then we
write h[n m] instead of h[n, m] and we have the convolution equation
y[n] =

m=

h[n m]x[m].

which is also written y[n] = h[n] x[n].

Thus a linear time-invariant (LTI) system is modeled by a convolution equation


y[n] =

m=

h[n m]x[m].

This is frequently abbreviated y[n] = h[n] x[n], although its not quite correct because it
suggests is an operator that takes the two inputs h[n], x[n]. If we were fussy wed write
y[n] = (h x)[n]. If the system is causal, then h[n] = 0 for n < 0 and
y[n] =

n
X

m=

h[n m]x[m].

4. The transfer function of an LTI system is the z transform H(z) of h[n]. Convolution in
the time domain is equivalent to multiplication in the frequency domain: y[n] = h[n] x[n]
is equivalent to Y (z) = H(z)X(z).
The frequency-response function of an LTI system is

the Fourier transform H ej of h[n].

5. For an LTI system with transfer function H(z), when is the system causal? Its easy to answer
when H(z) is rational, the ratio of two polynomials, H(z) = N (z)/D(z).
Lemma 2 The LTI system with transfer function H(z) = N (z)/D(z) is causal iff the ROC
is the exterior of a disk and
degree D(z) degree N (z).

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

22

Proof Necessity: Assume causality. Then h[n] = 0 for n < 0 and so from the definition of
H(z),
1
1
H(z) = h[0] + h[1] + h[2] 2 + .
z
z
Thus the ROC has the form |z| > r. Also,
N (z)
1
1
= h[0] + h[1] + h[2] 2 + .
D(z)
z
z
Letting |z| , we get the degree inequality.

Sufficiency: Since the ROC is the exterior of a disk, H(z) must have the form
H(z) =

h[n]z n

n=n0

for some finite n0 , that is, h[n] can be nonzero for only finitely many negative values of n.
From the degree inequality, n0 must be 0. Thus the system is causal.
.
Examples: Which of these could be causal:
1
z2
z
, z,
,
.
z
z 1 z2 1
6. A signal x[n] is said to be bounded if there is a bound B such that |x[n]| B for all n.
Examples: Sinusoids are bounded, but a ramp x[n] = nu[n] is not. A linear system with input
x[n] and output y[n] is said to be stable if every bounded input x[n] produces a bounded
output y[n] (text, same section). Heres the time-domain test for stability of an LTI system:
Lemma 3 The LTI system y[n] = h[n] x[n] is stable iff the impulse response function is
absolutely summable, i.e.,

n=

|h[n]| < .

Heres the frequency-domain test for stability of an LTI system:


Lemma 4 The LTI system with transfer function H(z) is stable iff
unit circle ROC.
Heres the frequency-domain test for stability of a causal LTI system:
Lemma 5 The causal LTI system with transfer function H(z) is stable iff all poles are in
|z| < 1.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

23

Examples: Which of these could be stable:


1
1
, z,
.
z
(2z 1)(z 2)
Why must H(z) = 1/(z 1) be unstable?
7. LTI and FIR = stable.
8. Consider an LTI system with impulse response h[n]. Lets discuss sinusoid in implies sinusoid
out. Suppose the input is the sinusoid x[n] = ejn . Is there an output? We have to be careful,
because the expression x[n] = ejn suggests x[n] was applied starting at n = , that is, it
suggests the system is in steady state. Will there be a steady state if the input is a sinusoid?
Yes, if the system is stable. So lets start again: Consider an LTI, stable system with impulse
response h[n] and suppose the input is the sinusoid x[n] = ejn . Then the output is as follows:
y[n] = h[n] ejn
X
h[n m]ejm
=
m

h[m]ej(nm)

= ejn

h[m]ejm

= e

jn

H(ej ).

So we see that the steady-state output is a sinusoid too, of the same frequency. By the way,
the equation
h[n] ejn = H(ej )ejn
has the same form as the equation
Ax = x
where A is a square matrix, x is a nonzero vector, and is a complex number. We say is
an eigenvalue of A and x is an eigenvector. Likewise, in the preceding equation, we say that
sinusoids are eigenfunctions of LTI systems and Fourier transforms H(ej ) are eigenvalues.
9. As examples, find H(ej ) in the following cases:
(a) y[n] = x[n 1]

(b) y[n] = x[n]


(c) y[n] = x[2n]

(d) y[n] = x[n + 1] + x[n] + x[n 1]

3.5

Matrix Representations

Now we introduce linear algebra into the picture. Well see how a discrete-time linear system can
be represented by a matrix.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

24

Vectors and Matrices


For those students who havent used matrices since first-year linear algebra, let us review the
elements.
1. A vector is an ordered list of numbers, real or complex. We usually write the list as an array
with one column. Example:

1
x = 2 .
3
But we may occasionally write it as an n-tuple:
x = (1, 2, 3).
2. An array of the form

1 2
0
1 0
8
A=
0 3
4
0 6 5

is a 4 3 matrix. It has 4 rows and three columns.

3. If x is a column vector, the multiplication Ax is defined if x has dimension 3 (it has 3 rows).
Let y = Ax. Then y has dimension 4. Its first component is the dot product of the first row
of A with x:


 x[0]
y[0] = 1 2 0 x[1] = x[0] + 2x[1].
x[2]
Its second component is the dot product of the second row of A with x:

x[0]


y[0] = 1 0 8 x[1] = x[0] + 8x[2].
x[2]

Thus the equation y = Ax, for this A, is a concise way of writing the 4 equations
y[0] = x[0] + 2x[1]
y[1] = x[0] + 8x[2]
y[2] = 3x[1] + 4x[2]

y[3] = 6x[1] 5x[2].


Each component of y is a linear combination of the components of x.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

25

4. To multiply two matrices AB, the number of columns of A must be equal to the number of
rows of B. When this is true, the first column of C = AB equals, by definition, Ab1 , where b1
is the first column of B. Likewise for the other columns of C. In general, matrix multiplication
is not commutative, that is, AB 6= BA in general.
5. Matrices represent linear transformations.
Example Consider the linear transformation that takes a vector in the plane and rotates it
clockwise by /2, There is a matrix A satisfying y = Ax, where x is the given vector in the
plane and y is the rotated vector. To find A, first let x be the unit vector on the horizontal
axis, i.e.,
 
1
x=
.
0
Rotating by /2 gives the vector


0
.
y=
1
Thus


0
1

=A

1
0

Thus the first column of A equals




0
.
1
Likewise, the second column is the second basis vector
 
0
1
rotated by /2, namely,
 
1
.
0
Thus
A=

0 1
1 0

A Signal is a Vector
1. Consider a discrete-time signal x[n] and let us suppose the time variable n runs over the range
0, 1, 2, . . . , that is, the non-negative integers. We could write the signal as the text does in
(2.1),
x = {x[n]},

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

26

or we could write the components out as an ordered list,


x = (x[0], x[1], x[2], . . . ).
Another way to write the signal is to stack the components up to form a column vector:

x[0]
x[1]

x = x[2] .

..
.

Of course, this vector has an infinite number of components, that is, its an infinite dimensional
vector. For example, here are the unit impulse and unit step:


1
1
0
1


= 0 , u = 1 .


..
..
.
.

2. Signals form a vector space, that is, you can add two of them and you can multiply a signal
by a scalar real number. Of course, to add them you add their components

x[0]
y[0]
x[0] + y[0]
x[1]
y[1]
x[1] + y[1]

x = x[2] , y = y[2] = x + y = x[2] + y[2] .

..
..
..
.
.
.
Likewise with scalar multiplication:

x[0]
cx[0]
x[1]
cx[1]

x = x[2] = cx = cx[2]

..
..
.
.

3. If the time variable n ranges over all integers, both negative and positive, i.e., < n < ,
then the components of a signal x are
. . . , x[2], x[1], x[0], x[1], x[2], x[3], . . .
In writing such a signal as a column vector, we need a marker that divides time into n < 0
and n 0. We use a horizontal line:

..
.

x[2]

x[1]

x=
x[0] .
x[1]

x[2]

..
.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS


So for example

..
.
0

1
=
0

..
.

, u =

..
.
0
1
1
1
..
.

A Linear System is a Matrix

27

1. Consider the system whose output is the sum of the current and past input:
y[n] = x[n] + x[n 1].

(3.1)

Let us suppose the time set is n 0. To determine y[0] from (3.1), we have to assign a value
to x[1]; let us take 0. Then as n ranges over 0, 1, 2, . . . , (3.1) generates the equations
y[0] = x[0]
y[1] = x[1] + x[0]
y[2] = x[2] + x[1]
etc.
These equations can be


y[0]
1
y[1] 1


y[2] 0

=
y[3] 0


..
..
.
.

assembled into one vector-matrix equation:

0 0 0
x[0]

1 0 0
x[1]

1 1 0
x[2] .

0 1 1 x[3]

.. .. ..
..
. . .
.

Let us denote the matrix by H, so that the preceding equation is y = Hx. Thus the notation
is: boldface capital letter for matrix representation of a system. To be consistent with the
time set, we choose to number the rows as 0,1,2 etc. Likewise the columns. Think of the
equation y = Hx like this: To get the output at time, say, n = 2, take the dot product of row
number 2 of the matrix H with the input vector x. For the above example,

x[0]
x[1]




y[2] = 0 1 1 0 x[2] = x[2] + x[1].


x[3]

..
.

2. Going beyond the example, let us consider a general case y = Hx, still with the time set
n 0. Then the matrix H has components that it is convenient to write as h[m, n]the
component in row m, column n. Then the equation y = Hx in component form is
y[m] =

n=0

h[m, n]x[n],

m0

28

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS


i.e., the dot product of row m of H with x.
3. Some other examples are as follows. A


y[0]
a
y[1] 0


y[n] = ax[n], y[2] = 0
y[3] 0


..
..
.
.
The unit time delay:

y[n] = x[n 1],

The unit time advance:

y[n] = x[n + 1],

y[0]
y[1]
y[2]
y[3]
..
.

y[0]
y[1]
y[2]
y[3]
..
.




=





=

The downsampler, also called the


y[0]
y[1]


y[n] = x[2n], y[2] =
y[3]


..
.

memoryless pure gain:

0 0 0
x[0]

a 0 0
x[1]

0 a 0
x[2] .

0 0 a x[3]

.. .. ..
..
. . .
.
0
1
0
0
..
.

0
0
1
0
..
.

0
0
0
1
..
.

0
0
0
0
..
.

0
0
0
0
..
.

1
0
0
0
..
.

0
1
0
0
..
.

0
0
1
0
..
.

x[0]
x[1]
x[2]
x[3]
..
.

x[0]
x[1]
x[2]
x[3]
..
.

compressor (Example 2.9):

1 0 0 0 0
x[0]

0 0 1 0 0 x[1]

0 0 0 0 1
x[2]

0 0 0 0 0
x[3]
.. .. .. .. ..
..
. . . . .
.

4. When the time set is < n < , we have to


into n < 0 and n 0. We do it like this, shown

..
..
.
.

y[2] 0

y[1] 1

y[0] 0
=

y[n] = x[n 1],



y[1] 0
y[2] 0

y[3] 0

..
..
.
.

place markers in the matrix that divide time


for the unit time delay:

.. .. .. ..
..
.
. . . .

x[2]
0 0 0 0 0

x[1]
0 0 0 0 0

1 0 0 0 0
x[0] .

0 1 0 0 0 x[1]

0 0 1 0 0
x[2]

0 0 0 1 0 x[3]

.. .. .. .. ..
..
. . . . .
.

We continue to write this equation as y = Hx. We number the rows with respect to the
horizontal line: The rows are numbered 0,1,2 etc. going down from the horizontal line, and

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

29

1, 2 etc. going up from the horizontal line. Likewise, the columns are numbered 0,1,2
etc. going right from the vertical line and 1, 2 etc. going left. Then the matrix H has
components that it is convenient to write as h[m, n]the component in row m, column n.
Again, we have
y[m] =

h[m, n]x[n].

(3.2)

n=

Notice that if x[n] = [n], then y[m] = h[m, 0], so h[m, 0] is the output at time m when an
impulse is applied at time 0. More generally h[m, n] is the output at time m when an
impulse is applied at time n.
The preceding matrix

..
.

0
U=
0

..
.

is important and we give it the special symbol U, for unit delay:

.. .. .. ..
. . . .

0 0 0 0 0

0 0 0 0 0

1 0 0 0 0
.
0 1 0 0 0

0 0 1 0 0

0 0 0 1 0

.. .. .. .. ..
. . . . .

5. To recap: A linear discrete-time system can be represented by the equation y = Hx.


6. Now we look at two properties the linear system may have, causality and time invariance,
and we see how those properties are reflected in the matrix H. We do this for the time set
< n < . First, causality. Causality means that y[n] depends only on x[n], x[n 1], . . . .
This is true iff the equation
y[m] =

h[m, n]x[n]

m
X

h[m, n]x[n],

n=

takes the form


y[m] =

n=

and this is equivalent to


h[m, n] = 0,

n > m.

Equivalently, H is lower triangular. For example, the unit time advance is not causal.
Time invariance is this property: If the input is delayed by 1 time unit, so is the output.
Lemma 6 The following conditions are equivalent:
(a) System (3.2) is LTI.

30

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS


(b) For every n, m, h[m 1, n 1] = h[m, n].
(c) H is constant along diagonals.
(d) H and U commute.

For example, if the impulse [n] at time 0 is applied, the output is y[m] = h[m, 0]; if the
impulse [n 1] at time 1 is applied, the output is y[m] = h[m, 1]. For the system to be LTI,
the second output must be the shifted first output:
h[m, 1] = h[m 1, 0].
7. Lets run through the examples:
memoryless pure gain: time invariant, causal
unit time delay: time invariant, causal
unit time advance: time invariant, not causal
downsampler: not time invariant, not causal
8. Consider an LTI system. The input-output equation in component form is (3.2). Since
h[m k, n k] = h[m, n], by setting k = n we see that h[m, n] depends onlyl on the difference
m n, and not m and n individually. We may therefore redefine h[m, n] to be h[m n], the
output at time m when an impulse is applied at time n. So finally, the function h[n] is the
impulse-response function, the output at time n when an impulse is applied at time 0.
To recap, an LTI system can be modeled by the equation
y[n] =

m=

h[n m]x[m],

< n < .

This is the familiar convolution equation, usually written


y[n] = h[n] x[n].
The matrix form is y = Hx,


..
.


y[2]


y[1]


y[0]


y[1] =


y[2]


y[3]


..
.

or in more detail
..
..
..
..
..
.
.
.
.
.
h[0] h[1] h[2] h[3] h[4] h[5]
h[1] h[0] h[1] h[2] h[3] h[4]
h[2] h[1]
h[0] h[1] h[2] h[3]
h[3] h[2]
h[1]
h[0] h[1] h[2]
h[2]
h[1]
h[0] h[1]
h[4] h[3]
h[3]
h[2]
h[1]
h[0]
h[5] h[4]
..
..
..
..
..
..
.
.
.
.
.
.

..
.
x[2]
x[1]
x[0]
x[1]
x[2]
x[3]
..
.

Notice that H is constant along its diagonalsh[0] on the main diagonal, etc. Such a matrix
is called a Toeplitz matrix.
9. Among the following systems, for every linear system find H.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

31

(a) y[n] = nx[n]


(b) y[n] = n2 x[n]
P
(c) y[n] = nk=0 x[k]
P
(d) y[n] = n+2
k=n2 x[k]
(e) y[n] = x[n 2]
(f) y[n] = ex[n2]

(g) y[n] = ax[n] + b


(h) y[n] = x[n]
(i) y[n] = x[n] + 3u[n + 1]
10. Heres a method to find H if you know the system is linear. Apply the input [n] and figure
out what y[n] is: Thats column 0 of H. Apply the input [n 1] and figure out what y[n] is:
Thats column 1 of H. And so on.
11. Finally, interconnections of systems. If two systems H1 and H2 are connected in series, the
resulting matrix is the product of the two matrices. Of course, theres usually a difference
between this way H1 H2 and this way H2 H1 . You get H1 H2 if you connect the output of H2
to the input of H1 . If the two systems are LTI, then the order doesnt matter. If two systems
H1 and H2 are connected in parallel, that is, a common input is applied and the outputs
added, then the resulting matrix is the sum of the two matrices.

3.6

Difference Equations

The purpose of this section is to review and unify the different ways we have of describing an LTI
systemmatrix representation, convolution equation, transfer functionand to introduce a fourth
way, the difference equation. It is the difference equation that is actually implementable in hardware
or software.
1. Lets start with the simple time delay,
y[n] = x[n 1].
This is in fact the difference equation representation of the system. There is no ambiguity
in the equation, that is, theres just one way to compute the output from the input, and the
system is causal. The matrix representation is

.. .. .. .. ..
. . . . .

0 0 0 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

y = Hx, H =
0 0 1 0 0 0 ,

0 0 0 1 0 0

0 0 0 0 1 0

.. .. .. .. .. ..
. . . . . .

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

32

the convolution equation is


y[n] = h[n] x[n], h[n] = [n 1],
and the transfer function is
H(z) =

1
,
z

ROC : z 6= 0.

Notice that for the function 1/z there can be only one ROC.
2. The next example is
y[n] = x[n + 1] + x[n] + x[n 1].
Again, the output is uniquely defined in terms of the input by this equation. The system is
not causal. The matrix representation is

.. .. .. .. ..
. . . . .

1 1 0 0 0 0

1 1 1 0 0 0

0 1 1 1 0 0
,

y = Hx, H =

1
1
1
0

0
0

0 0 0 1 1 1

0 0 0 0 1 1

.. .. .. .. .. ..
. . . . . .

the convolution equation is

y[n] = h[n] x[n], h[n] = [n + 1] + [n] + [n 1],


and the transfer function is
1
H(z) = z + 1 + ,
z

ROC : z 6= 0.

Again, for the function z + 1 +

1
z

there can be only one ROC.

3. The next example is


y[n] = y[n + 1] + x[n].
How are we to view this? To compute the output y[n] given the input it seems we have to
have stored the future output y[n + 1]. But suppose we re-arrange
y[n + 1] = y[n] x[n]
and then replace n by n 1:
y[n] = y[n 1] x[n 1].

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

33

Now it seems we dont need the future output in the computation of y[n], we need to have
stored the past value y[n 1]. We conclude that the difference equation could represent either
of two systems, one causal and one not. The situation is clarified by taking z transforms of
the original equation:
Y (z) = zY (z) + X(z) = H(z) =

1
.
1z

So for this H(z) there are two possible ROCs: |z| < 1 and |z| > 1. The system is not causal
for the former, and is causal for the latter.
4. Heres a more elaborate example. Consider the causal linear system modeled by
y[n] =

n
X

x[k].

k=

We have
y[n] = x[n] +

n1
X

x[k]

k=

and thus the input and output satisfy the difference equation
y[n] = x[n] + y[n 1].
Now consider
y[n] =

x[k].

k=n+1

This system is anticausal in the sense that the output at any time depends entirely on future
inputs. We have
y[n] = x[n + 1]

x[k]

k=n+2

and thus the input and output satisfy the difference equation
y[n] = x[n + 1] + y[n + 1].
Since n is just a dummy time variable, we can replace n by n 1 in the latter equation to get
the difference equation
y[n 1] = x[n] + y[n],
or equivalently
y[n] = x[n] + y[n 1].
So the two systems have identical difference equations. You can work out the transfer functions
in the two cases.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

34

5. Consider an LTI system with transfer function H(z) and assume H(z) is rationalthe ratio
of polynomials:
H(z) =

B(z)
.
A(z)

Suppose the numerator and denominator are completely general:


B(z) = bM z M + + b0 , A(z) = aN z N + + a0 .
Then the input and output satisfy the equation
A(z)Y (z) = B(z)X(z),
which in the time domain is
aN y[n + N ] + + a0 y[n] = bM x[n + M ] + + b0 x[n].
This is the difference equation model of the system. How to implement it, that is, how
to compute the output y[n] for each n, depends on the ROC of H(z). For example, suppose
the system is causal. Then the ROC is the exterior of a disk, N M , and we can write the
difference equation (assuming aN 6= 0) as
y[n] =

1
(aN 1 y[n 1] a0 y[n N ] + bM x[n (N M )] + + b0 x[n N ]) .
aN

Thus y[n] can be computed from stored past values of y and x, and also the current input,
x[n], if N = M .
Implementation in the general case, where H(z) is not causal, is more complicated. We may
come back to that later. Or, maybe you can figure it out.
6. In conclusion:
(a) An LTI system is completely specified by the matrix H.
(b) It is completely specified by the convolution equation, equivalently, the impulse response
h[n].
(c) The transfer function H(z) needs the ROC to completely specify the system.
(d) The difference equation needs details about what is stored in memory to produce an
update rule and therefore to completely specify the system.

3.7

FIR, IIR, and Block Diagrams

The purpose of this section is to introduce a simple way to derive a block diagram implementation
of an LTI system. The material is essentially the direct form realization of Chapter 6, but its
simplified by the use of matrices.

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

35

1. Before we come to block diagrams, a few words about how systems are implemented.
How to implement a continuous-time system G(s): It can be done by an analog integrated
circuit. From G(s) draw the block diagram in terms of (1/s); then implement each integrator
via an op amp. The disadvantage of an analog circuit is its sensitivity to resistor values. These
may vary by 10% or more, resulting in large, undesirable bandwidth variations. In addition,
resistors are hard to build on integrated circuits (they take up a lot of room).
Recently many active filters with resistors and capacitors have been replaced with a special
kind of filter called a switched capacitor filter. The switched capacitor filter allows for very
sophisticated, accurate, and tuneable analog circuits to be manufactured without using resistors. Instead, equivalent resistances can be made to depend on ratios of capacitor values
(which can be set accurately), and not absolute values (which vary between manufacturing
runs). Switched capacitor filters operate in discrete time. The inputs are sampled and held.
This is therefore an approximation to the original analog system.
As for a discrete-time system with a TF H(z), theres no op amp for (1/z). Instead, it
could be implemented again by a switched capacitor filter. But more relevantly, it could be
implemented in software in a DSP. Block diagrams are then useful for implementing an H(z).
Typically one starts with the transfer function, derives the block diagram, and then programs
the DSP unit.
2. FIR Systems FIR stands for finite impulse response. The more proper term is finiteduration impulse response. The definition is this: An LTI system with impulse response
function h[n] is FIR if h[n] is nonzero for only finitely many values of n. Equivalently, H(z)
has the form
H(z) = (polynomial in z) + (polynomial in z 1 ).
Examples are
(a) H(z) = 1 + z 1 + z 2 ,
(b) H(z) = z 2 + z + 1,
(c) H(z) =

z2

ROC : z 6= 0

ROC : all z

+ z + 1 + z 1 + z 2 ,

ROC : z 6= 0.

Notice that for an FIR transfer function, we dont have to be told the ROC: It can only be
one thing since the only poles can be at z = 0; so the ROC is either all z, or all nonzero z.
It follows that every FIR system is stable. Proof 1: h[n] is absolutely summable since there
are only finitely many nonzero ones. Proof 2: The unit circle is in the ROC. FIR filters are
widely used in practice for that reason: They are automatically stable. There are practical
filters in current use that have up to order 4000 (the degree of the polynomial H(z)).
A special case is a causal FIR system. Then H(z) is a polynomial in z 1 there are no terms
of the form z, z 2 , . . . . Of the three examples above, only the first is causal.
3. Now we turn to the block diagram realization of a causal FIR system. Consider this figure:

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

36

x[n]
z

y[n]

H(z) = a0 + a1 z 1
y[n] = a0 x[n] + a1 x[n 1]
!
"
A = a0 a1
The input is x[n] and the output is y[n]. The transfer function is first order, meaning H(z) is
a polynomial in z 1 of degree 1: H(z) = a0 + a1 z 1 . The box labeled z 1 stands for a unit
delay system: The output of the box is x[n 1], the delayed input. Let us turn to the box
labeled A. It is a linear combiner, having two inputs, x[n] and x[n 1], and one output,
y[n]. We think of the inputs as a vector:


x[n]
.
x[n 1]
Then A must be a 1 2 matrix because theres one output and two inputs:






x[n]
x[n]
y[n] = A
= a0 a1
= a0 x[n] + a1 x[n 1].
x[n 1]
x[n 1]
4. Heres a second example, with a 3-input linear combiner:

x[n]
z 1
A

y[n]

z 1

H(z) = 2 3z 1 + 4z 2
!
"
A = 2 3 4
5. IIR Systems

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

37

IIR stands for infinite impulse response, that is, infinite-duration impulse response. These
are the LTI systems that are not FIR. So h[n] is nonzero for infinitely many n. Examples are
z
z2
,
.
z 1 2z 1
Note that the ROC is not determined by H(z).
As an example, lets find a block diagram for
H(z) =

z 2 + 2z + 1
.
z3 z 3

This is causal if we take the ROC to be |z| > r, where r is the maximum magnitude of the
poles. The first step is to rewrite the transfer function in terms of z 1 by dividing numerator
and denominator by z 3 :
H(z) =

z 1 + 2z 2 + z 3
.
1 z 2 3z 3

Now separate the numerator and denominator:


H(z) = (z 1 + 2z 2 + z 3 )

1
.
3z 3

z 2

z 2

Thus we have
Y (z) = (z 1 + 2z 2 + z 3 )

1
X(z).
3z 3

Next, define the intermediate variable


V (z) =

1
X(z),
3z 3

z 2

so that
Y (z) = (z 1 + 2z 2 + z 3 )V (z).
Back in the time domain, this is
y[n] = v[n 1] + 2v[n 2] + v[n 3].
Equation (3.3) can be written
(1 z 2 3z 3 )V (z) = X(z),
or
V (z) = z 2 V (z) + 3z 3 V (z) + X(z).
Back in the time domain, this is
v[n] = v[n 2] + 3v[n 3] + x[n].

(3.3)

38

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS


So, finally, we have that the system is modeled by the two equations
y[n] = v[n 1] + 2v[n 2] + v[n 3]
v[n] = v[n 2] + 3v[n 3] + x[n].
These can be written in matrix/vector form


 v[n 1]
y[n] = 1 2 1 v[n 2]
v[n 3]

x[n]

 v[n 3]

v[n] = 1 3 1 0
v[n 2] .
v[n 1]

We arrive at the following block diagram with two linear combiners, B for the numerator of
the transfer function and A for the denominator:

x[n]

y[n]
A

v[n]

B
z 1

z 1

z 1

The matrices are






A= 1 3 1 0 , B= 1 2 1 .

Notice that there are feedback loops through A. As you know from control systems, feedback
systems can be unstable. Indeed, IIR systems can be unstable. A causal IIR transfer function
H(z) is unstable if it has a pole on or outside the unit circle.
In summary, suppose H(z) is IIR with rational transfer function, having numerator degree
m and denominator degree n and with n m. Assume the ROC is |z| > r where r is the
maximum magnitude of the poles. Thus the system is causal. Then H(z) has a block diagram
similar to the previous one, where the number of delay blocks z 1 is n.

Main Points of Chapter


1. A linear system can be modeled as either y = Hx or
X
y[n] =
h[n, m]x[m].
m

CHAPTER 3. DISCRETE-TIME SIGNALS AND SYSTEMS

39

The second equation is merely this: the output at time n equals the dot product of the nth
row of H and x.
2. Causality and time invariance can be characterized by either model.
3. The LTI model is y[n] = h[n] x[n].
4. The LTI system is stable iff h[n] is absolutely summable.
5. If the system is LTI and stable, then sinusoid in implies sinusoid out.
6. The theory of z transforms and associated ROCs; inversion.
7. The DTFT for the three cases; inversion.
8. Convolution in the time domain implies multiplication in the frequency domain.

Chapter 4

The Sampling Theorem


4.1

The Underlying Idea

1. Heres an example of a kind of sampling theorem. The signal is a vector x in R6 :


x = (x0 , x1 , x2 , x3 , x4 , x5 ).
Suppose you know that the signal lives in the subspace B defined by the three equations
x1 + 2x2 + x4 = 0
2x0 x1 + x5 = 0

x0 3x4 + x5 = 0.

You sample the signal in this way: You measure that


x0 = 1, x2 = 2, x4 = 3.
You want to find the missing components of x. Its easy, because you now have 6 linear
equations for the 6 variables. As long as the equations are linearly independent, they uniquely
determine the 6 variables. Let us write the 6 equations in the form Ax = b:

0
x0
0
1 2 0
1 0


2 1 0 0
0 1
x1 0

x2 0
1
0
0
0
3
1

x3 = 1 .
1
0
0
0
0
0

0
0 1 0
0 0 x4 2
3
0
0 0 0
1 0
x5
You can solve by Gaussian elimination, or x = A1 b, or however you like.
2. Let us rephrase the problem in a
Bx = 0, where B is the matrix

0
1 2 0
1

0
B = 2 1 0 0
1
0 0 0 3

slightly different way. The given subspace condition is

0
1 .
1
40

41

CHAPTER 4. THE SAMPLING THEOREM


You measure y = Dx, where

1
1 0 0 0 0 0
y = 2 , D = 0 0 1 0 0 0 .
3
0 0 0 0 1 0

We use the symbol D because later its going to be the downsampler. The problem is this:
Given B, D, y and the equations Bx = 0, Dx = y, compute x. The equation to solve is


 
B
0
x=
,
D
y
where the zero on the right-hand side is the 3 1 zero vector.
3. For completeness, a subspace B is a nonempty subset of a vector space that has the properties
1) if x, y are in B, so is x + y, 2) if x is in B and c is a real number, then cx is in B. The
subspaces of R3 are the lines and planes through the origin, together with the origin and
finally R3 itself. In the preceding example, the subspace B is the nullspace of B. (Nullspace,
in case you forgot, is the solution space of the equation Bx = 0.

4.2

Sampling a Discrete-time Signal

1. Now we extend the problem as follows. The signal x[n] is defined for all < n < . It is
known to lie in the subspace B/2 of signals bandlimited like this: X(ej ) = 0, || /2.
The signal x[n] is downsampled by 2, y[n] = x[2n], and the components y[n] are measured.
The goal is to compute all the other sample values of x[n].
The setup is this:

x[n]

y[n]

We want to find the input given the output, i.e., we want a system with input y and output
x. So again were given a projection of x[n], namely, all the components with even indices
and our prior knowledge is that x B/2 .
2. Lets pause to consider what kind of system a downsampler is. The vectors x and y are related
like this:

..
..
.. .. .. ..
..
. . . .
.
. .
y[0] x[0] 1 0 0 0 x[0]

y[1] = x[2] = 0 0 1 0 = x[1] .

..
..
.. .. .. ..
..
.
.
. . . .
.

Since the matrix is not lower triangular, the downsampler is not causal. This interpretation
is due to the fact that the discrete-time variable n merely counts sample numbers, not real
time. However, if we refer x[n] and y[n] to real time t, the plots would be like this:

42

CHAPTER 4. THE SAMPLING THEOREM


x[n]
t

y[n]
t

In this sense the downsampler is causal.


3. The key to how to recover the input from the output lies in the frequency domain. We have
1
X(z) = + x[1]z + x[0] + x[1] +
z
and therefore
1
X(z) = x[1]z + x[0] x[1] + .
z
Adding gives
X(z) + X(z) = 2Y (z 2 ).
From this it follows that
1
1
Y (z 2 ) = [X(z) + X(z)], Y (ej2 ) = [X(ej ) + X(ej() )]
2
2
Let us symbolically sketch the Fourier transform of x[n] as a triangle:

X(ej )
1

Note that this plot isnt meant to be realistic: X(ej ) is complex-valued, for one thing. The
plot merely shows the support of X(ej ), i.e., the frequency range where X(ej ) is nonzero.
Remember: X(ej ) is a periodic function of , of period 2. Thus the graph is duplicated
periodically to the left and right. Then the graph of X(ej() ) is that of X(ej ) shifted to
the right by :

43

CHAPTER 4. THE SAMPLING THEOREM

X(ej() )
1

Adding the two and dividing by 2 gives Y (ej2 ):

Y (ej2 )
1/2

Therefore the FT Y (ej ) looks like

Y (ej )
1/2

Thus in the frequency domain the effect of the downsampler is to stretch out the Fourier
transform of X(ej ) and scale by 1/2, but there is no aliasing. That is, the triangle of X is
preserved uncorrupted in Y .
4. The following system reconstructs x[n] from y[n]:

y[n]

w[n]

H(ej )

xr [n]

gain 2
cutoff /2
The output is denoted xr [n] (r for reconstructed). It will turn out that xr = x if x B/2 .
Lets follow the Fourier transforms from y to xr . We have
w[2n] = y[n], w[2n + 1] = 0, W (z) = Y (z 2 ), W (ej ) = Y (ej2 ).
Thus the graph of W (ej ) is

44

CHAPTER 4. THE SAMPLING THEOREM

W (ej )
1/2

The ideal lowpass filter passes only the low frequency triangle, amplifying by 2:

Xr (ej )
1

Thus we have shown that xr = x if x B/2 .


5. Finally, we need to see how to implement the reconstruction, that is, the time-domain formula
for xr [n] as a function of y[n]. Let h[n] denote the impulse response function of the filter.
Then
Z /2
1
h[n] =
2ejn d.
2 /2
So
i 1 1
  sin n
1 1 h j(/2)n
j(/2)n
2
h[n] =
e
e
=
2j sin
n =
.

jn
jn
2
n
2
This is the familiar sinc function. Plot the graph of h[n].
The convolution equation is
xr [n] = h[n] w[n] =

X
m

w[m]h[n m].

Since w[m] = 0 for m odd, thus


X
xr [n] =
w[2m]h[n 2m]
m

and since w[2m] = y[m], so


X
y[m]h[n 2m].
xr [n] =
m

45

CHAPTER 4. THE SAMPLING THEOREM


Equivalently,
xr [n] =

X
m

x[2m]h[n 2m].

Look carefully at the right-hand side: If n is even, say n = 2k, then the right-hand side
becomes
X
x[2m]h[2k 2m] = x[2k].
m

Whereas if n is odd, all terms are required for the reconstruction of x[n].

Notice that the reconstruction operation is noncausal: To reconstruct the odd values of x[n],
all past and future even values are required.
6. As an example, take x[n] = ej(/6)n . This is a sinusoid of frequency /6 radians/sample
and therefore x B/2 . You finish this: Find the signals y[n], w[n], xr [n]. Repeat with
x[n] = ej(5/6)n .

7. Lets review what we have done in terms of linear algebra. The input signal was x[n]. It
was downsampled to y[n]. Thus the vector y is related to the vector x via a matrix D (for
downsampler): y = Dx. Likewise, the reconstructor inputs y[n] and outputs xr [n], so there
is another matrix R such that xr = Ry. Putting the two systems in series gives xr = RDx.
What we have seen is that, if x B/2 , then RDx = x. Note that in general RDx 6= x if
x 6 B/2 . Finally, as a little exercise, show that DRy = y for every signal y.

4.3

Sampling a Continuous-time Signal

1. The notation is this: T is the sampling period in seconds, s = 2/T is the sampling frequency
in rad/s, fs = 1/T is the sampling frequency in samples/s or Hz, and N = /T is the Nyquist
sampling rate.
2. The definition of C/D, the continuous-to-discrete transformation, i.e., the sampling operator,
is
x[n] = xc (nT ).
Thus the input is a CT signal and the output a DT signal. The arrow convention is this:

xc (t)

C/D

x[n]

The assumption on the input will be xc BN , that is,


Xc (j) = 0,

|| N .

In words, the input is bandlimited and the sampling frequency is greater than twice the highest
frequency in the input. Under this condition, xc (t) can be recovered from the sample values
x[n]. But note that the reconstruction is non-causal. That is, for any given t not a sampling
instant, to reconstruct xc (t) requires all the infinitely many sample values, past and future.

46

CHAPTER 4. THE SAMPLING THEOREM

3. The first step in this development is to derive the relationship between the input and output
in the frequency domain.
Proposition 1
X(ejT ) =

1X
Xc (j jks ).
T
k

This formula doesnt assume anything about the input. Heres a schematic sketch of the graph
of Xc (j):

Xc (j)

The input xc (t) is not bandlimited. Then the formula in the proposition is applied, and two
terms in the series are shown here, namely, k = 0 and k = 1:
1
Xc (j)
T

1
Xc (j js )
T
s

There is aliasing, in the sense that higher frequency components of xc (t) become baseband
frequency terms under sampling. On the other hand, if xc BN , then the terms wont
overlap:
1
Xc (j)
T

1
Xc (j js )
T
N s

And therefore
1
Xc (j),
T

< N

 
1
Xc j
,
T
T

< .

X(ejT ) =
or equivalently
X(ej ) =

The corresponding graph is

47

CHAPTER 4. THE SAMPLING THEOREM

X(ej ) =

1
Xc (j/T )
T

4. Heres a concrete example of aliasing. Take xc (t) = ej2t/3 of frequency (2/3) rad/s. Let
the sampling period be T = 2 seconds. Then N = /2, so the condition xc BN is not
satisfied. The sampled signal is
x[n] = xc (t)|t=nT = ej4n/3 .
Thus x[n] is a sinusoid of frequency (4/3) radians per sample. But
4
2
= modulo 2.
3
3
Therefore
x[n] = e

jt/3

t=nT

And so the sinusoid xc (t) of frequency (2/3) rad/s has been aliased by sampling down to the
lower frequency sinusoid ejt/3 of frequency (1/3) rad/s.
5. The proof of Proposition 1 involves a fiction, the idea of impulse-train modulation, shown
here:

s(t) =

!
n

xc (t)

(t nT )

xs (t) =

!
n

x[n](t nT )

The signal s(t) is a series of impulses at the sampling instants. It multiples xc (t) to produce
xs (t). Think of s(t) as a carrier wave and xc (t) as the signal. The advantage of this setup is
that all three signals are continuous-time.
Now there are two steps. The first connects the outputs of the two systems.
Step 1: Xs (j) = X(ejT )

CHAPTER 4. THE SAMPLING THEOREM

48

Proof
Xs (j) =
=
=
=

xs (t)ejt dt
xc (t)s(t)ejt dt
xc (t)

XZ
m

X
m

(t mT )ejt dt

xc (t)(t mT )ejt dt

xc (mT )ejmT dt

x[m]ejmT dt

= X(ejT )
The second step connects the input and output of the impulse-train modulation system.
P
Step 2: Xs (j) = T1 k Xc (j jks ).

Proof The signal s(t) is periodic of period T . Were going to expand s(t) in a Fourier series
(this is not rigorous, since operations with impulses have to be justified). The Fourier series
is
1 X jms t
s(t) =
e
T m
Then
Xs (j) =
=
=
=
=
=

xs (t)ejt dt
xc (t)s(t)ejt dt
xc (t)

1 X jms t jt
e
e
dt
T m

Z
1X
xc (t)ejms t ejt dt
T m
Z
1X
xc (t)ej(ms )t dt
T m
1X
Xc (j jms ).
T m

Finally, putting the two steps together proves the proposition.


6. Another example: xc (t) = ej0 t , s > 20 . Then
Xc (j) = 2( 0 )

49

CHAPTER 4. THE SAMPLING THEOREM

2
( 0 ), || <
T
T
2

X(ejT ) =
( 0 ), || <
T
T


2
X(ej ) =

0 , || <
T
T
Xs (j) =

X(ej ) = 2 ( 0 T ) , || <
Thus x[n] is a sinusoid of frequency 0 T .

4.4

Ideal, Non-causal Reconstruction

This part is easier than the preceding part.


1. Theorem 1 If xc (t) is bandlimited to less than N , then xc (t) can be reconstructed from the
samples via the non-causal procedure


sin
(t

nT
)
X
T
xc (t) =
x[n]
.
(4.1)

(t

nT
)
n
T
Another way to write the equation is
xc (t) =

X
n

sin
x[n]hr (t nT ),

hr (t) =

 
t
T .

t
T

Notice that hr (t) is a sinc function and therefore is the impulse response of an ideal lowpass
filter.
The block diagram we use for this reconstruction is this:

x[n]

D/C

xc (t)

2. The proof of the theorem uses the impulse-train modulation fiction.

s(t)
xc (t)

xs (t)

LPF

cutoff /T
gain T

xc (t)

50

CHAPTER 4. THE SAMPLING THEOREM


Thus the signal xc (t) is reconstructed by lowpass filtering xs (t).

3. Heres an example that shows that xc (t) BN is not a necessary condition for reconstruction,
only sufficient. Suppose we know that xc (t) is a sinusoid of frequency in the interval [10 kHz,
20kHz). To have xc (t) BN , we need the sampling frequency to be at least 40kHz. But lets
sample at 20kHz instead. Its left to you to show that xc (t) is reconstructed by a suitable
D/C. Hint: A bandpass filter.
4. The appendix contains a proof of the sampling theorem that doesnt use impulse-train modulation.

4.5

Summary of Formulas

Discrete time
x[n]

y[n]

1
1
y[n] = x[2n], Y (z 2 ) = [X(z) + X(z)], Y (ej2 ) = [X(ej ) + X(ej() )]
2
2

y[n]

w[n]

H(ej )

xr [n]

gain 2
cutoff /2
w[2n] = y[n], w[2n + 1] = 0, W (z) = Y (z 2 ), W (ej ) = Y (ej2 )
X
xr [n] = h[n] w[n] =
y[m]h[n 2m], Xr (ej ) = H(ej )W (ej ) = H(ej )Y (ej2 )
m

If x[n] is BL to less than /2, then H(ej )X(ej() ) = 0 and so

1
Xr (ej ) = H(ej )Y (ej2 ) = H(ej ) [X(ej ) + X(ej() )] = X(ej )
2
Thus xr [n] = x[n].

Continuous time
T = sampling period, N = /T = Nyquist sampling rate, s = 2/T = sampling frequency in
rad/s, fs = 1/T = sampling frequency in samples/s or Hz.

xc (t)

C/D

x[n]

51

CHAPTER 4. THE SAMPLING THEOREM


X(ejT ) =

x[n] = xc (nT ),

1X
Xc (j jms )
T m

x[n]

xr (t) =

X
n

D/C

xr (t)

x[n]h(t nT ), Xr (j) = H(j)X(ejT ), H(j) gain T , cutoff /T

If xc (t) is BL to less than /T , then H(j)Xc (j jms ) = 0 for m 6= 0 and so


Xr (j) = H(j)X(ejT ) = H(j)

1X
1
Xc (j jms ) = H(j) Xc (j) = Xc (j)
T m
T

Thus xr (t) = xc (t).

4.6

Causal Reconstruction

1. As we just saw, the non-causal reconstruction of xc (t) from the samples x[n] is D/C,
X
xr (t) =
x[n]h(t nT ),
n

where h(t) is the sinc function

sin t/T
t/T .

2. A causal way to approximately interpolate is to zero-order hold followed by smoothing:

x[n]
ZOH

vc (t)

The output of the ZOH is


vc (t) = x[n],

kT t < (k + 1)T.

Or, in terms of the unit step


vc (t) = x[n] [u(t kT ) u(t (k + 1)T )] .
This signal is filtered:
FINISH

G(s)

yc (t)

52

CHAPTER 4. THE SAMPLING THEOREM

4.7

Discrete-time Processing of Continuous-time Signals

1. Electroencephalography (EEG) is the recording of electrical activity along the scalp produced
by the firing of neurons within the brain. In conventional EEG, electrodes are placed on the
scalp. Each electrode is connected to one input of a differential amplifier (one amplifier per
pair of electrodes); a common system reference electrode is connected to the other input of
each differential amplifier. These amplifiers amplify the voltage between the active electrode
and the reference. In analog EEG, the signal is then filtered, and the EEG signal is output as
the deflection of pens as paper passes underneath. Most EEG systems these days, however,
are digital, and the amplified signal is digitized via an analog-to-digital converter, after being
passed through an anti-aliasing filter. Analog-to-digital sampling typically occurs at 256-512
Hz in clinical scalp EEG; sampling rates of up to 20 kHz are used in some research applications.
2. With this application in mind, let us suppose we want to process a continuous-time signal
x(t). Suppose we have designed an appropriate continuous-time filter H(j):

x(t)

y(t)

H(j)

If x(t) is bandlimited, we could instead sample it and then do discrete-time filtering. Whats
the advantage of the latter? A digital filter is easy to tune, whereas tuning an analog filter
requires replacement of resistors and/or capacitors. So let us consider the problem of designing
a discrete-time filter Hd (ej ) so that the following system is equivalent to the previous one:

x(t)

C/D

Hd (e )

D/C

y(t)

That means that the two outputs are identical when the same x(t) is applied to both systems.
So how are we going to get Hd (ej ) from H()? We have to assume the sampling rate has
already been selected to avoid aliasing. The solution is easy. Starting with the continuous-time
system, do C/D and D/C at the input:

x(t)

C/D

D/C

x(t)

H(j)

y(t)

Of course the new system is equivalent to the continuous-time one, since the input to H(j)
is still x(t). Next, we do the same at the output:

x(t)

C/D

D/C

x(t)

H(j)

y(t)

C/D

D/C

y(t)

53

CHAPTER 4. THE SAMPLING THEOREM


Again, this system is input-output equivalent. Now Hd (ej ) is revealed:

x(t)

C/D

H(j)

D/C

C/D

D/C

y(t)

Hd (ej )
That is, Hd (ej ) is the frequency response of this system: D/C followed by H(j) followed
by C/D. And Bobs your uncle.
3. Its probably not obvious that the system D/C followed by H(j) followed by C/D is even
LTI and that it therefore has a frequency response that we can call Hd (ej ). So lets prove
that. The three components are indeed linear. So we need to prove only time invariance.
Define the signals as follows:

x[n]
D/C

v(t)

H(j)

w(t)

C/D

y[n]

Let the input x[n] generate the output y[n]. To prove time invariance we have to show that
x[n 1] produces y[n 1] (the system commutes with the unit delay). Now the definition of
v(t) is
X
v(t) =
x[n]hr (t nT ).
n

Thus if x[n] is shifted to x[n 1], then v(t) becomes


X
X
x[n 1]hr (t nT ) =
x[m]hr (t T mT ) = v(t T ),
n

that is, v(t) is shifted to v(t T )its delayed by T seconds. Since H(j) is the frequency
response of an LTI system, w(t) becomes w(t T ). Finally, the definition of y[n] is w(nT ).
So y[n] becomes w(t T ) sampled at t = nT , namely,
w(nT T ) = w((n 1)T ) = y[n 1].
4. Now that we know Hd (ej ) exists, we can get a formula for it. Recalling the ideal lowpass
filter Hr (j) with gain T and cutoff /T , we have
V (j) = Hr (j)X(ejT ).
Thus
W (j) = H(j)Hr (j)X(ejT ).

54

CHAPTER 4. THE SAMPLING THEOREM


Thus w(t) is bandlimited to less than N and hence
Y (ejT ) =

1
W (j),
T

Y (ejT ) =

1
H(j)Hr (j)X(ejT ),
T

|| < N .

So,
|| < N .

But the gain of Hr cancels the 1/T and so


Y (ejT ) = H(j)X(ejT ),
Converting to gives
 
Y (ej ) = H j
X(ej ),
T

|| < N .

|| < .

And so finally

 
,
Hd (ej ) = H j
T

|| < .

5. In summary, the following two systems are input-output equivalent for x(t) BN :

x(t)

x(t)

C/D

H(j)
! "
H j
T

y(t)

D/C

y(t)

6. Example: a half-sample delay, H(s) = esT /2


 
= ej/2 .
Hd (ej ) = H j
T

Is there a transfer function Hd (z)? To see, we have to get the impulse response hd [n] and then
see if it has a z transform. Over to you.

7. Example: A differentiator, H(s) = s. Complete this example as far as you can: Find hd [n],
think about how to implement the filter, try truncating hd [n] to be zero for |n| > N , plot the
resulting frequency response.
8. We have seen in this section how to implement a continuous-time filter via discrete time signal
processing. We arrived at this block diagram:

55

CHAPTER 4. THE SAMPLING THEOREM

x(t)

C/D

Hd (e )

D/C

y(t)

Is there a frequency-response function from x(t) to y(t)? Nothe system is not time invariant.
But if x BN , then there is the relationship
Y (j) = H(j)X(j).
This function H(j) is called the effective frequency response and it is obtained by taking
the formula
 
, || <
Hd (ej ) = H j
T

and turning it around to

H(j) = Hd (ejT ),

|| < N .

CHAPTER 4. THE SAMPLING THEOREM

4.8

56

Discrete-time Random Signals (frag)

For the next topic, analog-to-digital converters, we need some background on random signals. The
reference is Section 2.10 and the appendix in the text. This is a very brief summary. A reference is
A. Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, Prentice-Hall.
1. We need these concepts:
(a) discrete and continuous random variable (RV)
(b) distribution function and density function
(c) mean and variance
(d) jointly defined RVs
(e) joint distribution function, joint density function
(f) independent RVs
(g) correlation, uncorrelated RVs, cross correlation function
2. An example will help you understand the idea of correlation. Suppose one of my lectures is
recorded by my speaking into a microphone. Thus a voltage as a function of time is produced.
Let x denote the voltage 5 minutes after I start and y the voltage 5 minutes after that. Suppose
we model x and y as jointly distributed RVs, that is, we construct a joint distribution. To
simplify, suppose the mean of x is subtracted from x, so E(x) = 0. Likewise for y. Let us try
to estimate y given x and let us restrict the estimate to a linear function of x. That is, we
estimate y via ax, where a is a real scalar. What is the best value of a? The error is y ax
and its variance is E(y ax)2 . So the best a minimizes this variance. If you solve
d
E(y ax)2 = 0
da
for a, youll get
a=

E(xy)
.
E(x2 )

We call E(xy) the cross-correlation function of x and y. Its a measure of how linearly
dependent y is on x. If x, y are uncorrelated, i.e., E(xy) = 0, and we measure x and try to
estimate y by a linear estimate, well get y = 0 for the estimate; this is the same as the best
estimate of y if we dont measure x first. In this sense y is not linearly correlated with x.
3. Yadda Yadda
4. Signal-to-error ratio of a quantizer. Consider an N -bit quantizer with resolution . There are
2N levels. Suppose the input is a a sinusoid A sin(t) sampled well above double the frequency.
Suppose the amplitude A is such that the sinusoid is full scale. This means 2A = 2N , i.e.,
A = 2N 1 . Now the mean-square value of A sin t is
Z 2
Z 2
1
1
A2 sin2 (t)dt = A2
sin2 (t)dt = A2 /2.
2 0
2 0

57

CHAPTER 4. THE SAMPLING THEOREM


Since A = 2N 1 , the mean-square value of the input is 2 22N 3 .

On the other hand, the quantizer error is, say, uniformly distributed over [/2, /2]. The
variance of this RV is 2 /12. We define the signal-to-error ratio to be the ratio of the input
mean-square value to the variance of the error, in dB:
10 log10

2 22N 3
= 10 log10 22N 3 12 = 1.76 + 6N.
2 /12

Thus each bit adds 6 dB in accuracy.

4.9

A/D and D/A Converters

This section is to complement Sections 4.8 and 4.9 in the text. The reference for these notes is the
book Understanding Data Converters, by R. Schreier and G. Temes.
1. Digital signal processing usually is processing analog signals, the most familiar application
being audio signal processing. So a DSP processor has to interface with the analog world,
typically as in Figure 4.1.

analog
ADC
input

DSP
core

DAC

analog
output

Figure 4.1: Converters


ADC means analog-to-digital converter and DAC means digital-to-analog converter, each a
type of data converter.
There are two types of data converters: Nyquist-rate and oversampled. In a Nyquist-rate
converter the discrete-time clock is at twice the bandwidth of the analog input signal.
2. Heres an example of a Nyquist-rate DAC. To simplify the figure, we consider converting a
3-bit number into an analog voltage. The input is the binary number b1 b2 b3 , where bi = 0 or
1, and the output the voltage Vout :
Vout = Vref (b1 21 + b2 22 + b3 23 ).
A simple resistor string will do the jobsee Figure 4.2. In the figure, b1 = 1, b2 = 1, b3 = 0,
that is, the input is the binary number 110, and the output voltage is Vout = (6/8)Vref . The
possible outputs are (n/8)Vref , n = 0, . . . , 7, thus giving a 3-bit DAC. The switches would be
operated at the Nyquist rate. This DAC is memorylessnothing is stored in memory.
The problem with this Nyquist-rate DAC is we cant get enough accuracy. Consider an N bit DAC like the one in the picture. Suppose the resistors are not perfect but have some
manufacturing tolerance so that their values in ohms are R(1 ), where is, say, 103 , that
is, 0.1%. Suppose we want to convert the N -bit binary number 10 0 to analog without any
error. In the resistor ladder, we need to generate half the voltage Vref by selecting the bottom
half of the resistors. There are 2N resistors, and the voltage divider rule gives that our goal is
sum of lower-half resistances
1
= .
sum of all resistances
2

58

CHAPTER 4. THE SAMPLING THEOREM

Vref
b3
b2
b3
b1
Vout

b3
b2
b3

Figure 4.2: Simple DAC


We will make the smallest error possible of 1 bit when the ratio on the left equals 21 + 2N
instead of 21 , that is, it corresponds to 100 01 instead of 100 0. Suppose the lower
half resistors are high at R(1 + ) ohms and the upper half are R(1 ). Then the equation
becomes
1 N
2 2 R(1

1 N
2 2 R(1

+ )

+ ) +

1 N
2 2 R(1

1
1
+
.
2 2N

This simplifies to
1
1
1+
= + N,
2
2 2
or equivalently, since = 103 ,
103 = 2N 1 .
Thus N = 1 + 3 log2 (10) = 11. To recap, the converter may make a 1-bit error if N = 11 and
the resistors have an accuracy of 0.1%. Thus the number of obtainable bits without error is
N = 10. But digital audio, for example, needs at least 16 or 18 bits to sound right.

3. For the reason just given, oversampled converters are used, with an oversampling frequency
of from 8 to 512. Before we turn to oversampled ADCs, lets define what a Nyquist-rate ADC
is. It looks like that in Figure 4.3. The block Q is a memoryless quantizer, the opposite of
whats shown in Figure 4.2.

59

CHAPTER 4. THE SAMPLING THEOREM

xc (t)

x[n]

C/D

y[n]

Figure 4.3: Nyquist-rate ADC


4. The oversampling converter were going to study is called a converter, meaning it has a
difference operator () and an integrator (). Keep in mind that the sampling rate is much
larger than twice the bandwidth of the analog input.
We begin with the continuous-time circuit in Figure 4.4; real converters arent this simple,
but its easy to understand how this one works. The circuit has an integrating op-amp and a

C
vc (t)

xc (t)

yc (t)
clocked
comparator

Figure 4.4: A simple converter


feedback loop. The clocked comparator does a sample-and-hold at the oversampling rate,
followed by a 1-bit quantizer. The circuit equations are
vc = sgn(yc )
xc
v
= C y c + .
R
R
The signum function (sgn) equals 1 depending on the sign of the input. The second equation
is an application of KCL at the input node to the op-amp, a node at zero potential.Thus
1
y c =
(xc v).
RC

60

CHAPTER 4. THE SAMPLING THEOREM

A sample-and-hold is a sampler followed by a zero-order hold (ZOH). This leads to the block
diagram in Figure 4.5. The zero-order hold holds the discrete-time values v[n] at that constant
y[n]
xc (t)

1 yc (t)
C/D
RCs

v[n]
vc (t)

ZOH

Figure 4.5: Block diagram of converter


value over the time interval nT t < (n + 1)T . In this block diagram the sampling frequency
is at the oversampled rate. The key point of this architecture is that the R and C values do
not have to be very precise because the integrator is going to helpit has infinite DC gain no
matter what R and C are (hooray for feedback loops).
Since fs is much larger than the bandwidth of xc (t), we can approximate the preceding block
diagram by the one in Figure 4.6. And this can be simplified to Figure 4.7. This has a

xc (t)

C/D

ZOH

y[n]
1 yc (t)
C/D
RCs

v[n]

ZOH

vc (t)

Figure 4.6: Almost equivalent block diagram

xc (t)

C/D

x[n]

k y[n]
Q
z1

v[n]

k = T /RC
Figure 4.7: Simplified block diagram
discrete-time integrator in a feedback loop with a 1-bit quantizer. The discrete-time clock is
at the oversampling rate.
5. How can we get good analog-to-digital conversion with a quantizer with only 1 bit? The
reason is that this converter has memory: The output v[n] at time n depends on all the past
values of x[n]. The average of v[n] over a long window equals the average of x[n] over the
same window:

61

CHAPTER 4. THE SAMPLING THEOREM


Proof Lets say the window is from n = 0 to n = N . We have
y[n + 1] = y[n] + k(x[n] v[n]).
This yields
y[N ] = y[0] + k

N
1
X
n=0

(x[n] v[n])

which in turn leads to




y[N ] y[0]
= k avg0kN 1 x[n] avg0kN 1 v[n] .
N

Assuming the loop is stable, so that y[n] is bounded, the left-hand side is small for N large,
and therefore
avg0kN 1 x[n] avg0kN 1 v[n].

Thus x[n] can be recovered from v[n] by lowpass filtering, i.e., averaging. Then xc (t) can be
recovered using a DAC, which can be a zero-order hold followed by a lowpass filter.
6. To recap, a converter is based on oversampling at a high rate; it can have a 1-bit
quantizer; its relatively insensitive to circuit component values.
7. The quality of an ADC is measured by a quantity called signal-to-quantizer noise ratio
(SQNR). For this topic we need to approximate the quantizer in Figure 4.7. Define the
signal
e[n] = v[n] y[n] = Q(y[n]) y[n].
That is, e[n] is the quantizer output minus the quantizer input. We now set k = 1, i.e.,
RC = T , and re-draw the block diagram as though e[n] were a noise inputFigure 4.8. The

e[n]
xc (t)

C/D

x[n]

y[n]
1
z1

v[n]

Figure 4.8: Quantizer noise


transfer functions are these: From x[n] to v[n] it is z 1 and from e[n] to v[n] it is 1 z 1 .

We now model e[n] as a random signal that is white, that is, independent from one time to
another (e[n], e[m] are independent for n 6= m), and identically distributed at every time (the
density function of e[n] is the same for all n). The quantizer output, v[n], is 1, depending

CHAPTER 4. THE SAMPLING THEOREM

62

on the sign of y[n]. Let us assume |y[n]| 2. Then the value of e[n] is confined to the interval
[1, 1]. Let us therefore assume e[n] is uniformly distributed over this range. Then its mean
is 0 and its variance is 1/3. Thus the autocorrelation function of e[n] is e [n] = (1/3)[n].
You can compute the power spectrum of v[n] in response only to e[n] to be
v (ej ) =

sin2 .
3
2

Noteworthy is that this spectrum is zero at DC. Since v[n] is going to be lowpass filtered (to
recover x[n]), the effect of the noise e[n] is going to be small, depending on the oversampling
factor.
To see this, let us be more explicit about frequency ranges. Let 0 denote the bandwidth of
the analog signal xc (t). The sampling frequency for a Nyquist-rate converter would be 2 0
(or a bit larger). Let M be the oversampling factor (from 8 to 512). Then the sampling
frequency for the converter were treating is s = M 2 0 , and thus the Nyquist frequency
for this sampling frequency is M 0 . In discrete time, where = T , the bandwidth of x[n]
is 0 T = /M . Thus on the axis from 0 to , the inband range is from 0 to /M . Thus
the inband power of v[n] due to the quantization noise e[n] is
1
2

/M

v (ej )d.

/M

For M large, this evaluates to (const)/M 3 . Now SQNR is the ratio of the power of xc (t) to the
power just computed (the inband power of v[n] due to e[n]). Fixing xc (t) as, say, a full-scale
sinusoid, we have
SQN R = (const.) M 3 .
SQNR is normally written in dB, for which we take 10 log10 of the above:
SQN R = (const.) + 30 log10 M.
In particular, if M is doubled, the SQNR increases by 9 dB. A typical value for SQNR is 90
dB.
Besides increasing the oversampling factor, SQNR can also be enhanced by design of other
feedback loops instead of the one in Figure 4.8.

Chapter 5

Multirate DSP
Multirate systems have at least two different data rates. The ones we study have downsamplers
and/or upsamplers. This makes them time-varying from the input to the output.

5.1

Components of a Multirate System

We begin our study of multirate systems by reviewing basic components namely, downsamplers and
upsamplers. Well use our matrix representations of systems.
1. The downsample by 2 is denoted by either
x[n]

- D

The output is defined

..
.

D=
0
0

..
.

y[n]

or

x[n]

- 2

y[n]
-

by y[n] = x[2n] and the matrix is

.. .. .. ..
. . . .

0 0 0 0

0 1 0 0
.
0 0 0 1

.. .. .. ..
. . . .

2. The upsample by 2 is denoted by either


x[n]
-

y[n]

or

x[n]

- 2

y[n]
-

Going forward in time from n = 0, the output y[n] is obtained by inserting 0 following every
input sample; likewise, going backward in time from n = 0, a zero is inserted following every
input sample. Thus,
x = (. . . , x[2], x[1]|x[0], x[1], x[2], x[3], . . . )
63

64

CHAPTER 5. MULTIRATE DSP


becomes
y = (. . . , x[2], 0, x[1], 0|x[0], 0, x[1], 0, x[2], 0, x[3], . . . )
The matrix is

E=

..
.
1
0
0
0
0
..
.

..
.
0
0
1
0
0
..
.

..
.
0
0
0
0
1
..
.

Notice that the matrix E is the transpose of the matrix D, that is, E = DT . So we dont
need the symbol E any morewell use DT .
3. Its a simple fact that the output of DDT equals the input: If you upsample and then downsample, you havent done anything. So we write DDT = I.
4. On the other hand, the system DT D is this:
- 2

The matrix is

T
D D=

..
.
1
0
0
0
0
..
.

..
.
0
0
0
0
0
..
.

..
.
0
0
1
0
0
..
.

..
.
0
0
0
0
0
..
.

..
.
0
0
0
0
1
..
.

- 2

So this system is time-varying. However, it has the special property that if the input is delayed
by two sample times, then the output is too. That is to say,
DT DU2 = U2 DT D.
When this happens, we say the system DT D is periodic of period 2, or 2-periodic.
5. A filter bank is a filter with either several inputs or several outputs, or both. An example is
x

- G0

- G1

y0
-

y1
-

y0 = G0 x

y1 = G1 x

65

CHAPTER 5. MULTIRATE DSP

It has one input and two outputs. Alternatively, we can write the two equations y0 = G0 x
and y1 = G1 x as one

 

y0
G0
=
x.
y1
G1
The companion system, with two inputs and one output, is
x0

- G0

x1

- G1

?y
- j -

y = G0 x0 + G1 x1

We can write
y=

G0 G1

x0
x1

6. Warning: The matrix for a filter bank may not be what you
matrices for two systems G, H, for example,

..
..
..
..
.
.
.
.

g[0] 0

0
0

g[1] g[0] 0

0

G =
0 g[1] g[0] 0 , H =


0
0
g[1]
g[0]



0
0
0 g[1]

..
..
..
..
.
.
.
.

think it is. Suppose we have


..
..
..
..
.
.
.
.
h[0] 0
0
0
h[1] h[0] 0
0
0 h[1] h[0] 0
0
0 h[1] h[0]
0
0
0 h[1]
..
..
..
..
.
.
.
.

and then we form the system

- G

- H

y0
-

y1
-

y0 = Gx

y1 = Hx

What is the matrix for the result? The natural way to get it is to define the output to be


y0
y=
y1

66

CHAPTER 5. MULTIRATE DSP


and then to take the output vector to be

..

..
.
. y0 [0]

y[0]
y1 [0]

y=
y[1] = y0 [1] ,

y[2]
y1 [1]

..
..
.
.

This gives

y=

..
.
g[1]
h[1]
0
0
0
..
.

..
..
..
.
.
.
g[0]
0
0
0
0
h[0]
g[1] g[0]
0
h[1] h[0]] 0
0
g[1] g[0]
..
..
..
.
.
.

Thus the rows are those of G, H interleaved.

7. Now let us look at this sysytem:


x[n]

- 2

- 2

x0 [n]

- 2

x1 [n]

- 2

- z 1

? y[n]
- j -

- U

? y[n]
- j -

Alternatively, we can use the matrices for the blocks:


x[n]

- D

- UT

- D

x0 [n]

- DT

x1 [n]

- DT

The output of DT in the upper branch is


(. . . , x[2], 0|x[0], 0, x[2], 0, x[4], . . . ).
In the lower branch the input to D is
(. . . , x[2], x[1], x[0]|x[1], x[2], x[3], x[4], . . . )

CHAPTER 5. MULTIRATE DSP

67

and thus the output of DT is


(. . . , 0, x[1], 0|x[1], 0x[3], 0, . . . ).
Then the output of U is
(. . . , 0, x[1]|0, x[1], 0, x[3], 0, . . . ).
Adding gives
(. . . , x[2], x[1]|x[1], x[2], x[3], 0, . . . ),
which equals x. Thus the output equals the input. We say this system has the perfect
reconstruction property. The system from x to y is
T = DT D + UDT DUT .
Since the system has the perfect reconstruction property,
DT D + UDT DUT = I.

(5.1)

Finally, it is convenient to use matrix notation again and define the filter bank from x to
(x0 , x1 ):

 

D
x0
=
x.
(5.2)
x1
DUT
The output of this filter bank is the input partitioned into blocks of length 2. That is, the
output at time n is


x[2n]
.
x[2n + 1]
For n = 0, 1, 2, . . . , this gives

 
 

x[4]
x[2]
x[0]
,....
,
,
x[3]
x[5]
x[1]
Likewise, the matrix from (x0 , x1 ) to y is
 T

D
UDT

and the output is the merging, or interleaving, of the two inputs, as we saw above.

8. All these building blocks extend to a general integer N . So theres a downsample by N ,


upsample by N , and so on.

68

CHAPTER 5. MULTIRATE DSP

5.2

Block Filtering

1. There are signal processing algorithms that require block processing. An example is JPEG
compression of an image. A black and white image is stored as a matrix of gray level values,
one per pixel. The JPEG algorithm involves partitioning the matrix into blocks of size 8 8
and doing a discrete cosine transform of each block, followed by quantizing and coding.
In this section we see how to implement an ordinary digital filter by dividing the input into
segments or blocks; this is block convolution. Theres no real advantage to this by itself, but
later in the DFT chapter well see a remarkable way of speeding up the filtering process using
this method of block processing.
2. Block filtering involves a scheme called overlap-add. It is illustrated pictorially here, in continuous time for readability:

x0

y0

x1

y1

The goal is to filter the signal x shown at the top left. The signal is broken into two segments,
x0 and x1 , shown below x. But x0 is actually longer than the true segment of x because it is
appended with zero. The actual length of x0 is shown by the double-headed arrow. Likewise,
x1 is the second segment of x, again appended with zero. Then to the right of x0 is the result
of filtering x0 ; it is denoted by y0 . Likewise for y1 . Notice that these two signals overlap in
time: y1 starts before y0 has finished. These two signals are added together to get y, shown
below. The method works correctly provided the appended zero components are long enough.
Well see how much zero-padding is necessary in the discrete-time example to follow.
3. Now well derive this overlap-add procedure in discrete time. Well use our matrix method and
youll see how slick the derivation is. It may look a bit complicated, but its actually simple
algebra manipulation. Were going to use very small integers for ease of writing. Consider a
first order FIR filter H(z) = h[0] + h[1]z 1 . We want to filter a signal x[n] by partitioning it

69

CHAPTER 5. MULTIRATE DSP

into blocks of length 3. That is, we want to implement the convolution y = h x but where
x[n] is partitioned into blocks.
We have H = h[0]I + h[1]U, that is,

..
..
..
..
.
.
.
.

h[1] h[0] 0
0

H=
0 h[1] h[0] 0


0
0 h[1] h[0]

..
..
..
..
.
.
.
.

.. .. .. .. ..
. . . . .

0 1 0 0 0

= h[0]
0 0 1 0 0
0 0 0 1 0

.. .. .. .. ..
. . . . .

Thus

..
.

..
.

+ h[1]

.. .. .. .. ..
. . . . .

1 0 0 0 0

0 1 0 0 0
.
0 0 1 0 0

.. .. .. .. ..
. . . . .

y = Hx
= (h[0]I + h[1]U)x.
It simplifies matters to do the two cases
y0 = Ix, y1 = Ux
separately, and then take y = h[0]y0 + h[1]y1 .
4. First, we do the case y1 = Ux. To make blocks of length 3, we need the downsampler by 3;
denote it by D. Then, as in (5.2), the blocking system is

D
DUT
DUT 2
and as in (5.1) we have

DT D + UDT DUT + U2 DT DUT 2 = I.


Thus
y1 = U(DT D + UDT DUT + U2 DT DUT 2 )x.
Write this as follows:

y1 =

DT

UDT

U 2 DT

U 3 DT

0
 D

DUT x.
DUT 2

70

CHAPTER 5. MULTIRATE DSP


The factor on the right can itself be factored to pull out the blocking system:

0 0 0
D
1 0 0
T

0 1 0 DUT 2 .
DU
0 0 1
This product can be rewritten asheres where we append a zero

D
0 0 0 0
1 0 0 0 DUT

0 1 0 0 DUT 2 .
0 0 1 0
0

Putting the factors together, we get

y1 =

DT

UDT

U2 DT

U 3 DT

0
 1

0
0

0
0
1
0

0
0
0
1

0
D

0 DUT
0 DUT 2
0
0

x.

Lets read this from right to left: Start with the input x. Partition it into blocks of length 3,
and append (pad) with a zero. This gives blocks of length 4. Next, multiply each block by
the 4 4 Toeplitz matrix

0 0 0 0
1 0 0 0

T =
0 1 0 0 .
0 0 1 0

Finally, merge.

Heres the block diagram implementation of the method:

x[n]

y[n]

z 1

z 2

z 3

z
3
0

71

CHAPTER 5. MULTIRATE DSP

You can see the blocking, the padding with one zero, the block-processing by T , and the
merging.
The case y0 = Ix

1
0
T =
0
0

is exactly the same except the Toeplitz matrix is the identity matrix:

0 0 0
1 0 0
.
0 1 0
0 0 1

Finally, putting the two together, we get exactly the same except the Toeplitz matrix is

h[0] 0
0
0
h[1] h[0] 0
0

T =
0 h[1] h[0] 0 .
0
0 h[1] h[0]
In summary, the block filtering formula for H(z) = h[0] + h[1]z 1 is

h[0] 0
0
0


 T
h[1] h[0] 0
0
y= D
UDT U2 DT U3 DT
0 h[1] h[0] 0
0
0 h[1] h[0]

D
DUT

DUT 2 x.
0

(5.3)

5. To aid in understanding what we have done, and also to write code to implement the system,
we bring in matrix representations. To simplify a bit, well do the previous example but where
the input block length is 2, not 3. Then the downsampler is 2 and the factorization (5.3)
simplifies to

D
h[0]
0
0


(5.4)
y = DT UDT U2 DT h[1] h[0] 0 DUT x.
0
0 h[1] h[0]
First, the blocking system,


D
,
DUT
where D is downsample by 2. Label the signals like this:

x[n]

2
z

v0 [n]
v1 [n]

72

CHAPTER 5. MULTIRATE DSP


The outputs are v0 [n] = x[2n], v1 [n] = x[2n + 1]. The output is therefore a vector:


v0 [n]
v[n] =
.
v1 [n]
Let us take the vector corresponding to v[n] to be

..
.
v[1]

.
v=
v[0]

v[1]

..
.

Then you can easily

.. ..
. .

0 0

1 0

0 1

0 0

0 0

0 0

.. ..
. .

check that the matrix producing v from x is the identity matrix:

.. .. ..
. . .

0 0 0

0 0 0

0 0 0
.
1 0 0

0 1 0

0 0 1

.. .. ..
. . .

(5.5)

In a similar way, the matrix of the far-right system in (5.4), i.e., block and pad, is

..
.
0
1
0
0
0
0
0
0
..
.

..
.
0
0
1
0
0
0
0
0
..
.

..
.
0
0
0
0
1
0
0
0
..
.

..
.
0
0
0
0
0
1
0
0
..
.

..
.
0
0
0
0
0
0
0
1
..
.

(5.6)

73

CHAPTER 5. MULTIRATE DSP


We can see the pattern better

.. .. .. .. ..
. . . . .

0 0 0 0 0

1 0 0 0 0

0 1 0 0 0

0 0 0 0 0

0 0 1 0 0

0 0 0 1 0

0 0 0 0 0

0 0 0 0 1

.. .. .. .. ..
. . . . .

This matrix

1
0
0

by partitioning this matrix into blocks:

is block diagonal, with the submatrix

0
1
0

on the diagonal.

Then the middle system on the right-hand side of (5.4) is the block-diagonal matrix with the
3 3 Toeplitz matrix along the diagonal:

..
..
..
..
..
..
..
.
.
.
.
.
.
.


0
0
0
0
0
0
0

h[0] 0
0
0
0
0
0

h[1] h[0]
0
0
0
0
0

0
0
h[0] 0
0
0
0


0
0
h[1] h[0] 0
0
0 .


0
0
0 h[1] h[0] 0
0


0
0
0
0
0 h[0] 0


0
0
0
0
0
h[1]
h[0]


0
0
0
0
0
0 h[1]

..
..
..
..
..
..
..
.
.
.
.
.
.
.
Finally, continuing with the downsampler by 2, lets look at the merge system
h
i
T
T
T
2
.
D
UD
U D
The matrix for this system is the transpose of the matrix for

D
DUT .
DUT 2

(5.7)

74

CHAPTER 5. MULTIRATE DSP


The latter is (check this)

.. .. .. ..
. . . .

1 0 0 0

1 0 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 0

0 0 0 0

.. .. .. ..
. . . .

Thus the matrix

..
.

..
.

..
.
0
0
0
0
0
0
1
1
..
.

..
.
0
0
0
0
0
0
0
0
..
.

..
.
0
0
0
0
0
0
0
0
..
.

..
.
0
0
0
0
1
0
0
0
..
.

..
.
0
0
0
0
1
0
0
0
..
.

..
.
0
0
0
0
0
1
0
0
..
.

..
.
0
0
0
0
0
0
1
0
..
.

for (5.7) is
..
.
0
0
0
0
0
0
0
0
..
.

..
.
0
0
1
0
0
0
0
0
..
.

You can see the pattern

.. .. ..
. . .

0 0 0

1 0 0

0 1 1

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

.. .. ..
. . .

..
.
0
0
0
1
0
0
0
0
..
.

..
.
0
0
0
0
0
0
1
0
..
.

..
.
0
0
0
0
0
0
0
1
..
.

better by partitioning this matrix into 2 3 blocks::

.. .. .. .. .. .. ..
. . . . . . .

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

1 0 0 0 0 0 0

.
0 1 1 0 0 0 0

0 0 0 1 0 0 0

0 0 0 0 1 1 0

0 0 0 0 0 0 1

.. .. .. .. .. .. ..
. . . . . . .

The block pattern is this:




1 0 0
0 1 0

on the main diagonal and




0 0 1
0 0 0

(5.8)

75

CHAPTER 5. MULTIRATE DSP


on the first diagonal below that.

6. Continuing, as an example computation, suppose we want to multiply the two polynomials


H(z) = 2 z 1 , X(z) = 1 + 2z 1 z 2 + 4z 3 + z 4 .
The computation is easy by hand:
Y (z) = H(z)X(z) = 2 + 3z 1 4z 2 + 9z 3 2z 4 z 5 .
To use the matrices above, we have to regard x[n] as a signal defined for all n:
.
..

x=
1
4

..
.

Multiplying by matrix (5.6) gives the vector


.
..

..
.

Then multiplying by the block Toeplitz matrix

.. ..
..
.. ..
..
..
. .
.
. .
.
.


0
0 0
0
0
0 0


2 0
0
0 0
0
0

1 2
0
0 0
0
0


0
0
2
0
0
0
0


0 0 1
2 0
0
0


0
0
0
1
2
0
0


0
0
0
2
0

0
0


0 0
0
0 0 1
2


0
0
0
0
0
0
1

.. ..
..
.. ..
..
..
. .
.
. .
.
.

(5.9)

CHAPTER 5. MULTIRATE DSP

76

gives

..
.
0
2
3
2
2
9
4
2
1
0
..
.

Finally, multiply by (5.8):


.
..

9 .

..
.

This matches up with Y (z) in (5.9), so the answer is right!


7. Heres a Scilab program to do the computation of the example but for a long sinusoidal input.
The input block length is 3. The only tricky part is the overlap-add. You can see how its
done in the for loop. MATLAB code is almost identical, with conv instead of convol.
// Block filtering
// N is the number of input samples
N=3*10^4;
// input x[n] and filter h[n]
x=sin((1:N));
h=[2 -1];
// computation by direct convolution

CHAPTER 5. MULTIRATE DSP

y=convol(h,x);
y=y;
y_direct=y(1:N);
// computation by block filtering
// redefine notation
h0=h(1);
h1=h(2);
// define some matrices
// B is to append a zero
B=[1 0 0;0 1 0;0 0 1;0 0 0];
// K is for overlap-add
K=[0 0 0;0 0 0;0 0 0;1 0 0];
// T is the Toeplitz matrix
T=[h0 0 0 0;h1 h0 0 0;0 h1 h0 0;0 0 h1 h0];
// the signal w[n] is the output of T
w_block_old=[0 0 0 0];
// initialize y[n]
y=0;
for i=1:(N/3)
x_block=[x(3*i-2) x(3*i-1) x(3*i)];
x_block_pad=B*x_block;
w_block=T*x_block_pad;
y_block=B*w_block+K*w_block_old;
y=[y;y_block];
w_block_old=w_block;
end
y_block_filtering=y(2:(N+1));
// check that the two ys are equal
norm(y_direct-y_block_filtering)

77

78

CHAPTER 5. MULTIRATE DSP

8. What is the block diagram in the general case, where the filter is length M and the input
block length N M ? The downsampling integer is N ; M 1 zeros are padded to the input
block; and T is (N + M 1) (N + M 1).
9. As a fun exercise, write a MATLAB or Scilab script to filter a signal with 106 samples in
blocks of length 1000 and where the filter has 64 coefficients, h[0], . . . , h[63].
10. Lets summarize symbolically what we did in this section: We started with a causal FIR filter
H and we factored it as H = MTB, where B breaks the input into blocks and appends a
sufficient number of zeros; T filters each block one by one; and M merges the filtered blocks
using overlap-add.

5.3

Changing the Sampling Rate

1. The sampling rate used in audio signal processing depends on the application. For example,
for CDs it is 44.1 kHz, for studio recordings it is 48 kHz, and for digital broadcast it is 32 kHz.
Suppose we want to change the sampling rate. That is, a signal xc (t) has been sampled at,
say, 48 kHz, producing x[n], and we want to get y[n], xc (t) sampled at 44.1 kHz. Assuming
no aliasing, we could reconstruct xc (t) from x[n] and then re-sample at 44.1 kHz. But theres
a better way, using only discrete-time operations. For that we need general frequency-domain
formulas for down- and upsampling.
2. Downsampling by M . Suppose a signal x[n] is downsampled by M to produce v[n]. How are
these two signals related in the frequency domain? The formula is nicer in terms of the z
transform.
Lets do M = 3 and then write down the general case. So v[n] = x[3n]. We want V (z) in
terms of X(z). Let W denote the complex number ej2/3 . We have
X
X(z) =
x[n]z n
n

X(W z) =

x[n](W z)n =

X(W 2 z) =

x[n]W n z n

x[n](W 2 z)n =

Add:

X(z) + X(W z) + X(W 2 z) =

x[n]W 2n z n .

X
n


1 + W n + W 2n x[n]z n .

Now look at the term


Cn = 1 + W n + W 2n .
It is the sum of three complex numbers on the unit circle. It is claimed that Cn = 0 if n is
not a multiple of 3; otherwise Cn = 3. You prove this.

79

CHAPTER 5. MULTIRATE DSP


Thus we have
X

X(z) + X(W z) + X(W 2 z) =

3x[n]z n

n is a multiple of 3
and so
X(z) + X(W z) + X(W 2 z) =

3x[3m]z 3m = 3V (z 3 ).

So the formula were looking for is


V (z 3 ) =


1
X(z) + X(W z) + X(W 2 z) .
3

To get the FT, set z = ej :


i
1h
V (ej3 ) =
X(ej ) + X(ej(2/3) ) + X(ej(4/3) ) .
3

Another form for this is


i
1h
V (ej ) =
X(ej/3 ) + X(ej(2)/3 ) + X(ej(4)/3 ) .
3
For a general integer M , the formulas are these:
v[n] = x[M n]
W = ej2/M

1 
V (z M ) =
X(z) + X(W z) + + X(W M 1 z) .
M
i
1 h
X(ej ) + X(ej(2/M ) ) + + X(ej((M 1)/M ) ) .
V (ejM ) =
M
i
1 h
X(ej/M ) + X(ej(2)/M ) + + X(ej((M 1))/M ) .
V (ej ) =
M

3. Upsampling by L. Suppose a signal x[n] is upsampled by an integer L to v[n]. The formulas


are simpler:
V (z) = X(z L )
V (ej ) = X(ejL )
4. Now we can develop the sample-rate changer. Let f1 and f2 be two sampling rates in Hz. Let
xc (t) be a signal bandlimited to less than both f1 /2 and f2 /2 in Hz. Assume f1 and f2 are
rationally related, that is, there are two positive integers L, M such that
f2 =

L
f1 .
M

Then the following system converts xc (t) sampled at f1 Hz to xc (t) sampled at f2 Hz:

80

CHAPTER 5. MULTIRATE DSP

xc (t)

C/D

x[n]
f1

v[n]

w[n]
LP

cutoff /L
gain L

y[n]
f2

Sketch the Fourier transforms of all the signals to convince yourself this works.
5. For example, lets do the case we started with: f1 = 48 kHz, f2 = 44.1 kHz. Thus
f2 =

441
147
f1 =
f1 .
480
160

So L = 147, M = 160.
Question: Why couldnt we downsample by 160 first and then upsample by 147; why do we
have to upsample first?

5.4

Subband Coding

This section is taken from a research paper:


T. Chen and B. A. Francis, Design of multirate filter banks by H-infinity optimization,
IEEE Trans. Signal Processing, Dec. 1995, pp. 2822-2830.
1. One motivation for this section is that of compression, reducing the number of bits in the
digital representation of a signal. Heres a chart showing compression descriptors and where
subband coding lies, the technique well study.
Compression methods

Waveform based
Lossy

Model based
Lossless

Frequency domain

Time domain

Filter based

Transform based

Subband coding

DCT, wavelet etc.

Examples of compression are zip applied to a file, JPEG (still image), MPEG (video-audio).
For example, consider video DVD. The frame rate is 30 frames/second with say 720 480
pixels/frame, and 8 bits/pixel. The total number of bits/second in a video signal is therefore
30 720 480 8 = 80 106 = 80 Mbps

81

CHAPTER 5. MULTIRATE DSP

A DVD can hold 4.7 Gb, or 4.7 109 8 bits. Therefore, without compression a DVD can
store only
(4.7 109 8)/(80 106 ) = 470 seconds = 7.8 minutes.
2. This system we want to study is the multirate filter bank
x[n]

- H0

- D

- H1

- D

x0 [n]

- DT

x1 [n]

- DT

- F0

- F1

[n]
? x
- j -

The four subsystems H0 , H1 , F0 , F1 are LTI filters. Heres how this system works. The
input x[n] is a signal that we want to compress. In subband coding we decompose the signal
into two components, shown as x0 [n] and x1 [n]; then we quantize these two components. The
quantizers are not shown, but they would be just before the two blocks labelled DT . Typically
H0 is a lowpass filter and H1 is a highpass filter. If these filters were perfect, their outputs
would be bandlimited to, respectively, [, /2] and [/2, ]. These two bandlimiteid signals
could be subsampled by D without loss of information. Then the two signals x0 [n] and x1 [n]
would contain exactly the same information as x[n]. Therefore, x[n] could be reconstructed
from these two signals and we would have x
[n] = x[n] if the filters F0 , F1 were designed
perfectly. The compression comes about by quantizing the two signals x0 [n] and x1 [n]. Why
is this better than just quantizing x[n] directly? Suppose x[n] is an audio signal. The ear does
not have the same gain over all frequencies; there are frequency ranges over which hearing
quality is reduced. Over these frequency ranges fewer bits are needed. The figure shows just
two channels. The MPEG audio coder has 32 channels.
3. The preceding discussion raises two issues: How to design the four filters; how many bits are
required in the quantizers. Well leave the latter question to a later section of the course; we
continue with the first question. From x to x
is
T = F0 DT DH0 + F1 DT DH1
This system is 2-periodic: U2 T = TU2 . In 1988 Vaidyanathan and Mitra solved the problem of alias cancellation, getting conditions on the filters so that T is time-invariant. Then
researchers studied the problem of perfect reconstruction, designing filters so that for some m
x
[n] = x[n m],
that is, the reconstructed signal is a delayed version of the original signal. Delay in signal
processing is usually not harmful. In this problem the ideal system from x to x
is
Td = Um ,

Td (z) = z m

The conventional design method for this problem is to treat separately the problems of aliasing,
magnitude distortion, and phase distortion.

82

CHAPTER 5. MULTIRATE DSP

4. An alternative approach is to design the so-called analysis filters H0 , H1 for good frequency
separation, and then to design the synthesis filters F0 , F1 to make x
as close to x as possible.
This leads to an error system where our system is compared to the ideal:

Td

- H0

- DT

F0

- H1

- DT

F1

Td = Um ,

?
?
- m
- m

e
-

Td (z) = z m

e = [Td (F0 DT DH0 + F1 DT DH1 )]x


Now we need a performance criterion that measures how big the error e[n] is. It is common
to take the quadratic norm:
kek =

e[n]2

!1/2

This leads to the problem of minimizing the error kek by designing the filters F0 , F1 . But this
isnt well posed yet because kek depends not only on the filters but also on the input x[n].
One sensible approach is to allow x[n] to be arbitrary except scaled so that kxk = 1. Then
the worst-case error is
J = max kek.
kxk=1

Then we could design the synthesis filters to minimize this performance measure. So heres
the problem now:
Given (causal, stable) analysis filters H0 (z) and H1 (z) and given a tolerable time delay
m;
Design (causal, stable) synthesis filters F0 (z) and F1 (z) to minimize J.
Developing the solution to this would take too long.

83

CHAPTER 5. MULTIRATE DSP

5. This is a typical example of a subband filter bank for audio coding. Leakage between frequency
bands is undesirable, that is, ideally the analysis filters should not overlap. Moreover, analysis
filters with narrow transition widths and high stopband attenuation are sought. Johnston
filters (named after the engineer of that name) are widely used, especially for speech coding.
They are linear phase, alias free, FIR, and optimized for out-of-band leakage and amplitude
distortion; they are computationally efficient and perceptually lossless. We will compare the
performance of such a system with one where the analysis filters are Johnston filters and the
synthesis filters are designed by the optimization method outlined above.
First, the standard method. Filter H0 (z) was chosen to be Johnstons filter codename 32D;
it is a 32-order FIR filter with passband edge 0.25 and stopband edge 0.293. Here and in the
following, frequency is f normalized to the interval [0, 0.5]. Heres the magnitude plot of H0 :
$!

!$!

|H0 (f )|

!%!

!&!

!'!

!#!

!)!

!(!
!

!"!#

!"$

!"$#

!"%

!"%#

!"&

!"&#

!"'

!"'#

!"#

Theres a standard way of getting the other three filters:


H1 (z) = H0 (z),

F0 (z) = 2H0 (z),

F1 (z) = 2H0 (z).

The overall delay was m = 31. The performance is measured in terms of signal-to-noise ratio,
SN R, defined as follows:
SN R = 10 log10

encoder input signal energy


noise signal energy

That is,
P

x[n]2
SN R = 10 log10 P
(x[n] x
[n])2

Equivalently,

SN R = 20 log10

kxk
.
kx x
k

For the standard filters above, the average SN R over a wide range of x is 49.8 dB.
Now for the optimization approach. The SN R is 58.0 dB, considerably better. The optimal
filters are order 41, IIR.

84

CHAPTER 5. MULTIRATE DSP

5.5

MPEG Audio Compression

This section was written by two students, Alp Kucukelbir and Mark Grover.
(a) MPEG stands for the Moving Picture Expert Group. The group was formed in
1988. They created standards, which are simply specications without any architectural
design. MPEG-1 is for Video CD and MP3 Audio; MPEG-2 is for Digital TV; MPEG-3
is for HDTV (not very popular); and MPEG-4 is for multimedia for xed and mobile web
applications (e.g. DRM). Each of the above standards has layers. For this section we
primarily constrain ourselves to MPEG-1 Layer II Audio Encoding.
(b) MPEG audio uses a form of lossy compression. This means that the perfect reconstruction of the encoded signal is (in general) not possible. Instead, the goal is to design
encoding schemes to reproduce the original signal to a perceivably correct level. In
MPEG audio, the assumption is that a human will listen to the decoded audio signal,
allowing the encoding scheme to exploit the many non-linearities of the human ear.
(c) The following block diagram shows the MPEG-1 Layer II audio coder in simplified form.
The input is the audio signal sampled above the Nyquist rate.

32

BPF

32

HPF

Q
Q

bitstream
formulation

LPF

coding

32

Q
psychoacoustic
model

The filter bank divides the 0-20 kHz audible spectrum into 32 equal width bins (with
some overlap). Only three are shown. This resolution of the audible spectrum is too
coarse to perform any special operations. The filters are implemented as 512 tap FIR
lters. This provides a good amount of control over the properties of the individual lters.
The filter bank is critically sampled.
The human ear is a nonlinear hearing device. In particular it exhibits, what we call,
critical bands. The human ear tends to blur sounds within a critical band. These bands
can be as thin as 100 Hz in the lower frequencies, to as high as 4 kHz in the higher
frequencies. So the main idea of the psychoacoustic model is to build a noise mask as
a function of frequency for the audio signal to be encoded. Using such a mask, the
psychoacoustic model can determine how many bits to use for quantization purposes,
which brings us to the next stage.
The quantization stage will vary with each cycle of the encoding process. Such quantizers are said to be adaptive. As the psychoacoustic model determines the optimal bit
distribution, the quantization for each sub-band will be performed at a specic resolution.
Then come the Error-Correcting Codes and general coding schemes.

85

CHAPTER 5. MULTIRATE DSP

The most important thing to realize in this stage is that the quantization bit distribution
information is also packaged along with the encoded audio samples. This is crucial, as it
allows for the blind decoding of the encoded audio signal on the decoder endi.e., the
decoder has no knowledge of the psychoacoustic model and its inner operation scheme.

5.6

Efficient Implementation

1. Consider the sample-rate changer system:

v[n]

x[n]

w[n]
H(z)

y[n]
3

Suppose the input x[n] is length N . Then the degree of X(z) equals N 1. Lets assume N is
much larger than 1, say 106 . Then the length of x[n] and the degree of X(z) are equal to the
accuracy we need. Likewise, let the impulse response of the FIR filter H(z) be of length M and
assume M is much greater than 1, say M = 103 . The length of v[n] is 2N . The convolution
of the filter is equivalent to the computation W (z) = H(z)V (z), where the degrees of V, H, W
are respectively 2N, M, 2N + M . Each coefficient of W (z) requires M multiplies and M adds,
for a total of 2M ops, and there are 2N + M coefficients. Thus the total number of ops to
compute the output is
2M (2N + M ).
Lets suppose N is much larger than M . Then the number of ops is 4M N . But we should
be able to reduce that because, for example, 2/3 of the values of w[n] are discarded by the
downsampler.
2. To reduce the number of computations, several useful facts are required. The first is that the
following two systems are input-output equivalent:

z 1

z 2

That is, if you downsample and delay by 1 time step, its equivalent to delaying by 2 time
steps and then downsampling. In terms of our matrices,
UD = DU2 .
This extends to this equivalence:

H(z)

H(z 2 )

86

CHAPTER 5. MULTIRATE DSP


3. The upsampler goes the other way:

H(z)

H(z 2 )

4. The preceding equivalences extend in the obvious ways for a general integer N .
5. This fact is sometimes handy: L and M commute if and only if M and L have no common
divisor except 1. For example, M = 2, L = 3:

6. Next, let G(z) be a z transform. Thus


X
G(z) =
g[n]z n .
n

We can write the summation as the sum over all even n plus the sum over all odd n:
X
X
G(z) =
g[2n]z 2n +
g[2n 1]z (2n1) .
n

The second term can be written


X
X
g[2n 1]z (2n1) = z
g[2n 1]z 2n .
n

Thus we have that G(z) can be written in the form


G(z) = G0 (z 2 ) + zG1 (z 2 ).
Likewise, for any positive integer N we can write
G(z) = G0 (z N ) + zG1 (z N ) + + z N 1 GN 1 (z N ).
7. Let us use these tricks in getting a more efficient implementation of the system we started
with:

H(z)

First, factor H(z) as


H(z) = H0 (z 3 ) + zH1 (z 3 ) + z 2 H2 (z 2 ).
Each of these three factors has length M/3. Now redraw the diagram:

87

CHAPTER 5. MULTIRATE DSP

H0 (z 3 )
z

H1 (z 3 )

z2

H2 (z 3 )

Manipulate to get
2

H0 (z)

H1 (z)

z2

H2 (z)

Now write z = z 2 z 3 and z 2 = z 4 z 3 and then manipulate to get


3

H0 (z)

z 1

zH1 (z)

z 2

z 2 H2 (z)

Lets pause here and count the number of ops for this system. Each of the three filters is
length M/3 and the length of their inputs is 2N/3. So for each filter the number of ops is,
again assuming N is much greater than M ,
2

M
2N

.
3
3

And since there are three filters, the number of ops for the filters is 4M N/3. Then we have to
account for the two summations, but that is a multiple of N alone. So we have reduced the
number of ops from 4M N to 4M N/3.
This can be improvedyou do it.

Chapter 6

Filter Design (frag)


This is the subject of a lab, so these notes are brief.
1. The goal is to design an FIR filter that approximates a given desired frequency response. The
desired filter is denoted hd [n] (impulse response) or Hd (ej ) (frequency response), and the
designed FIR filter is denoted h[n] and H(ej ).
The simplest idea is to truncate hd [n], that is, to take some positive integer N and set
h[n] = hd [n] for N n N , h[n] = 0 for |n| > N . This filter is certainly FIR. This
method is equivalent to setting
h[n] = wre [n]hd [n],
where wre [n] is the window function defined by wre [n] = 1 for N n N , wre [n] = 0
for |n| > N . This function is called a rectangular window function since the graph of wre [n]
versus n is rectangular (if you take the convex hull). The integer N is a design parameter,
but the only one for this window.
2. We take as an example the lowpass filter with cutoff frequency /3. Thus hd [n] is the sinc
function
hd [n] =

sin(n/3)
.
n

Here is the Scilab code:


// window filter design
// rectangular window
N=100;
n=(-N):(-1);
h_d_left=sin((%pi/3)*n)./(%pi*n);
n=1:(N);
h_d_right=sin((%pi/3)*n)./(%pi*n);
h_d=[h_d_left 1/3 h_d_right];
88

89

CHAPTER 6. FILTER DESIGN (FRAG)

clf
nump=poly(h_d,z,coeff);
den=[1 zeros(1,2*N)];
denp=poly(den,z,coeff);
H_d=freq(nump,denp,exp(%i*(0:(%pi/255):%pi)));
omega=0:255;
omega=omega*0.5/255;
plot2d(omega,20*log10(abs(H_d)))
//plot2d(omega,abs(H_d))
Here is a graph of the magnitude of the frequency response |H(ej2f )| versus the normalized
frequency f for the two cases N = 10 and N = 10. The vertical axis is on a linear scale:

|H(ej2f )|
N = 10

N = 100

f
For N = 10 the magnitude is nowhere close to being 1 on the passband. For N = 100, the
magnitude is better, somewhat flat from DC until getting near the desired cutoff frequency.
But then theres a big ripple just before /3. If N is increased further and further, there will
always be a large rippleit will get narrower, but not less in height. This oscillation in the
truncated impulse response is called Gibbs phenomenon. Interesting, isnt it. Our design has
a roadblock because of a mathematical fact.
Here are the same graphs except the vertical scale is dB:

90

CHAPTER 6. FILTER DESIGN (FRAG)

|H(ej2f )|
N = 10
N = 100

f
The way out of the mathematical obstacle is to design a better window. There is some
interesting theory that leads the way. One common window is called the Kaiser window. Here
are graphs that compare the rectangular window and the Kaiser window; both have N = 100.

|H(ej2f )|

rectangular

Kaiser

91

CHAPTER 6. FILTER DESIGN (FRAG)

|H(ej2f )|
rectangular
Kaiser

f
The Kaiser window design is much better: Theres no Gibbs oscillation, and the stopband
attenuation is 100 dB. The only characteristic that is better for the rectangular window is
that the cutoff slope is steeper.

Chapter 7

The DFT
Because every signal stored and processed in a computer must be finite in size, we need a Fourier
method to deal with finite-lenght signals, that is, vectors. The discrete Fourier transform is that
method, and consequently is of immense practical importance.

7.1

Definition of DFT

1. Youve studied the Fourier transform (FT) for continuous-time signals, x(t) defined for <
t < , and for discrete-time signals, x[n] defined for < n < . The discrete Fourier
transform (DFT) is much easier than either of these. It is a transform of a finite number of
sample values, x[n] defined for n = 0, 1, . . . , N 1. But it is based on the fundamental Fourier
idearepresenting a signal as a sum of sinusoidal signals of different frequencies.
2. Fix the integer N . The points
1, ej(2/N ) , ej(2/N )2 , . . . , ej(2/N )(N 1)
lie in the complex plane and go around the unit circle counter-clockwise starting at the point
1 on the positive real axis. They define a length-N signal w1 [n], i.e.,
w1 [n] = ej(2/N )n ,

n = 0, 1, . . . , N 1.

We say this signal is a sinusoid of frequency 2/N radians, since w1 [n] = ejn , = 2/N .
Likewise
w2 [n] = ej(4/N )n ,

n = 0, 1, . . . , N 1

is a sinusoid of twice the frequency, 4/N radians.. Continuing in this way we get N sinusoids,
wk [n] = ej(2k/N )n , k = 0, 1, . . . , N 1
of frequency 2k/N . The case k = 0 is the DC signal of 1, 1, . . . , 1.
Since these sinusoids each have N components, we can think of them as vectors in the complex
Euclidean space CN . For complex vectors there is an inner product
hv, yi =

N
1
X

v[n]y[n],

n=0

92

93

CHAPTER 7. THE DFT


and the sinusoids w0 , . . . , wN 1 are orthogonal with respect to this inner product:
hwk , wm i = 0, k 6= m; hwk , wk i = N.

Since they are orthogonal, they form a basis and every vector in CN can be written uniquely
as a linear combination of these N sinusoids.
3. Now we turn to a signal x[n] defined for n = 0, 1, . . . , N 1 and that is real-valued because
thats almost always the case
Pin applications. As we just saw, x[n] can be written as an
orthogonal expansion x[n] = k ck wk [n]. Take inner products of both sides with wm [n]:
hx, wm i = N cm .

This gives the coefficients to be


ck =

1
hx, wk i.
N

Finally, define X[k] to be hx, wk i. So the interpretation of X[k] is that it equals the component
of x in the direction of the sinusoid wk .
4. To recap, we have the equations
X
X[k] =
x[n]wk [n]
n

x[n] =

1 X
X[k]wk [n].
N
k

The complex numbers X[k] are the DFT of x[n]. Putting in the explicit sinusoids gives
X
X[k] =
x[n]ej2kn/N
n

x[n] =

1 X
X[k]ej2kn/N .
N
k

Note that these formulas are similar to the discrete-time Fourier transform (DTFT) formulas:
X
X(ej ) =
x[n]ejn
n

x[n] =

1
2

X(ej )ejn d.

In DTFT the frequency variable takes on a continuum of values, all those in the interval
from to ; in DFT there are only N frequencies, 0, 2/N, 4/N, . . . , 2(N 1)/N .
5. As an example, take N = 4. It is convenient to write a signal x[n] as an ordered list of the
values x[0], x[1], x[2], x[3], like this: x = (1, 2, 3, 4). The four sinusoids are
w0 = (1, 1, 1, 1), w1 = (1, j, 1, j), w2 = (1, 1, 1, 1), w3 = (1, j, 1, j).

CHAPTER 7. THE DFT

94

Then, the DFT of x = (1, 2, 3, 4) is


X[0] = 10, X[1] = 2 + 2j, X[2] = 2, X[3] = 2 2j.
For example,
X[0] =

3
X

x[n]w0 [n] = 10

3
X

x[n]w1 [n] = 1 2j 3 + 4j = 2 + 2j,

n]0

X[1] =

n]0

and so on.
6. It is customary to define the complex number W = ej2/N and then to write the equations
in this form:
X
X[k] =
x[n]W kn
(7.1)
n

x[n] =

1 X
kn
X[k]W .
N

(7.2)

Equation (7.1) is called the analysis equation, defining the sinusoidal components of x[n],
and (7.2) the synthesis equation, putting x[n] back together again from its sinusoidal components.
7. Let us continue with a finite duration signal x[n], n = 0, . . . , N 1, but let us extend this
signal by saying it equals 0 for n < 0 and n > N 1. Then it is defined for all n, < n < .
Its z transform, X(z), is a polynomial in z 1 :
X(z) = x[0] + x[1]z 1 + + x[N 1]z (N 1) .
At this point it is more convenient to regard z 1 as the independent variable instead of z.
So let us define the variable = 1/z. Even though we changed the independent variable, its
convenient to use the same symbol for the transform:
X() = x[0] + x[1] + + x[N 1]N 1 .
If we substitute W for , we see that X(W ) equals the DFT coefficient X[1]. More generally
X[k] = X()|=W k .
Thus we have an important conclusion: The DFT equals the z transform (or transform)
sampled in the frequency domain. Note that the points = W 0 , W 1 , W 2 , . . . , W N 1 are
uniformly spaced on the unit circle.
8. As an example, let N = 3 and x = (1, 1, 3). Then the transform is X() = 1 + 32 ,
and the DFT is
X = (1 1 + 3, 1 W + 3W 2 , 1 W 2 + 3W 4 ).

95

CHAPTER 7. THE DFT

7.2

Circular Convolution

1. You know that for discrete-time signals, convolution in the time domain corresponds to multiplication of the DTFTs, or of the transforms. That is,
y[n] = h[n] x[n] Y () = H()X().
It seems natural to consider the product of DFTs too. So let h[n] and x[n] be two finiteduration signals, of the same length N . Let y[n] denote the signal whose DFT equals the
pointwise product of the DFTs of h[n] and x[n], i.e., Y [k] = H[k]X[k]. We say y[n] is the
N x[n]. Thus
circular convolution of h[n] and x[n] and the notation is y[n] = h[n]
N x[n] Y [k] = H[k]X[k].
y[n] = h[n]

In this section we study this operation.


2. We first look in the frequency domain, that is, in terms of the transforms of h[n] and x[n].
As an example, let N = 4 and
h = (2, 1, 1, 3), x = (0, 0, 1, 0).
The transforms are
H() = 2 + 2 + 33 , X() = 2 .
The signal y[n], which we dont yet know, also has a transform, Y (), and this too is a
polynomial of degree 3. Let us try to find Y (); then we will be able immediately to write
down y[n] from the coefficients of Y ().
As we have seen before, the DFT Y [k] is related to the transform Y () via
Y [k] = Y (W k ),

k = 0, 1, 2, 3

where
W = ej2/4 .
Likewise for H and X. Therefore from the equation Y [k] = H[k]X[k] we have
Y (W k ) = H(W k )X(W k ),

k = 0, . . . , 3.

This is equivalent to saying that W k are roots of the polynomial H()X() Y (). Notice
that W k , k = 0, 1, 2, 3, are the four roots of the polynomial D() = 1 4 . Thus it follows
that the polynomial D() is a factor of the polynomial H()X() Y (). This is equivalent
to saying there exists a polynomial Q() satisfying
H()X() Y () = D()Q().
This yields the key equation
Y ()
H()X()
=
+ Q().
D()
D()

96

CHAPTER 7. THE DFT

This says that, when HX is divided by D, the remainder is Y and the quotient is Q. In our
example we have
1 + 3 + 22 3
(2 + 2 + 33 )2
=
+ (1 3).
1 4
1 4
Thus
Y () = 1 + 3 + 22 3 .
Reading off the coefficients gives the solution to our problem:
y = (1, 3, 2, 1).
3. The preceding example generalizes. The conclusion is this:
N x[n] Y () = the remainder of the division
y[n] = h[n]

H()X()
.
1 N

4. It takes a little more work to see the time-domain operation that produces y[n]. Lets continue
working through the example. We had N = 4 and
h = (2, 1, 1, 3), x = (0, 0, 1, 0).
We also had the key equation
H()X()
Y ()
=
+ Q(),
4
1
1 4
which can be written
H()

X()
Y ()
=
+ Q().
4
1
1 4

This provokes the question, What signal has the transform X()/(1 4 )? The answer is,
The periodic extension of x[n].
5. Let us derive this fact. Continue x[n] forward in time to get the periodic signal x
[n], of period
N = 4:
x
= (0, 0, 1, 0, 0, 0, 1, 0, . . . ).
Write the definition of the transform of x
[n]:

X()
= x[0] + + x[3]3 + x[0]4 + .
The first 4 coefficients are repeated, so



X()
= x[0] + + x[3]3 + 4 x[0] + + x[3]3 + .

Use the definition of the transform of x[n]:

X()
= X() + 4 X() + 8 X() + .

97

CHAPTER 7. THE DFT


Simplify:

X()
= (1 + 4 + 8 + )X().
And again:

X()
=

1
X().
1 4

This is what we wanted to show.


6. Now let us return to our key equation:
H()

X()
Y ()
=
+ Q().
D()
D()

Write it as

H()X()
= Y () + Q().
Conversion to the time domain gives
h[n] x
[n] = y[n] + q[n].
In this example,
h[n] x
[n] = (0, 0, 2, 1, 1, 3, 2, 1, 1, 3, . . . )

= (1, 3, 2, 1, 1, 3, 2, 1, . . . ) + (1, 3, 0, 0, . . . ).

The first term is the periodic part, y[n], the second the transient part, q[n]. Taking the first
period of y[n] gives
y = (1, 3, 2, 1),
and were done.
7. The conclusion is that circular convolution is related to ordinary convolution like this:
N x[n] y
y[n] = h[n]
[n] = h[n] x
[n] + a transient.

8. What we are doing equivalently is putting the periodic input x


[n] into the LTI system with
impulse response h[n]; in fact this system is FIR. The input is applied starting at time n = 0,
and this is the reason for the transient, q[n]. If we were to apply the periodic input starting
at time n = , then there would be no transient and we would have simply
y[n] = h[n] x
[n].
In this case the original x[n] is periodically extended both forward and backward in time, y[n]
is simply the first period of the periodic signal h[n] x
[n], and therefore
y[n] =

m=

h[m]
x[n m],

0nN 1

98

CHAPTER 7. THE DFT


or, since h[m] = 0 outside [0, N 1],
y[n] =

N
1
X
m=0

h[m]
x[n m],

0 n N 1.

(7.3)

N x[n] and the operation is called circular convolution.


In the text this is written y[n] = h[n]
The text also uses the formula
N
1
X
y[n] =
h[m]x[((n m))N ], 0 n N 1,

m=0

but lets use (7.3).

x[n]
Finally, we could alternatively have convolved h[n]
and x[n]. That h[n] x
[n] = h[n]
follows immediately from
HX
= H X.

= HX
D

7.3

Vectors and Matrices

1. To derive the DFT in vector-matrix form, we can regard x[n] as a vector with components
x[0], . . . , x[N 1]. We can write x either as an N -tuple or as a column vector:

x[0]

..
x = (x[0], . . . , x[N 1]) or x =
.
.
x[N 1]

Equations (7.1), (7.2) are less than transparent because of the summations and indices. Let
us define the N N DFT matrix, F . Its first row is
W kn , k = 0, n = 0, 1, . . . , N 1.

Its second row is


W kn , k = 1, n = 0, 1, . . . , N 1.
And so on. For example, for N = 2


1 1
, W = 1
F =
1 W
and for N = 3

1 1
1
F = 1 W W 2 , W = ej2/3 .
1 W2 W4

Then defining the vector X = (X[0], . . . , X[N 1]), we have simply


X = F x, x = F 1 X.

These are respectively the analysis and synthesis DFT equations. Isnt that much more
concise?
2. You can check that the inverse F 1 equals (1/N )F , where F equals the complex conjugate
transpose of F .

99

CHAPTER 7. THE DFT

7.4

Circular Convolution via the Circulant Matrix

1. Let us review the definition of circular convolution in the form of an algorithm:


Given h, x;
Compute H = F h, X = F x.
Multiply H, X pointwise to get Y .
Compute y = F 1 Y .
Pointwise multiplication of H and X is equivalent to multiplying the vector X by the diagonal
matrix D whose diagonal elements are H[0], H[1], . . . , H[N 1]:

H[0]
0
0

H[1]

D = diag(H[0], H[1], . . . , H[N 1]) =


.
..

.
Thus the algorithm can be written

H[N 1]

Given h, x;
H = F h, X = F x
D = diag(H[0], H[1], . . . , H[N 1])
Y = DX
y = F 1 Y
Thus
y = F 1 DF x.
The matrix C = F 1 DF
show that

h[0] h[2]
C = h[1] h[0]
h[2] h[1]

is called the circulant matrix generated by h. For N = 3 you can

h[1]
h[2] .
h[0]

And for general N it can be proved that the first column of C equals h[0], h[1], . . . , h[N 1];
the second column is obtained from the first by putting the last element first and shifting all
the others down one; and so on.
N x[n] is equivalent to the vector
2. In summary, the circular convolution equation y[n] = h[n]
equation y = Cx, where C is the circulant matrix generated by h.

3. As an example, we had N = 4 and


Then

h = (2, 1, 1, 3), x = (0, 0, 1, 0).

2
3
1 1
1

2
3
1
, x =
C=
1 1

2
3
3
1 1
2

0
1
3
0
, y = Cx =
2
1
0
1

100

CHAPTER 7. THE DFT

7.5

Ordinary Convolution via DFT

1. Consider a causal LTI system with impulse response h[n]. Suppose x[n] is an input and y[n]
an output, both starting at n = 0. Then of course they satisfy the convolution equation
y[n] = h[n] x[n].
Let us look at only the first N components of the input and output:
y[0] = h[0]x[0]
y[1] = h[1]x[0] + h[0]x[1]
etc.
y[N 1] = h[N 1]x[0] + + h[0]x[N 1].
Bringing in vector notation again, we have
y = T x,
where T is the associated N N matrix

h[0]
0 0
0

h[1]
h[0]
0

T =
..

.
h[N 1]

h[0]

A matrix like this that is constant along the diagonals is called a Toeplitz matrix.
In summary, for the convolution equation y[n] = h[n] x[n], the first N components of x[n]
and y[n] are related by the vector equation y = T x, where T is the Toeplitz matrix.
2. As an example, consider the
related by

1 0
y[0]
1 1
y[1]

y[2] = 1 0 1

2
0 0
y[3]
0 0
y[4]

filter H(z) = (1 + z 1 )/2. The first 5 inputs and outputs are


0
0
1
1
0

0
0
0
1
1

0
0
0
0
1

x[0]
x[1]
x[2]
x[3]
x[4]

3. Now we shall see how to convert ordinary convolution, i.e., multiplication by the Toeplitz
matrix generated by h, into circular convolution, i.e., multiplication by the circulant matrix
generated by h. The trick is to append h and x with zeros.
Suppose x[n] has length 3 and h[n] has length 4. The goal is to compute their ordinary
convolution, y[n], of length 6. This is the same as computing the product of the transforms: Y () = H()X(). Define the vector x = (x[0], x[1], x[2]) and similarly for h, y. The
dimensions of the vectors x, h, y are 3, 4, 6. In general
degree Y () = degree H() + degree X(),

101

CHAPTER 7. THE DFT


and so
dim(y) 1 = dim(x) 1 + dim(h) 1,
that is,
dim(y) = dim(x) + dim(h) 1.

Append x and h with zeros so they are of dimension 6continue to call the vectors x and
h. In general, dim(x) 1 zeros must be appended to h and dim(h) 1 to x. Computing
the ordinary convolution is equivalent to the multiplication y = T x, where T is the Toeplitz
matrix generated by h:

y[0]
x[0]
h[0] 0
0
0
0
0
y[1]
x[1]
h[1] h[0] 0
0
0
0

y[2]
x[2]
h[2] h[1] h[0] 0
0
0

.
y = T x, y =
, x = 0 , T = h[3] h[2] h[1] h[0] 0
0
y[3]

y[4]
0
0 h[3] h[2] h[1] h[0] 0
y[5]
0
0
0 h[3] h[2] h[1] h[0]
Now let C denote the circulant

h[0] 0
0
h[1] h[0] 0

h[2] h[1] h[0]


C=
h[3] h[2] h[1]

0 h[3] h[2]
0
0 h[3]

matrix generated by h:

h[3] h[2] h[1]


0 h[3] h[2]

0
0 h[3]
.
h[0] 0
0

h[1] h[0] 0
h[2] h[1] h[0]

Because T and C have the same first three columns and the last three rows of x are zero, it
follows that T x = Cx. Thus,
y = T x = Cx.
Having converted the computation y = T x to the computation y = Cx, we can now use the
fast DFT computation: y = F 1 DF x. Note that x is the original vector appended with 3
zeros; F is the 6-point DFT matrix; and D is the diagonal of the 6-point DFT of the original
h appended with 2 zeros.
4. Lets summarize. Heres the fast algorithm to compute ordinary convolution:
Given h, x;
Append h with dim(x) 1 zeros and x with dim(h) 1 zeros.

Compute their DFTs H = F h, X = F x.

Form D = diag(H[0], H[1], . . . , H[N 1]).


Compute Y = DX

Compute y = F 1 Y .

102

CHAPTER 7. THE DFT

7.6

Digital Filter Implementation via DFT

In Section 5.2 we saw how to implement a digital filter by breaking the input into segments or
blocks. Its natural to speed this up by doing the block processing by the DFT method of the
preceding section. We illustrate by the example in Section 5.2.
1. We do an example. The filter is H(z) = 2 z 1 , of order 2, and we choose input blocks of
length 3. The block diagram was

x[n]

y[n]

z 1

z 2

z 3

z
3
0

where T is the Toeplitz matrix

h[0] 0
0
0
h[1] h[0] 0
0
T =
0 h[1] h[0] 0
0
0 h[1] h[0]

2
0
0
1
2
0
=
0 1
2
0
0 1

0
0
.
0
2

Since the input blocks are padded with one zero, we can replace T by the circulant matrix

2
0
0 1
1
2
0
0
.
C=
0 1
2
0
0
0 1
2

Then we can replace C by F 1 DF , where F is the 4 4 DFT matrix and D is the diagonal
matrix
diag(2, 1, 0, 0).

2. Whether youll actually get a speedup depends on the hardware and software. Heres a graph
taken from the book Digital Signal Processing: A Practical Guide for Engineers and Scientists,
by S. Smith.

CHAPTER 7. THE DFT

103

That text offers the following advice: The important idea to remember: filters of order less
than 60 can be implemented faster with standard convolution, and the execution time is proportional to the order. Higher order filters can be implemented faster with FFT convolution.
With FFT convolution, the order can be made as large as you like, with very little penalty
in execution time. For instance, a 16,000 order filtler only requires about twice as long to
execute as a 64 order.
The speed of the convolution also dictates the precision of the calculation. This is because
the round-off error in the output signal depends on the total number of calculations, which is
directly proportional to the computation time. If the output signal is calculated faster, it will
also be calculated more precisely. For instance, imagine convolving a signal with a 1000 order
filter, with single precision floating point. Using standard convolution, the typical round-off
noise can be expected to be about 1 part in 20,000. In comparison, FFT convolution can be
expected to be an order of magnitude faster, and an order of magnitude more precise (i.e., 1
part in 200,000).
Keep FFT convolution tucked away for when you have a large amount of data to process and
need an extremely high order filter. Think in terms of a million sample signal and a thousand
order filter. Anything less wont justify the extra programming effort.
I myself asked around and was unable to find a commercial digital filter implemented via the
DFT method. So I think theyre rare. The reason seems to be that high-order filters are
required mostly for low speed applications, like voice processing. So speedup isnt required.

7.7

Summary

(a) The DFT is the Fourier representation of a finite-length signal. The DFT of x[n] is
denoted X[k].
(b) Multiplication of two DFTs is called, in the time domain, circular convolution. The
N x[n]:
notation is y[n] = h[n]
N x[n] Y [k] = H[k]X[k].
y[n] = h[n]

(c) Via the FFT, there is a fast algorithm for circular convolution:
Given h[n], x[n];
Compute their DFTs, H[k], X[k].
Multiply the DFTs to get Y [k] = H[k]X[k].

CHAPTER 7. THE DFT

104

Compute the inverse DFT to get y[n].


(d) If we represent length-N discrete-time signals x[n] as vectors x = (x[0], x[1], . . . , x[N 1]),
then there is a matrix F , called the DFT matrix, such that the DFT vector can be
computed via X = F x.
(e) Furthermore, linear systems can be represented by matrices; ordinary convolution y[n] =
h[n]x[n] is equivalent to y = T x, where T is the Toeplitz matrix generated by h; circular
N x[n] is equivalent to y = Cx, where C is the circulant matrix
convolution y[n] = h[n]
generated by h.
(f) Ordinary convolution can be converted to circular convolution by padding vectors with
zeros. Therefore, there is a fast algorithm for ordinary convolution:
Given h, x;
Append dim(x) 1 zeros to h and dim(h) 1 to x.
Compute their DFTs, H = F h, X = F x.
Multiply the DFTs to get Y .
Compute the inverse DFT to get y = F 1 Y .
(g) This fast algorithm extends to the implementation of FIR filters.

Chapter 8

The FFT
8.1

The Algorithm

1. FFT refers to the fast Fourier transform algorithm to compute the DFT. The FFT algorithm as currently used was presented in a famous paper by Cooley and Tukey in 1965,
entitled An algorithm for the machine calculation of complex Fourier series. Its a very
simple idea that uses the fact that the complex number W = ej2/N satisfies W N = 1, and
so the sequence {W n } is periodic with period N .
2. Recall that the vector-matrix form of the DFT is
X = F x,
where x is an N -dimensional column vector of sample values,

x[0]

..
x=
,
.
x[N 1]

X is the N -dimensional column vector of DFT coefficients, and F is the N N DFT matrix.
If we number the rows and columns of F from 0 to N 1, then the element in row k, column
n is W kn , where W is the complex number ej2/N . For example, for N = 2


1 1
F =
, W = 1
1 W
and for N = 3

1 1
1
F = 1 W W 2 , W = ej2/3 .
1 W2 W4

3. The DFT computational problem is, given the vector x, to compute the vector X = F x.
Whats the complexity of performing the multiplication F x when F is N N ? Computing
the first component of F x, namely X[0], is taking the dot product of the first row of F with
x; this has N multiplies followed by N 1 adds, for a total of 2N 1 arithmetic operations.
105

CHAPTER 8. THE FFT

106

Doing this for all N rows therefore requires N (2N 1) arithmetic operations. We say the
complexity is N (2N 1). This complexity is for the naive procedure where we dont take
advantage of the special structure of F . The FFT algorithm does take advantage, as we shall
see.
4. As an example, let

1
1
Fx =
1
1

us begin with the case N = 4. We have

1
1
1
x[0]

W W2 W3
x[1] , W = ej2/4 .
2
4
6
W W W x[2]
W3 W6 W9
x[3]

It is convenient to group together the components of x at the even times and the odd times,
that is, {x[0], x[2]} and {x[1], x[3]}. To keep the product to be the same, we must interchange
columns 2 and 3 of the matrix:

1 1
1
1
x[0]
1 W 2 W W 3 x[2]

Fx =
1 W 4 W 2 W 6 x[1] .
1 W6 W3 W9
x[3]

Now let us partition the matrix into 2 2 blocks, and the vector into two subvectors, like this:

1
1
x[0]
1 1
1 W 2 W W 3 x[2]

Fx =
(8.1)
1 W 4 W 2 W 6 x[1] .
1 W6 W3 W9
x[3]
The matrix has four blocks. The (1, 1)-block


1 1
1 W2

is exactly the DFT matrix for N = 2. This is because the square of the 4-point complex
number W equals the 2-point complex number W :


ej2/4

2

= ej2/2 .

Let us denote the 2-point DFT matrix by F2 . Then the (1, 2)-block in (8.1) is D2 F2 , where
D2 is the diagonal matrix


1 0
D2 =
.
0 W
Since W 4 = 1, the (2, 1)-block in (8.1) is

 

1 W4
1 1
=
= F2 .
1 W6
1 W2

107

CHAPTER 8. THE FFT


Finally, the (2, 2)-block is

 



W2 W6
W2 0
1 0
1 1
=
= D2 F2 ,
W3 W9
0 W2
0 W
1 W2
since W 2 = 1. Thus (8.1) becomes


 x[0]
x[2]
F2 D2 F2

Fx =
F2 D2 F2 x[1]
x[3]

Or, defining the even and odd parts of x,






x[0]
x[1]
xe =
, xo =
.
x[2]
x[3]
we have
Fx =

F2 D2 F2
F2 D2 F2



xe
xo

F2 xe + D2 F2 xo
F2 xe D2 F2 xo

5. Lets summarize the formula for the case of a general even number N . Denote the N -point
complex number W by WN and the N -point DFT matrix by FN . Define the even and odd
parts of x:
xe = (x[0], x[2], . . . ), xo = (x[1], x[3], . . . ).
Also, define the diagonal matrix
DN/2 = diag(1, WN , WN2 , . . . ).
Then
FN x =

FN/2 xe + DN/2 FN/2 xo


FN/2 xe DN/2 FN/2 xo

(8.2)

6. Equation (8.2) suggests a recursive procedure for performing the computation FN x when
N is a power of 2, N = 2M . The procedure is this:
(a) Form xe and xo of dimension N/2.
(b) Do the two N/2-point DFTs FN/2 xe , FN/2 xo .
(c) Form the matrix DN/2 .
(d) Compute DN/2 FN/2 xo and DN/2 FN/2 xe .
(e) Add FN/2 xe + DN/2 FN/2 xo and subtract FN/2 xe DN/2 FN/2 xo .
(f) Stack up the latter two vectors to get X.

CHAPTER 8. THE FFT

8.2

108

Complexity

1. Let us study the complexity of this recursive procedure. Let c[N ] denote the number of
arithmetic operations to perform this procedure. We assume Step 1 requires no operations.
Step 2 requires c[N/2] for each N/2-point DFT, for a total of 2c[N/2]. We assume Step
3 requires no operations. Since DN/2 is diagonal, the product DN/2 FN/2 xo requires N/2
multiplies; the same for DN/2 FN/2 xe , for a total of N for Step 4. Finally, Step 5 requires N/2
adds and N/2 subtracts, for a total of N operations. Summing them all up, we get
c[N ] = 2c[N/2] + 2N.
Equivalently,
 


c 2M = 2c 2M 1 + 2M +1 .

This is a recursive equation for the function c. The initial condition for the equation is when
M = 1. Then, N = 2 and the complexity is given by the formula N (2N 1), which for N = 2
equals 6. The leads to the initial value problem
 


c 2M = 2c 2M 1 + 2M +1 , c[2] = 6.
This can be solved by z transforms and the solution for M 1 is
 
c 2M = 2M (2M + 1),

or, in terms of N ,

c[N ] = N (2 log2 (N ) + 1).


2. In summary, the direct procedure to compute X = F x requires N (2N 1) arithmetic operations, while the FFT algorithm requires N (2 log2 (N ) + 1). This table compares these two
functions:
N N (2N 1) N (2 log2 (N ) + 1)
2
6
6
4
28
20
8
120
56
16
496
144
32
2016
352
64
8128
832
32640
1920
128
256
130816
4352
The advantage of the FFT algorithm is obvious.

Chapter 9

Spectral Analysis (frag)


Spectral analysis is the process of analyzing a signal based on its frequency spectrum, that is, what
sinusoids it contains and how dominant are they. For example, we might be looking for abnormal
rhythms in the heart rate. This amounts to determining the Fourier transform of a signal.

9.1

Using the DFT

1. Of course in practice we have only a finite number of samples of the signal to be analyzed.
Heres the setup:

w[n]
sc (t)

LPF

xc (t)

C/D

x[n]

DFT

V [k]

v[n]

The signal to be analyzed is sc (t). It is lowpass filtered before sampling to avoid aliasing. The
sampled signal is x[n]. Taking a finite number of samples is equivalent to multiplying it by a
finite-duration window function, w[n]. The window function could be rectangular, but need
not be. So v[n] consists of the finite number of samples. The DFT produces V [k], which is
then analyzed in a computer. The problem is to estimate Sc (j) given V [k]. Assuming the
anti-aliasing filter is well designed, equivalently we want to estimate Xc (j).
2. Assuming no aliasing, we have
X(ejT ) =

1
Xc (j),
T

||

,
T

or equivalently
X(ej ) =

1
Xc (j/T ),
T

|| .

Therefore there is a one-to-one correspondence between X(ej ) and Xc (j) over the frequency
range of interest. Therefore the problem reduces to estimating X(ej ) given V [k].
109

CHAPTER 9. SPECTRAL ANALYSIS (FRAG)

110

3. Let us assume that the length of w[n] equals the number N of points in the DFT: No zeros
are padded to v[n] nor are any values of v[n] discarded. Then the mapping from X(ej ) to
V [k] depends on N and the window function w[n].
Example: w[n]

Appendix A

Proof of the Sampling Theorem using


Fourier Series
This section is include for your possible interest. You are not required to know it.
1. Claude Shannon was without question a genius, one of the great engineers of the 20th century.
He invented the subject called Information Theory, which tells us what the capacity of a
communication channel is. Shannon worked for many years at Bell Labs, where he was known
as a unicycle rider and juggler.
2. Lets state Shannons theorem like this: If xc (t) BN , then xc (t) is uniquely determined by
the sampled values xc (nT ).
3. To simplify notation, let us suppose T = 1, so that N = . Let us see how to construct the
Fourier transform Xc (j) from the sample values xc (n). By the inverse Fourier transform,
Z
1
xc (t) =
X(j)ejt d.
2
Therefore the sampled values satisfy
Z
1
x[n] = xc (n) =
X(j)ejn d.
2
This has the form of a dot product
x[n] = hX, Wn i,

(A.1)

where
Wn (j) =

ejn || <
0,
|| > .

These functions are orthogonal in the dot product: If n 6= m,


Z
1
hWm , Wn i =
ejm ejn d = 0.
2
111

APPENDIX A. PROOF OF THE SAMPLING THEOREM USING FOURIER SERIES

112

Also, they have unit length in the sense that


hWm , Wn i = 1.
Therefore this family, {Wn }, forms what is called an orthonormal set. It turns out (we wont
prove this) that the family is a basis in the sense that every transform bandlimited to can
be written as a linear combination of these basis functions:
X
Xc =
cm Wm .
m

This is actually a Fourier series expansion. Take the dot product of both sides with Wn :
X
hXc , Wn i = h
cm Wm , Wn i.
m

The left-hand side equals x[n] from (A.1). Use linearity of the inner product on the right-hand
side:
X
x[n] =
cm hWm , Wn i.
m

But the right-hand side equals cn because the functions Wn are orthonormal. Thus the
expansion coefficients equal the sample values, and we therefore have
X
Xc =
x[n]Wn .
n

So in the time domain we have


X
xc (t) =
x[n]wn (t),
n

where wn (t) is the inverse transform of Wn (j), which equals


wn (t) =

sin((t n))
.
(t n)

In summary, the sinc functions {wn } form an orthonormal basis in the time domain for signals
bandlimited to , and the sample values x[n] are the coefficients of xc (t) in the Fourier series
expansion of xc (t) in this basis.

Appendix B

The Fourier Chart


So now you have seen a lot of Fourier stuff: F series, discrete-time F transform, continuous-time F
transform, discrete F transform. Heres a summary of three Fourier transform formulas:

Analysis
xc (t)

Xc (j)

CTFT

tR
Xc (j) =

Synthesis
Xc (j)

R
xc (t)e

jt

1
xc (t) =
2

dt

X(ej )

x[n]
nZ

DTFT

X(e ) =

X[k]

X[k] =

N
1
!

X[k]

k {0, . . . , N 1}

x[n]e

Xc (j)ejt d

x[n]

1
x[n] =
2

jn

DFT
n {0, . . . , N 1}

xc (t)

DTIFT

x[n]

X(ej )

(, )
x[n]e

CTIFT

X(ej )ejn d

x[n]
IDFT

N 1
1 !
x[n] =
X[k]ej(2n/N )k
N

j(2k/N )n

n=0

k=0

The acronyms are easy to figure out: CTFT is continuous-time Fourier transform, etc. The
arrow convention is that a continuous arrow signifies that the independent variable t, n, , ranges
over a continuum of values; a dotted arrow signifies that the independent variable ranges over a
discrete set of values.1
The set R of all real numbers is a continuum. Likewise, the interval (, ) is a continuum, since there is a
one-to-one correspondence between (, ) and R. (Think of a function that provides this correspondence.) On the
other hand, the set of integers Z is not a continuumthere is no one-to-one correspondence between Z and R. The
1

113

APPENDIX B. THE FOURIER CHART

114

integers can be counted, whereas the real numbers cannot. We say that the set Z is discrete. Also, a finite set such
as {0, 1, 2, . . . , N 1} is said to be discrete. These ideas are related to the mathematical concept of cardinality.

Review Problems
1. A signal x[n] is upsampled by 3 and then passed through H(z) = 1 2z, producing y[n]. Find
the matrix H from x to y.
2. Consider the discrete-time sinusoid x[n] = ej2n/3 . Suppose it is downsampled by the integer
2, producing y[n] = ej4n/3 . Draw the Fourier transforms of x and y, and verify the formula
1
1
Y (ej ) = X(ej/2 ) + X(ej((/2)) ).
2
2
Repeat for x[n] = ejn/3 .
3. (a) A signal x[n] is upsampled by 2, then downsampled by 2, producing the signal y[n]. Is
the system from x[n] to y[n] LTI? If so, find its transfer function.
(b) A signal x[n] is downsampled by 2, then upsampled by 2, producing the signal y[n]. Is
the system from x[n] to y[n] LTI? If so, find its transfer function.
(c) A signal x[n] is upsampled by 2, then passed through the system with transfer function
1 1/z, then downsampled by 2, producing the signal y[n]. Is the system from x[n] to
y[n] LTI? If so, find its transfer function.
4. This problem uses handy notation from the book by Proakis and Manolakis. A discrete-time
signal is written as an ordered list, with an uparrow pointing to the value at time n = 0.
(a) Find the z transform of
x[n] = (. . . , 1, 1, 1, 1, 1, 1, 0, 0, 0, . . . ).

What are its poles and zeros?


(b) Repeat for
x[n] = (. . . , 1, 1, 1, 1, 1, 1, 0, 0, 0, . . . ).

5. Consider the system with transfer function


H(z) =

1 z3
.
1 z4

For all possible regions of convergence of H(z), find the impulse response h[n] and say if the
system is stable.
115

116

APPENDIX B. THE FOURIER CHART


6. Derive the z transform and ROC of
n
2 , n even and negative
0, n odd and negative
x[n] =

0,
n 0.

7. From the function X(z) you derived in the previous problem, derive x[2] and x[3] by the
inversion formula. (You should of course get back to the original x[n].)

8. Consider a causal FIR filter with impulse response h[n].


the input is white noise with
P Assume
variance 1. Show that the variance of the output is
h[n]2 .
9. The sinusoid xc (t) = ej2100t is sampled at 80 Hz. Draw the Fourier transform of the sampled
signal.

10. A signal x[n] is passed through D/C, then through the system with transfer function esT /2 ,
then through C/D, producing the signal y[n]. Is the system from x[n] to y[n] LTI? If so, find
its transfer function.
11. The signal xc (t) is sampled at 10 kHz, producing x[n]. Sketch the Fourier transform of x[n] if
the Fourier transform of xc (t) looks like this:
X(j)
1

2(104 )

2(104 )

12. Find Y (z) in terms of X(z):

x[n]

z 1

y[n]

13. The following system arises from a modulator with a quantizer:

e[n]
x[n]

y[n]
H(z)

117

APPENDIX B. THE FOURIER CHART

The filter is causal with transfer function H(z) = 1/(z 1). Find the difference equation for
y[n], with both x[n] and e[n] as inputs.
14. Continuing, now suppose the filter is causal with transfer function H(z) = 3/(2z 1), the
input x[n] is zero, and e[n] is white noise with zero mean and unit variance. Find the variance
of y[n].
15. Derive how to go from
xc (t)

C/D

ZOH

y[n]
1 yc (t)
C/D
RCs

v[n]

ZOH

vc (t)

to

xc (t)

C/D

x[n]

k y[n]
Q
z1

v[n]

k = T /RC
16. Derive the relationship between the power spectra of the input and output for L and M .
17. What is the DFT of (1, 0, 1, 0, 1, 0)?
18. Suppose x[n] and X[k] are a DFT pair, and x[n] is real-valued. Define Y [k] to be X[k], the
complex conjugate of X[k]. How is y[n] related to x[n]? That is, taking complex conjugate in
the frequency domain corresponds to what operation in the time domain?
19. Suppose x[n] is defined for n = 0, 1, 2; let x denote the column vector with components
x[0], x[1], x[2]. Append (pad) x[n] with three zeros to obtain y[n], defined for n = 0, 1, . . . , 5.
Let Y [k] be the 6-point DFT of y[n] and let Y denote the column vector with components
Y [0], . . . , Y [5]. Find the matrix G such that Y = Gx.
20. The convolution of two finite-duration signals can be computed using the DFT. Using this
method, find y[n] = h[n] x[n] where H(z) = 2, X(z) = 1 2/z.
21. Find the circular convolution of
h[n] = (1, 2, 2),

x[n] = (2, 0, 1)

by these two methods: 1) dividing H()X() by a certain polynomial; 2) multiplying the


vector x = (x[0], x[1], x[2]) by a circulant matrix.

APPENDIX B. THE FOURIER CHART

118

22. Consider a finite length signal x[n], n = 0, . . . , 19. Let X[k] denote its DFT. Develop a way
to compute X[16] by computing four 5-point DFTs.
23. Suppose x[n] is an 8-sample vector of the form
x[n] = (a, b, a, b, a, b, a, b),
that is, it is periodic of period 2. What is the general form of its DFT X[k]?
24. Consider two finite-duration signals, x[n] of length 3 and h[n] of length 4. The goal is to
compute their linear convolution, y[n], of length 6. Define the vector x = (x[0], x[1], x[2])
and similarly for h, y. The dimensions of the vectors x, h, y are 3, 4, 6. Pad x and h with
zeros so they are of dimension 6continue to call the vectors x and h. Computing the
linear convolution is equivalent to the multiplication y = T x, where T is the Toeplitz matrix
generated by h. Now let C denote the circulant matrix generated by h. Prove that T x = Cx.
From this, show that y = T x can be computed like this:
y = F 1 DF x.
What is the matrix D?
25. Let P denote the circulant matrix with first row (0, 1, 0). Show that if C is a 3 3 matrix, it
is circulant if and only if C and P commute.
26. Text, Problem 8.15, but use this method: Let C denote the circulant matrix generated by
x1 . Then y = Cx2 . If C is invertible, then of course the whole vector x2 (not just a) can be
determined from y. Is C invertible?
27. Find the finite-duration signal y[n] such that
Y (z) = H(z)X(z), H(z) = z 1 2z 2 , X(z) = 2 + 2z 1 + 3z 2 ,
by converting the problem to the form y = Cx, where C is a circulant matrix and x, y are
vectors.
28. This problem asks you to implement the digital filter y[n] = h[n] x[n] via the DFT. The
input x[n] is defined for all n, < n < , and the filter transfer function is
H(z) = 1 + 2z 1 z 3 .
The input blocks are to be length 2. Draw the block diagram of the DFT implementation.
Explicitly show the entries of each matrix in the diagram.
29. Repeat the previous part but with the noncausal filter
H(z) = z + 1 + 2z 1 .
30. Consider Problem 8.9 (a). It can be solved like this: The zT of x[0], . . . , x[19] is
1
1
X(z) = x[0] + x[1] + + x[19] 19 .
z
z

.
G! , A "

.
110 211/12
119

APPENDIX B. THE FOURIER CHART

Thus, a 1-octave pitch shift upwards corresponds to a doubling of the frequencies of the
notes in the original
octave.
j4/5
Let W = e

. The exercise is to evaluate X(z) at z = W . The least positive integer k such

There are
several
files x[n]
on ainto
computer.
use5. the
that
W k = 1formats
is k = 5. for
Thisstoring
promptssound
us to break
subvectors Well
of period
That.wav
is, weformat,
partition the
vector
x intocan
fourread.
subvectors,
each of dimension
5:
which MATLAB
and
Scilab
In MATLAB
and Scilab


x[0] . . . x[4] x[5] . . . x[9] x[10] . . . x[14] x[15] . . . x[19]


x == wavread(file.wav)
[y, f s, bits]


y0 y1 y2 y3

loads the file, returning the sampled .data in variable y, the sampling rate (Hz) in variable
f s, and Then
the number of bits used to encode the samples in bits. Also, in MATLAB
= x =

1
X(z) = Y0 (z) + 5
wavwrite(y,file.wav)
z




1
1
Y1 (z) + 5 Y2 (z) + 5 Y3 (z)
.
z
z

saves theThus
vector y as a .wav file. In Scilab the function name is savewave. A .wav file can
be played (output to the speakers) by your media player.
X(W ) = Y0 (W ) + Y1 (W ) + Y2 (W ) + Y3 (W ).

Download the files tone.wav, scale.wav, chord.wav from the course website. Load them
Each term Yi (W ) can be computed by a 5-point DFT.
into MATLAB
or Scilab. Play tone. You should hear a pure tone. Write a MATLAB or
You
do
part
Scilab program to (b).
generate the same note. Hint: Display the DFT of the signal to see the
frequency.
31. A signal x[n] is the input to a zero-order-hold, producing the signal vc (t). Thus vc (t) = x[n]
for nT t < (n + 1)T . Then vc (t) is the input to the integrator with transfer function Hc (s),

6. Continuing,
play scale.
should
a series
of five
to generate
with output
wc (t). You
Finally,
wc (t) ishear
the input
to C/D
withtones.
samplingWrite
periodaTprogram
, and the output
the sameis result.
y[n]. Find Y (z) in terms of X(z).
32. Consider
continuous-time
signalhear
that aequals
0 for
t < 0, andof
forsome
t 0 is
by
7. Continuing,
playthechord.
You should
chord
consisting
ofgiven
the tones
played si



that ideally when applied
multaneously. xUsing
the
Kaiser
windowtechnique,
design a filter
53.12t
sin 1600t +
+ e45.97t sin 2600t +
.
c (t) = 3e
4
2
to chord will output the lowest tone alone.

8.

thatf h[n]
is not causal and it is not FIR. Truncat
Let x[n] denote xc (t) sampled 9.
at Continuing,
the sampling notice
frequency
s = 12 kHz. Suppose we have the
n that
>
and
2N
Then
itfor
is oflength
+ 1 but
Considerfirst
theNcontinuous-time
signal
equals
0<for
t sample
<. 0,
and
t x
0 4N
is given
by its still not
= 128 samples. From
x[n]
we2N
wish
to n
compute
values
c (t) at a higher
assigning
!
" at this
sampling rate, namely, 4 times!the old rate: 48
Let v[n] denote
xc (t) sampled
" kHz.45.97t
h[n
53.12t
g[n]
=
].
higher rate.
the impulse
h[n]+of the +
ideal
filter2600t
that produces
the
output
xc (t)Find
= 3e
sinresponse
1600t
e lowpasssin
+
. 2N
4
2
v[n] as follows:

Then g[n] is causal and FIR. Its length is 4N + 1 but we want 4N


Let x[n] denote xc (t) sampled at theupsampled
samplingsignal.
frequency
fs =
12] kHz.
Suppose
have
So drop
g[4N
and rename
the we
filter
g[n].the
On the s
and wish
g[n] to
y[n]compute
versus real
time t.values
How doofthey
compare?
first N = 128 samples. From x[n] we
sample
xc (t)
at a higher
v[n]

4
h[n]
sampling rate, namely, 4 times the old rate: 48 kHz. Let v[n] denote xc (t) sampled at this
10. Implement the filter g[n] by an equivalent DFT procedure:

higher rate. Find the impulse response h[n] of the ideal lowpass filter that produces the
output v[n] as follows:
33. Continuing, notice that h[n] is not causal and it is not FIR. Truncate h[n] to be zero for
n > 2N and n < 2N . Then it is length 4N + 1 but its still not causal. Shift it by assigning
g[n] = h[n 2N ].

Then g[n] is causal and FIR. Its length is 4N +1 but we want 4N , the length of the upsampled
signal. So drop g[4N ] and rename the filter g[n]. On the same graph plot x[n] and g[n] y[n]
versus real time t. How do they compare?
34. Continuing, implement the filter g[n] by an equivalent DFT procedure:

11. On the same graph plot x[n] and v[n] versus real time t. You shou
delayed version of x[n].

120

APPENDIX B. THE FOURIER CHART

matrix G
x[n]

v[n]

4
4N -point DFT

4N -point IDFT

On the same graph plot x[n] and v[n] versus real time t.
35. Consider the two systems

w(t)
y[n]
xc (t)

v(t)

C/D

w[n]
xc (t)

C/D

y[n]

The input xc (t) is bandlimited, w(t) is a rectangular window, and w[n] = w(nT ). Are the two
systems input/output equivalent? Is v(t) bandlimited? Is W (j) a sinc function? Is W (ej )
a sinc function?

You might also like