A Digital Signal Processing Primer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 96

Solutions Manual

A Digital Signal
Processing Primer
with Applications to Digital Audio
and Computer Music

Ken Steiglitz
Eugene Higgins Professor of Computer Science Emeritus
Princeton University

DOVER PUBLICATIONS, INC.


Mineola, New York
Copyright
Copyright © 1996, 2020 by Ken Steiglitz
All rights reserved.

International Standard Book Number


ISBN-13: 978-0-486-84583-8
ISBN-10: 0-486-84583-4

www.doverpublications.com
3

Note to the Reader


Here are
are solutions
solutionstotoall
allthe
theproblems in in
problems mymy
book, A Digital
book, A DSP Signal Processing
Primer, Primer,
with the with
exception
the
of aexception of a fewprojects.
few open-ended open-ended projects.
As in the book, references to figures and equations are to the current chapter unless
stated otherwise. The figures in this solutions manual are labeled Pn, where n is the
problem number. There was no need to number the equations.
I am indebted to Dan Trueman for checking these solutions and suggesting clarifications
and improvements. When Dan did that much appreciated work he was a graduate student
in the music department at Princeton; he is now professor in that same department. The
inevitable errors that remain are, of course, my own doing.
Chapter 1

Tuning Forks, Phasors

1. One simply defined solution is the set of polynomial signals of fixed degree. More
precisely, the following set of signals is closed under time shift:

Sω = an tn + an−1 tn−1 + · · · + a0


for a fixed degree n, all coefficients ai , and all t. Thus, for example, the set of linear signals
is closed under time shift.
We can also use the set of finite sums of one-sided real exponential signals:
( n )
X
Sω = ai eσi t
i=1

for all real numbers ai , σi and, time t ≥ 0.


A really grand example is the class of all finite sums of exponentially weighted sinusoidal
signals: ( n )
X
ki σi t
Sω = ai t e sin(ωi t + φi )
i=1
for all real numbers ai , σi , ωi , φi , time t ≥ 0, and integer powers ki . This is the set of all
one-sided signals with rational Laplace transforms (when t is continuous) or z-transforms
(when t is discrete). These are discussed in Section 9 of Chapter 9. Obvious variations can
be obtained by restricting the time constants σi or the frequencies ωi to finite sets.
A wiseacre student may suggest the class of all signals, a perfectly good answer. The
empty set is also correct, vacuously.
2. The class of signals Sω defined in Problem 1 is not closed under multiplication. For
example,
2 sin2 (ωt) = 1 − cos(2ωt)
which includes a sinusoid at frequency 2ω and is therefore outside the class Sω .
Neither is the class closed under division, as can be seen from the fact that 1/ sin(ωt)
has an infinite value at ω = 0 and therefore cannot be a sinusoid.

5
6 CHAPTER 1. TUNING FORKS, PHASORS

3. By induction. Consider the sum of n elements of the class. The basis of the induction is
the case n = 2, which is already established. To carry out the induction, assume that the
class is closed under the addition of n − 1 elements, and consider the sum of n elements.
The sum of the first n − 1 elements is in the class by the induction hypothesis, and this sum
plus the nth element is in the class by the n = 2 case.
4. If a countably infinite sum of members from the class Sω converges, it also belongs to
the class. Just consider the real part of

X ∞
X
j(ωt+φk ) jωt
Ak e =e Ak ejφk
k=0 k=0

The same is true for an uncountably infinite sum.


If the class of functions is allowed to contain finite sums of phasors with all frequencies
that are integer multiples of some constant ω, then the class is no longer closed under
addition of a countably infinite number of terms. As we’ll see in Chapter 7, the Fourier
series for a square wave provides a counterexample — it’s an infinite sum of members of
this class that converges to a function outside the class.
5. The two tuning forks may differ in pitch, and you may be able to hear a beat. I have
three tuning forks designated as A 440 Hz, and the first two do not produce a beat when
sounded together. The third, however, beats at about 2 sec against each of the first two.
The pitch of the third is therefore either 0.5 Hz higher or lower than that of the first two.
The higher-pitched clang tones mentioned in the Notes are more likely to differ enough
in pitch to produce a beat. If the two forks are held apart, you may be able to tell there
are two sources by moving your head.
7. This simple experiment is an elegant demonstration of destructive interference. As the
tuning fork is rotated 360◦ , four distinct nulls are observed, located symmetrically in the
plane perpendicular to the axis of the tines, as shown in Fig. P7. An intuitive explanation
is as follows (see, for example, J. Askill, Physics of Musical Sounds, Van Nostrand, New
York, N.Y., 1979, p. 29): The tines vibrate in the horizontal direction, moving towards one
another, and then away from each other. This radiates sound waves most strongly in the
direction aligned with the x-axis (left-right in the figure), with a strength that diminishes as
one approaches the direction aligned with the y-axis (up-down in the figure). The motion
of the tines also radiates sound in the direction of y direction, by alternately compressing
and rarefying the air between the tines. This y-axis radiation is 180◦ out of phase with the
x-axis radiation, and also diminishes in strength as one approaches the x-axis. At some
angle roughly halfway between 0◦ and 90◦ , the two radiation waves cancel destructively.
Mathematically, the radial velocity in polar coordinates (r, θ) can be derived from the
velocity potential of two dipoles (see, for example, H.Lamb, The Dynamical Theory of
Sound, Dover, New York, N.Y., 1960, p. 231). The x-axis radial velocity is of the form

Ax f (r)ej(ωt−2πr/λ) cos2 θ
7

y radiation in
y direction

null null

radiation in
tines x direction

null null

Fig. P7 Nulls in the radiation from a tuning fork. Shown is a plane perpendicular to the tines.

where ω is the fundamental frequency of the tuning fork in radians per sec, λ = 2πc/ω is
the wavelength in m, and c is the speed of sound in air in m/sec. The y-axis radial velocity
is approximately of the same form, with a different amplitude, say Ay , and rotated 90◦ :

Ay f (r)ej(ωt−2πr/λ) sin2 θ

The nulls in the total therefore occur when

tan2 θ = Ax /Ay

Tuning forks seem to be designed to have Ax ≈ Ay , so the nulls occur along axes at about
45◦ and 135◦ .
8. Three Lissajous figures are shown in Fig. P8, for the frequency ratios 4:1, 31:7, and
99:97. The simpler the ratio, the simpler the figure.
9. I generated this sound real-time in front of a class, and the result stopped me in my
tracks. It’s a good example of the advantage of thinking of sinusoids as phasors.
If we view the generated signal as a rotating phasor, its instantaneous frequency is the
rate of change of its phase angle:
  
d t 2t
ω= ω1 + (ω2 − ω1 ) · t = ω1 + (ω2 − ω1 )
dt T T

The instantaneous frequency therefore starts at ω1 and increases linearly to 2ω2 − ω1 at


t = T , not ω2 . If we continue at the fixed frequency ω2 , the instantaneous frequency will
drop suddenly to ω2 .
8 CHAPTER 1. TUNING FORKS, PHASORS

Fig. P8 Some Lissajous figures. The frequency ratios are (top) 4:1, (middle) 31:7, and
(bottom) 99:97.
9

10. Denote by S the length of the vector labeled SUM in Fig. 9.3, and note that the vector
a2 makes an angle δt with the real axis. Then the law of cosines yields

S 2 = a21 + a22 − 2a1 a2 cos(π − δt) = a21 + a22 + 2a1 a2 cos(δt)

The length thus varies between |a1 − a2 | and |a1 + a2 |, as we expect.


Alternatively, the squared-magnitude of Eq. 9.4 is the sum of the real part squared and
the imaginary part squared,

S 2 = (a1 + a2 cos(δt))2 + (a2 sin(δt))2

which simplifies to the same thing.


11. The actual waveform, Eq. 9.1, is the real part of the complex waveform, Eq. 9.2, and
touches the envelope exactly when the complex waveform is real-valued — that is, the real
part is equal to the magnitude when and only when the imaginary part is zero. Looking
at it geometrically, this occurs when the tip of the complex vector crosses the real axis.
The position of the tip in the complex plane is a continuous function of time, and moves
between the upper half plane and the lower half plane. Hence, there is at least one instant
of time when it lies precisely on the real line.
12. The actual waveform touches the envelope exactly when the imaginary part of the
complex waveform, Eq. 9.2, is zero:

a1 sin (ωτ ) + a2 sin ((ω + δ)τ ) = 0

I don’t know of any way to solve this equation analytically for τ in the general case, and I
don’t think there is any.
13. The angle of the complex signal in Eq. 9.2 is
 
a1 sin(ωt) + a2 sin ((ω + δ)t)
Θ(t) = arctan radians
a1 cos(ωt) + a2 cos ((ω + δ)t)

Differentiating and simplifying yields the following expression for the instantaneous fre-
quency:  
dΘ(t) a2 + a1 cos(δt)
= ω + δa2 radians per sec
dt a21 + a22 + 2a1 a2 cos(δt)
Assume without loss of generality that a1 ≥ a2 ≥ 0 and δ ≥ 0. Then differentiating
this expression for instantaneous frequency shows that the maxima and minima occur when
cos(δt) = ±1 and are
a2
ω+δ radians per sec
a1 + a2
and
a2
ω−δ radians per sec
a1 − a2
10 CHAPTER 1. TUNING FORKS, PHASORS

respectively. The range is the difference, which when normalized by ω can be written
 
∆ω δ (a1 /a2 )
=2
ω ω (a1 /a2 )2 − 1
This is proportional to the fractional difference in frequencies between the two beating
sinusoids, and becomes infinite as their amplitudes approach each other.
14. The main point of this question is to get the student to think about the ambiguity in
quadrant when using a one-argument arctan function. For example, arctan(1) can be either
45◦ or 225◦ . The following C program solves this problem by using the two-argument math
library C function atan2():

/* from rectangular to polar form */


#include <stdio.h>
#include <math.h>
main() {
double x, y, R, theta, pi;
printf("rectangular to polar\n");
pi = 4.*atan2(1., 1.);
printf("Please enter x, y for x + jy\n");
scanf("%lf %lf", &x, &y);
printf("%lf + j%lf =\n", x, y);
R = sqrt(x*x + y*y);
theta = (180./pi)*atan2(y, x);
printf(" %lf at %lf degrees\n", R, theta);
} /* main */

This problem doesn’t come up in the conversion from polar to rectangular coordinates:

/* from polar to rectangular form */


#include <stdio.h>
#include <math.h>
main() {
double x, y, R, theta, pi;
pi = 4.*atan2(1., 1.);
printf("polar to rectangular\n");
printf("Please enter R, theta for R at theta");
printf(", theta in degrees\n");
scanf("%lf %lf", &R, &theta);
printf("%lf at %lf =\n", R, theta);
x = R*cos(theta*pi/180.);
y = R*sin(theta*pi/180.);
printf(" %lf + j%lf\n", x, y);
} /* main */
11

15. Write the complex exponential as a power series formally, and separate the real and
imaginary terms:
∞ ∞ ∞
X (jθ)k X (−1)k/2 θk X (−1)(k−1)/2 θk
ejθ = = +j
k! k! k!
k=0 k=0,2,4,... k=1,3,5,...

The sums on the right are the power series for cos θ and sin θ, respectively.
16. The frequency of the clang tone varies from fork to fork. Lord Rayleigh (The Theory of
Sound, see the Notes) reports that the forks examined by Helmholtz had clang tones with
frequencies from 5.8 to 6.6 times that of the main pitch. It’s easy to verify by ear that the
clang tone dies out more rapidly than the fork’s main pitch.
17. Convert to Hz by dividing by 2π; then take the reciprocal to find the period 2π/0.02 =
314 sec. This checks measurement on the figure.
18. Take the simple case when two sine waves of different frequencies and equal amplitudes
are used. The result is

(sin(ω1 t) + sin(ω2 t))2


= sin2 (ω1 t) + sin2 (ω2 t) + 2 sin(ω1 t) sin(ω2 t)
1 1
= 1 − cos(2ω1 t) − cos(2ω2 t)
2 2
+ cos((ω1 − ω2 )t) − cos((ω1 + ω2 )t)

using standard trigonometric identities. Thus, the positive frequencies present are ω =
0, 2ω1 , 2ω2 , and | ω1 ± ω2 |, with relative proportions in this case equal to 1, 1/2, 1/2, 1, and
1, respectively.
In the general case when the two sinusoids have arbitrary amplitudes and phase angles,
only the same frequencies as above are possible. Carrying out the more general calculation,

(A1 sin(ω1 t + φ1 ) + A2 sin(ω2 t + φ2 ))2


= A21 sin2 (ω1 t + φ1 ) + A22 sin2 (ω2 t + φ2 ) + 2A1 A2 sin(ω1 t + φ1 ) sin(ω2 t + φ2 )
A21 A22 A21 A2
= + − cos(2ω1 t + 2φ1 ) − 2 cos(2ω2 t + 2φ2 )
2 2 2 2
+ A1 A2 cos((ω1 − ω2 )t + φ1 − φ2 ) − A2 A2 cos((ω1 + ω2 )t + φ1 + φ2 )
Chapter 2

Strings, Pipes, the Wave Equation

1. The E4 string on my guitar (highest pitch) is 27.75 in long (0.70485 m), and its pitch
as an open string is 329.6 Hz. The speed of sound on the string is therefore c = 2L/T =
2Lf = 464.6 m/sec. The E2 string, two octaves below, yields one-fourth this value, 116.2
m/sec. Thus the speed of sound on the strings of a guitar spans the speed of sound in air,
about 345 m/sec.
2. By Eq. 2.5, the square of the velocity of sound on a string is P/ρ, where P is the tension
and ρ is the mass per unit length. Increasing the tension therefore increases the velocity of
sound on the string. The fundamental pitch is proportional to the velocity of sound on the
string, so the pitch also increases, as every guitarist knows.
3. Notice first that γ is dimensionless, being defined in Eq. 6.4 as a ratio of fractional
changes. From Eq. 6.14,
c2 ρ0
γ=
p0
and from the CRC Handbook of Chemistry and Physics, 75th edition, (D. R. Lide, ed.),
CRC Press, Cleveland, Ohio, 1994-1995,

c = 340.29 m s−1
ρ0 = 1.2250 m−3 kg
p0 = 1.01325 × 105 N m−2 = m−1 kg s−2

for sea-level dry air at 15◦ C. The calculation yields the value γ = 1.4000. In his 1894
edition of Theory of Sound (see the Notes to Chapter 2), Lord Rayleigh calculates the value
γ = 1.410 from thermodynamic considerations, independent of the speed of sound.
4. From Eq. 3.2, the general solution for the local deviation of the position of an air
molecule is
ξ(x, t) = f (t − x/c) + g(t + x/c)

13
14 CHAPTER 2. STRINGS, PIPES, THE WAVE EQUATION

We have already seen in Section 7 that the tube being open at x = 0 leads to Eq. 7.4,

ξ(x, t) = f (t − x/c) + f (t + x/c)

Now the condition that the tube is open at x = L is enforced by


∂ξ
=0
∂x x=L

which leads to the same periodicity condition as in the case of the vibrating string fixed at
two points:
f (t) = f (t + 2L/c)
Therefore, define the fundamental frequency ω0 = πc/L radians per sec and substitute the
proposed solution
ξ(x, t) = ejkω0 t Ξ(x)
where k is an integer, in the wave equation, Eq. 6.13, yielding
 2
d2 Ξ kπ
2
=− Ξ(x)
dt L
Therefore,
Ξ(x) = cos(kπx/L + φ)
where we need to find the phase angle φ from the boundary conditions. Imposing the
conditions Ξ0 (0) = Ξ0 (L) = 0 yields

sin(φ) = sin(kπ + φ) = 0

so we can take φ = 0, and solutions are of the form

ejkω0 t cos(kπx/L)

where k is any integer.


Of course any linear combination of solutions will also be a solution, so the grand
combination is
X∞
ck ejkω0 t cos(kπx/L)
k=1
in analogy to Eq. 8.1 for a string fixed at two points. In the case of a tube open at both
ends, the nodes and antinodes are interchanged, but the same frequencies are present.
5. From Problem 4, the fundamental frequency for the case when the straw is open at both
ends is ω0 = πc/L radians per sec. When we close one end of the straw, the fundamental
frequency becomes ω0 = πc/(2L) radians per sec (from Section 7), so the pitch is halved.
6 and 7. I recommend Benade’s discussion of a glockenspiel bar in his book, Fundamentals
of Musical Acoustics, referenced in the Notes to Chapter 3. In the simplest mode a bar
15

flexes between a U-shape and an inverted U-shape. The next higher mode has two humps
instead of one, looking like a ∼. The higher modes are more complicated, and involve
twisting motion of the bar.
Benade goes on to describe the vibration of rectangular plates, guitar plates, and drum-
heads. All these are two-dimensional boundary-value problems, and the modes of vibration
can be astoundingly complex. For a real treat, see Mary Désirée Waller’s Chladni Figures:
A Study in Symmetry, G. Bell & Sons, London, 1961. She reproduced and extended the
experiments of Ernst Florens Friedrich Chladni (1756–1827), who visualized the vibration
of plates by sprinkling them with fine sand and bowing them. Waller had been asked by
an itinerant ice-cream vendor why dry ice made his bicycle bell ring, and refined a method
for exciting vibrations in plates that uses dry ice instead of a bow. She went on to devote
a large part of her life to the exploration of what are called Chladni figures, and her book
is what she herself calls “essentially a picture book” of her experimental results.
8. Putting your finger at the center of a string should suppress all harmonics except those
that already have a node there. Figure 5.1 shows that the odd harmonics should then be
suppressed, and the frequencies remaining should be the even multiples of ω0 . Thus the
pitch of the string is effectively doubled, which makes sense because it can now be thought
of as two strings, each having half the original length.
Benade discusses the experiment at the end of Chapter 8 of his book (see the Notes
to Chapter 3), suggesting the “corner of a soft foam sponge” instead of a finger to absorb
the vibrations of the odd harmonics. Benade suggests many similar experiments involving
plucking, striking, and damping different vibrating systems at different points. It’s an
excellent way to gain insight about vibrational modes and their excitations.
Chapter 3

Sampling and Quantizing


1. One bel is 10 decibels, an amplitude ratio of 10, or a power ratio of 10.
2. (a) The signal is the rotating image of the hubcap pattern, and the sampling mechanism
is the periodic shutter of the motion-picture camera, often 24 frames per second.
(b) In this case the signal is a spatial one — the periodic stripes in the tweed material.
The spatial sampling is provided by the raster scan lines of the TV camera. The result is
similar to the interference patterns between stripes of different spacings that are overlaid
at different angles. When the TV anchor squirms, the change in perspective changes the
apparent spatial frequency of the tweed, and the pattern shifts in complicated ways. The
resulting spatial aliasing patterns are sometimes called moiré patterns.
(c) Some people have told me this effect occurs, but I’ve never observed it myself. The
signal is clear — it’s the same as in Part (a). If the effect is real, the sampling must occur
somewhere in the observer’s eye-brain vision system. This would give us a clue about how
vision works, and would be very interesting indeed.
The reason I’ve specified the sun in this part is that artificial lighting is often powered by
AC current, which can provide a 60 Hz variation in intensity. This is often quite noticeable
in fluorescent lighting, and the effect is used to calibrate phonograph turntables. The idea
is to draw radial stripes on a paper disk, which is then put on the moving turntable. The
stripes are spaced so that they appear stationary in 60 Hz lighting when the turntable is
moving at exactly the correct speed.
3. From Eq. 1.5, the frequency 330 Hz gets aliased to f0 − fs = 330 − 300 = 30 Hz in the
baseband. This checks in Fig. 1.1, where the period of the aliased, sampled waveform is 10
periods at the sampling frequency of 300 Hz.
4. Figure P4 shows the first three terms in the Fourier series of a square wave. The period
in this example is 225 sec.
5. Figure P5 shows four sampling periods in three repetition periods of the square wave.
There are 40,000/30,000 = 4/3 samples per period, and the sampling pattern repeats every
three periods of the square wave.

17
18 CHAPTER 3. SAMPLING AND QUANTIZING

signal value
0

-1

-2
0 225 450
time

Fig. P4 The first three terms in the Fourier series of a square wave.

period of
square wave

sampling period

Fig. P5 Sampling a 30 kHz square wave at 40 kHz.

6. The 79th harmonic occurs at 55.3 kHz, which at a 40 kHz sampling rate aliases to 15.3
kHz in the baseband.
7. Let fs = 1/T Hz and f0 = 1/P Hz be the sampling frequency and repetition frequency
of the original waveform, respectively. Then the frequencies present in the sampled signal
are, by Eq. 1.5,
kf0 + nfs

where k and n are arbitrary integers.


If f0 is an integer multiple of fs , all signal frequencies are aliased after sampling to the
zero frequency, and the conditions of the problem are satisfied trivially — no new frequencies
appear in the baseband. If f0 = fs /2, the Nyquist frequency, all frequencies are aliased to
zero or the Nyquist frequency, and again the conditions of the problem are satisfied trivially.
If f0 is neither an integer multiple of fs nor equal to fs /2, we can assume that f0 < fs /2,
for if not, the frequency f0 will itself certainly be aliased into the baseband [−fs /2, fs /2].
Assume next that 2 < r = fs /f0 is not an integer, and let r = A + , where A is an
19

integer and 0 <  < 1. The frequencies present in the sampled signal are
 
fs
f0 k + n = f0 (k + nA + n)
f0
Choosing n = 1 and k = −A shows that the frequency f0 is present after sampling, so the
conditions of the problem are not satisfied.
In the final case 2 < r = fs /f0 is an integer, say A. The frequencies present in the
sampled signal are  
fs
f0 k + n = f0 (k + nA)
f0
so no new frequencies result from sampling.
To summarize, sampling introduces no new frequencies only when
• f0 is an integer multiple of fs ;

• f0 is equal to fs /2;

• fs is an integer multiple of f0 .
The third case is the only interesting one, and represents the situation when the sampling
pattern is identical from period to period of the original waveform.
8. Shifting the square wave by T /4 results in a Fourier series with only cosines because
the waveform becomes even. Only odd harmonics are present because the waveform is odd
about the center of each half-period. Another way to see this is to replace t by t + T /4 in
the Fourier series in Eq. 2.2.
(Erratum in first printing: The fifth paragraph in Section 2 should begin, “Next, observe
that our square wave is even about the center of each half-period. That is, if we shift the
signal so that it’s centered at the center of a half-period, say at t = T /4, the resulting signal
is even. Now sine waves at the odd harmonics have this property, but sine waves at the
even harmonics are odd about the quarter-period point.”)
Shifting by T /2 gets us back to a sine series with only odd harmonics, with the same
symmetry properties as the original. In fact, the shift just inverts the sign of the signal.
9. As observed in the text, the sampling pattern drifts by 1/7 of a sample each period
of the square wave, so it repeats only after 7 periods. The sampled waveform therefore
has a period of 7T = 1/100 sec, which corresponds to a fundamental frequency of 100 Hz.
In general, the new period is the least common multiple of the period of the signal and
the sampling period. This is equivalent to saying that the new frequency is the greatest
common divisor of the fundamental frequency of the signal and the sampling frequency.
The spectral components in Fig. 2.2 would therefore be 100 Hz apart if all harmonics of
the original signal were present. These are all frequencies of the form k700 + n40000 Hz for
all integers k and n. But in fact the square wave allows only odd k, so we get all frequencies
of the form (2m + 1)700 + n40000 = 700 + m1400 + n40000 Hz for all integers m and n.
20 CHAPTER 3. SAMPLING AND QUANTIZING

This yields a spacing that is the greatest common divisor of 1400 Hz and 40000 Hz, or 200
Hz. The actual frequencies present after sampling are all the odd multiples of 100 Hz.
10. A periodic waveform becomes not periodic after sampling if and only if the signal’s
fundamental period and the sampling period have no least common multiple, or, in other
words, when their ratio is irrational.
12. (a) First use cos(x + y) = cos x cos y − sin x sin y for x = 2θ and y = θ :

cos(3θ) = cos(2θ) cos θ − sin(2θ) sin θ

Then use sin x sin y = (1/2)(cos(x − y) − cos(x + y)) for the second term to get

cos(3θ) = cos(2θ) cos θ − (1/2) cos θ + (1/2) cos(3θ)

Collecting the two terms in cos(3θ) and replacing cos(2θ) by 2 cos2 θ − 1 results in

cos(3θ) = 4 cos3 θ − 3 cos θ

(b) Repeating the strategy above for cos(nθ) leads to

cos(nθ) = 2 cos θ cos((n − 1)θ) − cos((n − 2)θ)

which shows that cos(nθ) is a polynomial in cos θ by induction.


Letting x = cos θ, the Chebyshev polynomial of order n is defined by

Tn (x) = cos [n arccos x]

We have shown above that

T1 (x) = x
T2 (x) = 2x2 − 1
T3 (x) = 4x3 − 3x

and, in general,
Tn = 2xTn−1 (x) − Tn−2 (x)

(c) To illustrate the general method, suppose we want to generate the signal

A cos(ωt) + B cos(2ωt) + C cos(3ωt)

Letting x = cos(ωt), this becomes

Ax + B(2x2 − 1) + C(4x3 − 3x) = 4Cx3 + 2Bx2 + (A − 3C)x − B

This is a fixed polynomial in x, so we can generate the desired signal by first generating
x = cos(ωt) and then passing it through this instantaneous nonlinearity.
Chapter 4

Feedforward Filters

1. Figure P1 shows sketches of the three cases. Notice that the samples in Case (a) can
be obtained by taking alternate samples in Case (b), and the samples in Case (b) can be
obtained by taking alternate samples in Case (c). The answers are not unique, but depend
on the relative phase between the sinusoid and the sampling.

(a)

(b)

(c)

Fig. P1 Samples of a sinusoid with frequency (a) zero; (b) the Nyquist frequency; and (c) half
the Nyquist frequency.

2. Reverse the sign of a1 , and use the filter

yt = xt − 0.99xt−τ

with the same delay as in the text example, τ = 167µsec. The magnitude transfer function

21
22 CHAPTER 4. FEEDFORWARD FILTERS

is given by Eq. 2.7 with a1 = −0.99, and the notches now occur when ωτ is an even multiple
of π, or at f = 0, 2/τ, 4/τ, . . . Hz.
(Erratum in first printing: The left-hand side of Eq. 2.7 should be |H(ω)|.)
3. In the general case, the transfer function corresponding to the filter equation

yt = xt + xt−1 + xt−2 + · · · + xt−n+1

is
H(z) = 1 + z −1 + z −2 + · · · + z −(n−1)
This is a geometric series, with closed form
1 − z −n
H(z) =
1 − z −1
For the magnitude response replace z by ejω :
1 − e−jnω
H(ω) =
1 − e−jω
Multiply the numerator by ejnω/2 , the denominator by ejω/2 , and take the magnitude:

ejnω/2 − e−jnω/2 sin(nω/2)


|H(ω)| = =
ejω/2 − e−jω/2 sin(ω/2)

The nulls occur at the zeros of the numerator that are not zeros of the denominator, which
correspond to the frequencies ω = k2π/n radians per sample, for integer k not equal to
integer multiples of n, k = 1, 2, . . . , n − 1, n + 1, . . .. The equivalent frequencies in terms of
the sampling rate fs are fs /n, 2fs /n, . . . , (n − 1)fs /n, (n + 1)fs /n, . . .. Major peaks occur
precisely at integer multiples of the sampling frequency, where the zeros in the numerator
and denominator cancel out. The other peaks occur close to (but not precisely) midway
between nulls.
This is the frequency content of a rectangular window of length n, and is plotted for
n = 8 and 64 in Fig. 3.1 of Chapter 10.
4. First write wt and wt−1 in terms of x:

wt = a0 xt + a1 xt−1
wt−1 = a0 xt−1 + a1 xt−2

Then substitute these values in


yt = b0 wt + b1 wt−1
and collect terms to get

yt = a0 b0 xt + (a0 b1 + a1 b0 )xt−1 + a1 b1 xt−2


23

as required.
5. When we multiply polynomials, as in Eq. 5.11, we first use the distributive law, which
justifies operations of the form α(β + γ) = αβ + αγ. This holds true for signals when α is
multiplication by a constant, because multiplying the sum of two signals by a constant is
equivalent to multiplying each by the constant and then summing the signals. It’s also clear
that it holds true when the α represents a shift by any number of samples. Commutativity
also holds; that is, the order of shifting and multiplication can be interchanged arbitrarily.
Finally, we use the law z m+n = z m z n for integer shifts m and n. This is true because
shifting by m samples and then by n samples is equivalent to shifting by m + n samples.
6. Let the transfer function be A(z) = i ai z −i and let z0 be a complex zero. Then the
P
transfer function evaluated at the conjugate point z0∗ is
∗
ai (z0∗ )−i =
X X
A(z0∗ ) = ai z0−i
i i

using the fact that complex conjugation and exponentiation commute. Taking the conjugate
of this yields
(A(z0∗ ))∗ =
X X
a∗i z0−i = ai z0−i = 0
i i
using the assumption that the ai are real.
Because the ai are real, replacing ω by −ω in the transfer function has the effect of
complex-conjugating it. In other words, H(−ω) = H ∗ (ω). Therefore, the magnitude is an
even function of ω and the phase is an odd function of ω.
7. First calculate the quadratic factor that accounts for the zeros at z = ej±π/6 :

(z − ejπ/6 )(z − e−jπ/6 ) = z 2 − 2 cos(π/6)z + 1 = z 2 − 3z + 1

We want zeros at z = 1 and z = ej±π/6 , so the transfer function is


√ √ √
z −3 (z − 1)(z 2 − z 3 + 1) = 1 − (1 + 3)z −1 + (1 + 3)z −2 − z −3

where we have introduced a delay of three samples to make the filter realizable.
Figure P7 shows the magnitude response of this filter, and the notches at the frequencies
zero and π/6 radians per sample show clearly. The peak magnitude between those two
frequencies is −25.45 dB and occurs at 0.298 radians per sample.
When the signal x(n) = 1 + cos(πt/6) is passed through this filter, the result is the
signal

y(0) = 2
y(1) = −3.598076
y(2) = 1.866025

and
y(t) = 0 for t ≥ 3
24 CHAPTER 4. FEEDFORWARD FILTERS

20

10
magnitude response, dB
0

-10

-20

-30

-40

-50

-60

-70

-80
0 0.1 0.2 0.3 0.4 0.5
frequency, fractions of sampling rate

Fig. P7 The magnitude response of the filter in Problem 7.

and yes, the output does become exactly zero.


8. Assume the transfer function is of the following form:

H(z) = az m + · · · + az n

where the exponents are in ascending order. The problem stipulates that the coefficients
are symmetric about their center. Multiply by z −(m+n)/2 :

z −(m+n)/2 H(z) = az (m−n)/2 + · · · + az −(m−n)/2

This makes the exponents odd about the center, so when we let z = ejω to get the frequency
response, pairs of terms symmetrically placed about the center combine to form cosines,
and the right-hand side becomes a real-valued cosine series:

e−j(m+n)ω/2 H(ω) = real-valued cosine series

This shows that the phase of H(ω) is linear. When m = 0 and n = N − 1, the resultant
delay is (N − 1)/2 samples.
9. Cases (a) and (b) are narrowband, having concentrated energy at DC and 1/100 the
Nyquist frequency, respectively. In general, if a signal is narrowband about ω0 , the transfer
function frequency response can be approximated by

H(ω) ≈ |H(ω0 )| ejφ(ω0 )

where φ(ω) is the phase response. Thus, if we normalize by |H(ω0 )|, the effect of the filter
should be very close to an ideal delay. Figure P9 shows the results for these three cases,
25

using the filter normalized at zero frequency, and therefore having the transfer function
0.5 + 0.5z −1 . The first two cases illustrate the delay of about one-half sample.
10. Multiply out the right-hand side:

(z − ejπ/3 )(z − e−jπ/3 ) = z 2 − 2 cos(π/3)z + 1 = z 2 − z + 1

11. The transfer function is 1 − RL z −L and its magnitude has minima and maxima when
z L = +1 and −1, respectively. The values achieved at minima and maxima are therefore
1 − RL and 1 + RL , respectively.
26 CHAPTER 4. FEEDFORWARD FILTERS

12

11

10

8
signal value

6
(a) 5

0
0 1 2 3 4 5 6 7 8 9 10
time, sample number

0.9

0.8

0.7
signal value

0.6

0.5

(b) 0.4

0.3

0.2

0.1

0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
time, sample number

1.2

1.0

0.8

0.6

0.4
signal value

0.2

0
(c) -0.2

-0.4

-0.6

-0.8

-1.0

-1.2
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
time, sample number

Fig. P9 The results of filtering three signals with a simple feedforward filter. In Cases (a) and
(b) the signals are narrowband and the result is close to a pure delay of one-half sample; in Case
(c) the signal is not narrowband, and the result has no such easy interpretation.
Chapter 5

Feedback Filters

1. Expand the exponential in a power series:

e−B/2 = 1 − B/2 + (1/2)(B/2)2 − · · ·


≈ 1 − B/2

The origin of the exponential form is that monument of electrical engineering: the tuned
RLC circuit. It’s the precise counterpart of the digital reson discussed in this chapter. In
a classic treatment of circuit theory, D. F. Tuttle, Jr. (Circuits, McGraw-Hill, New York,
N.Y., 1977) devotes an entire chapter of 116 pages just to this circuit. It turns out that the
impulse response is a damped sinusoid (no surprise), and the damping factor is

e−(∆ω/2)t

where ∆ω is the bandwidth in radians per sec. To correspond to the digital reson, set this
equal to Rt , yielding
R = e−∆ω/2
as desired.
2. Start with Eq. 5.1, the inverse-square of the magnitude response:
  2
1/|H(φ)|2 = ejφ − Rejθ ejφ − Re−jθ

Multiply out the binomials and write this as the sum of the squares of the real and imaginary
parts:
2
cos(2φ) − R cos(φ + θ) − R cos(φ − θ) + R2


+ [sin(2φ) − R sin(φ + θ) − R sin(φ − θ)]2

Replace cos(φ ± θ) and sin(φ ± θ), using the usual identities:


2
cos(2φ) − 2R cos φ cos θ + R2 + [sin(2φ) − 2R sin φ cos θ]2


27
28 CHAPTER 5. FEEDBACK FILTERS

Expand the squares and collect terms, using the identity cos φ cos(2φ)+sin φ sin(2φ) = cos φ:

1 + 4R2 cos2 θ + R4 − 4R cos φ cos θ + 2R2 cos(2φ) − 4R3 cos φ cos θ

Replace cos(2φ) by 2cos2 φ − 1 and use the resulting −2R2 to form the perfect square
2
1 − R2 :
2
1 − R2 + 4R2 cos2 θ − 4R cos φ cos θ + 4R2 cos2 φ − 4R3 cos φ cos θ
2
= 1 − R2 + 4R2 cos2 θ − 4R(R2 + 1) cos φ cos θ + 4R2 cos2 φ

which is Eq. 5.2.


Differentiate with respect to cos φ and set to zero:

−4R(R2 + 1) cos θ + 8R2 cos φ = 0

or
1 + R2
cos φ = cos θ
2R
which is Eq. 5.3.
3. Solving Part (c) first will help us with Parts (a) and (b). Let’s assume that we are
dealing with a case when θ0 is close to θ, so let θ0 = θ + δ, where δ is small. Approximate
cos θ0 by its first-order Taylor series:

cos θ0 = cos θ − δ sin θ + higher-order terms in δ

and substitute in Eq. 5.3 derived in Problem 2:

1 + R2
cos θ − δ sin θ = cos θ
2R
Rearranging gives us an approximation for δ

(1 − R)2
δ=−
2R tan θ

From this we see that when θ is between 0 and π/2, δ is negative, so θ0 is shifted to lower
frequencies. When θ is between π/2 and π, δ is positive, so θ0 is shifted upward, towards
the Nyquist frequency. In other words, when θ is in the baseband, θ0 is always shifted away
from the center frequency, π/2, which is half the Nyquist frequency. (We can also see this
directly from Eq. 5.3, because the factor (1 + R2 )/2R is greater than 1 when R is less than
1, as we’ll prove in the solution to Problem 5.)
This approximate formula for δ also tells us that the discrepancy between θ0 and θ is
greatest when R is small (bandwidth large) and θ is near 0 or π radians per sample.
29

4. Setting φ = θ in Eq. 5.2 gives us

1/|H(θ)|2 = (1 − R2 )2 + 8R2 cos2 θ − 4R(R2 + 1) cos2 θ


= (1 − R)2 1 + R2 − 2R cos(2θ)


as required.
To compare this with the gain factor normalizing at the peak resonant frequency, divide
this by the square of the gain factor in Eq. 5.5:

(1 − R)2 (1 + R2 − 2R cos(2θ))
(1 − R2 )2 sin2 θ

Replacing (1 − R2 )2 by (1 − R)2 (1 + R)2 in the denominator enables us to cancel factors


(1 − R)2 , and then letting R → 1 shows that this ratio approaches one.
5. Assume that 0 < R < 1, and note that

(1 − R2 ) = 1 − 2R + R2 > 0

from which it follows that


1 + R2
>1
2R
Thus, if we solve for cos θ in Eq. 5.3, it will be determined by a factor less than one times
cos ψ, and there will always be a solution for θ. On the other hand, there are ranges of
R and cos θ where the equation yields cos ψ > 1, and this means there is no true peak
frequency ψ.
6. In computer music, we might want to make a note by using a reson to filter wideband
noise (from a random number generator, for example). If the bandwidth of the reson is a
small fraction of an octave, the resulting note will have a definite pitch. A good measure
of how well the pitch will be perceived is the bandwidth of the reson relative to its center
frequency — in other words, the normalized bandwidth Q = ∆ω/ω0 , where ∆ω is the
bandwidth and ω0 is the center frequency, both in radians per sample. To use a reson
in this way over a range of pitches we usually hold the Q constant, which means making
the bandwidth proportional to the center frequency. Constant-Q bandpass filters of all
sorts are commonly used in many other applications, such as audio equalizers and spectrum
analyzers.
On the other hand, if we are using a reson (or any bandpass filter) to analyze or syn-
thesize in a narrow range, it may not be worth the trouble to vary the bandwidth as the
center frequency changes. It’s also very convenient in many cases to have the bandwidths
of a filter bank equal, rather than being geometrically scaled with center frequency. The
output energy of each filter in the filter bank is then proportional to the energy of the signal
at that frequency, and there is no need to scale.
Finally, there are many situations where we naturally want to vary the bandwidth of a
note produced with a reson, or vary the selectivity of a reson used for analysis.
30 CHAPTER 5. FEEDBACK FILTERS

7. (a) The transfer function of reson R is

1 − Rz −2
H(z) =
1 − 2R cos θz −1 + R2 z −2

(b) The corresponding input-output equation is

yt = xt − Rxt−2 + 2R cos θyt−1 − R2 yt−2

(c) To find the magnitude response at the frequency corresponding to θ radians per sample,
write H(z) explicitly in terms of its poles:

1 − Rz −2
H(z) =
(1 − Rejθ z −1 ) (1 − Re−jθ z −1 )

and set z = ejθ :


1 − Re−2jθ 1
H(ω) = =
(1 − R) (1 − Re−2jθ ) 1−R

8. (a) The transfer function of reson z is given in Eq. 7.1:

1 − z −2
H(z) =
1 − 2R cos θz −1 + R2 z −2
Let y = cos ψ and x = cos θ, for convenience. Then the magnitude-square of the numerator
is (after some algebra)
|N (ω)|2 = 4 1 − y 2


and, after further simplification, the magnitude-square of the denominator is

|D(ω)|2 = (1 − R2 )2 + 4R2 (x2 + y 2 ) − 4Rxy(1 + R2 )

Setting d(|D|2 /|N |2 )/dy = 0 yields (again, after some algebra) the quadratic equation in y

(1 + R2 )2 + 4R2 x2
 
2
y −y +1=0
2Rx(1 + R2 )

which factors into


1 + R2
  
2Rx
y− y− =0
2Rx 1 + R2
The two roots are reciprocal, and the first is always greater than 1, so choose the second,
which results in
2R
cos ψ = cos θ
1 + R2
Isn’t it interesting that the ratio cos ψ/ cos θ is the reciprocal of that for reson (cf. Eq. 5.3)?
31

(b) The gain of reson z at its true peak frequency can now be obtained by substituting
y = 2R/ 1 + R2 into the square-magnitude of the transfer function. Using the expressions
for |N |2 and |D|2 , we get (more algebra!)

|N |2 4
2 =
|D| (1 − R2 )2

which is independent of θ, as advertized.


(c) Reson z not only has a peak gain independent of center frequency but also requires one
fewer multiplication per sample.
9. A feedback filter can turn out to have an impulse response of finite duration if all the
poles are canceled by zeros. Here’s the transfer function for a simple example:

(1 − z −1 )(1 − 0.5z −1 ) 1 − 1.5z −1 + 0.5z −2


H(z) = =
1 − 0.5z −1 1 − 0.5z −1
which corresponds to the input-output equation

yt = xt − 1.5xt−1 + 0.5xt−2 + 0.5yt−1

which is a feedback filter. But the impulse response is zero for t ≥ 2. This becomes
clear when you cancel the pole and zero at z = 0.5; the transfer function is also simply
H(z) = 1 − z −1 .
The only way the impulse response can be of infinite duration without feedback is to
have an infinite number of feedforward terms in a feedforward filter.
10. What’s missing from the proof is showing that the cumulative effect due to a single pole
and all the input samples at an arbitrary time is bounded. To do this we need to assume
something about the input. Here’s a standard argument.
Suppose the magnitude of the input signal xk is bounded by some constant Xmax , and
consider the response yt at an arbitrary time t due to a pole p of a filter:
t
X
yt = xk pt−k
k=0

This comes directly from Eq. 2.6, with each impulse δt−k replaced by the response to that
impulse, pt−k . Take the magnitude of yt and use the fact that the magnitude of a sum is
no larger than the sum of magnitudes:
t
X
|yt | = xk pt−k
k=0
Xt
≤ |xk | |p|t−k
k=0
32 CHAPTER 5. FEEDBACK FILTERS

Next, use the assumption that |xk | ≤ Xmax , and bring Xmax outside the summation:
t
X Xmax
|yt | ≤ Xmax |p|t−k ≤
1 − |p|
k=0

where we assume |p| ≤ 1 and bound the finite geometric series by the infinite geometric
series. This shows that the output signal is bounded in magnitude when the pole p is less
than 1 in magnitude, assuming also that the input signal is bounded in magnitude — which
fills the gap in the proof.
Chapter 6

Comb and String Filters

1. Let
Rt if t = 0 mod L

xt =
0 otherwise
for all t, both negative and nonnegative. The output of the inverse comb filter, by Eq. 1.1,
is
yt = xt − RL xt−L = 0
for all t, so the output wt of the following comb filter is also zero for all t. Thus, in this
example wt 6= xt .
The apparent difficulty in using transfer functions and z-transforms here stems from the
fact that the z-transform of two-sided signals like this particular xt have two parts, one for
negative t and one for nonnegative t, and the two transforms have no common region of
convergence. This jumps way ahead — beyond of the scope of the book, in fact.
3. By Eq. 3.4, the lowpass filter in the feedback loop has magnitude transfer function
cos(ω/2) and a delay of L + 1/2 samples. After k samples, a signal makes k/(L + 1/2) trips
around the feedback loop, and therefore a phasor component at frequency ω radians per
sample has its magnitude multiplied by

|cos(ω/2)|k/(L+1/2)

assuming R = 1. The magnitude of the contribution of every frequency is decreased with


each round-trip, and so the overall plucked-string filter is stable.
4. (a) To find the time constant k in samples for general R, set

|R cos(ω/2)|k/(L+1/2) = e−1

and solve for k by taking the natural logarithm of both sides:


L + 1/2
k≈ samples
− ln (R |cos(ω/2)|)

33
34 CHAPTER 6. COMB AND STRING FILTERS

(b) If the frequency of a resonance is f Hz, the frequency in radians per sample is ω =
(f /fs )2π, where fs is the sampling frequency in Hz. The fundamental frequency of the
plucked-string filter is f0 = fs /(L + 1/2) Hz, so the time constant estimate in Part (a) can
be written
fs /f0
k≈ samples
− ln (R |cos(πf /fs )|)
Let τ0 = 1/f0 , the fundamental period in sec; Ts = 1/fs , the sampling interval in sec;
and f = nf0 , so we are dealing with the nth partial. Then the time constant estimate
becomes
τ0
τ = kTs ≈ sec
− ln (R |cos (πn/(L + 1/2))|)

5. The filter equation is

z-1

xt + a + + yt
Σ Σ

z-1

Fig. P5 Signal flowgraph for an allpass filter that uses one multiplication.

yt = a(xt − yt−1 ) + xt−1


and the corresponding signal flowgraph is shown in Fig. P5.
6. The low-frequency delay δ of an allpass filter is, by Eq. 7.9, (1 − a)/(1 + a). When this
is negative, a > 1, and the filter is unstable.
7. Letting x = ω0 /2, y = ω0 δ/2, and taking the tangent of both sides of Eq. 8.1 yields
 
1−a
tan y = tan x
1+a
Solve for a:
tan x − tan y
a =
tan x + tan y
sin x cos y − sin y cos x
=
sin x cos y + sin y cos x
sin(x − y)
=
sin(x + y)
which is Eq. 8.2.
35

8. The proof is perfectly analogous to the one given in Section 7 for the one-pole, one-zero
case. Multiply numerator and denominator of the transfer function by z n/2 :

a0 z −n/2 + a1 z −n/2+1 + · · · + an z n/2


H(z) =
a0 z n/2 + a1 z n/2−1 + · · · + an z −n/2

When z = ejω , the numerator is the complex conjugate of the denominator, so the magni-
tude response can be written in the form

rejψ(ω)
|H(ω)| = =1
re−jψ(ω)

which shows that the filter is allpass.


9. (a) To show that the second-order allpass filter is stable when µ > 1, we need to show
that in that range, its poles, the roots of

z 2 + bz + a = 0

are inside the unit circle in the z-plane, where

2−µ (2 − µ)(1 − µ)
b=2 and a =
1+µ (2 + µ)(1 + µ)

The discriminant of this quadratic equation is

b2 3(2 − µ)
−a=
4 (1 + µ)2 (2 + µ)

Therefore, the roots occur in a complex pair when µ > 2. In this case the coefficient a is
the squared radius ρ of the complex roots, and
  
2 µ−2 µ−1
ρ = < 1 for µ ≥ 2
µ+2 µ+1

When µ = 2, the roots are double at z = 0. We have left to consider the case 1 < µ < 2,
when the roots are real and opposite in sign, since the discriminant is positive and the
product of roots, a, is negative. The center of gravity of the roots is
b µ−2
− =
2 µ+1

which moves from z = −1/2 to z = 0 as µ moves from 1 to 2. The roots are obtained by
adding and subtracting to −b/2 the square root of the discriminant:
r √ r
b2 3 2−µ
−a=
4 1+µ 2+µ
36 CHAPTER 6. COMB AND STRING FILTERS

When µ = 1, this square root takes the value 1/2, and decreases monotonically to 0 as µ
increases to 2. Thus the roots satisfy
r
b b2
−1 < − ± − a < 1 for 1 < µ < 2
2 4
which is what we set out to show.
(b) Figure P9 shows a comparison of computed delay for the first- and second-order allpass
filters. The second-order filter provides a uniformly better approximation to constant delay
at all frequencies. Whether it’s worth using in the plucked-string instrument depends on
how important it is to have accurate approximations to the pitch of the higher harmonics.

2.0
1.9
1.8
1.7
1.6
1.5
delay, samples

1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5
frequency, fractions of sampling rate

Fig. P9 Computed delay vs. frequency for the first- and second-order allpass filters. The bottom
ten curves show the actual delay in the first-order case when the desired delay = 0.1, 0.2, . . . , 1.0,
and the top ten curves show the actual delay in the second-order case when the desired delay
= 1.1, 1.2, . . . , 2.0.

10. The transfer function of an allpass filter with the specified poles and zeros can be
written  −1
z + ae−jωc
  −1
z + aejωc

H(z) =
1 + aejωc z −1 1 + ae−jωc z −1
37

For the poles and zeros to occur at angles ±ωc , choose a < 0. The first factor then
corresponds to a pole and zero at angle ωc , and the second factor corresponds to a pole and
zero at angle −ωc . Notice that the result in Problem 8 confirms that this is allpass because
it can be put in the form
z −2 + (2a cos ωc )z −1 + a2
H(z) =
1 + (2a cos ωc )z −1 + a2 z −2
To examine the phase response in the range of frequencies near ωc , let ω = ωc +  and
z = ejω in first form above:
 −j 
e +a
H(ω) = G(ω)
1 + ae−j
The factor G(ω) corresponds to the pole and zero at −ωc and is allpass. Let’s assume
that ωc is sufficiently far from zero frequency so that G(ω) is essentially constant in the
neighborhood of ωc . The factor G(ω) then contributes a constant phase shift to the total
result, say the angle φ0 . The first factor is identical in form to Eq. 7.2, and therefore
contributes the phase given by Eqs. 7.7 and 7.8. The total phase of the two-pole, two-zero
allpass at frequency ωc +  is therefore
 
1−a 1−a
−2 arctan tan(/2) + φ0 ≈ −  + φ0
1+a 1+a
which is what we wanted to show.
11. As the signal circulates in the feedback loop, its low-frequency components are decreased
relative to its high-frequency components, the opposite of what happens when the filter is
lowpass. Thus, the spectrum of the output signal evolves in the opposite way: the high
frequencies are sustained for a long time and the low frequencies decay fast. To me it sounds
metallic, somewhat reminiscent of the clang tone of a tuning fork.
12. (a) When b = 1 the filter equation reduces to
1 1
yt = yt−L + yt−L−1
2 2
which corresponds to the transfer function
1
H(z) = 1
1 − (1 + z −1 ) z −L
2

This differs from Eq. 5.3 for the plucked-string filter only in having the lowpass filter in the
feedback path instead of the feedforward path, and having the original comb pole radius
R = 1 (see the signal flowgraph in Fig. P12).
(b) Setting b = 0 reverses the sign of the feedback terms, thus shifting the poles so that only
odd harmonics are present. This sounds tube-like by analogy with the vibrating column of
air in a half-open tube (see Section 2.7).
38 CHAPTER 6. COMB AND STRING FILTERS

Xt Yt
Σ

(1+z -1 )/2 z -L

Fig. P12 Signal flowgraph for the case b = 1 in the Karplus-Strong drum-like instrument.

(c) When b = 1/2, the filter equation is


1 1
yt = yt−L + yt−L−1
2 2
Following the approximate analysis of Karplus and Strong, square this:
1 2 1 2 1
yt2 = yt−L + yt−L−1 + yt−L yt−L−1
4 4 2
and take the expected value term by term. The expected value of the last term on the
right-hand side is approximately zero, since yt−L and yt−L−1 are close to being uncorrelated.
Denoting the expected value of a signal by Yt = E[yt2 ], we get
1 1
Yt ≈ Yt−L + Yt−L−1
4 4
The two terms on the right-hand side are approximately equal, assuming that the ampli-
tude of the filter output decays reasonably slowly. We can therefore write the following
approximate difference equation for Yt :
1
Yt ≈ Yt−L
2
which implies that
Yt ≈ Y0 2−t/L
showing, heuristically at least, that the mean-square value of the filter’s impulse response
does decay exponentially.
13. Plucking a string at a particular point suppresses excitation of the harmonics that have
nodes there, the opposite effect of putting your finger there while the string is vibrating
(see Problem 8 of Chapter 2). Thus, as Jaffe and Smith point out, plucking a string 1/kth
of the way along its length suppresses every kth harmonic. This corresponds to filtering the
excitation signal with the inverse comb filter with equation
yt = xt − xt−N/k
which has the transfer function
1 − z −N/k
and therefore has zeros at integer multiples of k2π/N radians per sample.
Chapter 7

Periodic Sounds

1. Method 1: In the two-dimensional case we can prove the result using complex numbers.
This is a good way to give the class some practice. Let the vector v = vx + jvy and let
w = wx + jwy . Then
< v, w >= < {vw∗ } = vx wx + vy wy
But if we write v and w in polar form as Rv ejθv and Rw ejθw , respectively,

< v, w >= < {vw∗ } = Rv Rw cos(θv − θw )

which is the desired result.


Method 2: Here’s a more geometric approach, using the law of cosines. Define the vector
c = v − w; that is, c is the vector with tail at the end of w and head at the end of v. Denote
the lengths of v, w, and c by Rv , Rw , and Rc , respectively. Then the law of cosines states
that
Rc2 = Rv2 + Rw
2
− 2Rv Rw cos(θv − θw )
so
1
Rv2 + Rw
2
− Rc2

Rv Rw cos(θv − θw ) =
2
Using

Rv2 = vx2 + vy2


2
Rw = wx2 + wy2

and
Rc2 = (vx − wx )2 + (vy − wy )2
yields
Rv Rw cos(θv − θw ) = vx wx + vy wy
as required.

39
40 CHAPTER 7. PERIODIC SOUNDS

2. Method 2 in the previous solution works in any number of dimensions with very little
change. In three dimensions, for example, define c in the same way, apply the law of cosines
in the plane determined by v and w, and use

Rc2 = (vx − wx )2 + (vy − wy )2 + (vz − wz )2

where z is the third dimension.


3. First note that cn = 0 when n = 0, because the two integrals in Eq. 3.1 cancel out.
Evaluating the integrals in Eq. 3.1 when n 6= 0 yields
"Z #
T /2 Z T
1 −jnω0 t −jnω0 t
cn = e dt − e dt
T 0 T /2
" T /2 T
#
1 e−jnω0 t e−jnω0 t
= −
T −jnω0 0 −jnω0 T /2
1  −jnπ
− 1 − e−jn2π + e−jnπ

= e
−jn2π
where we used the fact that ω0 T = 2π. This is zero when n 6= 0 is even. When n is odd,
the bracketed expression becomes −4, and so
−4 2j
cn = =−
−jn2π nπ
verifying Eq. 3.2.
4. In general, we can state: Any symmetry of the waveform must be respected by every
nonzero term in its Fourier series. This is a consequence of the orthogonality of the basis
sinusoids. No basis element can be expressed as a linear combination of other basis elements,
and therefore no term that destroys a symmetry can be canceled out by a sum of others in
the series.
As another example, the triangle wave in Fig. 4.1 is even, and odd about every quarter-
period point. Therefore, its Fourier series contains only cosine terms, and only the ones
that are odd about the quarter-period points, the odd harmonics.
See also Section 2 and Problem 8 of Chapter 3 for similar observations. That symmetry
implies that certain coefficients are zero can also be argued directly from their explicit
formula in terms of the waveform, the projection in Eq. 2.5.
5. The nth coefficient is, from Eq. 2.5,
Z α
1
cn = e−jnωs t dt
T 0

When n = 0, cn = α/T . When n 6= 0,


α
e−jnωs t 1 − e−jnωs α 1 − e−jnωs α
cn = = =
−jnωs T 0 jnωs T jn2π
41

where we used the definition ωs = 2π/T in the last step. The magnitude of cn for n > 0 is
therefore
ejnωs α/2 − e−jnωs α/2 sin (nωs α/2) sin (nπα/T )
|cn | = = =
2πn πn πn

In the case α = T /2, c0 = 1/2, the average value of the waveform; cn = 0 when n is
even; and cn = −j/(πn) when n is odd. This checks the Fourier series for the square wave,
Eq. 3.2, since this case corresponds to the square wave in Fig. 3.1 multiplied by 1/2 and
shifted upwards by adding 1/2.
6. Figure P6 shows the cases α/T = 0.5, 0.25, and 0.1. What is actually shown is
sin (nπα/T ) /(πn), which does not include a complex exponential factor, but which re-
tains the sign. This makes it easier to see how the coefficients behave. For large α/T , the
coefficients become small fast; and for small α/T , the coefficients decay slowly. This is an
example of a general phenomenon we note at several points: A narrow waveform has a
broad spectrum, and vice-versa.
7. When ωt = m2π, an integer multiple of 2π, the sum in Eq. 7.1 becomes simply the sum
of the constant 1, and is therefore equal to 2N + 1.
When ωt 6= m2π,
N 2N
X
−jN ωt
X 1 − ej(2N +1)ωt
ejnωt
= e ejnωt = e−jN ωt
1 − ejωt
n=−N n=0

e−j(2N +1)ωt/2 − ej(2N +1)ωt/2 sin((2N + 1)ωt/2)


= −jωt/2 jωt/2
=
e −e sin(ωt/2)

as desired.
8. In this case, P is even, we choose N = P/2, and ω = 2π/P = π/N . Write the left-hand
side of Eq. 7.2 as
P/2−1
X
1+2 cos(n2πt/P ) + 2(−1)t
n=1

Then subtract (−1)t from both sides, so Eq. 7.2 becomes

P/2−1
(
sin((P +1)πt/P )
X − (−1)t if t 6= 0 mod P
1+2 cos(n2πt/P ) + (−1)t = sin(πt/P )
n=1
P + 1 − (−1)t if t = 0 mod P

Expanding the sine in the numerator yields

sin((P + 1)πt/P ) sin(πt) cos(πt/P ) + cos(πt) sin(πt/P )


=
sin(πt/P ) sin(πt/P )
= cos(πt) = (−1)t
42 CHAPTER 7. PERIODIC SOUNDS

0.5

Fourier coefficient alpha/T = 0.5

-50 -40 -30 -20 -10 0 10 20 30 40 50

harmonic number
Fourier coefficient

alpha/T = 0.25
0.2

0.1

-40 -30 -20 -10 0 10 20 30 40

harmonic number

0.10
Fourier coefficient

alpha/T = 0.1

0.05

-40 -30 -20 -10 0 10 20 30 40

harmonic number

Fig. P6 The Fourier coefficients of the rectangular wave in Problem 7, for α/T = 0.1, 0.25,
and 0.5. A complex exponential factor is not included, but the sign is retained.
43

and using the fact that t is even in the second case, we finally get
P/2−1 
X
t 0 if t 6= 0 mod P
1+2 cos(n2πt/P ) + (−1) =
P if t = 0 mod P
n=1

which is Eq. 7.4.


9. Dr. Godfrey Winham implemented a buzz generator in FORTRAN in the late sixties.
Here is Prof. Paul Lansky’s C translation, which he still uses for linear-predictive coding
synthesis. It’s in the form of a function and requires a 1024-point sine table f.

#define ABS(x) ((x < 0) ? (-x) : (x))


#define EPS .1e-06
float buzz(amp,si,hn,f,phs)
float amp,si,hn,*f,*phs;
{
register j,k;
float q,d,h2n,h2np1;
j = *phs;
k = (j+1) % 1024;
h2n = 2. * hn;
h2np1 = h2n + 1.;
q = (int)((*phs - (float)j) * h2np1)/h2np1;
d = *(f+j);
d += (*(f+k)-d)*q;
if(ABS(d) < EPS) q = amp;
else {
k = (long)(h2np1 * *phs) % 1024;
q = amp * (*(f+k)/d - 1.)/h2n;
}
*phs += si;
while(*phs >= 1024.)
*phs -= 1024.;
return(q);
}

As mentioned in the problem statement, the critical point is the choice of the parameter
EPS. When the magnitude of the denominator in Eq. 9.1 is smaller than this value, the
division is skipped and the computed value is set equal to amp.
10. Taking the conjugate of Eq. 1.10, the inner product in the complex case, shows that

< v, w >∗ =< w, v >

By the way, sometimes the inner product is defined with the first term conjugated instead
of the second.
Chapter 8

The Discrete Fourier Transform


and FFT

1. Let q = m − n 6= 0, and write the left-hand side of Eq. 2.4 as the closed form of a
geometric series:
N −1
X 1 − ejq2π
ejtq2π/N =
t=0
1 − ejq2π/N
The denominator of the right-hand side is not zero, because m and n range between 0 and
N − 1, and therefore −N < q < N . But the numerator is zero, which establishes the
orthogonality result.
2. To avoid the duplicate use of indices, use the dummy variable s instead of t in Eq. 2.6,
and substitute in Eq. 2.5:
N −1 N −1
1 X X
xt = xs ejk(t−s)2π/N
N
k=0 s=0

Interchanging the order of the summations then yields


N −1 N −1
1 X X
xt = xs ejk(t−s)2π/N
N
s=0 k=0

The result in Problem 1 shows that the summation on k equals N if s = t and equals 0
otherwise, thus getting us back to xt , as we wanted.
3. The fact that the inverse DFT exists shows that the matrix F is nonsingular. From Eq.
2.5, the matrix representation of the inverse DFT is
1
F −1 kt = ej2πkt/N
 
N
where this notation indicates the element in row k and column t of the matrix F −1 .

45
46 CHAPTER 8. THE DISCRETE FOURIER TRANSFORM AND FFT

I should have warned the reader that calculating the determinant of F is not a simple
problem. Finding its magnitude is relatively easy, but finding its phase angle is tricky.
Here’s how to find the magnitude of the determinant of F. The square of the determinant
of F is the determinant of R = F 2 , and R has the very simple form derived in Problem 15.
It’s N times a permutation matrix, and so

|det(R )| = N N

Therefore,
|det(F )| = N N/2
As mentioned above, it turns out that finding the phase angle of det(F ) is much harder
than finding its magnitude. I’ll give the method that, as far as I know, uses the most
elementary techniques. It’s from the paper “Is Computing with the Finite Fourier Transform
Pure or Applied Mathematics?” by L. Auslander and R. Tolimieri, Bulletin of the Amer.
Math. Soc., vol. 1, no. 6, Nov. 1979, pp. 847-897, and I thank Prof. Bradley Dickinson for
pointing out this derivation.
The kth row of F is a fixed vector of numbers raised to the kth power, 0 ≤ k < N , and
that qualifies det(F ) as a Vandermonde determinant, which has the following closed form:
Y
det(F ) = (W t − W k )
0≤k<t<N

where W = e−j2π/N . (See, for example, G. Birkhoff and S. Mac Lane, A Survey of Modern
Algebra, third edition, Macmillan, New York, N.Y., 1965.) Letting U = e−jπ/N , this can be
written
Y  
det(F ) = U t+k U t−k − U −(t−k)
0≤k<t<N
Y Y
= U t+k (−2j) sin((t − k)π/N )
0≤k<t<N 0≤k<t<N

The first product can be evaluated using closed forms for the sums in the exponent (see,
for example, L. B. W. Jolley, Summation of Series, second edition, Dover, New York, N.Y.,
1961):
N −1 X
t−1
N −1 2
X X  
(t + k) = (t + k) = 2N
2
0≤k<t<N t=1 k=0

Because t > k, the sines in the second product are all positive. Putting all the powers of
−j together yields
det(F ) = (−j)(3N −2)(N −1)/2 N N/2
where we’ve used the magnitude derived above. Finally, use the facts that j 2 = −1 and
j 4 = 1 to rewrite this in the beautiful form

det(F ) = j 2+3+...+N N N/2


47

It turns out that the eigenvalues of F are, aside from a factor of N 1/2 , 1, j 2 , j 3 , . . . , j N ,
and had we known that to begin with, the determinant would have followed immediately as
their product. But finding the eigenvalues is even harder than finding the determinant (see,
for example, M. L. Mehta, “Eigenvalues and Eigenvectors of the Finite Fourier Transform,”
J. Math. Phys., vol. 28, no. 4, April 1987, pp. 781-785).
4. (a) The signal x(t) is bandlimited as well as periodic and so can be written as the finite
Fourier series
N/2
X
x(t) = ck ejk2πt/T
k=−N/2

The sum incorporates frequencies between −1/(2Ts ) and 1/(2Ts ) Hz, as specified in the
problem statement. To put this in a form closer to a DFT, replace k by m − N/2, and
evaluate at the times t = nTs for n = 0, . . . , N :
N
X
x(nTs ) = e−jnπ cm−N/2 ejmn2π/N , for n = 0, . . . , N
m=0

This is an (N + 1)-point DFT and therefore relates the first (N + 1) samples of x(t) to
its Fourier coefficients ck in the usual way. Notice, however, that we must always choose
x(0) = x(N Ts ), because x(t) is assumed to be periodic with period T = N Ts . This puts
one linear constraint on the ck , and can be attributed intuitively to the fact that we are
including frequencies that are actually on the boundary 1/(2Ts ) Hz.
(b) If x(t) is not bandlimited, frequencies above 1/(2Ts ) Hz will be aliased to their images
in the baseband, the frequencies between −1/(2Ts ) and 1/(2Ts ) Hz. This will be reflected in
the values of the Fourier coefficients, which will then become aliased versions of the original
infinite set.
5. The number of stages is the largest integer not less than log N , denoted by dlog N e, the
ceiling of log N .
6. I adapted the following C program from the FORTRAN FFT program of Cooley, Lewis,
and Welch, referenced in the Notes. It was used to generate Figs. 9.1 and 9.2.

/* in-place complex fft; adapted from the FORTRAN of


Cooley, Lewis, and Welch; from Rabiner & Gold (1975)*/

#include <stdio.h>
#include <strings.h>
#include <math.h>

main() {
FILE *fopen(), *fdata;
#define dB(x) (((fabs(x))>(0.0))?(20.*log10(fabs(x))):(-1000.))
48 CHAPTER 8. THE DISCRETE FOURIER TRANSFORM AND FFT

#define verbose 0 /* 2 = print input and output */


#define N 1024 /* length of transform */
#define M 10 /* log N */
#define M_PI 3.14159265358979323846
int i, j, k, L; /* indexes */
int LE, LE1, ip;
int NV2, NM1;
double ar[N], ai[N]; /* array of points */
double t; /* temp */
double Ur, Ui, Wr, Wi, Tr, Ti;
double Ur_old;
double freq_test;
double mag;

fdata = fopen("data", "w");


NV2 = N/2;
NM1 = N-1;

/* generate original signal */


freq_test = 133.;
for (i=0; i<N; i++)
{ ar[i] = cos(freq_test*2.*M_PI*((double)i/(double)N));
ai[i] = sin(freq_test*2.*M_PI*((double)i/(double)N)); }

if(verbose >= 2)
{ printf("input:\n");
printf("real: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ar[i]);
printf("\nimag: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ai[i]);
printf("\n\n"); }

/* shuffle */
j = 1;
for (i=1; i<=NM1; i++)
{ if(i<j) /* swap a[i] and a[j] */
{ t = ar[j-1];
ar[j-1] = ar[i-1];
ar[i-1] = t;
t = ai[j-1];
ai[j-1] = ai[i-1];
ai[i-1] = t; }

k = NV2; /* bit-reversed counter */


while(k<j)
{ j -= k;
49

k /= 2; }

j += k; }

if(verbose >= 2)
{ printf("shuffled input:\n");
printf("real: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ar[i]);
printf("\nimag: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ai[i]);
printf("\n\n"); }

LE = 1.;
for (L=1; L<=M; L++) /* stage L */
{ LE1 = LE; /* LE1 = LE/2 */
LE *= 2; /* LE = 2^L */
Ur = 1.0;
Ui = 0.;
Wr = cos(M_PI/(double)LE1);
Wi = -sin(M_PI/(double)LE1);
/* Cooley, Lewis, and Welch have "+" here */
for (j=1; j<=LE1; j++)

{ for (i=j; i<=N; i+=LE) /* butterfly */


{ ip = i+LE1;
Tr = ar[ip-1]*Ur-ai[ip-1]*Ui;
Ti = ar[ip-1]*Ui+ai[ip-1]*Ur;
ar[ip-1] = ar[i-1] - Tr;
ai[ip-1] = ai[i-1] - Ti;
ar[i-1] = ar[i-1] + Tr;
ai[i-1] = ai[i-1] + Ti; }
/* end of butterfly */

Ur_old = Ur;
Ur = Ur_old*Wr-Ui*Wi;
Ui = Ur_old*Wi+Ui*Wr; } /* end of j loop */

} /* end of stage L */

if (verbose>=2)
{ printf("output:\n");
printf("real: ");
for(i=0; i<N; i++)
printf("%6.4lf ", ar[i]);
printf("\nimag: ");
for(i=0; i<N; i++)
50 CHAPTER 8. THE DISCRETE FOURIER TRANSFORM AND FFT

printf("%6.4lf ", ai[i]);


printf("\n"); }

for(i=0; i<N; i++)


{ mag = sqrt(ar[i]*ar[i]+ai[i]*ai[i]);
fprintf(fdata,"%6.4lf %le %lf\n",(double)i/(double)N, mag, dB(mag));}

} /* main */

10. Let F (N ) be the number of function calls in the recursive implementation of the
decimation-in-time FFT, and assume N is an integer power of 2. When the algorithm is
called with parameter N , it does two calls of the same algorithm with parameter N/2.
Therefore,
F (N ) = 2F (N/2) + 2
Repeatedly substituting for F (N/2) results in the telescoped series

F (N ) = 2 (2 (2 (· · · + 1) + 1) + 1)

where the nesting goes down until it reaches F (1) = 0. The sum therefore ranges from 2 to
N , doubling each time:

F (N ) = 2 + 4 + 8 + · · · + N = 2(1 + 2 + 4 + · · · + N/2) = 2(N − 1)

where we’ve used the fact that 1 + 2 + 4 + · · · + N/2 is the binary expansion of (N − 1).
11. Here’s an explicitly recursive decimation-in-time FFT in C. Note that it can be made
more efficient by computing the complex exponentials immediately before the butterfly by
repeated multiplication, as in the solution to Problem 6, rather than from scratch.

/* fft, implemented as direct recursion */

#include <stdio.h>
#include <strings.h>
#include <math.h>
#define M_PI 3.14159265358979323846
#define N 4096 /* length of transform */
#define dB(x) (((fabs(x))>(0.0))?(20.*log10(fabs(x))):(-1000.))
double ar[N], ai[N]; /* array of points to be transformed */

main() {
FILE *fopen(), *fdata;
#define verbose 0 /* 2 = print input and output */
int i; /* index */
double freq_test;
51

double mag;
fdata = fopen("data", "w");

/* generate original signal */


freq_test = (double)N/(double)8;
for (i=0; i<N; i++)
{ ar[i] = cos(freq_test*2.*M_PI*((double)i/(double)N));
ai[i] = sin(freq_test*2.*M_PI*((double)i/(double)N)); }

if(verbose >= 2)
{ printf("input:\n");
printf("real: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ar[i]);
printf("\nimag: ");
for (i=0; i<N; i++)
printf("%6.4lf ", ai[i]);
printf("\n\n"); }

fft(0, N-1);

if (verbose>=2)
{ printf("output:\n");
printf("real: ");
for(i=0; i<N; i++)
printf("%6.4lf ", ar[i]);
printf("\nimag: ");
for(i=0; i<N; i++)
printf("%6.4lf ", ai[i]);
printf("\n"); }

for(i=0; i<N; i++)


{ mag = sqrt(ar[i]*ar[i]+ai[i]*ai[i]);
fprintf(fdata,"%6.4lf %lf %lf\n",(double)i/(double)N, mag, dB(mag));}

} /* main */

fft(k, m) /* fft of array from k to m */


int k, m;
{
double tr[N], ti[N]; /* temp, for shuffle */
double Wr, Wi, Tr, Ti;
int i, n, n2;

if(verbose>=2)
printf("k= %d m= %d\n", k, m);

n = m-k+1; /* number of points this transform */


52 CHAPTER 8. THE DISCRETE FOURIER TRANSFORM AND FFT

if(n==1) return;
n2 = n/2;

if(verbose>=2){
printf("before shuffle:\n");
for(i=k; i<=m; i++)
printf("ar= %le ai= %le\n", ar[i], ai[i]); }

for(i=k; i<k+n2; i++){ /* put even points in first half */


tr[i] = ar[k+2*(i-k)]; /* odd points in second half */
ti[i] = ai[k+2*(i-k)];
tr[i+n2] = ar[k+2*(i-k)+1];
ti[i+n2] = ai[k+2*(i-k)+1]; }

for(i=k; i<=m; i++){


ar[i] = tr[i];
ai[i] = ti[i]; }

if(verbose>=2){
printf("after shuffle:\n");
for(i=k; i<=m; i++)
printf("ar= %le ai= %le\n", ar[i], ai[i]); }

fft(k, k+n2-1);
fft(k+n2, m);

for(i=k; i<k+n2; i++){


Wr = cos(2.*M_PI*(double)(i-k)/(double)n);
Wi = -sin(2.*M_PI*(double)(i-k)/(double)n);
Tr = ar[i+n2]*Wr-ai[i+n2]*Wi;
Ti = ar[i+n2]*Wi+ai[i+n2]*Wr;

tr[i] = ar[i] + Tr; /* butterfly */


ti[i] = ai[i] + Ti;
tr[i+n2] = ar[i] - Tr;
ti[i+n2] = ai[i] - Ti;
ar[i] = tr[i];
ai[i] = ti[i];
ar[i+n2] = tr[i+n2];
ai[i+n2] = ti[i+n2]; }
}

13. This bug is a beginner’s mistake. The variable Ur is changed by the first statement
and therefore the value used in the second statement is not what it should be.
14. All the numerical values needed are contained in a quarter of a period, so we need to
store only N/8 numbers. Of course, some work is required to reduce the argument to the
53

proper point within the first quarter, and to take the reduction into account in the sign of
the result.
15. (a) The element in row k and column s of the product matrix R = F 2 is
N −1 N −1 
X X
−jt(k+s)2π/N N if k + s = 0 mod N
[R ]ks = [F ]kt [F ]ts = e =
0 otherwise
t=0 t=0

Thus, the matrix R looks like


 
1 0 0 0 0

 0 0 0 0 1 

N
 0 0 0 1 0 

 0 0 1 0 0 
0 1 0 0 0

(b) When forming the product R 2 , the kth row times the kth column yields N 2 , and the
kth row times any other column yields zero. Thus, R 2 = N 2 I , where I is the identity
matrix. Multiplying this by R −1 shows that R −1 = (1/N 2 )R . We can also conclude that
F 4 = R 2 = N 2I .
(Erratum in first printing: As we see, the desired result is R 2 = N 2 I .)
(c) Left-multiply the definition F 2 = R by F −1 , and then right-multiply by R −1 . The
result expresses the inverse DFT as F −1 = FR −1 . Thus the inverse FFT can be computed
by first multiplying by the matrix R −1 , and then applying the forward FFT. From Part
(b), multiplying by R −1 is equivalent to scaling by 1/N 2 and then multiplying by R ; and
multiplying by R is equivalent to a simple permutation of the elements of the array to be
transformed.
17. Evaluating the DFT, Eq. 2.6, for the phasor at frequency (m + 1/2)2π/N radians per
sec results in
N
X −1
Xk = ejt(m−k+1/2)2π/N
t=0
By symmetry, the minimum amplitude occurs at the points closest to the point on the
frequency circle opposite the phasor frequency, the frequency points k = m + N/2 and
k = m + N/2 + 1. It doesn’t matter which one we look at, so consider the transform at the
first,
N
X −1 N
X −1
Xm+N/2 = ejt(−N/2+1/2)2π/N
= e−jt((N −1)/N )π
t=0 t=0
This is the usual finite geometric series, with closed form

1 − e−jπN ejπ 2
−jπ jπ/N
=
1−e e 1 + ejπ/N
54 CHAPTER 8. THE DISCRETE FOURIER TRANSFORM AND FFT

assuming that N is a power of 2, and hence even. For large N , the complex exponential in
the denominator is very close to 1, making the value of Xm+N/2 very close to 1, which is
what we wanted to show.
Chapter 9

The zz-transform and Convolution

1. At low frequencies,
z −1 = e−jω ≈ 1 − jω
and the frequency response is
1 1
−1

1−z jω
This matches the variation of the frequency content of a square wave, whose nth harmonic
is proportional to 1/n (see Eq. 3.5 in Chapter 7, for example).
On the other hand, near the Nyquist frequency, where z = −1,

1 1
−1

1−z 2
and this does not bear any relationship to the frequency content of the square wave.
2. The z-transform of the signal gk is
  
G(z) = gk z −k = fk/2 z −k = fk z −2k = F(z 2 )
all k k even all k

where F(z) is the z-transform of the original signal fk . When z travels around the frequency
circle once, the spectrum of the signal fk traces out the frequency content of the signal gk
twice. We have effectively squeezed two copies of the spectrum of fk into the baseband.
The way this idea is used in a CD player for oversampled d-to-a conversion is explained in
Section 1 of Chapter 14.
3. The z-transform of fk is


F(z) = z −k /k! = e1/z
k=0

since this is precisely the power series for the exponential function.

55
56 CHAPTER 9. THE Z-TRANSFORM AND CONVOLUTION

4. The z-transform of the signal yk is the transfer function of the feedback digital filter
described by Eq. 6.1 in Chapter 5,
1
Y(z) =
1 − (2R cos θ)z −1 + R2 z −2
as in Eq. 4.2 of Chapter 5.
5. I wrote a simple double-precision C program that implements reson with R = 1 and a
unit impulse initial condition. The reson’s output is compared to the theoretical impulse
response, as given in Problem 4 above, computed using the library sine function. Figure
P5 shows the difference between the reson output and the theoretical response for the two
values θ = 0.0625 and 0.06, in fractions of the sampling rate. The error grows in magnitude
in both cases, but remains centered on zero in the first case and drifts away from zero in
the second case.
There are two causes of the error. First, there is the buildup of roundoff error. Second,
there is the discrepancy between the pole angle that is actually achieved in the reson with
the computed filter coefficient 2 cos(2πθ), and the frequency that is used to compute the
theoretical value. In the first example, it appears that the filter achieves a frequency very
close to the direct calculation, and no drift is observable between the filter output and the
directly computed sinusoid. In the second example, there is a drift, and in fact it dominates
the error.
The figure shows the error for a million samples, which corresponds to more than 22
seconds at the standard audio sampling rate of 44,100 Hz. The maximum error in both
cases is less in magnitude than one part in 109 , and seems to grow no faster than linearly.
Thus the method is a practical way to generate sinusoids, at least for times on the order of
a minute at audio frequencies.
6. Below is a vertical slice down a typical column in the long division method applied to
the simple z-transform 1/(1 + az −1 + bz −2 ):
yt
· · · xt · · ·
..
.
byt−2
xt − byt−2
ayt−1
xt − ayt−1 − byt−2
In the general case, the tth numerator term will appear where xt appears in this column,
and successive denominator terms will accumulate to add to the final line, which is precisely
the output of the digital filter in the impulse-response method.
7. I’ll do the work in algebraic rather than numerical form. Write Eq. 8.9 as

z3 Az Bz B∗
= + +
(z − Rejθ )(z − Re−jθ )(z − 1) z − 1 z − Rejθ z − Re−jθ
57

1e-10

theta = 0.0625 sampl. rate


5e-11
error

-5e-11

-1e-10
0 2e+05 4e+05 6e+05 8e+05 1e+06
time, in samples
5e-10

4e-10 theta = 0.06 sampl. rate

3e-10
error

2e-10

1e-10

-1e-10
0 2e+05 4e+05 6e+05 8e+05 1e+06
time, in samples

Fig. P5 Error between the impulse response of a reson with poles on the unit circle and the
corresponding sinusoid directly computed. The top and bottom graphs show the cases with
frequencies 0.0625 and 0.06 times the sampling rate, respectively.
58 CHAPTER 9. THE Z-TRANSFORM AND CONVOLUTION

Multiplying by (z − 1) and setting z = 1 confirms the result in the text,

1
A=
1 − 2R cos θ + R2

Multiplying by (z − Rejθ ) and setting z = Rejθ results in

jRe2jθ (1 − Re−jθ )
B=
2 sin θ(R2 − 2R cos θ + 1)

From the first equation, the oscillatory part of the step response is therefore
n o
BRt ejθt + B ∗ Rt e−jθt = 2Rt < Bejθt

The total step response then becomes, after some straightforward algebra,

Rt+1
 
1
1+ (R sin((t + 1)θ) − sin((t + 2)θ))
R2 − 2R cos θ + 1 sin θ

From this, the first two values of the step response are 1 and 1 + 2R cos θ, which checks
against long division of the full z-transform, Eq. 8.1 times 1/(1 − z −1 ).
8. The z-transform of a unit step signal is

X 1
z −k =
1 − z −1
k=0

Differentiating both sides yields



X −z −2
(−k)z −k−1 =
k=1
(1 − z −1 )2

Multiplying by −z and extending the summation to k = 0, which doesn’t affect the result,
gives the desired z-transform, verifying Eq. 9.5:

X z −1
kz −k =
k=0
(1 − z −1 )2

Repeated differentiating can be used to derive the z-transform of the signal xk = k n ,


which will have an (n + 1)st-order pole at z = 1.
9. Differentiate the z-transform of the signal yk = k, k ≥ 0, Eq. 9.5:

X −z −2 − z −3
(−k 2 )z −k−1 =
k=0
(1 − z −1 )3
59

Multiplying both sides by −z then gives the answer,



X z −1 + z −2
k 2 z −k =
k=0
(1 − z −1 )3

10. The idea is to simplify the algebra by setting R = 1 in a known result, deriving a
new transform by differentiating, and then putting R back. The z-transform of the signal
sin(kθ), k ≥ 0 is, setting R = 1 in the result in Table 5.1,

X (sin θ)z −1
sin(kθ)z −k =
1 − (2 cos θ)z −1 + z −2
k=0

Differentiating yields, after the usual simplifications,



X (sin θ)(z −1 − z −3 )
k sin(kθ)z −k =
k=0
(1 − (2 cos θ)z −1 + z −2 )2

We can now weight the time signal at time t by Rt , and replace z −1 by Rz −1 in its transform,
yielding the result

X (sin θ)(Rz −1 − R3 z −3 )
kRk sin(kθ)z −k =
k=0
(1 − (2 cos θ)Rz −1 + R2 z −2 )2

11. We want to find the coefficients A and B in the partial fraction expansion
A B
G(z) = −1
+ + terms for other poles
1 − pz (1 − pz −1 )2

Since G(z) is assumed to have a double pole at z = p, write it in the form

H(z)
G(z) = + terms for other poles
(1 − pz −1 )2
and rewrite the partial fraction expansion as

z 2 H(z) Az Bz 2
= + + terms for other poles
(z − p)2 z − p (z − p)2

Multiplying by (z − p)2 and setting z = p yields

B = H(p)

To find the coefficient A, multiply the original partial fraction expansion by (1 − pz −1 )2 ,

H(z) = A(1 − pz −1 ) + B + (1 − pz −1 )2 [ terms for other poles ]


60 CHAPTER 9. THE Z-TRANSFORM AND CONVOLUTION

Differentiating with respect to z and setting z = p now gives us

A = p H0 (p)

12. The form we use for partial fraction expansions, including the double-pole version
in Problem 11, can represent ratios of polynomials in z −1 in which the numerator has
degree (in z −1 ) strictly smaller than the denominator. When this isn’t the case, divide the
denominator into the numerator, using the usual long-division method, until the remainder
is a power of z −1 times a ratio that is in this form. Then expand this ratio in the usual
way. As a simple example,

1 + z −1 + z −2 3
−1
= 1 + 2z −1 + z −2
1−z 1 − z −1

13. The result we’re after is the sum of the two partial fraction terms obtained in Problem
11,
p H0 (p) H(p) p H0 (p)(1 − pz −1 ) + H(p)
+ =
1 − pz −1 (1 − pz −1 )2 (1 − pz −1 )2
where H(z) is the original ratio of polynomials without an assumed double pole at z = p.
To derive this in the way suggested, we’ll assume there are poles at z = p1 and z = p2
instead of the double pole at z = p, and we’ll let p2 → p1 = p. With distinct poles, the two
terms above are replaced by
H(p1 ) H(p2 )
−1 + −1
(1 − p2 p1 )(1 − p1 z −1 ) (1 − p1 p2 )(1 − p2 z −1 )

Put this over the common denominator (1−p1 z −1 )(1−p2 z −1 ). The numerator then becomes
 
p2 H(p2 ) − p1 H(p1 ) −1 H(p2 ) − H(p1 )
− z p1 p2
p2 − p1 p2 − p1
A bit more simplification leads to
   
H(p2 ) − H(p1 ) H(p2 ) − H(p1 )
p1 + H(p2 ) − z −1 p1 p2
p2 − p1 p2 − p1
Letting p2 → p1 → p now leads to the desired form, derived at the beginning of this problem.
14. Standard power series are a rich source of such z-transforms. For example,

X
(−1)k/2 z −k = cos(1/z)
k=0,2,4,...

or
1 1 · 3 −4 1
1 + z −2 + z + ··· = √
2 2·4 1 − z −2
61

which is from L. B. W. Jolley, Summation of Series, second edition, Dover, New York, N.Y.,
1961.
15. (a) The number of pairs of rabbits at the beginning of Month t is just the sum of those
alive at the beginning of Month t − 1, who survive, and those alive at the beginning of
Month t − 2, who give birth to new pairs of rabbits. Thus

rt = rt−1 + rt−2

(b) Interpret rt as the output of a digital filter with the update equation

rt = xt + rt−1 + rt−2

where xt is an input signal equal to the unit pulse. The appropriate initial conditions with
this input signal are r−2 = r−1 = 0, which yields r0 = 1, r1 = 1, r2 = 2, and so on, as
desired. This is a feedback filter, and is unstable, since its output grows without bound
with this bounded input.
(c) The input signal xt is a unit pulse, so X (z) = 1. The update equation then gives us the
z-transform
1
R(z) =
1 − z − z −2
−1

(d) The poles are the roots of the denominator,

1 1√
p1,2 = ± 5
2 2
The plus sign yields a root outside the unit circle, as we expect for an unstable signal. The
partial fraction expansion is

1 A B
R(z) = = +
1− z −1 − z −2 1 − p1 z −1 1 − p2 z −1

where
1 1√
 
p1 1
A= =√ + 5
p1 − p2 5 2 2
and
1 1√
 
p2 1
B= = −√ − 5
p2 − p1 5 2 2

(e) The inverse z-transform of this partial fraction expansion gives us the following explicit
expression for rt :
√ t+1  √ t+1
 
t t 1
rt = Ap1 + Bp2 = √ 1+ 5 − 1− 5
2t+1 5
62 CHAPTER 9. THE Z-TRANSFORM AND CONVOLUTION

(f) Assuming no leap year, we want r365 , the number of pairs of rabbits at the beginning of
Day 365, since we started on Day 0. The second term in the expression for rt is negligible
compared to the first, so
√ !365
1 1+ 5
r365 ≈√ = 8.531 × 1075
5 2

a fair number of pairs of rabbits.


16. Substitute the definitions of X(ω) and Y (ω) in the definition of hX, Y i:
Z π
1
hX, Y i = X(ω)Y ∗ (ω)dω
2π −π
Z π X ∞ ∞
1 −jkω
X
∗ jmω
= xk e ym e dω
2π −π m=−∞
k=−∞
∞ ∞ Z π
∗ 1
X X
= xk ym e−j(k−m)ω dω
m=−∞
2π −π
k=−∞

The integral is equal to one if k = m and is zero otherwise, so we get



X
hX, Y i = xk yk∗ = hx, yi
k=−∞

which we set out to prove.


To verify this for a simple example, choose x0 = 1, x1 = 1, y0 = 1, y1 = −1, and all
other sample values zero. Then X (z) = 1 + z −1 , Y(z) = 1 − z −1 , and
Z π Z π
1 ∗ 1
X(ω)Y (ω)dω = (1 + e−jω )(1 − ejω )dω
2π −π 2π −π
Z π
1
= − 2j sin ωdω
2π −π
= 0

which checks the time domain inner product



X
hx, yi = xk yk∗ = 0
k=−∞

When the two signals are equal, Parseval’s theorem states that the integral of the square
of a signal’s magnitude transform is equal to the total energy in the signal.
Chapter 10

Using the FFT

1. We want to show by induction that


n−1
X 1 − xn
xt =
1−x
t=0

Therefore, assume this formula holds for all n ≤ N , and consider the case n = N + 1:
N N −1
X
t
X 1 − xN 1 − xN +1
x = xt + xN = + xN =
1−x 1−x
t=0 t=0

which is what we wish to show.


2. L’Hôpital’s rule states that the limit of the indeterminate ratio can be computed by
taking the limit of the ratio of derivatives, if that limit exists. Thus, to find the limit of
1 − xn
1−x
as x → 1, consider the limit of the ratio of derivatives:

−nxn−1
lim =n
x→1 −1

3. Let’s drop the magnitude signs in Eq. 2.4, so that we can deal with the differentiable
function S(θ − ω), where we define

sin(nφ/2)
S(φ) =
sin(φ/2)
as in Eq. 5.6. This is valid because we know from the plot in Fig. 2.1 that this function
is positive in the vicinity of φ = 0. We can try to verify that S(φ) has a peak precisely at
the origin by differentiating and setting φ = 0. This results in the indeterminate form 0/0.

63
64 CHAPTER 10. USING THE FFT

Applying L’Hôpital’s rule results in the same indeterminate form. Applying L’Hôpital’s rule
for a second time finally shows that the derivative is zero at φ = 0; this point corresponds
to ω = θ. To be rigorous, we should then check that the second derivative is negative at
φ = 0, but the shape of the curve is clear from the plot in Fig. 2.1.
Here’s an easier argument: The function S(φ) is an even function about the origin, and
is differentiable everywhere. Therefore, if we consider a neighborhood ± about the origin,
the function must either increase symmetrically on both sides, or decrease symmetrically
on both sides. That is, the function must have either a local maximum or minimum at that
point. Again, the fact that it has a local maximum can be shown easily by plotting the
function in a small neighborhood of the origin.
4. For the z-transform of a finite stretch of a cosine wave, write the cosine as the sum of
two phasors, and use the closed form in Problem 1:

n−1 n−1
X
−t 1 X  jθt 
cos(θt)z = e + e−jθt z −t
2
t=0 t=0
1 1 − ejnθ z −n 1 1 − e−jnθ z −n
= +
2 1 − ejθ z −1 2 1 − e−jθ z −1

In general, the frequency content does not peak precisely at the point ω = θ. The
situation is analogous to that of the two-pole reson: the contribution from the negative-
frequency component distorts the contribution from the positive-frequency component. To
see this algebraically, find the frequency content by letting z = ejω :

n−1
X
cos(θt)e−jωt
t=0
1 −j(ω−θ)(n−1)/2 1
= e S(ω − θ) + e−j(ω+θ)(n−1)/2 S(ω + θ)
2 2
1 −jω(n−1)/2 h jθ(n−1)/2 i
= e e S(ω − θ) + e−jθ(n−1)/2 S(ω + θ)
2

where S(φ) is defined as in Problem 3. Take the squared magnitude of this, which eliminates
the complex exponential in front. Then replace the complex exponentials that are left by
sines and cosines, and use the sum of the squares of the real and imaginary parts:

n−1 2
X
−jωt
cos(θt)e
t=0
1 2 1 1
= S (ω − θ) + S 2 (ω + θ) + cos(θ(n − 1))S(ω − θ)S(ω + θ)
4 4 2

We know from Problem 3 that the S 2 (ω − θ) term peaks at ω = θ, but what’s left in general
doesn’t.
65

5. The closed form for a finite geometric series helps us once again. We normalize by the
sum of the window weights, so we need to find
n−1
X n−1
X
ht = (0.54 − 0.46 cos(2πt/(n − 1)))
t=0 t=0
n−1
X
= 0.54n − 0.46 cos(2πt/(n − 1))
t=0

Writing the cosine as the real part of a phasor, the sum of cosines can be written
n−1
(n−1 )
X X
j2πt/(n−1)
cos(2πt/(n − 1)) = < e
t=0 t=0
( )
1 − ej2πn/(n−1)
= <
1 − ej2π/(n−1)
Multiplying numerator and denominator by the conjugate of the denominator makes the
denominator real, and allows us to finish the job of taking the real part:
n−1
X 2 − cos(2πn/(n − 1)) − cos(2π/(n − 1))
cos(2πt/(n − 1)) =
2 − 2 cos(2π/(n − 1))
t=0
= 1
which completes the calculation, showing that the sum of the weights of a Hamming window
is 0.54n − 0.46.
You can also derive this result by evaluating H(1) = H(0) = n−1
P
t=0 ht in Eq. 5.7. The
shifted functions S(±2π/(n − 1)) evaluate to −1.
General comment on Problems 6–8: Notice that I use the symmetric definition of
a Hamming window in Oppenheim and Schafer, Digital Signal Processing, Prentice-Hall,
Englewood Cliffs, N.J., 1975. The window weights have the value 0.08 at each end. This
is slightly different from the definition in T. Saramäki’s Chapter 4 of S. K. Mitra and J. F.
Kaiser (eds.), Handbook for Digital Signal Processing, John Wiley, New York, N.Y., 1993,
where the argument of the cosine is an integer multiple of 2π/n, instead of the 2π/(n − 1) I
use in Eq. 5.1. I adopt similarly symmetric definitions for the other windows, in Problems
6–8. This has a very small effect on the frequency content, noticeable in Fig. P6–8 only near
the Nyquist frequency for the Hann and Blackman windows. To me, asymmetric window
weights are difficult to defend philosophically, but to each his own; as Saramäki points out,
the definitions “differ slightly in the literature.”
6. The derivation of the closed form for the frequency content of the Hann window is
identical to that for the Hamming window, except that both 0.54 and 0.46 are replaced by
0.5. The result is analogous to Eq. 5.7:
 
−j(n−1)ω/2 2π 2π
H(ω) = e 0.5S(ω) + 0.25S(ω − ) + 0.25S(ω + )
n−1 n−1
66 CHAPTER 10. USING THE FFT

The sum of weights, for normalization, is 0.5n − 0.5, using the same methods as in Problem
5.
Figure P6–8 shows the frequency content of this window for n = 257, together with the
corresponding plots for the windows in Problems 7 and 8. Figure P6–8 (close-up) shows
close-ups of the frequency content of these windows in the low-frequency region.
The Hann window’s central lobe has a width essentially identical to that of the Hamming
window, but its dominant side lobe is about 15 dB higher (for this value of n). In return,
the Hann window rolls off much faster at high frequencies. It’s evident from the plots that
the Hamming window offers an amazingly good approximation to equiripple behavior, given
its simplicity.

7. The closed form for the frequency content of the Blackman window can be derived in the
same way as that for the Hamming and Hann windows, but in this case two more shifted
versions of the function S(ω) appear. The complex exponential in front induces a factor
of −1 for the versions shifted by 2π/(n − 1), as before, but a factor of +1 for the versions
shifted by 4π/(n − 1). The result is


−j(n−1)ω/2 2π
H(ω) = e 0.42S(ω) + 0.25S(ω − )
n−1

2π 4π 4π
+ 0.25S(ω + ) + 0.04S(ω − ) + 0.04S(ω + )
n−1 n−1 n−1

For normalization, the sum of weights for the Blackman window is 0.42n − 0.42, cal-
culated in the same way as for the Hamming and Hann windows. The sum of cosines
of the form cos(4π/(n − 1)) turns out to be the same as the sum of cosines of the form
cos(2π/(n − 1)); both are simply one.

8. The Bartlett window doesn’t have cosines in its definition, so the method used for
finding the frequency content of the Hamming, Hann, and Blackman windows doesn’t work.
Instead, we’ll calculate its z-transform directly. For convenience, we’ll assume that n is odd
for the Bartlett window, which means that the triangular waveform peaks at the single
central point, t = (n − 1)/2.
To find the z-transform, think of the triangle wave as the superposition of three linear
functions: a ramp up with slope +1 starting at t = 0; a ramp down starting at t = (n − 1)/2
with slope −2; and a ramp up with slope +1 starting at t = n − 1. For t ≤ n − 1 these three
ramps produce the triangle, and for t > n − 1 these three ramps sum to precisely zero.
The z-transform of the first component ramp is, by Eq. 9.5 of Chapter 9,

z −1
Y(z) =
(1 − z −1 )2
67

-20 Hann and Hamming windows


frequency content, dB
Hamming
-40 Hann

-60

-80

-100

-120

-140

-160

-180

-200
0 0.1 0.2 0.3 0.4 0.5
frequency, fractions of sampling rate
0

-20
Blackman window
frequency content, dB

-40

-60

-80

-100

-120

-140

-160

-180

-200
0 0.1 0.2 0.3 0.4 0.5
frequency, fractions of sampling rate
0

-20
Bartlett window
frequency content, dB

-40

-60

-80

-100

-120

-140

-160

-180

-200
0 0.1 0.2 0.3 0.4 0.5
frequency, fractions of sampling rate

Fig. P6–8 The frequency content of the Hann, Blackman, and Bartlett windows, from top to
bottom, for n = 257 points. For comparison, the frequency content of the Hamming window is
plotted along with that for the Hann window.
68 CHAPTER 10. USING THE FFT

frequency content, dB -10 Hann and Hamming windows


Hamming
-20 Hann

-30

-40

-50

-60

-70

-80

-90

-100
0 0.01 0.02 0.03 0.04 0.05
frequency, fractions of sampling rate
0

-10 Blackman window


frequency content, dB

-20

-30

-40

-50

-60

-70

-80

-90

-100
0 0.01 0.02 0.03 0.04 0.05
frequency, fractions of sampling rate
0

-10 Bartlett window


frequency content, dB

-20

-30

-40

-50

-60

-70

-80

-90

-100
0 0.01 0.02 0.03 0.04 0.05
frequency, fractions of sampling rate

Fig. P6–8 (close-ups) The central lobe behavior of the Hann, Blackman, and Bartlett windows,
from top to bottom, for n = 257 points. These are close-ups of the low-frequency region in Fig.
P6–8.
69

The sum of the z-transforms of the three components just described is therefore

H(z) = Y(z) − 2z −(n−1)/2 Y(z) + z −(n−1) Y(z)


z −1 − 2z −(n+1)/2 + z −n
=
(1 − z −1 )2
!2
−1 1 − z −(n−1)/2
= z
1 − z −1

This is, aside from the delay factor of z −1 , the square of the z-transform of a rectangular
window of length (n − 1)/2 points. Thus the frequency content of the Bartlett window is
2
sin((n − 1)ω/4)
|H(ω)| =
sin(ω/2)

To normalize the window, just divide by H(0) = (n − 1)/2.


Of course, this is no accident: The frequency content of the Bartlett window is the square
of the frequency content of an appropriate rectangular window because the convolution of
a rectangle with itself is a triangle. See Problem 10 of Chapter 11.
9. Multiplying every other sample of a signal xt by −1 is equivalent to multiplying it by
(−1)t . The z-transform of the resulting signal is therefore

X ∞
X
−t
t
(−1) xt z = xt (−z)−t = X (−z)
t=−∞ t=−∞

where X (z) is the z-transform of the original signal. The effect is thus to replace z by −z,
which flips the frequency circle so that the points 0 and −1 in the z-plane trade places.
In other words, the frequency content of the signal is now in reverse order as we trace out
frequencies from 0 to the Nyquist frequency.
Another way of looking at this is that the signal is multiplied by (−1)t = ejπt , which
is a phasor. Multiplying by this phasor shifts the frequency axis by π radians per sample,
and therefore slides the negative frequencies up to the positive range, where they appear in
reverse order.
10. The rectangular window has a relatively narrow central lobe, but poor side-lobe rejec-
tion. The signal in this case has a dominant pitch and is narrowband. Therefore, the pitch
of the signal sometimes lines up well with the window central lobe and sometimes does not.
As the signal sweeps in frequency, its pitch passes in and out of the central-lobe region.
When the pitch lines up well with the central lobe, the energy is concentrated at the correct
pitch. When the pitch falls outside the central lobe, the side lobes pick up a relatively large
amount of energy.
To exaggerate the effect further, the spectrogram program has a kind of automatic
gain control, which normalizes the plot so the overall intensity remains constant as time
70 CHAPTER 10. USING THE FFT

progresses. So when the central-lobe signal fades out, the side-lobe signal is boosted, further
smearing the plot.
11. The first 9 periods in Fig. 7.6 take about 132 samples, which, at a sampling rate of
22,050 Hz, corresponds to a frequency of 22050/(132/9) = 1503 Hz. The final 8 periods take
about 63 samples, which corresponds to a frequency of 22050/(63/8) = 2800 Hz. This is
reasonably close to a doubling of frequency, and checks reasonably well with the spectrogram
in Fig. 7.4.
Chapter 11

Aliasing and Imaging

1. Denote the continuous-time signal by x(t) and the window by w(t). Windowing the
signal first results in the signal x(t)w(t); sampling then results in x(kTs )w(kTs ), where k
is the sample number and Ts is the sampling interval, as usual. Sampling the signal first
yields the digital signal with samples x(kTs ), which after windowing by the window with
samples of w(t) yields x(kTs )w(kTs ), the same result.
2. The domains omitted correspond to periodic signals, when the time domain is continuous
and finite (circular), and the corresponding transform domain is discrete and infinite. This
is the left-right reversal of the second row in Fig. 1.1.
This case is crucial for the development of signal processing theory — Chapter 7 is
devoted to it. When time and frequency are reversed, it corresponds to digital signals and
their z-transforms. But it doesn’t come up directly in much practical signal processing
because continuous-time signals are usually processed in a continuous, virtually infinite,
stream. When a finite section is processed, it is almost always sampled. The only case I
can think of when a finite segment of a continuous-time signal is processed is the tape loop.
3. There are six pairs of transforms, one for each entry in Fig. 1.1, and each relating a
convolution to its product-form transform or inverse transform. The method of deriving
these pairs is the same in all six cases:

• Write the conjectured form of the convolution.

• Replace the signals or transforms in the convolution by their transforms or inverse


transforms.

• Move the summation or integration corresponding to the convolution to the inside.


That is, do it first. That produces a δ-function, either discrete-time or continuous-
time.

• Perform one of the two remaining summations or integrations, using the δ-function.
The result is in the form of a transform or inverse transform, and is the desired result.

71
72 CHAPTER 11. ALIASING AND IMAGING

This may sound complicated, but it’s actually very natural. We’ll start at the upper left in
Fig. 1.1, and proceed from left to right and top to bottom.
Continuous time, infinite extent, time-domain convolution: The convolution between the
two signals f (t) and g(t) is
Z ∞
f (t) ⊗ g(t) = f (τ )g(t − τ )dτ
−∞

Replacing the signals by their inverse transforms, F (φ) and G(ω), respectively, yields
Z ∞ Z ∞ Z ∞
1 jφτ 1
f (t) ⊗ g(t) = F (φ)e dφ G(ω)ejω(t−τ ) dωdτ
−∞ 2π −∞ 2π −∞

Performing the outside integration first results in the integral


Z ∞
ej(φ−ω)τ dτ = 2πδ(φ − ω)
−∞

as promised. To derive this, note that the forward Fourier transform of δ(t) is one. There-
fore, the inverse Fourier transform of one is
Z ∞
1
δ(t) = 1 · ejωt dω
2π −∞
The next integration, on φ, gives
Z ∞
F (φ)δ(φ − ω)dφ = F (ω)
−∞

and what’s left is now


Z ∞
1
f (t) ⊗ g(t) = F (ω)G(ω)ejωt dω
2π −∞

The right-hand side is the inverse Fourier transform of F (ω)G(ω), so we have the desired
transform pair
f (t) ⊗ g(t) ↔ F (ω) · G(ω)

Since the method is essentially the same in all cases, I’ll give only the results below.
Continuous time, infinite extent, frequency-domain convolution:
Z ∞
1
F (ω) ⊗ G(ω) = F (φ)G(ω − φ)dφ ↔ f (t) · g(t)
2π −∞

Discrete time, infinite extent, time-domain convolution:



X
ft ⊗ gt = fk gt−k ↔ F (ω) · G(ω)
k=−∞
73

Discrete time, infinite extent, frequency-domain convolution:


Z π
1
F (ω) ⊗ G(ω) = F (φ)G(ω − φ)dφ ↔ ft · gt
2π −π

Discrete time, finite extent, time-domain convolution:


N
X −1
ft ⊗ gt = fk gt−k ↔ Fk · Gk
k=0

Discrete time, finite extent, frequency-domain convolution:


N −1
1 X
Fk ⊗ Gk = Fm Gk−m ↔ ft · gt
N
k=0

4. For convenience, denote the constant U (−1) by the constant a. We can assume that
a 6= 0, because otherwise the property stated would imply the trivial result that U (k) is
zero for all k. Setting k = 0 in the property yields

aU (0) = a

which tells is that U (0) = 1. Replacing k by k + 1 in the property and dividing by a gives
us the recurrence relation
U (k + 1) = a−1 U (k)
which shows that U (k) = a−k U (0) = a−k , which is the desired result with c = a−1 , for
k ≥ 0. The same argument works in the opposite direction to give the result for k ≤ 0.
5. I’ll do the z-transform case; the other cases are similar. Suppose the (real) signal xk is
odd. Then x0 = 0 and the z-transform can be written
−1
X ∞
X
X (z) = xk z −k + xk z −k
k=−∞ k=1

Now use the fact that xk = −x−k and let z = ejω to rewrite this as
−1
X ∞
X
X(ω) = − xk e−jkω + xk e−jkω
k=−∞ k=1

X  
= xk ejkω − e−jkω
k=1

X
= 2j xk sin(kω)
k=1
74 CHAPTER 11. ALIASING AND IMAGING

which is imaginary-valued.
6. (This derivation follows my Chapter 1 of S. K. Mitra and J. F. Kaiser (eds.), Handbook
for Digital Signal Processing, John Wiley, New York, N.Y., 1993.) Start with the digital
signal xt , in general infinite in extent, with z-transform X (z). We sample its frequency
content at frequencies k2π/N , for k = 0, 1, 2, . . . , N − 1, obtaining

X
X(k2π/N ) = xn e−jkn2π/N , for k = 0, 1, . . . , N − 1
n=−∞

The question comes down to this: what finite-duration signal has this DFT? Thus, we
compute the inverse DFT of this, which we call the signal x̃t :
N −1
1 X
x̃t = X(k2π/N )ejkt2π/N
N
k=0
N −1 ∞
1 X X
= e jkt2π/N
xn e−jkn2π/N
N n=−∞
k=0

Interchanging the order of summations results in


∞ N −1
1 X X
x̃t = xn ejk(t−n)2π/N
N n=−∞
k=0

The inside summation is our old friend the δ-function, but this time we need to take into
account the fact that the index n extends over an infinite range. The resulting function of
t − n is therefore 
N if n = t mod N
δ(t − n) =
0 otherwise
Hence, the sum picks up xt , plus all values of x that are displaced by integer multiples of
N . The aliased time function that results from sampling the z-transform is therefore

X
x̃t = xt+mN
m=−∞

7. The usual application of the DFT almost always starts with a finite-duration segment of
a digital signal. The signal can therefore be considered zero outside this range, and there
is no time aliasing.
8. The aliasing operation on the transform can be visualized as follows. Wrap a plot of
the signal’s frequency content around a cylinder with circumference 2π radians per sample.
Then add up the values of the frequency content that occur at each point around the
cylinder. The points that get added then correspond to the sum of values displaced by all
integer multiples of 2π.
75

9. Using the impulse response in Eq. 6.3, we want to evaluate


sin(πt/T )
lim
t→0 πt/T
L’Hôpital’s rule prescribes taking the derivative of the numerator and the denominator, and
evaluating the ratio at t = 0:

(π/T ) cos(πt/T )
=1
π/T t=0

10. (a) The impulse response connects the value zero at t = −T to the value one at t = 0
with a straight line; then it connects the value one at t = 0 to the value zero at t = T
with a straight line. At all other points it connects the value zero to zero, and hence is zero
outside the range −T ≤ t ≤ T .
(b) Consider the convolution of the following rectangular pulse with itself:

1/T if −T /2 ≤ t ≤ T /2
h(t) =
0 otherwise

When the two instances of the rectangle are separated by a distance greater than T , they
don’t overlap, and the convolution is zero. As they slide past each other, the overlapping
area increases linearly from the point t = −T until it reaches a maximum at t = 0, when
they coincide. Then the overlapping area decreases symmetrically to zero at t = T . The
maximum overlap is the area of the product when the two instances of the rectangle coincide,
which is T · (1/T )2 = 1/T . The convolution is therefore the triangle described in Part (a),
normalized so its area is one.
(c) The rectangular pulse in Part (b) is the impulse response of a zero-order hold, but
centered at the time origin. Its frequency response is therefore, from Eq. 4.4,
sin(ωT /2)
H(ω) =
ωT /2
The frequency response of the linear-point-connecting hold circuit is the square of this:

sin(ωT /2) 2
 
H 2 (ω) =
ωT /2

11. Equation 8.3 shows that when alternate samples are thrown away, the new spectrum is
1
(X(ω) + X(ω − ωs ))
2
where X(ω) is the spectrum of the original signal, and the original sampling rate is 2ωs .
Therefore, frequency ω is confounded with the frequency that differs from it by half the
76 CHAPTER 11. ALIASING AND IMAGING

sampling frequency, in this case, half of 2ωs . For example, the Nyquist frequency appears
identical to zero frequency if every other sample is discarded — which makes sense, since
the samples of a sinusoid at the Nyquist frequency are of the form (−1)t .
12. When only every kth sample is used, the new spectrum is
k−1
1X
X(ω − iωs )
k
i=0

where the original spectrum is X(ω), and the original sampling rate is kωs . This follows in
exactly the same way as the result for k = 2, using the argument between Eqs. 8.1 and 8.3.
If Fig. 8.1 is redrawn for the case when the original sampling frequency is kωs , the
lowpass filter used to prepare for the sampling-rate reduction should still have a cutoff
frequency at ωs /2. This is 1/k times the Nyquist frequency at the point when the filtering
takes place. For example, when k = 2, the filter is half-band.
Chapter 12

Designing Feedforward Filters

1. The story is told nicely in D. M. Burton, The History of Mathematics, an Introduction,


Allyn and Bacon, Inc., Boston, Mass., 1985. I’ll skip the best part — the rivalry and
infighting among the mathematicians of the time.
The problem can be formulated as follows: Express the roots of the general polynomial
of given degree n using only addition, multiplication, subtraction, division, and extraction
of radicals. About 1530, Nicolo Tartaglia provided a solution for the cubic case (n = 3).
The solution for the quartic case (n = 4) was found by Ludovico Ferrari, about ten years
later. The quintic case (n = 5) was the natural next target, but attempts for the next 250
years failed.
Finally, in 1799, Paolo Ruffino published a flawed but basically sound proof that the
solution in the general quintic case is impossible. Niels Henrik Abel provided a rigorous
proof in 1824, and the result is known as the Abel-Ruffino theorem.
2. When the feedforward filter length n is even, the transfer function doesn’t have a center
term, and the derivation of the cosine form needs to be modified a bit. Let’s do the case
when n = 4. The transfer function is

H(z) = a0 + a1 z −1 + a2 z −2 + a3 z −3

Factor out a power of z corresponding to the average of the first and last exponents:
h i
H(z) = z −3/2 a0 z 3/2 + a1 z 1/2 + a2 z −1/2 + a3 z −3/2

Set z = ejω to get the frequency response,


h i
H(ω) = e−3jω/2 a0 e3jω/2 + a1 ejω/2 + a2 e−jω/2 + a3 e−3jω/2

Finally, assume that the coefficients are symmetric, so that a0 = a3 and a1 = a2 :

H(ω) = e−3jω/2 [2a0 cos(3ω/2) + 2a1 cos(ω/2)]

77
78 CHAPTER 12. DESIGNING FEEDFORWARD FILTERS

The general form for even n, corresponding to Eq. 2.9 for odd n, is easy to write from
this simple example as

Ĥ(ω) = e((2m−1)/2)jω H(ω)


= c1 cos(ω/2) + c2 cos(3ω/2) + · · · + cm cos((2m − 1)ω/2)

where now m = n/2, the number of free coefficients.


Feedforward filters with even-symmetry coefficients are often classified as Type I when
n is odd, and Type II when n is even. (See T. Saramäki’s Chapter 4 of S. K. Mitra and
J. F. Kaiser (eds.), Handbook for Digital Signal Processing, John Wiley, New York, N.Y.,
1993.)
3. When the coefficients of a feedforward filter are odd about their center, instead of even,
the terms in the frequency response pair as

aq ejrω − aq e−jrω = 2aq j sin(rω)

producing a sine series instead of a cosine series. These filters are called Type III and Type
IV, according to whether n is odd or even, respectively.
4. Except for the DC component (an average value of 1/2), the desired frequency response
is odd about the half-band point, ω = π/2 radians per sample, or 0.25 in fractions of the
sampling rate (see Fig. 5.1). For the odd-length, even-symmetry filters considered in this
chapter, the frequency response is of the form given in Eq. 2.9:

Ĥ = c0 + c1 cos ω + c2 cos(2ω) + · · · + cm cos(mω)

The cosines in this series with arguments 2ω, 4ω, . . . are even about the half-band point,
and so must be absent in the optimal filter design. Thus, c2 = c4 = · · · = 0. In the
METEOR design for the example in Fig. 5.1, which used double precision, the corresponding
coefficients turned out to be less than about 10−15 in magnitude.
5. This argument is actually very similar to the one in the text using the frequency response
in Fig. 5.3, and is equally heuristic. To make it rigorous amounts to developing part of
linear programming theory.
We want to find a feasible point whose minimum distance to a constraint is as great as
possible. Start at an arbitrary point within the feasibility region shown in Fig. 4.1. Find
out which constraint is closest, and move away from it, until another constraint becomes
equally close. Then try to find a direction that will move us farther from both those
constraints, and that will increase the distance from both constraints equally. If such a
direction is found, move away from the first two constraints, until we become equally close
to a third constraint. Continue this process until we are equally close to a maximum number
of closest constraints. The constraints that we are closest to correspond to the equal ripples
in the frequency response. The more coefficients in the filter, and hence the higher the
dimensionality of the feasible region in Fig. 4.1, the greater the number of such equal
ripples we can find.
79

6. The measure 20 log10 [(1 + δ)/(1 − δ)] is the ratio of the maximum to the minimum value
of magnitude response in the passband, in dB. In other words, it is the “variation” in the
passband, expressed in dB — which is a natural way to think about deviation from ideal.
When δ is small, it’s small and positive. In fact, for small δ, it is proportional to δ. To see
this, expand (1 + δ)/(1 − δ) in a power series:
1+δ
= 1 + 2δ + higher-order terms
1−δ
and note that
1+δ log (1 + 2δ)
20 log10 ≈ 20 log10 (1 + 2δ) = 20 e ≈ 17.37δ
1−δ loge (10)

for small δ, where we used the fact that loge (1 + x) ≈ x. On the other hand, when δ is
small, the measure 20 log10 δ is negative and large in magnitude.
7. There are 18 ripples in Fig. 5.1, counting the bandedges, two more than m = (n+1)/2 =
16.
8. Filter response curves like the one in Fig. 5.2 are typically plotted by evaluating the
response on a grid of equally spaced points, and those grid points will miss the zeros in the
stopband by small amounts. But the logarithms of the smallest numbers near each zero,
and hence their decibel measures, will in general vary greatly. For example, we may get the
value 10−4 nearest one zero, and 10−5 nearest another. These will show up on the plot as
−80 dB and −100 dB, respectively — far apart in dB, even though both numbers are close
to zero.
9. Consider only odd n in this solution; analogous statements can be made for the even
n case. The critical property can be thought of as monotonicity : if F (n) is “infeasible,”
then F (k) is also “infeasible” for all k < n. Thus, if n∗ is the smallest n for which F (n) is
“feasible,” then it is “feasible” for all larger n, as well as “infeasible” for all smaller n. This
ensures that if we find that F (n) is “infeasible,” we can eliminate all smaller n; and if we
find that F (n) is “feasible” we can eliminate all larger n.
The binary search algorithm works by keeping track of two values of n, say left and
right, for which F (left) is “infeasible,” and F (right) is “feasible.” The midpoint is tested,
repeatedly, until left and right are consecutive (consecutive odd numbers in this case).
10. The following is a fragment of C code that uses a circular array to do feedforward
filtering. The pointer variable now is wrapped around after it’s incremented when it exceeds
L-1, as explained in Section 9. The only additional work to allow for the circularity of the
buffer is the calculation of index, which retrieves the past values of the stored input samples.
It’s wrapped around when it falls below zero.

for(i=0; i<L; i++) /* initialize array */


array[i] = 0.;
80 CHAPTER 12. DESIGNING FEEDFORWARD FILTERS

now = L-1; /* initialize pointer */


for(t=0; t<M; t++){ /* time loop */
new_sample = x[t]; /* new point arrives */
now++;
if(now > L-1) now = now-L; /* keep in range */
array[now] = new_sample;
y[t] = 0;
for(i=0; i<L; i++){
index = now-i;
if(index<0) index += L;
y[t] += a[i]*array[index]; }}

The circular array is initialized with zero values, reflecting an assumption that input values
x[t] are zero for −L + 1 ≤ t < 0. When these input values are unknown, the first L − 1
values of the output are not really meaningful. When the values are actually known, they
can be used instead of zero to initialize the array.
11. Suppose the L coefficients are symmetric about their center, and that L is even, so that

ai = aL−1−i , for i = 0, 1, . . . , L/2 − 1

(When L is odd, there is a center coefficient, and it changes the indexing slightly.) The L/2
pairs of terms in the filtering equation of the form

ai xt−i + aL−1−i xt−(L−1−i)

can therefore be combined as 


ai xt−i + xt−(L−1−i)
thus saving L/2 multiplications per sample. (When L is odd, this saves (L − 1)/2 multipli-
cations per sample, which is why I say “almost halve” in the problem statement.)
The following is the corresponding piece of C code. The only difference between this
and the code in the general, asymmetric, case is that past input values are extracted from
the circular array two at a time, from the locations index1 and index2 given above. The
loop accumulating the output value uses a[i] for i = 0, 1, . . . , L/2, and the values of a[i]
for i = L/2 + 1, L/2 + 2, . . . , L − 1 are known by symmetry.

for(i=0; i<L; i++) /* initialize array */


array[i] = 0.;
now = L-1; /* initialize pointer */
for(t=0; t<M; t++){ /* time loop */
new_sample = x[t]; /* new point arrives */
now++;
if(now > L-1) now = now-L; /* keep in range */
array[now] = new_sample;
81

y[t] = 0;
for(i=0; i<L/2; i++){ /* only the first L/2 a’s are used */
index1 = now-i; /* find indexes for the two inputs */
if(index1<0) index1 += L;
index2 = now-(L-1-i);
if(index2<0) index2 += L;
y[t] += a[i]*(array[index1]+array[index2]); }}
Chapter 13

Designing Feedback Filters

1. If all the coefficients but one, say c, are kept constant, the frequency response of a
feedforward filter is of the form
cf (ω) + g(ω)
and the problem asks for the behavior of

M (c) = max |cf (ω) + g(ω)|


ω

as c is varied. When c = 0, the value of M is just

M (0) = max g(ω) = gmax


ω

Letting fmax = max ω f (ω), we can also say that the value of M is bounded:

M (c) ≤ |c|fmax + gmax

Thus, the curve M (ω) is upper bounded by the vee-shaped intersection of two straight
lines that intersect at the point c = 0 and M = gmax . Intuitively, as c is moved in either
direction from zero, the maximizing value of ω shifts gradually from being dominated by
g(ω) to being dominated by f (ω). We therefore expect the curve to be reasonably well
behaved, as well as bounded.
Figure P1(a) shows an example in which the coefficient c13 is varied in the feedforward
half-band filter in Section 5 of Chapter 12. This is an equiripple design, which may explain
the fact that the minimum frequency response occurs when c13 = 0.03156, the original value
in the design. It seems that if the maximum value can be reduced by changing c13 , then
the design algorithm would have reduced the ripple by doing so. The segments of the curve
are very close to being straight lines, but are not exactly straight.
When we plot the error between the frequency response and some fixed, prespecified
response, the same arguments apply, except the prespecified response must be included in
the definition of the curve M (c), where it gets absorbed into the function g(ω).

83
84 CHAPTER 13. DESIGNING FEEDBACK FILTERS

maximum of frequency response


1.35

1.30

1.25

1.20

1.15

1.10

1.05

1.00
-0.25 -0.20 -0.15 -0.10 -0.05 0 0.05 0.10 0.15 0.20 0.25
13th filter coefficient, c13

Fig. P1(a) The maximum value of the frequency response vs. the coefficient c13 , for the
feedforward half-band filter example in Section 5 of Chapter 12.
maximum of frequency response

3
10

10 2

10 1

10 0

10 -1
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
denominator coefficient, b3

Fig. P1(b) The maximum value of the frequency response vs. the third coefficient, b3 , in the
denominator of the feedback transfer function example in Section 8 of Chapter 9 — the step
response of a reson.
85

By way of contrast, Fig. P1(b) shows the result of varying the third denominator
coefficient in the z-transform in Section 8 of Chapter 9. This is the z-transform of the
output of a reson when a step input is applied, but can also be interpreted as a filter
transfer function. The curve of maximum frequency response vs. b3 is much more wildly
behaved than the one in the feedforward case, and actually has three points at which it
becomes infinite. These points correspond to poles crossing the unit circle. These two
figures illustrate why the design problem for feedforward filters is inherently easier than the
design problem for feedback filters.
2. Multiplying the numerator and denominator of the transfer function in Eq. 2.1 by z + 1
shows that
1 1
H(z) = + z −1
2 2
which is the simplest kind of feedforward filter considered in Chapter 4, as in Eq. 5.1 of
that chapter.
3. Because the transfer function is symmetric in z and z −1 , every zero z0 inside the unit
circle has a corresponding zero z0−1 outside the unit circle. Choosing the latter instead of
the former is therefore equivalent to multiplying the transfer function by

z − z0−1
z − z0

which we know is allpass, from Eq. 6.5 in Chapter 6. Choosing some of the zeros outside
the unit circle, instead of inside, therefore has no effect on the magnitude of the resulting
filter transfer function. But it does introduce extra negative phase, and therefore additional
delay, which is why the design with all the zeros inside the unit circle is called minimum
phase.
4. Setting B(1) = 1 in Eq. 3.10 yields the following condition for the gain constant of the
Butterworth filter:
A = 2−N (1 + p20 ) · · · (1 + p2N/2−1 )

where the pole parameters pi are determined by Eqs. 3.5 and 3.8.
Similarly, setting B(1) = 1 in Eq. 4.7 yields

A = 2−N (1 − z0 ) · · · (1 − zN −1 )

for the case with a general cutoff frequency. This also yields a real value for A, because the
poles zi occur in complex-conjugate pairs.
5. We have just the parameter N to adjust, so in general we can’t hope to achieve two
specified values simultaneously. Furthermore, N is an integer, so we can’t even hope to
meet one of the specifications precisely. What we can do is require that the magnitude
response of the filter be at least as good as that implied by the specified points. Thus, if the
86 CHAPTER 13. DESIGNING FEEDBACK FILTERS

specified values are A1 in the passband, at ω1 < π/2, and A2 in the stopband, at ω2 > π/2,
we require that
1
|B(ω1 )|2 = 2N
≥ A21
1 + tan (ω1 /2)
and
1
|B(ω2 )|2 = ≤ A22
1+ tan2N (ω 2 /2)

Rearranging, we get the conditions


1
tan2N (ω1 /2) ≤ −1
A21

and
1
tan2N (ω2 /2) ≥ −1
A22
Because ω1 < π/2 — that is, ω1 is in the passband — it follows that tan(ω1 /2) < 1.
Similarly, tan(ω2 /2) > 1 because ω2 is in the stopband. It is therefore possible to satisfy
both of these conditions by choosing N sufficiently large. The smallest such N is the logical
design choice. The conditions can be made explicit by taking logs, giving

1 log(1/A21 − 1)
N> ·
2 log(tan(ω1 /2))

1 log(1/A22 − 1)
N> ·
2 log(tan(ω2 /2))
Note that the first inequality gets turned around because we divide by a negative quantity.
And since we’re taking the ratios of logs, it doesn’t matter what their base is.
6. The condition in Eq. 3.4 for the poles in the F -plane can be written

F 2N = (−1)N +1 = ej[(N +1)π+k2π] for all integer k

Taking the 2N th root of both sides shows that the poles are therefore of unit magnitude at
angles  
π 2k + 1
θk = + 2π
2 4N
for any 2N consecutive values of the integer k, checking Eq. 3.5 and Fig. 3.1.
7. Equation 3.5 works for N odd as well as N even; the 2N pole angles in the F -plane are
 
π 2k + 1
θk = + 2π, for k = 0, 1, · · · , 2N − 1
2 4N

When N is odd, there is a real pole at F = −1. It corresponds to the index k = (N − 1)/2,
when θ(N −1)/2 = π.
87

The pole at F = −1 becomes a pole at z = 0, by Eq. 3.7. The transfer function when
N is odd therefore has this left-over real pole when the complex pairs are combined, and
Eq. 3.10 becomes
A(z + 1)N
B(z) = 2
(z + p20 ) · · · (z 2 + p2(N −3)/2 )z

To check, this accounts for (N − 1)/2 pairs of complex poles, plus the real pole, a total of
N poles. The real pole is indexed by k = (N − 1)/2.
As mentioned above, the real pole ends up at z = 0 when the filter is half-band. When
the cutoff frequency is ωc radians per sample, Eq. 4.5 tells us that the real pole ends up at

1 − tan(ωc /2)
z(N −1)/2 =
1 + tan(ωc /2)

In the example illustrated in Fig. 4.1, ωc = 2π/10, and this formula gives the location of
the real pole at z = 0.5095. That’s where it would be in that figure if N were odd.
In the case when N is odd, the cascade implementation in Fig. 7.1 needs a first-order
section to account for this lone real pole.
8. To convert a lowpass Butterworth filter to highpass, we set z = −z. This is equivalent to
changing the sign of the coefficients ci in Eq. 7.6, but leaving the coefficients di unaltered.
(Of course, the zero locations should also be changed from z = −1 to z = +1.)
9. Assume for this problem that the numerator and denominator of the transfer function
are both of degree N = n = m − 1 in z −1 , and that N is even. Assume further that the
filter is otherwise completely arbitrary.
Implementing the filter in cascade form, as in Fig. 7.1, requires N/2 sections, and
each section requires 4 multiplications and 4 additions. There’s one more multiplication to
account for the scale factor A, so the total is 2N + 1 multiplications and 2N additions.
The direct form in Eq. 8.2 requires one multiplication for each numerator and denom-
inator coefficient and one fewer addition, so the total is 2N + 1 multiplications and 2N
additions — exactly the same as the cascade form.
10. The magnitude frequency response of the analog Butterworth filter is, from Eq. 6.1,
1

1 + Ω2N
Use the power series
x
(1 + x)−1/2 = 1 − + higher-order terms
2
to expand this in a power series around the point Ω = 0:

1 Ω2N
√ =1− + higher-order terms
1 + Ω2N 2
88 CHAPTER 13. DESIGNING FEEDBACK FILTERS

The first 2N − 1 derivatives of this vanish at Ω = 0, and thus the magnitude frequency
response is very flat. In fact, it is as flat as possible, given that the transfer function has N
poles. This follows because this process of expanding the response always produces a power
series of the form
1 + c1 Ωk + higher-order terms
where c1 is a constant and k ≤ 2N .
The same result holds in the digital filter case, because

Ω = c2 ω + higher-order terms

using Eq. 6.8 and the fact that tan x = x + higher-order terms.
11. Different pairings of zeros and poles and different orderings of the sections in a cascade
implementation can affect the roundoff error. It can also affect the maximum magnitude
of signals in the sections, and this can affect the wordlength required in fixed-point imple-
mentations.
12. The Laplace transform of the unit step signal is, using Eq. 6.3,
Z ∞ ∞
e−st 1
X (s) = e−st dt = =
0 −s t=0 s

where we assume <{s} > 0, so that the value of the integrated function vanishes as t → ∞.
(It’s necessary to establish a region of convergence for the Laplace transform, and for the
z-transform as well, but I simplified the discussion in the book by ignoring this issue.) The
bilinear transformation yields the z-transform

1 + z −1
 
z−1 z+1
X = =
z+1 z−1 1 − z −1

On the other hand, the sampled unit step signal has the z-transform
1
1 − z −1
from Eq. 5.1 in Chapter 9.
The Laplace transforms of an analog signal and its sampled version are related by the
aliasing formula, Eq. 3.2 of Chapter 11.
13. To avoid dealing with special cases, assume in this problem that a straight line is a
special case of a circle with infinite radius. Then there is a well-known mathematical result
that includes the one posed in this problem: A bilinear transformation maps any circle to a
circle. The bilinear transformation in Eq. 4.5 therefore maps the poles of the Butterworth
filter, which lie on the unit circle in the F -plane, to a circle in the z-plane.
First, observe that a bilinear transformation can be decomposed into successive appli-
cations of the operations of
89

• multiplication by a constant scale factor;

• shift by a constant z0 in the complex plane;

• inversion — replacing the complex variable z by 1/z.

For example,
z−1 2
=1−
z+1 z+1
so this bilinear transformation can be decomposed into a shift by 1, multiplication by −2,
inversion, and shift by 1.
The first two operations clearly map any circle to a circle. The crux of the problem is to
show that inversion does so also. Start with the general circle with center at the complex
number z0 and radius R :
|z − z0 |2 = R2
Replacing z by w = 1/z yields
|1 − wz0 |2 = R2 |w|2
Letting
w = wr + jwi and z0 = zr + jzi
this becomes
|1 − wr zr + wi zi − j(wr zi + wi zr )|2 = R2 (wr2 + wi2 )
Expanding the magnitude square as the sum of real and imaginary parts squared, and
collecting terms, results in
zr zi 1
wr2 − 2wr 2 + wi2 − 2wi 2 =
|z0 | − R 2 |z0 | − R 2 R − |z0 |2
2

and completing the square finally produces


 2  2  2
zr zi R
wr − + wi − =
|z0 |2 − R2 |z0 |2 − R2 |z0 |2 − R2
which is the equation of a circle.
To get this result we divided by R2 − |z0 |2 . When this is zero — that is, when the
circle passes through the origin, we can stop earlier in this derivation with the equation of
a straight line through the origin:

2wr zr + 2wi zi = 1

14. Go back and look at the pole-zero plot for the elliptic filter used as an example in
Section 8 of Chapter 5, Fig. 8.1 of that chapter. Observe that there is a pole for every
peak in the passband and a zero for every notch in the stopbands. Taking into account
that the frequency response is plotted for only half the baseband, corresponding to half the
90 CHAPTER 13. DESIGNING FEEDBACK FILTERS

unit circle, it’s clear that we should multiply these counts by two to get the total number
of zeros and poles. Note that this works only when the poles and zeros are sufficiently close
to the unit circle.
The frequency response of the elliptic filter in Fig. 6.2 has five peaks in the passband
(the last one is scrunched up), and five notches in the passband, checking the fact that this
is a 10-pole, 10-zero design.
15. The transformation is s → 1/s, and you can see this in a number of ways.
The voltage across an inductor is Ldi/dt, where i is the current. In the Laplace transform
domain, this corresponds to an impedance of sL. Similarly, the impedance of a capacitor
is 1/(sC). Interchanging inductors and capacitors is therefore equivalent to replacing s by
1/s, ignoring a scale factor.
If you’re using the bilinear transformation in Eq. 6.7 to design a filter,
1 z−1
s= ·
tan(ωc /2) z + 1

replacing z by −z is mathematically equivalent to the transformation


1 1
s→ ·
tan2 (ωc /2) s

which, again, is inversion with a scale factor.


Intuitively, any transformation that interchanges the zero and infinite frequencies, and
reverses the order of the frequencies in between, will act as a lowpass-to-highpass transfor-
mation. Inversion is the most natural, for the reasons given above.
Chapter 14

Audio and Musical Applications

1. It’s just as easy to work this problem for the general case of an n-coefficient feedforward
filter, and a signal padded with L − 1 zeros between the original sample values.
Suppose the quarter-band feedforward filter has the transfer function
n−1
X
ak z −k
k=0

At t = 0, the filtering operation uses only the terms involving a0 , aL , . . . , amL , where mL
denotes the last nonzero coefficient with index 0 mod L. At t = 1, the filter uses only
a1 , aL+1 , . . . , amL+1 , where now mL + 1 denotes the last nonzero coefficient with index
1 mod L. In general, if t = i mod L, the filter uses only coefficients ai , aL+i , . . . , amL+i ,
where mL + i denotes the last nonzero coefficient with index i mod L.
For example, consider the case L = 3 and n = 8. When t = 0, 3, 6, . . ., the filter uses
a0 , a3 , and a6 ; when t = 1, 4, 7, . . ., the filter uses a1 , a4 , and a7 ; and when t = 2, 5, 8, . . .,
the filter uses a2 and a5 . (There isn’t any a8 .)
The filtering operation can be viewed as a time-varying one — the filter coefficients
change cyclicly, depending on the value of t mod L. The filtering operations are
mL+i
X
yt = ak xt−k for t = i mod L
k=i,L+i,...

Thus, the filter does only 1/Lth the work that it would with general input.
2. Here’s the input file to METEOR that produced Fig 1.3:

96 96 smallest and largest length


c
neither left nor right: maximize distance from constraints
500 number of grid points

91
92 CHAPTER 14. AUDIO AND MUSICAL APPLICATIONS

limit spec
+ upper limit
geometric interpolation
not hugged spec
0.00000E+00 0.11338E+00 band edges
0.70795E+00 1.00000E+00 bounds
limit spec
- lower limit
geometric interpolation
not hugged spec
0.00000E+00 0.11338E+00 band edges
0.50119E+00 0.70795E+00 bounds
limit spec
+ upper limit
geometric interpolation
not hugged spec
0.13605E+00 5.00000E-01 band edges
0.31623E-02 0.31623E-01 bounds
limit spec
- lower limit
geometric interpolation
not hugged spec
0.13605E+00 5.00000E-01 band edges
-0.31623E-02 -0.31623E-01 bounds
end

The input-preparation program FORMAT generates the following condensed version,


which is a bit easier to read:

type sense edge1 edge2 bound1 bound2 hugged? interp


1 limit + 0.00000 0.11338 0.70795 1.00000 n g
2 limit - 0.00000 0.11338 0.50119 0.70795 n g
3 limit + 0.13605 0.50000 0.00316 0.03162 n g
4 limit - 0.13605 0.50000 -0.00316 -0.03162 n g
OPTIMIZING, fixed length= 96
COSINE model (symmetric coeffs.)
501 grid points

There are four constraints of the limit type: an upper and lower limit in the passband
and in the stopband. The bandedges edge1 and edge2 are the left and right edges of each
band, in fractions of the sampling rate. The limiting values of the magnitude response are
interpolated geometrically from the left to the right edge, which means linearly on a dB
scale. Note that the specifications at the left and right edges of the stopband are −30 dB
93

0
-5
frequency response in dB

-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
-65
-70
-75
-80
0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
frequency, in fractions of the sampling rate

Fig. P2 Magnitude response of a quarter-band feedforward digital filter used to prepare for
oversampled d-to-a conversion in a CD player. The solid line is the response of the infinite-
precision design, and the dashed line is the response of the 12-bit version.

and −50 dB, respectively. Finally, not hugged means the design is moved as far as possible
from a specification.
I’ll interpret rounding off the coefficients to 12 bits as follows:

• multiply each coefficient by 212 ;

• cast the result to type int (in C);

• divide by 212 .

Think of this as shifting the decimal point of the binary representation 12 places to the
right, chopping off the fractional part, and moving the decimal point back. The coefficients
are all less than one in magnitude, so this just selects the first 12 binary bits.
Figure P2 shows the resulting magnitude response as a dashed line, along with the
original (essentially) infinite-precision design as a solid line. As you might expect, the
effects of using only 12 bits in the coefficients are most apparent where the magnitude
response is smallest. But the result still meets the specifications.

3. Figure P3 illustrates the effect of a zero-order hold after oversampled d-to-a conversion.
This would be Fig. 1.1(e). It is obtained by multiplying the signal spectrum after digital
filtering at the new sampling rate by the frequency response of a zero-order hold, from Fig.
94 CHAPTER 14. AUDIO AND MUSICAL APPLICATIONS

droop Nyquist sampling


(e)

image

0 frequency

Fig. P3 The effect of a zero-order hold after oversampled d-to-a conversion, a candidate Fig.
1.1(e).

4.5 in Chapter 11. Two effects are important. First, the signal spectrum droops in the
baseband, because of the decreasing magnitude response of the zero-order hold. Second,
the signal spectrum image at the sampling frequency is multiplied by the region of the
zero-order hold magnitude response that has a zero, producing two lobes, as in the example
in Fig. 5.1 of Chapter 11. Note that this spurious image signal is centered at four times
the original sampling rate, beyond the range of hearing.

4. The analog signal after oversampled d-to-a conversion must be amplified by an analog
amplifier. Too much energy in that signal can saturate the electronics in that amplifier,
even if it’s not in the audible range.
5. The specification in the stopband of the pre-d-to-a filter slopes up on the ground that
it is more important to do a good job at the lower edge of the stopband than at the upper
edge, because the lower edge of the stopband is closer to the audible region.
6. It’s no accident that the lowpass comb for reverb and the plucked-string filter have the
same structure. They both arise from the same physical situation: Sound circulates and
decays in a frequency-dependent way, faster at low frequencies than at high frequencies.
In reverb, the frequency-dependent decay is caused by absorption of sound in air and on
reflection; in the plucked string, it’s caused by radiation. Of course, the time scales of the
two phenomena are different.
Prof. Perry Cook pointed out to me that the one-pole feedback filter has a lowpass
frequency response that is in some sense a more natural model for the physics in both
applications. Certainly, the feedback filter works fine in the plucked-string filter, provided
you adjust the loop gain to keep the filter stable. He also pointed out, however, that the
frequency-independent delay of one-half a period makes it easier to predict the pitch of the
plucked-string filter when the feedforward filter is used.
7. The property of being allpass can be thought of as follows: A transfer function G(z) is
allpass if it maps the unit circle in the z-plane to the unit circle in the G-plane. This is
equivalent to the magnitude response being unity. (In the more general case of constant
magnitude response, just normalize to one.)
95

Using this definition, it’s easy to see that an allpass function of an allpass function is
also allpass. Composing two allpass functions maps the unit circle to the unit circle, and
then again to the unit circle. Now z m is an allpass function, because its magnitude on the
unit circle is one. Therefore, if G(z) is allpass, so is G(z m ). This is equivalent to replacing
unit delays by delays of m samples.
8. Amplitude modulation results in a strictly bandlimited spectrum, which makes it easy to
pack AM signals close together in the medium-wave band. The spacing of 10 kHz, however,
means that the audio bandwidth is at most 5 kHz, and is somewhat less in practice because
of the need for guard bands. This small bandwidth accounts for the low quality of AM
audio, but leaves room for a hundred or so channels in the medium-wave band, which is
about 1 MHz wide.
As it turns out, FM is more resistant to noise than AM for radio, and for this reason is
better suited to the high-quality demands of commercial music broadcasting. It also spreads
the spectrum around the carrier much more than AM, especially for large modulation
indexes, as discussed in Section 4. It is therefore used with much larger bandwidth than
commercial AM. To make this possible, the FM commercial band is in the 100 MHz region,
and is about 20 MHz wide, allowing about 100 channels, 200 kHz apart, 20 times the AM
spacing.
9. To be more realistic than Section 4, we’ll assume that the carrier is real-valued, so it
contains frequencies ±ω0 . The FM signal therefore contains all frequencies of the form

±ω0 ± kωm for all integer k

where ωm is the frequency of the modulating sinusoid. We’re assuming that ω0 /ωm =
N1 /N2 , with N1 and N2 relatively prime, and this can be rewritten
ωm
(±N1 ± kN2 ) for all integer k
N2

All these frequencies are integer multiples of ωm /N2 , which we’ll call the fundamental
frequency. We now consider the cases N2 = 1, 2, 3 in turn.
When N2 = 1, the fundamental frequency is the modulation frequency, ωm , and all
harmonics of the fundamental are present.
When N2 = 2, the fundamental frequency is ωm /2. N1 must be odd, since N1 and N2
are assumed to be relatively prime. Therefore, the FM signal contains all frequencies of the
form
ωm
(±N1 ± 2k)
2
Thus, all the odd harmonics of the fundamental are present, and all the even harmonics are
missing.
When N2 = 3, the fundamental frequency is ωm /3. In this case, N1 6= 0 mod 3 — again
because N1 and N2 are relatively prime. This means that N1 is of the form 3q + x, where
q is an integer, and x = 1 or 2. It follows that −N1 = −x mod 3, and is therefore 1 if
96 CHAPTER 14. AUDIO AND MUSICAL APPLICATIONS

N1 mod 3 = 2, and 2 if N1 mod 3 = 1. The FM signal now contains all frequencies of the
form
ωm
(±N1 ± 3k)
3
Therefore the FM signal contains all multiples of the fundamental except the ones that are
0 mod 3.
The results in this solution are from Chowning’s paper, cited in the Notes. Note that
when N2 is even, the FM spectrum contains only odd-numbered harmonics of the funda-
mental, as Chowning points out, but not necessarily all the odd-numbered harmonics.
When the ratio N1 /N2 is an irrational number, the spectral components present in the
FM signal are not all integer
√ multiples of a common fundamental frequency. Consider the
example when ω0 /ωm = 2/7. Then the frequencies present are
√ !
2
±ω0 ± kωm = ωm ± ±k for all integer k
7

10. I’ll use the notation LP[u] to denote the result of passing the signal u through the
lowpass filter. Denote the (real-valued) input and output signals in Fig. 3.3 by x and y,
respectively, and denote the output of the lowpass filter in the top branch by w. Then the
output of the lowpass filter in the bottom branch is w∗ , and

y = 2< ejω0 t w


= 2< ejω0 t LP [x cos(ω0 t) − jx sin(ω0 t)]




= 2 cos(ω0 t)LP [x cos(ω0 t)] + 2 sin(ω0 t)LP [x sin(ω0 t)]

which is just twice the output of the figure shown in this problem.
11. A typical picture shows objects that take up more area than a picture element (a pixel ),
and so the value at a particular pixel is highly correlated with the values at neighboring
pixels. This means the most straightforward pixel-by-pixel representation of a picture is
redundant, and implies that we can compress the image.
A consequence of this redundancy is that the value directly beside or below a given
pixel is highly correlated with it, and this can be exploited for run-length and line-to-
line compression when we are using a scanline representation. Similarly, in television and
movies, there is a high correlation between the value of a pixel at time t and its value at
time t + 1, because objects tend to remain stationary or move slowly. This correlation can
be exploited for frame-to-frame compression.
12. From audio bit-rate considerations, we have an hour of sound at 1.41 Mbit/sec, as
estimated in Section 1, which yields about 5 Gbit. I state that the actual bit rate onto and
off a CD is about three times this, so there should be a total of about 15 Gbit on a CD.

You might also like