100% found this document useful (1 vote)
353 views133 pages

1992 - Meyer - Wavelets Algorithms and Applications

This chapter introduces key concepts in signal and wavelet analysis. It discusses the goals of signal processing, including representing both stationary and non-stationary signals. The chapter describes early time-scale wavelets developed by Grossmann and Morlet, as well as time-frequency wavelets. It also discusses optimal representations and terminologies used in subsequent chapters, which will cover the history and applications of wavelet algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
353 views133 pages

1992 - Meyer - Wavelets Algorithms and Applications

This chapter introduces key concepts in signal and wavelet analysis. It discusses the goals of signal processing, including representing both stationary and non-stationary signals. The chapter describes early time-scale wavelets developed by Grossmann and Morlet, as well as time-frequency wavelets. It also discusses optimal representations and terminologies used in subsequent chapters, which will cover the history and applications of wavelet algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 133

Wav

Algorith111s

Applications

Yves Meyer

CEREMADE
and
Institut Universit.aire de France

Translated a nd Rel'ised b1
Robert D . Ryan

Office of 1'\a\'a} Research


European Office

:O.onl..'t' tor lndw;trtal :tnd App;t<:"d !\ l:llhem:llit:~

Phibcldphi:l !0<>..;

Ubrary or Congress Cataloging-in-Publication Data


Meyer. Yves.
[Ondelenes. English)
Wavelets : algorithms and applications I Yves Meyer : translated

by Roben D. Ryan.
p. em.
"Translation based on leQures onginally presented by the author
for the Spanish Institute in Madrid. Spain. Februa~ 1991"-CIP
galley.
ISBN 0-89871-309-9
I. Wavelets. I. Title.
QA403.3.M4913 1993
515'.2433-dc20
93-15 100

All rif:ht~ reser"ed. Pn ntcd in the Umted State~ of Amenca. !\<> pan of thi~ book may be reproduced.
stored. or tr.snsnuucd tn any manner w uhout t he wnuen pemussion of the Publisher. For mform~
tiC'In. wri te thl' StlCICt~ wr lndu~tn;tl and Applied lllathcmallt'>. 360(1 Uni versu~ Cit~ Science Center.
Philadelphia. Penn'~ h am.J J<11l~:!Mi!i

SlaJTL.

is a rcj,:tstc rcd tr.ttklllark .

Contents

Translator's Foreword

Prerace
CHAPTER I.
1. 1
1.2
1.3
1.4

1.5
1.6
1.7

1.8
1.9

CHAPTER 2.
2. 1

2.2
2.3
2.4
2.5
2.6
2.7
2 .8
2.9
2. 10
2. 11
CHAPTER 3.
3. 1

ix

xi

Signals and Wavd~ts


What Is a Signal? I
The Goals of Signal and Image Processing 2
Stationary Signals. Transient Signals. and Adaptive Coding Algorithms
Grossmann-Morlet Ti me-Scale Wavelets
S
TimeF~quency Wavelets from Gabor to Malvar
6
Optimal Algorithms in Signal Processing 7
Optimal Representation. according 10 Marr 9
Tenninology 10
Reader's Guide 11

Wavd& From a Historical Perspective


Introduction 13
From Fourier (1807) to Hw () 909). F~quency Analysis Becomes Scale Analysis 14
New Di~tions of the 1930s: Levy and Brownian-Motion 18
New Di~ons of the 1930s: Iittlewood and Paley 19
New Directions of the 1930s: The Franklin System 21
New Di~tions of the 1930s: The Wavelets of Lusin 22
Atomic Decompositions. from 1960 to 1980 24
Stromberg's Wavelets 26
A First Synthesi.s : Wavelet Analysis 27
The Advent of Signal Processing 30
Conclusions 31

Quadrature Mirror Falters

Introduction 33
Subband Codin{:: The Case of Ideal Filters 34
Qu3dra1Urt Mirror Fi lters 3(>
.~ .4 . The Trend and Auctuation
38
:<5 T he Timt>Scalc Algorithm of Mallat and the Time- Frequenc~ Algorithm Clf
Galnnd ~I}
.'.h Trend~ and Auctu:uion~ with OnhOnonnal Wavelet Ba.~~ 4(l
:>.i C'on\'ergcncc II' Wa\'eleL~ 4~
.'.!' Th.:- Wa\'clcts of Daubechie~ 4::<
~.::
~-~

:<.II

Conclu.~ion~

+l

,.

COl\'TE:>.'TS
CHAPTER 4.
4. 1
4.2
4.3
4.4
4.5
4.6
4.7

Pyramid Algorithms for Num~riClll Image Proc:~ing


Introduction 45
Tk Pyramid Algorithm~ of Bun and Adelson 46
Examples of Pyramid Algorithrru. 50
Pyramid Algorithms and Image Compression 51
Pyramid All!orithms and Multire~olution Analysis 53
The Onhogonal Pyramid~ and Wavelet'
55
Bi-onhogonal WaveleL' 5<i

CHAPTER 5.

Time-Frequen~

5.1
5.~

5.3
5.4

5.5
5.6
5.7

5.8
CHAPTER 6.
6. 1

6.2
6.3
6.4

6.5
6.6
6.7

6.8
6.9
CHAPTER 7.

7.1

72
7.3

7.4

75
CHAPTERS.
8.1

8.:!
8.~

i\.J
8.5
li.6
CHAPTER 9.

9.1
9 .2
93
9.J
95

Analysis for Signa l Processing


Introduction 63
The Time-Frequency Plane 66
The Wigner-Ville Transform 66
The Computation of Cenain Wigntt-Ville Transform~ 67
The Wigner- Ville Transform and P~eudodifferential Calculu~ 69
The Wigner-Ville Transform and Instantaneous Frequency 70
The Wigner- Ville Transform of Asymptotic Signab 72
Return to the Problem of Optimal Decomposition in Time-Frequency A10ms
TimeF~quency Algorithm s lising Malvar 's Wavelets
Introduction 75
Malvar Wavelets: A Historical Perspective 76
Windows with Variable Lengths 77
Malvar Wavelets and Time-Scale Wavelets 80
Adaptive Segmentation and the Split-and-Merge Algorithm 8 1
The Entropy of a Vector with Respect to an Onhonormal Basis 83
The Algorithm for Finding the Optimal Malvar Basis 83
An Example Where This Algorithm Works 85
The Discrete Case 86

Time-Frequ ency Analysis and Wavelet P ackets


Heuristic Considerations 89
The Definition of Basic Wavelet Packets 92
General Wavelet Packets 95
Splitting Algorithms 97
Conclusions 98
Com puter Vision and Human \'ision
Marr's Program 101
Tht" Theory of Zere>-Crossin!!' I(~
A Counterexample to Marr, Conjecture 105
Mall at's Conjecture HI(>
Th~ Twe>-Dimensional Vcrston <'f Mallat's Al~orithm
C<mdusions I 10
Wnelets and Fracta ls
Introduction Ill
The Weierstrass Function Il l
The Determination o f kc!!ular Points in a FrdCtal Background
Stud~ of the Riemann Funr ti<lll
115
C<>ndusion~
117

ID

73

CHAPTER 10.
10. 1
10.2
10.3
10.4
10 .5

Wankts and Turbulence


Introduction I 19
The Statistical Theory of T urbulence and Fourier Analys is
Verification of the Hypothesis of Parisi and Frisch 120
Farge's Experiment\ 121
Numerical Approache~ to Turbulence 122

CHAPTER I I.
I I. I
11.2
I 1.3
11.4
11.5

Wavelets and the Study or Oislllnt Galaxies


Introduction 125
The f'ev. Telescopes 125
The Hierarchical Or~ani zatiun of the Ga luxie' and the Creation of the Umverse
The Multifractal Approach to the Univer~e 126
The Advent of Wa,elel\ 126
Index

129

119

126

Translator's Foreword

The question most often asked by tho~e who heard 1 was translating this book was, "How did you
get the job?'" Well. I asked for iL I mentioned to Professor Meyer in March 1992 that I had hem!
that he had wriuen a new book. He said. yes. and that the book was based on noces from Jeaures he
had given at the Spanish Institute in Madrid. He added that the book was being translated illlo
Spanish and that SIAM was interested in publishing an English edition. for which they would need a
translator. 1 volunteered to do the job: what you have is the result.
In addition to translating the text. I have tried to "work through" much of the mathematics to
correct typos. I have also added a line of explanation here and there where it seemed appropriate.
and sections of text and references have been updated. These revisions are not highlighted in any
panicular way (i.e. there are no translator's noces exc:ep~ at one reicreuce). but ratheT incorporated
into the text. These changes were made with Professor Meyer's blessing. Of course, there is
always the possibility that in the process of updating the manuscript I have introduced other errors:
for these I take full rel;ponsibility.
The great fun of this project has been the chance to work with Yves Meyer and other members
of "team wavelet." Ptofes.wr Meyer impm1ed both my French and my mathematics. and h is
enthusiasm and appreciation for m)' efforts kept things moving. Direct help also came from John
Benedetto. Marie Farge. Patrick Flandrin. Stephane Jaffard. and Hamid Krim. Alex Grossmann
gave moral suppon by assuring me that it was an important project. My sincere thanks to all of
these people. The work '''as done while I was a Liaison Scientist for the Office of Na~al Research
European Office. where my primary job was to report on mathematics in Europe. It was through
thi s work that I first made contact with the French wavelet communitv. and I thank the Office of
Naval Research for that opportunity. This was esse.ntially a weekenc:,-and evening project. and
hence a family project. In this conte xt. I thank my son. Michael J. Ryan. who kept house and
produced great d inners while I kept the electrons moving.
Roben D. Rvan
November 199:::
London. UK

"

Preface

The ..theory of wavelet.sM sunds at the intersection of the frontiers of mathematics, scientific
computin~. and signal processing. h s goal is to provide a coherent 5et of concepc.s, methods. and
al~orithms that are adapted to a variety of nonStationary signals and that are also suitabl.e for
numerical signal processing.
This book results from a series of lectures that Mr. Miguel Anoia Gallego, Director of the
Spanish lnstiwte. invited me to ~ive on waveletS and their applications. I have tried to fulfi ll. in the
followin~ pa~es. the objective the Spanish Institute set for me: to present to a scientific audience:
coming from different disciplines. the prospects that wavelets offer for signal and ima~e processing.
A description of the different al~orithms used today under the name "wavc:lets" (Chapters 2-7)
will be followed by an analysis of several applications of these methods: to numerical ima~e
proeessin~ (Chapter 8J, to fractals (Chapter 9 ), to turllulencc: (Chapcer 10), and to astronomy
(Chapter I I ). This v.ill take me out of my scientifiC domain: as a resuh, the last two chapters are
merely resumes of the ori~inal articles on which they are based.
I wish to thank the Spanish Institute for its generous hospitality as well as its Director for his
warm welcome. Additionally. I note the excellent orpnization by Mr. Pedro Corpas.
My thanks ~o also to my Spanish friends and eollea~ues who took the time: to anend these
lectures.

!l.i

CHAPTER

Signa ls and Wavelets

The purpose of this first chapter is to give the reader a fairly clear idea about
the scientific content of the following chapters. All of the "themes" that will be
developed in this study, using the inevitable mathematical formalism, already
appear in this "overture." It is written with a concern for simplic!_ty and clarity,
while avoiding as much as po6Sible the use of formulas and symbols.
Signal and image processing always leads to a collection of techniques or
procedures. But like all other scientific disciplines, signal and image processing assumes certain preliminary scientific conventions. We have sought, in this
first chapter, to describe the intellectual architecture underlying the algorithmic
constructions that will be presented in the following chapters.
1.1.

What is a signal?

Signal processing has become an essential part of contemporary scientific and

technological activity. Signal processing is used in telecommunications (telephone and television). in the transmission and analysis of satellite images, and
in medical imaging (echography, tomography, and nuclear magnetic resonance),
all of which involve the analysis and interpretation of complex time series. A
record of stock price fluctuations is a signal, as is a rec:ord of temperature readings
that. permit the analysis of climatic variations and the study of global warming.
This list is by no means exhaustive.
Does there exist a precise definition of a signal that is appropriau. for the
field of scientific acti\'ity called '"signal processing"? A needlessly broad definition could include the sequence of lett.ers. spaces. and punctuation marks
appearing in Montaignes Essays. but the toOls v.e present do not apply to such
a sitmal. However. thC' structuralist analysis done by Roland Barth~ on lit.crary
t,c_,"f.S shares some amusin~ si.milarities with the multiresolution analysi." that WC'
describe in Chapter 4.
The signals we study will always be series of numbers and not !'Cries of letters.
words. or phrases. Thesc- numbers come from measurements. whicl1 arC' typical!~
1111\dC' using some recordin~ method. The signal." ultimately appear a.c: functions
of time. This is true for on~dimensional signals. The case of t.v.~imC'nsional
si~nlils will be examined in a moment.
Th e objecti11e.~ of .~ignal proce.uing o.rr to analy:r o.CCtJro.tely. rodr rffincutly.
trn nm it rapidly. o.nd tltrfl tc reconstruct rorrfull.v o.t tllr rccrit1rr tlw ddic:o.tr

CHAPTEH 1

oscillations or fluctuations of this function of time. This i8 important because


all of the in/Of'TTI4tion contained in the &ignc.l is effectively P.re&ent and hidden in
the complicaUd arabesques appearing in its graphical repruentation.
These remarks apply to speech: A ~ signal originates as subtle tim<:
variat ions of air pressure and becomes a curve whose complex graphical cbarac
teristics are an "adapted copy" of the voice.
1t is equally important to consider t\\'0-dimensional signab. which is t o say,
images. Here again, image processing is done on the numerical representat ion of
the image. For a black and white image. the numerical representation is created
by replacing the x and 11 coordinates of an image point with those of the clo~t
point on a sufficiently fine grid. The value f (x , y) of the '"gray scale" is t hen
replaced with an average coefficient. \\hicb is then assigned to the corresponding
grid point.
Tht irruJge thus becomes a larye. typically square, matrU:. Image processing

is done on this matrix.

These matrices are enormous. and as soon as one deals with a sequence of
images, tht> vol ume of numerical data that must be processed becomes immense .
Is it possible to reduce this volumt> by considering the "hidden laws" or correlations that exist among the differe.nt pieces of numerical information representing
the image? This question leads us naturally to define the goals of the scientific
discipline called "signal processing...
1.2. The goals of signal and image processing.

Experts in signal processing are called on t o describe. for a given class of signals.
aJ&orithms that lead to the construction of microprocessors and that allow certain
q~erations and tasks to be done automatically. T heS< tasks may be: analysis
ad diD.grwstics. coding. quantization and rompression. transmiSsion or storage.
and S1Jnthesis and reconstruction.
We will use several examples to illustrate the nature of t hese operations and
the difficulties they present. It will become clear that no unhersal algorithm'"
is appropriatt> for the extreme diversity of the situ ations encountered. Thus. a
large pan of this work is devoted to constructing .coding or analysis algorithms
that can be adapted to the signals that one processes.
Our first _example is the study of climatic Yariations and global warming.
'Thi5 example was discussed by Professor Jacques-Louis Lion!' at t he Spanish
Insti tute in 1990. and the following thought ~ wer e inspired by his talk!' [4).
Iu this example. one has fairly precisc t emperat u rc measurements from differ
eril poi nt.s in thr northem hemisphere that were taken <lver thr last two cemurit'l'.
and onc trie...; to disc-over if industrial acthity ha.~ caused global warming. The
extremt difficulty of the problem ari~c:; from tht exist euct' of sip1ificant natural
temperature fluctuations. Morco,er . thf.'S< fluctuation~ and thc rorrespoudiu~
cuinat.ic change- h11vc always cxist-<'d. a.~ wr learn from palroclimatology 16).
"Tt> specify 11 diagnostic, it is CS-"<'ntia.l t-< analy=.c. an d then \.u erase. thcst
natural flu ctuation~ lwhicb play thc rolt of u oi...;t) iu ordcr h ' han- aCCCS): t.u till'

SIGNALS AND WAVELETS

..artificial" heating of the planet resulting from human activity. The diagnostic often depends on extracting a small number of significant parameters from
a signal whose complexity and size are overwhelming. Thus the analysis and
the diagnostic rely naturally on data compres~ion. If this compression is done
inappropriately, it can falsify the diagnostic.
Data compres8lon also occurs in the problem of transmis$aon. Indeed . transmission channels have a limited capacity. and it is therefore important to reduc<:
as much as possible the abundance of raw information so that it fits within tht
channel's "bit allocation."
One thinks. for example, of the digital telephone (Chapter 3 ) and the 64
Kbit/sec standard. which limits, without appeal, the quantity of information
that can be transmitted in one second.
A more surprising example appears in neurophysiology. The optic nerve's capacity to transmit visual information is clearly less than the volume of information collected by all the retinal cells. Thus, there must be "low-level processing"
of information before it transits the optic nerve. David Marr bas developed a theory that allows us to understand the purpose and performance of this low-level
processing. We present this theory in Chapter 8.
We now consider problems posed by coding and quantization. Different coding algorithms will be presented and studied in this work: subband coding,
transform coding, and coding by zero-crossings. In each case, coding involves
methods to transform the recorded numerical signal into another representation
that is, depending on the nature of the signals studied, more convenient for
some task or further processing. Quantization is associated with coding. The
'exact" numerical values given by coding are replaced with nearby values that
are compatible with the bit allocation dictated by the transmission capacity.
Quantization is an unavoidable step in signal and image processing. Unfortunately, it introduces systematic errors, known as "quantization noise.'" The
coding algorithms that are used (taking into account the nature of t he signals)
ou~;ht to reduce the effects of quantization noise when decoding takes place. One
of the advantages of quadrature mirror filters is that they "trap" this quantization noise inside well-defined frequenc~ channels. These filters will be studied in
Chapter 3.
The problems encountered in archiving data (as well as problems of transmi.."sion and reconstruction) are illustrated by the FBI"s task of storing the American
population "s fingerprints. Different image-compression .algorithms wer<' tested.
and a 'o.riant of th<' algorithm described in Chapter 6 gave the best results. This
<'Stablished a standard for fingerprint compression and reconstruction.
T he Ja..;t p-oup of operations consists of der.odi1tg, syntlte.<~is. and rc..<~toration.
S~ntht>Si." and d('('()diug ar<' th<' inverse operation." of coding and quruttization.
Tht' task il' t.o r('('()ust.ruct ru1 imag<' or audibl<' signal at th<' re<"<'h<'r from thC'
S<'ri<'l< of os rutd ] ":; that haY(' traveled over the transmission channel. Onl' thinks
of de<-odiuJ: zu1 <'nrodt-d mC'S.<>~.L~c. a.c: in crypt.ownphy, and thif' ruuuogy is correct
bt'C'nuse our C'IUlllOI rt'<'onstruct an inta(!t' or signal without kuowiu~ tht C't>ding
ctl~;orithm.

CHAPTE!t 1

Signal restoration is similar to the restoration of old paintings. It amounts to


ridding the signal of artifacts and errors (which we call noi!le), and to enhancing
.certain aspects of the signal that have undergone attenuation, deterioration, or
degradation.

1.8.

Stationary signals, transient signals, and adaptive coding


algorithms.

We have just defined a set of tas~. or operations, to be performed on signals


or images. These tasks form a coherent collection. The purpose of this book is
to describe certain coding algorithms that have, during the last few years, been
shown to be particularly effective for analyzing signals ha\!ing a fractal structure
or for compression and storage. We will also describe certain '"met{l.,algorithms'"
that allaw one to choose the coding algorithm best suited to a given signal.
To better approach this problem of choosing an adaptive algorithm, we briefly
classify signals by distinguishing stationary signals, quasi-stationary signals, and
transient signals.
A signal is stationary if its properties are statistically invariant over time. A
well-known stationary signal is v.hite noise, which. in its sampled form , appears
as a series of independent drawings. A stationary signal can exhibit unexpected
events, but we know in advance the probabilities of these events. These are the
statistically predictable unknowns.
The ideal tool for studying stationary signals is the Fourier transform. In
other words, stationary signals decompose canonically into linear combinations
of waves (sines and cosines). In the same v.ay, signals that are not stationa.r:
decompose into linear combinations of wavelets.
The study of nonstatiornulf signals. where transient events appear that can
not be predicted ( even statistically with knowledge of the past). necessitates techniques different from Fourier analysis. These techniques. which are specific to
"!he nonstationarity of the signal. include wavelets of the "time-frequency" type
and wavelets of the "time-scale r type. "Timt-frequency,. wavelets are suited.
most specifically, to the analysis of quasi-stationary signals. while "time-scale,.
wavelets art adapted to signals hating a fractal structure.
Before defining "time-frequency- v.-avelets and "time-scale- wavelets, we indicate their common poi.Dts. They belong to a more general class of algorithms
that are encountered as often in mathematics as in speech processing. Matht"maticians speak of ~atomic decompositions.'" w}).ile speech specialists speak of
"decompositions in time-frequency at.o~ru'": the scientific ,realtty i.. the some in
both cases.
An ~atomic decomposition'" consists in cxtracticg the simple constituent-s
that make up a complicated mixt.urt'. Howt>ver. contra.r: w what happens iu
chemistry. the ~atoms'" that are discO\ert"d in a signal "ill depend on thE' point
of viev.- adopted for the analysis. ThE'S<' -at,oms- v.ill be '""timt"-frt>quency atomswhen we stud~ quasi-stationa.r: signals. but they could. in other situations. tx
replaced b~ '""thut"-SCalt' wavelets- or ~Grossmann-Morlct wavE'l<'ts."

SIGSALS AND WAV J-: LETS

These "atorru;'' or "wavelets" have no more physical existence than the number system used to multiply the mass of the earth by that of the moon. Each
number system ha.s an internal logical coherence, but no scientific law asserts
that multiplication must of necessity be done in base 10 rather than base 2. On
tht' other hand, the number system used by the Romans is certainly excluded
because it is not suitable for multiplication.
Having different algorithms that allow us to code a signal by decomposing it
into "time-frequency atoms" presents us with a similar situation. The decision to
use one or the other of these algorithms will be made by considering their performance. How well they perform must be judged in terms of one of the anticipated
goaL~ of signal processing. An algorithm that i.~ optimal for compression am be
'disastrous for analysis: A stancuml energetic criterion for the compression could
cause detaiLs that are important for the analysis to be systematically neglected.
These thoughts v.'ill be developed and clarified in 1.6 and 1.7. It is time to
defint' wavelets, which we do in the next two sections.

1.4.

Grossmann- Morlet time-scale wavelets.

"Time-scale" analysis (which should be called "space-scale" in the image case,


but which we prefer to call "multiresolution analysis") involves using a vast
range of scales for signal analysis. This notion of scale, which clearly refers to
cartography, implies t hat the signal (or image) is repiaced, at a given scale. by
the best possible approximation that can be drawn at that scale. By "traveling"
from the large scales toward the fine scales. one "zooms in" and arrives at more
and more exact representations of the given signal.
The analysis is then done by calculating the change from one scale to the
next . T hese art' the details that allen>: one. by correcting a rather crude approximation, to move toward a better quality representation. This algorithmic
scheme is called ~multiresolution analysis" and is developed in Chapters 3 and
4. Multiresolution analysis is equivalent to an atomic decomposition where the

atoms are Grossmann- Morlet wavelets.


We define these wavelets by starting with a function ~(t) of the real variable
v.-a.velet~ pro,'ided it is well localized and
oscillating. (By oscillating it resembles a wave. but by being localized it is a
"'avelet.) The localization condition is expressed in the usual way as decreasing
rapidly to zero when lti tends to infinity. Tht' second condition suggest~ that
1!'(f ) vibrates like a wave: Here -we require that the integral oft*) bt' zero and
that the same hold true for the first m movements of 'If'. This is expressed as

t. This function is called a "mother

(1.1)

(l

=joe 1!(f) dt = .. = j"" fm- lt''(f) dt .


- oc

- QIC:>

Tht> "mother v.avelt>t." t"(f). generateS tht' other v.avelets. t \ ...ltl (t ). o > 0.
b E R. of th<' fan1ily by change fJf scalt (th<' scale of ~/l(t ) is convcutiouully 1.
and tbnt of 1i1... b1( t ) i:< a > O) and translat.ion iu time (tb<' fuurt.ion t'(t} i.e;
romcut.ionl\lly c~nt<'rt'd around 0. ru1d t1... bl(tl is th<'n r.cnt.crt>d nruuud l).

CHAPTER 1

Thus we have

(1.2)

VJ{o,b)(t):;::

~VJC:b)

a> 0,

bE R.

Alex Grossmann and Jean Morlet have shown that, if V! (t ) is real-valued, this
collection can be used as if it were an orthonormal basis. This means that any
signal of finite energy can be represented as a. linear combination of wavelets
VJ{o,6)(t) and that the coefficients of this combination are, up to a. normalizing
factor, the scalar products f~oo /(t)1Pia,6)(t) dt.
These scalar products measure, in a. certain sense, the fluctuations of the
signal /(t) around the point b, at the scale given by a> 0.
It required uncommon scientific intuition to assert, as Grossmann and Morlet
did, that this new method of tim~scale analysis was suitable for the analysis and
synthesis of transient signals. Signal processing experts were annoyed by the
intrusion of these two poachers on their preserve and made fun of their claims.
This polemic died out after only a few years. In fact , the argument should
never have arisen because the methods of time-scale or multiresolution analysis
had existed for five or six years under various d1Sgwses: in signal analysis (under
the name of quadrature mirror filters) and in image analysis (under the name of
pyramid algorithms).
The first to report on this was Stephane Mallat. He constructed a. guide
that allowed the same signal analysis method to be recognized under very differeut presentations: wavelets, pyramid algorithms, quadrature mirror filters,
Littlewood-Paley analysis, David Marr's zero crossings...
Mallat's brilliant synthesis has been the source of many new developments.
One of the most notable of these is Ingrid Daubechies's discovery of orthonormal wavelet bases ha''ing preselected regularity and compact support. The
only previously knov.'ll case was the Haar system (1909). which is not regular. Thus 80 years separated Alfred Haar's work and its natural extension by
Daubechies !1].

1.5.. Time-frequency wavelets from Gabor to Malvar.


Dennis Gabor was the first to introduce time-fre'luency wavelets or Gabor
wavelets. He had the idea to divide a wave (whose mathematical representation is cos(...t- .,:)\into segments and then to throw awa, all but one of these
segments. This left a piece of a wave. or a wavelet. which had a beginning and
an end.
To use a musical analo~. a wan rorresponds to a not<' (a r t minor. for
example) t.ha.t ha.~ been emitted siuiT t.lll' origin of time and sounds indefinitely.
vdthout att.cnuatiou. until thr end of tin)('. A wavelet then correspondt: to th<'
sa.me rc minor that is struck at a certain moment. say. on a piano. and is later
muffled by til(' proal. In other word~. a Gabor vavelet hilS (at lCI\St) thre<> piect'l'
of information: 11 be~inning. an end. and a specific frequeury in between.

SIGNALS AND WAVELETS

Difficulties appeared when it was necessary to decompose a signal using Gabor wavelets. As long as one does only continuous decompositions (using all
frequencies and all time), Gabor wavelets can be used as if they formed an or
thonormal basis. But the corresponding discrete algorithms do not exist, or they
require so much tinkering that they become too complicated.
It is only very recently, by abandoning Gabor's approach, that two sep<V
rate groups have discovered time-frequency wavelets having good algorithmic
qualities. These special time-frequency wavelets, called Malvar wavelets, are
particularly well suited for coding speech and music. Moreover, they are equally
useful to the FBI for storing fingerprints.
The decomposition of a 8ignal in an orthononool bam of Malvar wavelets
tmitates writing mwic wing a mwical score. But this composition is misleading
because a piece of music can be un-itten in only one way, wheretls there exists
a nondenumeroble infinity of orthononool bases of Malvar wavelets. Choosing
one among these is equivalent to segmenting the given signal and then doing
a t raditional Fourier analysis on thl' delimited pieces. What is the best way
to choose this segmentation? This question leads us naturally to the following
section .

1.6.

Optimal algorithms in signal processing.

Which wavelet to choose? I have often posed this question at meetings on


wavelets and their applications held since 1985. But this question needs to be
sharpened. 'What freedom of choice is at our disposal? What are the objectives
of the choices we malre? Can we make better we of the choices offered to us by
considering the anticipated goals? These are several of the questions we will try
to answer.
The goal we have in mind is aptly illustrated by a remark by Benoit Mandelbrot made in an interviev. on "France-Culture'": "The world around us is very
complicated. The tools at. our disposal to describe it are very weak.'"
It is notable that Mandelbrot used the word "describe" and not "explain"
or "interpret." We are going to follov. him in this, ostensibly, very modest
approach. This is our answer to the problem about the objectives of the choices:
Wavelets, whe!Mr they are of the time-scale or time-frequency type. will not help
us to uplain scientific facts. but they will serve to describe the reality around us,
whether or not it is scientific.
Our task is to optimiZ<' thl' description. This means that w~ must mak<' the
best use of the resources allocated to us (for example. the number of a'-ailabll'
bits) to obtain thl' most precise possible description.
To resolv~ this problem. WE' must first indicate how th<' qualit.y of thr dt-script ion will be judged . l\lost often . thl' criteria used are academic a.nd do not
correspond at all to th< user 's point of ,iew.
For example. iu i m~C' processing. all tht calculations for judgiug tlu quu..lit~
o! the description ust th<' quadratic mean val ue of gray l<'vcls. It is clt'I\T. howtvcr.
that our ~yc ha.~ ll mud t mor< :sclertiv<' sensit.i,it~. Thus. in thr last ru11\l~sis. w..

CHAPTER I

sbou.ld submit the performance of an "optimal algorithm" to the users because


the average approximation criterion that leads to this algorithm will often be
inadequate.
The case of speech (telephonic communication) or music is similar. After
systematic researd that optimizes the reception quality (quality calculated with
an ave~age) , it would be advisable to experiment by taking into account the
judgments of telephone users and musicians.
Thus we see a two-state program developmg. Nevertheless, the only stage we
describe in the following pages is the "objective search" for an optimal algorithm,
even though its optimality is defined in terms of a debatable energy criterion.
Rather than formulate ad hoc algorithm.~ for each signal or each class of
signals, we are gomg to construct, ona and for a!~ a vast collectton called a
library of algorithm.,. We will also construct a "meta-algori.t hmr whose fun ction
will be to find tht particular algorithm in the library that best serves the given
signal in view of tiu: criterion for the q!Ullity of the description.
It is hardly aJ exaggeration to say that we will introduce almost as many
analysis algorithm.. as there are signals. For example, for a signal recorded on
2 1c.. = 1024 points. the number of algorithms at our disposal will be of the order
2 1024 . This is the number of all the signals defined on our 1024 points that take
only two values, 0 and 1.
Thus we will US(: a '-very large library" to describe the signals. but we exclude
the "library of Babel.'' This would contain all the books, or all the signals in
our case. And as everyone knows, the search for a specific book in the library of
Babel is an insurmountable task The "ideal library" must be sufficiently rich
to suit all transient signals, but the "books- must be easily accessible.
While a single algorithm (Fourier analysis) is appropriate fot all stationa.J). signals, the transient signals are so rich and complex that a single analysis
method (whether of tim~scale or tim~frequency) cannot serve them all.
If we stay in the relatively narrov. emironment of Grossma.nn- 1\iorlet
wa,elets. also called timt>-scale algorithms. v.e have only two ways to adapt
the algorithm to th- signal being studied: We can choose one or another anal~-z
ing v.avelet. and we can use either the continuous or the discrete version of the
v.1welets.
For example. we can rt>quire the analyzing wavelet 1. to be an aJlal~'tic signal.
"'hich means thai its Fourier transform t (: ) i~ zero for negativt' frequencies. In
thi case. all th' v.avelet.." t. '1o .bJ a > 0. b .: R. generated by t will s1ill ha"e
thL~ propert~. <Uid their linear combination!' ~hen by the algorithm "ill b' the
anal~t k Sipl<tl F associat-ed v.; tb th' rC'al si~ual f.
.
L1kev.ise. w can follow Daubechie:: and. for a 'en r ~ 1. choD!'t for q.r\ a
real-valued fuurtion in t h' clas; r with compllct support such that th( ('(\lll'C'tion
2i ':!t:(2J.r- k). j .k E :.. h: an orthonom1al basi~ for L2 (R). In t h i~ di!>rrN.'
YC'r~iou of tht' ,l)~orithm. c1 = 2-J aJid b = k-2-' ,j.J. E Z..
In spit.l' of this, th( cl10ice.~ that can bt made from th' set of timt>-,;cal'
WIIYelP.ts remain limit.ed. Tile 11earch for optimal algorithm.< lc11d.< u.. MJ 11omt

SIC!"Al...'i A="l) WAVELETS

remarkable algorithmic adventures. where time-scale wavelets and time-jrel;uency


wavelets are in competition, and where they are a.lso compared to intermedwte
algorithms that mix the two utreme forms of analym.
These considerations are developed in Chapters 6 and 7 and the question I
asked myself six years ago (what wavelet to choose?) seems passe to me today.
The choices that we can and must consider no longer involve only the analyzing
instrument (the wavelet); they also involve the methodology employed (timescale. time-frequency, or intermediate algorithms).
Today. the competing algorithms (time-seal< and timE;-frequency) an included in a wbolc univerS< of intcrmediat<: algorithms. An entropy cri terion
permits us to choose the algorithm that optimi~ the description of th~ giver
signal within the g.ivcn bit-allocation.
Each algorithm is presented in terms of a particular orthogonal basis. We
can compare searching for the optimal algorithm to searching for the best point.
of viev.. or best perspective, to look at a statue in a museum. Each point of '\;ev.reveals certain parts of the statue and obscures others. \Ve change op.x point
of view to find the best one by going around the statue. In effect, we make a
rotation: we change the orthonormal basis of reference to find the optimal basis.
These reflections lead us quite naturaUy to the scientific thoughts Qf David
Marr. which we present in the next section.

1.7.

Optimal representation, according to Marr.

David Marr was fascinated and intrigued by the complex relations that exist
between the choice of a representation of a signal and the nature of the operations
or transformations that such a representation permits. He wrote !5, pp. 2o-21):
A representation is a formal system for making explicit certain entities or
types of information, together v.;th a specification of bow the s~stem does this.
And I shall call the result of using a representation to describe a given entity a
description of the entity in that representation.
For example. the Arabic. Roman and binary numerical ~stems are all formal
systems for representing numbers. The Arabic representation consists of a string
of symbols drawn from the set {0. 1. 2. 3. 4, 5. 6. 7. 8. 9} and the rule for
c.onstructing thl description of a particular integer 11 is that one decompose.<:
n into a sum of multiplC$ of powers of 10... A musical score pro,ides a "ay
of representing 11 l'ymphon~: the- alphabet allows tlw construction of 11 ''Tifl <'l
representation of words . ..
A repr~ntation. therefon. i:< not a foreign idea at all-we all us<' repr<"'<'J
tation~ all th<' time. HowPYcr. th<> notion that one can capturt' som<> a.c:;pect o:
reality b_,. making 11 descriptiou of it using a ~,nbol and that to do su em he
useful S<'<'DI~ w Ill<' a fascinating and powerful idea. But e\cn the simpl<' cxmuples WC' lnw<' di~cus.c:;ed int roduc<' soml' rather general ru1d important issut':< Lhat
arise' whcuenr vnc chooses w ust one particular representation.
For cxan1plt-. if om cho<>SE'l' t ht Arabic numerical representation. it. is<'<\..<:~ w
discover wh<'thC'r 11 number is a po"cr of 10 but difficult t.o discover wht'thC'r it is
11 powl'r of 2. If on<' chooses tht biunry representation . tht' situntion is rcvcrst'll.

10

CHAPTER l

Thus, there is a. trade-off; any particular repre8entation makes certain information uplicU at the upense of infonnation that is pU8hed into the background and
m4V be quite hard to recover.I
This issue is important, because how information is presented can greatly
affect how easy it is to do different things with it. This is evident even from
our numbers example: it is easy to add, to subtract and even t o multiply if the
Arabic or binary representations are used, but it is not at all easy t o do these
things-especially multiplication-with Roman numerals. This is a key rea.wn
why t he Roman culture failed to develop mathematics in the way the earlier
Arabic cult ures had . ..
There is an essential difference between Marr's considerations and the alger

rithms that we develop in the first six chapters. The difference is that the choice
of the best representation. according to Marr, is tied to an objective goal. For
tbt' problem posed by vision. the goal is to extract the contours, recognize the
edges of objects. delimit them. and understand their three-dimensional organization. In the algorithms we develop. the only criterion is internal t o the algorithm
and cons~ in reducing the amount of data. We have not yet studied situations
where one must also take into consideration an external criterion. In spite of
this difference, Marr's point of viev. is very close to the one we will present.

1.8. Terminology.
The elementary coostituents used for signal analysis and synthesis will be called.
depending on the circumstances. wavelets. time-frequency atoms. or v.avelet
packets (Chapter 8).
The wavelets used will be either Grossmann-Morlet wavelets of the form

.1_1;. (~).
..;a
a

a> 0 ,

the wavelets of Daubechies that have the form


2i 12 t(2J t - k).

j , k E '1...

or Gabor- Malvar wavelets of the form


..:(t -I) cos[7r(k- 1.'2)(t -I}).

k E !"\.

IE Z.

In thE' first two cases. wr will speak of time-scale algorithms: in the last ra.c;r.
we will speak of time-frequettr~- al~orithms. Later we will mix t hC' two point::: of
\iE!" and subject the Gabor-1\lah-ar wavelets to dyadic dilation.c; to construc-1 thl'
Daubechies v.avclets. On<' thu~ cnroumers g<'nC'ralized tim~fr<>qucncy at.oms.
\\'r v.ill lll*' only two "'very larg<' libraries.r T hr first consists of orthonormal
b8Se!: whO$(' element~> arC' wavelt't packets. In the second. t h<> v.-avelet. packeu: ar<>
replaced by thC' generalized tim~fr<'<)uency atoms. that W(' have just. d escribed.
1Thr
--

italio Ill\' oun;.

SIC:"A L.<; A:-ifJ WA\'ELI:."TS

1.9.

11

Reader's guide.

In Chapters 2 through 7, we present the timt.~scalc algorithms (Chapters 3 and


4) and time-frequency algorithms (Chapters 5. 6, and 7). Chapter 2 has a special
status. We have tried to retrace the path that led from Fourier analysis (Fourier
1807) to wavelet analysis (Calderon 1960 and Stromberg 1981 ) and to the very
core of contemporary mathematics.
Quadrature mirror filters are studied in Chapter 3 in connection with pro\>.
!ems posed by the digital telephone.
The pyramid algorithms described in Chapter 4 concern numerical image
processing. The orthogonal pyramids use precisely the quadrature mirror filters
of Chapter 3. The pyramid algorithms lead either to orthogonal wavelets or to
h i-orthogonal wavelets.
From Chapter :> on, we will study time-frequency algorithms. The WignerVille transform enables "the signal to be displayed in the time-frequency plane."
After indicating the main properties of t he Wigner- Ville transform, we must
point out that it does not provide an algorithm that allows us to decompose a
signal into time-frequency atoms that are the analogues of musical notes.
The two algorithms that do provide access to these "atomic decompositions"
are presented in Chapters 6 (Malvar wavelets) and 7 (wavelet packets).
The first six chapters form a coherent unit. This is not the case for the last
four chapters; each of them treats a special application of wavelets and timescale methods. Chapter 8 deals with the Marr- Mallat theory. This concerns the
possibility of coding an image using the zero-crossings of its wavelet transform.
Using the Grossmann-Morlet analysis one can determine the fractal exponents, as a function of position, of a multifractal object. This is one of the most
remarkable applications of the Grossmann-Morlet wavelets, and it is presented
in Chapter 6.
The last tv.o chapters cover two very dynamic subjects: turbulence and the
multifractal approach t.o turbulence {Chapter 10) and the hierarchical.organizatioo of the galaxies and the structure of the Universe (Chapter 11). In both
cases, results are still meager. But these avenues of research are fascinating, and
the researchers involved have such good reputations that one can reasonably
expect important progress before long.

Bibliography
Ttr. Lectures on Watoc-kL. Societ.y for lndu.~mal and Applied Mathem at
ic... P hiladelphia. PA. 1992.
D. G AllOI\. Tl1cMy of e<>mmumcatson . l. IE. 93 ( !946). PI' 429-457.
C . G ASQIC1 AtW F' . \\"tTOMS I\1 .Ar~aiy dr F01Jncr cl a ppliC<>tum.<. FiltrofiC. CalctJ
numenquc Or~ddcUes. Masson . Pe.rt~. ) !l9(l.
J. l . L tONS. La plane!<' Un"t: r Ol. d r$ mathhnottque.< ct dCJi IU,..-r ordanoteur.t. C'OUn< a
l'lnst itul dEspap>c. 1990.
D . MARl\ . \ "a..ntm. W . H. Fn-emM and (\.. 1\t'w York. HIH2.

I( I.
12;
:3j

14]
IS)

l6l

DAt B ECHIE~.

R . V AI''l"AIIl' AtW M. GHIL. S inj1Ular ~>lnnn anolpa.. 111 u o nltnror ciynamaa, with OJ'
plaC<>h<>u . t pcirodamahr tam<' rno . T'h~"'icn 0. 3~ (19~!1). PI' 35P-12..J.

CHAPTER

Wavelets From a Historical


Perspective

2.1.

Introduction.

The application of wavelets to signal and image processing is only a few years
old. But in looking back over the history of mathematics, we v.ill uncover at least
seven different origins of wavelet analysis. Most of this work was done around
the 1930s, and, at that time, the separate efforts did not appear to be parts of a
coherent theory. In particular, neither the word "wavelet" nor the corresponding
concept appeared in this research done a half-century ago. Only today do we
know that this work prefigured the theory of wavelets.
It is important to describe these seven sources in some detail. Each of them
corresponds to a specific point of view and a particular technique, v.hich, only
now, are we able to viev. from a common scientific perspective. What's more,
these specific techniques were rediscovered several yearsago by physicists and
mathematicians working on wavelets. For example, the Littlewood-Paley analysis (2.4), which dates from 1930, underlies Mallat's work on image processing.
Matthias Holschneider used, without knowing it, Lusin's technique (1930) to
clarify the fractal structure of Riemann's function (2.6 and 9.4). G~mann
and Morlet rediscovered Alberto Calder6n's identity (1960) 20 years later. And,
to spare no one, the author of these lines was not the first to construct a regular, well-localized orthonormal wavelet basis having the algorithmic structure
of Ha.ar's system (1909): J . 0. Stromberg bad done the same thing five years
earlier.
Does this mean that evel")-thing bad alread) been written and that '"t.earuv.-avelet" researchers appropriated-while making a great show of it-results discovered by others? This judgment must be qualified for two reasons. In the first
place. tht> physicists from l.eam wavelet"" were not aware of Calderon's work: it
was in completely good faith that tht-y presented as a revolutionru: innovation
rt>Sults that wcr<' about 20 years old. "Tt-am wavelet" researchers can b<' t~ecu...OO
of i~oranCE' but not of plagiarism. But above all, by rediscovering these lmown
scientific facts, thC' investigat.or.: gaw them new life and authority. Our debt t.o
Grossmann and 1\lorlC't is not so mucl1 for having rediscovered the idcntit-) of
Calderon as it is for ha,;n~; related it to processing nonstationru: si~als. Thil>
bold 10ynthesis cC>rtninly C>nrountcred r<.>sL"fwlce. and Calderon himself found t.hil'
W:<' of his work illron~mtul'.

14

CHAPTER 2

The role of Morlet and Grossmann in the wavelet saga can be compared to
that of Mandelbrot in the story of fractals. It is true that before Mandelbrot
there were Pierre Fatou and Gaston Julia, and certain of Mandelbrot 's discoveries
repeat their work. But Mandelbrot showed us the possibility of interpreting
the world around us with the help of a new concept, that of fractals. None
of the mathematicians working on Hausdorff dimension were ready, from their
training or experience, for such a leap into the unknown. Wavelets have gone
the same way, and one of our objectives is to construct the bridge that relates
signal processing to the different mathematical efforts that developed outside the
"theory of wavelets.
2.2.

From Fourier {1807) to Haar (1909), frequency analysis becomes


scale analysis.

Let's go back to the origins, that is to Joseph Fourier. As everyone knows, he asserted in 1807 that any 2r.-periodic function f(x) is the sum ao+ L~(ak coskx+
bt sinkx) of its "Fourier series." The coefficients ao. ak, and bk(k ~ 1) are calculated by

ao = 2171"
and by

112.

ak=7f

f(x ) coskxdx,

12" f(x) dx,


0

112.

bk =1r

f (x)sinkx dx.

When Fourier announced his surprising results, neither the notion of function
nor that of integral had yet received a precise definition. We can even say
that Fourier's statement. played an essential role in the evolution of the ideas
mathematicians have had about these concepts.
Before Fourier's work, entire series were used to represent and manipulate
functions, and the most general functions that could be constructed were endowed with very special properties indeed. Furthermore, these properties were
unconsciously associated v.oith the notion of function itself. By passing from a
representation of the form

(2.1)
to one of the form

(2.2)

ao -t- (a1 cosx + b1sinx) + (a 2 cos 2;r...,.. b:! sin 2;r) + ...

Fourier discovered. without knov.ing it. a nE'w functional universe.


However. in 18i3. Paul Du Boi~RE'ymoud constructed a continuous. 2r.periodic function of thr real variable x. whos<' Fourier series diverged at a given
point. If Fourier's a.~ertiou werr true. it rould not bE' so in thr naJv(' sense
imagined b~ Fourier.
At that time. th.rt't' nrw nvenuc:; wrr< opcnf'd to mathemnticirul~>. ru1d all
thrE'r haw IE'd to imponm1t n~ults:

WAVELETS FROM A HISTOIUCAL PERSPECTIVE

15

(a) They could modify the notion of function and find one that is adapted,
in a certain sense, to Fourier aeries;
(b) They could modify the definition of convergence of Fourier series; or
(c) They could find other orthogonal systems for which the phenomenon,
discovered by Du Bois-Reymond in the case of the trigonometric system, cannot
happen.
The functional concept best suited to Fourier series was created by Henri
Lebesgue. It involves the space L 2 (0, 21r] of (classes of) functions that are Squareintegrable on the interval [0, 21r]. The sequence

$. '

...fi cos r..

,jT. SID X

...fi cos 2r.. ...fi sm2r., ...

is an orthonormal basis for this space. Furthermore. the coefficients of the decomposition in this orthonormal basis form a square-summable series, and this
expresses the conservation of energy: The quadratic mean value of the developed function f (r.) is (up to a normalization factor) the sum of the squares of
the coefficients. Finally, the Fourier series of f converges to f in the sense of the
quadratic mean.
The second way that was followed to avoid the difficulty raised by Du BoisReymond was to modify the notion of convergence. The partial sums Sn(r.) are
replaced by the Cesaro sums C!n = ~(So++ Sn-d, and everything falls into
place.
The third route leads to wavelets. This was followed by Haar, who
asked bimseU this question: "Does there exist another orthonormal system
ho(r.). h 1(r. ), .... h.,(r.), ... offunctions defined on (0.1] such that for any tunc
tion f (r.). continuous on [0.1j. the series

(2.3)

(f. ho)ho(r. ) +{f. h1) h1 (x ) + + (!, h.,)h.,{x) +

...

converges to f (x) uniformly on (O.lr _


Here we have v.7itten (u.t) = J0 u(r.)v(r.)dr., where vis the complex conjugate oft:, and chosen the interval (0,1] for convenience to fix our ideas.
As we will see, this problem has an infinite number of solutions. In 1909.
Haar discovered the simplest solution and. at the same time. opened one of the
routes leading to wavelets.
Haar begin.-: with t h(' function hlr.) that is equal w 1 on (0,1/ 2). -l on
{1,':!.1 ). and 0 outsid(' th(' interval (0.1). For n ~ 1. he v.nt.es n = 21 - k.
j ~ 0. 0 $ k < 2i . and defines /~.n (x) = 2il 2 h(2Jx - k ). The support of h.,p)
is tht> dyadic' int('rval In = !k2-J. (k + 1)2-'). v.hich is included in (0.1) v.hcu
0 $ k < 2J. To completf th<' set. define ho(r.) = 1 on [0.1). Then th(' series
h0 (x). h!(x)... . . l&n(x) .... is an orthonormal basis (also called a Hilbert basis)
for L 2 (0.1J.
Tht uniform approximation of f(x) by th(' sequence Sn(f)(x)
(f. hc.1 \h0 p) + + (f. hn )h., lx) L<: nothing more than till' classical approximation
of a C'Ontinuou:; function by st.('p funC't ion~ whos<' valu('l: ar(' the' mcau vu) u('l: of
f l:r) 011 1 ht npprupriaw dyndi<' int.('f''ll.k

16

CHAPTER 2

We can criticize the Haar construction on a couple of points. On one hand, the
"atoms" hn ( x) used to construct the continuous function f (x) are not themselves
continuous functions, and thus there is a lack of coherence.
But there is a more serious criticism. Suppose that, instead of being continuous on the interval [0,1], f(x) is a function of the class C 1 , which means f (x )
is continuous and has a continuous derivative. Then the approximation of f(x)
by step functions would be completely inappropriate. In this case. a suitable
approximation would be the one created from the graph of f(x ) by inscribing
polygonal lines.
The Ha.ar construction is suitable only for continuous functions. functions
square-integrable on [0,1j or, more generally, functions whose index of regularity
is near 0. We will see a little later what this means.
These two defects of the Haar system and the idea of approximating the
graph of /(x) with inscribed polygonal lines Jed Faber and Schauder to replace
the functions h..(x) of the Ha.ar system by their primitives. This research began
in 1910 and continued until 1920.
Define the "triangle function" A(x) by A(x) = 0 if x ~ [0.1], A(x ) = 2x if
0::; x::; 1/ 2, and A(x i = 2(1- x) if 1/2 ::; x ::; 1. Then Faber and Schauder
considered' the series An (x), n ~ 1, defined by
~f

(2.4) An (x)

= A{2jx- k)

for

n = 2j + k,

j ~ 0,

The support of An{x) is the dyadic interval In [k2- j, (k + 1)2- i]. and An (x )
is the primitive of hn(x) multiplied by 2 2i12 and zero outside In
For n = 0, we set Ao{x)
x, and we add the constant 1 to complete the
set of functions. Then the sequence 1, Ao(x) .... . An (x), ... is a Sclw.uder basis
for the Banach space E of continuous functions on [0.1]. This means that every
continuous function on [0.1] can be written as

ex

(2.5)

f(x) =a ... b:r +

L o,. ~n(:r \
l

and that the series has the follov.ing properties: the convergence is uniform on
[0.1] and. as a consequence. the coefficients are unique.
We note that the Ha.ar system is not a Schauder basis of E because a Schauder
basis of a Banach space must be made up of vectors of that space. and the
functions h, are not continuous.
The calculation of thc coefficients in {2.5) is immediate. We have successively
flal =a and !{1) =a- b. whicl1 ghes a and b. This allows us to consider a
function ftx)- a - b:r. v.hicl1 is zero at 0 and 1. Oncc this reduction is made. wc
have .f(1 / 2) = o 1 which allow~ us to consider a function equal to zero at 0. 1.'2.
and 1. The calculation continues with f (1/4) = o ~ and ! {3/ 4) = o 3 and ~>Cl on.
If ~c do not wish to ~r / (:r) thi~ v.ay. tht> coefficients On can be comput.ed
directly by the fom1ula
~ ;6 ,

(- l ;

WAVELJ::TS FltOM A II JSTOfUCAL PEH.SPECTIVE

17

We can gJVe a further interpretation t<..o (2.5). H, instead of being continuous,


f(x ) was a function in the class C 1 , then we could differentiate (2.5) term by
term and obtain the expansion of /'(:r) in the Haar basis. H f(x) is in class
C 1 , the series (2.5) converges uniformly to f(:r) and the series differentiated
term by t.erm converges uniformly to f' (:r). Does this mean that the functions
6n (:r), n 2!: 0, with the added function 1, constitute a Schauder basis for the
Banach space C 1 [0,1)? As before, this is not the case because the function.'An (x ) do not belong to the space in question.
Following Holder, we define the space C'"[O, 1), for 0 < r < 1, by the relation
if (x )- f (y)l ~ C lx -ylr for a certain constant C and for all x, y E [0,1). Then it
is clear from (2.6) that ionl ~ C2-(j+l )r if I belongs to cr. Since 2i ~ n < 2' ... 1
we can also write lonl ~ Cn-r, n 2!: 1. The converse, although much less evident.
is nevertheless true when 0 < r < 1. It is not true if r = 1.
Contemporary physicists are very interested in the Holder spaces C'" because
they occur naturally in the study of fractal structures. In fact, the physicists
require more. They wish to calculate a Holder exponent r that varies from one
point to another. Here is the definition of pointwise HOlder exponents. We
say that /(:r) satisfies a HOlder condition of exponent r, 0 < r < 1, at :ro if
1/(:r)- / (:ro)l ~ Clx- xolr Then we look for the largest possible value of r,
which is denoted r (:r0 ) if it exists.
Contemporary science deals with numerous physical phenomena having
multifractal structures. This means that the "fractal exponents" r(:ro) vary
from point to point. In this case, the Hausdorff dimension of the set of points
:r0 , where r (:r0 ) o is a function of d(o) whose graph bas a distinctive form.
An example from mathematics is the celebrated function attributed to Bernhard Iliemann L;~ ( l /n 2 ) sin(n 2 :r ). This example illustrates the point that thl'
Fourier series of a function provides no directly accessible information about
the function's multifractal structure. By using the wavelets of Lusin (which Wl'
present in 2.6). it is possible to elucidate the multifractal structure of Iliemanns
function. This was done by Matthias Holschneider and Philippe Tchamitchian.
We describe their work in Chapter 9.
A second example is the signal coming from full~ developed turbulence. Thl'
multifractal structure of this signal has been studied with great care by Alain
Arneodo and his collaborators. \\-e present this exa.mpll' in Chapter 10.
Conceivably. thl' pointwise Holder CA"J)onents could be computed by going
back to the definition. However. thl' exampll' of thl' Iliemann functions shov;~
that. such an approach is too crudl' to yield practical results. This approach offer::
no way to take into consideration thl' int>\itable noise (assumed to bl' Gaussian)
t hat. alters a signal. Thl' Schauder basis presents the sam<> difficulties becaus<' tht'
calculation of th<' coefficients a ,. (according to (2.6)) calls directly upon explicit
values of the signal.
Today. WE' art> fortunate to bavr much mort> subtl<> \\'ays to attack thi~
problem. Spt>Cificall~, the poim.v;isr Holder ex-ponent!' art' now determill("CC
using wavelet analysis. Thr wavell't coefficients replac<' thOSE' given by formula (~.(;) . Th<'y arl' less sensitiv<' to uoiS(' becallSC.' th<'y measure. at. diffcrcm
scales. tht awrn~t fiuct.uations of t.h<' si~ual. Thes<' m<'thod.o: will bt de;crillt"d
iu C'hapt.t'r 9.

18

2.3.

CHAPTER

New directions of the 1930s: Levy and Brownian motion.

Brownian motion is a random signal. We will limit our d.iscuasion to the one-

dimensional caae. We thus write X(t,w) for the Brownian motion: t denotes
time, w belongs to a probability space n, and X (t, w) is regarded as a real-valued
function of time depending on the parameter w.
To obtain a realization of Brownian motion, we choose a particular orthonor, mal basis Z1(t), i E ! , for the usual Hilbert space L 2 (R). Then we know that
the derivative (in the sense of distributions) ftX(t,w) is written as
(2.7)

d
dtX(t,w) = L9;(..:)Z,(t),
EI

where the g1(w ), i I, are independent identically distributed Gaussian random


variables with zero mean.
Then the problem is to choose the best possible representation of Brownian
motion. It is certainly advisable, as in all signal processing problems, to have in
mind the desired end result.
H we .wish to examine the spectral properties of Brownian motion, we are
led to seiect the Fourier representation . . The real line is cut into intervals [2hr,
2(l+l)7rJ, l E Z, and the trigonometric system is used on each of the intervals. In
its real form, this trigonometric system is
coskx, and
sin kx , k ~ 1.
H we wish to highlight the multifractal structure of Brownian motion, Fourier
analysis is inadequate. On the other hand, the analysis using the Schauder basis
immediately reveals the Holder regularity cr. r < 1/ 2, of the Brownian motion
trajectories.
We start with the Haar basis (or L 2 (R ) composed of the functions h..(t -l).
n ~ O,l Z, and expand the white noise ft:r(t.w) in this orthonormal basis.
By taking primitives, we obtain the development of Brownian motion in the
Scbauder basis.
To fix our ideas. we restrict the discussion to Brownian motion on the interval
[0.1J. For this l = 0. and

-7,;.1;

(2.8)

X(t.w) = ao(w) + tbo(w) +

1;

DC

2 2:)-i12gn(w)l!.n(t).
I

..-here the 9n(;.;) are independent Gaussian random variables v.;th mean zero and
the same distribution.
lb verify that thl' function X (t . ..:) bclonp: to tht> Holdt>r Spacl'
for almost
all..: E 0 , it is sufficient to show that 2-J: 219n(...)l ~ C(:)2-Jr. If. for almost
all w n, onf.' had SUPn>O l9n(w)l < oc. then t.ht> trajectorie!' of th<' Brov."Jlian
motion would almOSt sur;ly belong to the spact> (' 112 But this is not tht> case.
and instead Wt' haVl' SUPn>209n(w)li ,llog n) < oc for almost all .... E n. Then
tbt> criterion for Holder regularity give:'

cr

(2.9)
V.'hert> C'(w)

IX(t + h,w)- X(t....)! S Cl..:hf hiog 1/h.

< oc for almost

all ..; E

WAVELETS FROM A HISTORICAL PERSPECTIVE

19

We see, from this theorem of Paul Levy, the superiority of the Schauder basis
over the Fourier basis for studying local regularity properties.
The determination of the fractal exponents requires some extensions. These
extensions that allow the study of small, complicated details form part of "multiresolution analysis," which we define in the following chapters. These ideas are
already incorporated in t he definition of the Schauder basis itself, in particular,
within the mapping x .... 2' x - k. They are clearly absent in the trigonometric
system.
Patrick Flandrin 15] has extended this work to the case of fractional Brov.'Dian motion. as it was proposed by Benoit Mandelbrot and John W. van l'<ess
to model certain noise. Albert Benassi, Stephane Jaffard, and Daniel Rowe
13) have generalized these ideas to certain Gaussian- Markov fields. All of this
work demonstrates that multiresolution methods are adapted to the analysis and
synthesis of these processes.

2.4.

New directions of the 1930s: Littlewood and Paley.

We have shown, with the example of Brownian motion, that the trigonometric
system does not provide direct and easy access to local regularity properties and
that these properties are clearly apparent when examined with other representations.
Similar difficulties are encountered when we try to localize the energy of a
function. To be more precise. the integral .].,. Jt' lf(x}fldx, which is the mean
value of the energy, is given directly by the sum of the squares of the Fourier
coefficients. However, it is often important to know if the energy is concentrated
around a few points or if it is distributed over the whole interval I0,2w]. This
determination can be made by calculating 21.,. J;" lf (x)! 4 dx, or more generally,
f02" Jf(x)l"dx for 2 < p < oo. When the energy is concentrated around a few
points. this integral will be much larger than the mean value of the energy. while
it will be the same order of magnitude when the energy is evenly distribute(!. We
write 11!11, = (i,. J;" lf(x)l"dx) 11P and, for obvious reasons of homogeneity. we
compare the norms llfll, to determine if the energy is concentrated or dispersed.
But if p is different from 2, we can neither calculate nor even estimate these
norms llf li, by examining the Fourier coefficients of f .
The information needed for this calculation is hidden in the Fourier series of
f: to reveal it. it is necessary t.o subject the series to manipulations that vere
discovered by Littlewood and Pal~ as long ago as 1930.
Lit.tl<'wood and Paley definE' the Mdyadic blocks !:l.1 f (x ) b~

i.,

(2. 10)

~if{:r)=

(akcosk:r+h.sinkx),

2J$ k<2J +l

where a 1,

+ I:~ lak coskx- b.- sink:r) denotes thr Fourier series of f. Then
oc

ft:r)

= Cl(l + 2: [j.j f {:r).


(l

CHAPTER

20

and the fundamental result of Littlewood and Paley is that there exists, for
1 < p < oo, two constants C, ~ c, > 0 such that
11

(2.11)

c,ll/11, !S ( laol2 +

~ l~if(xW) ~

!S C, ll/11,.
'P

= =

2, C, c, 1, and there is equality in (2.11).


Up to this point, wavelets have not yet appeared. The path that leads from
the work of Littlewood and Paley to wavelet analysis passes through the research
done by Antoni Zygrnund's group at the University of Chicago. Zygrnund and
the mathematicians around him sought to extend to n.-dimensional Euclidean
space the results obtained in the one-dimensional periodic case by Littlewood
and Paley.
It was at this point that the "mother wavelet" t/J(x) appeared. It is an
infinitely differentiable, rapidly decreasing function of x , defined on the Euclidean
spaceR" . whose Fourier transfonn C,(~) satisfies the following four conditions:

If p

tb(~)

(2.12)

=1

if 1 +a !S 1~1 S 2- 2a.

' where a . is by hypothesis. chosen in the interval (0.1/ 3].


(2.13)
(2.14)

and
(2.15)

,Z,(~)

=0

if I~ I S 1 -a or 1~1 ~ 2 + 2a,

,Z,({) is infinitely differentiable on IR".


oe

L I~(Ti{):

=1

for all ~? 0.

-oc

Condition (2.15) does not amount to much. It is sufficient to verif:y it for 1- a S


~~~ !S 2- 2a. and then only two cases arise:. if 1 -a !S l{l !S 1 + a , (2.15} reduces
to jC,({)!Z + 1~(2{)1 2 = 1, while if 1 +a !S 1{1!S 2 - 2a. (2.15) is automatically
satisfied since one term is equal to 1 and all the others are zero.
Condition (2.15) implies that the analysis of Littlewood- Paley-Stein (v.hose
definition .will be given in a moment) conserves energ~Y. This same condition
(2.15) i!' satisfied by all the orthonormal wa\'elet bases ofth<> form 2il 2 t:(2:i :r-k).
j. k E : . It also anticipates similar condit ions shared by the quadrature mirror
filt.rr!' t Chapter 3) and the MalYar wavelet.s (Chapter 6 ).
The theory for JR" proceed~ by !'Ctting t;(:r) = 2"l t (:?J:r) and replacin~ the
dyadic blocks of Littlewood and Pair~ with thr conYolution product!' !:!., (/l =
f t ,. Tht "Lit.tlcwood-Palc~-St.cin functiou~ is defined by

WA\'.1-:LI:cTS

F.H.0~1

21

A IIJSTO HJC A!. PEH.SPECTJVE

If j (:r:) bclong.c, t CJ U (R '), the sanae is trU(~ fur g(x), and ll/1!2
IIBll2 (the
c0nscrvation of energy).
If 1 < p < oc., there exist two constants C, ;::: c.p > 0 such that, for all
functions f belonging to V(IR"),
(2.16)

where
'

11/11, = ( }R'. lf(x)l"dx

) 1/ p

This double inequality (2.16) doe!> not hold in the limiting case where

() = 0 [4J.
The Littlewood-Paley- Stein function g(x) provides a method for analyzing
f !x ) in which a major role is played by the ability to vary arbitrarily the scales
used in the analysis; by the same token, the notion of frequency plays a minor
role. The dilations of size 2i are present in the definition of the operators tl.;.
!'\'}vertheless. conditions (2.12} and (2.13) endow these operators with a frequential content. The sequence of operators~,, j E l. constitutes a bank of band-pass
filters. oriented on frequency intervals covering approximately an octave.
Thanks to the work of Marr and Mallat (which we describe in Chapter 8) the
Littlewood- Paley analysis provides an effective algorithm for numerical image
processing.

2.5.

New directions of the 1930s: The Franklin system.

In 1927, Philip Franklin, v:ho was a professor at the Massachusetts Institute


of Technology (!\llT). had the idea to create an orthonormal basis from the
Schauder basis by using the Gram-Schmidt process. This gives a sequence fn(x)
v:ith /-1 (x ) 1. / o(:r) = 2J3(x- l / 2). ... , which is an orthonormal basis for
U[O.lj. This sequence Un)n;::-1 is called the Franklin system and satisfies

1
I

(2.17)

fn l:r)d.x

t :rf,(:r)dx = 0

= Jo

for

n;:::: 1.

The Franklin system has advantages of both the Haar basis and the Schauder
basis. It. can be used to decompose any function fin L 2 (0,l ]. which the Schauder
basis does not allow. and it can be used to characterize the spaces C'. 0 <
r < 1. by li.f.f,. ' :S L TI- l / 2 - r. which the Haar basis does not allov.. Thus
tht Franklin sy:;tcm works as wdl in relative!) regular situations as it does in
rciatively irregulu.r sit.uations.
Thr wcaknr:;.~ of thr Franklin basi$ is that it no longer bas u simple> algorithmic s truct.un. Tht' functions of th( Franklin basis. unlike thOS<' of tht Haar basi.~
or t.hOS<' of tht' Sdmudcr basis. arr not derived from a fixed function t by integer
translation!' a.nd dy~k dilations. Thb defect caused the Fraukliu syst.cna tu br
abandomd ruad fo~otten for almost 40 years.

CHAPTER 2

22

Fortunately, the Franklin system has survived this disgrace. Zbigniew Ciesielski revived it in 1963 by showing the existence of an exponent -y > 0 and a
constant C > 0 such that

(2.18}

{2.19)
Thus, everything works as if /n (:r) = 2ii2 1/J(2J:r - k ) and 1/J(:r) were a Lipschitz
ftmction with exponential decay.
Today we have an asymptotic estimate for the functions /n(:r}. This estimate shows that, in a certain sense, the orthonormal wavelet basis discovered
by Strfunberg in 1980 was living hidden inside the Franklin system . We have, in
fact, for n = 2i + k, 0 $ k < 21,

(2.20)
where, for a certain constant C,
d(n) = inf(k,2;- k}.

(2.21}

The function ,P{z), which was discovered in 1980 by Stromberg, is completely


explicit. It has the following three properties:

(2.22)

1/J(:r) is continuous on the whole real line.


is linear on the intervals 11, 2],.12, 3), . .. , ll,l + 1), . ..
and similarly on the intervals

l]

1+1
ll/2, 1), IO, 1/2]. l-1/2. 0].. .' [--2-.-2 .... ,
(2.23}

11/J(:r)l $ C(2- J3)1"'1,

(2.24) 2f12w(2ix- k}, j , k


N~ that (2 infinity.
2.6~

v'3) <

e Z.

and

is an orthonormal basis for

1, and hence (2.23) mearu: that

t"

L2 (R).

decreases rapidl.' at

New directions in the 1930s: The wavelets of Lusin.

The interpretation of Lusin's work in terms of the theoJ: of wavelet!' v;ould


probably astonish its author. But it i.o; certainly thr best reading. the one that
gives the greatest beauty to Lusin'!> v.'Ork.
We begin by introducing th<' object of Lusin's study. nan1ely. th<' Hardy
space!> HP(R), where 1 $ 1' $ oc. Let P denote the opeu. upper-half plan<'

WAVELETS FROM A HISTOJUCAL PERSPECTIVE

23

defined by z 2: +ill and 11 > 0. Then a function /(z +ill) belongs to H"(R) if
it is holomorpbic in the half-plane P and if
00

{2.25)

sup
11>0

(1

lf(z + iv)l"dx

1/ p

< oo.

- oo

When this condition is satisfied, the upper bound, taken over 11 > 0, is alao
the limjt as 11 tends to 0. Furthermore, I (2: + iy) converges to a function denoted
by l(z) when ll tends to 0, where convergence is in the &eDBe of the V -norm.
The space H~'(R) can thus be identified with a closed subepa.ce of V{R), which
explains the notation.
The Hardy spaces play a fundamental role in signal processing. One associates with a real signal l {t), -oo < t < oc., of finite energy, the analytic signal
F (t ) for which l(t), -oo < t < oo, is the real part. By hypothesis, the energy
of I is f~oo ll (t)lldt, and we require that F(t) have finite energy as well. This
implies that F belongs to the Hardy space H 2 {R). Then F(t)
f(t) + ig(t),
and the function g(t) is the Hilbert transform of /(t). For further infon:n&tion
about analytic signals, the reader may refer to [9, pp. 118-119]. One may alao
consult the rem&rkable exposition by Jean Ville, (10].
Read in the light of the theory of wavelets, Lusin 's work concerns the analysis
and synthesis of functions in H~'(R) using "atoms" or "basis elements," which are
the elementary functions of H"(R). These are, in fact, the functions (z- ()- 2 ,
where the parameter (belongs toP.
Thus one wishes to obtain effective and robust representations of the functions
f (z) in H~'(R) of the form

f (z) =

{2.26)

Ji

(z- ( )- 2 o (()dudv,

where ( = u + it: and where o(() plays the role of the coefficients. These
coefficients should be simple to calculate, and their order of magnitude should
provide an estimate of the norm of I in H~'(R).
The synthesis is obtained by the folloll.ing rule. We start with an arbitral'l'
measurable function o((), subject only to the following condition introduced by
Lusin: The quadratic functional A(:r) must be such that f~00 (A(:r))"d:r is finite.
This quadratic functional is defined by
(2.2i)

.4(.:rl =

where
f(.:r)

!r r
( lr(zl

ln
lo ltJ + it)l!?t.-!?dudv

= {(u.tr) e R2 .

t'

> lu- zl} .

Notl' that this condition involves only tht' modulus of the coefficients o((}.
lf tbl' integral J~~(A(.:r))"d.:r is finite. then necessarily
f l:) ==

j{,,(;-()- o (()du dt
2

belongs to W '(R).

CHAPTER 2

24

and if 1 $ p < oo,

11/ llp $

(2.28)

C(p)

(j

oe;.

- oc. (A (x ))Pdx

)Iff>

The left member of (2.28) is the norm off in HP(JR), as defined by (2.25). This
estimate, however, is sometimes very crude. If, for example, / (z) = (z + i )- 2 ,
one is Jed to choose the Dirac measure at the point i for o ( ( ) , and the second
member of (2.28) is infinite. This paradox arises because the representation

{2.29)
is not unique.
To obtain a unique decomposition. which we call the natural decomposition.
we restrict the choice too{( ) = ~ vf' (u + iv). When we do this. the two norms
11/ llp and IIAIIJ> become equivalent if 1 $ p < oo. Today this choice of natural
coefficients has an interesting explanation. This interpretation. which depends on
the contemporary formalism of wavelet theory, is given in the following section.
2.1.

Atomic decompositions, from 1960 to 1980.

Guido Weiss, in collaboration with Ronald R. Coifman, was the first to interpret.
as we have just done, Lusin's theory in terms of "atoms" and ..atomic decompositions." The "atoms" are the "simplest elements" of a function space. and the
objective of the theory is to find; for the usual function spaces. {1) the atoms
and {2) the ..assembly rules" that allov. the reconstruction of all the elements of
the function space using these atoms.
In the case of the holomorphic Hardy spaces of the last section, the atoms
were the functions (.::- ( )- 2 ( P . and the assembl~ rules werE" given b~ the
condition on Lusin's area function A\x).
For the spaces V[O, 21r], 1 < p < oc. the ..atoms cannot be the functions
cos b.: and sin kx, k ;:: 1, because this choice does not lead to assembly rules that
are sufficiently simple and explicit. to be useful in practice. Marcinkiev:icz showed
in 1938 t~t the simplest atomic decomposition for the spaces LP(0.1], 1 < p <
oc. is given by the Haar system. The Franklin basis would have served as well
and. from the scientific perspective given by v;avelet theory. the Franklin basis
and Littlewood-Paley analysis are naturally relat.ed.
One of the approaches to 'atomic decompositions.. i!' ghen by Calderon 1'
identity. To e>..-plain Calderon's identit~. we start with a function t:( x ) belonging:
to LZ(JR" ). Later in this history, Grossmann and r.1orlet callE-d this function an
analyzing wavelet. Its Fourier transform t7l~ ) is subject to thE" condition that
(~ .30)

i
I

oc

., dt
l1!"(t{ )l - = 1 for almost all ~ E IR" .

WAVELETS FROM A IIISTORlCAL PERSPECTIVE

25

If I)J(:r ) belongs to L 1(R11 ) , condition (2.30) implies

(2.31)

I)J(x )dx = 0.

}Rn
IP( - x ). w (x ) = t-nw(x f t ),

We write J,( x ) =
and J,,(x) = t - n.J;(x f t ). Let Q t
1
denote the operator defined as convolution with 1P1 ; its adjoint Q~ is the operator
defined as convolution with -1 .
Calderon's identity is a decomposition of the identity operator, written as

I =

(2.32)

oc

dt

Q,Q; -.
t

Grossmann and Morlet rediscovered this identity in 1980, 20 years after the
work of Calderon. However, with this rediscovery, they gave it a different interpretation by relating it to "the coherent states of quantum mechanics" [6].
T hey defined wavelets (generated from the analyzing wavelet '1/J) by
(2.33)

(X- b) ,

lP(.,.b)(x) = a-n / 21/J - a-

a> 0,

In the analysis and synthesis of an arbitrary function f (x) belonging to


L2 (Rn ). these wavelets W( .,,b) are going to play the role of an orthonormal basis.
The wavelet coefficients W (a , b) are defined by

lF {a, b) = {f. IP(.,.b)).

(2.34)

where (u. v} = u(x )r(x) dx .


T he functio~ f (x ) is analyzed by (2.34). The synthesis of f (x) is given by
f (:.rl =

(2.35)

f oe {

Jo

W (a. b)v,.,.b1(x )db da

}Rn

an+ 1

This is a linear combination of tht- original wavelets using the coefficients given
by the analysis.
We return to the specific case of thE' Hardy spaces H"{R) for 1 $ p < oc.
Tht- analyzing wavelet t (:.r) = ~ ( x + i)- 2 chosen by Lusin is the restriction to
thE' real a.xis of thE' funct ion ~(.: + i) - 2 : it is holomorphic in P and belongs t o
all of the Hardy spaces. The Fourier transform oft is J ({ ) = -2{e- { for~ ~ 0
and t(~l = 0 if~ $ 0.
Condition (2.30) is not satisfied but, on the other hand, we have

:>C

(:.?.36)

.. dt

l t lf ~) - =1

if { >0.

=0

if { $

o.

Condition (2.36) implies that thE' wavelets t"\ .,.1>1 generate H 2 (R) inst.cad of
L~ ( JR ) wht-n o > 0. h .;: R

26

CHAPTER 2

The wavelet coefficients of a function f (x ) belonging to the Hardy space


J(l(R) are then

(2.37)

W (a,b) = (/,1/J(o.b)} = -1
1f

j"" f (x )(
- cc.

X -

aJa
b . )2 dx,
- Ul

which is equal to 2iaJaf'(b + ia) since f (z) is holomorphlc in P .


Thus the representation (2.35) of a function in t he Hardy space H 2 (R) coincides with the "natural representation" that we defined in the preceding section.
Note that t/l(x ) = z~i ' although it belongs to all the Hardy spaces. cannot
be used as an analyzing wavelet because

ee.

dt

I,Z,(t{)l2 - = +X.
t

2.8.

if {

> 0.

Stromberg's wavelets.

The real version of the holomorphlc Hardy space H 1(R) is denoted by 'H 1 (R)

aud is composed of real-valued functions u(x ) such that u (x) + iv(x) belongs to
H1(R), where v(x) is also real-valued. In other words. u (x) belongs to 7t1 (R) if
and only if u and its Hilbert transform ii belong to L 1 (R).
Research on "atomic decompositions" for the functions in the Hardy space
7t1(R) takes two completely different approaches: one involves the atomic decomposition of Coifman and Weiss, and the other concerns the search for unconditional bases for the space 7t1 . Here is an outline of these theories.
Coifman and Weiss showed that the most general function f of 7t 1 can
be written as f (x ) = 2:~ >.kak(2 ), where the coefficients >.k are such t hat
IA.I:I < oo and where the ak(x ) are atoms of 'H 1 . This means that for
each ok{x), there exists an interval Ik such that ak (x ) = 0 outside of I .
lo(x)l :5 1/II~o l (I I I is the length of Ik ). and f 1 ak(x )dx = 0. These three
conditions imply that the norms of the o~o in 'H1 are bounded by a fixed constant
Co. The price to pay for this extraordinary decomposit ion is that it is not. given
by a linear algorithm.
Finding an unoonditional basis means constructing. once and for all. a sequence of functions b~o (:r) of 'H1 that are linearly independent. in a very strong
.sense. and that allov. all functions f of 7t 1 to be decomposed as

Eo""

cc

(::!.38)

J<x>=

L: v. bktxl.
0

where the scalars f3~o are defined explicitly by t.h<' formulas

(2.39\
Hen' thc g11 are specific functions in thc dual of 'H 1 . which is to say th<'y
spt'Citic BMO function:; .

RT<'

WAVELETS FROM A HISTOIUCAL PERSPECTIVE

27

The strong mdcpendence property is this: There exists a constant C such


that if two sequences of coefficients 13Jc and AJc verify l/3~: 1 $ j>.,.j for all k, then
(2.40)

where II II is the nonn of the function space 'H. 1 .


Wojtasczyk proved that the Franklin system /o(x), it (x), ... , fn(x) , .. . .
without the function 1, is an unconditional basis for the subspace of 'H.1 (1R)
composed of functions that vanish outside the interval [0,1].
Stromberg showed that the orthonormal wavelet basis defined by (2.24) is.
in fact , an unconditional basis for the space 'H. 1 (R).
Does there exist a relation between these two types of atomic decompositions?
To construct the decomposition of Coifman and Weiss using an unconditional
basis, it is necessary, first of all, to use a wavelet with compact support in place
of the one used by Stromberg.
The discovery of orthonormal bases ofthe form 2ii2 ,P(2ix-k), j,kE Z, where
,P(x) is in the class C 1 and has compact support, is due to Ingrid Daubechies
and dates from 1987. This will be developed a little later in 2.9.
To obtain the Coifman and Weiss decomposition of a function f in 'H. 1 ,
we first decompose it in Daubechies's orthonormal basis, 1/J;.~< The series obtained is regrouped into blocks. These blocks are a little like the dyadic blocks
of Littlewood and Paley; however, this time, they are defined by considering
the moduli of the coefficients o;.t of this series. The interested reader is referred
to [8].
2.9.

A first synthesis: Wavelet analysis.

Thanks to the historical perspective that we have today, we can relate the
Littlewood- Paley decomposition (1930). the version of Franklin's basis given
by Stromberg (1927), and Calderon's identity (1960).

This first synthesis will be followed by a more inclusi':'e synthesis that encompasses the techniques of numerical signal and image processing. This second
synthesis will lead to Daubechies's orthonormal bases.
This first synthesis is based on the definition of the word "wavelet" and
on the concept of "wavelet transform." We will see that the success of this
synthesis depends on a certain lack of specificity in the original definition. At
this time. mathematicians had not created a general formalism covering all of
the examples we presented above. A physicist and an engineer. Grossmann and
Morlet. provided a definition and a way of thinking based on physical intuition
that was flexible enough to cover all these cases. Starting v.;th the GrossmannMorlet definition. we will present two other definitions and indicate hov. thry
are related.
The first definition of a wavelet. which is due to Grossmarm and Morlet. i.
quite brood. A watelet i.~ a function t ' in L 2 (R) whose Fourier transform 1it~)
satisfie.~ tile coudition
l~t~W' = 1 almost everywhere.

J:

28

CHAPTE(I

The second definition of a wavelet is adapted to the Littlewood- Paley- Stein


theory. A 11Hlvelet if a function 1/; in L2 (R" ) whose Fourier transform J,({)
satiffies the condition L~oc. ltb(2-J{)j 2 = 1 almost everywhere. If ,Pis a wavelet
in this sense, then JlOg2 ,P satisfies the GrossmaM- Morlet condition.
The third definition refers to the work of Franklin and Stromberg. A UHlvelet
if a junction ,P in L2 (R) such that 2J I2T!J(2Jz- k), j, k E Z , is an orthonormal
basis for L2 (R). Such a wavelet 1P necessarily verifies the second condition.
This shows that in going from the first to the third definition we are adding
more conditions and thus narrowing the scope of 'wavelet." The same is true
for the wavelet analysis of a function. In the general Grossmann-Morlet tht:o~ (which is identical to Calderons theory) tht: wavelet analysis of a function
f yields a function W (a.b) of n + 1 variables a > 0 and bE R" . Thi5> function is defined by (2.34): W (a.b) = {f , c4 .,1). where 'W(a.bl(:r) = a - "12tp ( 7 ; h),'
a > 0. bE R" .
In the Littlewood-Paley theory. a is replaced by 2-J . while b is denoted by :r.
ln other words, iff is the multiplicative group {2- J. j ~ Z}. then the LittlewoodPaley analysis is obtained by restricting the Grossmann-Morlet analysis to

f x R" .
ln the Franklin- Stromberg theory, a is replaced by 2-; and b is replaced
by ka.-i, where j , k,E Z. ln other words, the analysis off in the FranklinStromberg basis is obtained by restricting the Littlewood- Paley analysis to the
"hyperbolic lattice" Sin (O.oo) x R consisting of the points (2-J, k2 - 1), j, k E Z .
The logical relations among these wavelets analyses are easy to verify.
We start with the Grossma.nn- Morlet analvsis. that is. the Calderon identit\.
This is written I
Jo"" Q,Q; ~. where QJJ) = f b 1 . This becomes I ::,
L~o.: tl1 .6.j in the Littlewood-fl"a.le~ theory, and if t
2-;. one bas Q1( /)
.6.; (!). Replacing t by 2-i and the integral J0"" u(t)~ by the sum L~o.: u (2-')
is completely classic.
To relate Littlewood-Paley analysis to the analysis that is obtained using the
orthogonal v:avelets of Franklin and Stromberg. wt> write t::;(z ) = 2i tf:(2i:r). and
tb;(x) 2iw(2i :r), where ~lxl t (-x}. We let tl1 ( / ) denote the convolution
product f 1P; and 6j : L 2 (R)- L2 (R} the adjoint ofthe operator .6.1 : L2 (R) L2 (R). Then .6-j (/) = f ;;1 . The coefficients o lf. k) of the decomposition off
in Stromberg's orthonormal basis are then given by

(2.41 )

o (j.k\ =

zn.!

f l:rk(2J: - in d.:

= '2 - j .'Zlf l'; )( k~-; ):: ~-; :: ..l.:ft k~ -J) .


Thus the cocfficicnu- arr obt.l\inE'd by rl'!'t.rirt ing ..l; lf \ to t h<' hyp<'rholir
lattiet> S.
h1 all three cases. wavekt annlvsi.<: i.-: followE-d bv a ~vnthesi!' that r<'C'onstrnrt!'
f (:r) from it$ wavelet. transfom1. in tbt' ca.<;(' of Gr~ru'anu-1\lorlt'l wavclt'ts. thi!'

WAVELETS FFI OM A II ISTO!liCAL PERSPECTIVE.

29

syntlwsis is given by the identity (2.35), which we rewrite here:


(2.42)

f (x ) =

{oo { W (a,b)1/lcc,bJ(x)db :.1

lo JR..

J;-

In the case of the Littlewood-Paley analysis, the integral


u (a ) ": is replaced,
as we have already mentioned, by the sum 2:~oo u (2- i ) and (2.42) becomes
(2.43)

j (x ) =

t 1.

(t:.; J )(b)J(x- b)db.

-oc

Finally, for the one-dimensional case and for Stromberg's orthogonal wavelets,
t he last integral becomes a sum, and (2.43) becomes
oc.

(2.44)

f (x ) =

"'"

2::2:: o (j. k)wj.1,(x).


-ex. -ex.

All of the preceding arguments remain at a fairly superficial level since the
hypot hesis on 1:' enables us to analyze only the space L2 of square-sum.mable
functions. This is the setting in which Grossmann and Morlet wrote their theoretical work. But this is evidently a sort of regression, for we have just shov."D
that, across a century of mathematical history, wavelet analysis was created
specifically to analyze function spaces other than L 2 Fourier analysis serves
admirably for L 2
If we want wavelets to be useful for the analysis of these other function spaces.
it is necessary to impose conditions on them in addition to those we have already
given. Up until now v.e have required only that the analysis preserve energy or,
which is the same thing. that the synthesis give an exact reconstruction.
These new conditions are
(2.45)
(2.46)
(2.47)

the regularity of the wavelet '1/J,


the decay at infinit~ of i' . and
the number of vanishing moments.

W e can. for cxamplt>. impose the conditions that tt helonp: t.O the Schwartz
cia$ and that all of it!' moments vanish. \\'e can also requir<> t which " ill br thr
case' for Dauht'chit'l'.l' waYdets) that li'lil have m continuou.':' derivatives. t hat it
han .> compact support. and that its first r - 1 moments vanish.
T hr prop('rt.i<>:< of t hf Stromberg wswelC't art' imermediaw: It ha.-< cxpoucntiw
decay. a.o.; doe:< itl' first dC'ri,-ative. and

L:

t'(:r)d:r =

L:

:rt!(x)d:r

= o.

CHAPTER 2

30

2.10.

The advent of signal processing.

If history had stopped with this first synthesis, the Daubechies orthonormal
bases, which improve the rudimentary Haar basis, would never have been discovered.
A new start was made in 1985 by Stephane Mallat, a specialist in numerical
image processing. Mallat discovered some close relations among
(a) the quadrature mirror filters, which were invented by Croissier, Esteban,
and Galand for the digital telephone.
(b) the pyramid algorithms of Burt and Adelson, which are used in the context of numerical image processing, and
(c) the orthonormal wavelet bases discovered by Stromberg and his succesIIOrS.

These relations will be explained in the next two chapters. Using "Mallat's

program," Daubechies was able to complete Haar's work. For each integer r ,
Daubec:bies constructs an orthonormal basis for L 2 (R} of the form

(2.48)

j . k E Z,

JaviD1 the following properties:


(2.49)

the support of tbr is the interval [0, 2r + 1),

(2.51)

~r(x) bas -yr continuous derivatives,

where the positive constant


Saar system.

1 is about 1/ 5. 'When r

= 0,

this reduces to the

Daubechies's wavelets provide a much more effective analysis and synthesis

than that obtained with the Haar system. If the function being analyzed bas
m COIItinuous derivatives, where 0 S m S r + 1. then the coefficients a(j, k )
from iU decomposition in the Daubechies basis will be of the order of magnitude ~(m+l/2 )j. while it would be of the order 2- 3i/2 with the Haar system.

This means.tbat as soon as the analyzed function is regular, the coefficients one
keeps (those exceeding t he machine precision) will be much Jess numerous than
in the case of the Haar system. Thus one speaks of signal ~compression... Furthermore. this property has a purely local aspect because Daubechi1'.!' wavelets
havt> compact support.
Synthesis using Daubechies's wavelets also give!' better results than tht> Haar
~'Stem. In the latter case. a regular function is approximated by functions that
have strong discontinuities. This is very anno~-ing for image processing. as the
reader can veri~ by referring to the imagt> of Abraham Lincoln ou page i4 of
Marr's book [7). These remarkable qualities of Daubechies's base e>.."J>lain their
undisputed success.

WAVELETS FROM A Jfl!:iTOJUCAL PERSI' ECTIVE

2.11.

31

Conclusions.

The status of "wavelet analysis" within mathematics is rather unique. Indeed,


mathematicians have been working on wavelets, which were called "atomic decompositions," for a fairly long time. Their goal was to provide direct and
easy access to the various function spaces. But during all of this period, which
stretches from 1909 to 1980, from Haar to Stromberg, there was very little scientific interchange among mathematicians (of the "Chicago School"), physicists,
and experts in signal processing. Not knowing about the mathematical developments and faced with the pressure of specific needs within their disciplines. the
la..'it two gToups were led to rediscover wavelets.
For example. Man did not know about Calder6n's work on wavelets (dating
from 1960) when he announced the hypothesis that we analyze in detail in Chapter 8. Similarly. G. Battle and P. Federbush [2] were not aware of Stromberg's
basis when they needed it to do renormalization computations in quantum field
thCOD'
In the numerous fields of science and technology where wavelets appeared
at the end of the 1970s. they were handcrafted by the scientists and engineers
themselves. Their use bas never resulted from proselytism by mathematicians.
Today the boundaries between mathematics and signal and image processing
have faded, and mathematics bas benefited from the rediscovery of wavelets by
experts from other disciplines. The detour through signal and image processing
was the most direct path leading from the Haar basis to Daubecb.ies's wavelets.
Bibliography
A block $pin con~truction of ondelett.e.s. Pori II: The QFT ccmnection. Comm.
Math. Phys.. 114 (1988). pp. 93-102.
G . BATTLE AND P . fEDERBUSH, Ondelettes and plvue clwter e:pon.rion.f, o t!lndicotion,
Comm. Math. Phys.. 109 (1987), pp. 417-419.
A . B1'\ ASSI. S. JAFFARD. AI"D D. Roux. Anol11e multi-~chelle du chomp gow~ten.o
morkovieru d'ordrt p andc:tU por (0.1). C.R. Acad. Sci. Paris, 5er. l ( 1991), pp. 403406.
C . FFFRMAI", The multiplier,.-oblemforlhe bol~ Ann. of Math . 94 {1971), pp. 330-336.
P . FLANDRII". Wot>ekt onoly$1.$ and .ynlhui.s of froctionol Broooni<ln motion, preprint.
Ecol<' r-;ormalt' Superi~un> dt> Lyon. Lab. dt' Physiqu<' (URA 1325 CNRS). Lyon.
France. 1991.
A . GROSSMAI'\1' A I'D J . MORLET. Deccmponhon of Hardy jllnctaon..c tnto lqu4rt' antepnlblfwovelet.< of constant lhopc. S!A!II J . !llath .. 15 (19~). pp. 723- 736.
D . II IARR. \'u oon. F'tt'eman and C'.o .. r-; ..,.. York. 198:!.
Y . Ill EYER. Ondelettu. Ht'rmanu. Nn- York. 1990.
A . P APOt' LIS, S ognol Anoiy.ns. 4th edit ion. McGraw-Hill. r-; ..,.. York. 198S.
J . \'ILL I:. Thiont' et oppluntto u . de lo nota on dr ~tpnol onoll/fUI"C C&T. Laboratoin
d .. TelecommucicatiOil$ elf' lA Societ.i Als&cienn .. dt' Connructioo 1\lec&nique. 2e.oo. A.

(1) G.
(2)
(3]

!4)
(5)

!6:
!7:

!tt:
(9;
(10;

BATTLE.

No. 1. ( 19-.18).

CHAPTER

Quadrature Mirror Filters

3.1.

Introduction.

It was with keen pleasure that I recently reread Claude Galand's thesis entitled
"Codage en sous-ba.ndes: theorie et applications a la compression numenque du
signal de parole." This thesis was defended on 25 March 1983 at the Signals and
Systems Laboratory of the University of Nice.
In his thesis, Galand carefully describes the quadrature mirror filters (which
he invented in collaboration with Esteban and Croissier) and their antic
ipated applications. He also posed some very important problems that would
lead to the discovery of "wavelet packets" (Chapter 7) and "Malvar's wavelets'"
(Chapter 6).
Galand 's work was motivated by the possibility of improving the digital tel~
phone, a technology that involves transmitting speech signals as se(zuences of O's
and 1's. However, as Galand remarked, these techniques extend far beyond the
digital telephone; facsimile, video, databases, etc., all travel over telephone lines.
At present, the bit allocation used for telephone transmission is the well
known 64 kilobits per second. Galand sought, by using coding methods tailored
to speech signals, to transmit speech well below this standard.
To 'llalidate the method he proposed, Galand compared it to two other tech
niques for coding sampled speech: predictive coding and transform coding.
Linear prediction coding amounts to looking for the correlations among sue
cessive values of the sampled signal. These correlations are likely to occur on
intervals of the order of 20 to 30 milliseconds. This leads one to cut the sampled
signal x (n ) into blocks defined by 1::; n::; N, N + 1::; n::; 2N, etc., and then
to seek. for each block, coefficients a~r . 1 ::; k ::; p. that minimizo:e the quadratic
mean -k I:~ !e(n )i2 of the prediction errors defined by

e(n) = :rln)- :~::O~rx(n- k)."


0

k= l

In general. p i.o; much smaller than/'.'. To transmit the block :r(n), 1::; n $ N . it
suffices to transmit th> first p values :z:(l) . . ,x(p). th> p coefficients a 1 .. a,..
and thc- prl'dict.ion erron; e(n ). Th> method i~ efficient if most of th> predk
tiou erron: art near 0. When thry fall below a certain threshold. th<>~ art' not
transmittt'<l. IUJd si~uifi<"Wlt comprC'SSiou can result.
3:4

34

CHAPTER3

Traruform coding consist.s in cutting the sampled signal into succe88ive blocks
of length N, as we have juat done, and then uaing a unitary transformation A to
traD8fonn each block (deooted by X) into another block (denoted by Y ). The
block Y is then quantized, with the hope that, for a suitable linear transformation, the Y blocks will have a simpler structure than the X blocks. Subband
coding will be presented in the next section.
For a stationary GaUBSian signal, the theoretical limits of the minim.al distortion that can be obtained by the three methods are the same. But, as Galand
showed, this assumes, in the case of subband coding, that the width of the frequency channels tends to 0 and that their number tends to oc. We will show that
these conditions cannot be satisfied (rigorously) for subband coding. To provide
an approximate solution, Galand proposed a treelike ~gement of quadrature
minor filters. This construction leads precisely to the wavelet packets, which w:
discuss in detail in Chapter 7.
In the cases of linear prediction coding and transform coding, the theoretical
limits of the minimal distortion are calculated as the lengths of the blocks tend
to infinity, while conserving the stationarity hypothesis.
If the three types of coding yield asymptotically the same quality of compression, why introduce subband coding? Galand saw two advantages: the simplicity
of the algorithm and the possibility that subband coding would reduce the unpleasant effects of quantization noise as perceived at the receiver. By quantizing
inside each subband, the signal would tend to mask the quantization noise, and
it would be less apparent.
The same argument bas been repeated by Adelson, H.ingorani, and Simoncelli
[1] for numerical image processing. The use of pyramid algorithms and wavelets
allows aspects of the human yisual system to be cleverly taken into account so
that the signal masks the noise. The perceptual quality of the reconstructed
image is improved even though the theoretical compression calculations do not
distinguish this method from the others.

3.2.

Subband coding: The case of ideal filiers.

We follow Galand's example and begin with a deceptively simple case. For a
fixed m ~ 2, let I denote an interval of length 2r./ m \\"ithin (0,27r], and let l~
denote the Hilbert space of sequence (ck)kEZ verifying L:lckl2 < oo and such
that /(8) = L~oockeu.e is zero outside the interval I. This subspace will be
called a frequency channel.
If (ck)keZ is a sequence belonging to 1~. the subsequence (Cmk)kEZ prm;des
an optimal. compact representation of the original beca\l.SE'

-m1 /(9 ) = -m1


if

( /(9} + f

(o + -2r.)
m

-r. -r,

((I . 2:7\1111))) = L
m
~

Ckm ComP

f (8) vanishes outside I and if 8 E J.

Thus we have L~oo !ckon I~


~ L~oo lc~c l2 and this relation expresse: th('
redundancy contained in th( oriual sE>qucncC' (ck ). v.bich is strongly corrcll\tro.

QUADRATUJU; MlllltOit FILTERS

This means that the original sequence contains m times the numerical data
needed to reconstruct /(8) on I , knowing that /(8) is 0 out8ide of I .
The i<kal 1cheme for aubband coding consists in first jiltmng the incoming signal into m frequency channels associated with the intervals 12,.-1/m,
21r(l + 1)/ mJ, 0 S l $ m- 1, and then 1Ub1ampling the corresponding ou~
puts, retaining only one point in m . This operation, which consists in restricting
a sequence defined on Z to mZ, is called decimation and is denoted by m ! 1.
Thus the ideal subband coding scheme is illustrated as follows:

Yt(nm)-

Ym(nm)The scheme for recorutructing the original signal is the dual of the analysis
!jclleme. We began by extending the sequences y 1 (nm), ... , Jlm(nm) by inserting
O's at all' integers that are not multiples of m . Next we filter this "absurd
decision" by again using the filters F 11 ,Fm. The output returns (:z:(n)).

r--~.........,-(x(n ))-..--

- - Y m (nm)
One can. as Galand did. hope for the best, and tl) to replace the index
functions of the intervals 12,..1/m, 21r(l + 1)/mJ with more regular functions of
the form utm.r - 271"1). 0 S 1 S m - 1. If the w(:z:) = u:m(:r) were a finitr
trigonometric sun1. then the filters F1 Fm would have finit.t> length. which is
essential for applications.
But the Baliau- Lov. theorem (Chapter 6) tells us that such a function u:(x)
cannot be constructed if we demand that it be regular and well localized (uniformly in m ). Consequently, it Lc; not possible to realize the ideal subband codinl!
schemr just. described if we require that the filters F~o ... Fm have finite length
and. at thr samr timr . providr good frequency definition.

36
J.l.

CHAPTER3

Quadrature mirror fllten.

Faced with the impos11'bility of realizing subband coding using m bands covering
the frequency 1paoe regularly and having finite-length filters (whoee length must
be Cm, u required by the Hei.lenberg unoertainty principle), Galand limited
himlelf to the cue m 2. He then had the idea to effect a finer frequency tiling
by IUitably iterating the ~band process. We will see in Chapter 7 that this
arborescent echeme leads directly to "wavelet packets," but we will also see that
theee "wavelet packets" do not have the desired spectral properties.
Subband coding using two frequellcy channels works perfectly. We are going
to describe it in detail.
The input signals :r(n) have already been sampled and are defined on the
integers n E Z; they are arbitrary sequences with finite energy:

We denote by D : 12 (Z) ..... 12 (2Z) the decimation operator (also denoted by


2!1), which consists in retaining only the terms with even index in the sequence
(:r(n))nEZ The adjoint operator

.E

= D" : z2(2Z) - l2 (Z)

is the crudest possible extension operator. It consists, starting with a sequence


(:r(2n)),.ez. in constructing the sequeJ~ce defined on Z obtained by inserting O's
at the odd indices. Thus we get the sequellce

(.. .. :r(- 2). 0. :r(O), 0, :r(2), 0. :r(4), 0, ...).


'Ib simplify the notation we write X in place of (:r(n ))nEZ These input signals
X are first filtered using two filters F0 and F 1
We will see later that Fo must be a low-pass filter (in a sense that will be
made precise) and that F1 v.'ill then be a high-pass filter.
The outputs Xo
Fo(X ) and X 1 F1 (X ) are two signals (:r0 (n))nEZ and
(:rt(n)),.ez with finite energy.
Xo and X 1 are subsampled v.-ith the decimation operator D = 2 1 1. Then
we have Yo= D(Xo ) (:ro(2n)),.ez andY,= D(X d
(:rl (2n))nEZ
Write

111011 =

(~ 1Xol2n ll')"'

and

lf>ili =(

~ l.l2n ll' )"'

The two filters Fo and F1 arc called quadrature mirTOr filter.~ if, for all signal.(
X of finite energy, one ha.~
(3.1)

lllol!:! + lll '1f

= IIX I!:!.

Denote by T0 th<> operat.or DFCl: l:!(Z) - /:!(2Z) and. similarly, let T 1 denotE'
tht' operator DF1 : f:!(Z ) - /:!(2Z.l.

37

QUADRATURE MIRROR FILTERS

Then (3.1) is clearly equivalent to


(3.2)

I= Tc)To + TjT1.

In (3.2), I : fi(Z) .... fl(Z) is the identity operator. What is much less evident is
that the vectors 1i)To(X) and TiTt (X) arc always orthogonal, as the following
theorem aseerts.
THEOREM 3.1. Let Fo(9) and Ft(9) denote the trarufer function.& of the
filt.e rs Fo and F} . Then the following:two properties are equivalent to each other
and to (3.1).
1 (

(3.3)

The matrix-

..J2

Fo(9)

is unitary for almost all

(3.4)

Ft(9)

Fo(9+ 1r) Ft(9+ 1r)

e e [0, 211").

The operator (To . Tt): fl(Z) -+l2 (2Z)

Z2 (2Z)

is an isometric isomo:ryhism.

R.ecall that the sequence of Fourier coefficients of the 211"-periodic function


F0(9) is the impulse response of the filter Fo , and the same is true for Ft(9)

and F 1 .
. Here are some comments on these different properties.
Condition (3.2) is called the perfect reconstruction property: The input signal
X is the sum of two orthogonal signals 10To(X) and TjT1 (X), where the signals
To (X ) and Tt (X ) were given by the analysis. The operators T 0 = F0E and
Ti = FiE are applied to two sequences sampled on the even integers. These are
first extended in the crudest way, which is by replacing the missing values with
O's. Next, this seemingly nonsensical step is corrected by passing the sequences
through the filters F0 and F;, which are the adjoints of Fo and F 1 The correct
result is read at the output. The complete scheme, anal~'Sis and synth~, is
D OW classic:

--x
-...syntbesi$

Condition (3.4) means that quadrat ure mirror filters constitut.(' orthogonal
transformations of a particular type. while (3.3) allow:;: us to ronstruct. quadrature mirror filters t hat havt' finite impulse response .
To se<' this. W(' start with a trigonometric polynomial mo(9) = oo + Ott'ifl +
+oN ci!> P such that lmo(B)I:! + jm 0 (8+1r) j:! = 1 for all B. ('W<' will seE' further

CHAPTER3

38

on how to CODitniCt these polynomials.) Next, we write Fo(B) = v'2 mo(9) and
F 1(9) J2e"mo(9 + w). Then it follows directly that (3.3) i. .atiafied.
The followins jitJe uamplu illustrate the definition of quadrature mirror

6ltere.
Tbe firlt uample is essentially a counterexample becauee it is never UBed, for
a re&SOD that wiD become clear later. It consists in bypusing the operators F0
and F1 More preciaely, To(X) is the restriction of the sequence X= (z(n)),ez
to the even iute&ers, while T1(X) is the restriction of this t~equence to the
odd integers. Condition (3.1) is then trivially satisfied, and the Wlitary matrix (3.3) is

The aecond t!Z.111mple is more interesting. The filter operators are defined by

Fo((x(n)))

1
y'2(x(n) + z(n + 1}) and
1

F 1 ((z(n))) = y'2(z(n)- z(n + 1)).

The orthonormal
System.

wavelet basis associated with this choice will be the Haar

The third emmple recaptures the ideal filters preseDted at the beginning of
the chapter. The 2w-periodic function mo(9) is 1 on [0,1r) and 0 on [1r, 211'), and
m1(8)
1- mo(B). Next, as above, define Fo(B) = v'2mo(9) and F1 (9) =
:Jjm1 (9).
In the fourth uample, mo(9) becomes the characteristic function of
[-w/2,,../2) when it is restricted to [-1r, w), and m1 (~) = 1- mo(9).
. The Wt emmple is a smooth modification of the preceding one. Assume
that 0 <a< wfl. We ask that mo(9) be 271'-periodic, equal to 1 on the interval
[-J +a, i-n], equal to 0 on [i +a, 3; - a], even, and infinitely differentiable.
In addition, we impose the condition

lmo(eW + lmo(9 + w)l 2 = 1.

(3.5)

Then write mt(9) = e- 18mo(9 + 7r), Fo(9)

= v'2mo(9), F1 (8) =

v'2mt(9), and

we obtain two quadrature mirror filters.

3.4. The trend and fluctuation.


Returning to Theorem 3.1, let H denote the Hilbert space of all sequences
(z(n)),ez satisfying L:~oo lz(n}fl < oc. and recall the operators To and T1 Write
Bo and H, for the two subspaces TQTo(H) and TjT1 (H). We knov. that if F0
and F 1 are two quadrature mirror filters. then H v.ill be the direct orthogonal
.aum of Ho and R1
Write mo(B)
Fo(9) and assume that mo(9) is 0 at 9 = 7r and that this 0
\as order q ~ 1. Then lmo(9)12 = 1+ 0(19!2q) and m 1 (9) =0(1919) when 9 tends
to 0. Under thesr conditions. 'Wt' say that F0 is a low-pass filter and that F 1 is
a ~h-pass filter. even though this terminology ma~ not. always lx- justified.

='32

39

QUADRATURE MIRROR FILTERS

If theee conditions are aatisfied, the trend and the fiuctuation around this
trend of a signal X are defined, respectively, by Xo = 70To(X) aDd Xt =
7iTt(X).
The trend Xo is twice u regular u X and, u such, cao be auhlampled by
keeping ooly one point in two.
But this sul!sampllog is furnished here by CODStruction. Indeed, 70 : z2(2Z)-+
(l(Z) is a partial isometry, and To( X) constitutes the subsampling of the trend
Xo. In the same way, Tt(X) is t);le subsampllng of the fiuctuation X 1

3.5.

The time-scale algorithm of Mallat and the time-frequency


algorithm of Galand.

It is amazing to reread Galand's thesis in the light of present understanding.


Indeed, Galand'& goal was to obtain finer and finer frequency reeolutiODS by
appropriately iteratin8 the quadrature mirror filters. This is possible. however,
ooly in the case of the ideal filters in our third example, a.od theee ideal filters
are unusable because they have a.o infinite impulse response. In spite of this
criticism, we will return to Galand's J)oint of view in Chapter 7, a.od it will lead
us to wavelet packets. .
Galo.nd UJ(I.6 thw looking for time-frequency clgorithms, but hi8 .fundc.mental
d.iscoverJI, the quadrature mirrorfi,lUr1, UJ(I.6 ditlerledfrom that end by Moll4t, who
intention4lly wed quadrature mimn- /ilter1 to conrinu:t, wing an hierarchicol
lcherne, time-1cole algorithms.
.
.
MalJat considers AD increasing sequence rJ
2-iz of nested grids that go
from the "fine grid" rN(N ~ 1) to the "coarse grid"
The signal to be
analyzed has been sampled on the fine grid (we will come back to the sampling
technique when studying the convergence problem), atid our starting point is
thus a sequence I= lo belonging to l 2 (rn)
In addition, two quadrature mirror filters F0 a.od F 1 are given. (We will
see later what conditions they must &atisfy.) These same filters will be used
throughout the discussion.
We process the signal f by decomposing it into it& trend a.od ftuctuation.
The trend is sampled on the next grid
1; it represents a new signal that
is decomposed again into trend a.od fluctuation. The fiuctuations are never
analyzed in this hierarchical scheme, a.od the algorithm follows a "herringbone"
pattern.

ro.

r"_

,~,~~~~~~~: ---- - --- o~o fH


r 1 r2 r3 r 4 rs r&

rN

ro

40

CHAPTER 3

The input aigncl f

e l 2(fN) u jirwll11 repruented btl the aequence r1 , ... , r,.,

of fluctootioru and bJI the ltut trend f N e l 2 (fo). The trrml/cmnaticn that mapa I
onto the aequence (r 1 , .. , r N, f N) u clearlJI ortlwgonal, and the invene u imme
diatel11 calculated, ba!ed on the perfect reconstruction properlJI of the quadrature
mirror jilten.
The significance of Mallat's algorithm stems &om the following observation:
For appropriate choice of the filters F0 and F1 , there are numerous cases where
the fluctuations r 1 , . , r N are, at different steps, extremely small. Coding the
signal thus comes down to coding the last trend f N as well as those coefficients
of the fluctuations that are above the threshold fixed by the quantization. Notice
that the last trend contains 2_,. times less data than the input signal and that
the gain is appreciable.

3.6.

Trends and fluctuations with orthonormal wavelet bases.

We propose to describe the asymptotic behavior of Mallat's algorithm when the


number of stages N tends to infinity. In order to do this. it is first necessary
to present the continuous version of this algorithm. This involves orthonormal
vavelet bases in the follov.riqg "complete form ,'" which means that we have a
wavelet plus a multiresolution analysis. This will be explained.
We begin with a function <p(:.r) belonging to L 2 (R) that has the following
property

(3.6}

Let

IP(X- k),

k E Z,

is an orthonormal sequence in L 2 (R).

~o denote the closed linear subspace of

More generally, define

(3.i )

V;

L2 (JR) generated by this sequence.

in terms' of ~0 by simply changing scale, that is t o say.

J (x ) e l o

J (21 x)

l ~.

for funct ions f E L2 (R}.


The other hypotheses are these: The l j, j E Z. form a nested sequence: their
intersection n~00 lj reduces to {0}: and their union U~oc lj is denst: in L 2 (1R) .
We then v.Tite ~jJ:(:.r) = 2il2'r'(2ix- k ), j. k E I . and d efine the trend fr at
scale 2-J. of a function f E L 2 (R) by

fit:r> =

z:=u..,,.k)~).k(:r l.
k

ThE> fluctua.tions-Qr dE>t-a.ils in t h<' case of an imag('-art' denoted by d ,(:r)


and definl"d b~ dj (:r} = f J+I(:r ) - fi (:r).
.
To anai~'Z{' thesE> deta.ils further. WE' let w) denOt{' thE' orthogonal complem('llt
of l j in l i+ 1. so that l ~+ 1
I? H j. Then therr exists at least one function
t/' belonging to W0 such that t"(:r- k ). k E I , is an orthonormal basis of nl'
This funct ion 1/1(:.r). called t lw m other wavelet, ha.c: the following propertie~::

= \ :,

(3.8)

j. k E Z.

QUAOkATURF. MIJlHOH FILTERS

41

is an orthonormal bb.si~ for L2 (R) and, more precisely, for all j E Z, we have
(3.9)

dj(x>= I ) i"""i"<x>.
A:

The details, at a given scale, are thus linear combinations -of the "elementary
fluctuations," which are the wavelets related to that scale.
This analysis technique will be discussed again in the next chapter under the
name "multiresolution analysis."
Given two functions :p(:z:} and tP(XJ, called, respectively, "the father and
mother wavelet .~ it is possible to define two quadrature mirror filters Fo and F1
by way of the operators To = DF0 and T 1 DF1 This is done by relating the
approximation of the function space L 2 (R) t hat is given by the nested sequence
of subspa.ces l'J to the approximation of the real lineR that is given by the nested
sequence of grids r; = 2-i z .
To do this, we consider that the function :Pj.k(:z:} 2il 2 t,0(2ix-k) is centered
around the point k2-J , which would be the case if t,O(:r) was an even function .
We associate the point k2-i v.ith the function V'J,A: This gives a correspondence
between f ; and the orthonormal basis {lfJ,kk E Z} of V;. At the same time,
J2{r ;} is identified isometrically with \.j .
To define the operator To : /2 (r;+l ) - l 2 (f;), it is sufficient to define its
adjoint T0 : l 2 (f;) - 12 (fJ+1) . This adjoint 10 is a partial isometry. It is
constructed by starting v.i tb the isometric injection l'J C l'J+1 and by identifying
V; with t2 (f;} and V;+ 1 v.ith l 2 (fJ+ 1 ) , as explained above.
The orthonormal basis 1!1;.1< of W1 allows us to identify W; with 12 (f;) in the
same way. The isometric injection of H ~ C \'}~ 1 , interpreted v.ith this identification. becomes the partial isometr:y

Finally. the couple (;;. v) is represented by the couple (T 0 . TJ) or, which
amounts to the same thing, by the pair ( F0 F 1 } of the two quadrature mirror
filters. This crucial observation is due to Mallat.
Mallat also posro the converse problem: Given two quadrature mirror filters
F0 and F 1 is it possible to associate v.ith them two functions .,: and 1! ha,ing
properties (3.6 ). (3.i ). and (3.8)?
Although the comefS(' is incorrec-t in general. it is correc-t in numerous cases.
and this led to the construction of Daubechies's wavelets.
Our first and third e-xamples of quadrature mirror filters show that the converse is generally false. There art' no functions If and t/1 behind these numerical
algorithms.
The second example is related to tht' Haar system. ThC' function ,'(:r) is thC'
index function of 10.1 ). and \; is rompos!'<i of step functions that are constant
on each interval jk::?-J. (k + 1)2-J). k E Z.
ThC' fourth p.xampk IC'.ads to ~shannon's wavclC't.c;.~ Tlw fuuction -..~(.:r.) i!: thr
carqjnal sinC' ddimxi by i~;: .

42

CHAPTER3

Finally, the last example ia more interesting, bec&u8e both of the functions
~(z) aDd ,P(z) belong to the Schwartz class S(R), which COilliBts of the infinitely

differentiable functions that decreaee rapidly at infinity.


In the next eection, we are going to arrive at the functional analysis (that is,
the contimlous cue) by passing to the limit in the dilcrete algorithms. To d o
this, we wW give sufficient conditioos on mo(9) to COilStnlct a multiresolution
analysis starti.Dg with two quadrature mirror filters.

3.7. Convergence to wavelets.


In order to "restrict" a very irregular function l(x) belonging to L2 (R) to a
grid r = hZ, h > 0, it is necessary to filter the function before sampling. This
filWing ought to be done according to specific rules 50 that, in the event l (z } is
more regular than anticipated, I can be reconstructed from the sampled version
by interpolation.
Today we know exactly what needs to be done, and the technique of sampling
is a direct consequence of Shannon's work.
We filter I by forming the convolution I g,. , where g,.(z) h-1g(h- 1 x) and
where g(z) and its Fourier transform g({) are chosen 50 that

(3.10)

g(x) is in the class cr and g(x), g'(z), ... ,g(r ) (z)


all decrease rapidly at infinity,

1:

(3.11)
(3.12)

g(x)cU- = 1,

g(2k1r} = 0 if k E Z,

and

k # 0.

One can then restrict the filtered signal to the grid hZ.
We assume that these conditions are satisfied throughout the discussion. and

we begj.n by reconsidering Mallat's "herringbone" algorithm. Start by fixing f


in L 2 (R), and sample I on the grid r N using the preconditioning filter I g,...
where 9N(Z) = 2N g(2N x).
We wish to study the asymptotic behavior of Mallat's algorithm as N tends
to infinity. The limit we are looking for is defined as follows: ..Fix the index j of
the grid f; . (Starting with f o. we 'lrilllook at f 1, f2 . etc. ...) Then we seek tht>
(simple) limits of the sequences

when /'\ tends to +oo, j and k being fixed. Refer to the "herringbone" schemt>
for the definitions of f,..,r,...r,..- 1 Here are the results 16].
THEOREM 3.2. Assume tha.t th( rmpulse responses of the quadroture mirror
fiJten Fo and F1 decrease rapidly at infinity and that the transfer function Fo(B)
of Fo 1atisjies F 0 (0} ~. F0 (8) ::!- 0 if-~ ~ 8 ~ ~ . Then Mallat's "herrinp
bont,. olporithm. applied to f g,... a. indicated above. converye. to the anoly$t.<
off in an orthonormal wavelet ba.<r.<.

43

QUADRATURE MIRROR FILTERS

00

lim

N-+oo

fN(k)=1_

lim TN(k)

N-+oo

/(z)ip(z - k)dz,
00

= [-oo /(z)t/1(%- k) dz, ... ,

The functions V'(z) and ,P(z) are, respectively, "the father and mother" of
the orthonormal wavelet basis, as explained in the preceding sectiOilA very beautiful application of this result (to which we return in Chapter 4)
is the construction of the celebrated bases of Daubechies.

3.8.

The wavelets of Daubechies.

These wavelets depend on an integer N 2: 1 that defines the support of the


functions V'(x) and t/J(z), namely, !O, 2N - 1], as well as the Hi>lder regularity of these functions: <p(x) and t/J(x) belong to C", where r
r(N) and
limN-+oo N- 1r(N) 'Y > 0. The value of 'Y is about 1/5; this implies that, if a
wavelet ,P(z) is to have 10 continuous derivatiyes, the length of its support must
be about 50.
The functions V'(z) and t/l(z), which ought to be written as lPN and tPN . are,
respectively, the father and mother of the orthonormal wavelet basis.
To construct this orthornormal basis, Daubechies applies the method of the
last section. One starts with the trigonometric sum

PN(t)

= 1-

CN

1'(sinu) ~'~- du =
2

'Y1ceu.t

Jk J$21'1-1

=
h,.e-''"

with the constant CN > 0 chosen so that PN(Tr) 0. There exists (at least one)
finite trigonometric sum mo(t) = ~ ~N- 1
such that lmo(t)ll = PN(t)
and mo(O) = 1. The coefficients h" constitute the impulse respoDSl' of the filter
Fo and are real.
Under these conditions we knov: from Theorem 3.2 that the functions 9 and t/
exist and that they form a multiresolutional analysis. We now use these results
to construct cp and t " explicitly. The function cp. which we seek to construct,
must be a solution of the functional equation
21'\ - I

(3.13)

V'(z)

= v2. L
0

ht.V'(2z- k)

and

L:

<,?(x)dx = 1.

This functional equation follows from the inclusion V0 C V1 and the fi\Ct that
<,?(.T - .1.). k E Z. i!' an orthonormal basis for l (, and .J2<,?(2x- k). k E Z. il' M
orthonormal basi~ for \ "1

44

CHAPTER 3

This functional equation leads to


cp({)

(3.14)

= mo(e/2)mo({/4) . .. mo({/21} ... ,


=

and the principal difficulty in this coDBtruction is to show that one has cp({)
0(1{ 1-m) at infinity, where m = m(N) tends to infinity with N .
On the other hand, it is almost obvious that the support of cp(z} is in
I0,2N -1J.
The fact that cp(x- k), k E Z, is an orthonormal sequence is a direct consequence of Theorem 3.2 and the fact that mo(t) i= 0 on !- 1r f2,1T/ 2J.
To determine the Fourier transform 1}J({) of the wavelet t/J(z ), we write
m l (t) ei(l-2N)t mo(t + 1T). Then

(3.15)

~({) = ml({/ 2)cp({/ 2) = ml ({/2)mo({ / 4)mo({/8) . .. mo({/ 2i} ...

and the support of ,P(x) is the interval [0. 2N- 1).


If N
1, cp(z) is the index function of !0,1), while ,P(x) = 1 on [0,1/2}, -1
on [1/ 2,1 ) and 0 elsewhere. The orthonormal basis 2il 2 t,!J(2Jx - k ), j , k e Z, is
then the Ha.a.r system.

3.9.

Conclusions.

The functions 1/JN used by Daubecbies to construct the orthonormal bases named
for her are new "special functions." These "special functions" never appeared in
previous work, and their only definition is provided by (3.14} and (3.15). This
means that the detour by way of quadrature mirror filters and the corresponding
transfer functions was indispen&aQ!e. In other words, it never would have been
possible to discover Duabechies's V.:avelets by trying to solve directly the existence
problem: Is there, for each integer r ~ 0. a function 1b{z ) of class cr such that
2il21/1(2i x- k), j, k E z, is an ortbonormaJ basis of L2 (R)?
The quadrature mirror filters will be used again in the algorithms for numerical image processing, which we describe in the next chapter.

Bibliography
R. HINCORANI. AND E. SIMO~CELLI. Ort}wgonol pyramid tromfurm.
for tm4ge codsng. SPIE, Visual Communications and Image Processi.nt; 11. Cambridge,
MA, October Zi- 29, 1987, Vol. 84 5. p p .
I. DAUBECHIES. Onhonorrn41!wu of compoctly 8Upportet/ woveleu. Comm. Pure Appl.
Math .. 41 (1988). pp. 909-996.
D . ESTEBA~ AND C. GALAND . Applicolson of qu;~droture msn-or filt er to split bond voice
codi"!) ~tems. ln\ernational Conferencc
Aooustia. Speech. a.nd Signal Processing,
Washi.n((I.On. DC. May 1977. pp. 191- 195.
C . GA LAND. Codogc en ow-bondu: theorsr et applsC4lU>ns ci lc com,....,...ton numenqut'
du ftgnal de perote. Thesis, Universh~ of !'\ice. Nice. France. March 1983.
S. MALLAT, A u.eo., farmultiruolul.son h9fl4l dt!a>mpOntion: The wavelet ,..,..es entatson.
IEEE Traru.. Pattern Aual. Macbuw ln&.ell.. 11 (1989). pp. Gi~-6!13.
Y . MEYER AND F'. PAIVA. Con~ dt' l'olgorithm~ de Mollet, pl't'prinL. CEREMAOE,
Uniwnity of Pari~ Dauphine. Pari$. F'nu.aC't' .

(1] E.

12;
!3j

(4)

(5]
(6;

H . A DELSO~ .

ro-ss.

on

CHAPTER

Pyramid Algorithms for Numerical


Image Processing

4.1.

Introduction.

Experts in image processing agree on the following point: An image contains


important information in a 11."ide range of scales, and this information is often
independent from one scale to another.
Marr wrote in Vision [9, p. 51]: "Although the basic elements in our image are
the intensity changes, the physical world imposes on these raw intensity changes
a wide variety of spatial organizations, roughly independently at different scales."
Md we read on page 54: "'ntensity changes occur at different scales in an image,
and so their optimal detection requires the use of operators of different sizes."
In (2]. Adelson, Hingorani, and Simoncelli used the same language: "Images
contain information at all scales."
Cartography illustrates this concept very well. Maps .contain different information at different scales. For example, it is impossible to plan a trip to visit
the Roman churches at Charente and Poitou using the map of France found on
a globe of the earth. Indeed, the villages where these churches are found do not
appear on the global representation.
Cartographers have developed conventions for dealing with geographic information by partitioning it into independent categories that correspond to the
different scales of a department. a region, a country, a continent, and the whole
globe. These categories are not entirely independent, and the more important
features existing at a given scale are repeated at the next larger scale. Thus. it
is sufficient to specify the relations between information given at two adjacent
scales in order to define unambiguously the embedding of the different representations at. different scales. 1\aturall~ these embedding relations (such as v;bich
department belongs to v..hich pro,ince. and which province belongs. in turn. to
which count~. and so on ... ) are 8\1ulable' to us from our knowledge of geograph~: however. they could be discovered by merely examining the maps.
We can see from thi~: example thE' fundamental idea of representing an image
by a tree. In the cartographic- case. the trunk would be the map of the world.
B~ traveling t.oward the brancl1es. th<' twigs. and the leaves, we reach successiV('
maps that cover smaller regions and gh>t> more details, details that do not appear
at lower levels.

CHAPTER4

46

To interpret this cartographic repreaentation using the pyramid algorithm, it


will be necessary to reverse the roles of top and bottom, since the pyramid algorithm progreiBIII from "fine to c:o&n~e." In c:actography, usage and certain conventions determine which details are deleted in going from one ecale to another
and which "coherent structures" {see Chapter 10) persist across a succession of
scales.
In this chapter, we are going to describe the pyramid algorithms of Burt and
Adelson, as well as two important modifications derived from them. The purpose
of these algorithms is to provide an automatic process, in the context of digital
imagery, to caJculate the image at scale 2i+l from the image at scale 2;. If the
original image corresponds to a fine grid with 1024 x 1024 points, the pyramid
algorithm first yields a 512 x 512 image, then one 256 x 256, next a 128 x 128
image, and so on until reaching the absurd {in practice) 1 x 1 limit. The interest
in pyramid algorithms derives from their iterative structure, which uses results
&om a given scale, 2;, to move to the next scale, 2i+1 .
Returning to our cartographic example, suppose that we already have maps
of the French Departments at a scale of 1 to 200,000. Then it is of no value
to refer to the new aatellite images in order to construct a map of France at a
scale of 1 to 2,000,000. The information needed to make this new map is already
contained in the maps of the Departments. The point is that one uses judiciously
the work already done without going back to the raw data. We have just outlined
the general philoeophy of the pyramid algorithms without, however, describing
the algorithms that are used to change scale. How, starting with a very precise
representation of the Brittany coast at a scale of 1 to 200,000, can we arrive
at a more schematic description at a scale of 1 to 2,000,000 without smoothing
or softening too much the myriad details and roughness that characterize the
Brittauy coastline?
The pyramid algorithms of Burt and Adelson and their variants {orthogonal
and bi-orthogooal pyramids} deal with this type of problem. In all cases, this
will involve calculating (at each scale) an approximation !; of a given image by
using an interative algorithm to go from one scale to the next.

4.2.

The pyramid algorithms of Burt and Adelson.

For the rest of the discussion, r; = 2-rl 2 will denote the sequence of nested
grids used for image processing. It often happens that the image is bounded by
the unit square. in which case we will speak of a 512 x 512 imag~ to indicate
that; 9; similarly, a 1024 x 1024 image will correspond to j
10.
At this point, we are working with images that are already digitized and
appear as numerical functions. The raw image that provided t hese digital images will be, for us, a function f (x. y). This function can b~ very irregular,
either because of noise or because of discontinuities in the image. For example.
discontinuities can be due to the edges of objects in the image.
The sampled images f; are defined on tht> corresponding grids f; = 2-iz 2
These sampled images are obtained fron1 tht> original physical image. that is,
/(%, 71), by the restriction operators Ri : L ~ (:R 2 ) - / 2 (f; ). The&' operators R;

PYRAMID ALGORITHMS FOR NUMERICAL JMACE PROCESSING

47

will be defined in the folloolring pages. They are the same type aa thole u.eed in
DUJDerical analysis to dilc:reti.ze an irregular function or distribution.
The fundament4l dUcovt:rJI of Burt ond Adellon il the e:rilten of 1'Mriction
operator~ R; witA the prDf1ertJI that, for all initial im4gu f, the ompled imGgu
I;= R;(/) ore rel4ted to eoch other btl utremeiv mnple C4tuolitv relotion.f.
These causality relatioos, of the type "fine to eo&r&e," allow IJ- 1 to be calculated directly from /; without having to go beck to the original physical image
f(x , Jl).
In order to define the restriction operators R;, we first consider the case d a
grid given by x = hk, J1 = hl, where h > 0 is the sampling step and (k, l) E ZxZ.
Very irregular functions should not be sampled directly and, therefore, the
image must be smoothed before it is discretized. This leads to the following
classic scheme

where F i6 o low-po8s filUr prior to the 8ompling E .


To determine the characteristics of the filter F, first consider the special case,
f (x , y) cos(mx+ny+cp), m, n EN. 1b sample this function correctly on h1.2,
the Nyquist condition must be satisfied. This means that h must be less than
mm{1r/m,1r/n} if we wish to be able to reConstruct f(z,JI). Looked at from
the other side, the Nyquist condition says that sampling on hZ2 will Jose all
information at frequencies higher than 1r / h. For the case at hand, the Nyquist
condition comes down to suppressing, through the action-of the filter F, all the
frequencies in f(z, JI) that are greater than 1r/h. This is done by smoothing the
signal through convolution with g ( f, l), where g(x, y) is a sufficiently regular
function concentrated around zero.
The filtering/sampling 8cheme mop8 the phJisic.ol image f (x, Jl) onto o numerical image defined by

tz

(4.1}

c(k, l ) = ~

By v.'Titing fr?(Z, y)
(4.2)

jj

g ( k-

~ ,l- *) f (x, y)d:x dy.

=g( -x. -y) and <Ph(x, y) = tz<P (~ , *) , we have


c(z , y)

= (f,cph( -

kh, -lh)}

whert> \U. t) =If u(x.ylv(x , y )dx dy and where denotes the (dummy) variablt>
of integration.
Tht> extension operator enables us to extend a sequence c(k, l} defined on h'l':!
to a regular function on :R~ . In this sense, it is inverse to the filtering/ sampling
operation. We define the extension operator to be the adjoint of the restricti011
operator so it is given by
(4.3)

This L" an interpolation OJICrator.

CHAPTER 4

48

The simplest examples are given by the spline functions. We consider the onedimensional case to simplify the notation. U we let T be the "triangle" function
T(x} sup(l-lxi,O), then (4.3} yields the usual piecewise linear interpolation
of a discrete sequence. A second choice is given by tp = T T, which is the basic
cubic spline.
Returning to the general case, we ask that the operator PhRh, composed of
the restriction operator followed by the extension operator, have the property,
that for all functions f E L 2 (R2 },

(4.4}

PhRh(/ ) --+

f, in the quadratic mean, when h tends to 0.

By assuming, for example, that tp(x) is a continuous function that decreases


rapidly at infinity, it is easy to show that .(4,.4) is equivalent to the..con~ition
PhRh(l) = 1, where 1 represents the function identically equal to 1. And this is
equivalent to

(4.5}

1<.0(0. 0)1 = 1,

<P(2k1r. 2Lr.) = o if (o, o) # (k, I) e Z2 .

In whatfollows, we assume that If tp(x,y)dx dy = 1, after possibly multiplying


t; by a constant of modulus 1.
We return to the fundamental problem posed by Burt and Adelson. Thus
we consider the nested sequence f; = 2- ;z2 These grids become finer when j
tends toward +oo and coarser when j tends toward -oo.
We begin with a function tp(x,y) that is continuous on R2 and decreases
rapidly at infinity. We also assume, as above, that cp(O, 0) = 1. Denote by R;
and P; the restriction and ~ion operators associated with this choice of tp
;:and the grid r;.
Burt and Adelson's basic idea is that, for certain choices of the function tp, the
different "sampled images" R;(f} = f; derived from the same "physical image"
fare necessarily related by extremely simple "causality relations." The dynamic
of these relations is from "fine to coarse," which means that a function defined on
a fine grid is mapped to one on a coarse grid. To make these causality relations
explicit, we denote by T; the operators that will eventually be defined by these
relations, that is, by T; (/; ) = /;- l where /j = R;(f} and h - 1 = R1- l(f). We
can summarize all this with the two conditions:
(4.6}
(4.7}
One might nruvely think that the operator T; can be defined by inverting the
operator R; in l4.i ). But the operator R,, is a smoothing operator. and its
inverse is not defined. 1n terms of images. it is not possible to go from a blurred
image back to the original in1age.
Thus we cannot solve (4.7) by element.ar~ algebra. On the other band, once
R; is restricted to an appropriate closed subspace l j of L:!(JR:!). R1 : l j ..... [:!(f;)
becomes, in certain cases. all isomorphisu1. Then we can solve (4.i) dii'<'Ctly.

f'Yk.AMID ALGORITHMS FOR NUMERICAL IMAGE PROCESSING

49

Burt and Adelson aaked hO\\ to determine the functions r.p such that (4.6}
and (4. 7) are satisfied. Stated this way, the problem is very difficult, for most
of the usual choices of smoothing functions do not have these properties. To
resolve this difficulty, Burt and Adelson proceeded the other way around; that
iB to say, they sought to construct r.p from the operators T;.
For this it is necessary to derive some consequences of (4.7). The first is
that the operator To : P(Z2 ) -+ l 2 (2Z2 ) can be written as To ::: DFo, where
Fo: 12 (2 ) -.l2 (Z2 ) is a filter operator and where D : l2 (Z2 ) -+ l 2 (2Z2 ) restricts
a function defined OD Z2 to 2Z2 D is the decimation operator, which we have
already encountered in Chapter 3. The fact that To bas this form is a consequence
of the fact that To commutes v.-ith all even translations.
Thus, if X= (x(k)h:ez2, we can write
To(X )(2k) =

(4.8)

L w(2k -

l )x (l) ,

lEV

where w(k) is the impulse response of the filter F0 . For convenience, V! assume
(as Burt and Adelson did) that w(k) is real.
--,---,...,.
U we apply To to x (k) = J f (:r)r.p(x - k)d:r, then from (4.7) we get
J f (x)r.p (j - k)dx. Condition (4.8) is thus equivalent to

r.p(:r,y) = 4 L L w(k , l)r.p(2:r+ k,2y+ l) .

(4.9)

It

Note that the meaning of k changed in (4.9): We had k E Z 2 in the preceding


formulas, while k and l now belong to Z.
For obvious reasons, Burt and Adelson were particularly interested in filters
with finite length. This means that w(k ,l ) is zero if lkl > Nand Ill > N for
some J\'.
By taking Fourier transforms of both sides. (4.9) becomes

~{. J7) =mo(~ ~) ~(~~~) :

(4.10)
where

mo((. '7)::: L L w(k,

(4.11)

lr

Recall that. .,_?(0. 0)

(4.12)

l )fiCitH1'1l .

= 1. By iterating {4.10) and passing to the limit. it become:

r.p({.f7)=mo

({22'7) mo ({4'4'7) ... m o ( 2i'


{ 2i'1 )

The second consequence that we derive from ( ~. 7) is that all of these conditions (for different. j) are in fact equivalent. This cau be seen by making thr
change of variables :r ,_ 2i :r and y .- 2i y in (4 .9 ). and intep-ating both sides
against f (:r.,y). We then bavt'
(4.13)

R1 _ 1 (f} (:.?-i+lk)

=L
IE Z'

w(2k -l)R;(f)(2-il).

50

CHAJ>TER4

In other words, under our a.ssumptions, the operators T, are defined by

T;(X)(Ti+ l k)

{4.14)

= E ~o~(2k -

l)x(2-1 l)

lEZ2

when X= (x(2-il)) belongs to l 2 {r;). Tbe point is that the eequence ~o~(k), k E
Z2 , is the same for all the operators T; . Working backwards, Burt and Adelson
began with a finite eequeoce of coefficients ~o~(k,l) such that E., E1 ~o~(k,l) = 1.
They defined mo({, 17) by (4.11) and then cl; by (4.12). Then the first question to
arise is whether the second member of (4.12) defines a square-integrable function.
If this is the case, we call this function <{;, we define the restriction operators R1
in terms of the Fourier transfonn of this function, and we define the transition
operators T; by (4.14). Then R;- 1 = T;Rj for all j E z.

4.3. Examples of pyramid algorithms.


Before continuing the presentation of the Burt and Adelson algorithms, we give
examples of functions cp that illustrate both the existence and the nonexistence
of the transition operators. Conversely, we give examples of sequences w(k , l )
illustrating the existence and nonexistence of the associated function cp.
We begin with two exam~les where the transition operators do not exist.
2
Suppose that cp(x,y) = !e_,. - 11 . Then there are no transition operators because (4.10) implies that mo({,f1) = exp(-~(f1 +rf)) , which is clearly not
21r-periodic in { and '1 In the same way, the transition operators do not exist if rp(x, y) = exp( - lx! - jy l). One senses, justifiably, that the existence of
transition operators is exceptional.
Here, however, is an example where the operators do exist. To simplify the
discussion, this example (the spline functions) is presented in dimension one.
Let m ~ 0 be an integer, and let cp(x ) be the convolution product x X
Here there are m producu and m + 1 terms. and x is the characteristic function
of the interval!0, 1J. Then<{;({)= ((ci{ - 1)/ - i{)m+l , and (4.10) is satisfied
with mo({) = ( (1 + e-i{)/2) m +l, which is indeed 27rperiodic.
We now proceed the other way: we start with a sequence of transition operators (T;), which is a sequence w(k.l ).k.l E Z, and we propose to reconstruct
rp. All the examples that we consider are constructed v.oith separable sequences
w(k.l), that is, sequences of the form w(k)w(l). The associated function cp(x, y)
will then necessarily be ofthe form r,:t:r)~?(y).
For the first example take w(k) = 0 if k :f. 0 and w{O) = 1. In this cas<'
the function V> defined ~ (4.12) is the Dirac measure at 0 and the restriction
are DO longer defined.
operators R; : L2 (R2 ) In the second example take w(k) = 0 except fork= :::1. and ...:(1) = 1/ 2.
From this we can deduct> that mo(~ ) = <'Os(n and ~(:r) = 1/2 on th<> inter.-al
1-l,lj and 0 elsewhere. This choice for ;,:. "hich is (for the moment) perfectly
reasonable, will be excluded when W<' introduce the concept of multiresolution

rcr,)

analysis.
Burt and Adelson proposed a very ori~inal function for ..:. IUld this will lx
our third example. Takt> w(O) = 0.6: :(::::1 ) = 0.25: ..:t::::?) = - 0.05: and

PYRAMID ALGORITHMS FOR NUMERICAL IMAGE PROCESSING

51

0 for lk l ~ 3. The corresponding function cp(z) is continuous, ita support


is (-2,2), and it reeemblee Cexp(-cjzl), C > 0, c > 0, on this i.Dterval. This
is why the corresponding algorithm is called a Laplacian pyramid. We shall see
this example again when we introduce hi-orthogonal wavelets at the end of the
chapter.
The purpose of our last example is to show that the existence of the function
cp, defined by (4.12), is not a stable property, even in the simplest cues. In
fact, we limit our discussion to functions w(k) that are zero except at k 0 and
k
-1, and here w(O) = p, w(-1) = q, 0 < p < 1, 0 < q < 1, p+q = 1. Then
the choice p = q = l / 21eads to a function cp that is the characteristic function of
10, 1]. All other choices imply that the mathematical object on the right side of
(4.12) is the Fourier transform of a probability measure #l that is l!ingu1ar with
respect to the Lebesque measure. The support of this probability measure #l is
the interval (0, 1]. This measure is defined by the following property: If I is a
dyadic interval in (0,1] and if I ' is that left half of I and I" is the right half of
I , then ll(J') = Pll(I) and ll(J") = qll(I).
We drop for the moment the problem of choosing an optimal filter w(k),
k e Z2 Indeed, such a choice must take into consideration the overall objective.
Burt and Adelson's objective was image compression. We are going to present
their compression algorithm in the next section. After that we will return to the
problem of choosing w(k).

w(k)

4.4.

Pyramid algorithms and image compression.

Image compression is one of the uses of the pyramid algorithms. Burt and
Adelson's algorithm, which we describe in this section, will later be compared
with other algorithms (orthogonal pyramids and hi-orthogonal wavelets) that
perform better. All of the pyramid algorithms act on images that are already
sampled and never on the original physical image. In other words, the function
cp we have tried to construct using the sequence w(k) is never used. Then why
have we worked to learn its properties? The answer is given at the end of 4.6,
but we can indicate the idea here, We want, on reaching the summit of the
pyramid, to have a moderate, softened image and not one that is chaotic and
noisy; this is precisely where the regularity of the function cp comes into play.
The Burt and Adelson pyramid algorithms use only the transition operators
T; : 12 (r;) ..... 12 (r1 _ 1 ). All of these operators are the same, except for a change
of scale: therefore, we are going to assume that j
0.
The discussion of the algorithm begins with the definition of the trend and
the fluctuations around this trend for a sequence I belonging to /2 (f 0 ). This
trend cannot be To(/) because it "lives in a different universe" and cannot be
compared to I . To define the trend, it is neeessaz: to leave the coarse grid 2r0 ,
where To(/) is defined. and return to the fine grid r 0 where I is defined. This
is done by using the adjoint operator 10 : l 2 (2r0 ) ..... 12 (r0 ), and the trend off
is defined by TciTo(/).
Wt' clearly want th<' trend of a very regular function to coincid<' with that
function. Thi.c: leadg to t h<' r('Quircments that 70T0 ( 1) = 1 and, mor<' ~cnemlly.

52

CHAPTEH 4

that 7()To(P) = P for all polynomials P of degree less than or equal toN, for
some fixed integer N . This condition i8 equivalent to the following: The function
mo(~. '1) , defined by (4.11), must vaniab, along with all of its derivatives of order
less than or equal toN, at pointa (e11r, t21r), 1,2 E {0, 1}, with the exception
of the origin. At the origin, one must have

lmo(~. '1)1 2 = 1 + 0(1~1 + I'1DN+l.


The price to pay is naturally the length of the filter w(k,l) that must be used
to satisfy these conditions. This length is proportional toN. Another interesting
observation is that the conditions we have just imposed on mo(~. '1) imply, by
(4.12), that cp(2k1r,211f) = 0 if (k,l) e Z 2 and (k,l) #= (0, 0). But this last
condition is the same as (4.5), which, as we know, is necessary and sufficient to
have P,.R,.(f) -+ I in the norm of L 2 (R2 ) when h tends to zero. Since To is the
discrete analogue of the restriction operatOr Rh , whereas T0 corresponds to the
extension operator P,., 1(iT0 is the "discrete approximation" operator analogous
to the continuous approximation operator P,.R,..
The fluctuation around the trend is I - T0T0 (f) when f belongs to 12 (ro).
This fluctuation is zero whenever f is a polynomial of degree no greater than /\'.
and one can easily deduce from this that the fluctuation will be very weak in all
areas where the image is very regular. As we will see, this last property is the
key to the success of the Burt and Adelson algorithm.
The trend, and the fluctuation around the trend, of a sequence f belonging
to l 2 (r;) are defined by a simple change of scale. The trend is TjT1(f) and the
fluctuation is f - TjT;(f) .
H the sequence in question, I . is the restriction to the grid r i of 8 function
F that is very regular in a some.(open) region n, then
(4.15)

If- TjT;(f )l ~ C2-(N+l )j

at all the points of this region. Thus the Burt and Adelson algorithm becomes
more effective as N increases.
To define the coding and compression algorithm of Burt and Adelson. we
begin with the "fine grid" r m = 2- '"Z2 and 8 numerical image fm sampled on
this line grid. This numerical image is. in fact, the restriction to r'" = 2- mz2 of
a physical image f e L 2 (R2 ) . This means that fm is the restriction in the usual
sense of the convolution product f Dm where Dm (:r:,y ) = 4mg(2'"x.2'"y). The
properties of the function g were indicated in 4.2: in addition. we shall assume
that the integral of g is equal to 1. However it is not necessary to '-return to the
physical image- f to use the algorithm.
Bun and Adelson replace fm by thr couple (trenc!. fluctuation). But thr
trend, which is given by T;,TmUna ). is completely defined by TmUm) Thb
amounts to saying that the trend is sufficiently regular that it can bt' coded by rt'taining one pixel in four. This coding is given by TmUm ). In summary. Burt &~ d
Adelson codr fm with the couple ITm(/m). fm - T;,Tmlfml;- Then thr fluct.ulltion. denot.ed by Tn is not processed further. They v:ritt' fno-1 = TmUm) and it crate t.hr procoourc. /no-1 is coded by lfno -':: Tn-1 ). v.herr f.,, _~= Tno-l Uno - 1)
and Tm - 1 = /no-1 - T,;,_.Tno-lUm-1).

l ' Yk.AMIIJ ALGORJTHMS FOH N UMERICAL IMAGE PROCESSING

53

Thes<' substitutions are interesting for two reasons:


(4.16)

Going from f; to /;- 1 reduces the data one must


deal with by a factor of four. Indeed, J; is defined on
r; and /;-1 on r;- 1. which has one-fourth as many points.

(4.17)

In many cases, the fluctuations are small enough


that they can be replaced by zero.

Condition ( 4.17) is satisfied in all the regions v.here the image exhibits a certain
regularity.
If we suppose that the starting image fm is bounded on a square of side 1,
then the algorithm is stopped on reaching the summit of the pyramid, which is
the grid ro.
The image fm is coded by the sequence (/o, r 1 , ... , rm}, where / o, defined on
r(l, is a scalar and where the ri = (1- Tj1j)f1 , 1 ~ j ~ m , are the different
fl uctuations.
The diagram below gives a schematic description of the algorithm. The
columns correspond to the different grids r ;.

Tm
fm j

Tm

rm

Tm -1

r m-1

Tm-1

r---------------, i
1

Tm - 2

TJ

rm-2

T,
fo

r1

ro

Running the algorithm the other way, which is reconstructing fm from the
code. is very easy. Begin v.'ith / o and TJ. and compute .fl = Ti fo + r 1 . In the
same way. reconstruct h by h = Ti h + r2. and continue until fm is recovered.
As we will set> a little later, this algorithm is advantageous only if most
of the fl.uctuations are zero. Otherwise it is clearly disadvantageous because
information is wasted.

4.5.

Pyramid algorithms and multiresolution analysis.

Befor<' lea,'ing tlw first version of Burt and Adelson's algorithms. wt> ar<' going to
describe a continuous version of it. The interplay between the discrett> algorithms
and their continuous versions. which is implicit in the work of Burt and Adelson.
wa.. madc explicit by Mallat. and the author.
\\'( consider tl.Jc genrral cas<' becaUSt> dimension two plays no particular rok
in tht following definition.
A mult in::.H1luti011 analy.i.~ of L 2 (1R" ) i. an increasing sequence of clo.~ed subspace:< (\ ~ l.1 ez of L 2 (R" ) hmnng the f ollounng three properties:

CHAPTER4

n
00

(4.18)

V; = {o}.

-oo

""' V;
U

u den.te in L2 (Rn),

-oo

(4.19)

for all fvru;tioru f e L2 (Rn) and all inUgers j


f(x ) e Vo u equivoknt to f (2'x) e v,

(4.20)

there e::&Ut! a function 1,0(:z), belonging to V0 ,


luch tluJt the sequence ft?(X - k), k zn,
is a Rie.!z bo4is for Vq.

eZ

Recall that if H is a Hilbert spa.oe, a Riesz ba.$is (e; );EJ of H is. by definition,
an isomorphic image T : H- H of a Hilbert basis (/; ); EJ of H . (Note that T
is DOt necessarily an isometry.) Then each vector X e His decomposed uniquely
in a series
(4.21)

:z

= L a, e,
;EJ

where

L lo;l < oc.


2

Furthermore, o; = (x,ej), where ej = (T )- 1 (/; ) is the dual basis of e;; this


dual basis is itself a Riesz basis.
We are going to measure the regularity of a multiresolution analysis; this
1rill, in fact, be the regularity of the functions belonging to lfo. To measure this
regularity, we introduce an integer r , whlch can take the values 0, 1, 2, ... , and

even+oo.
We say that the multiresolution analysis (V; );ez is r-regular if it is p ossible
to choose the function 1,0(x ) in {4.20) so that, for all integers m ~ 0 and all
rER",

(4.22)
where a = (o 1 , .. on) is a multi-index satisfying o 1 + + o, :S r and where
ff' =(o/oxt} 0 1 (8/8xn) 0 " .
We return to the two-dimensional case. Here, a multiresolution analysis is. in
a oertain sense, a particular case of the pyramid algorithm. To see this. suppose
that the function o;. which is defined by (4 .1:?). has t he following additional property: There exist. two constants Cz ~ ('1 > 0 such that for all scalar sequences
(aJ:)J:eZl

To simplify the notation. we have denoted th<> vector (k 1 k2 ) E z~ by k and


similarly x = (x1 :r.2 ) E R 2 .
If this is t he case. let\.~ denott- the closed linear subspace of L 2 (R 2 ) generated
by the functions ','(l" - k). k E Z 2 Then it is eA.~ to verify that the conditions
in (4.18) hold and that \ j C \ ~+I v: hen tht> \ ~ 1\N' defined by (4.19).

55

PYRAMID ALGORITHMS FOR NUMERICAL IMAGE PROCESSING

The pyramid algorithma auociated with multireeolution analyaes are the only
ones that we will study in the following sections. They have some quite remarkable properties. For example, the restriction operator R; is then an isomorphism
between "'J and l 2 (r;). In this caae, the equation R;- 1 = T;R; can be eolved
directly; it is sufficient, in fact , to restrict the two sides of the space in V; in
order to invert R;.
Not all pyramid algorithms are related to a multiresolution analysis. A counterexample is given by one of the pyramid algorithms that we presented in 4.3.
In this example the function tp(:r) is 1/2 on [-1,1J and 0 elsewhere. Thus
(4.24)

II~P(:t) - ,.,(x- 1) + tp(x- 2) + + (-

l)N tp(x- N}ll2

= ~.'

whereas, according to (4.23), it should be of the order of .JR.

4.6.

The orthogonal pyramids and wavelets.

Shortly after the discovery of quadrature mirror filters by Esteban and GalaJid,
Woods and 0 'Nei1(13J had the idea to apply this technique to image processing.
They thus obtained the first example of an orthogonal pyramid. We are going
to set aside, for the moment, the specific construction carried out by Woods and
O'Neil using separable filters. Instead we will present the notion of an orthogonal
pyramid in complete generality. Then we will return to the particular case
where the quadrature mirror filters appear in the construction of an orthogonal
pyramid.
The Burt and Adelson algorithm is flawed because it replaces information
coded on JVJ pixels by new information whose description requires ~ !f2 pixels.
This criticism, which we will analyze in a moment, is not justified because in
many examples of real images, most of the values of the gray levels on the !f2
pixels are in fact zero, and the unfavorable pixel count, where !f2 Dec:omes
; N2, occurs very rarely.

Let us examine, however, why the information has been wasted or, more
precisely. where the inefficient coding occurs. At the start. the image f has been
coded on N2 pixels. Next, we replace this by the couple (To(/), (1- T0To)(f)J.
which is composed of the coding for the trend and the complete description of
the fluctuations around the trend. The description of To{f) requires
pixels.
whereas the description off- TtiTo(f) continues to require N 2 pixels. In all. we
use N2 +iN 2 pixels. At the ne>..'1. step . the pixel count becomes N2 +!N 2 +T6N 2
and so on ... At. the end. we will have used N2 + ! N:! + fr,J\1'2 + ~ 1. or
2
approx.imat.ely
pixels. The "wasted" pixels appear because the fluctuations
f - TjT; l /) have not been coded efficiently.
The orthogonal pyramids are a particular class of pyramid algorithm.~ that
code the fluctuations with ~ N 2 pixels. With this scheme there is no waste.
When the original image f is replaced by coding the trend and thr fluctuation.
the required pixels are ~ N 2 and ~ J\1'2, respective!~. and tht' volumt' of data
remains constant.

!!YJ

5N

5{j

CHAPTEH.4

We 1ay that a pyramid algorithm i.9 orthogonal if, in the 1eme of tile usual
1calar product of l 2 (ro), the trend TOTo(/) and the ftuctuatimu I - T0To(/)
around thu trend are orthogonal for each image f defined on the grid ro.
Write H
Ho TOTo(H), and H1 (1- TOTo)H. If the pyramid
Ho e H 1. Since the dimension of Ho is a
algorithm is orthogonal, then H
quarter that of H, the dimension of H 1 is 3/4dimH, as announced.
An equivalent definition of "orthogonal pyramids" requires the adjoint 1(j of
the operator To to be a partial isometry. (Recall that 7(i is defined on l 2 (r_I)
with values in l 2 (r 0 ) .) This takes us back to one of the characteristic properties.
in dimension one, of the "low-pass filter" To in a pair of quadrature mirror
filters, (T0 , TI). And this observation prompts us, in dimension two, to look for
the corresponding second filter, T1 We will see in a moment that three filters
are necessary in this case.
But before this, we show how to construct some pyramid algorithms. We
return to the "transfer function" mo(-'7) defined by (4.11). The P~Tamid algorithm is orthogonal if and only if

= rcro).

lmo({, '7W + lmo( ... r..'7W + lmo({, '7 + r.W


(4.25)

+ lmo({ + r.. '7 + 1rW

= 1.

This condition is completely analogous to the one on the transfer function mo({)
in the case of two quadrature mirror filters (Chapter 3, 3.3).
Continuing this comparison, we consider the function tp(x , y) in L2 (JR2 ) n
L 1(JR2 ) defined by (4.12) and normalized by II tp(x, y)dx dy
1. We might
expect that the sequence tp(x-k;y-1) , k.l E Z , is orthonormal. and this is true
in most cases. However, the proof involves a delicate limit process. passing from
the discrete to the continuous, and certain-orthogonal pyramids do not lead to
orthogonal sequences.
This difficulty already appeared in dimension one for tht> quadrature mirror
filters. The condition we assume here on mo({. f'J), which is sufficient to allow
passage from the "discrete to the continuous." is completely analogous to the
condition we used in dimension one.
It is sufficient to assume that mo({.f'J) -=1= 0 if-~ $ {$~and-~$ f'J $ ~
Then r,?(x - k , y - l ). k.l E Z, is an orthonormal basis of a closed subspace l0
of L 2 (1R 2 ) . At the same time. 2i r,?(2J:r - k . 2Jy -1). k.l E 'l}. is an orthonormal
basis for tht> subspact> l j: the extension operator P3 : i 2 (r;) ~ l j is an isometric
isomorphism: and tht> restriction operator R; : L2 (1R2 ) ~ P (f ;) is decomposed
into the orthogonal projection operat.or from L 2 (R2 ) ont.o \ j follmved by the
inverse isomorphism Pj 1 : V; - 12 (ri).
This allows us to define e>."J)licitl~ the transition operat.ors T; : /2 (r;) ~
2
l (r;-d. which. we recall. v.rert> defined implicitly by T; RJ :::: R;-J US<'
the operator P; to identify l2 (r;) with l j. and similar!~ US<' PJ- l to identi~
l 2 (r;-d with l j_ 1 . Ha,ing made these identifications. the t.ransition operator
T; : l 2 (f; ) -+ l 2 (r,_1 ) corresponds quitr simply t.o tht> ortho~onal projection of
l j on \ ~ _ 1 . which il< Pj _ 1T;P;-l in our notation.

PYHAMID ALGORJTHMS FOH NUMERJCAL IMAGE PROCESSING

fJ7

We next define W; to be the orthogonal complement of V; in V,+l Thus


we have V;+ 1 = V, e W; . It is easy to verify-using once again the "isometric
interpretations" given by Pj : l 2 (f ;) -+ V; andP;+l : l 2 (f;+d -ol'J+ 1-thatthis
orthogonal decomposition corresponds precisely to the orthogonal decomposition
of a function into its trend and fluctuation, and this latter decomposition is the
definition of orthogonal pyramids.
W e ccme now to the two-dimeruional generalization of the quo.droture ;nirror
filters. In dimension two, we consider four operators T 0 , R 1 , R2 , and.R3 . All four
are defined on l 2 (Z 2 ) with values in l 2 (2Z 2 ). We ask that these four operatOrs
commute with the even translations T E 2Z2 and that

(4.26)

llf ll 2 = IITo(/)112 + IIRI (/)112 + IIR2(f)ll 2 + IIRa(/)11 2 ,

for all f belonging to l 2 (Z2 ). The left term is of coun;e computed in l 2 (z2),
whereas each term on the right is computed in l 2 (2Z2 ) .
One of the important results in the theory of orthogonal pyramids is the
existence and constructability of these operators R1o R2 , and R 3 . Furthermore,
if the impulse response w(k, l) of To decreases rapidly a t infinity, the operators
R 1 R 2 , and R3 can be constructed to have this same property.
Once R1o R 2 , and Ra are constructed, we can construct the corresponding
wavelets t/Ji, tP-J. and 1/Ja. Assuming that mo ({, 17) i= 0 if I{I $ ~ and 1'71$ ~ .
t hese three wavelets are defined by
(4.27)

1/l;(:c, y)

=4 L L w;(k.l)<p(2x + k , 2y + l) ,
It

= 1,2, or 3,

v.hcre ..:;(k , l ) denotes the impulse response of R; .


Thus. under quite general conditions, the orthogonal pyramids lead to orthonormal wavelet bases. and this development proceeds by way of the twodimensional generalization of quadrature mirror filters.
We move on to the two-dimensional generalization of Mallat's algorithm. The
~exact reconstruction" identity.
(4.28)

is deduced immediately from (4.26). Identity (4.28) provides a particularly elegant solution to the problem of coding the fluctuation f - 7(jT0 (/). This flur
t uation i:; exactly

RiR1 lf) + R;R2(J) ..,.. R3R3t/).


The thr('(' operators Rj . R; . and R j are partial isometries. 81Jd this allov.~ U!' to
code thE' fluctuation f- T0Tol/) with the three sequences R 1 (/) , R2 (f). Rslf).
ThesE> three sequences belon~ to 12 (2f0 ) when f E 12 (f 0 ) . and thus thE' coding
of each of them uses only on<> pixel in four. Hence. three-fourths of thE' pixels
art> used to codE> th<> fluctuation . v.bereas one--fourth i:; used to cod<> the trend.
Consequently there are no lon~er all:'" was~ed pixels.
W< can nov. return to t ht' algorithm and givt> it. a much more preciS<' formulation. This is illustratE'd by thl' follov.ing schem(. Th<> horizontal arrows
corr<'Spond to coding th< tT<'uds. v.hcreas the obliqut WTtW.'t' represent codiu~
t-Ill' t.br~ fluctuntiou.o.:.

CHAPTER4

fm

~=~=~=~=~= ~:
"'

""

~~ ~ ~

W2
w3

fl(fo)
Tbe wavelets appear in the asymptotic limit of this scheme. This limit is
&Ibn on the number of steps m , which must tend to infullty. We start with a
function J(x,JJ) belonging to L2 (R2 ). We "restrict" J to the fine grid r m
tbe classic acbeme. This means we have a fixed regular function g, which derapidly at infinity and whose Fourier transform g({, '1) satisfies g(O, 0) = 1
2k1r,2l1r) = 0 if (k,l ) =F (0,0). We write 9m(x,y) = 4mg(2mx,2my). Fi-

f.,. is the restriction to the (fine) grid r m of the (filtered) image f. 9m


.If we still assume that mo({,f1) does not vanish on [- ~ . ~] x [-~ . J) and
tllat the pyramid is orthogonal as defined by (4.25), then Mallat's algorithm
CIIIIMIJes as the number of steps tends to infinjty. The limit of this process is
~algorithm, namely, the decomposition of the original image f (x,y) in
orthonormal basis composed of the following four sequences: ~(x- k , y -I),
~(2ix-k, 2iJ1-l), 2ith(2ix - k , 2i y - l ), and 2i 1P3(2ix- k , 2i y-l ), where
IEZI.Dd j eN.
7Jiii!MUI8 that if we ji: the index j of the grid f; , and if we exomine the
of Mallet's algorithm that are defined on this grid, then the limits of
. . . ~ are, re.f1H!Ctively,

...,-put.a"

21

2;

J
J

f (x , Y)'i'(2ix- k . 2iy-l)dx dy,


f(x , y)t!JJ(2i x - k . 2iy - l)dx dy,

j f (x , y)11''2(2ix- k. ~Y - l )dx dy,


21 j f (x , y )ti3(2 x- k . 2 y - l)dx dy.
21

'Albert Cohen established this result undt>r vel) general b~potheses [4]: of
themost .convenient is that mo(~ . f7) ~ 0 if 1{1~~and 1'11$ ~
beauty of this theol)' lead~ ont> to think that it provide$ the correct
lli.poa~e to the image-processing problem. indeed. the image i.< decomposed by
canalyN into information that is independent from one scale to another.
thil ~ with tM generol philosophy expressed in the introduction. These
~radent pocket$ of information arc represented by the trend m V
o and the
. .CQIGIDOIU fJ n;. WJhose orthogonal .um i.< equal to f . Thr charocteristic
ofVo-j is 2i. Furtltcmon:. caclt !; E H _, s. it.,r.lf dr.composed inte> e>rl.lwge>ual
~

PYRAMID ALCORJTHMS rOR NUMERJCAL IMAGE PROCESSING

compomm& according I() the bc.ri12i,P(2ix- k, 2it~-l), (k,l)

59

e Z2 , t/J .. 1/11 .p,,

or 1/Js.
The Haar system is the simplest example of orthogonal waveleta, and it has
been used for image proc:eeeing for a long time. However, it has the disadvantage that, following quantization, it introduces rather harsh "edge effects," thus
producing unpleasant images.
This prompts us to say a few words about the quantization problem. From
the point of view of the mathematician, aU orthonormal bases allow the signal to
be reconstructed exactly. This is not the point of view of the numerical analyst
or image specialist. In practice, the coefficients from the decomposition must
be quantized, whether we like it or not. These approximations arise from the
machine accuracy or are imposed by a desire to compress the d&ta. II it is true
that
f(x, Jl)
a;e;{z, Jl),

=L

jEJ

what happens to the first term when the a; are replaced by coefficients 0:; satisfying IO:;
I :S t:, where t: > 0 is related to the machine precision? Anything can
happen if we are not using a .robust orthonormal basis, and we know today that
many orthogonal wavelet bases are infinitely more robust than the trigonometric
basis.
. In spite of this, orthogonal wavelets (and the corresponding pyramid algorithms) have not completely satisfied the experts in image processing. The criticized defect is the lack of symmetry. The function cp(x) ought to be even, while
the function ,P{z) ought to be symmetric in the sense that ~(1 - x)
1/l{z).
These properties are satisfied by certain orthogonal waveJets, but they do not
bold for wavelets with compact support. The Haar system is the only exception.
This lack of symmetry is reflected in visible defects, again following quantization. These visible defects do not appear when one uses symmetric b~-ortbogonal
wavelets having compact support. We introduce these wavelets in the next
section.

-a,

4.7.

Hi-orthogonal wavelets.

Following the pioneering work of Philippe Tchamitcbian {11] , Albert Cohen.


Ingrid Daubechies, and Jean-Christophe Feauveau studied a remarkablt> generalization of the notion of orthonormal v.-avelet bases, namely, hi-orthogonal
systems of wavelets. We begin with the one-dimensional case.
In place of an orthonormal basis of tht> form 2ii2 ,P(2ix- k), j, k E Z, Wt' ust>
two Riesz bases. each tht> dual of the other, denoted by 1/;.lc and J;.lr Tht> first
is used for s~"'lthesis. and the second is used for analysis. This means that for all
f (:r) belonging to L 2 (~)
OC'

(4.29)

/(:r }

=L

DO

L o;.~r1P;.~r(x),

-oo-oo

CHAPTEH 4

60

where the coefficients are defined by

(4.30)

a;,lc

L:

/(z).j,j,lc(z )dx.

Here

tPj,lc{z) = 'PI 2 tJJ(21 x- k),

and

tb1 ,A:(z ) = 'PI 2.j,(2ix- k).

Up to this point we have only weakeDed the definition of the orthonormal


wavelet bases. We have gained nothing. But now we are going to make considerably stronger demands by requiring t/l(z) to be an essentially explicit function.
For example, we can reqwre that 1/J(z) be the following function :

Then the general theory of Cohen. Daubechies, and Feauveau tells us that,
for this choice of t/l{z ), 2il 2 tJJ(2i:r- k). j , k E Z. is a Riesz basis for L 2 (R) and
the dual basis has the same structure, which is given by 2il 2 tb( 2' :r - k), j , k E z.
However, in this particular case, the dual wavelet. is not a continuous function .
To repair this we need to take a more general approach.
We assume that 1/; belongs to a set of functions that are continuous, have
compact support. are linear on each interval [k/ 2, (k + 1)/2]. k E Z. and are
~'lllJiletric with respect to 1/ 2 (in the sense that tjl(l- z) = w(:r)). Then the
Cohen- Daubechies-Feauveau theory states that we can choose a tl: from this set
so that the dual wavelet tb is a function in the class c;r and has compact support.
Here is bow ,P and~ are constructed. Start with the triangle function '(z )
sup(1 -lxi.O), which v.'8S mentioned in 4.2. Write mo{{) (ca; U 2) 2 : then. by
construction, <P({) = mo{U2)<P{{/ 2). !\ext consider g_-.({) = c,. J," (sin t) 2 "'+1dt.
where CN > 0 is adjusted so that 9N(0} 1.
'
If moW is defined by mo({ )mo{{ ) = g,.({l. then

mo({)mo({) + mo({ + 7-lrho({

(4.31)

+ r. ) =

1.

(In the construction of the Daubechi~ v.-a,elets. one imposed till' condition
lm(l({)l2 9N({).)
Define It e L 2 {JR) by its Fourier transform

{-t32)

I'YHAM ID ALGORITHMS FOJt NUMERICAL IMAGE PROCESSING

{i]

Identity (4.31) is equivalent to

(4.33)

j ~(x)cp(x- k)dx = 0

if k -:1- 0,

and

=1

if k

= 0.

The function ~(:z:) is even , its support is the interval [-2N,2N), and ~(x) is in
the Holder space cr for all sufficiently large N.
It is clear that

but (4.33) implies that the inverse inequality also holds.


Thus one can consider the closed subspace V0 C L 2 (R) for which ~(:r - k),
k E Z, is a ruesz basis. If the subspaces "Cj are defined by (4.19), this sequence
.
forms a multiresolution analysis of L 2 (R).
In the same way, let V0 be the closed subspace of L 2 (R) for which tp{:r - k),
k E Z. is a ruesz basis and construct the V, again by (4.19).
The two multiresolutions (V,) and ("Cj) are the duals of each other. This
duality is used to define the subspa.ces W; and W1 : I belongs toW; if I belongs
to
and if f~ool(:z:)u(:z:)dx = 0 for all u e V;.
. The wavelets tP and 1b will be constructed so that tJJ(:z:- k), k e
is a ruesz
basis for W o and, similarly, "'(x- k), k e z, is a ruesz basis for Wo. For this.
we define

v,+l

z.

and define the Fourier transforms

and

1b of t/J and wby


~({) =

m1 (U2),P(f./ 2).

We v.'Tite wp,(r ) 2il2 (2ir- k) and define tob;.~r (:z:) similarly.


The only properties that are not clear are that the family 1/J;.~r ,j, k e Z . is a
lliesz basis for L2 (R) and that. the same is true for the J;.lc These lliesz bases
are the duals of each other. Furthermore, the function ~(:r) is as simple&: it
is e>.."J)licit: it i~ continuous: it has compact support: it is linear on each interval
lk/2. (k + 1)/ 2]: and the values til(k/ 2) are explicit rational numbers. Fina.ll~.
we have 11(1- :r) = tb(x). and the symmetry. which Daubecbies's wavelets lack.
is reestablished.
In dimension two. we USt' thr wavelets 'P(r)1f1(y). 'PlY)1!(x). and w(x )t'"l:!J). 1\S
in thl! ortho~onal case. Then the dual wavelets art> ..;(x)1i(y}. .;?(y).;?(.:r). nnd
w(:r)~<y).
By usiug thest> bi-ortlJO{!OUul wavelet!' and an efficient method of vector qwu ttizatiou for coding the wavelet rocfficicnts. Barlaud 11 J ha.c: a chieved a comprt"l'sion rat io of urdcr l 00.

CHAPTER 4

Bibliography
(1) M. AN'TOHINI , M. BAIU.AUD, J. DAUIIECHIES, AND P . MAT1UI':U, /mage coding wmg
we:& pontUotion in the uoveld trorujt1f"m dcm4in, IEEE lnternatia<W Conference
oo Acou.Uca, Speech, aDd SicnaJ Procaeing, Albuquerque, NM, April 1990, pp.

:t.m-2300.
(2) E. H. ADIWION, R . HINGORANI , AND E. SIMONcr:LLI, Orlhogon41 pymmid trorujormJ
far imAge coding, SPIE, Visual Communicatiooa and l.mage Procesaing II, Cam
bridae, MA, October 27- 29, Vol. 845, pp. 50-58.
(3} P . J . B URT AND E . H . ADELSOI" , The L4plocian wramid ,.. a compact image code,
IEEE Trans. Comm., COM-31 (1983), pp. 532- 540.
(4] A. COHI':N, Onddettu, anolv.U multiruoluttmu et troitement numo!rique du 8ig114l. Thteis, CEREMADE, Univet'tity or Paris-Dauphine, Parilo, France, Sept.ember 1990
(5) J. C. FEAUVEAU, Anolv-'e mulbrUolution par on.deleuu non orthogon4lu et b<uu d~
jiltru numbique, Theei.6, Univet'tity or Paris-South, Pari.&, France, January 1990.
(6] J. FROMEPIT, 7hlitement d 'im4gu et applicati<ml de 14 troruformee en ondelett.es, Thteis, CEREMADE, University of Paris-Dauphine , Paris, France, November 1990.
(7) J . FROMENT AND J . M . M OREL. Anolve multiechelle, Wicn 1ti.rio et ondelette$, in
Lu ondelett.es en 1989, P. G . Lemarie, eel., Lecture Notes in Mathematics 143S.
Sprinaer-Verlag, Berlin, N- York, 1990, pp. 51-80.
(8] S. MALLAT, A ~for multiruolution rign41 decompo.Stion: The wavelet repruenta
tion, IEEE Trans. Pattern Anal. Machine Intelligence. 11 (1989), pp. 674~93.
191 D. MARR, Vifton, A ~ invutiganon onto the human repruentation And
priiCUiing of WuCll in/orrno.hcn. W. H. Freeman and Co.. N- York, 1982.
(10] Y. MEYBR, Qrule1d;t.u, Hermann. Paris. 1990.
(11) P . TCBANITCHIAN, BiorthogO'flGliU et Thi~ cLe. Opi.rateurs, Revista Matematica
lberoamerican~ 3 (1987) , pp. 163-189.
(12) M. VETTERU, Wavelet. and fil.ter bank,: &l4t~h.ips and new re.rolts, IEEE Conference on Ac;oustics, Speech and Signal Processing, Albuquerque. N M, April 1990, pp.
1723-1726.
(13) J. W. Wooos AND S. O'NEIL, Suhband coding of irn4ges. IEEE Trans. Ac:ous. Speech
Signal Process., 34 {1986), pp. 1276-1288.

CHAPTER

Time-Frequency Analysis for Signal


Processing

5.1.

Introduction.

Dennis Gabor (1946) and Jean Ville (1947) both addressed the problem ofdeveloping a mixed signal representation in terms of a double sequence of elementary
signals, each of which ocx:upies a certain domain in the time-~ plane. In
the following sections we will define what is meant by "time-frequency plane"
and "mixed representation," and we will suggest several choices for "elementary
signals."
. Roger Balian (1) tackled the same problem and expressed the motivation for
his work in these terms:
One is interested, in communication theory, in representing an oscillating
signal as a superposition of elementary wa!Jele~ each of which bas a rather
well defined frequency and position in time. Indeed, useful mformation is often
conveyed by both the emitted frequencies and the signal's temporal structure
(music is a typical example). The representation of a signal as a function of time
provides a poor indication of the spectrum of frequencies in play, while, on the
other hand, its Fourier analysis masks the point of emission and the duration of
each of the signal's elements. An appropnate representation ought to combine
the advantages of these two complementary descriptions; at the same time, it
should be discrete so that it is better adapted to communication theory.
Similar criticism of the usual Fourier analysis. as applied to acoustic signals,
is found in the celebrated work of Ville:
If wt> consider a passagt> [of music) containing several measures l v:hich is the
least that is nt>eded) and if a note. la for example, appears onct- in the passage. harmonic analysis vdll giw us tht> corresponding frequenc~ \'l'ith a certain
amplitude and a certai n phase. without localizing the la in time. But it is obvious that there arc moments during the passage when one does not bear the la.
The (Fourier] representation is nevertheless mathematically correct because the
phases of the notes near the la art- arr~ed so as to destroy this not' through
interferenct- when it is not heard and to reinforcr it, also through interference.
when it is heard: but if thert- is in this idea a cleverness that speak.c: well for
mathematical anru~-sis. on<' must not ignore thl' fact that it. i!' also 11 distortion
of reality: indet"<l whl'n th' la i~: not heard. thf' trur reason is that tht la i~ uot.
emittro.
f.:l

64

CHAPTEk 5

Thus it is desirable to look for a mixed definition of a signal of the sort advocated by Gabor: at each instance, a certain number of frequencies are present,
~ving volume and timbre to the sound as it is heard; each frequency is associated with a certain partition of time that defines the intervals during which
the corresponding note is emitted. One is thus led to define an instantaneous
spectrum as a function of time, which describes the structure of the signal at a
given instant; the spectrum of the signal (in the usual sense of the term), which
gives the frequency structure of the signal based on its total duratton, is then
obtained by putting together all of the instantaneous spectrums in a precise v.oay
by integrating them with respect to time. In a similar way, one is led to a distribution of frequencies with respect to time; by integrating these distributions,
one reconstructs the signal...
Ville thus proposed to unfold the signal in the time-frequency plane in such
a way that this development would lead to a Ilillced representation in timefrequency atoms. The choice of these time-frequency atoms would be guided by
an energy distribution of the signal in the time-frequency plane.
The time-frequency atoms proposed by Gabor are constructed from the function g(t ) = ?r- 114 exp( -t 2 / 2) and are defined by

(5.1)

w(t)

= h- 112 exp(iwt)g((t- i{])/h).

The parameters w and i{) are arbitrary real numbers, whereas h is positive. The
meaning of these three parameters is the following: w is the average frequency
of w (t ), h > 0 is the duration of w(t) , and i{) - h, i{) +hare the start and finish
of the "note" w(t). Naturally, all this depends on the convention used to define
the "pass band" of g(t).
The essent1al problem is to describe an algorithm that allows a given signal to
be decomposed, in an optimal way. as a linear combination of judiciously chosen
time-frequency atoms.
The set of all time-frequency atoms (v.'itb wand t 0 varying arbitrarily in the
time-frequency plane and h > 0 covering the whole scale axis) is a collection of
elementary signals that is much too large to pro\'ide a unique representation of
a signal as a linear combination of time-frequency atoms. Each signal adm1ts an
infinite number of representations. and this leads us to choose the best among
them according to some criterion. This criterion might be the one suggested
b~ \'ille: The decomposition of a signal in timt"-frequency atoms is relatt>d to
a synthesis. and this synthesis ought logically to be done in accordance with
an analysis. The analysis proposed by Ville v.'ill be described in the follov.-ing
sections. However, Ville did not explain how the results of the analysis would
lead to an effective synthesis.
A similar program (the definition of timt"-frequenry atoms. analysis. and
synthesis) was proposed by Jean-Sylvan Licnard [3j:
Wt consider the speech signal w be compo.~ed of elementary u>at>cform..~. tcf.
( winaowed sinusoids), rodz one defineJi by a small number of parameters. .4
watefoml model (ufm ) i..< a .nusoidal signal multiplsed by a windowing func tion. It i. not to be confused with tllr signal segment. uf . that it i. SUPJIO.~ed tCI
apJ>roximatc. It.~ total duration can b< dcromposeJi intc>attack (be/on: tJu max-

TJMI~ fREQ UEI'CY ASALYSIS Fill SIGNAL. PROCESSING

65

mmm of the envelope). and decay. In order to minimize 6peCtral ripples, ~


envelope 1hould pruent no lit or 2nd order du continuity. The initial dUcontinuity is removed through the use of an attack function (railed 1inusoid$) tuch
that the total envelope u null at the origin, and maximum after a 1hort time. Although exponential damping u natural in the physic..al world, we choose to model
the decaying port of the wf with another rau ed linwoid. Actually we see the
w f as a perceptual unit, and not nece11arily as the response of a format filter to
a voicing impooe. . .

Lienard's '"time-frequency atoms" are thus different from those used by Gabor. They are, however, based on analogous principles. We have w(t) =
A(t) cos(wt + cp), where w represents the average frequency of the emitted "note"
and where the envelope A (t ) incorporates the attack and decay. T he principal
difference is that. in the '"atoms'" of Lienard. the duration of the attack and
that of the decay are independent. Thus Lienard's "atoms" depend on four indep endent parameters. and the optimal representation of a speech signal as a
linear combination of time-frequency atoms is more difficult to obtain. Some
empirical methods exist. and they lead to wonderful results for synthesizing the
singing voice. I had the chance to hear t he Queen of the Night"s grand aria
from Mozart's Magtc Flute interpreted b~ '"ttme-frequency atoms." This was
not a copy of the human voice; it involved the creation of a purely numerical
(superhuman) voice. This was commissioned by P ierre Boulez. the director of
the lnstitut. de Recherches Coordonnees Acoustique-Musiqu e.

01
---------~~.w
I
I

I
I

66

CHAPTERS

5.2. The time-frequency plane.

The time-frequency plane serves the acoustician the way music paper serves the
musician. To continue this metaphor, the signal anal)'lia that we seek to effect
ahould be compared to an exerciae called a "music:&l dictation," which consists
in writiz1 down the notes on hearing a passage of music.

6.3. The Wigner-Ville transform.

We begin by presenting the point of view of Ville. We will then indicate how
to interpret the results in terms of the theory of pseudodifferential operators as
expressed in Hermann Weyl's formalism. This will bring us back to work done
by the physicist Eugene P. WigDer (1932).
Ville, eea.rc:hing for an "instantaneous spectrum," wanted tQ display the energy of a signal in the time-frequency plane and to obtain an energy density
W(t,{) having (at least) the following properties:

I:

(5.2)

1:

(5.3)

W(t.{)~ = lf(t)fl ,
wet,e)dt = 1icew,

where j({) denotes the Fourier transform of f . These two properties reflect the
program that we presented in the introduction: at each instant t , the function
W(t,{) gives an instantaneous Fourier analysis of the signal f (t ), and (5.2) is
the Plancherel formula. The same remark holds for (5.3):
comes from

li <eW

the contributions of all instants t . and one hopes that li({)l2 is more precise}~
analyzed by using these "individual contributions.'"
Properties (5.2) and (5.3) are clearly not sufficient to define W (t. ~) =
Wr (t. { l. We impose two other condit.ions. namely. "Mo~-ars formula'

jj w (t. ~nrp(t. ~ldt ~ = IJ /(t )g(tldt j:!

(5.4)

which plays the role of ParseYaJ"s identit.y. and the requirement that if
(5.5}

/ {t) =h-I ':! exp(i..Jt )g((t:... t0 )/ h)

then
(5.6)

n1(t. { ) = 2 c.\':p (

(t-

tof)

112

.,

.,

e>.."J>( -h(~- -)).

TIME-FREQUENCY ANALYSIS FOR SIGNAL PROCESSING

67

Let's stop a moment to examine {5.6). The second member is a function of


(t , {) that is localized on the rectangle of the time-frequeucy plane defined by
It- to I :S h, I{- wl :S This localiz-ation COtTesponda exactly tothe frequency
content of the time-frequency atom /{t). Up to a normalization factor, the second
member of (5.6) is the solution to the localization problem in the time-frequency
plane that we want for our time-frequency atom.
We now come to the general definition of the Wigner-Ville transform of a
signalf{t). First assume that the energy
lf(t)1 2 dt is finite. Then define

(5.7}

W(t,{)

L: /

Loo

(t+

i) 7 (t- i) e-f{~d-r.

We notice immediately that W(t, ~)is real and continuous in both variables. It
is easy to verify that W{t,{) has the properties indicated in (5.2) to (5.6).
If /(t) = f( - t), then W(O, 0) = 2
IJ(r)l2dr > W(t, {) for all other pairs

Loo

{t,{).
5.4.

The computation of certain Wigner-Ville transforms.

We begin by treating the case of signals with finite energy. If W (t ,{) is the
Wigner- Ville transform of /{t), then
W (t, ~ - w) is the transform of e""' /{t),
W (t - to,{) is the transform of f(t- to),

and
W

Ga{)

is the transform of

~I(~). a> 0.

Knowing that the transform of g (t) = 7!'- 114 exp(-t2 / 2) is 2exp(-t2 - {2 ) ,


we deduce immediately that the Wigner- Ville transform of

(t -to)

-1eiwr g - -

.Jh

is

2exp (

(t ~!o)2 -

h2(~- w)2).

Here are some other useful observations. The Wigner- Ville transformation
of a function characterizes the function up to multiplication by a constant of
modulus 1. W<' will prove this in the ne>.'t section when we establish the connection between the Wigner- Ville transform and the pseudodifferential calculus.
The Wigner- Ville transform of f (-t) is W (-t, -{) when the transform of f (t )
is W (t,{). Multiplying f (t) by a real or complex constant>. results in the transform W (t . {) being multiplied by 1>.12 Thus we need to consider only the case
where f~oc lf(t )l2dt = 1 when we are working with signals of finite energy.
Not. all function!< H'(t . { ) of thE' two variables t and {atE' the Wigner- Villc
transform of SOUIC' si~nal f (t) .

68

CtiAPTER 5

We consider a positive-definite quadratic form

Q(t, {)

= pF.2 + 2r{t + qt2

(p > 0, q > O,pq > r 2 )

and ask when 2exp(-Q(t,{)) is the Wigner-Ville transform of a signal f(t) . In


view of the preceding remarks, we must have/( - t) = f(t), and we assume that
1~oo l/(t)j2dt = 1. But then
W(t,{)dt~ 211', which implies that pq-r2 1.
We will show that this necessary condition is also sufficient for 2exp(-Q(t,{))
to be the Wigner- Ville transform of a signal.
For this, we observe that if W(t ,{) is the Wigner- Villt> transform of f (t).
t hen the transform of / (t)e'012 12 , where o is a real constant, is W (t ,{- at).
Our quadratic form 1'{2 + 2r{t + qt2 can also be written as Q(t, {) =
p{{- (r /p)t)2 + (t2/ p) since pq - r 2 = 1. The Wigner-Ville transform of
2
p- 114 g(t/ JP), where g(t) = 7r - 114 e-t / 2 , is 2 exp( -(t 2 f p)- p{2 ), and thus that
of p- 1 14 g(t / ..fi) exp(i (r / 2p)t2) is 2 exp( - Q(t. {)). Our problem is solved.
More generally 2 exp( -Q(t - to . ~ w)) is the Wigner- Ville transform of the
signal

11

(5.8)

f (t ) = p-1f4g ( t ~to) exp (' ;p (t -

to)2)exp(~t).
pe

which is called a "chirp." The quadratic form Q(t, {) =


+2r{t+# is subject
to the condition r2 - pq = -1.
Here is an important identity involving the Wigner- Ville transform. We have

In other words, if W(t,w) is the Wigner-Ville transform of f (t), then


W(w. -t) is that of * j({). Here again j denotes the Fourier transform
of f .
Another very useful fact is that the Wigner- Ville transform of an arbitral':'
function is alv.-ays real, but it is not alv.-ays positive. This second remark is the
source of a great deal of diffi~ty in the interpretation of 'W(t. { ).
Ville interpreted the Wigner-Ville transform W (t , { ) of a normalized signal
f (t) as a probability density in the time-frequency plane. II this probability density were concentrat,ed in several v.ell-delimited rectangles in the time-frequency
plnne, this would lead to a decomposition of the signal in terms of the corrt>sponding time-frequency atoms.
This program has not led to an effective algorithm. Tht> reason for this failurt>
is that if f (t) is the sum of two timE"-frequenc-y atoms c""dgll- t 1 )+eiw21g(t - t 2 ).
then
H'(t. { )

= W1(t . ~) -r ll'2(t. { ) + "W3(t , O + n 4 l :.~) .

wherC' thC' terms W 1 (t. 0 and W2 (t. ~ ) art' tht> squart'" tcml." already calcull\tf'd
and wherC' H :l and n4 arc two ~cros.o:- L('rrru;. But thCSC' c-ross temts do not tend

T IM!-rFR.EQ t;EI\CY AI\ALYSIS FOil SIGNAL PROCESSING

to zero if W2 - w 1 or if t2 -

=4

t1

69

tends to infinity. In fact,

(t 1 +t 2 ) and WJ !{w1 +W2) These "cross" terms are thus artifacts


where t3
that are localized in the time-frequency plane midway between the corresponding
square terms.
The fact that the Wigner- Ville transform is not, in general, positive and the
fact that its localization in the time-frequency plane does not necessarily imply
the presence of time-frequency atoms are two independent properties. This can
be seen by considering the signal l(t) = e-t for t ~ 0, and f (t) = 0 elsev.here.
Then W(t. {) = e- 2t inl'.{ if t ~ 0 and W (t , { ) = 0 otherwise.
There exists, however, a simple way to make the 'W igner-Ville transform
positive. It suffices to smooth it appropriately. Indeed, if !I and h are two
arbitrary functions (with finite energy), then
(5.10)

jj WJ. (t -

u.~- t)W h(u .t)du dv = 21r

11.:

l 1 (s)f2(t- s)e-~d{

= ~ g(k) , where g(t) is the normalized Gaussian and


where h > 0 is arbitrary, then we have W12 (u, v) = 2exp( -(u2 fh 2 ) - h 2 v 2 ) and
the smoothing function is a Gaussian kernel. The mean value one obtains is the
square of the modulus of the scalar product of h and the time-frequency atom
~ g( ;;' ) e'(, centered at t , with width h and avera.ge frequency ( .
But the Wigner-Ville transform W(t, {)of a signal f can also be smoothed by
using a kernel of the form ~ exp( - Q(t , {)). where Q( t, {) is one of the quadratic
forms previously stu<tied. One obtains a positive contribution that is the square
of the modulus of the scalar product of I with a "chirp."
If. in particular, h ( t )

5.5.

The 'Wigner-Ville transfonn and pseudodift'erential calculus.

The following considerations allow us to relate the Wigner-Ville transform to


quantum mechanics and the work of Wigner. We are going to forget signalprocessing problems for the moment and go directly to dimension n . The analogue of the timt--frequen~ plane is the phase space R" x R" wh~ elements
are pa.L'"S (:r. {). where :r is a position and { is a frequency.
Wt> start with a ::-~-mbol'" u(:r,{) defined on phase space. Certain technical
hypotht!Ses haw w h<' made about this symbol to ensure convergence of the
follov.ing int<>gral when I belongs to a reasonable class of test function!' . and we
will deal with this in a moment.
Following thr formalism of Weyl. we associat,e with the symbol (Tl.7'. ~) thr
pseudodifTcrt>ntial operat.Or C7(:r. D ) defined by
(5.11)

wht>rt> th! inwgral

j,-

(l\'('f

R"

IR" . D<'fin(' thC' kernl'l

r:(:r. y)

assudntt'<l with

CHAPTERS

70
the I)'Dlbol u(z, {) by
(27r)" K(x, ") =

(5.12)

j u ( z; ", {) e'C-IIHd{

= (27rtL(X;JI,x-~) .

This says that the symbol u(z,{) is the partial Fourier transform, in the
ftriable u, of the function L(z, u) and that the kernel K(x, y) that interests us is L(~,z- y). We can also write, in the inverse sense, L(z,JI)
K(x+ ! .z - J), and this allows us to recover the symbol u(z,{) by writing
a(z,{) =

(5.13)

K ( z + ~ . :z:-

~) e-fll(d11.

Thus we are led to hypotheses about the symbols that are the reflections. through
tbe partial Fourier transform, of hypotheses that we may v.-ish to make about the
Rrnels. If we admit all the kernel distributions K(:z:, y) belonging to S'(Rn x R.n),
then there will be no restrictions on a(x, { ) other than the condition that

a (z ,{) E S'(Rn x R" ).

An immediate consequence of (5.13) is this: If u (:z:,{) is the symbol for the


Clp8&tor T, then u(:z:,{) is the symbol for the adjoint operator 'r.
Finally, we consider a function f belongingtoL2(Rn) and satisfying ll/ ll2 1.
Pr denote the orthogonal projection operator that maps L2 (Rn) onto the
span of f. Then the kernel K(:z:, y) of Pr is / (:z:)/(JI) and the corresponding

symbol is

(p.J4)

u (:z:,{)

= 1 (x + ~) 7 (x-

e- ll(dJI.

kurning to dimension one, we have the following result: The Wigner- l'ille
lnru/orm of the function f is the Weyl ~~ of the orthogonal projection oper-

*f c:baracterizes
onto that function f .
/,
to
up

From this it is clear that the Wigner-Ville transform


multiplication by a constant of modulus 1.

(Je. The Wigner-VUle transform and instantaneous frequency.


In Villt's fundamental work (1'bicb has essentially been tht> sourct> for this chap), be makes a careful distinction between tht' instantaneous frequen~ of a
llipa1 (assumed to be real) and tht' instantaneou!' spt><:trum of frE'<)ueucies given
by tht> Wigner-Ville transform.
Mort' precisely, let f (t) bt> a real signal with finitr energy. Vill{' v.Tites f( t) =
ReF(t ). where F(t) is tht> corresponding anal~"tic signal: F (t ) is th<' restriction
to

the real axis of a function F (.:) that is holomorphk in tht> upprr half-p lant'

1m:> 0 and belongs to tht> Hardy spact' H 2 (R l.

\'illt' then writes F (t ) = A(t)r"~'lt). whert> A(t l is thr modulu.c; of F (t ) Md


t) is its argument. He defines tht> instantanrou.c; frt>quen~ of f(t ) by f,.,:(t).

TlME-F'R.EQUENCY ANALYSIS FOR SIGNAL PROCESSING

71

This definition requires the function /(t) to have additional regularity~


erties. Otherwise .p(t) could be aa irregular as an arbitrary bounded, measurable
function, and the instantaneous frequency would then be a very singular object.
This a1ao rai8e8 a problem about the continuity of .p(t) eo as not to introduce
Dirac: measures in .ftf{)(t).
We will not deal with theee difficulties, and we assume that Ville's formal
definitions make sense. This, of course, clearly limits the class of analyzed signals.
Following Ville, we define the imtcntanemu 6peCtrum of f(t) ""the WignerVille tromfonn W(t,{) of the anolvtic 6ign41 F(t).
An easy calculation shows that, for all real or complex-valued functions u(t) ,
one has

1: [

{u(t

+ r/2)U(t- r / 2)e-i"'f.d{ dr = -1fi(u'(t}u(t)- u(t)U'(t)).

Applying this_ identity to u(t) = F(t), it becomes

2~ /_: ~W(t,{)d{ = .p' (t )IF(tW = .p'(t) (;"'

L:

W(t,{)d{) .

If W(t, ~) is positive or zero, then j,;W(t,{) will be a probability density (when


f~oo IF(t)12 dt = 1) and the imtantaneous ~ will be the average of the
frequency { computed with rupt to the inltdntaneow 8pictrum.
Similarly, we can try to compute the analogue of the variance of the variable
{with respect to the instantaneous spectrum. This is J~00 ({- f{)'(t)) 2 W(t,{)d{.
The calculation is completely general and does not rely .on the assumption that
F (t) = A(t)e'<P(t) is an analytic function. We obtain

H, in particular, A(t )
1, the second member is zero, and W(t,{) cannot be
positive or zero unless it is concentrated on the curve { = .p'(t), which represents
the graph of the instantaneous frequency. Since 21rA 2 (t)
f~oo W(t,{)d{, the
"variance" of { is equal to -!(cf2 /dt2 )(log A(t)).
Here are two examples of the calculation of the instantaneous frequency.
Suppose first that the original signal / (t) is real, equal to 1 on the inter\'al
[-T, T], and 0 outsidt> this interval. The corresponding anal~<tic signal is then

F (t )

= f (t) + ;;:i log lt+TI


t=T .

The phase c;(t ) of F { t) is continuous on the whole real line. odd. equal to
T (and thus -~ if t ~ - T ). and strictly increasing on [-T.T]. Tht>
instantaneous frequen~ is 0 outsid<> tht> interval [-T, T ]. suictly positive on
( -T, T ), equl\1 to "~ at 0. and increase; from ~ to +OC' &: t traverses tlw
interval [0. T ).
~ if t ;:::

CHAPTEk 5

72

AB one could have guessed, the in.ttantaneou.~ Fourier anol11. iJ propo$ed b11
Ville i3 not even a local propert11. Thu ~ Uult knowing the 6ign4l in an arbitro.r y klrge intenlaZ ~ at to u not .u[ficient to calculate the irutantaneo1U
frequency at t 0 . The operation responaible for this anomaly is the calculation of

the analytic BigDal F(t) associated with / (t); as everyone knows, the kernel of
the Hilbert transform decreases slowly at infinity.
This discussion shows that the signals to which the Ville theory applies are
necessarily academic signals (whose algorithmic structure does not change over
time) or asymptotic signals whose behavior on a short time interval is equivalent
to that. of a normal signal over a much longer duration.
The second example of a calculation of the instantaneous frequency is for
the signal/(t) = cost2 which is a chirp of infinite duration. The calculation of
the corresponding analytic signal is interesting because it exhibits two different
asymptotic behaviors depending on whether t tends to + :x. or - x . Indeed.
the Fourier transform of this analytic signal F (t) is 0 for { < 0 and is equal to
~ ( .;;i e-~' 1 4 + H1 ei!'/ 4 ) for { > 0. It follows that F (t ) is asymptotically
equal to e112 when t tends to +oc and to e- 112 when t tends to - oo. The
inStantaneous frequency of cost 2 is thus equal to 2t + E:(t) when t tends to + oc
and to -2t + E:{t) when t tends to - oc. ln both cases, E:(t) tends to 0 when lt i
tends to +oo.

5.7.

The Wigner-Ville transform of asymptotic signals.

As we have already seen in 5.5, the Wigner-Ville transform can be generalized


to the case where, instead of being a signal of finite energy. / (t) is an arbitra.l}'
tempered distribution. We limit our discussion to three examples where / (t) =
e"*>. We begin v.;tb the particular ca.se where ;p:>(t) = wt. (/(t) is an anal~>tic
signal only when u.. ~ 0. but the calculations that follow do not depend on this
type of hypothesis.)
The Wigner- Ville transform of / (t) e'-' 1 is 211'60 ({- "-'). where 6o({) is thP
Dirac measure at the origin. Then the instantaneous frequenc~ given by the
Wigner-VillP transform is simply w.
Next, if / (t) = e'012 12 for a real. thP Wigner-Villt> t ransform of / (t) is
21r60 ({ - at). This is a distribution (in fact. a measure) supported by the line
{ = at. The corresponding "instantaneous frequency" is at. and both members of (5.15) are 0 in this case. In fact. this statement is not correct becau.st'
c' 0 12 12 is not 811 analytic signal. However. at< W {' hav{' al.ready observed. e'012 n
is asymptotic to an anal~"tic signal when a is strictly positiv{' and when t tend$
to +oo.
0 13
Finally we come to the ca.se where f tt )
with o > 0. T hr \\'iguer- Villt
transform of this function is easily calculat.ed and is

=('

j exp[i(ar

/ 4 'T 3art2 - wr))dr

= 2r. (a:)

~ ) :: (3af~- ,..:))
1 1

113

A ((

T IMI-: rFREQt;E:"''Y A!'\A I.YSIS FOR SJG:"\AL PROCESSII"C.

73

wher!!
A(w)

= _!_
27r

f""

ei((" / 3J+w)ds

-oo
011

is the Airy function . Here again the Wigner-Ville transform of the function e'
is "essentially" concentrated around the curve ~
3ot2 , which is the graph of

the instantaneous frequencies. All of this must be put in quotation marks since
etats is not an analytic signal. But the signal is asymptotically analytic because
the Airy function decreases exponentially when w tends to +ex:.
5.8.

Return to the problem of optimal decomposition in timefrequency atoms.

As we indicated in the introduction to this chapter, the analysis of the energy


distribution of a signal in the time-frequency plane was, for Ville, a preconctition
for his search for optimal decompositions in time-frequency atoms.
Consider the example of the signal f(t) = e"'',/ 2 In the time-frequency
plane, its energy is concentrated on the line { at. U we try to decompose this
function in time-frequency atoms of the kind advocated by Gabor, this comes
down t o covering the line ~ = at in an optimal way with "Heisenberg boxes" of
area 1. We can think of these squares as leading to Gabor wavelets of the form
e10" 1g(t - k). Finally, all of this leads to approximating f(t) e.... t/ 2 with the
sum of the series L~cc e-\Ok, l l eioktg(t- k), and this is a poor approximation.
The other maj or shortcoming of the Wigner- Ville transform is that it is
not always positive. One might have thought, in light of the example of the
signal f (t ) = e""'1g(t - t l) + eiw>1 g(t - t 2 }, that the places where the WignerVillr transform is positive correspond to the time-frequency atoms and that the
places where it oscillates were simply artifacts. But this is not the case. as we
see from the example of the signal / (t ) e-t fort~ 0; f(t ) = 0 otherv.'ise.
WE' are forced to conclude that the Wigner-Ville transform yields only imperfect information about the distribution of energy in the time-frequency plane.
There does not exist an algorithm that allows us to find an atomic decomposition
of a signal by using the Wigner- Ville transform.

Bibliography
!I: R

12
l3i
!{

lin pnne1pc d 'metTtnude fort en thi one du ngnaJ ou en micamquc quonhque.


C. R. Acad . Sci. Paris. St\r. 11. 292 ( 1981 ), pp. 135i- 1361.
I' FLA!I'OIU!<. Somr a.specl. nf n Ofl $LOhonory $i!l'ol proces."f19 with nnpho.t.. or. timt
,fr.,qurncy ond hmtcolr m ethods. i11 Wavele~. J . 1\1. Combe. A . G rossm&Jo. and P h.
T cluurutch.a n . eds .. Spr in,:cr-Verl&&. Berlin. 1989 . pp. ~98.
J . S. L tE"ARll. SPCh onoly..i. ond re<:em.ttnlctaon u.nng .Mn-time. elnnetlto"ll ""''"''
f orms. ICASSP Si.
J . \' 1LL E , Thionc el awi&cahons de lo notion de >gn4/ anol)!toque. CtYT, Laboratoirt'
df' Te!Kommuniation. dr h Soci~~ Alsaciennt' df' Construction Mt!caniquc. 2rmt A.
l'c>. I (1!148).
DALIA!<.

CHAPTER

Time-Frequency Algorithms Using


Ma.lvar Wavelets

6.1.

Introduction.

This chapter is the logical sequel of the preceding one. We will introduce algorithms that allow us to decompose a given signal .t(t) into a linear combination
of time-frequency atoms. The time-frequency atoms that we use are denoted by
/R(t) and are coded by the Heisenberg rectangles R (with sides parallel to the
axes and with area 1 or 27f depending on the normalization). If R = [a, b] x [o, 8],
we require that the function !R be essentially supported on the interval[a, b) and
that its Fourier transform
be essentially supported on [a,p]. We also ask that
the algorithmic structure of /R(t) be simple and explicit to facilitate numerical
processing in real time. The decomposition

iR

00

(6.1}

s(t)

= 2:o;/R1(t )
0

cannot be unique. and we take advantage of this flexibility by looking for optimal
decompositions, which here means they contain the fewest possible terms.
The point of view of Ville (and of numerous other signal-processing eicperts)
is that it is first necessary to understand the physics of the process and that '"the
algorithms will follow." Unfortunately, algorithms associated with the ViignerViUe transform have never followed, they have never existed, and the WignerVille transform is an analytic technique that does not lead to a synthesis or to
the transmission of information.
Our approach here is to favor synthesis and transmission over analysis. Thr
''time-frequency atoms that v.e use are completely explicit: They art' either
Malvar wavelets or wavelet packets. and we will immediately WTite dov.n tbt'
'atomic- decompositions~ of thE' type (6.1). This means that the synthesis will be
direct. whereas the analysis will consist in choosing- with the use of an entropy
criterion- the most effective synthesis which is the one that. leads to optimal
compression. Thus th> analysis proceeds according to algorithmic criteria and
not according to physics, and it is not at all clear that this approach lead~ t.o
a signal anal~'Sis that rt'veal!' physical properties having a real meaning. For
exaropk. MariC' Farg< had thC' idea to apply the algorithm to image!' of simulated t-W<~dimcusional turbulenc-e. Th> algorithm extractl"Cl. in order. coh<'r<'llt
structur(-:;. vorticit.y fihunrnl$. nnd w on down the seal<'. Thi..; i:> ama.ziu~: ht'-

CHAPTER 6

76

cause the algorithm is not based on an analysis that takes into consideration the
underlying physics of fully developed turbulence.
After these general remarks, it is time to specify the algorithms. There are
two options: Malvar wavelets and wavelet packets. With the first option, the
signal is segmented adaptively and optimally, and then the segments are analyzed
using classical Fourier analysis. The second option, "wavelet packets," reverses
the order of these operations and first filters the sigDal adaptively; the analysis
in the time variable is then imposed by the algorithm.
Ville (1947) proposed two types of analysis. He wrote: "We can either: first
cut the signal into slices (in time) with a switch; then pass these different slices
through a system of filters to anal~rzc them. Or we can: first filter different
frequency bands; then cut these bands intp slices (in time) to study their energy
variations." The first approach leads us to "Malvar wavelets" and the second to
"wavelet packets."

6.2.

Malvar wavelets: A historical perspective.

The discovery of Malvar wavelets (Henrique Malvar, 1987) falls within the general framework of windowed Fourier analysis. The windov. is denoted b~ w(t),
and it allows the signal , (t) to be cut into "slices" that are regularly spaced in
time w(t-bl)s(t ), l = 0, 1, 2, ... The parameter b > 0 is the average length of
these "slices." Next, following Ville, one does a Fourier analysis on these slices.
which reduces to calculating the coefficients J e- i4klw(t - bl), (t)dt, where a > 0
must be related to band where k 0, 1, 2, ... This is thus the same as taking
the scalar products of the signal s(t) with the "wavelets"

Wi..t(t)

= eioklw(t- bl).

This analysis technique was proposed by Gabor (1946). in which case the w(t}
was the Gaussian. The "Gabor v.-avelets" lead to serious algorithmic difficulties
and. more generally, Low and Balian showed in the early 1980s that if u(t)
is sufficiently regular and well local..U:ed, then the functions w,.,1, k.l E Z. can
never be an orthonormal basis for L 2 (1R). More precisely. if the two integrals
J::000 (1 + 1tl) 2 1w(t)i2dt and f~oc ( l + 1{ 1)21w(~)j 2 d{ are both finite, the functions
Wk.l k.l E Z., cannot be an orthonormal basis of L 2 (R).
The crud(' v.indov. defined by u(t ) = 1 on the inter"al !O. 2r.) and u(t) = 0
elsev.here escapes this criterion. By choosing a = 1 and h = 21r, tht> windowE'd
analysis consists in restraining tht' signal t.o each interval !21;.. 2(/+l }7r) and using
Fourier series to analyze each of t.ht> corresponding functions. But tht' functi ons
obtained by this crude segmentation arC' not 21r-periodic. and the Fourier anal~sis
v.ill highlight this lack of periodicity and interpret it as a discontinuit,y or au
abrupt \'&fiation in the signal.
Ont> way to attenuatt' these numerical artifac~-ithout. howt'ver. elimi
nating them- is to use th(' Discret<' Cosine Transform (DC'T). We describC' tbC'
continuous version of this transform.
On each intervall21r., 2(1+ 1):-r). v.-e analyze the signal s(t) using the orthonor
mal basis composed ofthC' functions +. and .J..cos( ~ t) . k = 1.2.3.. .. If s(t)
\

.671'

"~

TIME-FREQUENCY ALGORITHMS USING MALVAR WAVELETS

is a very regular function , the numerical artifacts produced by the segmentation


are reduced from an order of magnitude 1/ k to order 1/k?.
The physicist Kenneth Wilson was the first to have the idea that one could get
around the problem preaented in the Balian-Low theorem by imitating the DCf
and using a eegmentation aeated wtth very regular windon. Wilson alternated
the DCT with the Discrete Sine Tta.nsfonn (DST) according to whether l is even
or odd; l denotes the position of the interval, and the DST uses the orthonormal
basis of functions -j; sin (~t) , k:;::: 1, 2, 3, .. .
Wilson's construction bas been the point of departure for numerous efforts.
the most notable of which is due to Ingrid Daubechies. Stepbane Jaffard , and
Jean-Lin Journe. They used a window w (t ) having the l>roperty that .both it and
its Fourier transform decay exponentially, and they constructed w( t ) so that the
functions ""', k = 1, 2, 3, ... , l E Z . and Uo.l, l E 2Z, defined by

../2 w(t- 2hr) cos ( ~t)

(6.2)

uu(t ) :;:::

(6.3)

Uo.l(t ) = w(t - 2hr),

e 2Z,

= 1,2, . ..

1 e 2Z,

k = O,

lE2Z+l,

and

(6.4)

""'(t) = v'2 w(t- 2l1r) sin (

~t) ,

= 1,2, ...

constitute an orthonormal basis for L2 (R).


Malvar did not know about this work. He discovered a family of orthonormal
bases ""1(t), which have exactly the same algorithmic structure described by
(6.2), (6.3), and (6.4), but where the choice of the windO\\ w(t) is simpler and
more explicit. In fact, Malvar bad only these hypotheses:
w(t)

(6.5)

(6.6)

(6. 7)

~
2

=0

if t

~ -r.

w(t ) ~ 1 and
2

u (t) + u; ( -t)

=1

or t

3r.

w(21f- t ) :' w(t)

if

1f

~ t ::; r..

Then the construction is the same, and the sequence""' defined by (6.2). (6.3).
and (6.4) is an orthonormal basis for L 2 (R). In Malvar's construction. thE> v.indow w(t) can be very regular (infinitely cti1Jerentiable, for example). but thE'
Fourier transform of u cannot havt- exponential decay. Condition (6.5) prevents
it. and this condition plays an essential role in the demonstrations.
Althou~h similar. the solutions found by Dnubechies. Jaffard. a11 d Jouruc. on
thP om hand. a.nd l>y l\1alw.r. on the other. arl' not linked b~ a logical connl'Ction.
6.3.

Windows with variable lengths.

C'..oifman and tlu author tackled the problem of modifying tht' preceding constructions to obtain v.iudows with variable lengths that could bt> defined arbitrarily. The construction b~ Daubechies. Jaffard. and Journe d~ not extend to
this context. while t hat of Malvar gcueralizes. v.;thout tb<' sli~htest difficult.~. to
th( 01."<' of arbitrary windows.

78

CHAPTER6

We begin with an arbitrary partition of the real line into adjacent intervals
lcJo4j+J), where .. < - 1 < ao < 41 < 42 < .. . limJ-+oo a;
+oo, and
limJ--oo a; -oo. Write l; 4j+l - CJ and let OJ > 0 be positive numbers
that are ama1l enough 10 that l; ~ o1 + OJ+l for all j E
The windotva w;(t) that we use will eeeentially be the characteristic functions
of the intervals [a1,a;+l); the role played by the disjoint intervals (aJ-o;, CJ+o;)
is to allow the windows to overlap, which is necessary if we want the windows to

z.

be regular.
More precisely, we impose the following conditions:

0 S Wj(t) S 1 for all t E R,

(6.8,)
(6.9)

w;(t) = 1 if

(6.11)

tt!J (a;

(6.12)

+ a 1 S t S a;+l- OJ+t

ai

+ r ) + tt!J(a1 -

u:;-1(a; +

=1

-;-)

r ) = U:j(a1

if lr l S

r ) If lr l $

OJ

Or

Clearly, these conditions allow the v.-indows w1 (t ) to be infinitely differentiable.


Wj - 1 ( t l

Wj l t J

: \
I

:
I

aj

I
I

'
:

1:

_,'

'

2~j

'......: ----+-----'------;:o....,.;--.
:

---..--~I

-----T----

ai,. 1
1

~ ~----~'---~~

.. 1

...

2~/+1
I

~-

It is clear that }:~00 (w;(t)f2 11 identically on the whole real line.


FiDally, we come to tbe Malvar v.-avelets. They appear in two distinct forms.
Tbe first is given by

itb k = 0. 1, 2.... and j E Z .


The second form consists in al teruat i n~ thl' cosines and sin~ according to
11'hether j is even or odd. Thus Wt' haw thrE'C distinct expressions for the second

form:
(6.14)

r;:;
F
=
f .:.u1 lt ) ~ _::_ lf _,_
\' 1J
I

u"'-(t )

a1 )

TIME-FREQUENCY ALGORJTHMS USING MALVAR. WAVELETS

if j E 2Z and k

= 1, 2, ...
up1 (t)

(6.15)

79

= {fw;(t) if j E 2Z and k = 0,

and
(6.16)

if j E 2Z + 1 and k

1, 2, . ..
The functions u;.-.(t), j E Z, k = 0, 1, 2, ... given by (6.13) are an orthonormal basis for L 2 (R), and the same is true for the functions defuied by (6.14),
(6.15), and (6.16).
Here are two graphs of Malvar wavelets:

4-------- - ---------

--1

.....,..
1

Not<' the ex"treme similarit) between the Malvar wavelets and thE' timt>frequenC'y atoms proposed by Lienard. In particular. the .Malvar wavelet:; 8T('
construc-ted v.itb an attack (whOSE' duration is 2o;). a stationary period (v.hich
lasts l; - o ; - o; +d and then a decay (which lasts 2a;+ll ThE' ability to
ch~. arbitrarily and independently. the duration of the attack, then that of
the stat ioutu') sC<'tion. and fumlly the duration of th<' relaxation is preciS<'ly what

CHAPTEH.6

80

differentiates the Malvar wavelets from the preceding constructions (Gabor or


Daubechies- Jaffard- Joume).
Of course, it is important to make good use of the freedom-of-choice at our
dispoeal. We will see bow to do this in the folloi'ing sections.

6.4.

Malvar wavelets and time-scale wavelets.


I

In 1985, Pierre-Gilles Lemarie and I constructed a function 1/J(t) belonging to the


Schwartz class S(R) such that 2;/2 t/J(2;t- k), j ,k E Z, is an orthonormal basis
for L 2 (R). In addition, the Fourier transform of tb is zero outside the intervals
2
2
2
8
; ) and [ ; , ; ). We will see that these wavelets 2il w(2it-k), j. k E Z.

[-.-

constitute a particular case of the general Malvar construction.


This is quite surprising because the Lemarie-Meyer wavelets constitute a
"time-scale" algorithm, whereas the Malvar wavelets are a "time-frequency" algorithm. There is thus an incompatibility.
In fact, it is by analyzing the Fourier transform j of a (arbitrary) function f
in an appropriate Malvar basis that we arrive at the analysis by Lemarie-Meyer
wavelets.
We begin v.'ith the following observation: The Malvar \vavelets allov. us to
' analyze functions defined on a half-line, and this is contrary to what can be done
with the Daubechies- Jaffard-Journe wavelets. The segmentation of (O, oc) that
we use is the "natural" division into dyadic intervals [2i, 2i+ 1 J, j E Z. Then
it is natural to choose the windows w;(x), associated with these intervals, to
be of the form w; (x) = w(2-ix). Thus the whole construction rests on the
precise choice for the function w(x). For this, we make the follov.'ing choices
in accordance with conditions (6.8)- (6.12): w(x ) is zero outside the interval
[2/3.8/ 3j, w(2x) = w(2 - x) for 2/ 3 S x ::; 4/ 3. and u.2 (x ) + w 2 (2 - x) = 1 on
the same interval. Then a;= 2i, o.; = ~21 and l; =2i = o.; ..J... a;+J

w t 2x l

W !XI

, . . ---T"------.
. . _............
.:
t

8 '3

The Malvar wavelets of type (6.13) are then


(6.17)

u;.k(x ) = , 12 :?-il2 oos[r.(k+ 1.'2)T1 xju(:?-Jx ).

We can just as easily replace' all of the cosine;: in (6.13) by sines and thu.c:
obtain a second orthonormal basis for L 2 [0. ocj of the form
(6.18)

v;.k(x ) =

"0 2- i / 2 sin[r.(J. + L'2)2- 3 x ]u(:?-i x).

We nl').1 El).'tend w(x ) to the whole real linE' by making it an even function:
ul-x\ = UI{X). This give;: a natural even ext.ension for the functioul' uJ. k and

an odd e."\.1.ension for thr

, .j.k

TJ:.Il,ffiEQUEN CY ALGOfiiTJIMS USING MALVAR WAVELETS

(!)

Finally, the complete collection of extended functions


(G.l9)

~u;,~t(x), ~v;,k(x); j E Z, k = 0, 1, ... }

is an orthonormal basis for L2 (R).


It follows that the set of functions ~(u;,k + iv;,~t). !Cu;,l:- iv;,k) is also an
orthonormal basis for L2 (R).
Next, we observe that
(6.20)

~ (u1 .k + tt:;.~t){x) = 2-(J+lll2w(2-ix ) exp(ir.(k + 1/ 2)2- ix)

and, that by letting k"


(6.21)

=- 1-

k, we have

~ (u;.t - it:;.A:){x) = 2-(j+ l)l2 w(2-ix) exp(i1r(k" + 1/ 2}2-ix).

The conclusion is that the sequence


(6.22)

j,kE Z,

is an orthonormal basis for L 2 (R).


Denote the Fourier transform of the function ~w(x)eiu/2 by ~(t). The
function tb(t) is real and satisfies w(r.- t) = 1/J(t). Then the sequence
(6.23)

1
- -21 ' 2 w(21 t - kr.),

...&.

j,k E Z ,

is an orthonormal basis for L 2 (R). To regain the usual form, 2i1 2 ~(2i t - k),
j , k E Z. it suffices to replace t by r.t.
It is clearl~ possible to require that w(x } be an infinitely differentiable func
tion, in v.hich case 1/J(t) will be a function in the Schwartz class S(R).
We recall the program of Ville. There were two possible approaches: either
segment the signal appropriately and follov. this by Fourier analysis, or pass the
signal through a bank of filters and then study the individual outputs of the
filter banks.
Here wr havr selected the second approach. The filter bank that wr US(' is
defined b~- the transfer functions u(:!-ix ). where w(x } is thE' even window used
above.
6.5.

Adaptive segmentation and the split-and-merge algorithm.

We are not f!:Oing to create a segmentation u nihilo. but we will modi~ an


existing segmentation to produCE' a nea one. The modification operation is
described iu this section. A sepnentation is modified by adjusting the partition
(a1 ) that defines thl' segmentation. and this is done by iterating the followiDJ;
elcmcntiU"y modificllt ions: An elementary modification consists in supprcssin~
a point a.~ of tht' pn.rtit iou. whidl mt>ans t hat tlJ(' t.wo int.<'n-als Iai - l ai 1 and

82

CHAPTER6

(c;,a;+t] are combined into a single interval, namely, (a;- 11 a; +I] The other
intervals remain unchanged. This operation is called "merging." The inverse
operation consists in adding an extra point ex between the points a; and a;+ 1 ,
which results in replacing the interval !a;,a;+ 1) by the two intervals !a; , a ] and
!a,c;+tl This inverse operation is called "splitting."
A split-and-merge algorithm provides a criterion to decide when and where
to use one or the other of these elementary operations.
We are going to examine the effect of these operations on a Malvar basis.
We will show that an elementary operation induces an elementary modification
of the basis that is very easy to calculate. The following remark is the point of
departure for this discussion.
For each fixed j , let W; denote the closed subspace of L2 (R ) generated by
the functions u;,k(t), k = 0, 1,2, ... described by (6.13}. Then f (t ) belongs to
W; if and only if /(t) = w;(t)q(t), where q(t ) belongs to L 2 (a;- ex;, a;+ 1 + ex;+tl
and satisfies the following two conditions: q(a; + T) = q(a;- T) if ITI ~ex; and
q(c;+J +T) -q(a;+ 1 - T) if ITI ~ a;+l There are no conditions that need to
be satisfied on the interval (a;+ ex; ,a;+t - a;+t)
To use the suggestive language of Ronald Coifman and Guido Weiss, W; is
a "di~" with a positive polarity at 4j and a negative polarity at a;+J The
intervals (a; - a;, a; +a;) where the polarities interact are pairwise disjoint.
From here, the merging algorithm is trivial. Removing the point c; of the
partition amounts to replacing the two subspaces W;-t and W; by their direct
orthogonal sum W;-1 E9 W; without disturbing any of the other spaces Wp,
j' :/: j -1 and j' :/: j . But this, in turn, romes down to replacing the two windows
Wj-1 and Wj by the new window Wj defined by te; (t)
(wj_ 1 (t) + u"J(t)) 112 .
The two lengths l;-J and l; are replaced by l; = l;- 1 + l;, which changes the
fundamental frequency in (6.13).
To fix our ideas. we start with a segmentation with intervals of length 1,
c; = j , and choose w;(t) = w(t - j ) with a ; = 1/ 3. Next, we look at which
wind~'S can appear as a result of the merging algorithm, which consist s in
removing intermediate points in the partition. The resulting windows and the
corresponding wavelets will look like centipedes (see the figure of the second
Mah-ar wavelet). The localization of these centipedes in the time-frequency plane
is not optimal. This is because, in using the merging algorithm, we never change
the values of the numbers a ,. In our example. we always keep a , = 1.'3. T his
being the case. the algorithm allows us to replace only the partition consisting
of all the integer interYals of length 1 by a partition that has arbitrary integer
intervals [a;. a1 +1! The v."avelets that appear are given by (6.13). and thE'y are
approximate!~ localized in thE' time-frequenc~ plane around the rectangles

R(j, k ) = !a;. a; +l x

[kr.
1; (i:+l, lh].

Wt> must no"' provide thE' criterion that all09os us to decide when to US(' the
dynamic split-and-merge algorithm. This means that we need to establish a
numerical value to measure v.hat is gained or lost h~ adding or deleting n point
in the subdivision. This is the purp()S(' of thE' next section.

TIME-FREQUENCY ALGOfUTHMS USING MALVAR WAVELETS

6.6.

83

The entropy of a vector wlth respect to an orthonormal basis.

Let H denote a Hilbert space and let (e;);EJ be an orthonormal basis for H . Let
x be a vector of H of length 1. We write x E ;EJ OJe;; the entropy of x relative
to the basis e; measures the number of significant terms in this decomposition.
This entropy is defined by exp(- EJeJ loJI2 log loJI 2 ).
If we have a collection (ej),eJ of orthonormal bases v.here w ranges over a
set n, we will choose for the analysis of % the particular basis (indexed by Wo)
that yields the minimum entropy.
This point of view poses three problems:

{6.24)

Does an optimal basis exist?

(6.25)

A diagnostic on the transmit ted signal can be made


impracticable by a compression algorithm whose
only objective is efficiency.

(6.26)

The underlyiDg energy criterion (the square


of the norm in the Hilbert space H) can cause
certain information in the signal to be given
low priority, and this information can subsequently
disappear in the compression, even though
it may be crucial for the diagnostic.

In image analysis, all of the algorithms are based on calculations that are
done from an energy function, which is defined as the quadratic mean value of
the gray levels; the algorithm used to search for an optimal basis for compression
does not escape this difficulty. The search for a Hilbert norm that is adapted
exactly to the structure of the image is still an unsolved problem.
A concrete example of where these difficulties arise is a program to transmit
radiographic images within large hospitals over the ordinary telephone lines. The
compression needed for transmission and the q~ty of the received image that
is required for the doctor to make a diagnosis are clearly antagonistic objectives.
The interested reader can refer to (3), where this problem is discussed.
6.7.

The algorithm for finding the optimal Malvar basis.

We v.ill examin<' in detail thE' particular case where the Hilbert space H is tht>
spact' of signals f (t ) with finite energy, which is defined by f~oe lf(tWdt. The
quality of the compression will be measured only by this criterion.
ThE' algorithm looks for '"the best basis. which is tht> one that optimizes
compression based on the reduction of transmitted data. The search is donE' by
comparing the scores of a whole family of orthonormal bases of L 2 (R). These
are Malvar bases. and thE'~ are obtained from segmentations of the real lin<' into
dyadic intervals. These intervals are systematically constructed in a scheme that
moves from "fine" to "coarse.~ This means that. we start. with a segmentation by
inten~s of length 2-q. wheT(' q ~ 0 is largt> enough to captur<' th<' fine."t. dt't.aib:
appcarin~ in thE' sip1al.

84

CHAPTER6

The process consists in removing, if necessary, certain points in the segmentation and in replacing, at the same time. two contiguous dyadic intervals I ' and
I" (appearing in the former segmentation) with the dyadic interval I = I' U I" .
For example, (2, 3) and (3, 4) can become (2, 4] with the disappearance of 3, but
[3,4) and (4,5] will never become [3,5). For the point 4 to disappear, it would
be necessary to wait for the eventual merging of the intervals (0, 4) and (4, 8).
By an obvious change of scale, we may assume that q = 0. The starting point
is thus the segmentation where the "fine grid" is Z. The intervals [aj, a j +ll of
6.5 are now [j, j + 1], and the first orthonormal basis to participatR in the
competition will be

(6.27)

u;.k(t )

= v'2cos [

1r (

+ ~) (t -

j)] w(t-

j ),

where j E Z, k = 1, 2, 3, ...
The other orthonormal bases that participate in the competition will all be
obtained from this first one by merging. The regrouping algorithm that merges
two orthonormal bases into one was described in detail in 6.5. We will limit its
application to situations where (aj_ 1 .aj ] and [aJoaj+ll are the left (I' ) and (/" )
right halves of a dyadic interval I .
Each partition of the real line into dyadic intervals of length greater than or
equal to 1 thus defines canonically one of the orthonormal bases that are allowed
to participate in the competition.
One reaches all of these partitions by iterating those elementary operations
that combine the left and right halves of a dyadic interval and by traversing this
tree structure, starting from the "fine grid."
We denote by I the collection of all the dyadic intervals I of length Il l ~ 1.
and if I = (aj , ai +I J is one of these dyadic intervals, we denote by WI the window
that was denoted by W j in 6.5. In the same way, we denote by W 1 the closed
subspace of L 2 (R) that was denoted by Wi and by w~k) the orthonormal sequence
defined by (6.13). which is now an orthonormal basis for li'J.
. If I' and I" are, respectively, the left and right halves of the dyadic interval
I , t hen we have WI= WI' e WI" and this direct sum is orthogonal.
The signal f (t ) that we wish to analyu optimally is normalized by
f~oc lf(t )l2dt = 1. To simplify the follov..;ng discussion. we assume in addition that f (t ) is zero outside the interval [1, T ] for some sufficiently large T .
Then f belongs to "H"L if L == [0. 21] and I is large enough.
It is eas~ to shov that if m tends to -oc. the entropy off in t he orthonormal
basis
l of li'1. I
10. 2m] , also tends to infinity. Thus there exists some \-alu'
of m after v.hich u~"l no longer enter!' into thr competition. In other words. the
dyadic partitions that come into play will. in fact , be the partitions of L 10. 2"' ]
(for sufficiently large m ) into dyadiC" intervals I of length I/ ~ ~ 1. The number
of partitions is thus finite, but it can be incredibl~ large (the order of magnitude
being 2<2"' l). It remains to find a fast algorithm to search for the "best basis.''
This is the algorithm that wr arr now going to describe. If I belongs to I .
then c1(k ) = / (1)u1(k ) (t.)dt.

uY

TIME-FREQUENCY ALGORITHMS USING MALVAR WAVELET.i

e(l) =-

(6.28)

L"" lc~kll

log lc~"')l 2 ,

and
(6.29)

where the lower bound is taken over all the partitions (Jp) of the interval/ iDto
dyadic intervals Jp belonging to I . If I = IJ,j + 1), then clearly e"(I) e(I).
The problem that we must solve is thus clearly reduced to finding the optimal
partition (Jp) when I L [0, 2m]. the largest of the dyadic~ involved ill
the competition. The calculation of e"(L) and the determination of the optimal
partition cannot be done directly because the number of cases to be considered
is too large. We will calculate e"(/} for Il l = 2" by induction on n. For n = 0.
we must calculate t:" (J) = e(I) for all intervals I= IJ,j + 1) in (O,r). Next we
proceed by induction on n, assuming that we have calculated e"(/) for Il l= 7!
and that we have determined the corresponding covering (J,.).
Suppose that I/ I = 2"+ 1 and let I ' and J" be the left and right halves of I.
There are t wo cases:

= =

(6.30)

If t:(/) S e"(/') +e"(I") , keep I and forget all the preced..ing


information about I' and/"; define e"(J) e(J) and the
partition of I is the trivial partition (consisting of only/).

(6.31}

If t:(/)

> !"(/') + e"(I"), set e"(/ } = e"(I') + e"(I") and the


partition of I is obtained by combining the partitions of
I ' and /"that v.ere used to calculate e"(J') and e"(l").

Arrh-ing at the ksummit of the pyramid,"' that is to say, at L. we have. at


the same time. found the minimal entropy and the optimal partition of L. which
leads to the optim.al basis.
6.8.

An example where this algorithm works.

Consider a signal of the form g(t ) + ~ei<.rlg((t- to)/h ), where g (t )


e-''12,
0 < h < 1 and where ..; is a real number. which can be arbitrarily large. \\'<> \\ill
be concerned with the limiting situation where h is very small.
If this signal f{t l is anal~'Ud using the Malvar wavelets associated with a
regularly segmented grid (a1 = ja). then the entropy of the decomposition is
uecessarily greater than Clog 1I h . Indeed. if the grid mesh is of order 1. the
term ~c'"" 1 g((t- t 0 )/hl it: veD poorl~ represented, whereas if th<" mesh is or
order It, the term g( t} is ''eD' poorly represented.
We will show that the entropy of a decomposition of j (t) can decreas<' to C
ta constant) by using the adaptive segmentation of the last section.
We asswnc. to fi.x ideas. that h = 2- 9 and that the initial grid U. 2-qz.
Th<' optimal partitiou iu d~'11.dic intervals is then formed from th<' sequeua- ol
nested dyadic int.crval~ J~ C J 9 _ 1 C C JCl contniuing t'<' 11J1d luwiu~ lm~b..~

86

CHAPTER6

2-f, 2 2-f, .. , 1. To each Jn we a880ciate the two oootiguous intervals of the


same leugth to the left and right of Jn. The extremities of the dyadic intervals
thus defined constitute the optimal segmentation for /(t).
It ia then easy to show that the entropy of /(t) in the Malvar basis correlpOnding to this eegmentatlon does not exceed a certain constant C.
Tbe adaptive segmentation has allowed us to "zoom in" on the singularity
of /(t), which is located at to. Thus, in this example, the optimal segmentation
algorithm has provided an interesting analysis of the signal/(t).

The diacrete case.

6.9.

We replace the real line R by the grid hZ, where h > 0 is the sampling step.
Thus the signal/ is given by a sampling denoted f(hk), k = 0, 1, 2, . . .. but
we will not discuss here the technique used to arrive at this sampling. We will
systematic:al.l forget h in all that follows.
A partition of Z will be defined by the intervals la;,a;+l], where a;-~ is an
')Dteger, in such a way that a; does not belong to Z. This point of view has also
~ been adapted in the context of the OCT.
We denote a number of points belonging to !a;,a;+l] n Z by l; =a;+ I -a; ,
~we let the numbers o; > 0 be small enough so that o; + oi+1 $ l;.
The windows w;(t) will be subject to exactly the same conditions as in the
cxmtinuous case. This means that
(6.32)
l

(6.33)
{6.34)

(6.35)

w;(t)

=0

w1 (t)

outside the interval

=1

on the interval

Ia;- a;, a;+ I

+ a;+1 ]

Ia; +o;, a;+l - a;+ 1]

0$ w;(t) $ 1 and w;-t(a; + r). = w;(a; - r) if lr l $ o;


w;(a; + r) + u."J(a; - r)

=1

if lr l $ a;.

Then the double sequence

0 S k Sl; -1,

J ez.

- an orthonormal basis for 12 (Z).


Furthermore, nothing prevents us from considering a finitE' interval of integer~
replacing 12 (Z) by !2{1 ..... 1\"}. Start with ao
~ and end with a;0 .;. 1
We require that tuo(t) be equal to 1 on l1/2.o 1 - a 1]. and there is no
c:onstraint. on this interval. Similarly, w;0 (t) 1 on [a;~+l - a;o+l aJ~+ d
With no other constraint on thE' interval.
Tbt> IIWvar wavelets thus exist in very different algorithmic settings, and it
Js this that makes them superior t.o other analytic techniqu~ (Gabor wavelets.
prossmann- Morlet wavelets, et.c.).

TIME-FREQUENCY ALGORITHMS USING MALVAR WAVELETS

87

Bibliography
(1) R . COJPWAH ANDY . MEY!a, ~.,. l'cmoiJIH. Fowler 4/eMtrc. c. R. Aced.
Sci. Paria S6r. I (1991), pp. 29-201.
(2) 1. DA\IBECKJa, The ~ CroN/ann, time~ '-'UoCicm en4 ~ Oft4lsiN,
IEEE 'Iram. Inform. Theory, 36 (1990), pp. 961-1005.
(3)

(4)

(5]
(6)
(7)
(8)

S.

E . ELNAHAS, Kou-Hu Tzou, J. R. Cox, R. L. Hw., A"D R. G. Jon, Progreuive


coding en4 trarumiuion oflligiUal cliognomc: ~ IEEE Tram. Medical J.macio&,

MJ-5, 2 (1986), pp. 73-63.


L41'Jie(l troNfM"tUfor efficient tnmlfMm/Nbltontl C>C>diftg, IEEE 'InDa.
Ac:oust. , Speech, Signal Proce.., 38 (1990), pp. 969-978.
- -, F011t olgorithm fur moduloted l4pped tnmlfann, Elec:troll. Lett., 27 (1991), pp.
775-776.
- -, Sign4l proceuing tuith ~ tnmlfM"'U, Anec:h HOUM, Norwood, MA, 1991.
H . S . MALVAR AND D . H . STAEUN, Reduction of~ eJ/ec:U in imoge coding with o
l4pJied Of11logonol trrm.fann, Proc. ICASSP 88, N- York, Apri11gsa, pp. 781- 784.
- -, The LOT: 'l"ran6fMm coding without W.ang effet:U. IEEE 'fiaaa. Acou.lt., Speech,
Sipal Proce.., 37 (1989), pp. 553-559.

H.

S . MALVAR,

CHAPTER

Time-Frequency Analysis and


Wavelet Packets

7 .1.

Heuristic considerations.

A time-frequency analysis of a signal lets us express the signal as a linear rombination of time-frequency atoms. These time-frequency atoms are essentially
characterized by
(a) an arbitrary duration, t 2 - t 1 ; t 1 is the instant when the signal is first
heard (if it is a speech signal, for example}, and~ is the instant when it ceases
to be heard, and
(b) an arbitrary frequency, which is the average frequency; this average fre.
quency is the frequency of the emitted note in the case of a musical signal, while
the frequency spectrum given by Fourier analysis takes into consideration the
parasitic frequencies created by the note's attack and decay.
We also think of a time-frequency atom as occupying a symbolic region in
the time-frequency plane. This symbolic region is a rectangle with area 211" (the
Heisenberg uncertainty principle).

2F.

I
I
I

T I

wo-- -- 1
I

'I

_l_ __

I :

---

ThE' most fan1ous examplE' of time-frequency atoms is that of the Gabor


wavelets. For thCS<' we have fR(t) r"""'tgh(t- to), when' to,.. !(tl + t2) i." tht>
center of the time-frequency atom and 9h(t) h- 112 g(t/h ). g(t) 1!"- 1 14 c-t' t~.
To say that thC' time-frequency atom f R(t ) occupies th< ~-mbolk region R of
thC' tim<--fn'Quenc-y plant> means that f R(t) is essentially supported by tbt' intcr\'1\l

CHAPTER 7

90

(t~o~) and that the Fourier transform in({ ) of /n(t) is eaeentially supported by
the interval (wo- 1rjh, Wo + 1r/h). It is well known that there does not exist
a function with compact support whose Fourier transform also has compact
support. This leads ODe to consider the following, leas stringent conditions

(7.1)

(7.2)

L:

(t- to) 1JR(t)l dt $ cflh

/ _ : ({ - Wo)

1iR(t)l 2 d{ $

21rCJ2h- 2 .

The time-frequency atom that optimizes this criterion (that is, for which
the constant C is the smallest possible) i.e; precisely the Gabor wavell't, and th,.
Gabor wavelet owes it success w this optimal localization in the time-frequency
plane.
We will see, on the other band, that the Gabor wavelets have a very disagreeable property, which makes them unsuitable for the time-frequency signal
analysis.

If the time-frequency awms fR(t ) were actually concentrated on rectangles


R in the time-frequency plane. they would enjoy the following property: If R 1
and R2 are disjoint rectangles in the time-frequency plane, then

(7.3)

I:

/R, (t )JR, (t )dt = 0.

We indicate the "proof" of this property. If R 1 and R 2 are disjoint, then


either the horizontal sides of the rectangles are disjoint or the vertica.J sides art>
disjoint. In tlle first case, the supports of /n, (t) and fn, (t ) (in t) are disjoint.
and the integral (7.3) is zero. In the second case, the supports of the Fourier
transforms in,({) and in, ({) (in{) are disjoint, and the integral (7.3) is still
zero, as we see by applying Parseval's identity. We know. if fact, that this
C&D.DOt happen, and if /n, (t ) and /n, (t) are Gabor v.'&velets, the integral (7.3)
is never zero. But this integral is small if R 1 and R2 are "remote," that is. if the
rectangles mR1 and mR2 art> disjoint. Herem~ 1 is an integer. and mR is the
rectangle that has the same center as R and whose sides art> m times the length
of the sides of R. If m is large. remot.eness becomes a very strong condition.
Eric sere (5) has shov.'II that remoteness of the rectangles Ro. R 1 R2 .... doenot imply that t he corresponding Gabor V.'&velets /~ ( t ). /R,( t ). /n,( t) .. . . ar<'
well separated from each other. l'llore precisely. for every m (no matter how
large). there exist rectangle:~~- R 1 . . . . in th<' timt>-frequeury plant' such that
the rectangles mR, art> pairv.ise disjoint and coefficient~ o 0 . o 1 ... such that

(7.4)

' "" lo 1z -- 1~

-- . J i
(l

and

(7.5)

TIME-FREQUENCY ANALYSIS AND WAVELET PACKETS

91

Thus remoteness of the rectangles in the time-frequency plane does not even
imply that the corresponding Gabor wavelets are almost orthogonal, and consequently the apparent heuristic simplicity of the time-frequency plane is completely misleading.
This catastrophic phenomenon results bom the arbitrariness of the h > 0
that are used in the definition of the time-frequency atoms. The rectangles
~. R 1 , . in 5ere's result have arbitrarily large eccentricity. When h = 1 all is
well, and the corresponding situation has been studied extensively. This is then
a form of windowed Fourier analysis where the sliding window is a Gaussian j4J.
Once we abandon the Gabor wavelets we have two options: Malvar wavelets
and wavelet packets. We will briefly indicate the advantages and disadvantages
of these two options.

If we use Malvar wavelets, then by their nature, the duration of the attack
or of the decay is not related to the duration of the stationary part. We can,
for example, have a Malvar wavelet for which the durations of the attack and
of the decay are of order 1, while the stationary part lasts T > > 1. If Wo is
the frequency corresponding to this stationary part, then the Fourier transform
of the Malvar wavelet v.ill be, at best, of the form jfcW:::::l~({- Wo), and it
cannot satisfy (7.2) since his of the order of magnitude ofT. On the other band,
the Malvar wavelets are constructed to be exactly orthogonal.
The implication of our observations is that the orthogonality of the Malvar
wavelets has been won at the prioe of tlieir localization in the time-frequency
plane, a localization that no longer guarantees the "minimal conditions" (7.1)
and (7.2).
One last remark about the Malvar wavelets is obvious but significant: Although they are given by a very simple formula, they are not obtained by trans-lation, change of scale, and modulation (or modulation by e*'A) of a fixed function g(t).
The option we propose in the following pages is that of wavelet .packets. The
advantages of this option are the following:
(a) Daubechies's orthogonal wavelets (Chapter 3) are a particular case of
wavelet packets.
(b) Wavelet packets are organized naturally into collections, and each collection is an orthonormal basis for L 2 (R).
(c) Thus one can compare the advantages and disadvantages of the various
possible deeompositions of a given signal in these orthonormal bases and select
th<' optimal collection of wavelet packets for representing the given signal.
(d) Wavelt't packets arE' described by a vel) simple algorithm 2i12wn(2i:r-k),
where j, k E z. n = 0. 1. 2.... , and v.here the supports of the Wn (:r) arE' in the
same fixed interval IO.L].
The integer n plays thE' role of a frequenc~. and it can be compared to the
integer k = 0. 1. 2 .... that occurs in thE' definition of the Malvar wavelets.
The pricE' to pa~ for these advantages is the same one that. comes up v.hen we
use Malvar v.-avelets. Indeed. if, to facilitate intuition, we associat<> the rectangle
R in thE' timt"-fr<'Quency plane defined by k2-' ~ t < (k + 1 )2-J aud n2i ~ { <

92

CHAPTER

(n+ 1)2i with the wavelet packet 2'12wn(2i t- k), then this choice does not meet
tbe conditions (7.1) and (7.2). Furthermore, we C&D110t do better by assigning a
frequency different from n to Wn(z), for although !1Wn!l2 1,

lim { inf

{7.6)

n-+oo

roo ({- w) lwn({)l tle} = + oo.


2

wER } _ 00

The frequency localization of wavelet packets is relatively poor, except for


certain values of n, and hence the "lim sup" in (7.6).

1.2.

The definition of basic wavelet packets.

We begin by defining a special sequence of functions tcn(z ), n = 0 , 1, 2, .... supported by the interval (0, 2N- 1}, where N !::, 1 is fixed at the outset. If A'= 1.
these functions Wn(z) constitute the Walsh system, which, we recall, is wellknown orthonormal basis for 2 [0, 1). If N !::, 2, the functions Wn(x ) are no
longer supported by [0,1]; however, the double sequence

n = 0, 1, 2, ... , k E Z

Wn (Z- k).

(7.7)

will be an orthonormal basis for L2 (R). This orthonormal basis will allow us
to do an orthogonal windowed Fourier analysis. Thus, for the moment, this
construction is similar to the Malvar wavelets. The difference occurs when the
dilations enter (the changes of variable of the form x- 2i x ).
We start with an integer N !::, 1 and consider two finite trigonometric sums
.,
mo()

l
rn

v2

(7.8)

2N-1

h,.e- "(

satisfying the following conditions:

(7.9)

91<

(7.10)

= (- 1)"+1Ji2N -J-Ic
mo(O) = 1.

and. finally.
(i .ll)

OnE' choirt' is given by


(7.12)

or

m1 ({)

= e 1<2...- - l ){mo( + r.),

TJME,FREQUENCY ANALYSIS AND WAVELET PACKETS

93

where

but other choices are possible (2).


As a first example take mo ~ (e - ~ + 1) and m 1 ({)
(7.11) reduces to

= i<e-iC -

1). ConditioD

A second choice is given by

v'2 h1

= ~(3 + v'3),

v'2h3 =

~(1- v'3),

The wavelet packets Wn (x) are then defined by induction on n


using the two identities

= 0, 1, 2, . .. ,

2N - 1

(7.13)

W2n(x)

= J2

h~twn(2.x- k)

(7.14)

W2n+1(z )

= J2

21''- 1
9kWn(2x- k)

L
0

and by thr condition wo(x ) E L (1R), f~oo wo(x)dx = 1.


We explain the roles of these t~-o identities. Identity (7.13). with n
1

= 0 , is

2N - 1

(7.15)

wo(z) = J2

h~cwo(2.x- k).

and th> function cp(:r)

wo(x ) appears as a fixed point of the operator

T : L 1 (R) - l 1(R) defined by


2/\" - 1

(7.16)

Tf (x )

= J2 L

hkf(2x - k) ,

which becomes
(7.17)

(Tf )"W = mo((/2)j (f./ 2)

by taking thr Fourier transform.

CHAPTER 7

94

0/

h3

/ 0-----~----~o~
/

____r:::l_____
0

1/2

2_..

1/2

1 3/2

I
I

When f is normalized by
is given~

(i .lS)

~{~)

J:::.,. ft:r)dx = 1. tht> fi..'tt'd point is unique. and it

= mo(~ '2)m 0 (~/ 4 ) ... mCll~ '21 ) ..

This diagram illustrates tbt> iterativt> schemt> for constructing ..; usin~ thr
characteristic function of [0.1] for thr initial ''alue /(). Here. fi+ 1 = T(/1 ) . and
~ have drawn the first fev; functious /o, It. and h The coefficients II{). h1 h2 .
and h3 are approximately ~
(l -1. J3).... , ~(
1- ,/3). as in thr second exh..
.., ample mentioned above. Then the st>queuet> f 1 converg~ uniformly to t hr fi.xt'<i
point ;,.."(r ).

TIME-FREQUENCY ANALYSIS AND WAVELET PACKETS

95

Once the function 4p is constructed, we use (7.14) with n


0, which gives
us t/J = w 1 t!J(x) is the "mother" wavelet in the conatruction of orthonormal
wavelet hues, and 4p(x) is the "father" wavelet. Next, we use (7.13) and (7.14)
with n
1 and obtain W2(x) and w,(x). By repeating this process we generate,
two at a time, all of the wavelet packets.
The support of cp(x) is exactly the interval(O, 2N -1) (see (2}), and it is easy
to show that the supports of w,(x),n EN, are included in (0,2N -1).
A very important result from this construction of wavelet packets is that the
double sequence

(7.19)

Wn(X-

k),

= 0,1,2, ... ,

k E Z,

L 2 (.R) .

is an orthonormal basis for


To be more precise, the subsequence derived from (7.19) by taking 2i $ n <
2H 1 is an orthonormal basis for the orthogonal complement W; of V; in \';+1
Recall that, in the language of multiresolution analysis, V; is the dosed subspace
of L 2 (R) spanned by the orthonormal basis 2ii2V'(2ix -k), k E Z, and similarly.
2il 2 t!J(2ix- k ), k E Z, is an orthonormal basis for W; . Thus, the construction
of wavelet packets appears as a change of orthonormal basis inside each W; .
AD interesting observation concerns the case where the filter has length 1
and ho
ht
~ 91 = - go= ~ - Let us recall the definition of the Walsh
system. We let r(.x) denote the periodic fl,mction with period 1 that equals 1 on
the interval(0,1/2) and -1 on the interval (1/2,1}. To define the Walsh system
Wn(x}, n E N, let X(x ) denote the characteristic function of the interval (0,1)
and, for n =Eo+ 2E1 + .. . + 2ie;J, where EJ = 0 or 1, write

= (r(xW [r(2x)]~' (r(2ixW'X(x ).


It is helpful to observe that r 0 = 1 and that r 1 = r . Then we can immediately
Wn(x)

verify that
and that

W2n+l(.x}

= Wn(2x )- Wn(2x -

1).

This shows that, in the case of filters of length 1, the construction of wavelet
packets leads to the Walsh ~ystem.
The Walsh system Wn (x), n E N. is an orthonormal basis for L 2 (0, 1), and
it follows immediately that the double sequence Wn(x- k ). n E N. k E Z. is an
orthononnal basis for L~ (R).
In the general case of basic wavelet packets (filters longer than 1). the su~
ports of u:n(x- k) and U:n(x - k') are not necessarily disjoint ..-hen k :f; k1 so
that the orthogonality of the double sequence wn(x- k ).11 E N. k E Z , is mon>
subtle.

7.3.

General wavelet packets.

The bask wRvelN packets are thl' functions w,.(x), n


0.1. ::! . (v:hicl1 lU'('
derived from 1t filter {h,} ). and thl' SE>quenet' u:,.(x- I;). " E r\. k E Z . i~ an

CHAPTER 7

96

orthonormal baai8 for L 2 (R). This orthonormal basis is analogous to the Walsh
system but, for filters longer than l, it is more regular. That is, the frequency
localization of the functions Wn (:r) iB better than the frequency localization of
the functions in the Walsh system. Nevertheless, this frequency localization does
not yield an estimate of the type

(7.20)
uniformly in n (see [2]).
The general wavelet packets are the functions

neN,

(7.21)

j,k e

z.

These are much too numerous to form an orthonormal basis. In fact, we can
extract several different orthonormal bases from the collection (7.21). The choice
j
0, n e N, j , k e Z , leads to the orthonormal basiS described in the previous
section, while the choice n 1, j , k e Z. leads to the orthonormal wavelet basis
(Chapter 3).
We associate with each of the wavelet packets (7.21) the "frequency interval"
I,(j,n) defined by 2;n S < 2i(n+ 1). The following result describes certain
sets of wavelet packets that make up orthonormal bases for L 2 (R).
THEoREM 7.1. Let E be a eet ofpc.in (j,n), j e Z, n eN, 6uch that the
corTe8pOflding frequency intenJOO l(j, n) constitute a partition of [0, oo), up to a
countable eet. TMn the 6Ub&equence

(7.22)

~12 wn(2i:r: k),

is an orthoncrm4I ba6is for

(j.n) e E ,

Ire z.

L2 (R).

Notice that choosing E is choosing a partition of the frequen~ axis. This


partitioning is "active," whereas the corresponding sampling v;ith respect to the
variable :r (or t) is passive and is dictated by Shannon's theorem.
Going back to Ville, v.-avelet packets lead t.o a signal analysis technique v.here
the process is "first filter different frequen~ bands; then cut these bands into
slices (in time) to study their energy variations."
Simiiju-ly, we refer to the methodology developed by Lienard: "The prcr
posed analysis process contains the follov.ing steps: filtering with a zero-phase
filterbank, and modeling the output signals into successivt> v.-aveforms (channel~el modeling)."
'When we have at our disposal a "libraryM of orthonormal ba.~. each of v.bich
can be~ to analyze a given signal of fullte energy, we are necessarily faced
with the problem of knowing which basis to choose. We settle this problem
v.ith the same approach that we used for the Malvar wavelets: The optimal choice is given by the entropy criterion that we havt> alread~ used in the
preceding chapter. This entropy criterion pfO\ides an adaptivt' filtering of the
given signal.

TIM&FREQUENCY ANALYSIS AND WAVELET PACKETS

7 .4.

97

Splitting alcorlthma.

Let (o~o) and (13~o) be two sequences of ex>efficients, indexed by k E Z, and


satisfying the following conditions: E lo~:l 2 < oo, E 113~:1 2 < oo and, by defi.ni.Dg
mo(IJ) Eo~oe-'"' and m1(/J) E/J,.e-Ucf, the matrix

mo(8)
m1(1J) )
U(IJ) _ (
mo(8+1r) m1(8+1r)

15

.tary
UDJ

Consider a Hilbert space H with an orthonormal basis


the sequence f k , k E Z , of vectors in H by
oc.

(7.23)

ht = h 'L:o21t- le1,

(e~o)~o ez.

and define

00

/21<+1

= v"2 EP2~o-1e1.
-co

-oo

Then the sequence (!,.), indexed by k E Z , is also an orthonormal basis for the
Hilbert space H.
Next, let Ho be the closed subspace of H generated by the vectors}21o, which
0
1
we denote by e~ >; similarly H1 will be generated by !21<+1 e~ >, k E Z.
0
Nothing prevents us from repeating on {H0 , e~ >) the operation we have done
on (H, e,.) and from iterating these decompositions, while keeping the same coefficients (o,) and (13~c) at each step.
An elementary example helps us to understand the nature of this splitting
algorithm. The initial Hilbert space is 2 [0, 21r) with tbe usual orthonormal basis
e, ~e"'', k E Z. The (21r-periodic) functions mo(IJ) and m1 (8) are (when
restricted to [0, 21r)) the characteristic functions of [0, 1r) and [1r, 211"). Then the
vectors f2,, k E Z, are -j; e12..,mo(IJ} and constitute a Fourier basis for the
interval [0, 1r), while the vectors ht+ 1, k E Z, constitute a Fourier basis for
the interval (1r, 21r). Finally, the subspace H 0 of His composed of the functions
supported on the interval (0, 1r}, while H1 is composed of the functions supported
on (1r,21r).
Iterating the splitting algorithm leads to subspaces that are naturally denoted
by H 1., ,... J) t: 1 0 or 1.... , ~i = 0 or 1, or even by Hr , where I denotes the
dyadic interval of length 2-; and origin t:J/2+ +~if2i . In fact. in the example
we have just studied. Hr is exactly the subspace of 2 (0, 211") consisting of the
functions that vanish outside the interval 211"1.
This example has guided the intuition of scientists working in signal processing. Assuming that the signal is sampled on Z, they have considered the
situation where (o,) and (B~~:) are two finite sequences and where mo(9) resembles the transfer function of a low-pass filter while m 1 (8) resembles that of a
high-pass filter. One requires. at least, that mo(O) 1 and that mo(8) does not
vanish in the interval 1- i .
By analogy with the preceding example, these scientists were led to believe
that the iterative scheme, v.hich we have called the splitting algorithm. v.-ould
providr a finer and finer frequency definition, as onr wanders through thr maz<'
of "channels" iu th<' follol':iu~ figure.

CHAPTER 7

98

H(O,O,O,O)
H(O,O)

Ho
H(O,l , O)
H(O,l)

HIO, 1.1 }

H!

H(l , O, O)
H(l. O)
H ( l,O, l)

H,
H (l,l, 0)
H( l.l )
LH(l.l.l)-i
H ( l , l,l , l )

The initial Hilbert space H is the direct sum of various combinations of these
subspaces. In particalar, H is the direct sum of all the subspa.ces at the same
"splitting level": at the first level there are 2 subspaces, at the next level there
are 4, then 8, 16, and so on.
To give a better understanding of the construction of wavelet packets and
the exact nature of the splitting algorithm, consider the case where the initial
Hilbert space is the space V; , j !:: 1, (in the language of multiresolution analysis)
with the orthonormal basis 2il2rp(2ix- k), k e Z. Next, suppose that the
splitting algorithm has operated j times. Then we arrive exactly at the sequence
of functions wn(:t - A:), k E Z, 0 :S n < 2i. More precisely, the integer n =
Eo+ 2E1 + + 2;- 1; - 1 is the index of the "frequency channel" H cc0 .c1 , ... . c,_ 1 )
The frequency localization of wavelet packets does not conform to the intuition of the scientists who introduced these algorithms, and the only case where
there is a precise relation between the integer n and a frequency in the sense of
Fourier analysis is the case where mo(B) and m 1 (8) are the transfer functions of
"ideal filters."

7.5.

Conclusions.

It remains for us to indicatr how to use v.avelet packets. We begin by selecting.


for use throughout. the discussion, two sequences h~o:.gA;, 0 :S k :S 2N- 1. that.
satisfy the conditions for constructing wavelet packets. The choict' of thesl' St>quences results from a compromiS<' between the len~h (2N -1) of thr filter.; a11d
tht> quality of thr frequen() resolution. OncE> thr filters arr selected. wt !l<'t in

99

TJME..FREQUENCY ANALYSIS AND WAVELET PACKETS

motion the algorithm for constructing the wavelet packets. From this processs
we obtain a huge collection of orthonormal buel for L2 (R).
It is then a question of determining, for a given signal, the optimal basis.
And again, the optimal ba8ia is the one (among all thoee in the wavelet paclcet)
that gives the most compact decompositioo of the signal
We determine this optimal basis by using a ~tcH:oanle" type strategy and
the method of merging. Thus we start from the finest frequency channels H1,
which are associated with the dyadic intervals I of length III 2-m. The integer
m is taken to be as large as possible, consistent with the ch06en precision. The
algorithm proceeds by making the following decision: It combines the left and
right halves, I' and I", of a dyadic interval I whenever the orthonormal basis of
Hr yields a more compact. representation than that obtained by using the two
orthonormal bases of Hr and Hr"
The discrete version of wavelet packets can also be used and is immediately
available. This is obtained by starting with the Hilbert space H
12 (Z) of
signal sampled on Z and the canonical orthonormal basis (c 11 ) 11ez : 1t is 1 at k
and 0 elsewhere. Here there is }>erlect resolution in position but no resolution
in the frequency variable. Next, we systematically apply the splitting algorithm
to improve the frequency definition until reaching the spaces H1 associated with
the dyadic intervals I of length Ill 2-m. Finally, we apply the algorithm to
ch006e the best basis (6:.7 ).
Wavelet packets offer a technique that. is dual to the one given by the Malvar
wavelets. In the case of wavelet packets, we effect an adaptive filtering, whereas
the Malvar wavelets are associated with an adaptive segmentation.

Bibliography
(1) R . COlf'MAK, Adopted multiruol11tion onoU, computotion, rign4l ,..ooeutng tmd ~
ator fhemv, ICM 90, Kyoto, Japan, S~Ved-c.
(2) R . COIFMAK, Y. MEYER. AND V . WJCKEJUIAUSER, Sire propemu of -..eld ,_Jt.
et.f, in Wavelets and Their Applications, M. B. Ruab.i, G . Beylkin, R. Coifman, I.
Daubec:hies, S. Mallat, Y. Meyer, L. Raphael, eda., Jones &.: Bartlett, Boston, MA,
1992, pp. 453-470.
(3) R . COIFMANN, Y . MEYER. S . Q UAKE , AND V . WICKEJUIAUSER, Signol ~9 and
c:ompreuion with tDGlldet poc.l:et.t, preprint, Yale Univmsity. Ne-.. Haven, CT. 1990.
(4) l. DAUB.ECHIES, The Vl01Jelet troruform., timefretluencv loc::olUGtion ond ftgnG1 ~.
IEEE Tr&DI. Inform. Theory. 36 (1990), pp. 961- 1005.
(S) E . Stat. Thesis, University of Paris-Dauphine, CEREMADE, Paris, France, 1992.

CHAPTER

Computer Vision and Human Vision

We propoee, in this chapter, to describe and commellt on a small part of David


Marr's work. We limit our discussion to Marr's analysis of the "loW-level" pr~
cessing of luminous information by the retinal cells. Marr hypothesized that
the coding of this luminous information was done by using the zero-crossi.ngB
of the wavelet transform. This hypothesis leads us 1<> state the famous "Man
conjecture" and then to state its precise form as coojectured by Mallat. This
precise form yields a remarkably effective algorithm. We will aee, however, that
the Mallat conjecture is incorrect, and this poses some fascinating new problems.
8.1.

Marr'5 p rogram.

Vision, A Comp1dational l nvt!3tigation into the Human &pruenl4tit>n and Proceuing of Vi.rual Inform4tion appeared in 1982, and it is somewhat analogous to
Descartes's Di.scours de la methode. Exactly as Descartes did, Marr takes us into
his confidence and speaks to us as if we were one of his friends cOlleagues from
his laboratory. Marr confides to us his intellectual progress and tells us about
his doubts, his hopes, and his enthusiasms. He gives a lively description ~ the
theories be has struggled with and rejected, and he explains his own research
program to the reader with a sense of jubilation.

We recall that the goal of Marvin Minsky's group at t he Massachusetts Institute of Technology (MIT) artificial intelligence laboratory was to solve the
problem of artificial vision for robots. The challenge was to construct a robot
endowed with a perception of its environment that enabled it to perform specific tasks. It turned out that the first attempts to coost.ruct a robot capable of
understanding its surroundings were completely unsuccessful.
These surprising setbacks showed that the problem of artificial vision was
much more difficult that it seemed. The idea then occurred to imitate, within
the limits imp<l6ed by the technology of robotics, certain solutions found in
nature. Marr. who li.'&S an expert on the human visual system. was invited to join
the MIT group and leave Cambridge, England, for Cambridge. Massachusetts.
According to Marr, the disappointments of the robotics scientists were due to
having skipped a step. They had tried to go directly from the statement of the
problem to its solution without having at band the basic scientific- under.;tanding
that ill nCCC!l.~ to ronstmc-t effcctivt' al_gorithms.
llll

102

CHAPTER

Marr's first premille is that there exists a science of vision, that it must be
developed, and that once there has been sufficient progress, the problems posed
by vision in robots can be solved.
Marr's eecond premise is that the science o human vision is no different from
the science of robotic (or computer) vision.
Marr's third premise is that it is as vain to imitate nature in the case of vision
as it would have been to construct an airplane by imitating the form of birds
and the structure of their feathers. On the other hand, he notes that the laws of
aerodynamics explain the flight of birds and enable us to build airplanes.
Thus it is important, as much for human vision as for computer vision, tv
establish scientific foundations rather than blindly seeking solutions.
To develop this basic science, one must carefully define the scope of inquir)'.
In the case of human vision, one must clearly exclude everything that depends on
training, culture, taste, etc., for instance, the ability to distinguish the canvas of a
m~r from that of an imitator. One retains only the mechanical or involuntary
aspects of vision, that is, those aspects that enable us to move around, to drive
a car, and so on.
We thus limit the following discussion to low-level vision: This is the aspect
of vision than enables us to recreate the three-dimensional organization of the
physical world around us from the luminous excitations that stimulate the retina.
The notion that low-level vision functions according to universal scientific
algorithms seemed, to some, an implausible idea, and it encountered two kinds
of opposition. In the first place, neurophysiologists had discovered certain cells
having specific visual functions. But Marr objected to this reductionist approach
to the problems of vision, and he offered two criticisms on this subject:
(a) After several very stimulating discoveries, neurophysiologists had not
made sufficient progress to enable them to explain the action of the human
visual system based on a collection of ad hoc cells.
(b) It would be absurd to look for the cell that lets you immediate!~ recognize
your grandmother.
On another front. Marr objected to attempts by psychologists to relate th<>
performance of the human \'isual system to a learning process: We recognize thE>
familiar objects of our environment by dint of having seen and touched them
simultaneously. Bela Julesz had made a fundamental discovery that eliminated
this as a working hypothesis.
Julesz made a systematic study of the response of the human visual. S~'Stem
when it V."&S presented with complete!~ artificial images-synthetic images having no significanc~which were computer-generated. random-dot stereograms.
If these synthetic images presented a certain ~formal structure- that stimulated
s~vision reflexes. the eye deduced. in several millisecond!' and ll:ithout thr
slightest hesitation. a three-dimensional organization of the image. This organizatioo "in relier is clearly only a mirage in which the mechanism of sterovision
finds itself trapped. This mechanism acts with the same speed. thf' same quality. and the same precision as if it v.we a matter of recognizing familiar objt'Cts.
Thus familiarity with t hf' objects that onE> sees plays no rolr in th!' primary
mechanism~; of vision. 1\tarr !'<'t out to understand thr algorithmic- ardtit<'<'t ure

COMPUTER VISION AND HUMAN VISION

103

of these mechanisms.
This venture can be compared to that of the 17th century physiologists who
studied the human body by comparing it with a complex and subtle machinean 888e1Dbly of bones, joints, and nerves whoee functioning could be explained,
calculated, and predicted by the same laws that applied to winches &Dd pulleys.
A century and a half later, Claude Bernard made a similar connection between
the organic functioning of the human body and progress in the naacent field of
organk chemistry. The synthesis of urea (Wohler, 1828) again reduced the gap
betwee.n the chemistry of life.and organic chemistry.
In their scientific approach, these researchers relied on solid, well-founded
knowledge, which came either from mechanics or from chemistry. They then
tried to effect a technology transfer and to apply results acquired in the study
of matter to life science.
But what Marr set out to do was much more difficult because the relevant
knowlege base, namely, an understanding of robotics, was too tenuous to serve
as the nucleus for an explanation of the human visual system.
Marr asserts that the problems posed by human vision or by computer vision
are of the same kind and that they are part of a coherent and rigorous theory,
an articulate and logical doctrine.
It is advisable, at the outset, to set aside any consideration of whether the
results will ultimately be implemented with copper wires or nerve cells and to
limit the investigation to the four properties of human vision that we wish to
imitate or reproduce in robots. These are
(a) the recognition of contours of objects, the contours that delimit objects
and structure the environment into distinct objects,
(b) the sense of the third dimension from twc:Hiimensional retinal images and
the ability to arrive at a three-dimensional organization of the physical world,
(c) the extraction of relief from shadows, and
{d) the perception of movement in an animated scene..
The fundamental questions posed by Marr are the following:
How is it scientifically possible to define the contours of objects from the
variations of their light intensity?
How is it possible to sense depth?
How is movement sensed? How do we recognize that an object bas moved
by examining a succession of_i~ages?
M.arr opened a very active
of contemporary scientific research by giving
each of these problems a precise algorithmic formulation and by furnishing parts
of thE' solution in the form of algorithms.
Marrs working hypothesis is that human vision and computer 'ision face the
samE' problems. Thus thE' algorithmic solutions can and must be tested 111.ithin
t he framework of robotics technology and artificial vision.
In case of success. it is necessary to investigate whether these algorithms art>
physiologically realistic. For example, Marr did not believe that neuronal circuits
used iterative loops. which are an essential aspect. of thE' existing algorithms.
Thi!' discussion raise$ th<' basic- problem of knowing the nature of thr reprr
.~rntatr cm (th!' terminology i~< dut w Marr) on 111.hich th<' alf!;orithm~ act. Marr

area

CHAPTEk 8

104

uses a simple comparison to help us underStand the implications of a representation. If the problem at band was adding integers, then the representation of
the integers could be given in the Roman system, in the decimal system, or in
the binary system. These three systems provide three represent4tions of the integers. But the algorithms used for addition will be different in the three cases,
and they wiU vary greatly in difficulty. This shows that the choice of this or that
repre$ento.tion involves significant cooaequences.
Marr was truly fascinated by the esaential and subtle role played by a representation. He wrote: "A representation, therefore, is not a foreign idea at all- we
use representations all the time. However, the notion that one can capture somt"
aspect of reality by making a description of it using a symbol and that to do so
can be useful seems to me a fascinating and powerful idea. . . r

8 .2.

The theory of zero-crossings.

Marr felt that image processing in the human visual system has a complex hierarchical structure, involving several layers of processing. The "low-level processing" furnishes a repre$enlation that is used by later stages of visual information
processing. Based on a very precise analysis of the functioning of the ganglion
cells, Ml!JT was led to this hypothesis: The basic representation ("the raw primal
sketch" ) furnished by the retinal system is a successioo of sketches at different
scales and these scales are in geometric progression. These sketches are made
with lines, and these lines are the famous zero-crossings that Marr uses in the
following argument:
TM firat of the three stages described above concerns the detection of intensity
changes. The two idea.s undering their detection are ( 1) that intensity ch4nges
occur at different scales in an image, and 40 their optimal detection requires the
we of operators of different sizes, and {2) that a sudden intensity change will
give rise to a peak or trough in the first deriootive or, equivalently, to a zerocrossing in the second derivative. These idea.s suggest that in order to detect
intensity changes efficiently, one should seo.rch for a filter that has two salient
charocteristics. First and foremost. it should be a differential operator, taking
either a first or second derivative of the image. Second. it should be capable of
being tuned to act at any desired scale. so that large filters can br used to detect
blurry shtulow edges. and 8mall ones to detect sharply focused fin e details in the
image.
Marr and Hildreth (1980) argued that the most satisfactory operator fulfilling those conditions is the filter t::.G , where 1::. is the Laplacian operator
{)2 f ox2 + lf2 !Otf and G stands for the two-dimensional Gaussian distribution
2
G(x , y ) = e-<:r'+s? l/ 2'"' which has standard deviation of o/?.. ~G is circularly svmmetric Mexican-hat-shaped operator whose distributaOTl Ul two dimensions may be expressed in term.. of the radial distance r from thr origin by th(
fortnula
~
2 (
r ) _,., ,2.,,
1::.G( r ) = - CT2 l - 2CT 2 c
...

*).

We observl' that t::.G(r)


( l /tr2 hb( ~.
where 1/(x. y ) il' t bc wavelet that
everyonl' today calls the Marr wavelet. lf a black and v.hiw imagc ~ definro by

COMI' l~TER

VISION AND HUMAN VISJON

105

the gray levels f (x , y ), the zero-crossmg. of Marr's theory are the lines of the
equation (f tP.r )(x, y) 0. Since the function ,P(x,y} is even, the convolutiou
product (f 1/J.r)(X,lf) is (up to a proportionality factor) the wavelet coefficient
of f , analyzed with the wavelet ,P. Thus the zero-crossings are defined by the
vanishing of the wavelet coefficients.
It remains to specify the values of a. These values, in geometric progression,
were discovered by Campbell, Robson, Wilson, Giese, and Bergen. based on neurophysiological experiments. T hese experiments led to the values oi = (1.75)Joo.
Marr's conjecture is that the original image f (x, y ) is completely determiited
by the sequence of lines where the functions (f
)(x,y) are zero. Interest
in this representation of an image stems from its invariance under translations.
rotations, and dilations (by p owers of 1.75).
We quote Marr: "Zer<r<:rossings provide a natural way of moving from au
analogue or continuous representation like the two-dimensional image intensity
values I (x, y) to a discrete, symbolic representation. A fascinating thing about
this representation is that it probably inCUIS no loss of information. The arguments supporting this are not yet secure . . . "
In the following pages, we propose to study Marr's conjecture. We will show
first of all that it is incorrect for periodic images covering an unbounded idea.
This does not exclude the possibility that the conjecture is true for bounded
images, t hat is, images having finite extent.
We will then examine Mallat's conjecture, which is a version of Marr's conjecture. Mal.lat's conjecture provides an explicit algorithm for reconstructing
t he image. This algorithm works very well in spite of the fact that Mal.lat's
conjecture is false. The counterexample that we coDBttUct is, in a certain sense,
more realistic than the one we present in the case o! Marr's conjecture. This
counterexample raises an exciting problem: 'Why does Mallat's algorithm work
so well?

"''a,

8.3.

A counterexample to Marr's conjecture.

We begin with a counterexample in one dimension. It will then be easy to


transform it into a twcrdimensional counterexample. This counterexample has
the property of being periodic in x (or in x and yin the two-dimensional case).
We do not know how tO construct other counterexamples.
Consider all tht' functions f (:r) of the real variable :r, having real values. and
given by the serict:
oc

(8.1 )

j (:r)

= sin:r..;. 2: a;..sinkx,

when
oc

(8.2)

L::>3 1a~: l <

1.

W<' art ~in!' w show that all choices of tlw coefficients a~: lead tl. t.h<'
sum<' zero-crossin~. For cxumplt' sin :r and sin :r + f, sin 2:r havt> tltt srum' zrr~

CHAPTER 8

106

croesinp. We verify t.uis assertion by .yat.ematically applying the following simple obeervation: H u(:r) and v(:r) are two continuous functions of :r, and if, for
aome constallt r E [0,1), lv(:r)l ~ r lu(:r)l for all :r, then u(:r) + v(:r)
0 is

equivalent to u(:r) 0.
Returning to (8.1), we define 96(:t}

= 6Ji;e-a'f'U' . Then
oc

I 96(:r) = e-6"12 sinx + 2:oke- k, 6' 12 sinkx.


2

It follows from this that

- tf22 (fg6)(x )
d.t

= e- t'/ 2 sinx+ f,k 2oke- k'6'12 sinkx


= u(x) -r v (x ).

Observe that lsin kxl ~ kl sinxl. which implies that lv(x)l ~ rju(x}l, where
r = E~ k 3 1okl < 1. Thus the zero-crossings of all the functions f (x ) are
x=mr., mE Z.
H we wish to have 0 ~ f (x) ~ 1. it is sufficient to add a suitable constant to
f (x ) (defined by (8.1)) and then to renormalize the result by multiplication with
a suitable positive constant. These two operations do not change the positions
of the zero-crossings.
A nontrivial two-dimensional counterexample is

f(x, y)

= sinxsiny+ L Oksinkxsinky,
2

where

8.4.

Mallat's conjecture.

The existence of these counterexamples and several remarks Marr made in his
book led Stephane Mallat to a more precise version of Marr's conjecture. He
gave it a formulation that was compatible "''ith the progress made in the early
1980s on pyramid algorithms for numerical image processing.
Mallat observed that numerical image processing using certain kinds of p~rra
mid algorithms (quadrature mirror filters) and Marr's approach represented two
particular examples of wavelet anal~sis of an image.
In fact. one has 6(/ 96 ) = E- 2 f t:l' " bert>
2

tr(x , y )

,l) ( :r + J?)

,. = -;;:1 ( 1 - --2

exp

- - ---

is Marr's v.avelet. With this in mind. Mallat took up a promising approach:


to giVt> Marr's conjecture a preciS(' numerical and algorithmiC' content by takint;
advantage of the progress that had bt-en made using quadratur(' mirror filters.

COMPUTER VISION AND HUMAN VISION

107

We start with the one-dimensional cue. Mallat replaced the Gaussian


*e-"12 with the basic cubic spline B(x}, wboee support is the iDterva1 (-2, 2).
Recall that 8 = TT, where T(x} is the triangle function that is equal to 1-l:z:l
if l:z:l ~ 1 and 0 if lxl > 1.
Let /(z) be the function we wish to analyze by the method of zero-crossings
and write 86 (z} = 6-1 8(6- 1x). Then the zero.croasinga are tbe values of x where
the second derivative (cP ftJ:il)(f 86) is zero and changes sign.
To use the pyramid algorithms, Mallat MSUmes that 6 = 2-;, j E Z. Mallat
then proposes to code the signa.l/(z) with the double sequence (xq.;, Jiq.J}, where
(a) x = Xq,j is (for 6 = 2-i) a zero, with change of sign, of (cP jtJ:il)(f86 )(x),
and
(b) J/q,j =
e6)(xq,j>
In other words, Mallat considers the values of z (denote 'by z 9 .;), where
(f 86} has an extremum, and be keeps the values of these local extrema in
memory.

tzu.

tz

f(x)

i
I

-----~---T---,-------- -Xt
X2
X3

:
1

~------

lt_(f8.t)
dx
o

I
I
I

I
I
I
I

I
I

~(f 8~J) I:

dY~

Certain of these local extrema correspond to points wher<' thf' signal


f(:r) changes rapidly: this is the cast> for the points z 1 and z 3 in th<' figure.
Other extrema aT<' related to points v.bere the function changes very lit tle.
Mallat had the iden to consider only the first of these and thus to retain only
th<' local maxima of 1,! (f 96)1. Tlili; v.ill not cha.uge th(' critical anal~sis that
follows.

101:!

CHAPTEH

Coding f with the double aequence (:r9_,, liN) meets two objectives: It is
invariant under translation, and it corresponds to a precise form of Marr's conjecture. One reads on page 68 of his book: "On the other band, we do have
extra information, namely, the values of the slopes of the curves as they cross
zero, since this corresponds roughly to the contrast of the underlying edge in
the image. An analytic approach to the problem seems to be difficult, but in an
empirical investigation, Nishihara (1981) found encouraging evidence supporting the view that a tv.'<Hiimensional filtered image can be reconstructed from its
zero-crossings and their slopes."
We are going to show that this conjecture is incorrect. However, this assertion
must be tempered because our counterexample depends on a specific choice for
the function 8(x ). If 8(x ) is the cubic spline, then we have a counterexample.
II, on the other hand, the cubic spline 8 (:r) is replaced with the function that is
equal to l+cos:r if lzl ~ 1r and to 0 if lzl > 1r (which is the Tukey window), then,
for all signals f (x) with compact support, reconstruction is theoretically possible
but unstable. In this case, it comes down to determining a function with compact
support from the knowledge of its Fourier transform in the neighborhood of zero.
which is an unstable problem.
To describe the counterexample. we make a change of scale so that the values
of 6 are 2w2-i rather than 2-i, j E z.
Then we define /o(:r) 1 + cosx if - w ~ :r ~ 1r and fo(x) = 0 if l:rl ~ x. We
next consider the function f 1 (x ) defined by

b (:r) = fo(x ) +

OC>

L o~~: (l + cos(2k + l ):r)

if

- r. S r S r.

and
il (:r} =O if i:r~~ :r.
We cb()()S(> the coefficients ak , k
l o~~:lk"'-

(8.3)

0. so that

0 as k - +oc

for all m

~ 1.

(8.4)
oe

(8.5)

L (-1)"( 2k+l)- 3 o~~:cos !( 2k .... l hi =O forall ~ = 2r.2 - 1

jEZ.

and finally a finit e set of condition.c: of t hf' form


(8.6)

1 SIS /1.'.

Then if 0 ~ {> S 'ff. (tfl f d:.r 2 }(/J 8~ ) and (d2 /d:.r 2 )(j 0 .. 6~ ) h avC' thC' 5anlC' zeros.
namely. at :r = ~ or at l:rl ~ r. + 2f. ThC' fa<'t that tbC' dC'r ivnt iVl'l' of / 1 6t
and / o 81- ar(' th; same at thes<' zcro-crossin(:l' r~ults from (t:.;,).

COMPt:n ; n VISION t\NlJ HUMAN VISION

109

Determining the zero-crossings is also easy if 6 ~ 81r; they are given by


x ~ or at lxl ~ 11' + 26. Finally, in the intermediate cases where -1 $ j $ 4,
condition (8.6) forces the zero-crossings of (cPfd%2 )(/1 86 ) and (cP /d% 2 ) (/0 86)
to be at the same places and the values of the first derivatives to be equal at
these points.
The counterexample that we have just given looks very much like the one
given in connection with Marr's conjecture. Notice that the modifications added
to f 0 (x ) involve "high frequencies" and that they are small.
If, however, the function f (x ) that we wish to analyze by Mallat's algorithm
is a step function (with an arbitrarily large number of discontinuities). then
Mallat's conjecture is correct. In fact , thanks to the symmetry of the function
8(x ), the zero-crossings occur (for sufficiently small 6 > 0) at the points of
discontinuity, while the values of the first derivatives of the smoothed signal
furnish the jumps in the signal at these discontinuities. In this case, we have
perfect reconstruction of the signal.
.
All this explains, without doubt, why Mallat's algorithm works in pract ice
with s uch excellent precision, no matter what signals are treated. The signals in
question have more in common v.>ith step functions than with the subtle functions
described in the counterexamples.

8.5.

The two-dimensional version of Mallat's algorithm.

We start v.i th a two-dimensional image f (x , y). From this we create the increasingly blurred versions at scales 6 = 2- i , j e Z , by taking the various
convolution products f 86. where, in two dimensions. 86(x, y) = 86(x)86(Y) =
t>- 2 8(x f 6)8(y/ 6). The function 8(x) is the basic cubic spline used in one dimension.
Next we consider the local maxima of the modulus of the gradient of f 8~.
We keep in memory the positions of these local maxima as well as the gradients at these points. The conjecture is that this data. computed for 6 = 2- 1 ,
characterizes the image who.se gray levels are given by f (x,y).
We will show that this conjecture is incorrect in this general form. This does
not exclude the possibility of its being true if (1) more restrictive assumptions are
made about the function f(x , y) or (2) the definition of the smoothing operator
is changed.
The counterexample in two dimensions will be f (x , y ) = 11 (x ) + / 1 ly). v.here
f 1 (:r) is the counterexample in one dlluension. One should observe bert> that
0('

f 1 (:r)

= f o\:r) -

:~:::Ok ( 1 + cos(2k _.. 1):r)


o

and

for

- r. S r S :-:-

'L)2k + 1)3 iokl < imply t hnt U tlr ) ~

! 1 \.T\ :::,

~ fo(r) since

1 ..- cus(2k

+ 1):r S (2k +

112 (1

+ cos.:r ) .

CHAPTERS

110

The support of the function f(z, Jl) is thus the CTOSS -1!' $ X $ 11' Or -11' $ Jl $
since
7
9
8(/o(z) + /o(JI)) :5/(:r,JI) $ 8(/o(z) + /o(JI)).
Finally,
f(z,JI) 96(x)96(JI)

11'

= ft(x) 96(x) + !l(JI) 96(!1)

and the gradient of this function is the vector

Its length is (1 1z (ll 96)12 + 1 ~(/t 96}fl)112 It has a maximum if and only
if l!(h 96)1and lt 11 (11 96)1are at a maximum. But the set of functions ft (x )
has been constructed so that the positions of the maxima of l! (h 96)1 are independent of the choice of b and the same is true for the values of~ (b 96) at
these points when 6 211'2-i, j e z. Thus we have a counterexample in dimension two. Finding a counterexample whose support is a square is an important
unsolved problem.

8.6.

Conclusions.

All of this shows that Marr 's conjecture is doubtful. However, the counterexample that we have constructed is deficient because it does not have finite support.
Regarding Mallat's conjecture, one must distinguish between the problem
of unique representation and that of stable reconstruction. In our opinion, the
reconstruction is never stable (unless the class of images to which the algorithm
is applied is seriously limited). But it is, in certain cases. a representation that
defines the image uniquely.

Bibliography
S. G. MALLAT, M..Uifrequency ch4nnd decomposil>On$ of lf7149U and wavelet modeLs.
IEEE Ttans. on Acoustics. Speech Signal Proce!S., 37 (1989). pp. 2091- 2110.
(2] S. G. MALLAT AND W .-L . HWANC. Singulorit11 detectwn and procu8ing with wavdct.s.

[1]

Robotics Report No. 245. Courant Institute of Mathematical Sciences, Jlie,.. York. t\' Y.
March 1990.
(3] S. G . 1\lALLAT AND S. ZHONC . Completf ssgnal repruentatwn with multu cale ed9e...
Robotics Report No. 219. Courant institute of Mathematical Sciences, New York.
1\'Y. December 1989.

CHAPTER

Wavelets and Fractals

9.1.

Introduction.

This title, in fact, pertains to the last three chapters of the book. In the next
chapter, wavelets will be used to analyze the multifractal structure of fully d~
veloped turbulence. Similarly, in Chapter 11, the astronomer Albert Bijaoui will
propose wavelets as a tool capable of elucidating the multifractal structure of
the hierarchical organization of the galaxies.
In this chapter, we show that Fourier analysis does not easily reveal the
multifractal structure of a signal and that ~velet analysis provides immediate
access to information that is obscured by tim~frequency methods. The two
examples we present have played a significant role in the history of mathematics
and, for this reason, this chapter has been written with extra care, by presenting
all the details of the arguments and demonstrations. We invite the reader to
accept our apologies, if he or she is not a mathematician, and to proceed directly
to Chapter 10. We hope that readers who are mathematicians will enjoy reading
this chapter as much as we have enjoyed writing it.

9.2.

The Weierstrass function.

We intend to shmo. that the function u (t) = I:;' 2-i cos(2' t) is nowhere differentiable and that the same is true for the function Ego' 2-i sin(2it). These proofs
will use wavelet analysis, which, in this example, appears in general outline as a
form of Littlewood- Paley analysis. The method we follow is due to Geza Freud.
Let 1/l(t) denote a function belonging to the holomorphic Hardr space H 2 (R).
Assume, in addition. that 11/l(t)l $ C/lt!3 if ltl ~ 1 and that thE' Fourier transform tb({) of 1!(t ) satisfies the following conditions:
(9.1)

tb({) = 0 if { $ 1/ 2 (and in particular on (-'oc, O]),

(9.2)

tb({) = 0 if

(9.3)

tb(l )

{ ;::: 2,

= 1.
=

We write \:lilt) 2Jt:(2Jt) and denote thE' convolution operal.on; I - I 1/Ji


= 0. L 2.... These opcraton; !l.J. j 0. 1. 2. .. constitul.t' a bank of

b~ [:).i j

lll

CHAPTER 9

112

filters that are arrayed on octave intervals. Thus the analysis of a real function
using the eequence 11;, j
0,1, 2, ... resembles a Littlewood- Paley analysis
that would be carried out on the analytic signal whose real part is f.
Freud's method is based on the following lemma.
LEMMA 9.1. Ld f(t) be a~ continuow functicn of the real varWble
t. Auurne that f is differentiable at to. Then

11;/(to)

= 2-it;

where t;-+ 0

aJ

j-+ +oo.

By definition, 11;/(to} = 2i I f(t 0 - t)1!1(2it)dt. We write /(to+ t) = / (to}+


tf'(to} + t(t), where t(t)-+ 0 witb t and lt(t}l ~ C if ltl 2: 1. This gives three
terms for ll; f (t 0 ). The first two are zero because I 1/l(t)dt = I ttb(t)dt = 0. The
third term is 2i I tt(t )t/1(2i t )dt 2-J Jt (2-it)t,P(t)dt. But we have lt(2-it) l ~
C,lim;-+oo t(2-it) = 0 (simple c:oovergepce), and J ltll'l/l(t )ldt < oo. From this
it follows that t; = f t(2-it)ttJ!(t}dt-+ 0 as j - +oo.
Let's return to the two functions u(t) = :[~2-i cos(2it) and u(t) =
2-i siD(2it). Then by direct computation, (111u}(t)
2-i- 1 2' 1 and
1
12
1
( ~;u)(t) = - i2-i- e ; Lemma 9.1 applies, and the conclusion is tbat u
and u are nowhere differentiable.
We make tbe following observation about the choice of the analyzing wavelet
t/l(t). U we had chosen an even function tb, decreasing as 1/l t13 at infinity and
satisfying ,P({) = 0 if 1{1~ 1/2 or 1{1 2: 2, this v.:ould have led to (11;i7)(t) =
2-i sin(2it), and we could not have concluded that u is nondifferentiable wben
t = p2-q1f. From this example we see the merit of choosing an analyzing wavelet
that is analytic. The information contained in 11;/(t ) is more specific.
However, the choice of the J!Jl8)yzing wavelet loses all importance if we
rephrase Lemma 9.1 in the following. more precise form.
LEMMA 9.2. With the 1arne hypotheses as Lemm4 9.1, there exists a function
fl(x), deft~ for x 2: 0, increa~ing, zero at x 0, and continuous at 0, such that

E:

e'

(9.4)

for all j 2: 0 and all real t 1 .


The proof is similar to tbe proof of Lemma 9.1.
We return to the issue of choosing the analyzing v:avelet.. By making thE'
"bad choice" of a real. even wavelet. we ended up with (ll; c)(t) = 2 - 3 sin(23 t).
and we
not. able. using Lemma 9.1. to reach the desired conclusion. The
result follows from Lemma 9.2 however. For example. for t!l = 0. take t 1 = ~ 2 -J
so. that sin 23 t 1 = 1.
For experimental applications. users prefer a formulation such as Lemma 9.1
because there are considerably fewer values to check (onE> needs only to vary j).
whereas the US<' of Lemma 9.2 requires onE> to consider (in addit.iou to j) all
'"8lues of t 1 This leads to a mud1 mort' cumbersomE' algorithm.
The statement of Lemma 9.2 comes cl05t' to being a necessary and sufficient
condition for differentiabilit) at f.o. 1n fact . differentiabilit) at t 0 cannot bE>
characterized in this way, and thr sufficient condition that "''<' v.ill present in
9.3 is not exactly thr convers<' of Lemma 9.2

were

113

WAVELETS AND I'RACTALS

9.3.

The determination of regular points in a fractal background.

We propose to determine the points to where a function, which is otherwise very


irregular, is difl'ereotiable. This is done in terms of a condition on the modulus
of the wavelet coefficients. This condition is a sort of converse to Lemma 9.2.
We first state the result in terms of Daubechiee's orthogonal wavelets with
CZ regularity. Next, we will indicate a "corollary of the proof," which v.1ll serve
to analyze the differentiability of Riemann's function.
Let 1/l(t) be a function in class CZ. whoee support is the interval [O, L], and
such that 2ii 2,P(2't- k ) = 1/Jj.k(t), j . k E Z. is an orthonormal basis for L2 (R).
The following lemma is the analogue of Lemma 9.2 in the language of
v.avelcts.
LEMMA 9.3. If f(t ) is a bounded, continuous function and if f(t) is differentiable at to. then its wavelet coefficienl.$ o(j, k ) = f (t ),P;,,.(t)dt satisfy

(9.5}

lo(j, k) l $ 2- ii 2 [2- 1TJ(T1 ) + lk2- ; - to1'7(ik2-i

-to I)],

where TJ(:t ) is defined for :t ~ 0, is increa.ring, and is continuous at 0, and


'7(0) = 0.
The following theorem is a partial converse of Lemma 9.3. On one hand, we
must make an assumption about the global regularity of the function f(t) , and
for this we suppose that f (t) belongs to the Holder space CO(O < o < 1). On
the other hand, we must strengthen the conditions on 77(t).
Recall that f E C 0 means that there exists a constant C such that for all t
and all h > 0
(9.6)

lf (t +h)- f(t}j $ Ch0

In terms of wavelet coefficients, this translates into the condition


(9.7)

where here and elsewhere. the constant. C may va.I)' from expression to expression.
Note that only the values h $ 1 are important in {9.6) since f (t ) is assumed
to be bounded; if (9.6) holds for these v-alues. then (9.7) is satisfied for all j ~ 0.
T H EOREM 9.1. Let f be a function of the real variable t that ism the class
C'0 and let to be a fixed real number. Assume that the wavelet coefficient.~ o (j. k )
of I satisfy, for j ~ 0 and all k E Z, the condition
(9.8)

lo (j. k ); S TJ : 2 [2-i TJ(2- 1 ) ..;. lk2-; - toiTJQk2-i - tol);.

'

Ulherc 17(r) is defined f orT ~ 0, is increasing. continuous at 0. and sati..~fies the


Dini condition
1

(9.9)

1
0

ca

TJ(X)- <
:r

00.

Tllen f i..~ differentiable at to. and /'(to) can be computed by differenhating tht
Ulavelet expansion off trrn by tcrnl.

CHAPTER 9

114

Theorem 9.1 ia oot exactly the convene of Lemma 9.3 because we use two
additional conditions: a global regularity hypothesis on f and the Dini condition
{9.9).
The proof of Theorem 9.1 is so nice that we cannot resist the temptation to
present it.
In our notation, both the constant C and the function '7 may vary from place
to place. To be specific, when functions of the form TJ(Kx) = TJK (x ), K 2! 0,
appear in estimates, we replace these (which are finite in number) by the largest
of the '1x and rename it '7 Note that all the '1K satisfy the same conditions
as '7

By neglecting the contribution of the "low frequencies," which is a very reg-

= L;~o LA: a(j, k)w;,A: (t). We then have


L a(j, k )V:j_,.(to) =S1(h) + S2 (h) + S3 (h).

ular function, we can write /(t)

f (to +h)- / (to)- h

L
j ~O

A:

where these three sums are taken over three intervals of j values.
Define the integer io by 2-io ~ ~ < 2 2-io and then define J1 2! i o by
2-J ~ lh l 2/ o < 2 2-J. Write
(9.10)

s~ (h) =

..I:
OSiSio

I: au. k)[t/J;."u~ + h)- tPj.k(to>- hwi."<to>l.


k

S2(h) is the analogous sum taken over j 0 < j ~ J1o and S3 (h) involves the values
j > il
We use two facts to estimate IS1 (h)l. On one hand. the functions 1i:;.A:(t) are
sufficiently regular so tha.t

(9.11)

11/l;.lf(to +h) - t/J;.k(to)- h1/lj_,.(to)l ~ Ch22'!>i/ 2,

and. on the other band. the support of tP;.k is the interval [k2-i. (k + L )2-i]. Once
to and hare fixed, the number of values of k for which ~;.~< (to) and 1Y;.~c ( to +h)
a.re not zero is no greater than 3L. For these values of k. (9.8) reduces to
lo(j.k)l ~ C2 - 3ii 2 TJ(2-i) since lk2 -i- tol ~ C2- i. Finally.
IS1 (h)l ~ Ch 2

(9.12)

2i'1(2-i)

o(h).

OSiSio

v.hich means that i iS1 (h)l-- 0 when h tends to zero.


The estimates for IS2(h)l and IS3(h)l involve only the localization of the
v.avelets. The regularit~ plays no further role since the wavelet!' '\\'hose support
contain t-o are zero in a neighborhood of to ..._ h and conversely.
We take absolute values insid(' th(' sum and use (9.8) whik distinguishin~;
those values of k Cor v.hich 1/:;.~c(to) #: 0 and thOS(' for '1\' hich t"j.l.-(t!l +h) 0. ln
the second~- lk2-J -to I is tb(' order of magnit.ude of h.
Th(' r.ontribut.ion ISz(h )I is thus bounded by a constant tim<'-~

(9.13)

I:
}o<1~J ..

:?-J,,c:?-J )+(j 1 - j0 )"''(hl "

2:

Jo<1:S.1

,,(:?--').

115

WAVELETS AND FRACTALS

We have assumed that J~ 17(t)~ < oo and that f}(t) is an iDcreasing function. From this it follows that r:''7(2-.1) < oo. Furthermore, '7(2- ') decreases to 0, from which we conclude that lim,t-+ooi'1(2-.1)
0 and, hence,
that {log )'I( h) - 0. But j 1 - j 0 is of the order log with the result that
limh-o iS2(h) 0.
To bound IS3 (h)l, we use only the hypothesis on the global regularity off
as expressed in terms of the wavelet coefficients (9.3). We again take absolute
values inside the sum. By considering, as in the second case, the limitations
imposed by the compact support of', we deduce that IS3 (h)l $ Ch 2 , which
implies that limh-o *S3(h) = 0.
This completes the proof of Theorem 9.1; however, this is not the result that
will be used in the next section. We will need a "corollary of the proof" of
Theorem 9.1. The statement of the corollary itself follows.
COROLLARY. Let 8(x, y), x E R, 0 < t1 $ 1, be a mea.rurable function of
(x , y ) satufying, in addition, the following two conditions.

(9.14)

y l8(x, y)l $ Cy0

for all x E R

and all JJ E (0, 1)

for some constant C and an exponent a > 0 and


yl8(x, tl)l $ tl17(tl) + lx - to1'7(1x- tol),

(9.15)

where the function '1 satUfies the hfiPOlhues of Thecrem 9.1, in particular (9.9).
Suppose that g(t ) is a function in the class C 2 with compact 8Upport. Define

11

1 00

f (t ) =

(9.16)

-oo

(t

8(x, y)g ....=_


X)
tl

dx.J!.
d .
tl

Then f (t ) is differentiable at to, and !'(to) ron be computed by differentiating


under the integral sign:

/ '(to)=

(9.17)

11 Lroooo

8(x , y )g'

(~ - X) dxd~.
tl

tl

and the integral (9.17) is absolutely converyent.


The proof of the corollary is parallel to that of Theorem 9.1. Obsen-e that if
we examine the absolute convergence of the integral (9.17) without. considering
{9.14). then we come directly to t he Dini condition (9.9).
9.4.

Study

of the Riemann function.

According to historians. Karl Weierstrass mentioned the function H"(t ) =


I:~ ~ sin(7rn 2 t) in a talk to the Academ~ of Sciences in Berlin on 18 July 1872
and indicated that Riemann had introduced this function to warn mathematicians that a continuous function need not have a derivative anywherr {5). This
function has comr to br known as "Riemann's function ," although tht're seem!'
to be no \\Titten e\idcnc-r that connects either Riemann or his student:< with thb
function . (&><- !4) for n fMcinating discussion of tht mystery surrouudiu~ th<
ori~:in of lr {t).)

llli

CHAPTER

Not being able to prove that W(t} was nowhere differentiable, Weierstrass
considered the more lacunary series o(t} = I:;' b" cos(a"t), 0 < b < 1, and
be showed that if a is an odd integer 8lld if ab is sufficiently large, tbe.n o(t)
is nowhere differentiable. We have seen that the result is true for a
2 and
b = 1/2; in fact, ab ~ 1 is sufficient.
In 1916, G. H. Hardy proved that the Riemann function W(t ) is not differentiable at to in the following three cases:

(9.18)
(9.19)
(9.20)

to is irrational;
t0 = p/q with p =0 (mod 2) and q = 1 (mod 4);
to = p/ q with p = 1 (mod 2} 8lld q = 2 (mod 4).

Serge Lang routinely suggested this "conjecture of Riemann" to his students


8lld, to the general surprise of the mathematics world, Joseph L. Gerver, one of
Lang's undergraduate students. resolved the problem by proving the following
unexpected result: U to= p/ q where p 8lld q are odd, then W (t } is differentiable
at to 8lld W'(to}
-1/ 2. He then showed that W (t ) is differentiable at no
other points, 8lld the problem of the differentiability of the Riemann function
was completely settled.
Following Holschneider 8lld Tchamitchi8ll 18], we restudy the function W(t)
by associating w ith it the corresponding "analytic signal'' defined by

00

F (t) =

(9.21)

n12 ein't.

F (t ) can be extended to the upper-half plane .:


.
holomorphic function

= x + iy,

F (.: ) = ~ ~e'"""

(9.22)

~n2

y > 0, as the bounded.

,.

The most naturul wavelet analysis for such holomorphic functions is the analysis advocated by Lusin (2.5). The wavelet transform ofF(.:) is, up to a normalization, .F'(z) ir.l:~ e>q>(i1r11 2 .:)
!f(9(::) - 1). where 9(z) is the Jacobi
junction defined by

IX

(9.23)

9(.z ) =

2: e>.'J>(i1r112.:>.

:=X T-iy.

> 0.

-oc

It is easy to study the beba\'ior of the function 9(::) near the real axis by using
the functional equation satisfied by 9(: ). And everything v..-ould go well in the
best of all possible, worlds if the analyzing vavelet. w(t) = (t + i )- 2. satisfied
the minimal condition allowing the corollary of Theorem 9.1 to be used. namely,
that J~00 (1 + ltl}lt/J(t)ldt < oo. But this integral diverges.
Holschneider had the remarkabl<' idea to start vith the result of th<' analysi~
obtaine-d with the "bad wavelet." and then to do the s~nthcsi.s with th<' -~ood
wa\'t>leC g(t). Thus he cousidered 1\ function g(t ). in the class C2 and with

u;

WA\' I:l.ETS AI'W FRAcrALS

compact t;upport, such that


(9.24)

fooc. g(u)e- "du = 1,

where

g denotes the Fourier transform of g . We then have, for all .>. > 0,

(9.25)

and

oc.

(9.26)

j"' g (-t - x) (8(x + iy) - l )dx-dy = -2 L"' -e''"'


-oe

II

1r

'

n2 .

We ~. further simplify the identity (9.26) by requiring that the integral ol


g(t ) be zero, which is to say that it is a Grossmann-Morlet wavelet. Then (9.26)
becomes

1oo
[ 9 (i.-y x) 8(x + iy)dx dy = ~ t e-:2'
.
o
n

(9.27)

Jl

-oo

1r

Let to = ~":t! . To demonstrate that the second member of (9.27) is differentiable at to, we use the corollary of Theorem 9.1.
First of all, the contribution from y 2: 1 in (9.27) can be neglected. In fact,
19(z) - 11~ Ce- 11 when y 2: 1 and Z = X+ iy 80 that the integral in J1 from 1
to in.fi.n.ity yields a function in the class C2 or, more generally, having the same
regularity as g(t).
To establish the differentiability at t 0 , observe that

IB(x + iy)l ~ cy- 112 for all real

(9.28)

and

<y

~ 1

and that
(9.29)

YIB(to + z)l ~ C lz l312

if z = x + iy,

0 < Jl

1.

We can then apply the corollary of Theorem 9:1 and prove Gerver's theorem.

9.5.

Conclusions.

In V"iew of the example of the Riemann function . one would be tempted to conclude that wavelet analysis is better than Fourier analysis for studying fractal
structures. The Riemann function is given explicitly in terms of a Fourier series
and. yet, this exact spectral information yields no direct access to the pointv.rise regulari~ of the function . This conclusion. however. is incorrect for several
reasons.
First of all. the simplest and most precise method'for studying the Riemann
function has been found by ltatsu !9]. and it is not a timr-sct:Uc method. On the
contrary. ltatsus method is based on the judicious use of the Poisson summation
formullt. It is thus a "timf'-frequency"' method. and it ftives morr preci.~r results
than thOS(' obtained by wavelet analysis.

118

CHAPTER 9

In the eecond pla.ce, we know today that wavelet analysis is part of a much
larger diadpline, namely, 2-microlocal analysis (12), (10)). 2-microlocal analysis
saw the light of day at about the same time as wavelet analysis. The dictionary
allowing us to go from one type of analysis to the other has been worked out by
Jaffard (10].
Should one use wavelet analysis or 2-microlocal analysis? We refer the reader
to (ll), where this problem is carefully studied. My (subjective) point of view
is that 2-microlocal analysis is a more flexible and more precise instrument but
that its sophistication can discourage some scientists. Wavelet analysis offers an
apparent simplicity, 9:hich is one of th<> explanations of its success.
Bibliography
(1) A . ARNEODO, E . BACRY, AND J . f . M\:ZY. Wavelet. and mvltifrodolformolinn for
ingulor ngnoh: Applicotiom to turbukna doto, preprinl, Centre de Recherche
Paul Pascal, Pesaac, France, 1991.
[2] J. M . BONY, Second microlocolUotwn and J1"0P09ation of nngulorit~ for aemi-linMr
ht,tperbolic equotioru, Tanjguchi Symp. HERT. Katata, (1984), pp. 11-49.
(3] - -, Interaction du nngulorith pour l'iquation de Klein-Gordon non lineoin.
8eminaire Goulao~ 1983-1984, Ecole Polyteclullque, Centre de Matbematique.
Palaiaeau, France.
[4) P . L. BUTZER AND E. L. STARK, MR~nn' e=mple 8 of a oontin...,... nondifferenti4blt
function in tM light of t1110 kttera (1865) of ChrUtoffel to Prym, Bull. Soc. Math.
Belg., 38 (1986), pp. ~73.
[S) J . J . DutSTERMAAT, Selfmnilori.ty of 'Riemonn'a nondifferenJ>4ble ftmction, Nieuw
Arch. Wisk., 9 (1991), pp. 303-337.
(6) J. GERVER, The differentiability of the Riemann function at cm.un rational multipl~
ofr., Amer. J. Math., 92 (1970). pp. 33-SS.
(7] - -, More on the differentiobiltty of tM Riemann funchon. Amer. J. Math. , 93 (1970j.
pp. 33-41.
(8) M . HOJ..SCHNEIDER AND PH . TCHAMITCHIAI'> , Point!UUe onolyA< of Riemann'! '"non
differentiable" function.. lnventiones Mathematicae, lOS (1991 ). pp. 157- 176.
[9] S. ITATSt:. The diff~ti4hilit!l of 1M Raemann function. Proc. Japan Acad .. Ser. A .
Math. Sci. 57 (1981). pp. 492-495.
(10] S. JAFFARD, Pointwi.se m100thneaa. two macrolocoli.z'ation and wovelet c:oefficaents. Pui:r
licacions Mathematiques (Publicacions de Ia Universitat Autbnoma de Barcelona).
35 (1991), pp. 155-168.
(11] Y . Ill EYER. L 'anolyae par ondelette. d un objet multifractol: w function
;!-, sin n 2 t
de Raemann. Colloquium 1\l athematque de l'liniversite de Rennes. Rennes. Franc~ .
November 1991.
(12; H . QVEFFELEC. Derivobiliti de certoane. !ommes d e airte. de Feu.,~ locunotrc. Thesi~.
University of O rsay, Orsay. FrancE'. 1971.

L;7"

CHAPTER

10

Wavelets and Turbulence

10.1.

Introduction.

The study of profound problems is often influenced by the available instruments


and techniques. The example of Galileo's lens comes immediately to mind. We
propose, by takiDg up an argument due to Marie Farge, to show that it is as
natural to study turbulence using wavelets as it is to explore the night sky with
a telescope.
This is not to imply, by any means, that all the problems in turbulence have
been resolved thanks to wavelets. We merely see things a little more clearly. The
objective of this chapter is to describe the new point of view that wavelets have
brought to the study of fully developed turbulence.
The fol.lov.ing lines are, in reality, only "lecture -notes," and we encourage
those who wish to know more about this subject to look at the original articles
cited at the end of the chapter.

10.2.

The statistical theory of turbulence and Fourier analysis.

The statistical theory of turbulence was introduced more or less siniultaneously


by Kolmogorov (1941 ), Obukhov (1941), Onsager (1945), Heisenberg (1948), and
Von Weiz.s8clrer (1948). This work involved applying the statistical tools used
for studying stationary process to understand the partition of energy, at different.
scales, in the solutions of the Navier- Stokes equation.
Aa;ording to Leray [6], this statistical point of viev. if justified by the loss of
stability and uniqueness of the solutions for very large Reynolds numbers and
for large values of time. One then speaks of fully developed turbulence.
The intermediate scales (the inertial zone) lie between the smallest scale:
(where. through viscosity, the dynamic energy is dissipated in heat) and th<'
largest scales (where exterior forces supply the energy). In this inertial zone.
the theory of Kolmogoro" stipulates that energy is neither produced nor dissipated but only transferred, without dissipation, from one scak> to another and
according to a constant rate.
Another hypothesis is that turbulence is statistically homogeneous (invariant
uuder translation). isotropic (invariant under rotation). and &'If-similar. Tht'
vdocit.y com p Olll'll\..'< art' treated as random variables. iu tht' prolmbilistk SCDS<' .
11!1

CHAPTI::H. lU

120

and the statistical description is derived from the corresponding correlation functions. The mathematical tool adapted to this statistical approach is the Fourier
transform, and, by associating scale and freQuency in the usual way, Kolmogorov
and Obukhov arrived at e213 jkj- 1113 for the average spectral distribution of energy, where k is the vector-variable of the fourier transform. which was ~en
over the space variables Xt, x 2, and X3.
Various wind-tunnel experiments {Batchelor and Townsend (1940); and
Anselmet, Gagne, Hopfinger, and Antonio (1984)] have shown that the energy
associated with the small scales of a turbulent flow is not distributed uniformly
in space. This observation, that the support of the transfer of energy is spatially
intermitt.ent, has led several authors to hypothesize that this support is fract.al
{Mandelbrot (1975); and Frisch. Sulem, and 1\elkin (1978JJ or multifractal !Pa:isi
and Frisch (1985)].
Use of the Fourier transform does not elucidate the multifractal structurt'
of fully developed turbulence, and this observation leads us to an important
application of the wavelet transform.
10.3.

Verification of the hypothesis of Parisi and Frisch.

The conjecture of Parisi and Frisch 19] is based on experimental data supplied
by Anselmet, Gagne, Hopfinger, and Antonio. The computation that led to
their conjecture involved evaluating, as a function of the displacement l ~xl, the
average value of the pth power of the change in the vector velocity in a turbulent fluid, that is the average value of jv(x + ~. t) - v (x . t)IP. The surprising
result from these measurements was that they obtained a power law in terms
of I ~IC(p), where the exponent ({p} does not depend linearly on p. The interpretation given by Parisi and Frisch is that turbulent flow develops multifractal
singularities when the Reynolds number becomes very large.
The relation between the multi.fractal structure and the power law is given
by the following heuristic reasoning. To speak of a multifractal structure means
that, for each h > 0, there is a set of singular points v.'ith Hausdorff dimension
D (h) on which the increase in velocity acts like 16xlh. The contribution of these
"singularities of exponent h" to the average value of lt(x + Ax. t)- v(x. t)!" is
of the order of magnitude of the product l~ l"h i~1 3 -D(h l: the second factor
is the probability that a ball of radius l6xl intersects a fractal set of dimension
D (h).
\Vhen lAx I tends to 0, the dominant term is the one " ith tht' smallest possible
ex-ponent. which leads to
((p)

= hinf>(l{ph -r 3- D(h)}.

The C."\.-ponent ((p ) is thus given by the Legendre transform of function D(h).
which is. we recall. the Hausdorff dimension of the set of ex>ptional point :r.
where ltl:r + A:r. t)- v (x , t)l is the order of magnitude of IA:r 11' . The nonlinear
depcndena> of ((p) on p thus indi~ that th<' very abrupt \nrintions in velocity
correspond to a multifractal structure.

WIIVI::l,J-; n; AND TUTI.BULENCE

121

The wavelet transformation ~ the ideal tool for arwlyzing multifractal strt~C
tures, and Fh&ch has Mught to verif11 hi$ ronjecture by plunging into the VCJ
heart of the turbulent signal and traveling ocrcm the 6CGlu to CtJlculc.te the frru:t4
exponeni6.
The turbulent signal, which is the object of the analysis, is, in fact, a functiOII
of time obtained by hot-wire measurements. The turbulent flow comes from tbt
Modane wind tunnel, and the hot-wire technique provides the measurement a1
a given point of the velocity as a function of time. Taylor's hypothesis implies
that, for the particular condition." created in this tunnel. the ~segment in time
i~ equivalent to a ..segment in space'' (along the axis of the tunnel). F inally,
the turbulent signal is analy-t..ed with the wavelet transform and. thanks to a
two-dimensional color visualization, one displays a multifractal structure.
This phenomenological approach has been criticized by Everson, Sirovich, and
Sreenivasan [3]. These investigators have been able to show that wavelet analysis
of Brov;tuan motion produces very similar two-dimensional visualizations.
It then became a matter of urgency to move from the qualitative to the quan
titative and to extract the fractal exponents h and the corresponding Hausdorff
dimensions D (h) from the "Gagne signal." This research is being actively pursued by Alain Arneodo and his collaborators with the objective to confirm the
hypothesis of Frisch and Parisi.
10.4.

Farge's experiments.

Following Farge. we quote John von Neumann [10]:


The phenomenon of turbulence was discovered physically and is still largely
unexplored by mathematical techniques. At thesame time. it is noteworthy
that the physical experimentation which leads to t hese and similar discoveries i.s
a quite peculiar form of experimentation: it is very different from what is characteristic in other pans of physics. Indeed, to a great extent. experimentation
in fluid dynamics is carried out under conditions where the underlying physical
principles are not in doubt, where the quantities to b e ob&erved are completely
determined by knov.-n equations. The purpose of J,he experiment is not to verify
a proposed theory but to replace a computation from an unquestioned theory
by direct measurements. Thus wind tunnels are, for example. used at present,
at least in large part. as computing devices of the so-called analogy type (or. to
US(' a Jess widely used. but more suggestive. e.x pression proposed by Wiener and
Caldwell: of the measurement. type) to integrate the nonlinear partial differential
equations of fluid d~-namics.
Thus it wa..~ to a considerable extent a somewhat recondit(' form of computation which proYided. and is still providing. the decisive mathematical ideas in
the field of fiuid d~-narnics. It is an analogy (i.e., mca.o;urenlent ) method . to hr
sure. It ~ecms clear. however. that digital (in the Wiener-Caldwell terminologY:
counting) ~<'Yiccs have more flexibility and more aocuracy, and could l><' made
much faster under present conditions. We believe, therefore. it i~ now time tO
concentrat<' ou effecting the transition to such devices, and that thi!: will increase
t he power of tht' approach in question to an unprecedented e.).'t.Cnt.
Om of tht chit'f architects of this experimentation ha.-.. without douht. been
Normnu Znhusky. who. uft.er luwing discovered solitons (iu rolhthomt.iou v:ith

CHAPTER

122

10

Kruslcal), demonstrated the ezi.Jtence of coherent 6tructures -.uithin turbulent


a, well a, thru dimensiom. Zabusky comments on his discovery Ill):

flow in two

In the last decade we have experienced a conceptual shift in our v1ew of


turbule:nce. For flows with strong wlocity shear ... or other organizing characteristics, many now feel that the spectral or wavenumber-space description has
inhibited fundamental progress. The next "El Dorado" lies in the mathematical
understanding of coherent structures in weakly dissipative fluids: the formation,
evolution and interaction of metastable vortex-like solutions of nonlinear partial
differential equations...

Farge explains to us how and why she was led to use wavelet analysis in her
study of numerical simulations of two-dimensional fully developed turbulence l4j:
The use of the wavelet transform for the study of turbulence owes absoiutely
nothing to chance or fashion but comes from a necessity stemming from the
current development of our ideas about turbulence. If, under the influence of
the statistical approach, we had lost the need to study things in physical space.
the advent of supercomputers and the associated means of visualization have
revealed a zoology specific to turbulent flows, namely, the existence of coherent
structures and their elementary interactions, none of which are accounted for by
the statistical theory.. .
What Farge asks of wavelet analysis (or of any other form of "time-frequency"
analysis) is to decouple the dynamics of the coherent structures from the residual flow. The residual flow would play only a passive role in an action whose

protagonists would be the coherent structures; these '"protagonists" clash or join


forces according to their "sign" . ..
The difficulty arises because the Navier- Stokes equations are nonlinear:
hence, the interactions between the ooherent structures and the residual flov.
cannot be eliminated. In other words. the .coherent structures differ from solitons in that they are not particular solutions of the Navier-Stokes equations.
Farge. after having tried several methods to extract the coherent struct ures
from the residual flow, decided to use Victor Wickerbauser's algorithm, v.hich
proYides a decomposition in a basis adapted to the signal. It is quite remarkable
that the Wickerhauser algorithm extracted the coherent structures by ghing
them priority over the residual flov.. These unexpected results are in the process
of being pu~lished.

10.5.

Numerical approaches to turbulence.

It is not unreasonable to believe that the use of new methods in numerical


analysis v.ill considerably reduce the time needed to compute solutions of th('
Navier- Stokes equations.
Gregor) Beylkin is developing a 8)-st.ematic program in which wavelet analysis
replaces the more traditional methods of numerical analysis- finite elementS.
finit.t> differences. and spectral n1etbods. He intend~ to apply these metbods to
the solution of thr Navier-Stok~ equation.

WAVELETS AND T v RDULENCE

Farge poses the following questions in f4J:


1. Is it possible to "project," in the sense of Galerkin's method, the NavierStokes equations onto bases that are adapted to the structure of these equations
and that lead to efficient numerical computation?
2. More generally, can one effectively describe the solutions with a small
number of parameters, as is the objective of the theory of inertial varieties of
Foias and Temam? These inertial varieties contain the ~strange attractors,"
which constitute the asymptotic solutions.
3. Can the best basis method of Wickerhauser furnish an effective paramf'terization for these inertial varieties?
There are many questions that have yet to receive satisfactory answers, but
these issues are stimulating research projects that will perhaps be decisive.
Bibliography
(1) F . ARCOUL , A . AIWEoDO, G. GllASSEAU, Y. GAGNE, E . F . HOPFINGER, AND l '.
)2)

(3)
(4}

(5]

FRISCH , Wowlet on4l~u of tvrbulence re.-Lt the multifro,d4J n4tvre of~ Rich4rdon aucode, Nature 338 {1989), pp. 51- 53.
A . ARNtOOO. G . GllASSEAt;, AND M . HOLSCHNEIDER, WClvelet troruform ofmv!Ji]rGctoll, Pbys. Rev., 61 {1988), pp. 2281- 2284.
R . EVEI\SOI' , L. SJROVICH, AND K . R . SR.EENIVASAN, Wcweld onolV8U on ~ tvrbvlent
jet., Pbys. Lett. A , 145 {1990), pp. 314-324.
M . FAROE, 7'ronlfO'fTTik en on<Ulettu ccntinve et oppliaW.cn Q. Ia tvrbulence, Soci~
Mathematique !7an~. Pari.s, !7ance, May 5 , 1990.
A . M . KOLMOCOROV, A refinement of prevoqw hwothuu oonceming the loc4l tn.cture
of tu~ m wc:oouo incompruibk fluid Clt a high Reynold& number, J . Fluid

Mecb .. 13 {1961), pp. l , 82-85.


)6) J . LEilAY, Etude.s de diver u equCltion.l antegrolu non-lmt!oare.t d de qudquu problemu

que poJe l 'hydrodyno~ J . Math. Pures AppL , (UI33), pp. 1-82.


(7) B . M ANOELBROT, Intermittent turbulence in elf-limiJcor CIOicodu: Divergence of high
momentJ and damen8ion of c=ner, J. Fluid Mech., 62 (1975), pp. 331- 358.
(8] J . F . 1\lUZV, E. BACRY, AND A . ARNEODO, WaudetJ and mulcijroctolform4Jum for
nngvlor ngnol.: Appliooton to tvrbulenoe data, prepriDt, Ceuue de Recherche Paul
Pasc&l. Pessac, !7ance.

(9) G . PARISI AI'D U. FRISCH , 7\i.wlene.e and pnedid4bilitJI in geoph~ fluid d~m
ic.! Clnd climcote dynamics, M. Ghil, R. Benzi, and G . Parisi, eels., North-Holland .
Amsterdam. 1985, pp. 71-88.
flO) J . VOl< I"EilMANI' , Complete works. 1949. )Tta.nslator'soo&.e: The quotation i.s from On
the Pnncrples of Large Scole ComJ)Vtmg Mochinu ~ Herman H . Goldstine aod
J ohn von l"eumann. This paper ...-as never published. It coot&ios material given by
von Jl'eum&nn in a number of lectures, in particular ODe at a ITif'eting oo Ma~ 15, 1946.
of th<' l\lathematical Computing Advisory Board, OffiCf' of R~ and Inventions .
N"''Y Depanment. which in 1947 became lhe Office of Naval R~ .J
(11) N . ZAB\'SK\. ComJ)VtClt.i onolaJI'Iergt:tics. Physics Today, July {1984). pp. 2- 11.

CHAPTER

11

Wavelets and the Study of Distant


Galaxies

11.1.

Introduction.

Wavelets are being used by Albert Bijaoui and his collaborators to clarify the
hierarchical organization of distant galaxies and, possibly from this, to make
deductions about the formation of these galaxies.
The following lines are again "lecture notes," this time from original articles
by Bijaoui and others, and we encourage the interested reader to consult these
articles. Several are listed in the bibliography.

11.2.

The new telescopes.

The difficulties encountered today in the analysis of the distribution of the galaxies result, paradoxically, from technological progress in the construction of tel~
scopes.
On the one band. the quality of telescopes has been considerably improved
during the last 50 years. In 1950, astronomers could capture ten million galaxies.
Today, 100 million galaxies can be examined ... There is now such a quantity of
information that it is impossible to continue using traditional obsefvations based
on the astronomer's eye and judgment.
On the other band, the very nature of the i.mages coming from these telescopes
has undergone a revolution. Nov:, CCDs (Charge Coupled Devices) replace plates
of silver salts, and "chemical" photography is already a thing of the past. The
future will involvE> electronic reception of photons that come to us from the edge
of the Universe.
CCDs do not providE> images in the traditional sense but. rather. data that
arc suitable for various methods of processing. Tomorrow's telescope$
be
computers that can be accessed from a distance. These computers "ill. '\1\ith
proper soft."arc. automatically execute certain imagt>-processing algorithms. The
optimal utilization of the marvelous progress offered b~ these CCDs necessitates
implementing cffectiv(' algorithms for processing the received images. and this is
when wn\elets cnt.cr.

ill

126

11.3.

CHAPTER 11

The b1erarcbJca1 organization of tbe galaxies and the creation


of the Uolwne.

This considerable technological progress has allowed us to study the threedimensional orpn1zatioD of the galaxies. The results that have been obtained
challenge the imqinUon. Far from being arbitrarily distributed like points
toeeed at random, the galaxies are organized according to extremely complex
geometric configurations, and these configuratioas contain information. In some
caaes, galaxies ue distributed in filaments; in other cases, they are spread over
huge surfaces that surround only void.
Galaxies aggregate into clusters, which are themselves organized into superclusters, although a galaxy can belong to a supercluster without being part of a
cluster.
One of the goals of contemporary astronomy is to analyze, in detail, the geometry of these hierarchical galactic organizatiODS. Thls analysis is particularly
important in the ease of the most distant galaxies.. Indeed, the images that we
receive from these galaxies date from the beginning of the creation of the Universe. It is possible to imagine that the fascinati..J:I geometry of these hierarchical
organizations of distant galaxies preserves traces of the process of "fragmentation" of the smooth, homogeneous primitive matter that made up the Universe.
11.4.

The multlfractal approach to the Universe.

A fractal description of the distribution of matter in the Universe has been


proposed by Mmdelbrot (1975, 1982) and, more or less explicitly, by many
other scientists. But the nature of the distribution of the galaxies precludes
homogeneous fractal descriptions, whereas a multifr&ctal approach corresponds
with reality 17).
Although it ~ves no indication of the physical processes of creation, this
multifractal approach would, however, be very useful if it led to phenomenological
predictions about certain aspects of the galax.ies' distribution-the frequency of
voids, etc. . . But the scientists who were using these "multifractal" approaches
were oriented toward calculating global information and parameters, and they
were neglecting the study of the local fractal structure of the galaxies.

11.5.

The advent of wavelets.

We encounter the samt> situation here as tht> ont' that ar()S(' in tht> study of
turbulence. In that case. the passage from global fractal properties to local fractal properties required tht> use of .,-avelets. 1'ht> hope placed in thE' analysis of
astronomic ~ using wavelets is, .,.;tbout doubt, reasonable if one considers the arguments presented by Marr. Wavelet analysis is used to delimit the
boundaries of objects and, b~ so doing. to arri\~ at their thret>-dimensional organization. But it is precisely this tbret>-dimensional organization that is sought
b~ tht> astroph_vsici.sts.
On the other band, at. a more imaginative lt>,-el. it is tempting to com par<' thE'
flov. of the Unive'S<' over timE' to that of 11 ftuid and w continu<' this ml'taphor

WAVELETS AND THE STUDY OF DISTANT G,t.LAXIES

l'lr

by comparing the galaxies to the active zones of fully developed turbulence.


Wavelet analysis of astronomic images is thus as natural as the 'WaWlet analysis
of turbulence.
We hope, by these few lines, to have ltimnlated the readers' curiosity and,
in this case, we encourage him or her to go to the original scientific articles by
Bijaoui, Slezak, and others.

Bibliography
(1) PH. BENDJOYA, E . SLEZAK , AND CL. F ROESCHU , T7te IOIGWld tn~Nform, o new tool fm
utroid fomilv deUnninollon, Anru10m. a.nd A.tropbya., to appear.
(2) A . BUAOUl, Algorithmu de lo trran.foml4tion en onddettu, Application !'image u-

(3)
(4)
(5)

(6)

[7)
(8)

tronomique, Proc. du Coun CEA/EDF/INRlA, 1991, pp. 1-26. .


- , Le eiel loint4in utU un mirage? Ciel et FApace, nU!Ilbo ~ "Du big-bang a
nos jours," JUDe-July-August 1991.
A . BUAOUl AND M . GtUDICW, Optit:.al image oddSQon with the ~ fnml/orm,
Experiment. Mtroo., 1 (1991), pp. 347-383.
A. BUAOUJ, E. SU:ZAK, AND G . MARS, The wovelet tromfurm: A neiD"'IV to duc:ribe the
Univere, Worlahop on tbe Distribution of Matter iD tbe Uni-, Meudoo, France,
March 1991.
E . EsCALEJU, E . SLEZAK, AND A. MAZURE, New evidence fm ~ in the Como
clunering wing the tD01Idet Gft4ls~N. Amooom.. a.nd A8tropbys., 2e9 (lim), pp. 379384.
B . J. T . JON!, V. J. MARTJ.NEZ , E. SMR, J. EJNASTO, MuJtijroct4J. ducnption of the
l4rgecok .mu:ture of the univeNe, Aftropbya. J ., 332 (1988), pp. 1-5.
E . SU:ZAK , A. BUAOUI. AND C . MARS, /~ of .m.cturu jrvrn ~ cmmtr.
U.e of the~ tnnuform, A.stronom. ud A8tropbya., 227 (1990), pp. 301-316.

Index

adaptive filterinc. Su wavelet packets


adaptive ~tion. See Malvar wavelets
Adelaoo. See ol.o pyramid aJ&orithms
image Pf'O(*Iing, 34, 4S
analytic aignal, 23. See aLto Wiper-Ville
transfonn
instant.&neoul frequency, 70-71
Aroeodo
Fnacb aud Pariai bypotbeeia, 17, 121
atomic decompo.ltlona
Calder6o ' identity, 24
Hardy~ 23-:U, 26
V'(0,1), 24
wavelet aualysia, 31
Wwu=r-VWe tranafonn, 7S
atoms, 4-6. See aLto tim.&equency atom&

G~Morlet theory, 28
rec:l.i8covered by Gro.mann aDd
Morlet, 13
Chqe Coupled Device~
in~y.125

chirp 8ipaJa,
Cieaielaki,22

~.

72

c:odiJI&. See .Z.o aipal &Dei UD.ce PfOC*Illl&

Burt aDd A del8oc'a alaorithm, 52, 55


bi-onbopmal wavelet~, 61
liDeat pr.1iction c:od.iD&. 33
Malia' ~ aJaorit.hm. 39
Nallat'a ~mace cod.ila& aJ&nrithm, 106
ortbocoaal pyramids, 57
wbbaud CX>diD&, 3, 34-36
tramform~, 3, 33-34
u-de aDd 8uc:tuat.ious, 55, 57-58
~.s, 13,101
Cobeu
t-ort.boppDaa waveleta, 5!HlO
COiro/Waw:e of Mallat'a algorithm, 58
coberat -..-, 25
CoifmaD. See ol.o Weia
Hardy ~. 26-27
Luaiu' theory, 24

Balian
t~frequency reprweutation, 63
Balian-Low theorem, 35,76-77

Battle
~.31

Beousi
Gau.iao-Markov fields, 19
best basis. Su ol.o Malvar waveleu
approach to turbulence, 123

Malvar waveleu, 83-84


wavelet packets, dual to Malva: wavelets,

t~~cyt-.,77

99

Beylkin
waveletS aud numerical analYSis, 122
bi-orthosonal

pyramids. See pyramid algorithms


wavelets, 5H1
Bijaoui
,.,.,.vewt analytis in astronom.,. 111. 12!>
Boulez
Moun's Magic Flut.e, 65
Brownian motion. 18-19
!;urt and Adellon'a algorithm. Se~ pyrAmid
algoridunB.

Coifm&D-W.-, 26-27
com~on. See aLto lipal and imece
~

uymptotic limita of cod.i.rlg acbemes. 34


optimal with Malvar basis. 83
pyramid .J&orithma, 51
Croissier
quadratw--. mirror filters. 30. 3:\
cubic apline:s. 48 . 107-108

Daubechiea
bi-or1.hoc0Dal -veleu, with Cohen &Dd
~u. 5~

extended Haar'a work, 6. SO


tirz>e..frequal~ wavelet, with JaiJanl and
Jo~. 77
-velet.<. ~~ 10. 29-30

Calderon. 13
identity
drrompoaition of tlu> identity. 2t
129

130

INDEX

Daubechlee (continued)
con.t.ruction, 43-44

atatl.ltical theory of turbulence ,


119-120
Ville' critique of, 63
-wlet history, 11

~11187.27

padceu, i l
O.U~Jalf&l'd-JCl!UM

wa..elet.a. 80

decima&loo opentor, 3S-36


d~.

See aipal and im processing


o..can.-. 101
dilfereutlability. See allo Riemann 'a function;
Weiemraa function
criterion in term.1 of wavelets, 112-113
Dini condition, 113- 115
Discrete Coline 'I)ansform. 7C. ~
Dilcrete Sine 'I)ansform. 77
Du Bois-Reymond
diV'el'ltnt Fourier aeries. 14
dual baei.s. See IUeu bas;s
dyadic block&, 19-20
energy criterion. S, 83
energy density. 66. See ol.so W igner- Ville
transform
entropy criterion, 83
optimal algorithm, 9
wavelet packets, 96
Esteban
quadrature mirror filter&, 30. 33
exact reconstruction, 29
extenaioa operator, 3S-36, 47, 52. 56
Faber. 16
F~

""llwlet analysis of turbulence. 75. 119,


121-122
Fau>u. 14
Feauveau
bi-ortbogonal -velets. 59--60
Federbuab
renonnalization, 31
filter bank$. 81
filtering/ aampling, 4;
filters. See ol.so quadrature mirror filterS
convolution, 42
finite impulse response. 3i
finite leogth , BUrt and Adelson s. 49
high-pass. 36. 38. 9i
ideal. 34-35. 38, 44. 9S
low-pass, 36, 38. 4i. 56. 97
linf!erprint storage. 3. i
Flandrin
Brownian motion. I ll
fluctuation. Set> trend and fluct Ul\ton
Founer
1!107 announcement. I~
anal~'Si
B~'tlian

motion. 11insta.ntaneous , 66
2
L !0.I}. 29
Sl&tio~ ~als. !I

_...

inlonnation hidden in dyadic block&, 19


provide no local information , 17
Riemaon'a function, 117
fractal. See olio multifractal
expooeotl, 11, 17
atructure of Riemann's function. 13
at.ructure of t urbulence, 120
Franklin. 21- 22
unconditional basis, 27
frequency channel, 34, 97-98
Freud. 111- 112
Frisch and Parisi
hypothesis. 120-121
Gabor 6, G:H;4 , 76
w"veleu
compared with Malvar wavelets. 86
optimal property, 90
~~ result, 90-91
time-frequency alOin$, 89
Gagne signal, 121
Galand
quadratic mirror filters, 33
aubband coding. 34
Galerkin 's method, 123
Gerver, Riemann's function, 116
global warming, 1- 2
Groamann- Morlet, 13
waveletS (analysis), H
analysis of multifractal objects. II
cboi<les. 8
compared with Malvar wav.:lets. 86
multiresolutioo analysis, 5
related to other wavelets. 2f.-2SI
Haar

s~em.

15-16

atomic decomposition of V'[0.1;. 24


Brownian motion. 18

compared itb Daubechies's wl\velets. 31


compared "'itb Franklin and Schauder
bases. 21
exception to no~-mmetry. 59
extreme point in Daubechies~
construction. 44
Hlll'd,.
Riemannf funct ion. liG
space;.. :!3
real \'ersion . 26
rol<' in Signal processin~;. :!:;
Hausdorff dtmension
fractal expo.n ents. 17
turbulence. 12(1
Hei~~en~rs:

re<:\.allf:lto< (boxes). 73.


linceruunt-~

7~,

P rinciple. 3G. II!J

131

INDEX

Hilbert transform, 23, 26, 72


Hildreth
sero-a'OIIIinp, 104
Hillcoranl
i.mace proceMing, 34, 45
HoldeT exponenta (condition) , 17
bklrtbogonal ...,.velets, 71
Brownian motion, 18
D aubecbiee'1 .....veleta, 43, 113
Holechn~deT

Riemann' function , 13, 17, 116


im~e

numerical repreeentation, 2
image procee!ing. s~ GUO pyramid
aJaor itbms; signal and image
prooessing

bi-ortbogonal .....veleta and symmetry, 59


compression, 51
information at diHeTent acales, 58
judging quality, 7
MarT's Mleas, 104

m.tantaneous &equency, 70

wavelets, 22- 24
Mall&t

formulation of Marr' conjecture, 107-108


cue of perfect~. 109
countere:u.mple, 108-109
"berrincboneM algorithm, 39-40
convergence to .....velets, 42-43
eyntheeis, 6, 30
two-dimenaion&l algorithm. 57
uymptotic behavior, 58
Malvar
optimal buis, 83-85
wavelets
adaptive eegmentation , 81, 99
advautacet and disadvantages, 91
dilcrete cue, 86
re1atiOD with Lemari6-Meyet" wavelets,

Mandelbrot

Jaffard
.....velets,

77, 80
Gaussian-Markov fields, 19
2-microloc:al e.nalysis, 118

Joume. Daubechies-Jaffard-Journ~ wavelets,


77. 80
Jules%
vilion experiments. 102
Julia. 14
Kruslcal

solitons. 122

Lang. 116
Laplacian pyramid. 51
~besgue. 15
~marif..:llleYer wavelets. 80-Sl
~rav
.
stat~ICal theory of turbulence. 119

Un. 11(, 1!l


lib~ of bases. S wavelet libraries

L~n~d

Lu.in

80-85
.
almllar to ...,.ve)et pacltet.&, 92, 96
auited to apeech aod muaic, 7
variable length wiDdowa, 77~

exampl5. 71
not a local property, 72
inst&ntaneoU& epectrum, 71
Ville's ideas, 64, 66
interpOlation operator, 47
ltattu
Riemann's function, 117

Daubechles-Jaffard-Journ~

~mace~.21

rela&ed to other theories, 27-29


underlie~ Mall&t' image Work, 13
Littlewood-Pa.ley-St.elll theory, 20-21 , 28

titm'-frequenC'y atoms. 64~


t itm'-freqUftiC')' methodolog:\. 96
LiOIL. IC'Ctu""' at thC' Spanish Inst itute. 2
Littlf'wood-Pal..y anal~-sis. 1~:!1

Brownian 1110tlon, 19
remark on "Ftane-Cult,u re," 7
1\ory of &ac:Ws, 14
turbuleDce, 120
Man:inldewicz
atomic decomposit ion for V'(0,1], 24

Ma:r
conjecture, 104-105, 110
counterexample. 105-106
Mallat'a formulation, 107- 108
ideas
astronomy, 1.26
optimal repreeentation, 9-10
repreaentation of visual information,
103- 104
adence of villon, 102- 103
VWion, 45, 101
wavelet. 10
merging. 81-62
wavelet packet. 96-99
Minsky. 101
Montaigne's EuoJI&. 1
Morlet. Sec Grossmann- Morlel
mot her wavelet , 5
first appearance in matbematics. 20
joined by fatheT ...avelet, 41. 43
Moyal's formula. 66
multifrac:tal. See ol.fo fractal
approach w structure of thC' Uni~. 11
hypothesi~' of Fnsch and Parisi a bout
turbulence:. 120
structure of Riemann ' f function. 17

INDEX

132

umque repn~~enUtion venw; Stable


reco...truction, 110
zer<Kroeainp, 104-1~
restriction operator, 4&-50, 55-56
!Ueman.a'a function
analyzed, 115-117
fractal atructure, 13, 17
myatery,US
R.ieez basis, S4, SIHiO
Roux
Gauuian- Mvk.ov fields, 19

multireeolutioc analysis, 53-S4


GroeemaDJ>-Morlet analysis, H
one dim~, 40-42
Navier-Stobe equation, 119, 122
Nishihara
u~ algorithm, 108
numerical image pr~. See image
p~

NyquWt. coDdltion, 47
O'Neil. See Woods and O'Neil
optimal algorithms, 7-9. See GUO energy
criterion; entropy criterion
orthogonal pyramids, 56. See GUo pyramid
algorithm
coding is efficient, 55, 57
exact reconstruction, 57
orthonormal wavelet baaes as limits. 5i
Parisi. See Frisch
perfect (exact) reconstruction property. 29. 3i
phy&ical image
filtering and sampling, 46-48
peeudodifferential calculw;, 66-07, 69
pyramid algorithm
bi-ortbogonal wavelets, 59-61
Burt and Adeleon's algorithms, 4&-50,
51- 53
cartographic example, 45
examples, ~51

iDefiicient coding in Burt and


algorithm, 55
orthogonal pyramids. 55-58

Ade~~

quadrature mirror filters. 36. See aLso filters:


pyramid algorithms
Daubecbies's wavelets. 44
examples, 38
Mallat constructed t ime-scale algorithm.
39
orthonormal wavelets. 40-44
perfect reconstruction. 3i, 40
properties. 37
quaotiz&tion noise. 3
two-dimensional generalization. 5i
wavelet packets. 33-34. 36. 39
quantization. 59. See aLso signal and image
proceeaing
noise and subband coding. 34
quasi-station~ signals. 4
rf'COIUtruction. See perfect reconstruction:
quadrature mirror filters
repraentat.ion
Marr'~ idea.s, !l-10. 101
signals in the timf'-frequency plane. 6:i4lt>

sampled images, 46-47


Schauder basis, 1&-19, 2 1
eegmenUtioo. See Malvar wavelets

sere

reaults on Gabor wavelets, 90-91


Shannon
emoothing and sampling, 42
wavelets, 41
signal and image processing
archiving data, 3
coding, 1-4
compressioD, 2-4. 30
decoding, 3
in.8uence oo mathematics, 30-31
objectives, 1-4
qu.anti.zatioa, 2-3
reconStruction (restoration), 1- 2
tranSmission, 1-3
Simoncelli
image proce&&ing, 34. 45
Slezak, 127
split-and-merge algorithm, 81-82
splitting algorithm. 97-9~
stationary signals
Fourier analysis. 4. S
Stein
extension of Littlewood- Paley theory. 20
StrOmberg's -avelets. 13. 22. 2&-30
subsampling. 35
trend and fluctuations. 38-39
~-nthesis

favored over analysil: ,..ith Malvar


-avelets. 75
opposite or analysis. 25
signal processing \.aSk. 2-3
T chamitchian
bi-orthogooal ..-avelets. 59
multifra.ctal nature or Riemann 's function.
1; , 116
t im... frequency
algorithms. 10
Galand. 39
Malvar. i 5
atOms. See .Wo Wi$ner-Ville transform:
1\lalvar -avelets: ..-avel~t packets
atomic decompositions. 4
Gabor. 64. 89-9(1

INUI::X
tmae-frequency ( contonued }
Lienard, 64--65
Lienatd 's compared to Malvar'e
wavelete, 79
Mou.rt'a Magi<; Flute, 6S
methods, 111
plane
ideu of Gabor, Balian, Ville, Lienard
63-65

wavelets, 6-7
time-41C&le analyais (wavelets). See a.Uo
multireeolution analysis
Grossmann-Morlet, H
Malvar's waveleu versus time-scaltwaveleu, 80
t ransfer function, 56
transient signals. 4. 6, 8
transition operators, 48-50
u&ed for c.ocfulg, 51-53
t rend and ftuctuation , 39
Mallat's "berringboneM algorithm, 40
pyramid algorithms, 51-52, 55-57
turbulence
Anleodo's wavelet. analysis, 121
hypothesis of Fn.c:b and Parisi, 120-121
Mar;e Farge's ideas, 75, 121- 122
atawtical theory, 119
Wickerhau.eer's analyais, 122
2-microlocal analysis, 118
unconditional basis, 26-27
Ville
analytic signals, 23

comments on Fourier analyais, G~


instantaneous frequency, 7(}-73
thoughts on signal processing, 75- 76,
81, 96
vision. See Marr's conjecture; Marr's ideAS
von Neumann
computers and turbulence, 121
Walsh system, 92- 93
wavelet
analysis
astrophysics. 126-12i
compared -.. it h Fourier analysis, 112
d ifferentiability. 112- 113

133
Riem.ann'a fuoction, 116-11 7
.Jgnals with fract.al ~ructure, 4

natua within matbematia, 31


eyntheeia of earlier theories, 27- 31
tur~nce, 120-123
Welemr- fwlctloo, 111-112

b.venue the Uiooometric buia, 59


coefficleou
dilferentiabillty, 112- 113
Holder coDdition, 17
eero-croNi.np, 105
wavelet libraries, 8, 10, 96
wavelet padceta
adaptive filtering, 99
advant.a&e& and diMdvantages, 91
buic -wlet pacbu, 92-95
compared with Mal\IW wavelete, 76

diacrete cue, 99
fmding optimal balls, 98-99
frequency localization, 98
general - velet peckeu, 95-96
roou in Galand'e ideu, 33-34, 36

we;emrue
fuoctioD,ll1

RiemanD's fuDction, 115--116


Weiu. See olio CoifmaD
clecocopo.itioaa ol Hanly ~ 26-27
lDterpreta Luain ' theory, 24
Weyl ~ 6~70. See oUo
peeudodilfereutiaJ calculus
Wic:kerbaueer' alcorithm, 122-123
Wiper-Ville tr&D.Ifono, ~7
uymptotic llignala, 72- 73
energy density, 77
instanta.oeous frequency. 7(}-72
properties, 67- 70
. ayotbesil algorithm misling, 75
W11&on's buis, n
windowed Fourier analysis. See Malvar
wavelets; wavelet ~u
Wojtuayk
Franklin eystem, 27
Woods and O 'Neil, 55

Zabuslcy
cobereDt structures in turbulence, 121- 122
zero-croesi..np. See Marrs conjecture
Zygmund. 20

You might also like