0% found this document useful (0 votes)
107 views26 pages

Wavelets For Kids A

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 26

WAVELETS FOR KIDS

A Tutorial Introduction

By
Brani Vidakovic and Peter Muller
Duke University

Strictly speaking, wavelets are topic of pure mathematics, however in only a


few years of existence as a theory of their own, they have shown great potential
and applicability in many elds.
There are several excellent monographs and articles talking about wavelets, and
this modest tutorial does not intend to compete with any of them. Rather it
is intended to serve as a very rst reading, giving examples interesting for the
statistical community. We also give references for further reading as well as
some mathematica do-it-yourself procedures.
Key words and phrases: Wavelets, Multiresolution analysis (mra), Haar
wavelet, Thresholding.
1991 AMS Subject Classication: 42A06, 41A05, 65D05.
CONTENTS 2

Contents
1 What are wavelets? 3
2 How do the wavelets work? 5
2.1 The Haar wavelet : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
2.2 Mallat's multiresolution analysis, lters, and direct and inverse wavelet
transformation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.2.1 Mallat's mra : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.2.2 The language of signal processing : : : : : : : : : : : : : : : : 10
3 Thresholding methods 13
3.1 Di erent thresholding policies : : : : : : : : : : : : : : : : : : : : : : 16
3.1.1 Hard thresholding : : : : : : : : : : : : : : : : : : : : : : : : : 16
3.1.2 Soft thresholding : : : : : : : : : : : : : : : : : : : : : : : : : 16
3.1.3 Quantile thresholding : : : : : : : : : : : : : : : : : : : : : : : 17
3.1.4 Universal thresholding : : : : : : : : : : : : : : : : : : : : : : 17
4 Example: California earthquakes 17
5 Wavelet image processing 18
6 Can you do wavelets? 20
7 Appendix 24
1 WHAT ARE WAVELETS? 3

1 What are wavelets?


Wavelets are functions that satisfy certain requirements. The very name wavelet
comes from the requirement that they should integrate to zero, \waving" above and
below the x-axis. The diminutive connotation of wavelet suggest the function has to
be well localized. Other requirements are technical and needed mostly to insure quick
and easy calculation of the direct and inverse wavelet transform.
There are many kinds of wavelets. One can choose between smooth wavelets, com-
pactly supported wavelets, wavelets with simple mathematical expressions, wavelets
with simple associated lters, etc. The most simple is the Haar wavelet, and we discuss
it as an introductory example in the next section. Examples of some wavelets (from
the family of Daubechies wavelets) are given in Figure 1. Like sines and cosines in
Fourier analysis, wavelets are used as basis functions in representing other functions.
Once the wavelet (sometimes called the mother wavelet) (x) is xed, one can make a
basis of translations and dilations of the mother wavelet f( x;a b ) (a b) 2 R+  Rg. It
is convenient to take special values for a and b in dening the wavelet basis: a = 2;j
and b = k  2j  where k and j are integers. This choice of a and b will give a sparse
basis. In addition, this choice naturally connects multiresolution analysis in signal
processing with the world of wavelets.
Wavelet novices often ask, why not use the traditional Fourier methods? There
are some important di erences between Fourier analysis and wavelets. Fourier basis
functions are localized in frequency but not in time. Small frequency changes in the
Fourier transform will produce changes everywhere in the time domain. Wavelets
are local in both frequency/scale (via dilations) and in time (via translations). This
localization is an advantage in many cases.
Second, many classes of functions can be represented by wavelets in a more com-
pact way. For example, functions with discontinuities and functions with sharp spikes
usually take substantially fewer wavelet basis functions than sine-cosine basis func-
tions to achieve a comparable approximation.
This sparse coding makes wavelets excellent tools in data compression. For ex-
ample, the FBI has standardized the use of wavelets in digital ngerprint image com-
pression. The compression ratios are on the order of 20:1, and the di erence between
the original image and the decompressed one can be told only by an expert. There
are many more applications of wavelets, some of them very pleasing. Coifman and
his Yale team used wavelets to clean noisy sound recordings, including old recordings
of Brahms playing his First Hungarian Dance on the piano.
This already hints at how statisticians can benet from wavelets. Large and noisy
data sets can be easily and quickly transformed by the discrete wavelet transform
(the counterpart of discrete Fourier transform). The data are coded by the wavelet
coecients. In addition, the epithet \Fast" for Fourier transform can, in most cases,
be replaced by \Faster" for the wavelets. It is well known that the computational
complexity of the FFT is O(n  log2(n)). For the fast wavelet transform (FWT) the
1 WHAT ARE WAVELETS? 4

Figure 1: Wavelets from the Daubechies family


2 HOW DO THE WAVELETS WORK? 5

computational complexity goes down to O(n):


Many data operations can now be done by processing the corresponding wavelet
coecients. For instance, one can do data smoothing by thresholding the wavelet
coecients and then returning the thresholded code to the \time domain." The
denition of thresholding and di erent thresholding methods are given in Section 3.

RAW DATA -W. DECOMP - THRESHOLD - W. COMP -PROCESSED DATA

Figure 2: Data analysis by wavelets


2 How do the wavelets work?
2.1 The Haar wavelet
To explain how wavelets work, we start with an example. We choose the simplest
and the oldest of all wavelets (we are tempted to say: mother of all wavelets!), the
Haar wavelet, (x): It is a step function taking values 1 and -1, on 0 21 ) and  21  1),
respectively. The graph of the Haar wavelet is given in Figure 3.
The Haar wavelet has been known for more than eighty years and has been used
in various mathematical elds. It is known that any continuous function can be
approximated uniformly by Haar functions. (Brownian motion can even be dened
by using the Haar wavelet.1) Dilations and translations of the function ,
jk (x) = const  (2j x ; k)
dene an orthogonal basis in L2(R) (the space of all square integrable functions).
This means that any element in L2(R) may be represented as a linear combination
(possibly innite) of these basis functions.
The orthogonality of jk is easy to check. It is apparent that
Z
jk  j k = 0
0 0 (1)
whenever j = j 0 and k = k0 is not satised simultaneously.
If j 6= j 0 (say j 0 < j ), then nonzero values of the wavelet j k are contained in the
0 0

set where the wavelet jk is constant. That makes integral (1) equal to zero:
If j = j 0, but k 6= k0, then at least one factor in the product j k  jk is zero. 0 0

Thus the functions ij are orthogonal.


Rt
If  iid N(0 1) and Sjk (t) = jk (x)dx, then Bt =def 1 2 ;1
j =1k=0 jk Sjk (t) (P. Levy).
j
1
0
2 HOW DO THE WAVELETS WORK? 6

1.0
0.5
0.0
-0.5
-1.0
0.0 0.2 0.4 0.6 0.8 1.0

Figure 3: Haar wavelet


The constant that makes this orthogonal basis orthonormal is 2j=2: Indeed, from
the denition of norm2 in L2 :
Z Z
1 = (const)  (2 x ; k)dx = (const)  2 2(t)dt = (const)2  2;j :
2 2 j 2 ;j

The functions 10 11 20 21 22 23 are depicted in Figure 4. The set fjk  j 2
Z k 2 Z g denes an orthonormal basis for L2. Alternatively we will consider or-
thonormal bases of the form fj0k  jk  j  j0 k 2 Z g, where 00 is called the scaling
function associated with the wavelet basis jk . The set fj0 k  k 2 Z g spans the same
subspace as fjk  j < j0 k 2 Z g. We will later make this statement more formal and
dene jk . For the Haar wavelet basis the scaling function is very simple. It is unity
on the interval 0,1), i.e.
(x) = 1(0  x < 1):
The statistician may be interested in wavelet representations of functions gener-
ated by data sets.
Let y = (y0 y1 : : : y2 ;1) be the data vector of size 2n : The data vector can be
n

~ with a piecewise constant function f on 0,1) generated by y as follows,


associated
~
f (x) = 2k=0;1 yk  1(k2;n  x < (k + 1)2;n ):
n

The (data) function f is obviously in the L20 1) space, and the wavelet decomposition
of f has the form
;1 2 ;1 d  (x):
f (x) = c00(x) + nj=0 j
(2)
k=0 jk jk

The sum with respect to j is nite because f is a step function, and everything can
be exactly described by resolutions up to the (n ; 1)-st level. For each level the sum
R
2 jjf jj2 =def hf f i = f 2 :
2 HOW DO THE WAVELETS WORK? 7

Figure 4: Dilations and translations of Haar wavelet on 0,1]

with respect to k is also nite because the domain of f is nite. In particular, no


translations of the scaling function 00 are required.
We x the data vector y and nd the wavelet decomposition (2) explicitly. Let
~ corresponding function f is given in Figure 5. The
y = (1 0 ;3 2 1 0 1 2). The
~following matrix equation gives the connection between y and the wavelet coecients.
p
Note the constants 2j (1, 2 and 2) with Haar wavelets on~ the corresponding resolution
levels (j =0, 1, and 2).
2 3 2 p 3 2 3
1 1 1 p2 0 2 0 0 0 c00
66 0 77 66 1 1 p2 0 ;2 0 0 0 7
7 6
6 d00 7
7
66 77 66 7
7 6 7
66 ; 3 77 66 1 1 ;p2 0 0 2 0 0 7
7
6
6 d10 7
7
66 2 77 = 66 1 1 ; 2 p0 0 ;2 0 0 7 6
6 d11 7
7
7  6 7
66 1 77 66 1 ;1 0 p2 0 0 2 0 7
7 6
6 d20 7
7
66 0 77 66 1 ;1 0 7 6 d21 7
64 1 75 66 p2 0 0 ;2 0 7
7 6
6 7
7
4 1 ;1 0 ;p2 0 0 0 2 7
5 4 d22 5
2 1 ;1 0 ; 2 0 0 0 ;2 d23
2 HOW DO THE WAVELETS WORK? 8

2
1
0
-1
-2
-3

0.2 0.4 0.6 0.8 1.0

Figure 5: \Data function" on 0,1)

The solution is 2 2 3
6
c00 37 6 121 7
6
6 d00 77 66 ;12 77
6
6 d10 77 666 2p2 777
6
6 d11 777 = 66 ; 2p1 2 77 :
6
6
6 d20 77 666 41 777
6
6
6
d21 777 66 ; 54 77
4 d22 5 64 14 75
d23 ; 14
Thus,
f = 12  ; 12 00 + p1 10 ; p1 11 + 14 20 ; 54 21 + 41 22 ; 14 23 (3)
2 2 2 2
The solution is easy to check. For example, when x 2 0 81 )
p
f (x) = 12 ; 21  1 + p1  2 + 14  2 = 1:
2 2
The reader may already have the following question ready: \What will we do for
vectors y of much bigger length?" Obviously, solving the matrix equations becomes
~
impossible.
2 HOW DO THE WAVELETS WORK? 9

2.2 Mallat's multiresolution analysis, lters, and direct and


inverse wavelet transformation
An obvious disadvantage of the Haar wavelet is that it is not continuous, and therefore
choice of the Haar basis for representing smooth functions, for example, is not natural
and economic.
2.2.1 Mallat's mra
As a more general framework we explain Mallat's Multiresolution Analysis { (mra).
The mra is a tool for a constructive description of di erent wavelet bases.
We start with the space L2 of all square integrable functions.3 The mra is an
increasing sequence of closed subspaces fVj gj2Z which approximate L2(R):
Everything starts with a clever choice of the scaling function . Except for the
Haar wavelet basis for which  is the characteristic function of the interval 0 1)
the scaling function is chosen to satisfy some continuity, smoothness and tail require-
ments. But, most importantly, the family f(x ; k) k 2 Z g forms an orthonormal
basis for the reference space V0 : The following relations describe the analysis.
mra 1     V;1  V0  V1    
The spaces Vj are nested. The space L2 (R) is a closure of the union of all Vj : In other
words, j 2Z Vj is dense in L2(R): The intersection of all Vj is empty.
mra 2 f (x) 2 Vj , f (2x) 2 Vj+1  j 2 Z:
The spaces Vj and Vj +1 are \similar." If the space Vj is spanned by jk (x) k 2 Z then the
space Vj +1 ispspanned by j +1k (x) k 2 Z . The space Vj +1 is generated by the functions
j +1k (x) = 2jk (2x):

We now explain how the wavelets enter the picture. Because V0 p V1  any function
in V0 can be written as a linear combination of the basis functions 2(2x ; k) from
V1. In particular:
p
(x) = k h(k) 2(2x ; k): (4)
p
Coecients h(k) are dened as h(x) 2(2x ; k)i. Consider now the orthogonal
complement Wj of Vj to Vj+1 (i.e. Vj+1 = Vj
Wj ). Dene
p
(x) = 2k (;1)k h(;k + 1)(2x ; k): (5)
p
It can be shown that f 2(2x ; k) k 2 Z g is an orthonormal basis for W1.4
R
3 A function f is in L2(S) if S f 2 is nite.
4 This can also be expressed in terms of Fourier transformations as follows: Let m0 (!) be the
2 HOW DO THE WAVELETS WORK? 10

Again, the similarity property of mra gives that f2j=2(2j x ; k) k 2 Z g is a


basis for Wj . Since j2Z Vj = j2Z Wj is dense in L2(R), the family fjk (x) =
2j=2(2j x ; k) j 2 Z k 2 Z g is a basis for L2(R):
For a given function f 2 L2(R) one can nd N such that fN 2 VN approximates
f up to preassigned precision (in terms of L2 closeness). If gi 2 Wi and fi 2 Vi , then
fN = fN ;1 + gN ;1 = Mi=1gN ;M + fN ;M : (6)
Equation (6) is the wavelet decomposition of f: For example, the data function (2.1)
is in Vn , if we use the mra corresponding to the Haar wavelet. Note that f fn and
f0 = 0:
2.2.2 The language of signal processing
We repeat the multiresolution analysis story in the language of signal processing
theory. Mallat's multiresolution analysis is connected with so called \pyramidal" al-
gorithms in signal processing. Also, \quadrature mirror lters" are hidden in Mallat's
mra.
Recall from the previous section that
p
(x) = k2Z h(k) 2(2x ; k) (7)
and
p
(x) = k2Z g(k) 2(2x ; k): (8)
The l2 sequences5 fh(k) k 2 Z g and fg(k) k 2 Z g are quadrature mirror lters in
the terminology of signal analysis. The connection between h and g is given by:
g(n) = (;1)nh(1 ; n):
The sequence h(k) is known as a low pass or low band lter while g(k) is known as
the high pass or high band lter. The following properties pof h(n) g(n) can be proven
by using Fourier transforms and orthogonality: h(k) = 2 g(k) = 0:
The most compact way to describe the Mallat's mra as well to give e ective
procedures of determining the wavelet coecients is the operator representation of
lters.
Fourier transformation of the sequence h(n) n 2 Z, i.e. m0 (!) = n h(n)ein! : In the 'frequency
domain" the relation (4) is ^(!) = m0 ( !2 )^( !2 ): If we de ne m1 (!) = e;i! m0 (! + ) and (2!)
^ =
^ ^
m1 ( 2 )( 2 ) then the function  corresponding to  is the wavelet associated with the mra.
! !
5A sequence fa g is in the Hilbert space l2 if 
n k2Z a2k is nite.
2 HOW DO THE WAVELETS WORK? 11

For a sequence a = fang the operators H and G are dened by the following
coordinatewise relations:
(Ha)k = nh(n ; 2k)an
(Ga)k = ng(n ; 2k)an :
The operators H and G correspond to one step in the wavelet pdecomposition.
The only di erence is that the above denitions do not include the 2 factor as in
Equations (4) and (5).
Denote the original signal by c(n). If(n)the signal is of length 2n, then c(n) can be
represented by the function f (x) ~= ck nk , f 2 Vn . At each stage of the ~ wavelet
transformation we move to a coarser~approximation c(j;1) by c(j;1) = Hc(j) and
d(j;1) = Gc(j). Here, d(j;1) is the \detail" lost by approximating ~ c~(j) by the averaged
~
~c(j;1). The~ discrete wavelet
~ ~
transformation of a sequence y = c of length 2n can
( n)
~then be represented as another sequence of length 2n (notice~ that~ the sequence c(j;1)
has half the length of c(j)): ~
~
(d(n;1) d(n;2) : : :  d(1) d(0) c(0)): (9)
~ ~ ~ ~ ~
Thus the discrete wavelet transformation can be summarized as a single line:
y ;! (Gy GHy GH 2y : : : GH n;1 y H n y):
~ ~ ~ ~ ~ ~
The reconstruction formula is also simple in terms of H and G! we rst dene
adjoint operators H ? and G? as follows:
(H ?a)n = k h(n ; 2k)an
(G? a)n = k g(n ; 2k)an :
Recursive application leads to:
;1 (H ? )j G? d(j ) + (H ? )n c(0):
(Gy GHy GH 2 y : : :  GH j;1 y H j y) ;! y = nj=0
~ ~ ~ ~ ~ ~ ~ ~
Equations (7) and (8) which generate lter coecients (sometimes called dilation
equations) look very simple for the Haar wavelet:
p p
(x) = (2x) + (2x ; 1) = p1 2(2x) + p1 2(2x ; 1) (10)
2 2
p p
(x) = (2x) ; (2x ; 1) = p1 2(2x) ; p1 2(2x ; 1):
2 2
The lter coecients in (10) are
h(0) = h(1) = p1 g(0) = ;g(1) = p1
2 2
2 HOW DO THE WAVELETS WORK? 12

y = c(3) 1 0 -3 2 1 0 1 2
~

d(2) p12 ; p52 p12 ; p12


~
c(2) p1 ; p12 p1 p3
~ 2 2 2

d(1) 1 -1
~
c(1) 0 2
~

d(0) ;p2
~
c(0) p
~ 2

Figure 6: Decomposition procedure


3 THRESHOLDING METHODS 13

Figure 6 schematically gives the decomposition algorithm applied to our data set.
To get the wavelet coecients as in (3) we multiply components of d(j) j = 0 1 2
and c(0) with the factor 2;N=2: Simply, ~
djk = 2;N=2d(kj) 0  j < N (= 3):
It is interesting that in the Haar wavelet case 2;3=2c(0) 1
0 = c00 = 2 is the mean of
the sample y:
Figure 7~schematically gives the reconstruction algorithm for our example.
The careful reader might have already noticed that when the length of the lter
is larger than 2, boundary problems occur. (There are no boundary problems with
the Haar wavelet!) There are two main ways to handle the boundaries: symmetric
and periodic.
We should remark that some problems call for use of continuous wavelet trans-
forms.

3 Thresholding methods
In wavelet decomposition the lter H is an \averaging" lter while its mirror coun-
terpart G produces details. The wavelet coecients correspond to details. When
details are small, they might be omitted without substantially a ecting the \general
picture." Thus the idea of thresholding wavelet coecients is a way of cleaning
out \unimportant" details considered to be noise. We illustrate the idea on our old
friend, the data vector (1 0 ;3 2 1 0 1 2):
Example: The data vector (1 0 ;3 2 1 0 1 2) is transformed into the vector
p p
( p1  ; p5  p1  ; p1  1 ;1 ; 2 2):
2 2 2 2
If all coecients less than 0.9 (well, our choice) are replaced
p p by zeroes, then the
resulting (\thresholded") vector is (0 ; p52  0 0 1 ;1 ; 2 2):
The graph of \smoothed data", after reconstruction, is given in Figure 8.

Wavelet thresholding has important applications in statistics. Donoho and John-


stone (1993) discuss an application to a nonparametric regression problem. As-
sume we observe some unknown function f with Gaussian noise: yi = f (ti) + i,
i = 1 : : : n and i  N (0 1). The goal is to estimate the unknown function f .
3 THRESHOLDING METHODS 14

p H? -
c(0) 2 1 1
~
p G? -
d(0) - 2 -1 1
~

  0 2
9

c(1) 0 2 H? - 0 0 p22 p22
~
d(1) 1 -1 G? - p1 - p12 - p12 p1
~ 2 2

   p12 - p12 p12 p32


9

c(2) p1 - p12 p1 p3 H? - 1 1 ; 12 - 12 1 1 3 3
2 2 2 2 2 2 2 2 2
~
d(2) p1 - p52 p1 - p12 G? - 1 - 12 - 25 5 1 - 12 1 1
2 2 2 2 2 2 2
~

1 0 -3 2 1 0 1 2
Figure 7: Reconstruction example
3 THRESHOLDING METHODS 15

2
1
0
-1
-2
-3

0.0 0.2 0.4 0.6 0.8 1.0

Figure 8: \Smoothed" sequence

Donoho and Johnstone propose to start with a wavelet decomposition of the data set,
threshold the coecients, and then use the wavelet reconstruction as an estimate
p fp^.
When using the thresholding rule d^jk = sign(djk )(jdjk j;)+ with  =  2 log n= n,
the estimate f^ can be shown to have risk R(f^ f ) within a factor 2 log n of the mini-
mum risk when using the (of course unknown) optimal thresholding rule. Here R is
given by R(f^ f ) = E ((f^(ti) ; f (ti))2=n. Donoho and Johnstone (1993) show that
the (interpolated) function estimate f^ is, with probability tending to 1 (as n ! 1),
at least as smooth as f .
Another interesting application of wavelet thresholding arises in density estima-
tion. Assume X1 : : : Xn are i.i.d. observations from an unknown probability density
function f (x). Donoho, Johnstone, Kerkyacharian, and Picard (1993) dene a non-
linear density estimate by thresholding the coecient in the wavelet decomposition
of the empirical p.d.f.
If the unknown density is estimated by
f^(x) = jk d^jk jk (x)
then due to orthonormality of jk s, the sample estimator of djk is d^jk = n1 jk (Xi ):
Thresholding in this problem reminds us of well known procedures in density
estimation by orthogonal series: shrinking and tapering.
In the next subsection we will give a brief tour through some thresholding policies.
3 THRESHOLDING METHODS 16

3.1 Dierent thresholding policies


3.1.1 Hard thresholding
The policy for hard thresholding is keep or kill. The absolute values of all wavelet
coecients are compared to a xed threshold : If the magnitude of the coecient is
less than , the coecient is replaced by zero:
(
dhard
jk = d0 ddjk <
jk jk

The function performing hard thresholding is given in Figure 9 a.


3

2
2

1
1
0

0
y

z
-1

-1
-2

-2
-3

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

x x

Figure 9: Hard and soft thresholding with  = 1:


Hard thresholding is used when one is interested in the shortest possible wavelet
code. Long sequences of zeroes that are usually obtained in thresholded wavelet
decomposition vector are coded in an ecient way.
3.1.2 Soft thresholding
Soft thresholding shrinks all the coecients towards the origin. The formula is
dsof
jk = sign(djk )(jdjk j ; )+ :
t
(11)
The graph of the function performing the soft thresholding is given in Figure 9 b.
4 EXAMPLE: CALIFORNIA EARTHQUAKES 17

3.1.3 Quantile thresholding


Let the rule for thresholding be given as
(
djk = d0 ddjk <pp 
quant
jk jk

where p is a p-quantile of the set of all wavelet coecients. For example, we might
want to replace 30% of the smallest wavelet coecients by zero.
3.1.4 Universal thresholding
q p
Donoho and Johnstone (1992) propose to use the threshold  =  2 log(n)= n on
transformed data set yn , where n is the sample size, and  is the scale of the noise on
i

a standard deviation scale. Universal thresholding can be hard or soft thresholding


with the above dened  as threshold.

4 Example: California earthquakes


A researcher in geology was interested in predicting earthquakes by the level of water
in nearby wells. She had a large (8192 = 213 measurements) data set of water levels
taken every hour in a period of time of about one year in a California well. Here is
the description of the problem.
The ability of water wells to act as strain meters has been observed for centuries.
The Chinese, for example, have records of water owing from wells prior to
earthquakes. Lab studies indicate that a seismic slip occurs along a fault prior
to rupture. Recent work has attempted to quantify this response, in an eort
to use water wells as sensitive indicators of volumetric strain. If this is possible,
water wells could aid in earthquake prediction by sensing precursory earthquake
strain.
We have water level records from six wells in southern California, collected over
a six year time span. At least 13 moderate size earthquakes (Magnitude 4.0 -
6.0) occurred in close proximity to the wells during this time interval. There is a
a signicant amount of noise in the water level record which must rst be ltered
out. Environmental factors such as earth tides and atmospheric pressure create
noise with frequencies ranging from seasonal to semidiurnal. The amount of
rainfall also aects the water level, as do surface loading, pumping, recharge
(such as an increase in water level due to irrigation), and sonic booms, to name
a few. Once the noise is subtracted from the signal, the record can be analyzed
for changes in water level, either an increase or a decrease depending upon
whether the aquifer is experiencing a tensile or compressional volume strain,
just prior to an earthquake.
5 WAVELET IMAGE PROCESSING 18

A plot of the raw data for hourly measurements over one year (8192 = 213 obser-
vations) is given in Figure 10a. After applying the DAUB #2 wavelet transformation
and thresholding by the Donoho-Johnstone \universal" method, we got a very clear
signal with big jumps at the earthquake time. The cleaned data are given in Figure
10b. The magnitude of the water level change at the earthquake time did not get
distorted in contrast to usual smoothing techniques. This is a desirable feature of
wavelet methods. Yet, a couple of things should be addressed with more care.
(i) Possible %uctuations important for the earthquake prediction are cleaned as
noise. In post-analyzing the data, having information about the earthquake time, one
might do time-sensitive thresholding.
(ii) Small spikes on the smoothed signal (Figure 10b) as well as `boundary dis-
tortions" indicate that the DAUB2 wavelet is not the most fortunate choice. Com-
promising between smoothness and the support shortness of the mother wavelet with
help of wavelet banks, one can develop ad-hoc rules for better mother wavelet (wavelet
model) choice.


-53.0
-53.0




-53.1





-53.1

















-53.2



-53.2













acl





-53.3




-53.3









-53.4


-53.4









-53.5


-53.5

0 2000 4000 6000 8000 0 2000 4000 6000 8000

Index Index

(a) Raw data, water level vs. time (b) After thresholding the wavelet transformation.
Figure 10: Panel (a) shows n = 8192 hourly measurements of the waterlevel for a
well in an earthquake zone. Notice the wide range of waterlevels at the time of an
earthquake around t = 2000.

5 Wavelet image processing


We will explain brie%y how wavelets may be useful in the matrix data processing. The
most remarkable application is, without any doubt, image processing. Any (black and
5 WAVELET IMAGE PROCESSING 19

white) image can be approximated by a matrix A in which the entries aij correspond
to intensities of gray in the pixel (i j ). For reasons that will be obvious later, it is
assumed that A is the square matrix of dimension 2n  2n  n integer.
The process of the image wavelet decomposition goes as follows. On the rows of the
matrix A the lters H and G are applied. Two resulting matrices are obtained: Hr A
and Gr A, both of dimension 2n  2n;1 (Subscript r suggest that the lters are applied
on rows of the matrix A). Now on the columns of matrices Hr A and Gr A, lters H
and G are applied again and the four resulting matrices Hc Hr A GcHr A HcGr A and
GcGr A of dimension 2n;1  2n;1 are obtained. The matrix Hc Hr A is the average,
while the matrices Gc Hr A Hc Gr A and Gc Gr A are details (Figure 11)

-
A Gr A Hr A
-

? ?
GcGr A Gc Hr A

? ?
HcGr A Hc Hr A

Figure 11: Image wavelet decomposition


6 CAN YOU DO WAVELETS? 20

The process continues with the average matrix Hc Hr A until a single number (av-
erage of the whole original matrix A) is obtained. Two examples are given below.
Example 1.
This example is borrowed from Nason and Silverman (1993). The top left panel
in Figure 12 is 256  256 black and white image of John Lennon in 0-255 gray scale.
In the top-right gure each pixel is contaminated by normal N (0 60) noise. (In
Splus: le  lennon+rnorm(256*256, s=60) where lennon is the pixel matrix of
the original image.)
The two bottom gures are restored images. The DAUB #4 lter was used for
the rst gure, while DAUB #10 was used for the second.
Though the quality of the restored images may be criticized, the stunning property
of wavelet image analysis shows up in this example. Both restored images use only
about 1.8 % of the information contained in the \blurred" image. The compression
rate is amazing: 527120 bites go to 9695 bites after the universal thresholding.
Example 2.
This is an adaptation of the data set of J. Schmert, University of Washington. The
word ve was recorded and each column on the top-right gure represents a peri-
odogram over a short period of time (adjacent columns have half of their observations
in common). The rows represent time. The original 92  64 matrix was cut to 64
 64 matrix for obvious reasons. After performing hard thresholding with  = 0:25,
a compression ratio of 1:2 is achieved. The compressed gures are shown in the two
bottom panels of Figure 13.

6 Can you do wavelets?


Yes, you can! There are several several packages that support wavelet calcula-
tions. The best (noncommercial) package in our opinion is Nason and Silverman's:
The Discrete Wavelet Transform in S. The manual 14] describes installation and
use of the software is installed and used. The software itself can be ftped6 from
lib.stat.cmu.edu or hensa.unix.ac.uk. The name of the package is wavethresh.
Carl Taswell ([email protected]) developed Wavelet Toolbox for Math-
lab. The latest version is WavBox 4.0 and the software has to be registered with the
author.
There are several mathematica notebooks on wavelet computations. Wicker-
hauser, Cohen, ([email protected]), made theirs available to the
public.
To understand how the wavelets work, we reinvented the wheel and developed
mathematica software for direct and inverse wavelet transformation and thresh-
olding and applied it to some exemplary data sets. The algorithms are far from
6 A new verb, ha!
6 CAN YOU DO WAVELETS? 21

250

250
200

200
150

150
100

100
50

50
0

0 50 100 150 200 250 0 50 100 150 200 250


250

250
200

200
150

150
100

100
50

50
0

0 50 100 150 200 250 0 50 100 150 200 250

Figure 12: Wavelet image restoration example


6 CAN YOU DO WAVELETS? 22

60
50

5
4
40

3
Z
2
30

1
0
60
20

50
40 60
30 50
10

40
Y 20 30
10 20 X
10
0

0 10 20 30 40 50 60
60
50

5
4
40

3
Z
2
30

1
0

60
20

50
40 60
30 50
10

40
Y 20 30
10 20 X
10
0

0 10 20 30 40 50 60

Figure 13: Word FIVE data. The panels in the rst row show to the original data.
The bottom panels show the signal after thresholding.
REFERENCES 23

being e ective! rather they are educational. mathematica notebook is given in the
appendix.

References
1] Barry A. C. (1993). Wavelet applications come to the fore, SIAM News, Novem-
ber 1993.
2] Coifman, R., Meyer, Y., and Wickerhauser, V. (1991) Wavelet analysis and signal
processing. In: Wavelets and Their Applications, Edited by Mary Beth Ruskai,
Jones and Bartlet Publishers.
3] Daubechies, I. (1988), Orthonormal bases of compactly supported wavelets. Com-
mun. Pure Appl. Math., 41 (7), 909-996.
4] Daubechies, I. (1992), Ten Lectures on Wavelets, Society for Industrial and Ap-
plied Mathematics.
5] DeVore, R. and Lucier, B. J. (1991) Wavelets. Acta Numerica 1 1-56.
6] Donoho, D. (1992). Wavelet shrinkage and WVD: A 10-minute tour. Presented
on the International Conference on Wavelets and Applications, Tolouse, France,
June 1992.
7] Donoho, D., Johnstone, I., Kerkyacharian, G, and Picard, D. (1993). Density
estimation by wavelet thresholding. Technical Report, Department of Statistics,
Stanford University.
8] Donoho, D., Johnstone, I., Kerkyacharian, G, and Picard, D. (1993). Wavelet
shrinkage: Asymptopia? Technical Report, Department of Statistics, Stanford
University.
9] Donoho D., and Johnstone, I. (1993). Adapting to unknown smoothness via
wavelet shrinkage. Technical Report, Department of Statistics, Stanford Univer-
sity.
10] Donoho, D. and Johnstone, I. (1992). Minimax estimation via wavelet shrinkage.
Technical Report, Department of Statistics, Stanford University.
11] Grossmann, A. and Morlet, J. (1984). Decomposition of Hardy functions into
square integrable wavelets of constant shape. SIAM J. Math., 15, 723-736.
12] Johnstone, I. (1993). Minimax-Bayes, asymptotic minimax and sparse wavelet
priors. Technical Report, Department of Statistics, Stanford University.
7 APPENDIX 24

13] Mallat, S. G. (1989), A theory for multiresolution signal decomposition: the


wavelet representation. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 11 (7), 674-693.
14] Nason, G. P. and Silverman B. W. (1993). The discrete wavelet transform in S,
Statistics Research Report 93:07, University of Bath, Bath, BA2 7AY , UK.
15] Press W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1993).
Numerical Recipes in C. Second Edition, Cambridge University Press.
16] Streng, G. (1993). Wavelet transforms versus Fourier transforms, BAMS, 28,
288-305.
17] Wang, Z. (1993). Estimating a Holder Continuous Function from a Noisy Sample
via Shrinkage and Truncation of Wavelet Coecients. Technical Report 93-9,
Purdue University, Department of Statistics.
18] Wavelets and Their Applications, Edited by Mary Beth Ruskai, Jones and
Bartlett Publishers. (1991).
19] Wavelets: A Tutorial in Theory and Applications, Edited by Charles K. Chui,
Academic Press, Inc. (1992)

7 Appendix
BeginPackage "Waves`"]
(* Author: Brani Vidakovic, ISDS, Duke University 
Functions Dec and Comp are based on M. V. Wickerhauser's
mathematica program *)

Mirror::usage = "Mirror _filter_] gives the mirror \


filter for the input _filter_. This is an adjoint \
operator H* of the operator H corresponding to _filter_."

WT::usage = "WT _vector_, _filter_] performs the direct \


wavelet transformation of the data vector _vector_. \
The wavelet base is chosen by _filter_. The length \
of the vector _vector_ has to be a degree of 2."

WR::usage = "WR _vector_, _filter_] gives the wavelet \


reconstruction algorithm. From the set of wavelet \
coefficients _vector_ the data set is reconstructed. \
The wavelet base is chosen by _filter_."
7 APPENDIX 25

Dec::usage = "An auxiliary function needed for the \


direct wavelet transformation. See WT."

Comp::usage = "An auxiliary function needed for the \


inverse wavelet transformation (wavelet reconstruction \
algorithm). See WR."

Begin "`Private`"]

Mirror filter_List]:= Module {fl=Length filter]},


Table -(-1)^i filter fl+1-i]], {i, 1, fl}]]

Dec vector_List, filter_List]:= Module


{vl= Length vector], fl=Length filter]},
Table
Sum filter m]] vector Mod 2 k+m - 3, vl]+1 ]],
{m,1,fl}],
{k,1,vl/2}]
]

Comp vector_List, filter_List]:= Module


{ temp=Table 0,{i,1,2 Length vector]}],
vl=Length vector], fl=Length filter]},
Do temp Mod 2 j + i -3, 2 vl]+1]] +=
vector j]] filter i]],
{j, 1, vl}, {i, 1, fl}]
temp]

WT vector_List, filter_List]:=
Module { wav={}, c,d, ve=vector, H=filter,
G=Mirror filter]},
While Length ve] > 1,
lev=Log 2,Length ve]]-1
c = Dec ve, H]
d = Dec ve, G]
wav= Join wav, d ]
ve = c] Join wav, c] ]
7 APPENDIX 26

WR vector_List, filter_List]:=
Module {i=1, vl=Length vector], c=Take vector,-1],
d=Take RotateRight vector,1],-1],
mirrorf=Mirror filter], cn, dn, k=1},
While i <= vl/2 ,
k += i
i= 2 i
cn=Comp c, filter]+Comp d, mirrorf]
dn=Take RotateRight vector, k], -i ]
c=cn
d=dn
]
c ]

End ]

EndPackage ]

Institute of Statistics
and Decision Sciences
Duke University
Durham, NC 27708-0251
[email protected]
[email protected]

You might also like