Article
Article
The Discrete
. Wavelet Transform: Wedding the
A Trous and Mallat Algorithms
Mark J. Shensa
Abstract-In a general sense this paper represents an effort cessing. The other, the multiresolution approach of Mal-
to clarify the relationship of discrete and continuous wavelet lat and Meyer, originally used in image processing, em-
transforms. More narrowly, it focuses on bringing together two
separately motivated implementations of the wavelet trans- ploys orthonormal wavelets [6]-[ 101. The latter
form, the algorithme U trous and Mallat’s multiresolution de- algorithm, apart from its wavelet interpretation, was dis-
composition. It is observed that these algorithms are both spe- covered previously in the form of quadrature mirror filter
cial cases of a single filter bank structure, the discrete wavelet (QMF) filter banks with perfect reconstruction where it
transform, the behavior of which is governed by one’s choice finds application in speech transmission and split-band
of filters. In fact, the h trow algorithm, originally devised as a
computationally efficient implementation, is more properly coding [ 111-[ 131.
viewed as a nonorthonormal multiresolution algorithm for A glance at these two algorithms suffices to reveal
which the discrete wavelet transform is exact. Moreover, it is closely related structures. In fact, apart from the con-
shown that the commonly used Lagrange i~ trous filters are in straints on their filters, the decimated B trous [5] and Mal-
one-to-one correspondence with the convolutional squares of lat algorithms are identical. We are thus led to examine
the Daubechies filters for orthonormal wavelets of compact
support. the expanded family of algorithms encompassing both
A systematic framework for the discrete wavelet transform types of filters. In this vein, it is shown that the Lagrange
is provided, and conditions are derived under which it com- interpolation filters commonly employed by the B trous
putes the continuous wavelet transform exactly. Suitable filter algorithm are actually the squares (in a convolutional
constraints for finite energy and boundedness of the discrete sense) of the Daubechies filters for compact orthonormal
transform are also derived. Finally, relevant signal processing
parameters are examined, and it is remarked that orthonor- wavelets. We also derive conditions under which the dis-
mality is balanced by restrictions on resolution. crete implementation computes a continuous wavelet
transform exactly and find that they bear an intimate re-
lationship to the B trous constraints.
I. INTRODUCTION From a more general viewpoint, the situation is as fol-
I
SHENSA: DISCRETE WAVELET TRANSFORM 2465
trous, then the DWT coincides with a continuous wavelet Thus, to obtain F(w, b), one multiplies the signal by an
transform by a wavelet $ ( t ) whose samples $(n) form the appropriate window h (such as a Gaussian) centered at
filter g (i.e., g, = $ ( n ) ) . Even i f f is not B trous, the time b and then takes the Fourier transform. In mathe-
algorithm is exact provided the signal lies in an appropri- matical terms, (1.3) is an expansion of the signal in terms
ate subspace; however, in that insthnce, the sampled of a family of functions h(t - b) e'"', which are generated
wavelet values depend on f as well as g. This is the situ- from a single function h(t) through translations b in time
ation in the orthonormal case where, moreover, the filter and translations w in frequency. In contrast, the wavelet
g is almost completely determined from fthrough the con- transform (1. I) is an expansion in functions $((t - b ) / a )
straints of orthogonality. generated by translations b in time and dilations a in time.3
In the remainder of this introduction we define, and Thus, the continuous wavelet transform resembles a (con-
briefly motivate, wavelet transforms at various levels of tinuous) bank of short-time Fourier transforms with a dif-
discretization. Section I1 contains an abbreviated deriva- ferent window for each frequency. The significance of this
tion of the h trous algorithm followed by a description of is that, while the basis functions in (1.3) haveAthe same
the Mallat algorithm. (The uninitiated reader is particu- time and frequency resolution (that of h(t) and h (U)) at all
larly referred to [ 11, [6], [8] .) In Section I11 we define the points of the transform plane, those of (1.1) have time
undecimated DWT, relate it to the decimated transform, resolution (that of $ ( t / a ) ) which decreases with a and
and provide algorithms for its computation. Section IV frequency resolution (that of $ (aw)) which increases with
states and proves several theorems which delineate the re- a . This property can be a great advantage in signal pro-
lationship between the DWT and the continuous wavelet cessing since high frequency signal characteristics are
transform. It may be read independently of the algorithms generally highly localized in time whereas slowly varying
of Section I1 although the motivation for the constructions signals require good low frequency resolution.
may not be clear. Section V defines the Lagrange a trous As originally proposed by Morlet et al. [17], $ was a
filters and proves that they are the squares of the Daube- modulated Gaussian
chies filters. In Section VI, we formulate the inversion r ~ / ( ~ )= ejVote-t2/2 (1.4)
problem and provide filter constraints which ensure finite
energy and bounded operators. It concludes with a short and this function is still the prototypical analyzing wave-
examination of the tradeoffs involved in choosing the let for signal processing applications [I]. The window
bandpass filter, emphasizing the differences of the orthm- function $ ( t / a ) has Fourier transform $(aw) =
ormal and nonorthogonal cases. ae -(" -[vo/a112a 12,which has analysis frequency v o l a . We
emphasize that vo is simply a parameter which determines
A . Transform Dejinitions the analyzing wavelet; its role should not be confused with
that of a even though the scale axis is often expressed in
The continuous wavelet transform of a signal s ( t ) takes terms of frequency under the transformation a -, vo/u.
the form Observe that (1.4) only satisfies the admissibility condl-
W(2', n) 2! -
1
J2' s t - n
IC/ (7 )
s(t) dt. (1.5a)
circle in Section IV where we show, under quite general
conditions, that given filtersf and g there exists a function
$(t) with $(n) = gf k E-, such that the DWT acting on
the sampled signal is exactly the sampled output of the
We remark that finite energy for the wavelet transform is
continuous wavelet transform (i.e., of the wavelet se-
not at all equivalent to finite energy for the wavelet series. . ~ other words, the DWT with filter g defined by
r i e ~ ) In
It depends on the sampling grid as well as the function
gi & which was originally conceived as an ap-
$ ( t ) [3]. Thus, the admissibility condition (1.2) is not
proximation of the (continuous) WT for an arbitrary ana-
necessarily appropriate in the discrete case and shall be
lyzing wavelet $ A ( t ) , is exact for another wavelet func-
replaced with conditions on the relevant filters in Section
tion $ B ( t ) where $B(n) = gf for all n. Of course, if there
VI. In addition, we shall often take b to be a multiple of
is sufficient regularity, $ A ( t )and $B (t)will be close since
a4
s-(:
they coincide on the integers up to the length of g.
Before embarking on this voyage, we summarize, and
W(2', 2'n) 7 $ 7-n
42' ) s(t) dt. (1.5b) hopefully clarify, the plethora of transforms with a brief
analogy to the Fourier transform, Fourier series, the dis-
A logical step in applying the theory to discrete signals cretized z transform, and the discrete Fourier transform
is to discretize the integral in (1.5) (DFT). The Fourier transform of a continuous signal s(t)
S(W) k 1W
-w
e-j"'s(t) dt
The sample rate has been set equal to one. As indicated is a function of the continuous variable W . Restricting it
by 2% on the left-hand side, (1.6), as well as (1.5b), are to a discrete (one-dimensional) grid results in the coeffi-
decimated wavelet transforms. Octave i is only output cients of a Fourier series
s:w
every 2' samples. In this form the resulting algorithms will
not be translation invariant ([7]). This is easily seen by ~ ( 2 = ~ ~ e-j'*m') s(t) dt (1.9)
substituting s(k - r) for s(k) which produces w(2', 2'(n
- r / 2 ' ) ) , an integer translation of w(2', 2'n) only if r is
which in turn may be computed approximately by
a multiple of 2'. However, the invariance, which is lost
by decimation, is easily restored by separately filtering sZ(2.lrmA t ) =
k
ce-j21rmkArs(k At) At (1.10)
the even and odd sequences (see Section 111) or by using
an equivalent algorithm, also described in Section 111. Our the z transform of s, k s(n A t ) output at discrete points
major reason for starting with (1.5b) rather than (1.5a) is e -12lrmAI . If s(t) is band limited and sampled at an appro-
historical. It delineates the relationship of the DWT to the priate rate, A t = 1 / N , then the above may be computed
QMF filter banks and (orthonormal) wavelet structures al- exactly using the DFT
ready found in the literature [6]-[9]. It also simplifies our
derivation of the i trous algorithm and readily lends itself jm - c
l N exp (-N)
j2.lrmk sk.
(1.11)
to physical interpretation (see Section VI). Note that the N I
original Mallat algorithm [6] was decimated; i trow [4]
These correspond precisely to "(a, b ) , W(2', n ) , w(2', n ) ,
was not.
and undecimated w i . With wavelets, however, we have
Let g be the discrete filter obtained by truncating the
the additional difficulty of dealing with a whole class of
sampled wavelet function; i.e., g, = $(n). Then, pro-
functions $(I) rather than simply , j u t . Also complicating
ceeding from (1.6), we shall be able to amve at the DWT
things are its two-dimensional structure and the decimated
versions, which, due to their 2'n dependency on i, play a
4Physically, this reflects a need for less frequency sampling of the trans- distinguished role without analogy in the one-dimensional
form output at lower frequencies ( i . e . , larger scales U ) . Mathematically, b
= 2'n has its roots in the orthonormal wavelets where it suffices for in- case.
vertibility of the tranform [ 6 ] . The general case, however, is much more
complex [3]. Too sparse a sampling leads to incompleteness; oversampling 'The adjoint filter g l =-z_k is used to simplify future notation. It cor-
results in a redundant set of functions. responds to the integrand J . ( - t ) * s found in ( l . l ) , (1.5), and (1.6).
SHENSA: DISCRETE WAVELET TRANSFORM 2461
11. Two ALGORITHMS Some comment concerning filter definitions is also ap-
A. Notation propriate. Usage in the literature is uniform only up to the
adjoint. Also, the z transform is sometimes defined with
Decimation, which appears as a down arrow in Fig. 1,
a positive exponential which leads to similar differences
plays a pivotal role in all DWT algorithms. However, it
in the frequency domain. In keeping with signal process-
leads to operators which are not time invariant and present
ing applications we have chosen (2.5) as above, consist-
a potential source of confusion. It is thus worthwhile to
ent with the Fourier transform, and we shall define our
first establish some formal notation.
filters so that adjoints do not appear in convolutions. This
Signals and filters in boldface type will be treated as
produces a minimum of adjoints and greatly simplifies the
vectors, in which case * indicates discrete convolution and
notation. Unfortunately, it also results in the definition
yields a vector. The symbol f will be used for the Her-
g: = $(n) and the introduction off as an interpolation
mitian adjoint filter [ f ‘ ] k = f - k . Note that this is the com-
filter whereas g andf would be more natural. Note, also,
plex conjugate reversal and does not imply a conversion
that our filters are the adjoints of the filters defined in [8],
of a column vector to a row vector. The above mentioned
although their z transforms coincide since [8] defines the
decimation operator is represented by a matrix
z transform with a plus sign.
Akm 6(2k - m)
B. The A Trow Algorithm
We take the discretized wavelet series (1.6) as our
where Bkm is the Kronecker delta and 6(k) 6 ~Also
. of starting point. The difficulty in implementing (1.6) is that,
even for + ( t ) of finite support, as i increases, $ ( t ) must
significance is the dilation operator
be sampled at progessively more points, creating a large
computational burden. The solution posed by [4]is to ap-
proximate the values at nonintegral points through inter-
polation via a finite filter f t . The resulting recursion is
highly efficient and may be implemented with the filter
which dilates a vector by inserting zeros. Observe that A
bank structure of Fig. 1.
and 0 are transposes of each other, and that although they The interpolation is perhaps best introduced with an ex-
are linear, they are not time invariant; i.e., they are not
ample. Let f t be the filter (0.5, 1.O, 0.5). Then,
functions of k - m .
Convolution followed by decimation becomes [A( f *
$Ilk = C m & m [ f * SI, = [ f *SI2k = C m f 2 k - m S m . How- f - 2k$(k)
ever, a particularly insidious pitfall remains; namely,
which shall occasionally be used in our proofs. A trivial approximates a sampling of $(t/2). With the help of the
calculation yields AFs = A(f * s). The symbol f will dilation operator 0, this may formalized as a general pro-
also be used for the adjoint of matrices. This is consistent cedure for dyadic interpolation. The steps are illustrated
with the above notation where [F’],,,
t
F,, = f n - , e in Fig. 2 . Let g be a filter defined by g: +(n); i.e.,
fm-n.
We define the Fourier transform s”(w) of a function s ( t )
by (1.8) and the z transform (on the unit circle) of a dis-
crete signal s by First we spread g’ to provide space in which to put the
interpolated values. The resulting filter is Dg’.Then we
sz(w) k C sne-j*”. (2.5) apply a filter f which leaves the even points fixed and
n
interpolates to get the odd points. This condition, that f
In the subsequent interplay between continuous and dis- be the identity on even points, is sufficiently important to
crete functions one must be careful to distinguish the usage warrant a separate defintion, which follows.
of these two transforms, Ignoring their differences can Definition 2.1: The low-pass filter f is said to be an a
easily lead to erroneous conclusions. In particular, al- trous filter if it satisfies
though the Fourier transform of s(2t) is s^(w/2)/2,
Dgt :
J
+(A)0
I
+io)
\\
0+U) 0 +(a
of the filter f. A major step towards treating this question
lies in the results of Section IV, as was outlined at the end
of the introduction. Since the algorithm is exact for some
pfV2 &(t), the question reduces to a) the quality of the ap-
dW): +(-I) x +(O) x +U) x +(2)
proximation $ = $B and b) the effect of this approxima-
tion on the wavelet integral (1.1). Inasmuch as $(n) =
Fig. 2. Diagram illustrating the dilation and interpolation of a function $B(n) for the finite set of integers n used to obtain g from
$ ( t ) : $ ( n / 2 ) = d2 f * (Dg+).
+
w(2, 2n) ;z x f l - 2 n - Z m g m- St k
In keeping with the literature, we have replaced the filter
k ,m f with the filter h , which also serves to indicate that this
class of filters is constrained, as detailed below. We re-
= gn-m'zm'-kSk
k,m' mark that none of these filters are a trous filters. The con-
straints on h and g which ensure an orthonormal multi-
= [g * (A(? * s>>l, (2. 1)
resolution analysis [6]-[9] are
which is simply wi (1.7) with i = 1. Continuing induc- -
tively by replacing s in (2.11) with s i - ' , we find w(2', Ci [ & 2 j - n h z j - m + g 2 j - n g 2 j - m ~ = 6nm (2.14a)
2%) = wi for all ii where wh is given by (1.7), which can
(2.14b)
be rewritten for real f
s i + ' = A(f * si) (2.12a) C g, =0 (2.14~)
n
wi =g * si. (2.12b) h, = J 2 . (2.14d)
n
Except for decimation of the output (the undecimated ver-
sion will be derived in Section 111), this is the a trous al-
Recalling that Hm, k hm- ,and that A ' = D, we may
gorithm described in [4]. Thus, we see that the a trous rewrite (2.14a) and (2.14b) as
algorithm is simply a DWT for which the filter f (an in- (H'D)(AH) + (G'D)(AG) = Z (2.15)
terpolator) satisfies condition (2.9) and the filter g is ob-
tained by sampling an a priori wavelet function $(t). (AH)(G'D) = 0. (2.16)
Remark 1: The definition (1.6) is not so transparent as Furthermore, (2.15) and (2.16) imply (e.g., multiply
it might seem. It is, of course, intended to reflect an ap-
proximation to (1.5). From this viewpoint one might well
(2.15) on the left by A H ) '
consider a change of variables t --* t/2' before discretiz- (AH)(H'D) = z
ing (1.5). Such a procedure certainly alleviates the com- ) z.
( A G ) ( G ~ D= (2.17)
putational problem since it dilates s ( t ) (that is, samples s
at 2', values which are known) rather than contracting Thus, H'D and G t D are injections and (2.13). is an or-
$(t). However, unless the original function s(t) was highly thogonal decomposition of the discrete signal s'. That is,
oversampled (which begs the computational question), the si-' = H'Ds' + G t D w i with the scalarproduct ( H ' D s ' )
* (G'Dw') = 0. In fact, (2.15) is a paradigm for invert-
approximation is poor. More precisely, to accurately ap-
proximate s(t), and therefore also (1.5), we must sample ing the transform. These concepts are illustrated in Fig.
at least at the nyquist rate rnypfor s. Then the integral for 4.
octave i requires $(t) to be sampled at a rate 2' rnyq. Furthermore, from (2.14) it follows that (2.13) repre-
Remark 2: The derivation above, as well as that in [4], a decomPosition [6I9 [8I, [91 as follows:
of the a trous algorithm make no statements regarding the There exists a function ($(t) whose Fourier trans-
accuracy of the approximation (2.11) or even of the dis- form is given by
m
cretization from (1.5) to (1.6). The former is iterated over
(2.18)
i and, hence, to succeed must be numerically stable in
SHENSA: DISCRETE WAVELET TRANSFORM 2469
s s
scaling function form a basis for L 2 ( R ) ,and
provided
s(r) = , c c d;$;(t).
I=-w n
(2.27)
(2.28b)
have the property where the first component of h is on the left. The wavelets
corresponding to (2.28a) are exactly the Haar function
4 c 1( t ) = [AHlnk+:(t). (2.22)
mentioned above.
Some additional remarks relating the two algorithms are
Note that the above definitions differ in the sign of i from in order. The conditions ( 2 . 1 4 ~ )and (2.14d) effectively
those of [8]. make g a bandpass filter and h a low-pass filter (e.g., an
Finally, define interpolation filter) with the sum on g k analogous to the
$(t) A d2 g-k4(2t - k). (2.23) condition $(t) dt = 0. Also, d'" corresponds to w l .
The additional decimation appearing in (2.13b) would not
Then, using (2.14) and the above properties of 4, one can appear in a translation invariant version of Mallat's al-
show that the family of wavelets, gorithm (cf. Section 111). On the other hand, although the
discrete filters g play algorithmically identical roles, the
$ i ( t ) of the ?
trous
i algorithm are not the wavelet vectors
(2.24) of a functional expansion. Rather the $ k ( r ) are the duals
of a set of vectors for which the coefficients of the signal
are orthonormal ( 5 $L(t) $jk(t) dt = 6, &), and that the d' expansion are w:. That is, they are the coefficients of an
are the coefficients of the expansion of s ( t ) in terms of the expansion of the form s(t) = C l , ( s,-$; ) $L(t) where ( )
*:.6 indicates the L 2 inner product, and $ k ( t ) is the dual basis
More precisely, the translates and dilates +l(t) of the or frame (see [3], [15]) of 11.;. In Mallat's algorithm, since
the $k(t) are orthonormal, the basis and its dual coincide.
61n contrast, the wavelet functions $:(r) (1 / J 2 ' ) $((2/2') - n) of (1.5) Thus, in many senses, the discrete filters g are more fun-
are not generally orthogonal. It is the filter constraints (2.14a) and (2.14b) damental than the wavelets themselves. It is usually the
that ensure orthonormality. Dropping these two constraints in Section IV,
we develop a structure identical to (2.21)-(2.24); however, the constructed coefficients which are of major interest; the actual wave-
$:(r) need not be orthogonal. lets $k(t), let alone their duals, are rarely computed.
2470 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40. NO. IO. OCTOBER 1992
1 - 1
Q
that of Fig. 1 (equations (2.12)) follows for arbitrary fil-
ters.
A. An Alternative Implementation
A second possibility for the implementation of IV is to (zlj?a+l
use the algorithm in Fig., 1 and proceed directly from Fig. 6. Diagram of an implementation of the undecimated DWT.
(3.7). That is, to output iO.t',, one simply translates the sig-
nal by n and then computes wb. As an algorithm, this takes
the form a) compute wb; (b) translate the signal by one an exact implementation of (1.6) and for which $(t) and
sample; (c) go to step (a). Moreover, in view of (3.6), s ( t ) do (1.5) and (1.6) coincide? The general answer is
one need not reperform the entire recursion (,i.e., (2.12)) that we are able to construct such a rc/ provided the dis-
for every time point n in order to obtain w'. Rather, at cretized signal lies in the appropriate subspace of (cf.
each octave, the decimation is replaced by a split into even (2.26)). A somewhat surprising result is that it is neces-
and odd sequences, each of which is a starting point for sary and sufficient f o r f t o be a trous for condition (2.26)
the next octave (see Fig. 6). A couple of examples suffice to be dropped. Our approach shall be to mimic the con-
to convince one that if, at octave i , n mod 2' = 0, then struction of orthonormal wavelets outlined in Section II-C.
one takes the upper branch; if n mod 2' = 1, then one
takes the lower branch. A rigorous derivation follows from A. Construction of the Scaling Function 4
the formula (cf. [20])
We begin with the stipulation of the existence of a scal-
T m /2 seven m even ing function 4(t) with Fourier transform
ATms = . (3.15) 00
(JIa(,))
1 - w
T(rn- 1 ) / 2 ~ o d d m Odd
r= I
We remark that Figs. 5 and 6 are computationally
equivalent provided that the algorithm in Fig. 5 is imple- wherex(w) = ( f t ) , ( w ) is the z transform o f f t . To em-
mented efficiently. The code must be written so as to omit phasize the nonorthogonality of the corresponding wave-
multiplication by the zero elements of filters D'f.(They lets, we retain the symbol f rather than h. Note that this
are mostly zeros for i > 2.) Depending on the number of function 4 need not have (and in general does not have)
octaves, the computational burden still remains much all of the properties of the orthonormal 4 outlined in Sec-
greater than that of the decimated algorithm (i.e., Fig. 1); tion II-C.
however, there is considerable parallelism which may be For (4.1) to converge to a nonzero function, the factors
sufficiently exploited on suitable hardware to produce a must approach one. Thus, f , (0) = 1 , which implies
real-time implementation [5].
cfk = d2.
IV. THE DWT AS A N EXACTWAVELET TRANSFORM Even though 4 could be normalized differently by the in-
Regardless of the filters employed, one can, of course, clusion of a factor in (4.1), the filter f must necessarily
perform the recursions (2.12) or (2.13) on the sampled obey the low-pass condition (4.2); i.e., (2.14d). No,te also
signal s. Moreoever, provided that f (respectively, h) is that, under the chosen normalization, j + ( t ) dt = d(0) =
low pass and g bandpass, the procedure may be inter- 1. However, without spme additional conditions, the re-
preted physically as a bank of proportional bandwidth fil- lationship of 4 ( t ) to $ ( U ) remains somewhat tenuous.
ters (cf. [21]-[24] also Section VI). In the present section, Even under pointwise convergence, the limit may be a
we examine the mathematical significance of relaxing the highly discontinuous , fractal function [9]. Suitable regu-
filter constraints (2.9) and (2.14). Our goal will be to re- larity conditions for the inverse Fourier transform of a
late the more general filter bank to the continuous wavelet product of the form (4.1) to converge to a reasonably be-
transform, thus, in a sense, justifying the term DWT (cf. haved (e.g., L'(Z?), L2(R), and/or continuous) function
[25]). In this endeavor, the major questions which we shall may be found in [8] and [26]. The results are summarized
address are: for what functions $(t) is the recursion (2.12) in Appendix B.
I
2412 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. IO. OCTOBER 1992
A time domain representation of 4(t) in terms off can We find, for example, that q j is zero for t < ti where tj
be derived from (4.1). Let x be the indicator function of - (tj- - N-)/2. With to = - 1/2, and with a similar
[-1/2,1/2) calculation for the right half interval, it follows that 4(t)
= limj-w qJ(t) is zero outside [-N-, N,].
B. Exactness
From (4.1) and Lemma 3.2, it is not difficult to see that To avoid confusion and stress their differences, let us
first recapitulate some definitons. Four different trans-
4(t) = lim Ck [(AF)i]okJ2i~(2it- k). (4.4) forms W(a, b), W(2', 2'n), w(2', 2h), and w; have been
1-03
mentioned ( ( l . l ) , (1.5), (1.6) and (2.12)). We retain a
In fact, the Fourier transform of (4.4) is7 terminology parallel to Fourier transforms, namely,
wavelet transform (WT), wavelet series,8 discretized
wavelet series, and discrete wavelet transform (DWT),
respectively. The first two transforms involve integrals of
a continuous signal; the latter two contain sums of sam-
pled signals. The first three utilize a continuous wavelet
function $(t), the last one employs the discrete filters g
and f. For consistency, we shall continue our develop-
which is just (4.1). Note that the existence of the function ment using decimated transforms; however, the results
(4.4) could have been taken as the starting point for our hold without change for undecimated transforms. This
analysis, since our proofs will not make use of (4.1). follows immediately, since they coincide for n = 0, and
We continue our parallel with Section 11-C. Define the undecimated transforms may be obtained at time n =
4k(t) by definition 4.1, as follows. no, by translating the signal by no samples and taking the
Definition 4.1: transform at n = 0. (See Section I11 where definition 3.1
remains valid for W(2', n) and w(2', n)).
Our starting point shall be a signal s(t) and discrete fil-
tersfandgwith w'definedby (2.12)ord1by(2.13), i.e.,
Then, substitution of 2't - n for t in (4.4) and the use of =
sl+l
A Fs'
Lemma 3.1 yield
w' = Gs'
J2'4(2't - n)
d'fl - Aw'. (4.11)
= lim Ck [(AF)']okJ2J+1x(2J+1t- 2'n - k)
,+m Recall that the matrices F, and G,, are given by f, -
= lim C [ ( A F ) ' - ' I ~ , ~ - ~,,J2'~(2't
, - k) and g , -,, respectively. Of course, we must also specify
j-m k an initialization of the recursion (4.11) for some i; for
example, for the zeroth octave so. The obvious choice is
= lim [(AF)'-'],kJ2JX(2Jt - k). (4.7)
J+W k s: A s(n) (4.12a)
On replacing i by -i, this becomes however, we shall also consider
4h(t) = lim C [(AF)'+'],d2'~(2~t - k). (4.8) s: G Ck - n)s(k) (4.12b)
j-w k
An immediate consequence of (4.8) is which relates to the discretized wavelet series w(2', 2h),
and
(4.9)
s: A j 4(t - n)s(t) dt (4.12~)
paralleling (2.22). Thus, we see that, despite their lack of which corresponds to the sampled WT (wavelet series).
orthogonality, the 4k(t) have retained most of their struc- For a given g, we shall construct a continuous function
ture. Furthermore, i f f i s a finite filter, then 4(t) has finite $(t) such that the DWT of (4.11) is an exact implemen-
support [8]. More, precisely, suppose the coefficients of tation of the discretized wavelet series under (4.12b) and
f are zero outside [-A'-, N+]. Let q J ( t ) of the wavelet transform under ( 4 . 1 2 ~ ) .
[(AFJ]okd2jx(2jt - k). Then, q J ( t ) converges to 4(t), Define $ ( t ) by
and, as in (4.9), we have
$(t) 2 4(t + k)gk = 4(c?- k ) g i (4.13)
(4.10)
'At times, we shall prefer the term sampled WT rather than wavelet se-
'This equality is immediate from Lemma 3.2 and i ( w ) = ries in order to emphasize its role as a restriction of the continuous trans-
(2 sin ( w / ~ ) ) / w . form.
SHENSA: DISCRETE WAVELET TRANSFORM 2413
... -
- lim [AF],J2 = 6,o. (4.20)
j+m
w ( 2 , 2%) 2 C J.k(m>s(m>
m
= s $i;'(t)s(t) dt
=
C
ink
[G (A ~ 1 ' 1 n k Q i (m>s(m)
[G(A F)'s0],
which, by (4.11) is exactly wk. Furthermore, under
(4.16) = s $;'(t)s(r) dt. (4.21)
T ' , ( t ) s ( t ) dt
u n g r (4.12a) and h trous, or for (4.12b), we have d', =
Ck $ A' (k)s(k), the counterpart of (4.16). It is interesting
= s f; [G(AF)'In,4i(t)s(t) dt
that, in a sense, the decimated wavelet transforms (1.5b)
and (1.6) contain superfluous information. That is, they
are underdecimated by a factor of two, and, thus, w ' prop-
erty belongs to octave i 1. +
= [G(A~ ' l n k j 4; (t>s(r) dt
C. Summary
= [G(AF)'s0], (4.17) Let us summarize the results of this section. We are
again w',. given discrete filters f and g such that (4.4) is well de-
Finally, let us investigate the significance of the ii trow fined. Define $(t) by
condition; i.e., of the constraint (2.9), hk= hkO/J2.We $(t> 40 - k>skt (4.22)
prove the following theorem.
Theorem 4. I : f is an ii trow filter 4(n) = ana. with corresponding transforms sampled WT (wavelet
Proof: Letting i = -1, n = 0, and t = n in (4.9) series)
gives
4(n) = C [AF]okJ2 4(2n - k ) . (4.18) (4.23)
k
Then, and discretized wavelet series
4(n) = 6nO * J2hn = 6ao- (4.19) w(2', 2%) A c s;i,(k)s(k).
k
(4.24)
Conversely, suppose that f2,, = s n O / J 2 . Then, since
Let d,, * s stand for the scalar product Ck &(k)sk, and
x ( - k ) = Am, (4.8) with i = 0 and t = 0 implies #,(t) - s ( t ) for the L 2 scalar product j & ( t ) s ( t ) . Then
Q(-n) = lim [ ( A F ) J ] n o J 2 J
J+m f i s ii trous * $(n) = g:. (4.25)
= lim C [(AF)J-']nk[AF],J2 J2'-' For s discrete:
J+CD k
so = s andfis i trous 3 w(2', 2%) = wb (4.26a)
= lim [(A F)' - ']n0J2/- I
/-a
s: = & s 3 w(2', 2'n) = wi,. (4.26b)
2474 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40. NO. IO, OCTOBER 1992
(4.26b) or (4.27) since their filters h are not a trous filters. which must hold for all P(x) of degree 1 2 N - 1.
Furthermore, they do not satisfy (4.25). We pick out the kth coefficient by letting P be the La-
grange polynomial
V . LAGRANGE INTERPOLATION FILTERS
A reasonable class of a trous interpolators to consider a
II (x-i)
LfN-'(x) =
i#j
i , j in [-N + 1, N I .
for (2.10) are those which are exact for polynomials P(t) II ( j - i)
of degree I M for some M , i.e., for which i#j
(5.5)
-P
$2 (;)
- = C f;-zkP(k).
k
(5.1)
L ~ N - ' , we get
~ , that replacing P in (5.4) with
Then, LfN-'(k) = 1 3 ~so
For reasons which will become clear very shortly, we shall aj = L ; ! J ' ( ~ ) f o r j = 1, - - ,N (5.6a)
call these filters Lagrange a trous filters. Since the a trous
filter f satisfies f2k = 13,,/J2, (5.1) is an identity for n a -J. = L ?J N - ' ( ~ > f o r j = 1, - - , N. (5.6b)
even. Let a contain the odd components o f f t
Inasmuch as the Lagrange polynomials L 2 N - ' form a ba-
J2fikp1 fork >0 sis for polynomials of degree less than or equal to 2N -
1 , these ak are in fact the unique solution to (5.4).
fork < 0. (5.2) It is also straiglitforward to see from (5.5) and (5.6) that
for k = 0 a is symmetric, i e., f o r j > 0
.
Then (5.1) is equivalent to II (1/2 - i ) iII (-1/2 + i )
i#l-j
- #j
P (f) = kFoakP (F - k)
aj =
II ( - j + 1 - i ) iII
i# I -j
II (1/2 - i )
#j
(-j + i )
II
i#i
( j - i)
- a-j.
-
(5.7)
for n odd and for all polynomials P of degree IM . (Ac-
tually, a single value of n implies (5.3) for all n; see In summary, we have the following theorem.
(5.4).) We proceed to express the ak in terms of Lagrange Theorem 5.I : Let f be an a trous filter, i.e.,
polynomials, and to show that the above conditions are 1
essentially equivalent to f = h * h ' / 4 2 where h is an hk = J2 I30k. (5.8)
appropriate Daubechies filter.
Assume, furthermore, that f is real with symmetric sup-
A . Construction of a port described by k E [-2N + 1 , 2N - 11. Then, f is-a
Lagrange trous filter, that is, (5.1) holds for all poly-
First, we parameterize the family of filters a satisfying
(5*3)by + ''
the dimension Of the space Of
nomials P of degree 1 2 N - 1 , if and only if the odd
components off are determined by (5.6). Furthermore, f
mials for which it must hold. For such a relationship to
is necessarily symmetric.
exist, one must relate the length of the filter (the number
of unknowns) to M . To accomplish this, we shall assume
that a has exactly the minimum number of coefficients B. Relationship to Daubechies (QMF) Filters
needed to satisfy (5.3). We further assume that a has sym- In [8], Daubechies constructs essentially the entire class
metric support; i.e., there is an N such that ak = 0 for (kl of finite length filters h which satisfy (2.14) and fulfill
> N and ak # 0 for Jkl = N. This assumption is not suitable regularity conditions on (2.18). Explicitly, they
unreasonable, at least for symmetric wavelets $ ( t ) , since take the form
there is no reason a priori to distinguish between t and
- t , and one would even expect a to be symmetric. We (5.9)
shall see, in fact, that the weaker condition of symmetric
support joined with the previous constraints implies that where h ( z ) C h , P is the z transforms of h , and Q is
a actually is symmetric. an appropriately constrained polynomial. (In this section
2475
SHENSA: DISCRETE WAVELET TRANSFORM
it is convenient express the z transform as a polynomial where i = 1, , 2N - 1. The signs work out since n
which we denoted 6 ( z ) where h,(w) = h ( e j w ) . )Her deri- - 1 + +
2k - i and n + 1 2k - i have the same parity
vation uses a specific g, which up to a phase factor is while n - i differs. Define Po@) = 1. Then, since the
given by polynomials PO(x)and Pi (x) for i = 1, -
, 2N - 1 form
g, = (-l)%(l - n) (5.10) a basis for polynomials of degree 1 2 N - 1, and since
(5.14b) implies (5.17) for i = 0, (5.17) must hold for
and, as a consequence, (2.14a) reduces to arbitrary polynomials P(x) of degree 5 2N - 1. Replac-
A h h t D = 1. (5.11) ing Pi by P in (5.17) and setting ak = bIkl yields (5.3).
Conversely, Theorem 5.1 implies that if a satisfies (5.3)
In other words, [hh t ] 2 n = A,, which is the B trous con- for all polynomials of degree 1 2 N - 1 and has symmet-
dition for hh t / d 2 . Finally, if Q is taken to be of minimal ric support, then it must be symmetric. Clearly (5.17)
degree (which turns out to be N - l ) , (el2
is unique [8]. must be also be satisfied. From (5.2) and reversing the
In other words, the squares of these filters are character- above algebra, thisis equivalent to (5.12a) and (5.14) with
ized completely by satisfying (2.14d), (5.1 l ) , and being J 2 f(z) replacing p (2) where f(z) is of degree of 2N - 1
of the form (5.9) with degree 2N - 1. We proceed to in z and also in z - ' . Letting n = 1 (or 0) in (5.14a), we
show that the h * h equal the Lagrange B trous filters. see thatf(z) has 2N roots at z = - 1. Sincefis symmetric,
Let h be the Daubechies filter of order 2N. Since h * f(z) must also have the form (5.12b). We conclude that
h is symmetric, the above conditions are equivalent to J 2 f = h * h t where h i s a Daubechies filter. Thus.
N N Theorem 5.2: There is a one-to-one correspondence be-
B(z) h * ht = 1 + c bk.Z2k-i+ c b k z - 2 k c '
1
tween the squares of Daubechies orthonormal wavelet fil-
ters h of length 2N and the Lagrange B trous filters f of
(5.12a)
length 4N - 1 given by f = h * h t / d 2 .
and Note that one can compute the h of length 2N by taking
b(z> = i ( i ( 1 + ~ ) ) ~ ( ; ( +1 z - ' ) ) N Q ( ~ ) Q ( ~ - ' )(5.12b)
. all possible square roots of the Lagrange a trous filters f.9
f is easily computed from (5.6) where its even compo-
That is, J 2 f = h * h if and only if f(z) is of the form nents given b y h k = 6,/.j2, its odd positive components
(5.12a), (5.12b) and is of degree 2N - 1 in z (respec- by f 2 k - = bk for k = 0 to N, and foraodd negative k by
tively, in z - ' ) . symmetry. Also, the spectrum o f f , f ( e I w ) = Jh(eJw)12,
Next, we show that the bk coincide with the ak of (5.3). presents a convenient method of computing the power
We multiply (5.12a) by z n for an arbitrary integer n, spectra of the h's. In another vein, since the h are maxi-
N N mally flat filters (i.e., have same number of vanishing de-
Z"b(Z) = Zn + c bkZn-1+2k+ c b k Z n + i - 2.k 45.13) rivatives at z = l and z = -l), Theorem 5.2 shows that
a maximally flat filter is a Lagrangian interpolator; a fact
The 2N zeros at z = - 1 in (5.12b) imply that which may aid in the design of such filters [ 141.
(;> c
P; - =
k>O
bkPi ("-
2 + k) 'During the revision of this paper, it was brought to the author's attention
that an implicit relationship between the squared filters and Lagrange in-
terpolation had been independently noted in private conversations between
n + l
c
4- k > O bkP; ( 7 - k) (5.17) I . Daubechies and Ph. Tchamitchian. Similar observations are to appear in
t271.
2476 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40. NO. IO. OCTOBER 1992
Fig. 7 . Illustration of a single stage and its inverse for the undecimated
Any software realization of the wavelet transform only algorithm found in Fig. 5 .
implements a finite number of octaves. Mathematically,
this reduces inversion to an algebraic question, one of version are much less stringent. In order to invert a stage
finding filters which satisfy certain (not overly restrictive) of the algorithm of Fig. 6, the filters p and q need only
equations. However, other considerations begin to come satisfy (cf. Fig. 7)
into play. Exact inversion requires finite filters, and, even
then, exceedingly long filters may not be useful. More- SI = p * SI+' + q * W'
over, the constrained problem is considerably more diffi-
cult to solve. An alternative approach, approximation by = ( p * f +q*g)*s'. (6.3)
truncated infinite filters, might be acceptable, but, once That is,
again, practical considerations dictate that the filters de-
p * f + q * g = 6 (6.4)
cay quickly. Similarly, the behavior of the DWT at infin-
ity (i.e., w ias i goes to infinity) becomes relevant. For where the Kronecker delta, 6 A 60,m, is the identity for
example, the condition C, g , = 0, the discrete counterpart convolution. This is a single equation, and consequently
of (1.2), is not necessary for inversion of a finite number less restrictive ht!n (6.2). If the polynomials formed by
of stages. However, it is necessary to finite energy and the z-transforms f ( z ) and g ( z ) are relatively prime, one
boundedness, which are desirable properties inasmuch as may apply the Euclidean algorithm for the greatest com-
they reflect directly on the numerical stability of the al- mon divisor (in this case, one) to find p and q. It has the
gorithm and/or its inverse (see, for example, [3]). advantage that finitefand g lead to finite p and q. Another
method is simply to solve the equation in frequency space,
A . Inversion
P z ( w ) f i ( w )+ qz(w)gz(w) = 1. (6.5)
To invert either the decimated or undecimated discrete
wavelet transform it suffices to invert a single stage (oc- There is almost too much flexibility in solving this equa-
tave); that is, to find S I , given S I + ' and w l + 'or W"'. tion, although it becomes much more restrictive if one
The equations for inverting the decimated algorithm are demands that the filters be finite or rapidly decreasing.
exactly analogous to those for the Mallat algorithm pic- Once again, a popular choice [30] is f , f , +
g,g, = 2,
which, for example, can be solved for g z by taking the
tured in Fig. 4. One seeks two filters p and q which invert
a single stage of the decimated DWT in Fig. 1; i.e., such square root of 2 - 1 fzI
as long as 1 f ,(U) l2 I2. (Or, vice
versa, it can be solved forf,.) The Daubechies (QMF)
that
filters h/d2 and g/d2 certainly satisfy this equation so
SI = PDs'+' + ells1+' that inversion for the undecimated version of the Mallat
= (PD)(AF)s' + (QD)(AG)s'. (6.1) algorithm is immediate. Another case, useful in signal
processing, is to choosefto be h trous, p = f t / 2 , and g
Equivalently, any filter with nonvanishing spectrum except possibly
where I f,(u)I equals d2 (see next subsection). Important
(PD)(AF) + ( Q D ) ( A G )= Z (6.2) questions of numerical stability, filter lengths, etc. cer-
where Z is the identity matrix. This type of equation, tainly remain to be answered, but are well beyond the
which in the frequency domain may be separated into two scope of the present paper.
equations comparable to (2.14a) and (2.14b), has been Finally, before departing from this subject, it should be
treated extensively (but not exhaustively) in the subband mentioned that inversion of the undecimated case in the
coding literature (cf. [12], [13], or even [8]). The QMF form of Fig. 5 also follows from (6.4). The inverting fil-
filters of the Mallat algorithm satisfy (6.2) with g , ( w ) = ters are just D'p and D'q. It is a simple matter to verify
f,(w + T ) , P = f t , and q = g' (i.e., (5.10) and (5.11)). that (6.4) implies that
A less restricted class is just p = f t and q = g'. The
general class of filters satisfying (6.2), so-called bior-
+
(D'p) * ( D ' f ) (D'q) * (D'g) = 6 . (6.6)
thogonal filters, are examined in [28] and [29]. It should (Inserting zeros in 6 just yields 6.)
be emphasized that for perfect reconstruction in applica-
tions all filters must be of finite length (FIR). This does B. Finite Energy and Boundedness
not imply that infinite filters (IIR) implemented by their The discrete wavelet transform is a m.apping of se-
truncations are not worthy of consideration [28]. quences s,, n = 1,. 2, * * * into the space of doubly in-
For the undecimated algorithm the requirements for in- dexed sequences wk, i , n = 1, 2, . * * . Finite energy for
SHENSA: DISCRETE WAVELET TRANSFORM 2477
the signal is simply we see that the wavelet transform will have finite energy
c 1s,12 <
n
OQ. (6.7)
if and only if
du d b / u 2 is a constant independent of i so that IwfI2 is Then, in the time domain, (6.15) and (6.16) imply
discretized in a fashion so as to be both power/hertz and
energy /cell.
Let 1 1 ~ 1 1 ~ C, be the squared norm of s, and define L
(1f”(12& 2 lim sup 51 l(s‘((* (6.10) Adding 1) @‘-‘~~2/2’-’ to both sides and repeating for de-
i+m creasing octaves implies that
which corresponds to the DC energy (i.e., at w = 0). Fi-
nally, define the energy of the DWT by
E = c’ 721 I( w ’ + 1)s”(I2 (6.1la) Finally, letting J go to infinity, we get not only (6.13),
but also the right inequality of (6.14) with B = 1 .
E = l(~’11~.+IS"(^^. (6.1lb) However, the condition (6.16) is much too strong. That
I
is, the transformation g -+ Cg for a large enough constant
*
Energy conservation takes the form of the following Par- C would cause (6.16) to be violated even though C has
seval’s relationship for discrete wavelets: no effect other than to multiply the total energy by a con-
Dejnition 6.1: A particular choice of filters f and g is stant. In fact, the filters f and g produce finite energy
said to be energy conserving if, for some constant C , transforms if and only i f f and Cg yield finite energy.
Thus, to have finite energy, it is sufficient to find a C >
0 such that max, (1 fz(w)(2 + C1 gZ(w)l2 I2. Such a C
exists provided that 1 f,(U)’ I 2 and (1 /2) 1 gZ(w)l2/
One may also specify conservation for decimated trans- (1 - [ I /2] 1 f,(w)I2) is finite; i.e., is less than some finite
forms, in which case the 2’ is dropped. B = 1 /C. A similar argument holds for the lower bound.
Except for some clarifying remarks at the end, we re- If
strict the discussion in the remainder of this subsection to
the undecimated DWT. Following the above definitions, mwi n i <IfZ(U>l2+ I gz(412)2 1 (6.19)
2478 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. IO, OCTOBER 1992
then (6.18) holds with the inequality reversed and the left b) Low-pass filter to obtain the lower half of the spec-
inequality of (6.14) holds with A = 1 . Once again, we trum (10, 7T/21).
apply the trick with the constant C and find that, for the c) Decimate to expand the lower half to [0, n].
inverse to be bounded, it is sufficient that there exist A = d) Go to a).
1/C > 0 such that (1/2)) gZ(w)l2/(1- [1/21 If,(w)12) In somewhat more detail: We first obtain the high fre-
2 A. In summary,
quency information by using g to filter the upper half of
Theorem 6.1: A sufficient condition for the undeci-
the spectrum of s'. The filter output is w ' . Then, in prep-
mated DFT IV and its inverse to satisfy (6.14) (that is, to
aration for the next octave, s' is low-pass filtered by f.
be bounded) is that, for all U , l f , ( ~ ) ( ~ 5 2 and
This retains the, as yet unexamined, low frequency con-
tents and also prevents the upper half of the spectrum from
aliasing (i.e., contaminating the low frequency contents)
in the dilation which follows. Finally, the operator A
To satisfy (6.20), one must have If,(w)l = 2 g,(w) = spreads" the remaining energy to fill the spectrum, pro-
0, and the multiplicities of the corresponding roots must ducing octave i +1 . The procedure then repeats itself,
+ I is bandpass filtered to get the spectral contents at
be identical. Note, also, that (6.20) can be used to give
an estimate of B / A , the so-called tightness of the frame. frequencies which are, in absolute units, one half the fre-
Whether these conditions are also necessary remains an quencies of the previous octave.
open question. One can, however, show from (3.11) and A potential problem is immediately apparent. If the
an examination of the power IVi(w) at w = 0 that a nec- bandwidth of gz(w) is less than ~ / 2 a, portion of the sig-
essary condition is g, (0) = 0 (equivalently, C, g, = 0). nal energy will be discarded; it never appears in w ' . One
This is the discrete analog of the admissibility condition possible remedy is to make g, sufficiently broad; how-
(1.2). The author conjectures that in the discrete case it ever, that would limit the resolution. Alternatively, we
is not a sufficient condition. (We remind the reader that may introduce so-called voices. That is, we can employ
even in those cases for which the DWT is exactly the sam- a bank of filters of the type g (see Fig. 9) in order to cover
pled WT, finite energy of the continuous wavelet trans- the entire upper half of the spectrum.
form does not imply that of the discrete transform.) We We formalize some of these concepts using the modu-
do have, however, the following theorem: lated Gaussian of (1.4) as an example. With the introduc-
Theorem 6.2: A necessary and sufficient condition for tion of an additional parameter P , +(t)becomes
energy conservation (6.12) is that for all w A ej v t e - - p 2 t 2 / 2 .
+(t) = (6.22)
1 1
- (IfZ(~>l2
2
+ 2I gz(w)I2)= 1 . (6.21) Its Fourier transform is given by
0 ff
Fig. 8. Illustration of one stage of the algorithm, octaves i and i + 1 inclusive, viewed from the frequency domain.
c [(hF)'],kejkW= eJ2lnW
i- 1
fZ(2'w). (A.4)
c f, = J 2
fl
r=O
(i.e., (1/2)fz(0) = 1).
Proof: For i = 1, we have ii) Energy
k [AF],eJk" = Ck f 2 f l - k e i k w ;lL(4125 1.
= eJ2""f,(w). (A.5) iii) Low pass
1
Then, by induction, = (1 + e'w>Nr(4
Jjfz'"' (B.7)
1-2
c [ ( A F ) ' l n k e J k=w c [ ( A F ) ] f l k e J k ( 2 ' - ' w )f,(2'W).
k k r=O
where ly(w)l 5 C < 1 / 2 ((B.7) implies thatf,(a) = 0).
iv) Energy: complementary low-padhigh-pass pair
APPENDIXB
Equation (B. 8) implies
SUMMARY OF FILTERCONSTRAINTS
The discrete wavelet transform 19' and the decimated a) high pass
discrete wavelet transform w' (or d') are defined for ar- Cg, = o (B.9)
bitrary filters f and g by n