Chap 1
Chap 1
Introduction to Harmonic
Analysis
November 12, 2010
Springer
Berlin Heidelberg NewYork
Hong Kong London
Milan Paris Tokyo
Contents
General Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
4 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.1 Motivation and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
4.1.1 Notation for Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
4.2 Convergence and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.2.1 Examples of Spaces Defined by Seminorms . . . . . . . . . . . 209
4.2.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
4.2.3 From Convergence to a Metric . . . . . . . . . . . . . . . . . . . . . . 213
4.2.4 Continuity Equals Boundedness . . . . . . . . . . . . . . . . . . . . . 214
4.3 Distributions of Various Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
viii Contents
5 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
5.1 Borel and Radon Measures on R . . . . . . . . . . . . . . . . . . . . . . . . . . 277
5.1.1 Convolution and Linear Time-Invariant Systems . . . . . . . 277
Contents ix
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
The latter notation is especially convenient when dealing with X = Lp (R) and
with distributions. In particular, if 1 ≤ p < ∞ then the dual space Lp (R)∗ is
′
isomorphic to Lp (R) in the sense that given any µ ∈ Lp (R)∗ there exists a
′
unique function g ∈ Lp (R) such that
Z
hf, µi = f (x) g(x) dx = hf, gi, f ∈ Lp (R).
Note that even though f is only defined almost everywhere, fb(ξ) is a well-
defined complex scalar for every ξ ∈ R since
Z Z
|f (x) e−2πiξx | dx = |f (x)| dx = kf k1 < ∞.
Ff = fb.
∧ ∧
To avoid notionally ugly formulations, we sometimes write f or (f ) instead
of fb.
Now we examine some of the properties of the Fourier transform, and
explain why we are so interested in it.
kF k = sup kfbk∞ ≤ 1.
kf k1 =1
Proof. The fact that F is linear is the reader’s first exercise. Given f ∈ L1 (R),
we have for any ξ ∈ R that
Z
|fb(ξ)| = f (x) e−2πiξx dx
Z
≤ |f (x) e−2πiξx | dx
Z
= |f (x)| dx = kf k1 .
Hence
kfbk∞ = ess sup |fb(ξ)| ≤ kf k1 .
ξ∈R
Thus, the Fourier transform maps the unit ball in L1 (R) into the unit ball
in L∞ (R). The following exercise shows that we actually have kF k = 1, and
that the supremum in the definition of the operator norm of F is achieved,
i.e., there exists an f ∈ L1 (R) with kf k1 = 1 such that kfbk∞ = 1. Even so,
the range of F is a proper subset of L∞ (R) (see Theorem 1.20).
1.1 Definition and Basic Properties 3
Exercise 1.3. Show that if f ∈ L1 (R) then we have the useful fact that
Z
b
f (0) = f (x) dx.
Assume for now that fb is continuous (we will prove this in Exercise 1.8), and
use this equality to show that if f ≥ 0 a.e., then kfbk∞ = kf k1 . Conclude that
kF k = 1. ♦
Remark 1.4. There are many other standard normalizations for the Fourier
transform. For example, each of
Z Z Z
1
f (x) e2πiξx dx, f (x) e−iξx dx, √ f (x) e−iξx dx,
2π
is a common choice. Each normalization makes certain formulas notationally
nicer and other formulas more notationally complicated. ♦
sin ξ
d(ξ) =
πξ
1.0
0.8
0.6
0.4
0.2
-6 -4 -2 2 4 6
-0.2
sin πξ
Fig. 1.1. Graph of the sinc function dπ (ξ) = πξ
.
The sinc function is also known as the cardinal sine, and indeed “sinc” is
a contraction of sinus cardinalis. The “cardinal” nature of the sinc function
is the fact that it is an interpolating function, because sinc(0) = 1 while
sinc(n) = 0 for all integers n 6= 0. The Dirichlet function is named for Johann
Dirichlet (1805–1859).
Note that the functions d and sinc do not belong to L1 (R), because they
only decay slowly at infinity (on the order of 1/|ξ|). On the other hand, they
do belong to Lp (R) for every 1 < p ≤ ∞.
Exercise 1.7. Let χ[−T,T ] denote the characteristic function of the interval
[−T, T ]. Note that χ[−T,T ] ∈ L1 (R). Prove the following statements.
sin 2πT ξ
(a) (χ[−T,T ] ) =
∧
πξ = d2πT (ξ).
(b) kχ[−T,T ] k1 = k(χ[−T,T ] ) k∞ .
∧
The last two properties of d2πT are in fact shared by any function that
is the Fourier transform of an L1 function. The following exercise asks for a
direct proof of the fact that fb is uniformly continuous if f ∈ L1 (R). On the
other hand, Theorem 1.20 below shows that if f ∈ L1 (R) then fb must belong
to C0 (R), which also implies (see Exercise 1.17) that fb must be uniformly
continuous.
0 4
-1 2
0 1
1 0
1.1.2 Motivation
1.0
0.5
-1.0
√
Fig. 1.3. Graph of ϕ(x) = cos(2π 7x).
For a given fixed ξ, the function fb(ξ) e2πiξx is a pure tone whose amplitude
is the scalar fb(ξ). The larger fb(ξ) is, the larger the vibrations of our string or
tuning fork, and the louder the perceived sound.
1.1 Definition and Basic Properties 7
-2
-3
√
Fig. 1.4. Graph of ϕ(x) = 2 cos(2π 7x) + 0.7 cos(2π9x).
10
-5
-10
P75
Fig. 1.5. Graph of 75 superimposed pure tones: ϕ(x) = k=1 fb(ξk ) cos(2πξk x).
correct amplitudes, we create any sound that we like. The pure tones are our
simple “building blocks,” and by combining them we can create any sound
(or signal, or function). Of course, the “superposition” is an integral, not a
finite sum, but still we are combining our very simple special functions eξ to
create very complicated functions f via the Inversion Formula.
Once we have a representation of f in terms of our pure tones, we can
act on it. For example, assume that we measure time in seconds, in which
case frequency is usually called hertz. If we don’t like the annoying buzz in
our signal f that is due to our 60 hertz overhead fluorescent lights, we might
decide to modify f by creating a new function h whose Fourier transform is
identical to fb except that b
h(60) = 0 (and most likely with some corresponding
smooth modifications of the frequencies close to ξ = 60 as well). Once we
know what we want b h to be,
R the function h that does this is given by the
Inversion Formula as h(x) = b h(ξ) e2πiξx dξ. In engineering jargon, we filter f
to obtain h. We will see later that h can be obtained from f through the
operation of convolution (see Section 1.3.4).
In light of the Inversion Formula, we make the following definition.
Definition 1.10 (Inverse Fourier Transform). The inverse Fourier trans-
form of f ∈ L1 (R) is Z
∨
f (ξ) = f (x) e2πiξx dx.
Using this notation, the Inversion Formula says that if f, fb ∈ L1 (R) then
∨ ∨ ∧
f = fb , and likewise f = f will hold under the same hypotheses.
∧ ∨
Remark 1.11. If f ∈ L1 (R) then f (ξ) = f (−ξ). Therefore, every result that
we prove about the Fourier transform has an analogue for the inverse Fourier
transform, simply by making a change of variables. We usually only state
results for the Fourier transform, but it is a good idea for the reader to work
out the corresponding formulas for the inverse Fourier transform (usually, a
sign simply has to be moved from one place to another). Note, for example,
that if f, fb ∈ L1 (R), then the Inversion Formula tells us that
∨
f (ξ) = fb (−ξ) = f (−ξ).
∧∧
♦
The musical discussion above may explain some terminology. In our ex-
ample, the function f (x) represented a displacement (of a string or speaker)
that changed with time. Time is represented by the variable x, and thus we
often speak of x as the time variable, and we say that values f (x) describe
the function f in the time domain. On the other hand, fb(ξ) represents the
“amount” of frequency ξ present in f, and therefore we often refer to ξ as
the frequency variable, and say that values fb(ξ) describe the function f in the
1.1 Definition and Basic Properties 9
Additional Problems
1.1. Show that the Fourier transform of the one-sided exponential f (x) =
e−x χ[0,∞) (x) is
1
fb(ξ) = , ξ ∈ R,
2πiξ + 1
and the Fourier transform of the two-sided exponential g(x) = e−|x| is
2
gb(ξ) = , ξ ∈ R.
4π 2 ξ 2 +1
1.3. Prove that if f ∈ L1 (R) is real-valued, then fb(ξ) = fb(−ξ). Conclude that
if f is both real and even, then fb is both real and even as well.
1.4. Show that if f ∈ L1 (R) is nonnegative almost everywhere (and is not the
zero function), then |fb(ξ)| < fb(0) for all ξ 6= 0.
1.5. (a) The Gamma function for complex numbers z satisfying Re(z) > 0 is
Z ∞
Γ(z) = tz−1 e−t dt.
0
Show that this is well-defined, i.e., tz−1 e−t ∈ L1 (0, ∞) whenever Re(z) > 0.
Remark: The Gamma function is analytic on Re(z) > 0, and has an ana-
lytic continuation to C \ {0, −1, −2, . . . }. Also, Γ(n + 1) = n! for n ∈ N.
x
(b) Show that f (x) = e−e ex ∈ L1 (R), and fb(ξ) = Γ(1 − 2πiξ).
Remark: It can be shown that Γ(z) 6= 0 for every z where it is defined, so
for this f we have fb(ξ) 6= 0 for every ξ ∈ R.
1.6. The Riemann zeta function for complex numbers s with Re(s) > 1 is
X∞
1
ζ(s) = .
n=1
ns
10 1 The Fourier Transform on L1 (R)
The Riemann zeta function is analytic on Re(s) > 1, and has an analytic
continuation to C \ {1}. The Riemann hypothesis, whose validity is one of the
great open problems in mathematics, states that if ζ(s) = 0 and Re(s) > 0
then Re(s) = 1/2.
(a) Show that
X∞
(−1)n+1
(1 − 21−s ) ζ(s) = , Re(s) > 1. (1.3)
n=1
ns
R∞
It is helpful to note that, by a change of variables, Γ(s) = ns 0 xs−1 e−nx dx
for Re(s) > 0.
(c) Justify interchanging the summation and integral in equation (1.4) to
obtain
Z ∞
1−s 1
Γ(s) (1 − 2 ) ζ(s) = xs−1 x dx, Re(s) > 0.
0 e +1
etx
ft (x) = .
eex +1
Show that ft ∈ L1 (R). Given ξ ∈ R, let s = t − 2πiξ and show that
Here are four operators that will pervade our study of the Fourier transform.
The scaling factor in the definition of dilation has been chosen so that di-
lation preserves the L1 -norm of a function. Translation, modulation, dilation,
and involution are all isometries mapping L1 (R) onto itself.
Exercise 1.13. Prove the following algebraic properties of the Fourier trans-
form of f ∈ L1 (R).
(a) (Ta f ) (ξ) = (M−a fb)(ξ) = e−2πiaξ fb(ξ), for a ∈ R.
∧
Also derive analogous formulas relating the inverse Fourier transform to trans-
lation, modulation, dilation, and involution. ♦
In this sense, Ta is dual to M−a under the Fourier transform, and likewise
Mη is dual to Tη . Additionally, except for the normalizing scaling factor (which
will be very important to us in Section 1.5!), dilation by λ is dual to dilation
by 1/λ under the Fourier transform.
In order to derive some of the properties of the family of translation op-
erators, it is helpful to know that Cc (R) is dense in Lp (R) when p is finite.
We will prove this using standard real analysis techniques. By making use of
convolutions, we will greatly refine this result in Section 1.5. For example, we
will see that the seemingly “tiny” space Cc∞ (R) is dense in Lp (R) for each
1 ≤ p < ∞, and is dense in C0 (R) with respect to the L∞ -norm.
The tool that we need for this proof is Urysohn’s Lemma, a general topolog-
ical result which states that if A and B are disjoint closed subsets of a normal
topological space X then there exists a continuous function f : X → [0, 1] that
is identically 0 on A and identically 1 on B. We will prove Urysohn’s Lemma
for subsets of Rd (although the same simple proof can be used in any metric
space). The key is the following lemma.
12 1 The Fourier Transform on L1 (R)
≤ |y − x| + |x − a|
ε ε
< + dist(x, E) +
2 2
= f (x) + ε.
Similarly f (x) < f (y) + ε, so |f (x) − f (y)| < ε whenever |x − y| < ε/2. ⊓
⊔
Theorem 1.15 (Urysohn’s Lemma). If E, F are disjoint closed subsets of
Rd , then there exists a continuous function θ : Rd → R such that
(a) 0 ≤ θ ≤ 1,
(b) θ = 0 on E, and
(c) θ = 1 on F.
Proof. Because E is closed, if x ∈
/ E then dist(x, E) > 0. Also, by Lemma 1.14,
dist(x, E) and dist(x, F ) are each continuous functions of x. Therefore the
function
dist(x, E)
θ(x) =
dist(x, E) + dist(x, F )
has the required properties. ⊓
⊔
Theorem 1.16. Cc (Rd ) is dense in Lp (Rd ) for each 1 ≤ p < ∞.
Proof. First consider the function f = χE where E ⊆ Rd is bounded. If we fix
ε > 0, then there exists a bounded open set U ⊇ E such that |U \E| < ε
and a compact set K ⊆ E such that |E\K| < ε. By Urysohn’s Lemma
(Theorem 1.15), we can find a continuous function θ : Rd → R such that
0 ≤ θ ≤ 1, θ = 1 on K, and θ = 0 on Rd \U. Then θ ∈ Cc (Rd ), and we have
Z Z
kχE − θkpp = |χE − θ|p = |χE − θ|p ≤ |U \K| < 2ε.
U\K
(b) Show that (1.5) can fail if we only assume that f ∈ Cb (R).
(c) Show that if 1 ≤ p < ∞ and f ∈ Lp (R), then
lim kTa f − f kp = 0. ♦
a→0
In another terminology, this says that τ (t) → τ (s) in the strong operator
topology as t → s. For 1 ≤ p < ∞, we therefore say that {Ta }a∈R is a strongly
continuous one-parameter family of operators on Lp (R). The same is true of
{Ta }a∈R on C0 (R) when p = ∞. ♦
Exercise 1.19. Prove that the family {Mη }η∈R of modulation operators is a
strongly continuous family of operators on Lp (R) for 1 ≤ p < ∞. For p = ∞,
show that {Mη }η∈R is a strongly continuous family on C0 (R), but not on
L∞ (R). ♦
Conjecture 1.22 was first made in [HRT96]. While many partial results
related to this conjecture are known, as of the time of writing it is not known
whether Conjecture 1.22 holds in the generality stated. For more details and
background on this conjecture, we refer to the survey paper [Heil06] or Section
11.9 in the author’s text [Heil11a].
Additional Problems
m
1.7. Let f : R → C be Lebesgue measurable. Prove that Ta f → f (convergence
in measure) as a → 0 on any compact set, i.e.,
1.10. Let X be a Banach space, and fix A ∈ B(X). For n > 0 let An denote
the usual nth power of A (An = A · · · A, n times), and define A0 = I (the
identity map on X).
P∞ k
(a) Given x ∈ X, show that the series eA (x) = k=0 Ak!x converges abso-
lutely in X, and show that eA is a linear operator on X.
P ∞ Ak
(b) Prove that the series k=0 k! converges absolutely in B(X), and
equals the operator eA defined in part (a). Conclude that eA ∈ B(X) and
keA k ≤ ekAk .
(c) Prove that if A, B ∈ B(X) and AB = BA, then eA eB = eA+B = eB eA .
(d) Let H be a Hilbert space. Show that if A ∈ B(H) is self-adjoint, then
eiA is unitary.
where these series converge absolutely both in the pointwise sense and in L1 -
norm. In contrast to Problem 1.10, note that the operators P and M are
unbounded on L1 (R).
1.12. Show that for an appropriate dense subset functions f ∈ L1 (R) we have
f − Ta f f − Mξ f
2πiM f = lim and − 2πiP f = lim , (1.9)
a→0 a ξ→0 ξ
where these limits converge both in the pointwise sense and in L1 -norm.
Remark: Equation (1.9) says that 2πiM is the infinitesimal generator of
the strongly continuous family of operators {Ta }a∈R , and −2πiP is the in-
finitesimal generator of the family {Mξ }ξ∈R .
1.3 Convolution
Since L1 (R) is a Banach space, we know that it has many useful properties. In
particular the operations of addition and scalar multiplication are continuous.
However, there are many other operations on L1 (R) that we could consider.
One natural operation is multiplication of functions, but unfortunately L1 (R)
is not closed under pointwise multiplication.
1.3 Convolution 17
Before proceeding, there are some technical issues related to the definition of
elements of Lp (R) that we need to clarify.
The basic source of difficulty is that an element f of Lp (R) is not a func-
tion but rather denotes an equivalence class of functions that are equal almost
everywhere. Therefore we cannot speak of the “value of f ∈ Lp (R) at a point
x ∈ R,” and consequently concepts such as continuity or support do not ap-
ply in a literal sense to elements of Lp (R). For example, the zero function
0 and the function χQ both belong to the zero element of Lp (R), which is
the equivalence class of functions that are zero a.e., yet 0 is continuous and
compactly supported while χQ is discontinuous and its support is R. Even so,
it is often essential to consider smoothness or support properties of functions,
and we therefore adopt the following conventions when discussing the smooth-
ness or support of elements of Lp (R). More generally, these same issues and
conventions apply to elements of
L1loc (R) = f : R → C : f · χK ∈ L1 (R) for every compact K ⊆ R ,
Notation 1.24 (Continuity for Elements of L1loc (R)). We will say that
f ∈ L1loc (R) is continuous if there is a representative of f that is continuous,
i.e., there exists some continuous function f0 such that f is the equivalence
class of all functions that equal f0 almost everywhere. R
Conversely, if g is a continuous function such that K |g(x)| dx < ∞ for
every compact K ⊆ R, then we write g ∈ L1loc (R) with understanding that this
means that the equivalence class of functions that equal g a.e. is an element
of L1loc (R). In this sense we write statements such as Cc (R) ⊆ Lp (R) even
though Cc (R) is a set of functions while Lp (R) is a set of equivalence classes
of functions. ♦
The reader should verify that if f ∈ L1loc (R) is continuous (in the sense
given in Notation 1.24), then the support of f in the sense of Notation 1.25
coincides with the usual definition of the support of f as the closure in R of
the set {x ∈ R : f (x) 6= 0}.
1 2
Therefore H(x, y) = f (y) g(x
R − y) ∈ L (R ), so Fubini’s Theorem implies that
the function (f ∗ g)(x) = f (y) g(x − y) dy exists for almost every x and is
an integrable function of x.
(c) Using part (b),
Z ZZ
kf ∗ gk1 = |(f ∗ g)(x)| dx ≤ |f (y) g(x − y)| dy dx = kf k1 kgk1 .
= fb(ξ) gb(ξ). ⊓
⊔
1 1 1
(b) If 1 ≤ p, q, r ≤ ∞ and r = p + q − 1 then Lp (R) ∗ Lq (R) ⊆ Lr (R), and
we have
Remark 1.31. If q = p′ (in which case r = ∞), then Young’s Inequality tells
′
us that the convolution of f ∈ Lp (R) with g ∈ Lp (R) belongs to L∞ (R). In
fact, it follows from our later Exercise 1.39 that f ∗g is continuous in this case,
and therefore (f ∗ g)(x) is defined for every x. However, for general values of
p, q, r satisfying the hypotheses of Young’s Inequality, we are only able to
conclude that f ∗ g ∈ Lr (R), and hence we usually only have that f ∗ g is
defined pointwise almost everywhere. ♦
The inequalities given in Exercise 1.30 are in the form that we will most
often need in practice, but it is very interesting to note that the implicit
constant 1 on the right-hand side of equation (1.13) is not the optimal constant
in general. Instead, if we define the Babenko–Beckner constant Ap by
1/2
p1/p
Ap = , (1.14)
p′ 1/p
′
and the constant Ap Aq Ar′ is typically not 1. The proof that Ap Aq Ar′ is the
best constant in equation (1.15) is due to Beckner [Bec75] and Brascamp
and Lieb [BrL76]. The Babenko–Beckner constant will make an appearance
again when we consider the Hausdorff–Young Theorem in Chapter 3 (see
Theorem 3.23).
Theorem 1.28 gives us another way to view filtering (which was discussed
in Section 1.1.2). Given f ∈ L1 (R), we filter f by modifying its frequency
content. That is, we create a new function h from f whose Fourier transform
is
b
h(ξ) = fb(ξ) gb(ξ).
The Fourier transform of the function g tells us how to modify the frequency
content of f. Assuming that the Inversion Formula applies, we can recover h
by the formula Z
h(x) = fb(ξ) gb(ξ) e2πiξx dξ,
22 1 The Fourier Transform on L1 (R)
which is a superposition of the “pure tones” e2πiξx with the modified ampli-
tudes fb(ξ) gb(ξ). Assuming that g ∈ L1 (R), Theorem 1.28 tells us that we can
also obtain h by convolution: we have h = f ∗ g. Filtering is convolution.
Obviously, there are many details that we are glossing over by assuming
all of the formulas are applicable. Some of these we will address later, e.g.,
is it true that a function h ∈ L1 (R) is uniquely determined by its Fourier
transform b h? (Yes, we will show that f 7→ fb is an injective map of L1 (R) into
C0 (R), see Theorem 1.92.) Others we will leave for a course on digital signal
processing. (For example, how do results for L1 (R) relate to the processing of
real-life digital signals whose domain is {1, . . . , n} instead of R?) In any case,
keeping our attention on the real line, let us ask one interesting question. If
our goal is to filter f, then one of the possible filterings should be the identity
operation, i.e., do nothing to the frequency content of f. Is there a g ∈ L1 (R)
such that f 7→ f ∗ g is the identity operation on L1 (R)?
Exercise 1.32. Suppose that there existed a function δ ∈ L1 (R) such that
∀ f ∈ L1 (R), f ∗ δ = f.
Avg T f HxL
x-T x T+x
R x+T
Fig. 1.6. The area of the dashed box equals x−T
f (y) dy, which is the area under
the graph of f between x − T and x + T.
and dilate it so that it becomes more and more compressed towards the origin,
yet always keeping the total integral the same, by setting
-4 -2 2 4
-1
-4 -2 2 4
-1
Fig. 1.7. Top: The function g(x) = cos(x)/(1 + x2 . Bottom: The dilated function
g5 (x) = 5g(5x).
R R
If it is the case that g = 1 (so gλ = 1 also), then we will see in
Section 1.5 that, for any f ∈ L1 (R), the convolution f ∗ gλ converges to f in
L1 -norm (and possibly in other senses as well, depending on properties of f
and g). The family {gλ }λ>0 is an example of what we will call an approximate
identity in Section 1.5.
From this discussion we can see at least an intuitive reason why there is
no identity function for convolution in L1 (R). Consider the functions gλ , each
an integrable function with integral 1 that become more and more spike-like
as λ increases. Suppose that we could let λ → ∞ and obtain in the limit
an integrable function δ that, like each function gλ , has integral 1, but is
indeed a spike supported entirely at the origin. Then we would hopefully have
1.3 Convolution 25
“Let δ be the Rfunction on R that has the property that δ(x) = 0 for
all x 6= 0 and δ(x) dx = 1. Then
Z
f (y) δ(x − y) dy = f (x).” (1.16)
However, there is no such function δ. Any function that is zero for all x 6= 0
1
is zero almost everywhere, and hence is the zero
R element of L (R). If δ(x) = 0
for x 6= 0, then the Lebesgue integral of δ is δ(x) dx = 0, not 1, even if we
define δ(0) = ∞. Thus f ∗ δ = 0, not f.
We cannot construct a function δ that has the property that f ∗δ = f for all
f ∈ L1 (R). However, we can construct families {gλ }λ>0 that have the property
that f ∗ gλ converges to f in various senses, and these are the approximate
identities of Section 1.5. We can also construct objects that are not functions
but which are identities for convolution—we will see the δ-distribution in
Chapter 4 and the δ-measure in Chapter 5. In effect, the integral appearing
in equation (1.16) is not a Lebesgue integral but rather is simply a shorthand
for something else, namely, the action of the distribution or measure δ on the
function f.
Thus we can view the convolution of f with g at the point x as the inner
product of f with the function e
g translated by x.
Notation 1.33. In equation (1.17), we have used the notation h·, ·i, which in
the context of functions usually denotes the inner product on L2 (R). However,
neither f nor Tx eg need belong to L2 (R), so we are certainly taking some
poetic license in speaking of hf, Tx e
g i as an inner product of f and Tx ge. We
do this because in this volume we so often encounter integrals of the form
R
f (x) g(x) dx and direct generalizations of these integrals that it is extremely
26 1 The Fourier Transform on L1 (R)
convenient for us to retain the notation hf, gi for such an integral whenever it
makes sense. Specifically, if f and g are any measurable functions on R, then
we will write Z
hf, gi = f (x) g(x) dx
f, g ∈ L1 (R) =⇒ f · Tx e
g ∈ L1 (R) for a.e. x. ♦
Thus (thanks to Fubini and his theorem), even if we only assume that f
and g are integrable, the “inner product” (f ∗ g)(x) = hf, Tx e
g i exists for
almost every x.
= kg − Th gk∞ kf k1 → 0 as h → 0,
where the convergence follows from the fact that g is uniformly continuous.
Thus f ∗ g ∈ Cb (R), and in fact f ∗ g is uniformly continuous. Actually, we
can make this argument much more succinct by making use of the fact that
convolution commutes with translation (Exercise 1.29). We need only write:
kf ∗ g − Th (f ∗ g)k∞ = kf ∗ g − f ∗ (Th g)k∞
= kf ∗ (g − Th g)k∞
≤ kf k1 kg − Th gk∞ → 0 as h → 0.
To show that f ∗ g ∈ C0 (R), consider first the case where g ∈ Cc (R). Then
supp(g) ⊆ [−N, N ] for some N > 0. Hence
Z x+N
|(f ∗ g)(x)| ≤ |f (y)| |g(x − y)| dy
x−N
Z x+N
≤ kgk∞ |f (y)| dy → 0 as |x| → ∞.
x−N
and
f ∈ L∞ (R), g ∈ Cc∞ (R) =⇒ f ∗ g ∈ Cb∞ (R).
Moreover, in either case, if f is also compactly supported then we have f ∗ g ∈
Cc∞ (R). ♦
So from one function in Cc∞ (R) we can generate many others. But this begs
the question: Are there any functions that are both compactly supported and
infinitely differentiable? It is not at all obvious at first glance whether there
exist any functions with such extreme properties, but the following exercise
constructs some examples (see also Problem 1.20).
2
Exercise 1.42. Define f (x) = e−1/x χ(0,∞) (x).
(a) Show that for every n ∈ N, there exists a polynomial pn of degree 3n such
that −2
f (n) (x) = pn (x−1 ) e−x χ(0,∞) (x).
Conclude that f ∈ Cb∞ (R), and, in particular, f (n) (0) = 0 for every n ≥ 0.
Note that supp(f ) = [0, ∞).
(b) Show that if a < b, then g(x) = f (x − a) f (b − x) belongs to Cc∞ (R) and
satisfies supp(g) = [a, b] and g(x) > 0 on (a, b). ♦
∀ f, g ∈ A, kf gk ≤ kf k kgk.
Example 1.45. (a) Cb (R) is a commutative Banach algebra with identity under
the operation of pointwise products of functions, (f g)(x) = f (x)g(x).
(b) C0 (R) is a commutative Banach algebra without identity with respect
to pointwise products.
(c) Since the operator norm is submultiplicative, if X is a Banach space
then the space B(X) of all bounded linear operators mapping X into itself is
a noncommutative Banach algebra with identity with respect to composition
of operators. ♦
We will explore some aspects of the Banach algebra structure of L1 (R) and
its relatives next. First we create another Banach algebra that isometrically
inherits its structure from L1 (R) via the Fourier transform.
and set kfbkA = kf k1 for f ∈ L1 (R). If we assume that the Fourier transform
is injective on L1 (R), which we prove later in Theorem 1.92, then k·kA is well-
defined. Given this, prove that k·kA is a norm and that A(R) is a Banach space
with respect to this norm. Prove also that A(R) is a commutative Banach
algebra without identity with respect to the operation of pointwise products
of functions. ♦
The space A(R) is known by a variety of names, including both the Fourier
algebra and the Wiener algebra, although it should be noted that both of these
terms are sometimes used to refer to other spaces.
Our definition of A(R) is implicit—it is the set of all Fourier transforms
of L1 functions, and in fact is isometrically isomorphic to L1 (R). However,
we can ask whether there is an explicit description of A(R)—is it possible
to say that F ∈ A(R) based directly on properties of F ? For example, we
know that A(R) ⊆ C0 (R) by the Riemann–Lebesgue Lemma (Theorem 1.20),
1.3 Convolution 31
x ∈ A, y ∈ I =⇒ xy ∈ I. ♦
other hand, the analogue of the Fourier transform becomes a very difficult
(and interesting) object to define and study on nonabelian groups, and this
is the topic of the representation theory of locally compact groups. For these
generalizations we direct the reader to texts such as those by Folland [Fol95],
Rudin [Rud62], or Hewitt and Ross [HR79].
Additional Problems
2 2
1.13. Let f (x) = e−|x| , g(x) = e−x , and h(x) = xe−x . Show that
(see Notation 1.25). Further, if f and g are both compactly supported then
so is f ∗ g and, in this case, supp(f ∗ g) ⊆ supp(f ) + supp(g).
1.17. Give an example of f ∈ Lp (R) with 1 < p < ∞ and g ∈ C0 (R) such that
f ∗ g is not defined. Compare this to Theorem 1.36 and Exercise 1.40, which
show that f ∗ g ∈ C0 (R) if either f ∈ L1 (R) and g ∈ C0 (R), or if f ∈ Lp (R)
and g ∈ Cc (R).
To begin, we show that if f has sufficient decay, then fb must have a cor-
responding amount of smoothness. Although it is an abuse of notation, we
∧
will write (−2πix)k f (x) to denote the Fourier transform of the function
g(x) = (−2πix)k f (x).
dk b ∧
fb(k) = f = (−2πix)k f (x) , k = 0, . . . , m. ♦ (1.21)
dξ k
Equation (1.21) does not come out of the blue. We can guess the correct
equation by formally exchanging a derivative and an integral:
1.4 The Duality Between Smoothness and Decay 37
Z
d b d
f (ξ) = f (x) e−2πiξx dx
dξ dξ
Z
d
= f (x) e−2πiξx dx
dξ
Z
= f (x) (−2πix) e−2πiξx dx
∧
= (−2πixf (x)) (ξ).
′ fb(ξ + η) − fb(ξ)
fb (ξ) = lim
η→0 η
Z
e−2πi(ξ+η)x − e−2πiξx
= lim f (x) dx (1.22)
η→0 η
exists. We do this by applying the Lebesgue Dominated Convergence Theo-
rem. Define ex (ξ) = e−2πiξx . Then the integrand in equation (1.22) converges
pointwise for almost every x, because
ex (ξ + η) − ex (ξ)
lim f (x) = f (x) e′x (ξ) = f (x) (−2πix) e−2πiξx .
η→0 η
ei Θ
Fig. 1.8. The distance |eiθ − 1| from eiθ to 1 is smaller than the arclength θ along
the unit circle from eiθ to 1.
Further, since we always have |eiθ − 1| ≤ |θ| (see the “proof by picture” in
Figure 1.8 for the case 0 ≤ θ ≤ π2 ), we have
38 1 The Fourier Transform on L1 (R)
Hence the Lebesgue Dominated Convergence Theorem implies that the limit
in equation (1.22) exists, and
Z
′ e−2πi(ξ+η)x − e−2πiξx
fb (ξ) = lim f (x) dx
η→0 η
Z
= f (x) (−2πix) e−2πiξx dx
∧
= (−2πix)f (x) (ξ).
M fb = (−P f ) .
∧
In order to prove our next main result, we will need to apply both the Fun-
damental Theorem of Calculus and the technique of integration by parts on
an interval [a, b]. Of course, if f is differentiable and f ′ is continuous on [a, b],
then these results can certainly be applied. However, the Fundamental The-
orem of Calculus and integration by parts hold more generally; in fact, they
apply as long as f is an absolutely continuous function. Therefore we pause
to briefly review the definition and basic properties of absolute continuity.
1.4 The Duality Between Smoothness and Decay 39
The next fact that we need is the Banach–Zarecki Theorem, which gives
characterization of absolutely continuous functions (see [BBT97, Thm. 7.11]
for proof). Bounded variation is defined in Problem 1.30.
Theorem 1.61 (Banach–Zarecki Theorem). Given f : [a, b] → C, the fol-
lowing statements are equivalent.
(a) f ∈ AC[a, b].
(b) f is continuous, f ∈ BV[a, b], and |f (A)| = 0 for every A ⊆ [a, b] with
|A| = 0.
(c) f is continuous and is differentiable a.e., f ′ ∈ L1 [a, b], and |f (A)| = 0 for
every A ⊆ [a, b] with |A| = 0. ♦
Combining these two results, we obtain a useful sufficient condition for
absolute continuity.
Theorem 1.62. If f : [a, b] → C is everywhere differentiable and f ′ ∈ L1 [a, b],
then f ∈ AC[a, b]. ♦
Proof. By splitting into real and imaginary parts, we may assume that f is
real-valued. Suppose that A ⊆ [a, b] and |A| = 0. Then since f is differentiable
R
at every point of A, we have by Lemma 1.60 that |f (A)|e ≤ A |f ′ | = 0.
Theorem 1.61 therefore implies that f is absolutely continuous. ⊓ ⊔
The hypotheses of Theorem 1.62 can be relaxed somewhat. For example,
if f is differentiable at all but countably many points and f ′ ∈ L1 [a, b], then
f will be absolutely continuous (see [Ben76]). However, the assumptions that
f is differentiable a.e. and f ′ ∈ L1 [a, b] are by themselves not sufficient to
ensure that f is absolutely continuous (the Cantor–Lebesgue function is a
counterexample, see Problem 1.33).
As we can see from Theorem 1.61 (and it is easy to prove directly), every
absolutely continuous function has bounded variation. Although we will not
need to use it, we state a fundamental decomposition for functions of bounded
variation.
Definition 1.63 (Singular Function). A function h is singular if h is dif-
ferentiable at almost every point in its domain and h′ = 0 a.e. ♦
Theorem 1.64. If f ∈ BV[a, b], then f = g + h where g ∈ AC[a, b] and h is
singular on [a, b]. Moreover, g and h are unique up to additive constants, and
we can take Z x
g(x) = f ′, x ∈ [a, b]. ♦ (1.23)
a
1.4 The Duality Between Smoothness and Decay 41
Now we show how the Fourier transform converts smoothness into decay.
Theorem 1.65 (Smoothness and Decay II). Let f ∈ L1 (R) and m ∈ N be
given. If f is everywhere m-times differentiable and f, f ′ , . . . , f (m) ∈ L1 (R),
then
(f (k) ) (ξ) = (2πiξ)k fb(ξ),
∧
k = 0, . . . , m.
Consequently,
kf (m) k1
|fb(ξ)| ≤ , ξ 6= 0. (1.24)
|2πξ|m
Proof. Consider the case m = 1. Assume that f is everywhere differentiable
and that f, f ′ ∈ L1 (R). Then f ∈ ACloc (R) by Theorem 1.62, so f is abso-
lutely continuous on every interval [a, b] in R. Consequently the Fundamental
Theorem of Calculus (Theorem 1.59) holds for f, so
Z x
f (x) − f (0) = f ′ (t) dt, x ∈ R.
0
= 2πiξ fb(ξ).
42 1 The Fourier Transform on L1 (R)
We extract and reformulate one useful fact that played a role in the proof
of Theorem 1.65.
Rx
Exercise 1.66. If g ∈ L1 (R) and its antiderivative f (x) = −∞ g(t) dt also
R
belongs to L1 (R), then g = 0 and f ∈ C0 (R). ♦
The essential property used in the proof of Theorem 1.65 is absolute con-
tinuity. For example, if we consider m = 1, the assumption in Theorem 1.65
that f is everywhere differentiable and f, f ′ are integrable is made solely so
that we will know that f is absolutely continuous. Absolute continuity is cer-
tainly necessary, as is shown by the example of the reflected Cantor–Lebesgue
function (see Problem 1.33). That particular function f is continuous and com-
pactly supported, is differentiable a.e., and both f and f ′ belong to L1 (R).
However, f is singular so f ′ = 0 a.e., and hence fb′ is identically zero while
(2πiξ) fb(ξ) is not.
On the other hand, as long as f is absolutely continuous and both f, f ′ are
integrable, the proof of Theorem 1.65 can be carried over, and so we obtain
the following result.
This implies the following variation on Theorem 1.65, dealing with the
Fourier transform of an antiderivative of an integrable function.
Rx
Corollary 1.68. If g ∈ L1 (R) and f (x) = −∞ g(t) dt also belongs to L1 (R),
g(ξ) = 2πiξ fb(ξ) for ξ ∈ R. ♦
then b
Hence the Riemann–Lebesgue Lemma (Theorem 1.20) holds for these par-
ticular functions g. A consequence of Exercise 1.78, which comes later in
Section 1.5, is that the set
1.4 The Duality Between Smoothness and Decay 43
g ∈ L1 (R) : g is everywhere differentiable and g ′ ∈ L1 (R)
is a dense subset of L1 (R). Assuming this fact for the moment, we can use
an extension by density argument to give another proof that the Riemann–
Lebesgue Lemma is valid for all f ∈ L1 (R).
Proof (of Theorem 1.20). Choose any f ∈ L1 (R) and ε > 0. Then there exists
a differentiable function g with both g, g ′ ∈ L1 (R) such that kf − gk1 < ε. By
Theorem 1.65, and equation (1.24) in particular, we have b g ∈ C0 (R), so there
exists an R > 0 such that |b g (ξ)| < ε for all |ξ| > R. Now, for every ξ ∈ R we
have
|fb(ξ) − b
g(ξ)| ≤ kfb − gb k∞ ≤ kf − gk1 < ε,
so |fb(ξ)| < 2ε for all |ξ| > R. Hence fb ∈ C0 (R) as well. ⊓
⊔
Additional Problems
1.26. Illustrate the connection between smoothness and decay given in Theo-
rem 1.65 by computing the following Fourier transforms explicitly. Then make
the same comparisons for the functions given in Problem 1.1.
(a) Show that if f (x) = (cos 6πx) χ[−1/2,1/2] (x), then
ξ sin πξ
fb(ξ) = .
π (9 − ξ 2 )
3i sin πξ
g(ξ) =
b .
π (ξ 2 − 9)
1.27. Let f (x) = xe−x χ[0,∞) (x). Show both directly and by using Theo-
rem 1.65 that fb(ξ) = (1 + 2πiξ)−2 .
1.28. Assuming the hypotheses of Theorem 1.65, improve the conclusion given
in equation (1.24) by showing that lim|ξ|→∞ |ξ m fb(ξ)| = 0.
1.29. Let X be a normed space, and let {ft }t∈R be a sequence in X indexed
by a real parameter. Show that ft → f as t → 0 if and only if ftk → f for
every sequence of real numbers {tk }k∈N such that tk → 0.
44 1 The Fourier Transform on L1 (R)
where the supremum is taken over all partitions Γ of [a, b]. We call V [f ; a, b]
the variation of f over [a, b] and say that f has bounded variation on [a, b]
if V [f ; a, b] < ∞. The set of functions with bounded variation on [a, b] is
BV[a, b]. Prove the following statements.
(a) If f, g ∈ BV[a, b], then αf + βg ∈ BV[a, b] and f g ∈ BV[a, b]. If
|g(x)| ≥ ε > 0 for all x ∈ [a, b] then f /g ∈ BV[a, b].
(b) Set f (x) = x2 sin(1/x) and g(x) = x2 sin(1/x2 ) for x 6= 0, and f (0) =
g(0) = 0. Then f and g are differentiable everywhere, f ∈ BV[−1, 1], and
g∈/ BV[−1, 1].
(c) For y ∈ R set y + = max{y, 0} and y − = max{−y, 0}, so y + − y − = y
and y + + y − = |y|.
Given a real-valued f ∈ BV[a, b], define
n
X +
SΓ+ = f (xi ) − f (xi−1 ) and V + [f ; a, b] = sup SΓ+ ,
i=1 Γ
and
V + [f ; a, x] − V − [f ; a, x] = f (x) − f (a), x ∈ [a, b].
1 1
0.75 0.75
j1 j2
0.5 0.5
0.25 0.25
0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
Fig. 1.9. First stages in the construction of the Cantor–Lebesgue function. Left:
The function ϕ1 . Right: The function ϕ2 .
1 2
Fig. 1.10. The reflected Devil’s staircase (Cantor–Lebesgue function).
The properties that a family {kλ }λ>0 will need to possess in order to be an
approximate identity for convolution are listed in the next definition.
1.5 Approximate Identities 47
and show that the family {kλ }λ>0 forms an approximate identity. ♦
The continuity of b
k therefore implies that for each ξ ∈ R we have
ξ
lim kbλ (ξ) = lim b k = bk(0) = 1 (1.25)
λ→∞ λ→∞ λ
2
(see the illustration in Figure 1.11 using the Gaussian function g(x) = e−πx ,
2
which by Exercise 1.109 has the interesting property that bg(ξ) = e−πξ ). Thus
kbλ converges pointwise everywhere to the constant function 1. This again
matches our intuition for what the Fourier transform of a “δ-function” would
be if there was one, for if there was a function δ that satisfied both δ(x) = 0
R
for x 6= 0 and δ = 1, then δb would be identically constant:
Z Z
b
δ(ξ) = δ(x) e−2πiξx dx = δ(x) dx = 1.
So, we at least have that (f ∗ kλ ) (ξ) converges pointwise to fb(ξ), and this
∧
∀ f ∈ L1 (R), lim kf − f ∗ kλ k1 = 0.
λ→∞
-5 -4 -3 -2 -1 0 1 2 3 4 5
-5 -4 -3 -2 -1 0 1 2 3 4 5
-5 -4 -3 -2 -1 0 1 2 3 4 5
2 2
Fig. 1.11. Top: The Fourier transform b g (ξ) = e−πξ of the function g(x) = e−πx .
Middle: The Fourier transform gb5 (ξ) = bg (ξ/5) of the dilated function g5 (x) = 5g(5x).
Bottom: The Fourier transform gc 15 (ξ) = g b(ξ/15) of the dilated function g15 (x) =
15g(15x).
≤ εK + 2kf k1 ε.
Thus kf − f ∗ kλ k1 → 0 as λ → ∞. ⊓
⊔
To illustrate the convergence proved in the preceding theorem, consider
the particular function χ = χ[0,1] and a particular approximate identity that
1.5 Approximate Identities 51
will be of considerable use to us later. This is the Fejér kernel {wλ }λ>0 ,
2
which is produced by dilating the Fejér function w(x) = sinπxπx depicted in
Figure 1.14. In Figure 1.12, we see the convolutions χ ∗w, χ ∗w5 , and χ ∗w25 . In
addition to the convergence apparent in these figures, note that the convolved
functions appear to be continuous, while χ is discontinuous. This is due to
the smoothing effect of convolution, which was discussed in Section 1.3.7.
1.0
0.8
0.6
0.4
0.2
1.0
0.8
0.6
0.4
0.2
1.0
0.8
0.6
0.4
0.2
There are many variations on the theme of Theorem 1.71. To begin, since
Lp ∗ L1 ⊆ Lp , we expect that we may be able to extend to other values of p,
and indeed for p finite we have the following result.
52 1 The Fourier Transform on L1 (R)
∀ f ∈ C0 (R), lim kf − f ∗ kλ k∞ = 0.
λ→∞
∀ compact K ⊆ R, lim (f − f ∗ kλ ) χK ∞
= 0.
λ→∞
and the set of Lebesgue points is called the Lebesgue set of f. Every point
of continuity is a Lebesgue point of f. More interestingly, the Lebesgue Dif-
ferentiation Theorem implies that almost every x ∈ R is a Lebesgue point
(Theorem A.30).
Now, we have seen that if f belongs to Lp (R) with p finite then f ∗ kλ → f
in Lp -norm. If we impose some restrictions on k, then we can also show that
we have pointwise convergence of (f ∗ kλ )(x) to f (x) at each Lebesgue point x
of f.
54 1 The Fourier Transform on L1 (R)
Theorem
R 1.77. Let k be a bounded, compactly supported function such that
k = 1, and define kλ (x) = λk(λx). If f ∈ L1 (R), then (f ∗ kλ )(x) → f (x)
as λ → ∞ for each point x in the Lebesgue set of f. In particular, f ∗ kλ
converges to f pointwise a.e.
Proof. By hypothesis, supp(k) ⊆ [−R, R] for some R. If x is a Lebesgue point
of f, then
= 0,
where the limit is zero by definition of Lebesgue point. Finally, since almost
every x is a Lebesgue point of f, we conclude that f ∗ kλ converges to f
pointwise a.e. ⊓⊔
The hypotheses on k in Theorem 1.77 can be relaxed. In particular, com-
pact support of k is not required. For example, Stein and Weiss [SW71,R p. 13]
give a more intricate argument that shows that if k ∈ L1 (R) satisfies k = 1
and there exists an even function φ ∈ L1 (R) that is decreasing and differen-
tiable on (0, ∞) that dominates k in the sense that |k(x)| ≤ φ(x) for all x,
then (f ∗ kλ )(x) → f (x) for every x in the Lebesgue set of f.
Theorem 1.16 gave us a proof, based on Urysohn’s Lemma, that the space
Cc (R) is dense in L1 (R). It almost seems that we should be able to give a
“simple” proof of this fact by using approximate identities and arguing as
follows. Choose any f ∈ L1 (R). Then we can find a compactly supported g ∈
L1 (R) that is close to f, e.g., if we take R large enough and set g = f χ[−R,R]
then we will have kf − gk1 < ε. If we convolve g with an element kλ of an
approximate identity, then g ∗ kλ will be close to g if λ is large enough, say
kg − g ∗ kλ k1 < ε. Further, if we choose our approximate identity so that kλ is
a “nice” function then g ∗ kλ will inherit this “niceness” as well. For example,
if kλ ∈ Cc (R) then g ∗ kλ ∈ Cc (R), and so we have found an element of Cc (R)
that lies within 2ε of f in L1 -norm.
1.5 Approximate Identities 55
The flaw in this reasoning is that our proof in Theorem 1.71 that g ∗kλ → g
in L1 -norm relies on the fact that translation is a strongly continuous family
of operators in L1 (R). The proof of this strong continuity of translation is
the content of Exercise 1.17. However, the proof of that exercise (at least the
proof we suggest in the hints) requires us to already know that Cc (R) is dense
in L1 (R). Hence the reasoning of the preceding paragraph is circular.
We could take a different approach, e.g., by first arguing that simple func-
tions are dense in L1 (R) and trying to proceed from there to show that Cc (R)
is dense. But it doesn’t really matter, one way or the other we essentially
have to “get our hands dirty” and show that some particular special subset is
dense. The power of approximate identities comes at the next step—once we
know that one particular set is dense, we can use the spirit of the argument
above (convolution with a “nice” approximate identity) to easily show that
“much nicer” spaces are also dense. We even obtain results that almost seem
too good to be true. For example, we will see that the space Cc∞ (R) consisting
of infinitely differentiable, compactly supported functions is dense in Lp (R) for
every 1 ≤ p < ∞! This fact is not just an abstract surprise, but will be of
great utility to us throughout the remainder of this volume, especially when
we turn to distribution theory in Chapter 4.
Exercise 1.78. (a) Show that Ccm (R) is dense in Lp (R) for each m ≥ 0 and
1 ≤ p < ∞, and also is dense in C0 (R) in L∞ -norm.
(b) Show that Cc∞ (R) is dense in Lp (R) for each 1 ≤ p < ∞, and also is
dense in C0 (R) in L∞ -norm. ♦
After proving the Inversion Formula in Section 1.6, we will also be able
to show that many spaces of functions with nice Fourier transforms are also
dense. For example, in Problem 1.66 we will see that f ∈ L1 (R) : fb ∈
Cc∞ (R) is dense in Lp (R) for 1 ≤ p < ∞.
We proved Urysohn’s Lemma for the setting of the real line in Theorem 1.15.
By using approximate identities, we now prove a much more refined C ∞ -
version of Urysohn’s Lemma for subsets of R.
Proof. Since K is compact and R\U is closed, the distance between these sets
is positive, i.e.,
d = dist(K, R\U ) = inf |x − y| : x ∈ K, y ∈/ U > 0.
Let
56 1 The Fourier Transform on L1 (R)
n do
V = y ∈ R : dist(y, K) < ,
3
R
and let k be any function in Cc∞ (R) such that k ≥ 0, k = 1, and supp(k) ⊆
[− d3 , d3 ] (for example, dilate the function constructed in Exercise 1.42). Set
f = k ∗ χV . Since k and χV are both compactly supported, their convolution
is also compactly supported, and hence it follows from Corollary 1.41 that
f ∈ Cc∞ (R). Since
Z Z
f (x) = k(x − y) dy ≤ k = 1,
V
d
we have 0 ≤ f ≤ 1 everywhere. Also, if x ∈ K and y ∈ / V then |x − y| ≥ 3
and so k(x − y) = 0. Therefore for x ∈ K we have
Z Z
f (x) = k(x − y) dy = k(x − y) dy = 1.
V
Similarly if x ∈
/ U then it follows that f (x) = 0. ⊓
⊔
Gibbs’s phenomenon refers to the behavior of the partial sums of the Fourier
series of a periodic function near a jump discontinuity. Although this chapter
focuses on the Fourier transform on R rather than Fourier series on T, there
is an analogous phenomenon for the Fourier transform that we will discuss.
For the formulation and proof of Gibbs’s phenomenon on the torus, we refer
to [DM72].
To illustrate this, let H = χ[0,∞) be the Heaviside function. Although
H does not belong to L1 (R), the important fact for this example is that H
has a jump discontinuity at x = 0. Pointwise convergence of Fourier series
corresponds to convolution with the Dirichlet kernel on the torus (see Sec-
tion 2.2.1, and equation (2.9) in particular). The Dirichlet kernel on the real
line is {dλ }λ>0 , which is obtained by dilating the Dirichlet function
sin ξ
d(ξ) = .
πξ
/ L1 (R), the Dirichlet kernel does not form an approximate identity,
Since d ∈
but even so let us consider the pointwise behavior of H ∗ dλ as λ → ∞.
R∞
Using the fact that, as an improper Riemann integral, 0 sinx x dx = π2 (see
Problem 1.43), we have
Z ∞
sin λ(x − y)
(H ∗ dλ )(x) = dy
0 π(x − y)
Z λx
sin y
= dy
−∞ πy
1.5 Approximate Identities 57
Z λx
1 sin y
= + dy.
2 0 πy
sin y
Since πy > 0 for 0 < y < π, H ∗ dλ is increasing on (0, π/λ). Then it
decreases on (π/λ, 2π/λ), then increases on (2π/λ, 3π/λ)—but never back to
the value it had at π/λ. Continuing in this way we see that H ∗ dλ achieves
its maximum at x = π/λ, and this maximum is
π Z π
1 sin y
(H ∗ dλ ) = + dy ≈ 1.089 . . . .
λ 2 0 πy
Note that this maximum is independent of λ. Although (H ∗ dλ )(x) converges
pointwise to H(x) as λ → ∞ for all x 6= 0, this convergence is not uniform.
In particular, the maximum distance between (H ∗ dλ )(x) and H(x) for x > 0
is a constant (approximately 0.089. . . ) that is independent of λ, although its
location at x = π/λ decreases with λ. Figure 1.13 displays a plot of H ∗ dλ
for λ = 16π.
1.0
0.8
0.6
0.4
0.2
Thus, the smallest closed ideal that contains g is precisely the smallest
closed subspace that contains all translates of g.
Recall that a set S ⊆ L1 (R) is complete in L1 (R) if span(S) is dense in
1
L (R). It therefore follows from Exercise 1.83 that {Ta g}a∈R is complete in
L1 (R) if and only if I(g) = L1 (R). But when does this happen? The next
exercise gives a necessary condition.
Exercise 1.84. Given g ∈ L1 (R), show that
The converse of Exercise 1.84 is also true, but is a much deeper fact that
is part of Wiener’s Tauberian Theorem, which we discuss in Section 2.10. In
contrast, the analogous question in L2 (R) is much simpler, see Problem 3.9.
Additional Problems
1.35. Show that if {kλ }λ>0 is an approximate identity then kbλ (ξ) → 1 point-
wise as λ → ∞.
1.36. Let {kλ }λ>0 be an approximate identity and fix f ∈ L1 (R) ∩ L∞ (R).
Show that if f is continuous at some point x ∈ R, then (f ∗ kλ )(x) → f (x) as
λ → ∞.
1.5 Approximate Identities 59
1.39. Fix g ∈ L1 (R), and consider the ideals g ∗ L1 (R) and I(g) introduced
in Exercise 1.48. Show that we need not have g ∈ g ∗ L1 (R), but we always
have g ∈ I(g).
R
1.40. Let k ∈ L1 (R) be any function such that k = 1 and xk(x) ∈ L1 (R),
and define an approximate identity by setting kλ (x) = λk(λx). Fix 1 ≤ p < ∞.
(a) Show that kP kλ k1 → 0 as λ → ∞, where P is the position operator.
(b) Show that if f ∈ Lp (R) then f ∗ P kλ → 0 in Lp -norm. Further, if we also
have xf (x) ∈ Lp (R), then P (f ∗ kλ ) → P f in Lp -norm.
1.41. Given 0 < α < 1, let C α (R) be the space of Hölder continuous functions
given in Definition 1.75. Show that
f (x) − f (y)
kf kC α = |f (0)| + sup
x6=y |x − y|α
1.44. This problem will illustrate one of the many different possible gener-
alizations of the results of this section by considering particular weighted Lp
spaces. Given s ∈ R let vs (x) = (1+|x|)s ; we refer to vs as a polynomial weight
(since it has polynomial-like growth if s ≥ 0, or decays like the reciprocal of
a polynomial if s ≤ 0). For this problem fix 1 ≤ p < ∞ and s ∈ R. Then we
define Lps (R) to be the space of all measurable functions f : R → C such that
60 1 The Fourier Transform on L1 (R)
Z 1/p
p ps
kf kp,s = kf vs kp = |f (x)| (1 + |x|) dx < ∞.
If we identify functions that are equal a.e. then Lps (R) is a Banach space.
Prove the following statements.
(a) If s ≥ 0 then vs is submultiplicative, i.e., vs (x + y) ≤ vs (x) vs (y) for
x, y ∈ R. If s ≤ 0 then vs is v−s -moderate, i.e., vs (x + y) ≤ vs (x) v−s (y) for
x, y ∈ R.
(b) For each a ∈ R, the translation operator Ta is a continuous map of
Lps (R) into itself, with operator norm kTa kLps →Lps ≤ v|s| (a) = (1 + |a|)|s| .
(c) Translation is strongly continuous on Lps (R), i.e., for each f ∈ Lps (R)
we have lima→0 kf − Ta f kp,s = 0.
(d) If {kλ }λ>0 is an approximate identity, then for each f ∈ Lps (R) we
have limλ→∞ kf − f ∗ kλ kp,s = 0.
(e) Cc∞ (R) is dense in Lps (R).
(f) Do parts (b)–(d) still hold if we replace vs by an arbitrary weight
w : R → (0, ∞)? What properties do we need w to possess?
Definition 1.85 (Fejér Kernel). The Fejér function is the square of the
sinc function dπ : 2
sin πx
w(x) = = dπ (x)2 .
πx
The Fejér kernel is {wλ }λ>0 where wλ (x) = λw(λx). ♦
1.6 The Inversion Formula 61
1.0
0.8
0.6
0.4
0.2
-4 -2 2 4
-0.2
Fig. 1.14. Graph of the Fejér function w.
The letter “w” is for “Weiss,” which was Fejér’s surname at birth.
In order to conclude that the Fejér kernel is an approximate identity, we
need to know that the integral of w is 1.
R
Exercise 1.86. Show that w ∈ L1 (R) and w = 1. ♦
sin πξ
Let χ = χ[−1/2,1/2] . Then we know from Exercise 1.7 that χ
b (ξ) =
πξ =
dπ (ξ). Hence if we let
W = χ ∗ χ,
then we have
W b 2 = dπ2 = w.
b = χ (1.27)
∨
Also W = w since W is even. Since both W and W b = w belong to L1 (R), if
we had already proved that the Inversion Formula holds, then we could apply
it to W to conclude that w’s Fourier transform is W, i.e.,
∨
w
b = (W )
∧
= W and
∨
b )∨ = W.
w = (W (1.28)
We will see eventually that this is true, but first we must prove the Inversion
Formula. And we will prove the Inversion Formula by making use of the Fejér
kernel, although we will not need equation (1.28) to do this. Obviously, this
is a good thing, since otherwise the argument would be circular.
It is not so much the Fejér kernel itself that will be important to us, but
rather some of the properties that it happens to have, including:
(a) w, W ≥ 0,
(b) w(0) = 1,
b ∈ L1 (R),
(c) W, W
R
(d) W = W b (0) = 1.
Even these are not all essential, and the reader can consider what other kernels
we might use instead to prove the Inversion Formula.
62 1 The Fourier Transform on L1 (R)
1.0
0.8
0.6
0.4
0.2
Exercise 1.87. Prove that W is the hat function or tent function on the
interval [−1, 1] defined by
Consequently,
n |ξ| o
W (ξ/λ) = max 1 − , 0 (1.29)
λ
is the dilated hat function with height 1 supported on [−λ, λ]. In particular,
W (ξ/λ) → 1 pointwise as λ → ∞.
To motivate our first step towards the proof of the Inversion Formula, choose
any f ∈ L1 (R). Then f ∗ wλ ∈ L1 (R) and also (f ∗ wλ ) = fb wbλ . We don’t
∧
know this yet, but we will show that wbλ (ξ) = W (ξ/λ) ∈ L1 (R). Once we
establish this, it follows (since fb is bounded) that
= fb wbλ ∈ L1 (R).
∧
(f ∗ wλ )
Hence, if only we already knew that the Inversion Formula was valid, we could
compute that
Z
∧ ∨
(x) = (fb wbλ ) (x) = fb(ξ) W (ξ/λ) e2πiξx dξ.
∨
(f ∗ wλ )(x) = (f ∗ wλ )
Unfortunately, these calculations are not yet justified since, among other
things, they rely on the Inversion Formula, which has not yet been proved.
However, instead of trying to justify the full Inversion Formula right now, we
begin with a much
R smaller step: We show directly that the specific equality
(f ∗ wλ )(x) = fb(ξ) W (ξ/λ) e2πiξx dξ holds when f ∈ L1 (R).
Suppose that f ∈ L1 (R) is given. Assuming for the moment that the use of
Fubini’s Theorem in the following calculation is justified, we have that
Z
(f ∗ wλ )(x) = f (y) wλ (x − y) dy
Z Z
= f (y) W (ξ/λ) e2πiξ(x−y) dξ dy
Z Z
λ
|ξ| 2πiξ(x−y)
= f (y) 1 − e dξ dy
−λ λ
Z λ Z
2πiξ(x−y) |ξ|
= f (y) e dy 1 − dξ
−λ λ
Z λ |ξ| 2πiξx
= fb(ξ) 1 − e dξ.
−λ λ
Of course, the applicability of Fubini’s Theorem does have to be justified, and
we assign this task to the reader as an exercise. ⊓
⊔
Now we obtain the Inversion Formula by taking the limit in equation
(1.30).
for every x.
64 1 The Fourier Transform on L1 (R)
Further, since 0 ≤ W ≤ 1,
One consequence of the Inversion Formula is that we can now justify our
hope, presented in equation (1.28), that w
b = W. Note that since w and W are
even, their Fourier and inverse Fourier transforms coincide.
∨
Corollary 1.91. w
∧
b )∨ = w.
b = (W ) = W = (W
∨
Proof. This follows from the Inversion Formula, the fact that w and W both
b = w. ⊓
belong to L1 (R), and our proof in equation (1.27) that W ⊔
f = g a.e. ⇐⇒ fb = gb a.e.
In particular,
f = 0 a.e. ⇐⇒ fb = 0 a.e.
1.6 The Inversion Formula 65
Proof. Since the Fourier transform is linear, the first equivalence is a con-
sequence of the second. If f = 0 a.e., then fb = 0 everywhere by defini-
tion of the Fourier transform. On the other hand, if fb = 0 a.e., then we
have both f, fb ∈ L1 (R), so the Inversion Formula applies, and we obtain
∨ ∨
f = fb = 0 = 0. ⊓ ⊔
Note that we could also appeal to Theorem 1.88 to give another proof of
the Uniqueness Theorem.
1.6.3 Summability
If fb ∈
/ L1 (R) then the Inversion Formula need not hold. However, as an
immediate consequence of Theorem 1.88, we have the following approximation
formula for f in terms of fb that is valid for all f ∈ L1 (R), even if fb is not
integrable.
Theorem 1.93. If f ∈ L1 (R), then
Z λ |ξ| 2πiξx
fb(ξ) 1 − e dξ → f in L1 -norm as λ → ∞. (1.32)
−λ λ
Proof. Simply note that, by Theorem 1.88, the left-hand side of equation
(1.32) is f ∗ wλ . ⊓
⊔
Thus, even if fb ∈
/ L1 (R), we have an “approximate” interpretation of the
∨
b
formula f = f in the sense that equation (1.32) will hold. This is analogous
to using summability conditions for evaluating divergent
P∞ series. For example,
consider a formal1 bi-infinite series of the form k=−∞ ak . Let us say that
this series converges and equals L if the symmetric partial sums
N
X
sN = ak
k=−N
Exercise 1.94. Given a sequence of scalars a = (ak )k∈Z , let the partial sums
sN and Cesàro means σN be as above.
(a) Show that
XN |k|
σN = 1− ak .
N +1
k=−N
(b) Show that if the partial sums sN converge, then the Cesàro means σN
converge to the same limit, i.e.,
N
X |k|
∞
X
lim 1− ak = lim sN = ak . (1.33)
N →∞ N +1 N →∞
k=−N k=−∞
P
(c) Set ak = (−1)k for k ≥ 0 and ak = 0 for k < 0. Show that the series ak
is Cesàro summable even though the partial sums do not converge, and
find the limit of the Cesàro means.
(d) Show that if ak ≥ 0 for every k then σN converges if and only if sN
converges. ♦
Comparing Theorem 1.93 to equation (1.33), we see that Theorem 1.93 is
a continuous version of Cesàro summability. Essentially, Theorem 1.93 says
that if f ∈ L1 (R), then the formal integral
Z ∞
∨
b
f (x) = fb(ξ) e2πiξx dξ
−∞
Exercise 1.96. (a) Show that the space of bandlimited functions in L1 (R),
is dense in L1 (R).
(b) Show that if Ω is fixed, then the space of functions in L1 (R) bandlim-
ited to [−Ω, Ω],
{f ∈ L1 (R) : supp(fb) ⊆ [−Ω, Ω]},
is a closed proper subspace of L1 (R). ♦
As we have seen, the Cesàro means σN of a formal infinite series may converge
even when the partial sums sN do not. For the Fourier transform, the analogue
of the Cesáro means are convolutions with the Fejér kernel:
Z λ |ξ| 2πiξx
(f ∗ wλ )(x) = fb(ξ) 1 − e dξ.
−λ λ
lemma. The proof of this lemma uses the Second Mean Value Theorem for
integrals. Problem 1.55 gives a short proof of a special case of the Second
Mean Value Theorem, and for the proof of the general case we refer to [Fol99,
Lem. 8.4.1].
Lemma 1.97. If g ∈ BV[0, δ], then
Z δ
g(0+)
lim g(x) d2πλ (x) dx = .
λ→∞ 0 2
Proof. The Jordan Decomposition Theorem (Problem 1.30) tells us that any
function g ∈ BV[0, δ] can be written as g = g1 − g2 where g1 , g2 are increas-
ing on [0, δ]. Therefore it suffices to assume that g is increasing. Further, by
replacing g(x) with g(x) − g(0+), we may also assume that g(0+) = 0.
By Problem 1.43,
Z b
sin x
C = sup dx < ∞.
a<b a x
Fix ε > 0. Then there exists an η > 0 such that |g(x)| < ε for all 0 < x < η.
Since g is continuous and d2πλ is continuous, the Second Mean-Value Theorem
for Integrals tells us that there exists some point c ∈ [0, η] such that
Z η Z c Z η
g(x) d2πλ (x) dx = g(0+) d2πλ (x) dx + g(c−) d2πλ (x) dx.
0 0 c
f (x+) − f (x−)
= . ⊓
⊔
2
There are many variations on the theme of duality between smoothness and
decay under the Fourier transform. Some of these were presented in Sec-
tion 1.4. As an application of the Inversion Formula, we now prove another
version, relating decay in frequency to smoothness in time.
Recall that A(R) = fb : f ∈ L1 (R) . Hence if f ∈ L1 (R) then fb ∈ A(R) by
definition, and to say that 2πiξ fb(ξ) belongs to A(R) means that there exists
a function g ∈ L1 (R) whose Fourier transform is gb(ξ) = 2πiξ fb(ξ). Note that
this is more than just a decay requirement on fb, for not only must we have
2πiξ fb(ξ) ∈ C0 (R), but we must have that 2πiξ fb(ξ) belongs to the smaller
space A(R).
70 1 The Fourier Transform on L1 (R)
Proof. First consider the case where we have the additional hypothesis that
gb ∈ L1 (R). Since 2πiξ fb(ξ) = b
g(ξ) ∈ L1 (R) and fb is continuous, it follows that
fb ∈ L1 (R). Therefore we can apply the inverse Fourier transform analogue
∨
of Theorem 1.55 to the function fb and conclude that fb ∈ C01 (R). Since
∨
the Inversion Formula applies to f, we know that f = fb is differentiable
everywhere and f ′ ∈ C0 (R). Furthermore,
∨ ∨
f ′ = (f )′ = 2πiξ fb(ξ) =
∧∨
gb = g,
Additional Problems
1.46. Suppose that f ∈ L1 (R). Show that if fb is even then f is even, and if fb
is odd then f is odd. Compare Problem 1.2.
1.50. Show that if g ∈ L1 (R), g 6= 0, then {Ta g}a∈R is finitely linearly inde-
pendent, i.e., every finite subset is linearly independent.
1.51. Prove the following variation on the theme “decay in frequency implies
smoothness in time”: If f ∈ L1 (R) and there exists C > 0 and 0 < α < 1 such
that
C
∀ ξ ∈ R, |fb(ξ)| ≤ ,
|ξ|1+α
then f is Hölder continuous with exponent α (see Definition 1.75).
1.52. Show that the Fourier transform maps L1 (R) ∩ A(R) bijectively onto
itself, and that L1 (R) ∩ A(R) ⊆ C0 (R).
Bn = Bn−1 ∗ χ[0,1] .
(d) Prove that there exist scalars ckn such that Bn satisfies the refinement
equation
n+1
X
Bn (x) = ckn Bn (2x − k).
k=0
(e) Prove that there exists a 1-periodic function m0 (in fact, a trigonometric
polynomial) such that Bcn (ξ) = m0 (ξ/2) B
cn (ξ/2).
1.55. This problem will prove a special case of the Second Mean Value Theo-
rem for Integrals. Assume that h is both continuous and nonnegative on [a, b]
and g is monotone increasing on [a, b] with g(a+) ≥ 0. Define
Z x Z b
G(x) = g(a+) h(t) dt + g(b−) h(t) dt,
a x
Rb
and show that G(b) ≤ a g(t) h(t) dt ≤ G(a). Apply the Intermediate Value
Theorem to show there exists a point c ∈ [a, b] such that
Z b Z c Z b
g(t) h(t) dt = G(c) = g(a+) h(t) dt + g(b−) h(t) dt.
a a c
Exercise 1.100. Show that if f ∈ Cc2 (R) then fb ∈ L1 (R). Conclude that
Cc2 (R) ⊆ A(R), and that A(R) is dense in C0 (R). ♦
The next exercise (taken from [Fol99]) shows that A(R) is a proper subset
of C0 (R), although the argument is implicit in the sense that it does not
construct a specific example of a function in C0 (R)\A(R). The main point is
that if the Fourier transform F was a bounded map of L1 (R) onto C0 (R),
then, since both of these are Banach spaces, the Inverse Mapping Theorem
(Theorem C.14) would imply that F had a bounded inverse. However, the
exercise shows that F −1 is not a bounded map of A(R) (under the L∞ -norm)
back to L1 (R).
Additional Problems
1.56. Define Ac (R) = {F ∈ A(R) : supp(F ) is compact}. Show that Ac (R) is
a dense subspace of A(R) in the norm of A(R). Compare Problem 1.23, which
shows that Ac (R) is an ideal in A(R).
1.57. Let X and Y be Banach spaces. Show that T ∈ B(X, Y ) is surjective if
and only if range(T ) is not meager in Y.
74 1 The Fourier Transform on L1 (R)
1.0
0.5
-0.5
-1.0
3.0
2.5
2.0
1.5
1.0
0.5
-4 -2 2 4
-0.5
0.5
To illustrate its use, we hint that an “easy” solution to the following ex-
ercise can be obtained by using the de la Vallée–Poussin kernel.
Exercise 1.105. Prove that if f ∈ L1 (R), then
0.5
0.4
0.3
0.2
0.1
-3 -2 -1 0 1 2 3
1.0
0.8
0.6
0.4
0.2
-3 -2 -1 0 1 2 3
Φ′ (ξ) = −2πξΦ(ξ).
(b) Solve the differential equation in part (a), and show that
(c) Show that Φ(0) = 1. Conclude that Φ = φ, and therefore the Gauss kernel
is an approximate identity. ♦
1.8 Some Special Kernels 77
Proof. Choose f ∈ C[a, b], and fix R large enough that [a, b] ⊆ (−R, R). Let g
be any continuous function on R supported in [−R, R] that equals f on [a, b].
Let {φλ }λ>0 be the Gauss kernel. By Exercise 1.73, g ∗ φλ converges uniformly
to g, so we can choose a λ such that
ε
kg − g ∗ φλ k∞ < . (1.37)
2
Since the Taylor series for ex converges uniformly on compact sets, the series
∞
X X∞
2
x2 (−πλ2 x2 )n (−1)n π n λ2n+1 2n
φλ (x) = λe−πλ = λ = x
n=0
n! n=0
n!
Since g and f are equal on [a, b], by combining equations (1.37) and (1.38),
we see that
sup |f (x) − (g ∗ q)(x)| < ε.
x∈[a,b]
78 1 The Fourier Transform on L1 (R)
Finally, if we write out g ∗ q and expand using the binomial theorem, we see
that g ∗ q is actually a polynomial of degree at most 2N :
XN Z
(−1)n π n λ2n+1 R
(g ∗ q)(x) = g(y) (x − y)2n dy
n=0
n! −R
XN 2n Z R
(−1)n π n λ2n+1 X 2n
= g(y) (−y)2n−j dy xj .
n=0
n! j=0
j −R
Additional Problems
1.58. There are many functions that equal their own Fourier transforms. Show
that if f, fb ∈ L1 (R), then f
∧∧∧∧ ∧ ∧∧ ∧∧∧
= f, and consequently g = f +f +f +f
satisfies b
g = g.
1.0
0.5
1 2 3 4
-0.5
-1.0
2
Fig. 1.21. Graph of the real part of the chirp e2πi2x .
2
1.59. Let φ(x) = e−πx be the Gaussian function.
2
(a) For r > 0, set ϕr (x) = e−πrx . Show that ϕ
cr = r−1/2 ϕ1/r .
(b) Now we extend the definition of ϕr to complex parameters. Let c =
a + ib ∈ C be complex, and set
2 2 2
ϕc (x) = e−πcx = e−πibx e−πax .
2
In engineering jargon, ϕc is a Gaussian multiplied by a chirp e−πibx (and the
resulting function is often also called a Gaussian function). Show that part (a)
extends to complex parameters with positive real part, i.e., if c = a + ib and
a > 0, then ϕcc = c−1/2 ϕ1/c . This fact will be useful to us in Section 3.3.
Note: We take a complex square root that extends the square root of
the positive real numbers. For example, since c = a + ib with a > 0, we
1.9 The Schwartz Space 79
can write c = reiθ with −π/2 < θ < π/2. We take c1/2 = r1/2 eiθ/2 and
c−1/2 = r−1/2 e−iθ/2 .
Remark: If we regard e2πiξx as a “pure tone” of constant frequency, then
2
we can think of e2πiξx as a function whose frequency increases with time, see
the illustration in Figure 1.21. If played through a speaker, such a function
sounds something like a bird’s chirp.
1.60. Suppose that f ∈ L1 (R)∩L∞ (R) has a jump discontinuity at the origin,
i.e.,
f (0+) = lim+ f (t), f (0−) = lim− f (t)
t→0 t→0
both exist, but f (0−) 6= f (0+). Let {φλ }λ>0 be the Gauss kernel. Prove that
f (0+) + f (0−)
lim (f ∗ φλ )(0) = .
λ→∞ 2
Can other kernels be used to obtain the same result?
1.61. Let f ∈ L1 (R) be given. Show that if there exist δ, R > 0 such that f
is bounded on [−δ, δ] and fb(ξ) ≥ 0 for |ξ| > R, then fb ∈ L1 (R).
1.62. Suppose that f ∈ L1 (R) ∩ L∞ (R) is real and even (so fb is real and even
as well). Show that if f is not continuous, then fb must change sign infinitely
often. Thus, a “local” (jump) discontinuity of f forces a global “reaction”
in fb.
On the other hand, show that g(x) = ie−|x| χ[0,∞) − χ(−∞,0) has a single
jump discontinuity and that gb is real and changes sign only once. Still, the
fact that g has a discontinuity implies that b
g decays slowly.
Thus, “rapid decay” means decay faster than the reciprocal of any poly-
nomial: For each m, n ≥ 0 there exists a constant Cmn such that
Cmn
|f (n) (x)| ≤ , x ∈ R.
|xm |
Since Cc∞ (R) ⊆ S(R), we know that S(R) is dense in Lp (R) for every
1 ≤ p < ∞, and also is dense in C0 (R) in L∞ -norm. The Gaussian function
2
e−x is an example of a function in S(R) that is not compactly supported.
On the other hand, while the two-sided exponential e−|x| has rapid decay, it
is not a Schwartz-class function since it is not differentiable at the origin.
The Schwartz space is a topological vector space but not a normed vector
space. Instead of being determined by a single norm, the topology on S(R) is
determined by the infinite collection of seminorms
∀ m, n ≥ 0, lim ρmn (f − fk ) = 0,
k→∞
Proof. Suppose that f ∈ S(R). Then, by the product rule, we have for any
m, n ≥ 0 that
Xn
n
Dn (−2πix)m f (x) = Dj (−2πix)m f (n−j) (x) ∈ L1 (R).
j=0
j
Hence,
∧
(2πiξ)n Dm fb(ξ) = Dn (−2πix)m f (x) (ξ) ∈ L∞ (R).
Since this is true for every m and n, we conclude that fb ∈ S(R). Thus the
Fourier transform maps S(R) into itself, and we know that it is injective by
the Uniqueness Theorem.
∨ ∨
∧
On the other hand, we also have f ∈ S(R), and therefore ( f ) ∈ S(R).
∨
∧
Hence f = ( f ) by the Inversion Formula, so the Fourier transform is surjec-
tive. ⊓
⊔
As a final remark, we note that since Cc∞ (R) ⊆ S(R), the Fourier transform
maps Cc∞ (R) into S(R):
However, we will later see the Paley–Wiener Theorem (Theorem 3.49), which
implies that f and fb cannot both be compactly supported (unless f = 0).
Hence the Fourier transform does not map Cc∞ (R) into itself:
F Cc∞ (R) ⊆ S(R) but F Cc∞ (R) ∩ Cc∞ (R) = {0}.
Additional Problems
1.64. Show that if f ∈ S(R) and g ∈ Cb∞ (R), then f g ∈ S(R). In particular,
S(R) is closed under pointwise products.
1.65. Show that if f ∈ S(R) and xm g(x) ∈ L1 (R) for every m ≥ 0, then
f ∗ g ∈ S(R). In particular, S(R) is closed under convolution.
1.66. Show that f ∈ L1 (R) : fb ∈ Cc∞ (R) is dense in Lp (R) for each 1 ≤
p < ∞.