0% found this document useful (0 votes)
15 views93 pages

Chap 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views93 pages

Chap 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

Christopher Heil

Introduction to Harmonic
Analysis
November 12, 2010

Springer
Berlin Heidelberg NewYork
Hong Kong London
Milan Paris Tokyo
Contents

General Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1 The Fourier Transform on L1 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 The Fourier Transform on L1 (R) . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Translation, Modulation, Dilation, and Involution . . . . . . . . . . . 10
1.2.1 Four Fundamental Operators . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 The Riemann–Lebesgue Lemma . . . . . . . . . . . . . . . . . . . . . 13
1.2.3 Position and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.4 The HRT Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Some Notational Conventions . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 Definition and Basic Properties of Convolution . . . . . . . . 18
1.3.3 Young’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.4 Convolution as Filtering; Lack of an Identity . . . . . . . . . . 21
1.3.5 Convolution as Averaging; Introduction to
Approximate Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.6 Convolution as an Inner Product . . . . . . . . . . . . . . . . . . . . 25
1.3.7 Convolution and Smoothing . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3.8 Convolution and Differentiation . . . . . . . . . . . . . . . . . . . . . 28
1.3.9 Convolution and Banach Algebras . . . . . . . . . . . . . . . . . . . 29
1.3.10 Convolution on General Domains . . . . . . . . . . . . . . . . . . . . 32
1.4 The Duality Between Smoothness and Decay . . . . . . . . . . . . . . . . 36
1.4.1 Decay in Time Implies Smoothness in Frequency . . . . . . 36
1.4.2 A Primer on Absolute Continuity . . . . . . . . . . . . . . . . . . . . 38
1.4.3 Smoothness in Time Implies Decay in Frequency . . . . . . 41
1.4.4 The Riemann–Lebesgue Lemma Revisited . . . . . . . . . . . . 42
1.5 Approximate Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.5.1 Definition and Existence of Approximate Identities . . . . 46
1.5.2 Approximation in Lp (R) by an Approximate Identity . . 48
vi Contents

1.5.3 Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52


1.5.4 Pointwise Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.5.5 Dense Sets of Nice Functions . . . . . . . . . . . . . . . . . . . . . . . . 54
1.5.6 The C ∞ Urysohn Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1.5.7 Gibbs’s Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.5.8 Translation-Invariant Subspaces of L1 (R) . . . . . . . . . . . . . 57
1.6 The Inversion Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.6.1 The Fejér Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.6.2 Proof of the Inversion Formula . . . . . . . . . . . . . . . . . . . . . . 62
1.6.3 Summability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.6.4 Pointwise Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
1.6.5 Decay and Smoothness Revisited . . . . . . . . . . . . . . . . . . . . 69
1.7 The Range of the Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 72
1.8 Some Special Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
1.9 The Schwartz Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
1.9.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 80
1.9.2 Topology and Convergence in the Schwartz Space . . . . . 80
1.9.3 Invariance of the Schwartz Space . . . . . . . . . . . . . . . . . . . . 81

2 Fourier Series and the Abstract Fourier Transform . . . . . . . . 85


2.1 The Abstract Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.1.1 Examples of Locally Compact Abelian Groups . . . . . . . . 85
2.1.2 Characters and the Fourier Transform . . . . . . . . . . . . . . . 87
2.1.3 The Fourier Transform on LCA Groups . . . . . . . . . . . . . . 90
2.2 Fourier Series and Approximate Identities on the Torus . . . . . . 92
2.2.1 Partial Sums and the Dirichlet and Fejér Kernels . . . . . . 96
2.2.2 Approximate Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.2.3 Cesàro Summability and the Inversion Formula . . . . . . . 101
2.3 Completeness and the L2 -Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.3.1 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.3.2 The L2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
2.3.3 Minimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2.4 Weyl’s Equidistribution Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.5 Basis Properties of the Trignometric System . . . . . . . . . . . . . . . . 115
2.5.1 Schauder Bases and the Partial Sum Operators . . . . . . . 116
2.5.2 The Symmetric Partial Sum Operators and Their
Relatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
2.6 The Conjugate Function and Norm Convergence of Fourier
Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
2.7 Pointwise Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.8 The Poisson Summation Formula . . . . . . . . . . . . . . . . . . . . . . . . . . 136
2.9 Wiener’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
2.10 Wiener’s Tauberian Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
2.11 Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Contents vii

3 The Fourier Transform on L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . 149


3.1 Definition and Basic Properties of the Fourier Transform on
L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
3.2 The Hermite Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.2.1 Construction of the Hermite Functions . . . . . . . . . . . . . . . 156
3.2.2 Wiener’s Definition of the Fourier Transform on L2 (R) . 159
3.3 The Fourier Transform on Lp (R) . . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.4 The Classical Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . 165
3.4.1 Concentration and the Statement of the Classical
Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.4.2 Simultaneous Approximation and Smoothness versus
Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.4.3 Proof of the Classical Uncertainty Principle . . . . . . . . . . . 171
3.4.4 Additive Form of the Uncertainty Principle . . . . . . . . . . . 172
3.4.5 Operator-Theoretic Proof of the Uncertainty Principle . 174
3.4.6 Hermite Function Proof of the Uncertainty Principle . . . 174
3.4.7 A Local Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . 175
3.5 The Paley–Wiener Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
3.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
3.6 A Primer on Hilbert–Schmidt Operators . . . . . . . . . . . . . . . . . . . . 182
3.7 The Donoho–Stark Uncertainty Principle . . . . . . . . . . . . . . . . . . . 183
3.7.1 Concentration and Essential Support . . . . . . . . . . . . . . . . 184
3.7.2 Products of Time- and Frequency-Limiting Operators . . 185
3.7.3 The Donoho–Stark Uncertainty Principle . . . . . . . . . . . . . 186
3.7.4 Size of the Supports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
3.8 Energy Concentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
3.8.1 Time-Frequency Limiting Operators . . . . . . . . . . . . . . . . . 189
3.8.2 Restatement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . 191
3.8.3 Time-Frequency Limiting Operators and the Spectral
Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.8.4 Double Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3.8.5 Prolate Spheroidal Wave Functions . . . . . . . . . . . . . . . . . . 194
3.8.6 Behavior of the Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . 195
3.8.7 The Space of Band- and Time-Limited Functions . . . . . . 197
3.8.8 The Range of Time-Frequency Concentrations . . . . . . . . 199

4 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.1 Motivation and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
4.1.1 Notation for Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
4.2 Convergence and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.2.1 Examples of Spaces Defined by Seminorms . . . . . . . . . . . 209
4.2.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
4.2.3 From Convergence to a Metric . . . . . . . . . . . . . . . . . . . . . . 213
4.2.4 Continuity Equals Boundedness . . . . . . . . . . . . . . . . . . . . . 214
4.3 Distributions of Various Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
viii Contents

4.3.1 The Schwartz Class and the Space of Tempered


Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
4.3.2 C ∞ (R) and the Space of Compactly Supported
Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
4.3.3 Convergence in Cc∞ (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
4.3.4 The Space of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 220
4.3.5 Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
4.3.6 Convergence of Distributions . . . . . . . . . . . . . . . . . . . . . . . . 224
4.4 Functions as Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
4.4.1 Locally Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . 228
4.4.2 Functions with Polynomial Growth . . . . . . . . . . . . . . . . . . 230
4.4.3 Compactly Supported Functions . . . . . . . . . . . . . . . . . . . . . 232
4.5 Operations on Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
4.5.1 Translation, Modulation, Dilation, Conjugation, and
Involution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
4.5.2 Products of Distributions with Smooth Functions . . . . . . 236
4.5.3 Convolution of Distributions with Smooth Functions . . . 236
4.5.4 Convolution and Translation-Invariant Operators . . . . . . 241
4.6 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
4.6.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 244
4.6.2 Compactly Supported Distributions . . . . . . . . . . . . . . . . . . 246
4.6.3 Convolution of Distributions . . . . . . . . . . . . . . . . . . . . . . . . 248
4.7 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
4.7.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 249
4.7.2 A Relation between Classical and Distributional
Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
4.7.3 Distributions and Derivatives of Continuous Functions . 255
4.7.4 Distributions Supported at a Point . . . . . . . . . . . . . . . . . . 256
4.8 The Fourier Transform of Tempered Distributions . . . . . . . . . . . 259
4.8.1 Definition of the Distributional Fourier Transform . . . . . 260
4.8.2 Basic Properties of the Fourier Transform . . . . . . . . . . . . 261
4.8.3 The Paley–Wiener Theorem for Tempered Distributions 262
4.8.4 The Distributional Poisson Summation Formula . . . . . . . 264
4.8.5 The Hilbert Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
4.9 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
4.9.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 270
4.9.2 Sobolev Spaces and Distributional Derivatives . . . . . . . . . 272
4.9.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
4.9.4 The Sobolev Embedding Theorem . . . . . . . . . . . . . . . . . . . 274

5 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
5.1 Borel and Radon Measures on R . . . . . . . . . . . . . . . . . . . . . . . . . . 277
5.1.1 Convolution and Linear Time-Invariant Systems . . . . . . . 277
Contents ix

A Lebesgue Measure and Integration . . . . . . . . . . . . . . . . . . . . . . . . . 281


A.1 Exterior Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
A.2 Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
A.3 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
A.4 The Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
A.5 The Dominated Convergence Theorem . . . . . . . . . . . . . . . . . . . . . 289
A.6 The Lebesgue Differentiation Theorem . . . . . . . . . . . . . . . . . . . . . 290
A.7 Repeated Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

B Spaces and Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293


B.1 Metrics and Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
B.2 Norms and Seminorms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
B.3 Examples of Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
B.3.1 The ℓp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
B.3.2 More Sequence Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
B.3.3 The Lp spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
B.3.4 Some Spaces of Continuous Functions . . . . . . . . . . . . . . . . 300
B.4 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
B.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
B.6 Linear Operators on Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . 304
B.6.1 Equivalence of Boundedness and Continuity . . . . . . . . . . 305
B.6.2 Isometries and Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . 306
B.6.3 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
B.6.4 The Space B(X, Y ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
B.7 The Dual of a Hilbert Space and Notation for Functionals . . . . 308
B.8 The Dual of Lp (E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
B.9 Adjoints for Operators on Hilbert Spaces . . . . . . . . . . . . . . . . . . . 310
B.10 Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

C Functional Analysis on Banach Spaces . . . . . . . . . . . . . . . . . . . . . 313


C.1 The Hahn–Banach Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
C.2 Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
C.3 Adjoints of Operators on Banach Spaces . . . . . . . . . . . . . . . . . . . . 316
C.4 The Baire Category Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
C.5 The Uniform Boundedness Principle . . . . . . . . . . . . . . . . . . . . . . . 318
C.6 The Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
C.7 The Closed Graph Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
C.8 Weak and Weak* Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

D Convergence, Completeness, and Schauder Bases . . . . . . . . . . 323


D.1 Convergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
D.2 Unconditional Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
D.3 Span and Closed Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
D.4 Orthonormal Sequences in Hilbert Spaces . . . . . . . . . . . . . . . . . . . 328
D.5 Hamel Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
x Contents

D.6 Schauder Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330


D.7 Continuity of the Coefficient Functionals . . . . . . . . . . . . . . . . . . . 331
D.8 Minimal and Exact Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
D.9 Unconditional Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

E Integral Operators and Hilbert–Schmidt Operators . . . . . . . . 337


E.1 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
E.2 The Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
E.3 Hilbert–Schmidt Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
E.4 Singular Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
E.5 Integral Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
E.6 The Hilbert–Schmidt Kernel Theorem . . . . . . . . . . . . . . . . . . . . . . 345

F Complex Analysis and Interpolation . . . . . . . . . . . . . . . . . . . . . . . 347


F.1 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
F.2 Power Series and Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
F.3 Dirichlet Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
F.4 Trigonometric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
F.5 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

Hints for Exercises and Additional Problems . . . . . . . . . . . . . . . . . . 355

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

Index of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Detailed Solutions to Exercises and Additional Problems . . . . . . 395


General Notation

We review some of the notational conventions that will be used throughout


this volume.
We use the symbol ⊓ ⊔ to denote the end of a proof, and the symbol ♦ to
denote the end of a definition, remark, example, or exercise, or the end of the
statement of a theorem whose proof will be omitted.
Unless otherwise specified, all vector spaces are taken over the complex
field C. In particular, functions whose domain is Rd (or a subset of Rd ) are
generally allowed to take values in the complex plane C.
If S is a subspace of a vector space V, then the finite linear span of S is
denoted by span(S). If V is also a metric space, then the closed linear span
span(V ) is defined to be the closure of span(S) in V. We say that S is complete
if span(S) = V.
The extended real line is R∪{−∞, ∞} = [−∞, ∞]. We use the conventions
that 1/0 = ∞, 1/∞ = 0, and 0 · ∞ = 0.
Given a set X, the characteristic function of a subset A ⊆ X is the function
χA : X → R defined by
(
1, if x ∈ A,
χA (x) =
0, if x ∈
/ A.

The Kronecker delta is


(
1, i = j,
δij =
0, i =
6 j.

If 1 ≤ p ≤ ∞ is given, then its dual index or dual exponent is the extended


real number p′ that satisfies
1 1
+ = 1.
p p′
Explicitly,
xii General Notation
p
p′ = .
p−1
The dual index lies in the range 1 ≤ p′ ≤ ∞, and we have 1′ = ∞, 2′ = 2,
and ∞′ = 1.
Integrals with unspecified limits are taken over either the real line or Rd ,
according to context. In particular, if f : R → C, then we take
Z Z ∞
f (x) dx = f (x) dx.
−∞

Unless otherwise specified, statements about measure and integration refer to


Lebesgue measure and Lebesgue integration.

Given f ∈ Lp (R) and g ∈ Lp (R) we use an inner-product-like notation to
denote the following integral:
Z
hf, gi = f (x) g(x) dx.

Note that the notation hf, gi is linear as a function of f, but is antilinear as a


function of g. If p = 2 then p′ = 2 and h·, ·i is the ordinary inner product on
L2 (R).
The dual space of a Banach space X is denoted by X ∗ . Given a continuous
linear functional µ ∈ X ∗ , we denote its action on f ∈ X by either

µ(f ) or hf, µi.

The latter notation is especially convenient when dealing with X = Lp (R) and
with distributions. In particular, if 1 ≤ p < ∞ then the dual space Lp (R)∗ is

isomorphic to Lp (R) in the sense that given any µ ∈ Lp (R)∗ there exists a

unique function g ∈ Lp (R) such that
Z
hf, µi = f (x) g(x) dx = hf, gi, f ∈ Lp (R).

The notation µ(f ) is taken to be linear both as a function of f and as a function


of µ, while the notation hf, µi is linear as a function of f but antilinear as a
function of µ.
1
The Fourier Transform on L1 (R)

Harmonic analysis revolves around the Fourier transform. As we will discuss


in Section 2.1, the Fourier transform can be defined on any locally compact
abelian (LCA) group. However, for most of this volume we will concentrate on
the setting of the real line R. In many ways, the real line is among the most
“complex” of the LCA groups, as it is neither compact nor discrete. Once we
understand the Fourier transform on R, it is easy to move to other settings,
as we do in Chapter 2.
Even when we restrict our attention to a single underlying group such as
the real line, there are many domains on which we can define the Fourier
transform. In this chapter we focus on what is perhaps the most concrete
domain, L1 (R). In Chapter 3 we will see how the Fourier transform extends to
L2 (R). Chapter 4 takes us far beyond transformations of functions by defining
the Fourier transform on the extraordinarily broad domain of the tempered
distributions. We bring a little more concreteness back into the picture in
Chapter 5 by considering the Fourier transform of bounded Radon measures.
Some basic notation was laid out in the General Notation section. Most of
the other terminology that we will use in this chapter, such as the definitions of
the spaces L1 (R), C0 (R), Cb (R), Cc∞ (R) and so forth, are given in Appendix B,
which briefly reviews definitions and major results from real analysis, operator
theory, and functional analysis.

1.1 Definition and Basic Properties


1.1.1 The Fourier Transform on L1 (R)

We define the Fourier transform on L1 (R) as follows.

Definition 1.1 (Fourier Transform on L1 (R)). The Fourier transform of


f ∈ L1 (R) is the function fb: R → C defined by
2 1 The Fourier Transform on L1 (R)
Z Z ∞
fb(ξ) = f (x) e−2πiξx dx = f (x) e−2πiξx dx, ξ ∈ R. ♦ (1.1)
−∞

Note that even though f is only defined almost everywhere, fb(ξ) is a well-
defined complex scalar for every ξ ∈ R since
Z Z
|f (x) e−2πiξx | dx = |f (x)| dx = kf k1 < ∞.

In fact, we will soon see that fb is a continuous function on R. We want to


understand the properties of this function fb, and the relation between f and fb.
When we wish to emphasize the role of the Fourier transform as an operator
mapping f to fb, we will write

Ff = fb.
∧ ∧
To avoid notionally ugly formulations, we sometimes write f or (f ) instead
of fb.
Now we examine some of the properties of the Fourier transform, and
explain why we are so interested in it.

Lemma 1.2. The Fourier transform F is a bounded linear mapping of L1 (R)


into L∞ (R), and its operator norm satisfies

kF k = sup kfbk∞ ≤ 1.
kf k1 =1

Proof. The fact that F is linear is the reader’s first exercise. Given f ∈ L1 (R),
we have for any ξ ∈ R that
Z
|fb(ξ)| = f (x) e−2πiξx dx
Z
≤ |f (x) e−2πiξx | dx
Z
= |f (x)| dx = kf k1 .

Hence
kfbk∞ = ess sup |fb(ξ)| ≤ kf k1 .
ξ∈R

Consequently F : L1 (R) → L∞ (R) is bounded, and kF k ≤ 1. ⊓


Thus, the Fourier transform maps the unit ball in L1 (R) into the unit ball
in L∞ (R). The following exercise shows that we actually have kF k = 1, and
that the supremum in the definition of the operator norm of F is achieved,
i.e., there exists an f ∈ L1 (R) with kf k1 = 1 such that kfbk∞ = 1. Even so,
the range of F is a proper subset of L∞ (R) (see Theorem 1.20).
1.1 Definition and Basic Properties 3

Exercise 1.3. Show that if f ∈ L1 (R) then we have the useful fact that
Z
b
f (0) = f (x) dx.

Assume for now that fb is continuous (we will prove this in Exercise 1.8), and
use this equality to show that if f ≥ 0 a.e., then kfbk∞ = kf k1 . Conclude that
kF k = 1. ♦

Remark 1.4. There are many other standard normalizations for the Fourier
transform. For example, each of
Z Z Z
1
f (x) e2πiξx dx, f (x) e−iξx dx, √ f (x) e−iξx dx,

is a common choice. Each normalization makes certain formulas notationally
nicer and other formulas more notationally complicated. ♦

We introduce a convenient notation for the dilation of a function.

Notation 1.5 (Dilation). Given a function f : R → C and given r > 0, we


define the L1 -normalized dilation of f by r to be the function fr : R → C
given by
fr (x) = rf (rx). ♦

The dilation fr is L1 -normalized in the sense that it preserves the L1


norm, i.e., kfr k1 = kf k1 . When r < 1, the function fr is a “stretched out”
version of f, while when r > 1, the function fr is a “shrunken” version of f,
both with a vertical scaling that ensure that the L1 -norm is preserved. When
r > 1, it may be more descriptive to refer to fr as a compression of f, but it
is customary to use the term dilation to refer to fr for all values of r.
Although there is the possibility of ambiguity between the notation for
the dilation fr of a function f and an element fn of a sequence of functions
{fn }n∈N , the meaning will usually be clear from context.
It is about time that we see a specific example of a Fourier transform.

Definition 1.6 (Dirichlet and Sinc Functions). The Dirichlet function is

sin ξ
d(ξ) =
πξ

(implicitly defined at the origin so as to be continuous on R). The sinc function


is
sin πξ
sinc(ξ) = = πd(πξ) = dπ (ξ). ♦
πξ
4 1 The Fourier Transform on L1 (R)

1.0

0.8

0.6

0.4

0.2

-6 -4 -2 2 4 6
-0.2
sin πξ
Fig. 1.1. Graph of the sinc function dπ (ξ) = πξ
.

The sinc function is also known as the cardinal sine, and indeed “sinc” is
a contraction of sinus cardinalis. The “cardinal” nature of the sinc function
is the fact that it is an interpolating function, because sinc(0) = 1 while
sinc(n) = 0 for all integers n 6= 0. The Dirichlet function is named for Johann
Dirichlet (1805–1859).
Note that the functions d and sinc do not belong to L1 (R), because they
only decay slowly at infinity (on the order of 1/|ξ|). On the other hand, they
do belong to Lp (R) for every 1 < p ≤ ∞.

Exercise 1.7. Let χ[−T,T ] denote the characteristic function of the interval
[−T, T ]. Note that χ[−T,T ] ∈ L1 (R). Prove the following statements.
sin 2πT ξ
(a) (χ[−T,T ] ) =

πξ = d2πT (ξ).
(b) kχ[−T,T ] k1 = k(χ[−T,T ] ) k∞ .

(c) d2πT is uniformly continuous.


(d) d2πT ∈ C0 (R). ♦

The last two properties of d2πT are in fact shared by any function that
is the Fourier transform of an L1 function. The following exercise asks for a
direct proof of the fact that fb is uniformly continuous if f ∈ L1 (R). On the
other hand, Theorem 1.20 below shows that if f ∈ L1 (R) then fb must belong
to C0 (R), which also implies (see Exercise 1.17) that fb must be uniformly
continuous.

Exercise 1.8. Show directly that if f ∈ L1 (R), then fb is uniformly continu-


ous, i.e.,
 
b b b b
lim kf − Tη f k∞ = lim sup f (ξ) − f (ξ − η) = 0. ♦
η→0 η→0 ξ∈R
1.1 Definition and Basic Properties 5

0 4

-1 2

0 1

1 0

Fig. 1.2. Graph of eξ (x) = e2πiξx for ξ = 2 and 0 ≤ x ≤ 4.

1.1.2 Motivation

We pause for a moment in our mathematical development of the Fourier trans-


form to give some motivation for its definition and purpose. One of our main
goals in this chapter is to prove the following Inversion Formula for the Fourier
transform (see Theorem 1.89).

Theorem 1.9 (Inversion Formula). If f, fb ∈ L1 (R) then f and fb are


continuous and Z
f (x) = fb(ξ) e2πiξx dξ, (1.2)

with equality holding pointwise everywhere. ♦

Thus, assuming the hypothesis that both f and fb are integrable, if we


know all the values fb(ξ) of the Fourier transform of f, then we can recover f
from these values via equation (1.2). To explain the significance of this, picture
the complex exponential function

eξ (x) = e2πiξx = cos(2πξx) + i sin(2πξx)

as a function of x. While x lies in R, the function values eξ (x) are complex


numbers that lie on the unit circle S 1 in C. As x ranges through the real
line, the values eξ (x) = e2πiξx move around the unit circle S 1 . If ξ > 0, then
6 1 The Fourier Transform on L1 (R)

as x increases through an interval of length 1/ξ, the values eξ (x) = e2πiξx


move once around S 1 in the counter-clockwise direction. Thus eξ is periodic
with period 1/ξ. If ξ is negative, the same is true except that the values
eξ (x) = e2πiξx circle around S 1 in the opposite direction. The graph of eξ is

Γξ = (x, e2πiξx ) : x ∈ R ⊆ R × C.

Identifying R × C with R × R2 = R3 , the graph Γξ is a helix in R3 coiling


around the x-axis, which runs down the center of the helix (see Figure 1.2).
The function eξ is periodic with period 1/ξ, and we therefore say that it has
frequency ξ, i.e., the frequency is the reciprocal of the period. The function eξ
is a “pure tone” or a “pure frequency” in some sense. Its real part is cos(2πξx),
a cosine of period 1/ξ and frequency ξ. Its imaginary part is sin(2πξx), a sine
of period 1/ξ and frequency ξ.
We turn to music for illustration. Restricting our attention for the mo-
ment to real-valued functions, imagine that the function f (x) represents the
displacement of the center of an ideal vibrating string, or the end of an ideal
tuning fork, or the center of your stereo speaker, from its rest position. The
vibrating string, tuning fork, or speaker creates a pressure wave in the air,
which causes your eardrum to vibrate and your brain to perceive sound. For
the function cos(2πξx), the sound that you would hear is a “pure tone” of fre-
quency ξ. Real strings are of course much more complicated—a piano string
or a violin string each vibrates in a very complicated way, resulting in their
different sounds. But an ideal string or tuning fork whose displacement is ex-
actly given by the function cos(2πξx) would be a pure tone, with no overtones
or other complications (see the illustration in Figure 1.3). The function e2πiξx
is a complex version of this pure tone.

1.0

0.5

0.2 0.4 0.6 0.8 1.0


-0.5

-1.0

Fig. 1.3. Graph of ϕ(x) = cos(2π 7x).

For a given fixed ξ, the function fb(ξ) e2πiξx is a pure tone whose amplitude
is the scalar fb(ξ). The larger fb(ξ) is, the larger the vibrations of our string or
tuning fork, and the louder the perceived sound.
1.1 Definition and Basic Properties 7

0.2 0.4 0.6 0.8 1.0


-1

-2

-3

Fig. 1.4. Graph of ϕ(x) = 2 cos(2π 7x) + 0.7 cos(2π9x).

10

0.2 0.4 0.6 0.8 1

-5

-10
P75
Fig. 1.5. Graph of 75 superimposed pure tones: ϕ(x) = k=1 fb(ξk ) cos(2πξk x).

Given two frequencies η, ξ and amplitudes fb(η), fb(ξ), a function ϕ of the


form
ϕ(x) = fb(η) e2πiηx + fb(ξ) e2πiξx
is a superposition of two pure tones (see the illustration in Figure 1.4). With
some caveats, if this function represents the displacement of our string, you
would very likely be able to tell that the sound you hear is a superposition
of two frequencies—the human ear is very well-adapted to sounds of this
type. A superposition of 75 pure tones with randomly chosen frequencies and
amplitudes is shown in Figure 1.5.
The Inversion Formula is an extreme version of such a superposition. It
says that any function f (so long as f and fb are integrable) can be represented
as an integral (in effect, a continuous sum) of pure tones fb(ξ) e2πiξx over all
possible frequencies ξ ∈ R. By superimposing all the pure tones with the
8 1 The Fourier Transform on L1 (R)

correct amplitudes, we create any sound that we like. The pure tones are our
simple “building blocks,” and by combining them we can create any sound
(or signal, or function). Of course, the “superposition” is an integral, not a
finite sum, but still we are combining our very simple special functions eξ to
create very complicated functions f via the Inversion Formula.
Once we have a representation of f in terms of our pure tones, we can
act on it. For example, assume that we measure time in seconds, in which
case frequency is usually called hertz. If we don’t like the annoying buzz in
our signal f that is due to our 60 hertz overhead fluorescent lights, we might
decide to modify f by creating a new function h whose Fourier transform is
identical to fb except that b
h(60) = 0 (and most likely with some corresponding
smooth modifications of the frequencies close to ξ = 60 as well). Once we
know what we want b h to be,
R the function h that does this is given by the
Inversion Formula as h(x) = b h(ξ) e2πiξx dξ. In engineering jargon, we filter f
to obtain h. We will see later that h can be obtained from f through the
operation of convolution (see Section 1.3.4).
In light of the Inversion Formula, we make the following definition.
Definition 1.10 (Inverse Fourier Transform). The inverse Fourier trans-
form of f ∈ L1 (R) is Z

f (ξ) = f (x) e2πiξx dx.

When we wish to denote the inverse Fourier transform as an operator acting



on f, we write F −1 (f ) = f . ♦

Using this notation, the Inversion Formula says that if f, fb ∈ L1 (R) then
∨ ∨ ∧
f = fb , and likewise f = f will hold under the same hypotheses.
∧ ∨
Remark 1.11. If f ∈ L1 (R) then f (ξ) = f (−ξ). Therefore, every result that
we prove about the Fourier transform has an analogue for the inverse Fourier
transform, simply by making a change of variables. We usually only state
results for the Fourier transform, but it is a good idea for the reader to work
out the corresponding formulas for the inverse Fourier transform (usually, a
sign simply has to be moved from one place to another). Note, for example,
that if f, fb ∈ L1 (R), then the Inversion Formula tells us that
∨
f (ξ) = fb (−ξ) = f (−ξ).
∧∧

The musical discussion above may explain some terminology. In our ex-
ample, the function f (x) represented a displacement (of a string or speaker)
that changed with time. Time is represented by the variable x, and thus we
often speak of x as the time variable, and we say that values f (x) describe
the function f in the time domain. On the other hand, fb(ξ) represents the
“amount” of frequency ξ present in f, and therefore we often refer to ξ as
the frequency variable, and say that values fb(ξ) describe the function f in the
1.1 Definition and Basic Properties 9

frequency domain. We use this terminology even if x represents something


other than time and ξ something other than frequency. For example, we often
use this same terminology when we move to higher dimensions, i.e., even if f
is defined on R2 instead of R, we might refer to R2 as the “time domain.” Or,
if x is representing a spatial quantity, we may refer to R2 as the “spatial do-
main.” There is no single perfect terminology since there are so many different
contexts in which the Fourier transform can arise.

Additional Problems

1.1. Show that the Fourier transform of the one-sided exponential f (x) =
e−x χ[0,∞) (x) is
1
fb(ξ) = , ξ ∈ R,
2πiξ + 1
and the Fourier transform of the two-sided exponential g(x) = e−|x| is
2
gb(ξ) = , ξ ∈ R.
4π 2 ξ 2 +1

1.2. Prove that if f ∈ L1 (R) is even, then fb is even, and if f ∈ L1 (R) is


odd, then fb is odd. The converse is also true, but is not as easy to prove, see
Problem 1.46.

1.3. Prove that if f ∈ L1 (R) is real-valued, then fb(ξ) = fb(−ξ). Conclude that
if f is both real and even, then fb is both real and even as well.

1.4. Show that if f ∈ L1 (R) is nonnegative almost everywhere (and is not the
zero function), then |fb(ξ)| < fb(0) for all ξ 6= 0.

1.5. (a) The Gamma function for complex numbers z satisfying Re(z) > 0 is
Z ∞
Γ(z) = tz−1 e−t dt.
0

Show that this is well-defined, i.e., tz−1 e−t ∈ L1 (0, ∞) whenever Re(z) > 0.
Remark: The Gamma function is analytic on Re(z) > 0, and has an ana-
lytic continuation to C \ {0, −1, −2, . . . }. Also, Γ(n + 1) = n! for n ∈ N.
x
(b) Show that f (x) = e−e ex ∈ L1 (R), and fb(ξ) = Γ(1 − 2πiξ).
Remark: It can be shown that Γ(z) 6= 0 for every z where it is defined, so
for this f we have fb(ξ) 6= 0 for every ξ ∈ R.

1.6. The Riemann zeta function for complex numbers s with Re(s) > 1 is

X∞
1
ζ(s) = .
n=1
ns
10 1 The Fourier Transform on L1 (R)

The Riemann zeta function is analytic on Re(s) > 1, and has an analytic
continuation to C \ {1}. The Riemann hypothesis, whose validity is one of the
great open problems in mathematics, states that if ζ(s) = 0 and Re(s) > 0
then Re(s) = 1/2.
(a) Show that

X∞
(−1)n+1
(1 − 21−s ) ζ(s) = , Re(s) > 1. (1.3)
n=1
ns

(b) The right-hand side of equation (1.3) is an example of a Dirichlet


series, and it converges and defines an analytic function on Re(s) > 0 (see
Theorem F.6). The left-hand side of equation (1.3) is analytic on Re(s) > 0
except for s = 1, and can be defined so that it is analytic at s = 1 as well.
Since both sides are analytic on Re(s) > 0 and are equal on Re(s) > 1, it
follows from the properties of analytic functions that equation (1.3) holds for
all s with Re(s) > 0. For s = 1 we need to define the left-hand side to take
P (−1)n+1
the value n = ln 2. Assuming these facts, show that

X Z ∞
1−s n+1
Γ(s) (1 − 2 ) ζ(s) = (−1) xs−1 e−nx dx, Re(s) > 0. (1.4)
n=1 0

R∞
It is helpful to note that, by a change of variables, Γ(s) = ns 0 xs−1 e−nx dx
for Re(s) > 0.
(c) Justify interchanging the summation and integral in equation (1.4) to
obtain
Z ∞
1−s 1
Γ(s) (1 − 2 ) ζ(s) = xs−1 x dx, Re(s) > 0.
0 e +1

(d) Given t > 0, define

etx
ft (x) = .
eex +1
Show that ft ∈ L1 (R). Given ξ ∈ R, let s = t − 2πiξ and show that

fbt (ξ) = Γ(s) (1 − 21−s ) ζ(s).

1.2 Translation, Modulation, Dilation, and Involution


A common theme of many of the issues that we will consider in this chapter
concerns the dualities that occur between properties of f and those of fb. In
this section we examine the dualities that occur among several basic operators
under the Fourier transform.
1.2 Translation, Modulation, Dilation, and Involution 11

1.2.1 Four Fundamental Operators

Here are four operators that will pervade our study of the Fourier transform.

Definition 1.12. We define the following operators on functions f : R → C.

Translation: (Ta f )(x) = f (x − a), a ∈ R.


Modulation: (Mθ f )(x) = e2πiθx f (x), θ ∈ R.
Dilation: fλ (x) = λf (λx), λ > 0.

Involution: fe(x) = f (−x). ♦

The scaling factor in the definition of dilation has been chosen so that di-
lation preserves the L1 -norm of a function. Translation, modulation, dilation,
and involution are all isometries mapping L1 (R) onto itself.

Exercise 1.13. Prove the following algebraic properties of the Fourier trans-
form of f ∈ L1 (R).
(a) (Ta f ) (ξ) = (M−a fb)(ξ) = e−2πiaξ fb(ξ), for a ∈ R.

(b) (Mη f ) (ξ) = (Tη fb)(ξ) = fb(ξ − η), for η ∈ R.


(c) (fλ ) (ξ) = λ (fb)1/λ (ξ) = fb(ξ/λ), for λ > 0.


(d) (f¯) (ξ) = (fb)∼ (ξ) = fb(−ξ).


(e) (fe) (ξ) = fb(ξ).


Also derive analogous formulas relating the inverse Fourier transform to trans-
lation, modulation, dilation, and involution. ♦

In this sense, Ta is dual to M−a under the Fourier transform, and likewise
Mη is dual to Tη . Additionally, except for the normalizing scaling factor (which
will be very important to us in Section 1.5!), dilation by λ is dual to dilation
by 1/λ under the Fourier transform.
In order to derive some of the properties of the family of translation op-
erators, it is helpful to know that Cc (R) is dense in Lp (R) when p is finite.
We will prove this using standard real analysis techniques. By making use of
convolutions, we will greatly refine this result in Section 1.5. For example, we
will see that the seemingly “tiny” space Cc∞ (R) is dense in Lp (R) for each
1 ≤ p < ∞, and is dense in C0 (R) with respect to the L∞ -norm.
The tool that we need for this proof is Urysohn’s Lemma, a general topolog-
ical result which states that if A and B are disjoint closed subsets of a normal
topological space X then there exists a continuous function f : X → [0, 1] that
is identically 0 on A and identically 1 on B. We will prove Urysohn’s Lemma
for subsets of Rd (although the same simple proof can be used in any metric
space). The key is the following lemma.
12 1 The Fourier Transform on L1 (R)

Lemma 1.14. If E ⊆ Rd is nonempty, then



f (x) = dist(x, E) = inf |x − z| : z ∈ E
is uniformly continuous on Rd .
Proof. Fix ε > 0. Choose any x, y ∈ Rd with |x−y| < ε/2. By definition, there
exist a, b ∈ E such that |x−a| < dist(x, E)+ε/2 and |y −b| < dist(y, E)+ε/2.
Hence
f (y) = dist(y, A) ≤ |y − a|

≤ |y − x| + |x − a|
ε ε
< + dist(x, E) +
2 2
= f (x) + ε.
Similarly f (x) < f (y) + ε, so |f (x) − f (y)| < ε whenever |x − y| < ε/2. ⊓

Theorem 1.15 (Urysohn’s Lemma). If E, F are disjoint closed subsets of
Rd , then there exists a continuous function θ : Rd → R such that
(a) 0 ≤ θ ≤ 1,
(b) θ = 0 on E, and
(c) θ = 1 on F.
Proof. Because E is closed, if x ∈
/ E then dist(x, E) > 0. Also, by Lemma 1.14,
dist(x, E) and dist(x, F ) are each continuous functions of x. Therefore the
function
dist(x, E)
θ(x) =
dist(x, E) + dist(x, F )
has the required properties. ⊓

Theorem 1.16. Cc (Rd ) is dense in Lp (Rd ) for each 1 ≤ p < ∞.
Proof. First consider the function f = χE where E ⊆ Rd is bounded. If we fix
ε > 0, then there exists a bounded open set U ⊇ E such that |U \E| < ε
and a compact set K ⊆ E such that |E\K| < ε. By Urysohn’s Lemma
(Theorem 1.15), we can find a continuous function θ : Rd → R such that
0 ≤ θ ≤ 1, θ = 1 on K, and θ = 0 on Rd \U. Then θ ∈ Cc (Rd ), and we have
Z Z
kχE − θkpp = |χE − θ|p = |χE − θ|p ≤ |U \K| < 2ε.
U\K

Hence χE can be approximated arbitrarily closely in Lp -norm by elements


of Cc (Rd ). By forming finite linear combinations, it follows that every simple
function in Lp (Rd ) that has compact support can be approximated as well as
we like by functions in Cc (Rd ). Since the set of compactly supported simple
functions is dense in Lp (Rd ), we conclude that Cc (Rd ) is dense as well. ⊓⊔
1.2 Translation, Modulation, Dilation, and Involution 13

Now we establish some properties of the translation operator Ta . The


“easy” way to prove part (c) of the next exercise is to first prove that it holds
for functions in Cc (R), and then use the fact that Cc (R) is dense to extend to
Lp (R).

Exercise 1.17. (a) Prove that if f ∈ C0 (R) then f is uniformly continuous,


and show that this is equivalent to the statement

lim kTa f − f k∞ = 0. (1.5)


a→0

(b) Show that (1.5) can fail if we only assume that f ∈ Cb (R).
(c) Show that if 1 ≤ p < ∞ and f ∈ Lp (R), then

lim kTa f − f kp = 0. ♦
a→0

The function ωp (a) = kTa f − f kp is often called the Lp -modulus of conti-


nuity of f.

Remark 1.18. If we consider the linear mapping τ : R → B(Lp (R)) defined by


τ (a) = Ta , then Exercise 1.17 implies that for each 1 ≤ p < ∞ we have

∀ f ∈ Lp (R), lim kτ (t)f − τ (s)f kp = 0.


t→s

In another terminology, this says that τ (t) → τ (s) in the strong operator
topology as t → s. For 1 ≤ p < ∞, we therefore say that {Ta }a∈R is a strongly
continuous one-parameter family of operators on Lp (R). The same is true of
{Ta }a∈R on C0 (R) when p = ∞. ♦

Exercise 1.19. Prove that the family {Mη }η∈R of modulation operators is a
strongly continuous family of operators on Lp (R) for 1 ≤ p < ∞. For p = ∞,
show that {Mη }η∈R is a strongly continuous family on C0 (R), but not on
L∞ (R). ♦

1.2.2 The Riemann–Lebesgue Lemma

We now use the strong continuity of the family of translation operators to


prove that if f ∈ L1 (R), then not only is fb continuous, but we also have decay
of fb at infinity.

Theorem 1.20 (Riemann–Lebesgue Lemma). If f ∈ L1 (R), then fb ∈


C0 (R).

Proof. Choose any f ∈ L1 (R). We already know that fb is continuous, so we


just have to show that it decays at infinity. Since e−πi = −1, we have for ξ 6= 0
that
14 1 The Fourier Transform on L1 (R)
Z
fb(ξ) = f (x) e−2πiξx dx (1.6)
Z
1
= − f (x) e−2πiξx e−2πiξ( 2ξ ) dx
Z
1
= − f (x) e−2πiξ(x+ 2ξ ) dx
Z 
1  −2πiξx
= − f x− e dx. (1.7)

Averaging equalities (1.6) and (1.7), we obtain
Z
1   1  −2πiξx
fb(ξ) = f (x) − f x − e dx.
2 2ξ
The fact that translation is strongly continuous on L1 (R) therefore implies
that
Z 
b 1 1 1
|f (ξ)| ≤ f (x) − f x − dx = kf − T 2ξ 1 f k1 → 0
2 2ξ 2
as |ξ| → ∞. ⊓

While certainly elegant, and yielding the nice L1 -modulus of continuity
estimate
1 1
|fb(ξ)| ≤ w1 ,
2 2ξ
the preceding proof does have a less-than-satisfying “magical” feel to it. A
different proof of the Riemann–Lebesgue Lemma is given in Problem 1.8, and
a third proof appears in Section 1.4.

1.2.3 Position and Momentum


Two addition operators that play central roles in harmonic analysis are the
mathematical versions of the position and momentum operators from quan-
tum mechanics. These are defined as follows.
Definition 1.21 (Position and Momentum). The position and momen-
tum operators P and M are
1 ′
P f (x) = xf (x), and Mf = f. ♦ (1.8)
2πi
The position operator is defined on all functions f : R → C. The momen-
tum operator is defined on all differentiable functions f, although often we
only require that M f be defined almost everywhere and hence only need to
assume that f is differentiable at almost every point.
Unlike the translation, modulation, dilation, and involution operators, the
position and momentum operators do not map Lp (T) boundedly into itself,
even if we restrict to domains where they are well-defined (see Problem 1.9).
Even so, Problems 1.11 and 1.12 show that position and momentum are fun-
damentally tied to modulation and translation.
1.2 Translation, Modulation, Dilation, and Involution 15

1.2.4 The HRT Conjecture

We close or discussion of translation and modulation by briefly discussing one


of our favorite open mathematical problems.

Conjecture 1.22 (The HRT Conjecture). If g ∈ L2 (R) is not the zero


function and Λ = {(pk , qk )}N
k=1 is any set of finitely many distinct points in
R2 , then
 N
G(g, Λ) = Mqk Tpk g k=1
is a linearly independent set of functions in L2 (R). ♦

Conjecture 1.22 was first made in [HRT96]. While many partial results
related to this conjecture are known, as of the time of writing it is not known
whether Conjecture 1.22 holds in the generality stated. For more details and
background on this conjecture, we refer to the survey paper [Heil06] or Section
11.9 in the author’s text [Heil11a].

Additional Problems
m
1.7. Let f : R → C be Lebesgue measurable. Prove that Ta f → f (convergence
in measure) as a → 0 on any compact set, i.e.,

∀ compact K ⊆ R, ∀ ε > 0, lim {x ∈ K : |f (x) − Ta f (x)| > ε} = 0.


a→0

1.8. This problem provides an alternative proof to Theorem 1.20.


(a) Show that fb ∈ C0 (R) for every f ∈ S = span{χ[a,b] : a < b ∈ R}.
(b) Show that S is dense in L1 (R) and use this to prove that fb ∈ C0 (R)
for every f ∈ L1 (R).

1.9. Let P, M be the position and momentum operators defined in equation


(1.8), and fix 1 ≤ p < ∞. These operators are not defined on all of Lp (R).
Instead, define domains

DP = {f ∈ Lp (R) : xf (x) ∈ Lp (R)},


DM = {f ∈ Lp (R) : f is differentiable and f ′ ∈ Lp (R)},

which are dense subspaces of Lp (R). Restricted to these domains, P maps DP


into Lp (R) and M maps DM into Lp (R). Show that P and M are unbounded
even when restricted to these domains, i.e.,

sup kP f kp = ∞ = sup kM f kp.


f ∈DP , f ∈DM ,
kf kp =1 kf kp =1
16 1 The Fourier Transform on L1 (R)

1.10. Let X be a Banach space, and fix A ∈ B(X). For n > 0 let An denote
the usual nth power of A (An = A · · · A, n times), and define A0 = I (the
identity map on X).
P∞ k
(a) Given x ∈ X, show that the series eA (x) = k=0 Ak!x converges abso-
lutely in X, and show that eA is a linear operator on X.
P ∞ Ak
(b) Prove that the series k=0 k! converges absolutely in B(X), and
equals the operator eA defined in part (a). Conclude that eA ∈ B(X) and
keA k ≤ ekAk .
(c) Prove that if A, B ∈ B(X) and AB = BA, then eA eB = eA+B = eB eA .
(d) Let H be a Hilbert space. Show that if A ∈ B(H) is self-adjoint, then
eiA is unitary.

1.11. Show that for an appropriate dense subset of functions f ∈ L1 (R) we


have
X∞
2πiξP (2πiξP )k f
Mξ f = e f = ,
k!
k=0

and similarly there is a set of f ∈ L1 (R) such that



X (−2πiaM )k f
Ta f = e−2πiaM f = ,
k!
k=0

where these series converge absolutely both in the pointwise sense and in L1 -
norm. In contrast to Problem 1.10, note that the operators P and M are
unbounded on L1 (R).

1.12. Show that for an appropriate dense subset functions f ∈ L1 (R) we have
f − Ta f f − Mξ f
2πiM f = lim and − 2πiP f = lim , (1.9)
a→0 a ξ→0 ξ

where these limits converge both in the pointwise sense and in L1 -norm.
Remark: Equation (1.9) says that 2πiM is the infinitesimal generator of
the strongly continuous family of operators {Ta }a∈R , and −2πiP is the in-
finitesimal generator of the family {Mξ }ξ∈R .

1.3 Convolution
Since L1 (R) is a Banach space, we know that it has many useful properties. In
particular the operations of addition and scalar multiplication are continuous.
However, there are many other operations on L1 (R) that we could consider.
One natural operation is multiplication of functions, but unfortunately L1 (R)
is not closed under pointwise multiplication.
1.3 Convolution 17

Exercise 1.23. Show that f, g ∈ L1 (R) does not imply f g ∈ L1 (R). ♦

In this section we will define a different “multiplication-like” operation


under which L1 (R) is closed. This operation, convolution of functions, will
be one of the most important tools in our further development of harmonic
analysis. Therefore, in this section we set aside the Fourier transform for the
moment, and concentrate on developing the machinery of convolution.

1.3.1 Some Notational Conventions

Before proceeding, there are some technical issues related to the definition of
elements of Lp (R) that we need to clarify.
The basic source of difficulty is that an element f of Lp (R) is not a func-
tion but rather denotes an equivalence class of functions that are equal almost
everywhere. Therefore we cannot speak of the “value of f ∈ Lp (R) at a point
x ∈ R,” and consequently concepts such as continuity or support do not ap-
ply in a literal sense to elements of Lp (R). For example, the zero function
0 and the function χQ both belong to the zero element of Lp (R), which is
the equivalence class of functions that are zero a.e., yet 0 is continuous and
compactly supported while χQ is discontinuous and its support is R. Even so,
it is often essential to consider smoothness or support properties of functions,
and we therefore adopt the following conventions when discussing the smooth-
ness or support of elements of Lp (R). More generally, these same issues and
conventions apply to elements of

L1loc (R) = f : R → C : f · χK ∈ L1 (R) for every compact K ⊆ R ,

which is the space of locally integrable functions on R. Note that Lp (R) ⊆


L1loc (R) for every 1 ≤ p ≤ ∞.

Notation 1.24 (Continuity for Elements of L1loc (R)). We will say that
f ∈ L1loc (R) is continuous if there is a representative of f that is continuous,
i.e., there exists some continuous function f0 such that f is the equivalence
class of all functions that equal f0 almost everywhere. R
Conversely, if g is a continuous function such that K |g(x)| dx < ∞ for
every compact K ⊆ R, then we write g ∈ L1loc (R) with understanding that this
means that the equivalence class of functions that equal g a.e. is an element
of L1loc (R). In this sense we write statements such as Cc (R) ⊆ Lp (R) even
though Cc (R) is a set of functions while Lp (R) is a set of equivalence classes
of functions. ♦

Notation 1.25 (Support of Elements of L1loc (R)). We will say that f ∈


L1loc (R) has compact support if there is a representative of f that has compact
support. Thus f has compact support if there exists an N > 0 such that
f (x) = 0 for a.e. |x| > N.
18 1 The Fourier Transform on L1 (R)

In many situations, this definition of compact support is all that we need,


but in some circumstances it is important to discuss the support of f ∈ L1loc (R)
explicitly. We define the support of f ∈ L1loc (R) to be
T
supp(f ) = F ⊆ R : F is closed and f (x) = 0 for a.e. x ∈
/F .

In particular, if F is a closed subset of R, then

supp(f ) ⊆ F ⇐⇒ f (x) = 0 for a.e. x ∈


/ F.

However, if T is a generic subset of R, then the statements supp(f ) ⊆ T and


f (x) = 0 for a.e. x ∈
/ T need not be equivalent.
In the language of Chapter 4, we are taking the support of f ∈ L1loc (R)
to be the support of the distribution in D′ (R) that is determined by f, see
Section 4.6. ♦

The reader should verify that if f ∈ L1loc (R) is continuous (in the sense
given in Notation 1.24), then the support of f in the sense of Notation 1.25
coincides with the usual definition of the support of f as the closure in R of
the set {x ∈ R : f (x) 6= 0}.

1.3.2 Definition and Basic Properties of Convolution

Now we can define convolution of functions.

Definition 1.26 (Convolution). Let f : R → C and g : R → C be Lebesgue


measurable functions. Then the convolution of f with g is the function f ∗ g
given by Z
(f ∗ g)(x) = f (y) g(x − y) dy, (1.10)

whenever this integral is well-defined. ♦

For example, suppose that 1 ≤ p ≤ ∞ and let p′ be the dual index to p.



If f ∈ Lp (R) and g ∈ Lp (R), then (as functions of y), f (y) and g(x − y)
belong to dual spaces, and hence by Hölder’s Inequality the integral defining
(f ∗ g)(x) in equation (1.10) exists for every x, and furthermore is bounded
as a function of x.

Exercise 1.27. Show that if 1 ≤ p ≤ ∞, f ∈ Lp (R), and g ∈ Lp (R), then
f ∗ g ∈ L∞ (R), and we have

kf ∗ gk∞ ≤ kf kp kgkp′ . (1.11)

We will improve on this exercise (in several ways) below. In particular,


Exercise 1.27 does not give the only hypotheses on f and g which imply
that f ∗ g exists—we will shortly see Young’s Inequality, which is a powerful
result that tells us that f ∗ g will belong to a particular Lebesgue space Lr (R)
1.3 Convolution 19

whenever f ∈ Lp (R), g ∈ Lq (R), and we have the proper relationship among


p, q, and r (specifically, 1p + q1 = 1 + 1r ). Before turning to that general case,
we prove the fundamental fact that L1 (R) is closed under convolution, and
that the Fourier transform interchanges convolution with multiplication.
Theorem 1.28. If f, g ∈ L1 (R) are given, then the following statements hold.
(a) f (y) g(x − y) is Lebesgue measurable on R2 .
(b) For almost every x ∈ R, f (y) g(x − y) is a measurable and integrable
function of y, and hence (f ∗ g)(x) is defined for a.e. x ∈ R.
(c) f ∗ g ∈ L1 (R), and
kf ∗ gk1 ≤ kf k1 kgk1 .
(d) The Fourier transform of f ∗ g is the product of the Fourier transforms
of f and g:
(f ∗ g) (ξ) = fb(ξ) gb(ξ),

ξ ∈ R.
Proof. (a) If we set h(x, y) = f (x), then
h−1 (a, ∞) = {(x, y) : h(x, y) > a} = {(x, y) : f (x) > a} = f −1 (a, ∞) × R,
which is a measurable subset of R2 since f −1 (a, ∞) and R are measurable
subsets of R. Likewise k(x, y) = g(y) is measurable. Since the product of
measurable functions is measurable, we conclude that F (x, y) = f (x)g(y)
is measurable. Further, T (x, y) = (y, x − y) is a linear transformation, so
H(x, y) = (F ◦ T )(x, y) = F (y, x − y) = f (y) g(x − y) is measurable.
(b) Using the same notation as in part (a), we have
ZZ Z Z 
|H(x, y)| dx dy = |g(x − y)| dx |f (y)| dy
Z
= kgk1 |f (y)| dy = kgk1 kf k1.

1 2
Therefore H(x, y) = f (y) g(x
R − y) ∈ L (R ), so Fubini’s Theorem implies that
the function (f ∗ g)(x) = f (y) g(x − y) dy exists for almost every x and is
an integrable function of x.
(c) Using part (b),
Z ZZ
kf ∗ gk1 = |(f ∗ g)(x)| dx ≤ |f (y) g(x − y)| dy dx = kf k1 kgk1 .

(d) Fubini’s Theorem (exercise: justify its use) allows us to interchange


integrals in the following calculation:
Z
(f ∗ g)(x) e−2πiξx dx

(f ∗ g) (ξ) =
ZZ
= f (y) g(x − y) e−2πiξx dy dx
20 1 The Fourier Transform on L1 (R)
Z Z 
= f (y) e−2πiξy g(x − y) e−2πiξ(x−y) dx dy
Z Z 
= f (y) e−2πiξy g(x) e−2πiξx dx dy
Z
= f (y) e−2πiξy gb(ξ) dy

= fb(ξ) gb(ξ). ⊓

In the proof of Theorem 1.28, we carefully addressed the measurability of


f ∗ g. We will usually take issues of measurability for granted from now on,
but it is a good idea for the reader to consider wherever appropriate why the
measurability of the functions we encounter is ensured.

Exercise 1.29. Establish the following basic properties of convolution. Given


f, g, h ∈ L1 (R), prove the following.
(a) Commutativity: f ∗ g = g ∗ f.
(b) Associativity: (f ∗ g) ∗ h = f ∗ (g ∗ h).
(c) Distributive laws: f ∗ (g + h) = f ∗ g + f ∗ h.
(d) Commutativity with translations: f ∗ (Ta g) = (Ta f ) ∗ g = Ta (f ∗ g) for
a ∈ R.
(e) Behavior under involution: (f ∗ g)∼ = fe ∗ eg. ♦

1.3.3 Young’s Inequality

As we have seen, L1 (R) is closed under convolution, which we write in short


as
L1 (R) ∗ L1 (R) ⊆ L1 (R).
It is not true that Lp (R) is closed under convolution for p > 1. Instead we
have the following fundamental result, known as Young’s Inequality (although,
since Young proved many inequalities, it may be advisable to refer to this as
Young’s Convolution Inequality).

Exercise 1.30 (Young’s Inequality). Prove the following statements.


(a) If 1 ≤ p ≤ ∞ then Lp (R) ∗ L1 (R) ⊆ Lp (R), and we have

∀ f ∈ Lp (R), ∀ g ∈ L1 (R), kf ∗ gkp ≤ kf kp kgk1 . (1.12)

1 1 1
(b) If 1 ≤ p, q, r ≤ ∞ and r = p + q − 1 then Lp (R) ∗ Lq (R) ⊆ Lr (R), and
we have

∀ f ∈ Lp (R), ∀ g ∈ Lq (R), kf ∗ gkr ≤ kf kp kgkq . ♦ (1.13)


1.3 Convolution 21

Of course, statement (a) in Young’s Inequality is a special case of state-


ment (b), but it is so useful that it is worth stating separately. It is also
instructive on first try to attempt to prove statement (a) rather than the
more general statement (b) in order to see the appropriate technique needed.
There are many ways to prove Young’s Inequality, e.g., via Hölder’s Inequality
or Minkowski’s Integral Inequality (which is stated in Problem 1.25).

Remark 1.31. If q = p′ (in which case r = ∞), then Young’s Inequality tells

us that the convolution of f ∈ Lp (R) with g ∈ Lp (R) belongs to L∞ (R). In
fact, it follows from our later Exercise 1.39 that f ∗g is continuous in this case,
and therefore (f ∗ g)(x) is defined for every x. However, for general values of
p, q, r satisfying the hypotheses of Young’s Inequality, we are only able to
conclude that f ∗ g ∈ Lr (R), and hence we usually only have that f ∗ g is
defined pointwise almost everywhere. ♦

The inequalities given in Exercise 1.30 are in the form that we will most
often need in practice, but it is very interesting to note that the implicit
constant 1 on the right-hand side of equation (1.13) is not the optimal constant
in general. Instead, if we define the Babenko–Beckner constant Ap by
 1/2
p1/p
Ap = , (1.14)
p′ 1/p

where we take A1 = A∞ = 1, then the optimal version of equation (1.13) is

∀ f ∈ Lp (R), ∀ g ∈ Lq (R), kf ∗ gkr ≤ (Ap Aq Ar′ ) kf kp kgkp , (1.15)

and the constant Ap Aq Ar′ is typically not 1. The proof that Ap Aq Ar′ is the
best constant in equation (1.15) is due to Beckner [Bec75] and Brascamp
and Lieb [BrL76]. The Babenko–Beckner constant will make an appearance
again when we consider the Hausdorff–Young Theorem in Chapter 3 (see
Theorem 3.23).

1.3.4 Convolution as Filtering; Lack of an Identity

Theorem 1.28 gives us another way to view filtering (which was discussed
in Section 1.1.2). Given f ∈ L1 (R), we filter f by modifying its frequency
content. That is, we create a new function h from f whose Fourier transform
is
b
h(ξ) = fb(ξ) gb(ξ).
The Fourier transform of the function g tells us how to modify the frequency
content of f. Assuming that the Inversion Formula applies, we can recover h
by the formula Z
h(x) = fb(ξ) gb(ξ) e2πiξx dξ,
22 1 The Fourier Transform on L1 (R)

which is a superposition of the “pure tones” e2πiξx with the modified ampli-
tudes fb(ξ) gb(ξ). Assuming that g ∈ L1 (R), Theorem 1.28 tells us that we can
also obtain h by convolution: we have h = f ∗ g. Filtering is convolution.
Obviously, there are many details that we are glossing over by assuming
all of the formulas are applicable. Some of these we will address later, e.g.,
is it true that a function h ∈ L1 (R) is uniquely determined by its Fourier
transform b h? (Yes, we will show that f 7→ fb is an injective map of L1 (R) into
C0 (R), see Theorem 1.92.) Others we will leave for a course on digital signal
processing. (For example, how do results for L1 (R) relate to the processing of
real-life digital signals whose domain is {1, . . . , n} instead of R?) In any case,
keeping our attention on the real line, let us ask one interesting question. If
our goal is to filter f, then one of the possible filterings should be the identity
operation, i.e., do nothing to the frequency content of f. Is there a g ∈ L1 (R)
such that f 7→ f ∗ g is the identity operation on L1 (R)?

Exercise 1.32. Suppose that there existed a function δ ∈ L1 (R) such that

∀ f ∈ L1 (R), f ∗ δ = f.

b = 1 for all ξ, which contradicts the Riemann–


Show that δ would satisfy δ(ξ)
Lebesgue Lemma. ♦

Consequently, there is no identity element for convolution in L1 (R). This


is problematic, and we will see several alternative ways of addressing this
problem. In Section 1.5, we will construct functions which “approximate” an
identity for convolution as closely as we like, according to a variety of meanings
of approximation. In Chapter 4, we will create a distribution (or “generalized
function”) δ that is not itself a function but instead acts on functions and
is an identity with respect to convolution. In Chapter 5, we will see that
this distribution δ can also be regarded as a bounded measure on the real
line. Thus, while there is no function δ that is an identity for convolution, an
identity δ does exist in several generalized senses.

1.3.5 Convolution as Averaging; Introduction to Approximate


Identities

Convolution can also be regarded as a kind of weighted averaging operator.


For example, consider
1
χT = χ[−T,T ] , T > 0.
2T
Given f ∈ L1 (R), we have that
Z Z x+T
1
(f ∗ χT )(x) = f (y) χT (x − y) dy = f (y) dy = AvgT f (x),
2T x−T
1.3 Convolution 23

Avg T f HxL

x-T x T+x

R x+T
Fig. 1.6. The area of the dashed box equals x−T
f (y) dy, which is the area under
the graph of f between x − T and x + T.

where AvgT f (x) is the average of f on the interval [x − T, x + T ] (see Fig-


ure 1.6).
For a general function g, the mapping f 7→ f ∗ g can be regarded as a kind
of weighted averaging of f, with g weighting some parts of the real line more
than others. There is one technical point to observe in this viewpoint: while
1 χ
the function χT (x) = 2T [−T,T ] used in the preceding illustration is even,
this will not be the case in general. Instead, in thinking of convolution as a
weighted averaging, it is perhaps better to set g ∗ (x) = g(−x) and write
Z
(f ∗ g)(x) = f (y) g ∗ (y − x) dy = Avgg∗ f (x),

the average of f around the point x corresponding to the weighting of the


real line by g ∗ (x) = g(−x). Alternatively, since convolution is commutative,
we can equally view it as an averaging of g using the weighting corresponding
to f ∗ (x) = f (−x).
Looking ahead to Section 1.5, let us consider what happens to the convo-
1 χ
lution f ∗ χT = AvgT f as T → 0. The function χT = 2T [−T,T ] becomes a
taller and taller “spike” centered at the origin, with the height of the spike
being chosen so that the integral of χT is always 1. Intuitively, averaging over
smaller and smaller intervals should give values (f ∗ χT )(x) that are closer and
closer to the original value f (x). This intuition is made precise in Lebesgue’s
Differentiation Theorem (Theorem A.30), which implies that if f ∈ L1 (R)
then for almost every x (including all those in the Lebesgue set of f ) we will
have
f (x) = lim (f ∗ χT )(x) = lim AvgT f (x).
T →0 T →0
Thus f ≈ f ∗ χT when T is small. In this sense, while there is no identity
element for convolution in L1 (R), the function χT is approximately an iden-
tity for convolution, and the approximation becomes better and better the
smaller T becomes.
Moreover, a similar phenomenon happens for the more general averaging
operators f 7→ f ∗ g = Avgg∗ f. We can take any particular function g ∈ L1 (R)
24 1 The Fourier Transform on L1 (R)

and dilate it so that it becomes more and more compressed towards the origin,
yet always keeping the total integral the same, by setting

gλ (x) = λg(λx), λ > 0,

as is done in Notation 1.5. Compressing g towards the origin corresponds to


letting λ increase towards infinity (as opposed to T → 0 in the discussion of
χT above). Even if g is not compactly supported, it becomes more and more
“spike-like” as λ increases (see the illustration in Figure 1.7).

-4 -2 2 4
-1

-4 -2 2 4
-1

Fig. 1.7. Top: The function g(x) = cos(x)/(1 + x2 . Bottom: The dilated function
g5 (x) = 5g(5x).

R R
If it is the case that g = 1 (so gλ = 1 also), then we will see in
Section 1.5 that, for any f ∈ L1 (R), the convolution f ∗ gλ converges to f in
L1 -norm (and possibly in other senses as well, depending on properties of f
and g). The family {gλ }λ>0 is an example of what we will call an approximate
identity in Section 1.5.
From this discussion we can see at least an intuitive reason why there is
no identity function for convolution in L1 (R). Consider the functions gλ , each
an integrable function with integral 1 that become more and more spike-like
as λ increases. Suppose that we could let λ → ∞ and obtain in the limit
an integrable function δ that, like each function gλ , has integral 1, but is
indeed a spike supported entirely at the origin. Then we would hopefully have
1.3 Convolution 25

that f ∗ δ = limλ→∞ f ∗ gλ = f, and so δ would be an identity function for


convolution. And indeed, it is not uncommon to see informal wording similar
to the following.

“Let δ be the Rfunction on R that has the property that δ(x) = 0 for
all x 6= 0 and δ(x) dx = 1. Then
Z
f (y) δ(x − y) dy = f (x).” (1.16)

However, there is no such function δ. Any function that is zero for all x 6= 0
1
is zero almost everywhere, and hence is the zero
R element of L (R). If δ(x) = 0
for x 6= 0, then the Lebesgue integral of δ is δ(x) dx = 0, not 1, even if we
define δ(0) = ∞. Thus f ∗ δ = 0, not f.
We cannot construct a function δ that has the property that f ∗δ = f for all
f ∈ L1 (R). However, we can construct families {gλ }λ>0 that have the property
that f ∗ gλ converges to f in various senses, and these are the approximate
identities of Section 1.5. We can also construct objects that are not functions
but which are identities for convolution—we will see the δ-distribution in
Chapter 4 and the δ-measure in Chapter 5. In effect, the integral appearing
in equation (1.16) is not a Lebesgue integral but rather is simply a shorthand
for something else, namely, the action of the distribution or measure δ on the
function f.

1.3.6 Convolution as an Inner Product

It is often useful to write a convolution in one of the following forms (the


g(x) = g(−x) was introduced in Definition 1.12):
involution e
Z
(f ∗ g)(x) = f (y) g(x − y) dy
Z
= f (y) ge(y − x) dy
Z
= f (y) Tx e
g(y) dy = hf, Tx e
g i. (1.17)

Thus we can view the convolution of f with g at the point x as the inner
product of f with the function e
g translated by x.

Notation 1.33. In equation (1.17), we have used the notation h·, ·i, which in
the context of functions usually denotes the inner product on L2 (R). However,
neither f nor Tx eg need belong to L2 (R), so we are certainly taking some
poetic license in speaking of hf, Tx e
g i as an inner product of f and Tx ge. We
do this because in this volume we so often encounter integrals of the form
R
f (x) g(x) dx and direct generalizations of these integrals that it is extremely
26 1 The Fourier Transform on L1 (R)

convenient for us to retain the notation hf, gi for such an integral whenever it
makes sense. Specifically, if f and g are any measurable functions on R, then
we will write Z
hf, gi = f (x) g(x) dx

whenever this integral exists. In another language, h·, ·i is a sesquilinear form


(linear in the first variable, antilinear in the second) that extends the inner
product on L2 (R). Although it is technically an abuse of terminology, we will
often refer to hf, gi as the inner product of f with g even when f and g are
not in L2 (R) or another Hilbert space. ♦
In this volume we will encounter sesquilinear forms much more often than
bilinear forms. A sesquilinear form is a function of two variables that is linear
in the first variable but antilinear in the second, while a bilinear form is linear
in both variables. The prefix “sesqui-” means “one and a half.”
Note that in the calculation in equation (1.17), all that we know is that f
and Tx ge each belong to L1 (R). Since the product of L1 functions does not be-
long to L1 in general, the integral appearing in equation (1.17) is not going to
exist for every f, g, and x. Yet Theorem 1.28 implies the following unexpected
fact.
Exercise 1.34. Show that

f, g ∈ L1 (R) =⇒ f · Tx e
g ∈ L1 (R) for a.e. x. ♦

Thus (thanks to Fubini and his theorem), even if we only assume that f
and g are integrable, the “inner product” (f ∗ g)(x) = hf, Tx e
g i exists for
almost every x.

1.3.7 Convolution and Smoothing

Since convolution is a type of averaging, it tends to be a smoothing operation.


Generally speaking, a convolution f ∗g inherits the “best” properties of both f
and g. The following theorems and exercises will give several illustrations of
this. We begin with an easy but very useful exercise.
Exercise 1.35. Show that

f ∈ Cc (R), g ∈ Cc (R) =⇒ f ∗ g ∈ Cc (R),

and in this case we have



supp(f ∗g) ⊆ supp(f )+supp(g) = x+y : x ∈ supp(f ), y ∈ supp(g) . ♦

Next we see an example of how a convolution f ∗ g can inherit smoothness


from either f or g. This proof of this result uses a standard “extension by
density” argument, which is a very useful technique for solving many of the
exercises in this and other sections.
1.3 Convolution 27

Theorem 1.36. We have


f ∈ L1 (R), g ∈ C0 (R) =⇒ f ∗ g ∈ C0 (R).
Proof. Note that if f ∈ L1 (R) and g ∈ C0 (R), then Exercise 1.27 implies that
f ∗g exists and is bounded. Also, since g ∈ C0 (R), we know that g is uniformly
continuous. Therefore, for x, h ∈ R,
(f ∗ g)(x) − (f ∗ g)(x − h)
Z Z
= f (y) g(x − y) dy − f (y) g(x − h − y) dy
Z
≤ |f (y)| |g(x − y) − g(x − h − y)| dy
 Z
≤ sup |g(u) − g(u − h)| |f (y)| dy
u∈R

= kg − Th gk∞ kf k1 → 0 as h → 0,
where the convergence follows from the fact that g is uniformly continuous.
Thus f ∗ g ∈ Cb (R), and in fact f ∗ g is uniformly continuous. Actually, we
can make this argument much more succinct by making use of the fact that
convolution commutes with translation (Exercise 1.29). We need only write:
kf ∗ g − Th (f ∗ g)k∞ = kf ∗ g − f ∗ (Th g)k∞
= kf ∗ (g − Th g)k∞
≤ kf k1 kg − Th gk∞ → 0 as h → 0.
To show that f ∗ g ∈ C0 (R), consider first the case where g ∈ Cc (R). Then
supp(g) ⊆ [−N, N ] for some N > 0. Hence
Z x+N
|(f ∗ g)(x)| ≤ |f (y)| |g(x − y)| dy
x−N
Z x+N
≤ kgk∞ |f (y)| dy → 0 as |x| → ∞.
x−N

This shows that f ∗ g ∈ C0 (R) whenever g ∈ Cc (R).


Now we extend by density to all of C0 (R). Choose an arbitrary g ∈ C0 (R).
Since Cc (R) is dense in C0 (R), we can find functions gn ∈ Cc (R) such that
gn → g in L∞ -norm. By our previous work we know that f ∗ gn ∈ C0 (R) for
every n, and, by equation (1.11),
kf ∗ g − f ∗ gn k∞ ≤ kf k1 kg − gn k∞ → 0 as n → ∞.
Thus f ∗ gn → f ∗ g in L∞ -norm. Since f ∗ gn ∈ C0 (R) for every n and since
C0 (R) is a closed subspace of L∞ (R), we conclude that f ∗ g ∈ C0 (R). ⊓

28 1 The Fourier Transform on L1 (R)

Remark 1.37. It is important to observe that the function f ∈ L1 (R) in the


statement of Theorem 1.36 is really an equivalence class of functions that
are equal almost everywhere. In this sense f is only defined a.e., yet the
convolution f ∗ g is a continuous function that is defined everywhere. In
particular, changing
R f on a set of measure zero has no effect on the values
(f ∗ g)(x) = f (y) g(x − y) dy for x ∈ R. ♦
There are other ways to go about proving Theorem 1.36. For example,
the next exercise suggests a slightly different way of giving an extension by
density argument.
Exercise 1.38. Let fn , gn ∈ Cc (R) be such that fn → f in L1 -norm while
gn → g in L∞ -norm, and show that fn ∗ gn → f ∗ g in L∞ -norm. Since
fn ∗ gn ∈ Cc (R) by Exercise 1.35, and since C0 (R) is closed in L∞ -norm, it
follows that f ∗ g ∈ C0 (R). ♦
The following exercise extends the pairing (L1 , C0 ) considered in Theo-

rem 1.36 to pairings (Lp , Lp ) where 1 < p < ∞ (and in so doing improves
on Exercise 1.27). A weaker conclusion also holds for the pairing (L1 , L∞ ).
An extension by density argument similar to that of either Theorem 1.36 or
Exercise 1.38 is useful in proving this next exercise.
Exercise 1.39. (a) Show that if 1 < p < ∞, then

f ∈ Lp (R), g ∈ Lp (R) =⇒ f ∗ g ∈ C0 (R).

(b) For (p, p′ ) = (1, ∞) or (p, p′ ) = (∞, 1), show that


f ∈ L1 (R), g ∈ L∞ (R) =⇒ f ∗ g ∈ Cb (R),
and f ∗ g is uniformly continuous. Show further that if g is compactly
supported then f ∗ g ∈ C0 (R). However, give an example that shows that
if g is not compactly supported then we need not have f ∗ g ∈ C0 (R), even
if f is compactly supported. ♦

1.3.8 Convolution and Differentiation


Not only is convolution well-behaved with respect to continuity, but we can
extend to higher derivatives.
Exercise 1.40. Given 1 ≤ p < ∞ and m ≥ 0, show that
f ∈ Lp (R), g ∈ Ccm (R) =⇒ f ∗ g ∈ C0m (R),
and
f ∈ L∞ (R), g ∈ Ccm (R) =⇒ f ∗ g ∈ Cbm (R).
Further, writing Dj g = g (j) for the jth derivative, show that differentiation
commutes with convolution, i.e.,
Dj (f ∗ g) = f ∗ Dj g, j = 0, . . . , m. ♦
1.3 Convolution 29

In particular, for any 1 ≤ p ≤ ∞, if f ∈ Lp (R) is compactly supported


and g ∈ Ccm (R), then f ∗ g ∈ Ccm (R). This gives us an easy mechanism
for generating new elements of Ccm (R) given any one particular element g.
Moreover, if g is infinitely differentiable, then we can apply Exercise 1.40 for
every m, and as a consequence obtain the following corollary.

Corollary 1.41. If 1 ≤ p < ∞, then

f ∈ Lp (R), g ∈ Cc∞ (R) =⇒ f ∗ g ∈ C0∞ (R),

and
f ∈ L∞ (R), g ∈ Cc∞ (R) =⇒ f ∗ g ∈ Cb∞ (R).
Moreover, in either case, if f is also compactly supported then we have f ∗ g ∈
Cc∞ (R). ♦

So from one function in Cc∞ (R) we can generate many others. But this begs
the question: Are there any functions that are both compactly supported and
infinitely differentiable? It is not at all obvious at first glance whether there
exist any functions with such extreme properties, but the following exercise
constructs some examples (see also Problem 1.20).
2
Exercise 1.42. Define f (x) = e−1/x χ(0,∞) (x).
(a) Show that for every n ∈ N, there exists a polynomial pn of degree 3n such
that −2
f (n) (x) = pn (x−1 ) e−x χ(0,∞) (x).
Conclude that f ∈ Cb∞ (R), and, in particular, f (n) (0) = 0 for every n ≥ 0.
Note that supp(f ) = [0, ∞).
(b) Show that if a < b, then g(x) = f (x − a) f (b − x) belongs to Cc∞ (R) and
satisfies supp(g) = [a, b] and g(x) > 0 on (a, b). ♦

1.3.9 Convolution and Banach Algebras

A Banach algebra is a Banach space that is closed under an additional


“multiplication-like” operation. Here is the precise abstract definition.

Definition 1.43 (Algebra). An algebra is a vector space A such that for


each f, g ∈ A there exists a unique product f g ∈ A that satisfies the following
conditions for all f, g, h ∈ A and α ∈ C:
(a) (f g)h = f (gh),
(b) f (g + h) = f g + f h and (f + g)h = f h + gh, and
(c) α(f g) = (αf )g = f (αg).
If f g = gf for all f, g ∈ A then A is commutative. If there exists an
element e ∈ A such that ef = f e = f for every f ∈ A then A is an algebra
with identity. ♦
30 1 The Fourier Transform on L1 (R)

Definition 1.44 (Banach Algebra). A normed algebra is a normed linear


space A that is an algebra and also satisfies

∀ f, g ∈ A, kf gk ≤ kf k kgk.

A Banach algebra is a normed algebra that is a Banach space, i.e., it is a


complete normed algebra. ♦

Since kf ∗ gk1 ≤ kf k1 kgk1 and convolution is commutative, L1 (R) is a


commutative Banach algebra with respect to the operation of convolution.
However, by Exercise 1.32 it has no identity element: There is no function
δ ∈ L1 (R) that satisfies f ∗ δ = f for all f ∈ L1 (R). Here are some other
examples of Banach algebras.

Example 1.45. (a) Cb (R) is a commutative Banach algebra with identity under
the operation of pointwise products of functions, (f g)(x) = f (x)g(x).
(b) C0 (R) is a commutative Banach algebra without identity with respect
to pointwise products.
(c) Since the operator norm is submultiplicative, if X is a Banach space
then the space B(X) of all bounded linear operators mapping X into itself is
a noncommutative Banach algebra with identity with respect to composition
of operators. ♦

We will explore some aspects of the Banach algebra structure of L1 (R) and
its relatives next. First we create another Banach algebra that isometrically
inherits its structure from L1 (R) via the Fourier transform.

Exercise 1.46. Define



A(R) = fb : f ∈ L1 (R) ,

and set kfbkA = kf k1 for f ∈ L1 (R). If we assume that the Fourier transform
is injective on L1 (R), which we prove later in Theorem 1.92, then k·kA is well-
defined. Given this, prove that k·kA is a norm and that A(R) is a Banach space
with respect to this norm. Prove also that A(R) is a commutative Banach
algebra without identity with respect to the operation of pointwise products
of functions. ♦

The space A(R) is known by a variety of names, including both the Fourier
algebra and the Wiener algebra, although it should be noted that both of these
terms are sometimes used to refer to other spaces.
Our definition of A(R) is implicit—it is the set of all Fourier transforms
of L1 functions, and in fact is isometrically isomorphic to L1 (R). However,
we can ask whether there is an explicit description of A(R)—is it possible
to say that F ∈ A(R) based directly on properties of F ? For example, we
know that A(R) ⊆ C0 (R) by the Riemann–Lebesgue Lemma (Theorem 1.20),
1.3 Convolution 31

so continuity and decay at infinity of F are necessary conditions for F to


belong to A(R), but are there further conditions that we can place on F that
are both necessary and sufficient for membership in A(R)? No such explicit
characterization is known. We will see later both implicit and explicit proofs
that A(R) is a dense but proper subset of C0 (R), see Exercises 1.100–1.102.
As in abstract ring theory, the concept of an ideal plays an important role
in the theory of Banach algebras. Ideals are the black holes of the algebra,
sucking any product of an algebra element with an ideal element into the ideal.
Since we are mostly interested in the algebra L1 (R) we state the definition
for commutative Banach algebras; in a noncommutative Banach algebra we
would need to distinguish between left and right ideals.

Definition 1.47 (Ideals). A subspace I of a commutative Banach algebra


A is an ideal in A if

x ∈ A, y ∈ I =⇒ xy ∈ I. ♦

In particular, a subspace I of L1 (R) is an ideal in L1 (R) if f ∗ g ∈ I


whenever f ∈ L1 (R) and g ∈ I.

Exercise 1.48. Suppose that g ∈ L1 (R).


(a) Show that g ∗ L1 (R) = {g ∗ f : f ∈ L1 (R)} is an ideal in L1 (R), called the
ideal generated by g. Give an example that shows that g need not belong
to g ∗ L1 (R).
(b) Show that
I(g) = g ∗ L1 (R)
is a closed ideal in L1 (R), called the closed ideal generated by g. Prob-
lem 1.39 will later show that g always belongs to I(g). Assuming this fact
for now, prove that I(g) is the smallest closed ideal that contains g. ♦

An ideal of the form I(g) is also called a principal ideal in L1 (R). Is


every closed ideal in L1 (R) a principal ideal? Atzmon [Atz72] answered this
longstanding question in 1972, showing that there exist closed ideals in L1 (R)
that are not of the form I(g).
Some Banach algebras also have an additional operation that has proper-
ties similar to that of conjugation of complex numbers.

Definition 1.49 (Involution). An involution on a Banach algebra A is a


mapping x 7→ x∗ of A into itself that satisfies the following for all x, y ∈ A
and all scalars α ∈ C:
(a) (x∗ )∗ = x,
(b) (xy)∗ = y ∗ x∗ ,
(c) (x + y)∗ = x∗ + y ∗ , and
(d) (αx)∗ = ᾱx∗ . ♦
32 1 The Fourier Transform on L1 (R)

For example, if H is a Hilbert space then the adjoint operation is an


involution on B(H), since we (AB)∗ = B ∗ A∗ for all A, B ∈ B(H). In the
language of operator theory, because this involution satisfies kA∗ Ak = kAk2
we say that B(H) is a C ∗ -algebra.
There is also an involution on L1 (R).

Exercise 1.50. Given f ∈ L1 (R), define fe(x) = f (−x). Show that f 7→ fe


defines an involution on L1 (R) with respect to convolution. ♦

1.3.10 Convolution on General Domains


Convolution can be defined for functions with domains other than the real
line. We give two specific examples.
First, consider the discrete analogue of functions, i.e., sequences.
Definition 1.51 (Convolution of Sequences). Let a = (ak )k∈Z and b =
(bk )k∈Z be sequences of complex
 scalars. Then the convolution of a with b is
the sequence a ∗ b = (a ∗ b)k k∈Z given by
X
(a ∗ b)k = aj bk−j , (1.18)
j∈Z

whenever this series converges. ♦


Exercise 1.52. Wherever it makes sense, derive analogues of the theorems of
this section for convolution of sequences. For example, results dealing with the
continuity or differentiability of convolution of functions will have no analogue
for sequences, but results dealing with integrability will have an analogue in
terms of summability. In particular, prove analogues of Young’s Inequality for
sequences. Show that, unlike L1 (R), there does exist an element δ ∈ ℓ1 (Z)
that is an identity for convolution, and we have
∀ 1 ≤ p ≤ ∞, ∀ f ∈ ℓp (Z), f ∗ δ = f. ♦
For a second example, consider periodic functions on the real line. To be
precise, we say that a function f : R → C is 1-periodic if
∀ x ∈ R, f (x + 1) = f (x).
We define 
Lp (T) = f : f is 1-periodic and f ∈ Lp [0, 1) .
Definition 1.53 (Convolution of Periodic Functions). Let f : R → C
and g : R → C be 1-periodic measurable functions. Then the convolution of f
with g is the function f ∗ g given by
Z 1
(f ∗ g)(x) = f (y) g(x − y) dy, (1.19)
0

whenever this integral is well-defined. ♦


1.3 Convolution 33

Note that f ∗ g, if it exists, will be a 1-periodic function.


Exercise 1.54. Wherever it makes sense, derive analogues of the theorems
of this section for convolution of periodic functions. In particular, prove ana-
logues of Young’s Inequality for periodic functions. Show that there is no
function δ ∈ L1 (T) that is an identity for convolution. ♦
There is another way to view 1-periodic functions. If we define T = [0, 1),
then T is an abelian group under the operation ⊕ of addition modulo 1,
x ⊕ y = x + y mod 1,
where a mod 1 is the fractional part of a. Topologically, T is homeomorphic to
the unit circle in R2 or C (thus, “T” for “torus”, though only a 1-dimensional
torus in our setting). Hence we can identify 1-periodic functions on R with
functions on the group T. Writing + for the operation on T instead of ⊕,
convolution of functions on the domain T is again defined by equation (1.19).
Clearly, there must be a broader context here. The three domains for con-
volution that we have considered, namely, R, Z, and T, are all locally compact
abelian groups. That is, not only are they abelian groups, but they are also
endowed with a topology that is locally compact (every point has a neigh-
borhood that is compact). But convolution seems to involve more—we need
the existence of Lebesgue measure on R and T in order to define the integral
in the definition of convolution of functions, and we implicitly need counting
measure on Z to define the series in the definition of the convolution of se-
quences. It is an amazing fact that every locally compact abelian group G has
a positive, regular, translation-invariant Borel measure µ, and this measure is
unique up to scaling by positive constants. This measure µ is called the Haar
measure on G, see [Nac65] or [Fol99]. For R and T this is Lebesgue measure,
and for Z it is counting measure. We are not going to define terms precisely
here; we simply observe that convolution is part of a much broader universe.
Indeed, this is true of this entire volume: Much of what we do for the
Fourier transform on R has analogues that hold for the setting of locally
compact abelian groups. There is a general abstract theory of the Fourier
transform on locally compact abelian groups, but not every result for the
Fourier transform on R has an analogue in that general setting. We explore
this briefly in Sections 2.1 and 2.2. In some ways, our chosen setting of the
real line is among the most “complex” of the locally compact abelian groups,
as R is neither compact (like T) nor discrete (like Z).
And the story does not end with abelian groups. If G is a locally com-
pact group that is not abelian then we have to distinguish between left and
right translations, but still G will have a unique left Haar measure µL (unique
up to scale, left-translation invariant), and a unique right Haar measure µR
(unique up to scale, right-translation invariant). The reader can consider how
to define convolution in this setting, and what properties that convolution
will possess—there will be a difference between “left” and “right” convolu-
tion. Still, convolution carries over without too many complications. On the
34 1 The Fourier Transform on L1 (R)

other hand, the analogue of the Fourier transform becomes a very difficult
(and interesting) object to define and study on nonabelian groups, and this
is the topic of the representation theory of locally compact groups. For these
generalizations we direct the reader to texts such as those by Folland [Fol95],
Rudin [Rud62], or Hewitt and Ross [HR79].

Additional Problems
2 2
1.13. Let f (x) = e−|x| , g(x) = e−x , and h(x) = xe−x . Show that

(f ∗ f )(x) = (1 + |x|) e−|x| ,


 1/2
π 2
(g ∗ g)(x) = e−x /2 ,
2
 1/2
1 π 2
(h ∗ h)(x) = (x2 − 1) e−x /2 .
4 2

1.14. Show that if f, g ∈ L1 (R) and f, g ≥ 0 a.e., then kf ∗ gk1 = kf k1 kgk1 .


Find a function h ∈ L1 (R) such that kh ∗ hk1 < khk21 .

1.15. If f, g ∈ L1 (R), then

supp(f ∗ g) ⊆ supp(f ) + supp(g)

(see Notation 1.25). Further, if f and g are both compactly supported then
so is f ∗ g and, in this case, supp(f ∗ g) ⊆ supp(f ) + supp(g).

1.16. Let Ap be the Babenko–Beckner constant defined in equation (1.14),


and prove the following facts.
(a) Ap < 1 for 1 < p < 2.
(b) Ap > 1 for 2 < p < ∞.
(c) A2 = lim+ Ap = lim Ap = 1.
p→1 p→∞

1.17. Give an example of f ∈ Lp (R) with 1 < p < ∞ and g ∈ C0 (R) such that
f ∗ g is not defined. Compare this to Theorem 1.36 and Exercise 1.40, which
show that f ∗ g ∈ C0 (R) if either f ∈ L1 (R) and g ∈ C0 (R), or if f ∈ Lp (R)
and g ∈ Cc (R).

1.18. Prove the following variation on Exercise 1.40: If f ∈ L1 (R) and g ∈


Cbm (R), then f ∗ g ∈ Cbm (R).

1.19. Let 1 ≤ p ≤ ∞ and f ∈ Lp (R) be given. If there exists an h ∈ Lp (R)


such that
f − Ta f
lim h − = 0
a→0 a p
1.3 Convolution 35

then we call h a strong Lp derivative of f and denote it by h = ∂p f. Show that,



in this case, if g ∈ Lp (R) then f ∗ g is differentiable pointwise everywhere,
and (f ∗ g)′ = ∂p f ∗ g.

1.20. Set f (x) = e−1/x χ[0,∞) (x).


(a) Show that there exists a polynomial pn of degree 2n such that f (n) (x) =
pn (x−1 ) e−1/x χ[0,∞) (x).
(b) Let g(x) = f (1 − |x|2 ), and show that g ∈ Cc∞ (R) with supp(g) =
[−1, 1].

1.21. Create explicit examples of functions f ∈ Cc∞ (R) supported in the


interval [−1, 1] that have the following properties.
(a) 0 ≤ f (x) ≤ 1 and f (0) = 1.
R
(b) f (x) dx = 0.
R
(c) f (x) dx = 1.
R R
(d) f (x) dx = 1, xf (x) dx = 0.

1.22. Let E ⊆ R be measurable, and show that

I(E) = {f ∈ L1 (R) : supp(fb) ⊆ E}

is a closed ideal in L1 (R) (under the operation of convolution).

1.23. Define Ac (R) = {F ∈ A(R) : supp(F ) is compact}. Show that Ac (R) is


an ideal in A(R) (under the operation of pointwise multiplication). Compare
Problem 1.56, which shows that Ac (R) is dense in A(R), and hence is not a
closed ideal in A(R).

1.24. (a) Let E ⊆ R be measurable with 0 < |E| < ∞. By considering


χE ∗ χ−E , show that E − E = {x − y : x, y ∈ E} contains an open interval
(−ε, ε) for some ε > 0.
(b) Consider the equivalence relation x ∼ y ⇐⇒ x − y ∈ Q on R. By
the Axiom of Choice, there exists a set E that contains exactly one element
of each of the equivalence classes of ∼ . Use part (a) to show that E is not
measurable.
(c) Show that every subset of R that has positive exterior Lebesgue mea-
sure contains a nonmeasurable subset.

1.25. Let f (x, y) be a measurable function on Rm+n = Rm × Rn , and fix


1 ≤ p < ∞. Prove Minkowski’s Integral Inequality:
Z Z p 1/p Z Z 1/p
|f (x, y)| dy dx ≤ |f (x, y)|p dx dy. (1.20)

Remark: This equation may be more revealing if we rewrite it as


36 1 The Fourier Transform on L1 (R)
Z Z
|f (·, y)| dy ≤ kf (·, y)kp dy.
p

Thus, Minkowski’s Integral Inequality is an integral version of the Triangle


Inequality (also known as Minkowski’s Inequality) on Lp (Rm ).

1.4 The Duality Between Smoothness and Decay


A fundamental duality under the Fourier transform is the interchange between
smoothness and decay. We explore this duality in this section.

1.4.1 Decay in Time Implies Smoothness in Frequency

To begin, we show that if f has sufficient decay, then fb must have a cor-
responding amount of smoothness. Although it is an abuse of notation, we
∧
will write (−2πix)k f (x) to denote the Fourier transform of the function
g(x) = (−2πix)k f (x).

Theorem 1.55 (Smoothness and Decay I). Let f ∈ L1 (R) and m ∈ N be


given. Then
xm f (x) ∈ L1 (R) =⇒ fb ∈ C0m (R),
i.e., fb is m-times differentiable and fb, fb ′ , . . . , fb(m) ∈ C0 (R). Furthermore, in
this case we have that xk f (x) ∈ L1 (R) for k = 0, . . . , m, and the kth derivative
of fb is the Fourier transform of (−2πix)k f (x):

dk b ∧
fb(k) = f = (−2πix)k f (x) , k = 0, . . . , m. ♦ (1.21)
dξ k

Before proving Theorem 1.55, we remark that the hypothesis xm f (x) ∈


1
L (R) is a kind of “average decay” condition. As x increases, the value of
|xm f (x)| becomes increasing large compared to the value of |f (x)| (except at
points where f (x) = 0), yet still the area under the graph of |xm f (x)| must
remain finite. However, this does not imply that f decays pointwise at infinity.

Exercise 1.56. Let m ∈ N be fixed. Give an example of a continuous func-


tion f ∈ L1 (R) such that xm f (x) ∈ L1 (R) but limx→±∞ f (x) does not ex-
ist. Show that we can even construct examples where f is continuous and
unbounded. On the other hand, show that for any h > 0 we will have
R R+h
limR→±∞ R |xm f (x)| dx = 0. ♦

Equation (1.21) does not come out of the blue. We can guess the correct
equation by formally exchanging a derivative and an integral:
1.4 The Duality Between Smoothness and Decay 37
Z
d b d
f (ξ) = f (x) e−2πiξx dx
dξ dξ
Z
d
= f (x) e−2πiξx dx

Z
= f (x) (−2πix) e−2πiξx dx


= (−2πixf (x)) (ξ).

Essentially, the proof of Theorem 1.55 is the justification of this interchange.


Proof (of Theorem 1.55). Consider the case m = 1. Assume that f (x),
xf (x) ∈ L1 (R). We must show that the limit

′ fb(ξ + η) − fb(ξ)
fb (ξ) = lim
η→0 η
Z
e−2πi(ξ+η)x − e−2πiξx
= lim f (x) dx (1.22)
η→0 η
exists. We do this by applying the Lebesgue Dominated Convergence Theo-
rem. Define ex (ξ) = e−2πiξx . Then the integrand in equation (1.22) converges
pointwise for almost every x, because
ex (ξ + η) − ex (ξ)
lim f (x) = f (x) e′x (ξ) = f (x) (−2πix) e−2πiξx .
η→0 η

ei Θ

Fig. 1.8. The distance |eiθ − 1| from eiθ to 1 is smaller than the arclength θ along
the unit circle from eiθ to 1.

Further, since we always have |eiθ − 1| ≤ |θ| (see the “proof by picture” in
Figure 1.8 for the case 0 ≤ θ ≤ π2 ), we have
38 1 The Fourier Transform on L1 (R)

e−2πi(ξ+η)x − e−2πiξx e−2πiηx − 1


f (x) = |f (x)| |e−2πiξx |
η η
≤ 2π|xf (x)| ∈ L1 (R).

Hence the Lebesgue Dominated Convergence Theorem implies that the limit
in equation (1.22) exists, and
Z
′ e−2πi(ξ+η)x − e−2πiξx
fb (ξ) = lim f (x) dx
η→0 η
Z
= f (x) (−2πix) e−2πiξx dx
∧
= (−2πix)f (x) (ξ).

Technically, the Dominated Convergence Theorem applies to sequences in-


dexed by the natural numbers, whereas above we need to let the continuous
variable η converge to zero. However, this can be justified by considering
all possible sequences ηk → 0 (see Problem 1.29). In any case, we conclude
that fb is differentiable. Further, since fb ′ is the Fourier transform of the in-
tegrable function (−2πix)f (x), the Riemann–Lebesgue Lemma implies that
fb ′ ∈ C0 (R). Since we also have fb ∈ C0 (R), we have shown that fb ∈ C01 (R).
Exercise: Apply induction to extend to all m > 1. ⊓ ⊔

In operator notation, if we recall the position and momentum operators P


and M defined by
1 ′
P g(x) = xg(x), and Mg = g,
2πi
then Theorem 1.55 says that if f, P f ∈ L1 (R) then we have

M fb = (−P f ) .

In other words, M F = −FP, and we will shortly see that P F = FM as well.


Thus P and M are dual under the Fourier transform.

1.4.2 A Primer on Absolute Continuity

In order to prove our next main result, we will need to apply both the Fun-
damental Theorem of Calculus and the technique of integration by parts on
an interval [a, b]. Of course, if f is differentiable and f ′ is continuous on [a, b],
then these results can certainly be applied. However, the Fundamental The-
orem of Calculus and integration by parts hold more generally; in fact, they
apply as long as f is an absolutely continuous function. Therefore we pause
to briefly review the definition and basic properties of absolute continuity.
1.4 The Duality Between Smoothness and Decay 39

Definition 1.57 (Absolute Continuity). A function f : [a, b] → C is abso-


lutely continuous on [a, b] if for every ε > 0 there exists a δ > 0 such that
for
 any finite or countably infinite collection of nonoverlapping subintervals
[aj , bj ] j of [a, b], we have
X X
(bj − aj ) < δ =⇒ |f (bj ) − f (aj )| < ε.
j j

We denote the class of absolutely continuous functions on [a, b] by AC[a, b].


The space of locally absolutely continuous functions on R is

ACloc (R) = f : R → C : f ∈ AC[a, b] for every a < b . ♦
The next exercise shows that the antiderivative of an integrable function
is absolutely continuous.
Rx
Exercise 1.58. Show that if f ∈ L1 [a, b], then g(x) = a f (t) dt belongs to
AC[a, b], and furthermore g ′ = f a.e. ♦
In fact, much more holds. We refer to texts such as [BBT97], [Fol99],
[WZ77] for proof of the next result.
Theorem 1.59 (Fundamental Theorem of Calculus). If g : [a, b] → C,
then the following statements are equivalent.
(a) g ∈ AC[a, b].
(b) There exists f ∈ L1 [a, b] such that
Z x
g(x) − g(a) = f (t) dt, x ∈ [a, b].
a

(c) g is differentiable at almost every x ∈ [a, b], g ′ ∈ L1 [a, b], and


Z x
g(x) − g(a) = g ′ (t) dt, x ∈ [a, b]. ♦
a

Some basic properties of absolutely continuous functions are given in Prob-


lem 1.32. In particular, every Lipschitz function on [a, b] is absolutely contin-
uous. A function f that is differentiable at every point and has a bounded
derivative f ′ is Lipschitz and hence absolutely continuous, even if f ′ is not
continuous.
It is a more subtle fact that if f is differentiable at every point on [a, b]
and f ′ is merely integrable on [a, b] then f is absolutely continuous. The main
steps needed to prove this are the following two results, which we state without
proof.
First we need a lemma that relates the size of f (E) to the integral of |f ′ |
on E. Since a measurable function need not map measurable sets to measur-
able sets, this result must be worded in terms of the exterior Lebesgue measure
of f (E). This subtlety of this result is that E is an arbitrary measurable set,
not just an interval (for proof, see [BBT97, Lem. 7.10]).
40 1 The Fourier Transform on L1 (R)

Lemma 1.60. Let f : [a, b] → R be measurable. If E ⊆ [a, b] is measurable


and f is differentiable at every point of E, then the exterior Lebesgue measure
of f (E) satisfies the bound
Z
|f (E)|e ≤ |f ′ |. ♦
E

The next fact that we need is the Banach–Zarecki Theorem, which gives
characterization of absolutely continuous functions (see [BBT97, Thm. 7.11]
for proof). Bounded variation is defined in Problem 1.30.
Theorem 1.61 (Banach–Zarecki Theorem). Given f : [a, b] → C, the fol-
lowing statements are equivalent.
(a) f ∈ AC[a, b].
(b) f is continuous, f ∈ BV[a, b], and |f (A)| = 0 for every A ⊆ [a, b] with
|A| = 0.
(c) f is continuous and is differentiable a.e., f ′ ∈ L1 [a, b], and |f (A)| = 0 for
every A ⊆ [a, b] with |A| = 0. ♦
Combining these two results, we obtain a useful sufficient condition for
absolute continuity.
Theorem 1.62. If f : [a, b] → C is everywhere differentiable and f ′ ∈ L1 [a, b],
then f ∈ AC[a, b]. ♦
Proof. By splitting into real and imaginary parts, we may assume that f is
real-valued. Suppose that A ⊆ [a, b] and |A| = 0. Then since f is differentiable
R
at every point of A, we have by Lemma 1.60 that |f (A)|e ≤ A |f ′ | = 0.
Theorem 1.61 therefore implies that f is absolutely continuous. ⊓ ⊔
The hypotheses of Theorem 1.62 can be relaxed somewhat. For example,
if f is differentiable at all but countably many points and f ′ ∈ L1 [a, b], then
f will be absolutely continuous (see [Ben76]). However, the assumptions that
f is differentiable a.e. and f ′ ∈ L1 [a, b] are by themselves not sufficient to
ensure that f is absolutely continuous (the Cantor–Lebesgue function is a
counterexample, see Problem 1.33).
As we can see from Theorem 1.61 (and it is easy to prove directly), every
absolutely continuous function has bounded variation. Although we will not
need to use it, we state a fundamental decomposition for functions of bounded
variation.
Definition 1.63 (Singular Function). A function h is singular if h is dif-
ferentiable at almost every point in its domain and h′ = 0 a.e. ♦
Theorem 1.64. If f ∈ BV[a, b], then f = g + h where g ∈ AC[a, b] and h is
singular on [a, b]. Moreover, g and h are unique up to additive constants, and
we can take Z x
g(x) = f ′, x ∈ [a, b]. ♦ (1.23)
a
1.4 The Duality Between Smoothness and Decay 41

1.4.3 Smoothness in Time Implies Decay in Frequency

Now we show how the Fourier transform converts smoothness into decay.
Theorem 1.65 (Smoothness and Decay II). Let f ∈ L1 (R) and m ∈ N be
given. If f is everywhere m-times differentiable and f, f ′ , . . . , f (m) ∈ L1 (R),
then
(f (k) ) (ξ) = (2πiξ)k fb(ξ),

k = 0, . . . , m.
Consequently,
kf (m) k1
|fb(ξ)| ≤ , ξ 6= 0. (1.24)
|2πξ|m
Proof. Consider the case m = 1. Assume that f is everywhere differentiable
and that f, f ′ ∈ L1 (R). Then f ∈ ACloc (R) by Theorem 1.62, so f is abso-
lutely continuous on every interval [a, b] in R. Consequently the Fundamental
Theorem of Calculus (Theorem 1.59) holds for f, so
Z x
f (x) − f (0) = f ′ (t) dt, x ∈ R.
0

Since f ′ is integrable, it follows that


Z x Z ∞
lim f (x) = f (0) + lim f ′ (t) dt = f (0) + f ′ (t) dt
x→∞ x→∞ 0 0

exists. Since f is continuous and integrable, this limit has to be zero:


limx→∞ f (x) = 0. Similarly limx→−∞ f (x) = 0, so we conclude that f ∈
C0 (R).
Integration by parts is valid for absolutely continuous functions (Prob-
lem 1.32), so for every a < b we have
Z b Z b
f ′ (x) e−2πiξx dx = e−2πiξb f (b) − e−2πiξa f (a) + 2πiξ f (x) e−2πiξx dx.
a a

Consequently, since f ′ is integrable,


Z ∞
b′
f (ξ) = f ′ (x) e−2πiξx dx
−∞
Z b
= lim f ′ (x) e−2πiξx dx
a→−∞ a
b→∞
 Z b 
−2πiξb −2πiξa −2πiξx
= lim e f (b) − e f (a) + 2πiξ f (x) e dx
a→−∞ a
b→∞
Z ∞
= 2πiξ f (x) e−2πiξx dx
−∞

= 2πiξ fb(ξ).
42 1 The Fourier Transform on L1 (R)

Finally, for ξ 6= 0 we have

|fb′ (ξ)| kfb′ k∞ kf ′ k1


|fb(ξ)| = ≤ ≤ .
|2πiξ| |2πξ| |2πξ|
Exercise: Use induction to extend to all m > 1. ⊓

We extract and reformulate one useful fact that played a role in the proof
of Theorem 1.65.
Rx
Exercise 1.66. If g ∈ L1 (R) and its antiderivative f (x) = −∞ g(t) dt also
R
belongs to L1 (R), then g = 0 and f ∈ C0 (R). ♦

The essential property used in the proof of Theorem 1.65 is absolute con-
tinuity. For example, if we consider m = 1, the assumption in Theorem 1.65
that f is everywhere differentiable and f, f ′ are integrable is made solely so
that we will know that f is absolutely continuous. Absolute continuity is cer-
tainly necessary, as is shown by the example of the reflected Cantor–Lebesgue
function (see Problem 1.33). That particular function f is continuous and com-
pactly supported, is differentiable a.e., and both f and f ′ belong to L1 (R).
However, f is singular so f ′ = 0 a.e., and hence fb′ is identically zero while
(2πiξ) fb(ξ) is not.
On the other hand, as long as f is absolutely continuous and both f, f ′ are
integrable, the proof of Theorem 1.65 can be carried over, and so we obtain
the following result.

Theorem 1.67. Suppose f ∈ L1 (R) ∩ ACloc (R), which implies in particular


that f is differentiable a.e. If f ′ ∈ L1 (R), then fb′ (ξ) = 2πiξ fb(ξ) for every
ξ ∈ R. ♦

This implies the following variation on Theorem 1.65, dealing with the
Fourier transform of an antiderivative of an integrable function.
Rx
Corollary 1.68. If g ∈ L1 (R) and f (x) = −∞ g(t) dt also belongs to L1 (R),
g(ξ) = 2πiξ fb(ξ) for ξ ∈ R. ♦
then b

1.4.4 The Riemann–Lebesgue Lemma Revisited

By Theorem 1.65, if g is an everywhere differentiable function such that g


and g ′ are both integrable, then
kg ′ k1
|b
g (ξ)| ≤ → 0 as |ξ| → ∞.
2π|ξ|

Hence the Riemann–Lebesgue Lemma (Theorem 1.20) holds for these par-
ticular functions g. A consequence of Exercise 1.78, which comes later in
Section 1.5, is that the set
1.4 The Duality Between Smoothness and Decay 43

g ∈ L1 (R) : g is everywhere differentiable and g ′ ∈ L1 (R)

is a dense subset of L1 (R). Assuming this fact for the moment, we can use
an extension by density argument to give another proof that the Riemann–
Lebesgue Lemma is valid for all f ∈ L1 (R).

Proof (of Theorem 1.20). Choose any f ∈ L1 (R) and ε > 0. Then there exists
a differentiable function g with both g, g ′ ∈ L1 (R) such that kf − gk1 < ε. By
Theorem 1.65, and equation (1.24) in particular, we have b g ∈ C0 (R), so there
exists an R > 0 such that |b g (ξ)| < ε for all |ξ| > R. Now, for every ξ ∈ R we
have
|fb(ξ) − b
g(ξ)| ≤ kfb − gb k∞ ≤ kf − gk1 < ε,
so |fb(ξ)| < 2ε for all |ξ| > R. Hence fb ∈ C0 (R) as well. ⊓

Additional Problems

1.26. Illustrate the connection between smoothness and decay given in Theo-
rem 1.65 by computing the following Fourier transforms explicitly. Then make
the same comparisons for the functions given in Problem 1.1.
(a) Show that if f (x) = (cos 6πx) χ[−1/2,1/2] (x), then

ξ sin πξ
fb(ξ) = .
π (9 − ξ 2 )

Observe that f is discontinuous, which is reflected in the fact that fb decays


slowly at infinity, on the order of 1/|ξ|.
(b) Show that if g(x) = (sin 6πx) χ[−1/2,1/2] (x), then

3i sin πξ
g(ξ) =
b .
π (ξ 2 − 9)

g decays on the order of 1/|ξ|2 .


The function g is continuous, and b

1.27. Let f (x) = xe−x χ[0,∞) (x). Show both directly and by using Theo-
rem 1.65 that fb(ξ) = (1 + 2πiξ)−2 .

1.28. Assuming the hypotheses of Theorem 1.65, improve the conclusion given
in equation (1.24) by showing that lim|ξ|→∞ |ξ m fb(ξ)| = 0.

1.29. Let X be a normed space, and let {ft }t∈R be a sequence in X indexed
by a real parameter. Show that ft → f as t → 0 if and only if ftk → f for
every sequence of real numbers {tk }k∈N such that tk → 0.
44 1 The Fourier Transform on L1 (R)

1.30. Fix f : [a, b] → C. Let Γ = {a = x0 < · · · < xn = b} denote a finite


partition of [a, b]. Set
n
X
SΓ = |f (xj ) − f (xj−1 )| and V [f ; a, b] = sup SΓ ,
j=1 Γ

where the supremum is taken over all partitions Γ of [a, b]. We call V [f ; a, b]
the variation of f over [a, b] and say that f has bounded variation on [a, b]
if V [f ; a, b] < ∞. The set of functions with bounded variation on [a, b] is
BV[a, b]. Prove the following statements.
(a) If f, g ∈ BV[a, b], then αf + βg ∈ BV[a, b] and f g ∈ BV[a, b]. If
|g(x)| ≥ ε > 0 for all x ∈ [a, b] then f /g ∈ BV[a, b].
(b) Set f (x) = x2 sin(1/x) and g(x) = x2 sin(1/x2 ) for x 6= 0, and f (0) =
g(0) = 0. Then f and g are differentiable everywhere, f ∈ BV[−1, 1], and
g∈/ BV[−1, 1].
(c) For y ∈ R set y + = max{y, 0} and y − = max{−y, 0}, so y + − y − = y
and y + + y − = |y|.
Given a real-valued f ∈ BV[a, b], define
n
X  +
SΓ+ = f (xi ) − f (xi−1 ) and V + [f ; a, b] = sup SΓ+ ,
i=1 Γ

and similarly define SΓ− and V − [f ; a, b]. Prove that

V + [f ; a, x] + V − [f ; a, x] = V [f ; a, x], x ∈ [a, b],

and
V + [f ; a, x] − V − [f ; a, x] = f (x) − f (a), x ∈ [a, b].

(d) Prove the Jordan Decomposition: If f ∈ BV[a, b] is real-valued then


f = g − h where g, h are monotone increasing on [a, b]. Conclude that the left-
and right-hand limits f (x+) = limyցx f (y) and f (x−) = limyրx f (y) exist
for all x ∈ (a, b), as do f (a+) and f (b−).

1.31. A function f : [a, b] → C is Lipschitz on [a, b] if there exists a constant


C > 0 such that

|f (x) − f (y)| ≤ C |x − y|, x, y ∈ [a, b].

The class of Lipschitz functions on [a, b] is denoted by Lip[a, b].


(a) Show that if f is Lipschitz on [a, b] then f is uniformly continuous, has
bounded variation, and V [f ; a, b] ≤ C (b − a).
(b) Exhibit a Lipschitz function that is not differentiable at every point
of [a, b].
1.4 The Duality Between Smoothness and Decay 45

(c) Show that if f is differentiable everywhere on [a, b] and f ′ is bounded


on [a, b], then f is Lipschitz with constant C = kf ′ k∞ . In particular, if f, f ′
are both continuous on [a, b], then f is Lipschitz.
1.32. Prove the following properties of absolutely continuous functions.
(a) If g ∈ AC[a, b], then g is uniformly continuous on [a, b].
(b) Lip[a, b] ( AC[a, b] ( BV[a, b].
(c) Integration by parts is valid for absolutely continuous functions, i.e., if
if f, g ∈ AC[a, b], then
Z b Z b
f (x) g ′ (x) dx = f (b)g(b) − f (a)g(a) − f ′ (x) g(x) dx.
a a

(d) AC[a, b] is closed under pointwise products: If f, g ∈ AC[a, b] then


f g ∈ AC[a, b].
(e) If f ∈ AC[a, b] and f ′ = 0 a.e. then f is constant.
(f) If f ∈ ACloc [a, b] and there is a continuous function g such that f ′ =
g a.e. then f is differentiable everywhere and f ′ (x) = g(x) for all x ∈ [a, b].
1.33. Consider the functions ϕ1 , ϕ2 pictured in Figure 1.9. The function ϕ1
takes the constant value 1/2 on the interval (1/3, 2/3) that is removed in the
first stage of the construction of the Cantor middle-thirds set, and is linear
on the remaining intervals. The function ϕ2 also takes the same constant 1/2
on the interval (1/3, 2/3) but additionally is constant with values 1/4 and 3/4
on the two intervals that are removed in the second stage of the construction
of the Cantor set. Continue this process, defining ϕ3 , ϕ4 , . . . , and prove the
following facts.
(a) Each ϕk is monotone increasing on [0, 1].
(b) |ϕk+1 (x) − ϕk (x)| < 2−k for every x ∈ [0, 1].
(c) ϕ(x) = limk→∞ ϕk (x) converges uniformly on [0, 1].
The function ϕ constructed in this manner is called the Cantor–Lebesgue
function or, more picturesquely, the Devil’s staircase. If we extend ϕ to R
by reflecting it about the point x = 1 and declaring it to be zero outside of
[0, 2], we obtain the continuous function ϕ pictured in Figure 1.10. Prove the
following facts.
(d) ϕ is continuous and monotone increasing on [0, 1], but ϕ is not Lipschitz.
(e) ϕ is singular on [0, 1], i.e., ϕ is differentiable a.e. and ϕ′ (x) = 0 a.e.
(f) The Fundamental Theorem of Calculus does not apply to ϕ:
Z 1
ϕ(1) − ϕ(0) 6= ϕ′ (x) dx.
0

Conclude that ϕ is not absolutely continuous on [0, 1].


46 1 The Fourier Transform on L1 (R)

1 1

0.75 0.75

j1 j2
0.5 0.5

0.25 0.25

0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1

Fig. 1.9. First stages in the construction of the Cantor–Lebesgue function. Left:
The function ϕ1 . Right: The function ϕ2 .

1 2
Fig. 1.10. The reflected Devil’s staircase (Cantor–Lebesgue function).

1.5 Approximate Identities

Although L1 (R) is closed under convolution, we have seen that it has no


identity element. In this section we will show that there are functions in L1 (R)
that are “almost” identity elements for convolution. We will construct families
of functions {kλ }λ>0 which have the property that f ∗ kλ converges to f in
L1 (R) (and in fact in many other senses, depending on what space f belongs
to). If kλ is a “nice” function, then f ∗ kλ will inherit that niceness, and so we
will have a nice function that is arbitrarily close to f. Using this procedure we
will be able to show that many spaces of nice functions are dense in L1 (R),
including Cc (R), Ccm (R) for each m ∈ N, and even Cc∞ (R).

1.5.1 Definition and Existence of Approximate Identities

The properties that a family {kλ }λ>0 will need to possess in order to be an
approximate identity for convolution are listed in the next definition.
1.5 Approximate Identities 47

Definition 1.69. An approximate identity or a summability kernel is a family


{kλ }λ>0 of functions in L1 (R) satisfying
R
(a) Normalization: kλ (x) dx = 1 for every λ > 0,
(b) L1 -boundedness: supλ kkλ k1 < ∞, and
(c) L1 -concentration: For every δ > 0,
Z
lim |kλ (x)| dx = 0. ♦
λ→∞ |x|≥δ

By definition, an approximate identity is a family of integrable functions.


If it is the case that kλ ≥ 0 for each λ, then requirement (b) in Definition 1.69
follows from requirement (a). However, in general the elements of an approx-
imate identity need not be nonnegative functions.
The “easy” way to create an approximate identity is through dilation of a
single function.

Exercise 1.70. Let k ∈ L1 (R) be any function that satisfies


Z
k(x) dx = 1.

Define kλ by an L1 -normalized dilation:

kλ (x) = λ k(λx), λ > 0,

and show that the family {kλ }λ>0 forms an approximate identity. ♦

Note that there is an inherent ambiguity in our notation: We may use


{kλ }λ>0 to indicate a generic family of functions indexed by λ, or, as intro-
duced in Notation 1.5, we may use kλ to denote the L1 -normalized dilation
of a function k. The intended meaning is usually clear from context.
In any case, if we define kλ by dilation, then, as λ increases, kλ becomes
more and more similar to our intuition of what a “δ-function” (a function that
is an identity for convolution) would look like (see the illustration in Figure 1.7
and the related discussion in Section 1.3.5). While there is no such identity for
convolution in L1 (R), the collection of functions {kλ }λ>0 in some sense forms
an approximation to this nonexistent δ-function, for requirement (c) implies
that kRλ becomes more and more concentrated near the origin as λ increases,
with kλ = 1 for every λ.
Consider also the appearance of kλ (x) = λk(λx) in the frequency domain.
By Exercise 1.13, we have
 
b b ξ
kλ (ξ) = k .
λ
R
Since k is integrable with k = 1, we know that b
k is continuous, and
48 1 The Fourier Transform on L1 (R)
Z
b
k(0) = k = 1.

The continuity of b
k therefore implies that for each ξ ∈ R we have
 
ξ
lim kbλ (ξ) = lim b k = bk(0) = 1 (1.25)
λ→∞ λ→∞ λ
2
(see the illustration in Figure 1.11 using the Gaussian function g(x) = e−πx ,
2
which by Exercise 1.109 has the interesting property that bg(ξ) = e−πξ ). Thus
kbλ converges pointwise everywhere to the constant function 1. This again
matches our intuition for what the Fourier transform of a “δ-function” would
be if there was one, for if there was a function δ that satisfied both δ(x) = 0
R
for x 6= 0 and δ = 1, then δb would be identically constant:
Z Z
b
δ(ξ) = δ(x) e−2πiξx dx = δ(x) dx = 1.

This contradicts the Riemann–Lebesgue Lemma, so no such function δ can


exist. However, when we define δ not as a function but rather as a distribution
in Chapter 4 or as a measure in Chapter 5, we will see that δb does exist and
is precisely the constant function.
In any case, given any f ∈ L1 (R) and an approximate identity {kλ }λ>0 of
the form kλ (x) = λk(λx), we have that

(f ∗ kλ ) (ξ) = fb(ξ) kbλ (ξ) → fb(ξ),



ξ ∈ R.

So, we at least have that (f ∗ kλ ) (ξ) converges pointwise to fb(ξ), and this

gives us hope that f ∗ kλ should converge to f in other senses as well. Our


goal in this section is understand in what sense this hope holds true.

1.5.2 Approximation in Lp (R) by an Approximate Identity

We begin by quantifying the notion that an approximate identity is approxi-


mately an identity for convolution in L1 (R). The proof of this theorem illus-
trates the “standard
R trick” of introducing kλ into an equation by virtue of
the fact that kλ = 1, and also the division of the integral into small and
large parts in order to make use of the defining properties of an approximate
identity.

Theorem 1.71. Let {kλ }λ>0 be an approximate identity. Then

∀ f ∈ L1 (R), lim kf − f ∗ kλ k1 = 0.
λ→∞

That is, f ∗ kλ → f in L1 -norm as λ → ∞.


1.5 Approximate Identities 49

-5 -4 -3 -2 -1 0 1 2 3 4 5

-5 -4 -3 -2 -1 0 1 2 3 4 5

-5 -4 -3 -2 -1 0 1 2 3 4 5
2 2
Fig. 1.11. Top: The Fourier transform b g (ξ) = e−πξ of the function g(x) = e−πx .
Middle: The Fourier transform gb5 (ξ) = bg (ξ/5) of the dilated function g5 (x) = 5g(5x).
Bottom: The Fourier transform gc 15 (ξ) = g b(ξ/15) of the dilated function g15 (x) =
15g(15x).

Proof. Fix any f ∈ L1 (R). Since kλ ∈ L1 (R), we know that f ∗ kλ ∈ L1 (R),


1
and we
R wish to show that it approximates f well in L -norm. Using the fact
that kλ = 1, we compute that
50 1 The Fourier Transform on L1 (R)
Z
kf − f ∗ kλ k1 = |f (x) − (f ∗ kλ )(x)| dx
Z Z Z
= f (x) kλ (t) dt − f (x − t) kλ (t) dt dx
ZZ
≤ |f (x) − f (x − t)| |kλ (t)| dt dx
ZZ
= |f (x) − f (x − t)| |kλ (t)| dx dt
Z Z
= |kλ (t)| |f (x) − Tt f (x)| dx dt
Z
= |kλ (t)| kf − Tt f k1 dt, (1.26)

where the interchange in the order of integration is permitted by Tonelli’s


Theorem since the integrands are nonnegative. We want to show that the
quantity in equation (1.26) is small when λ is large.
Choose any ε > 0. Since translation is strongly continuous on L1 (R), there
exists a δ > 0 such that

|t| < δ =⇒ kf − Tt f k1 < ε.

Also, by definition of approximate identity, we know that

K = sup kkλ k1 < ∞,


λ

and that there exists some λ0 such that


Z
λ > λ0 =⇒ |kλ (t)| dt < ε.
|t|≥δ

Therefore, for λ > λ0 we can continue equation (1.26) as follows:


Z Z
(1.26) = |kλ (t)| kf − Tt f k1 dt + |kλ (t)| kf − Tt f k1 dt
|t|<δ |t|≥δ
Z Z

≤ |kλ (t)| ε dt + |kλ (t)| kf k1 + kTt f k1 dt
|t|<δ |t|≥δ
Z Z
≤ ε |kλ (t)| + 2kf k1 |kλ (t)| dt
|t|≥δ

≤ εK + 2kf k1 ε.

Thus kf − f ∗ kλ k1 → 0 as λ → ∞. ⊓

To illustrate the convergence proved in the preceding theorem, consider
the particular function χ = χ[0,1] and a particular approximate identity that
1.5 Approximate Identities 51

will be of considerable use to us later. This is the Fejér kernel {wλ }λ>0 ,
2
which is produced by dilating the Fejér function w(x) = sinπxπx depicted in
Figure 1.14. In Figure 1.12, we see the convolutions χ ∗w, χ ∗w5 , and χ ∗w25 . In
addition to the convergence apparent in these figures, note that the convolved
functions appear to be continuous, while χ is discontinuous. This is due to
the smoothing effect of convolution, which was discussed in Section 1.3.7.

1.0

0.8

0.6

0.4

0.2

-1.0 -0.5 0.5 1.0 1.5 2.0


-0.2

1.0

0.8

0.6

0.4

0.2

-1.0 -0.5 0.5 1.0 1.5 2.0


-0.2

1.0

0.8

0.6

0.4

0.2

-1.0 -0.5 0.5 1.0 1.5 2.0


-0.2

Fig. 1.12. Convolution with an approximate identity. Top: χ ∗ w. Middle: χ ∗ w5 .


Bottom: χ ∗ w25 .

There are many variations on the theme of Theorem 1.71. To begin, since
Lp ∗ L1 ⊆ Lp , we expect that we may be able to extend to other values of p,
and indeed for p finite we have the following result.
52 1 The Fourier Transform on L1 (R)

Exercise 1.72. Let {kλ }λ>0 be an approximate identity. Prove that if 1 ≤


p < ∞, then
∀ f ∈ Lp (R), lim kf − f ∗ kλ kp = 0.
λ→∞

That is, f ∗ kλ → f in Lp -norm as λ → ∞. ♦

As usual, we do need to be careful when p = ∞, and we consider this issue


next.

1.5.3 Uniform Convergence

Now we turn to approximation by approximate identities in spaces such as


L∞ (R), C0 (R), and Cb (R). Considering that the elements kλ of an approx-
imate identity are integrable functions, if f belongs to L∞ (R) then f ∗ kλ
belongs to Cb (R) by Exercise 1.39. Hence it is simply too much to expect
that f ∗ kλ will converge to f in L∞ -norm for all f ∈ L∞ (R). However, if we
impose some smoothness on f, then we can obtain convergence in senses that
are appropriate to f.
If we assume that f belongs to C0 (R), then f ∗ kλ also belongs to C0 (R)
by Theorem 1.36. By replacing L∞ (R) with C0 (R), we therefore obtain for
p = ∞ an exact analogue of the convergence result of Exercise 1.72.

Exercise 1.73. Let {kλ }λ>0 be an approximate identity. Prove that

∀ f ∈ C0 (R), lim kf − f ∗ kλ k∞ = 0.
λ→∞

That is, f ∗ kλ converges uniformly to f as λ → ∞. ♦

Suppose that we relax the decay hypothesis on f in Exercise 1.73, and


assume only that f ∈ Cb (R) instead of in C0 (R). In this case we do have that
f ∗ kλ ∈ Cb (R) by Exercise 1.39, so we can hope that f ∗ kλ will still converge
to f. This is true, but in general the convergence will be uniform only when
restricted to compact sets.

Exercise 1.74. Let {kλ }λ>0 be an approximate identity. Prove that if f ∈


Cb (R), then f ∗ kλ converges uniformly to f on compact sets, i.e.,

∀ compact K ⊆ R, lim (f − f ∗ kλ ) χK ∞
= 0.
λ→∞

Give an example of f ∈ Cb (R) such that f ∗ kλ does not converge uniformly


to f on R. ♦

On the other hand, if we impose even a slight amount of “extra smooth-


ness” on f ∈ Cb (R), then we can restore the uniform convergence of f ∗ kλ
to f on the entire real line. We can quantify this extra smoothness in terms
of Hölder continuity, which is a generalization of Lipschitz continuity.
1.5 Approximate Identities 53

Definition 1.75. (a) A function f : R → C is Hölder continuous with expo-


nent α > 0 at a point x if there exists a constant K > 0 such that
∀ y ∈ R, |f (x) − f (y)| ≤ K |x − y|α .
(b) f is Hölder continuous with exponent α > 0 if there exists a constant
K > 0 such that
∀ x, y ∈ R, |f (x) − f (y)| ≤ K |x − y|α .
(c) Given 0 < α < 1, we set

C α (R) = f ∈ C(R) : f is Hölder continuous with exponent α . ♦
The reader should verify that the only functions that are Hölder contin-
uous with exponent α > 1 are the constant functions. Lipschitz continuity
is Hölder continuity with α = 1. Every differentiable function f such that
f ′ is bounded is Lipschitz, but a Lipschitz function need not be differentiable
everywhere (consider f (x) = |x|). Hence C 1 (R), the space of function that are
differentiable with a continuous derivative, is a proper subspace of the space
of Lipschitz functions. A Lipschitz function f has bounded variation on any
finite interval, and consequently f is differentiable at almost every point and
f ′ is integrable on any finite interval.
The graph of a function that is Hölder continuous with exponent α < 1
typically has a “fractal” appearance. The smaller that we must take α, the
more “jagged” the graph of the function appears. For example, the Cantor–
Lebesgue function is Hölder continuous but not Lipschitz (Problem 1.42).
Exercise 1.76. Let {kλ }λ>0 be an approximate identity. Prove that if f ∈
Cb (R) is Hölder continuous for some exponent 0 < α ≤ 1, then f ∗kλ converges
uniformly to f on R, i.e.,
lim kf − f ∗ kλ k∞ = 0. ♦
λ→∞

1.5.4 Pointwise Convergence


If f is a locally integrable function, then x ∈ R is a Lebesgue point of f if
Z x+h
1
lim |f (y) − f (x)| dy = 0,
h→0 2h x−h

and the set of Lebesgue points is called the Lebesgue set of f. Every point
of continuity is a Lebesgue point of f. More interestingly, the Lebesgue Dif-
ferentiation Theorem implies that almost every x ∈ R is a Lebesgue point
(Theorem A.30).
Now, we have seen that if f belongs to Lp (R) with p finite then f ∗ kλ → f
in Lp -norm. If we impose some restrictions on k, then we can also show that
we have pointwise convergence of (f ∗ kλ )(x) to f (x) at each Lebesgue point x
of f.
54 1 The Fourier Transform on L1 (R)

Theorem
R 1.77. Let k be a bounded, compactly supported function such that
k = 1, and define kλ (x) = λk(λx). If f ∈ L1 (R), then (f ∗ kλ )(x) → f (x)
as λ → ∞ for each point x in the Lebesgue set of f. In particular, f ∗ kλ
converges to f pointwise a.e.
Proof. By hypothesis, supp(k) ⊆ [−R, R] for some R. If x is a Lebesgue point
of f, then

lim |f (x) − (f ∗ kλ )(x)|


λ→∞
Z Z
= lim f (x) kλ (x − t) dt − f (t) kλ (x − t) dt
λ→∞
Z
≤ lim λ |f (x) − f (t)| |k(λ(x − t))| dt
λ→∞
Z x+(R/λ)
2Rλ
= lim |f (x) − f (t)| |k(λ(x − t))| dt
λ→∞ 2R x−(R/λ)
Z x+h
1
≤ 2R kkk∞ lim |f (x) − f (t)| dt
h→0 2h x−h

= 0,

where the limit is zero by definition of Lebesgue point. Finally, since almost
every x is a Lebesgue point of f, we conclude that f ∗ kλ converges to f
pointwise a.e. ⊓⊔
The hypotheses on k in Theorem 1.77 can be relaxed. In particular, com-
pact support of k is not required. For example, Stein and Weiss [SW71,R p. 13]
give a more intricate argument that shows that if k ∈ L1 (R) satisfies k = 1
and there exists an even function φ ∈ L1 (R) that is decreasing and differen-
tiable on (0, ∞) that dominates k in the sense that |k(x)| ≤ φ(x) for all x,
then (f ∗ kλ )(x) → f (x) for every x in the Lebesgue set of f.

1.5.5 Dense Sets of Nice Functions

Theorem 1.16 gave us a proof, based on Urysohn’s Lemma, that the space
Cc (R) is dense in L1 (R). It almost seems that we should be able to give a
“simple” proof of this fact by using approximate identities and arguing as
follows. Choose any f ∈ L1 (R). Then we can find a compactly supported g ∈
L1 (R) that is close to f, e.g., if we take R large enough and set g = f χ[−R,R]
then we will have kf − gk1 < ε. If we convolve g with an element kλ of an
approximate identity, then g ∗ kλ will be close to g if λ is large enough, say
kg − g ∗ kλ k1 < ε. Further, if we choose our approximate identity so that kλ is
a “nice” function then g ∗ kλ will inherit this “niceness” as well. For example,
if kλ ∈ Cc (R) then g ∗ kλ ∈ Cc (R), and so we have found an element of Cc (R)
that lies within 2ε of f in L1 -norm.
1.5 Approximate Identities 55

The flaw in this reasoning is that our proof in Theorem 1.71 that g ∗kλ → g
in L1 -norm relies on the fact that translation is a strongly continuous family
of operators in L1 (R). The proof of this strong continuity of translation is
the content of Exercise 1.17. However, the proof of that exercise (at least the
proof we suggest in the hints) requires us to already know that Cc (R) is dense
in L1 (R). Hence the reasoning of the preceding paragraph is circular.
We could take a different approach, e.g., by first arguing that simple func-
tions are dense in L1 (R) and trying to proceed from there to show that Cc (R)
is dense. But it doesn’t really matter, one way or the other we essentially
have to “get our hands dirty” and show that some particular special subset is
dense. The power of approximate identities comes at the next step—once we
know that one particular set is dense, we can use the spirit of the argument
above (convolution with a “nice” approximate identity) to easily show that
“much nicer” spaces are also dense. We even obtain results that almost seem
too good to be true. For example, we will see that the space Cc∞ (R) consisting
of infinitely differentiable, compactly supported functions is dense in Lp (R) for
every 1 ≤ p < ∞! This fact is not just an abstract surprise, but will be of
great utility to us throughout the remainder of this volume, especially when
we turn to distribution theory in Chapter 4.

Exercise 1.78. (a) Show that Ccm (R) is dense in Lp (R) for each m ≥ 0 and
1 ≤ p < ∞, and also is dense in C0 (R) in L∞ -norm.
(b) Show that Cc∞ (R) is dense in Lp (R) for each 1 ≤ p < ∞, and also is
dense in C0 (R) in L∞ -norm. ♦

After proving the Inversion Formula in Section 1.6, we will also be able
to show that many spaces of functions with nice Fourier transforms are also
dense. For example, in Problem 1.66 we will see that f ∈ L1 (R) : fb ∈
Cc∞ (R) is dense in Lp (R) for 1 ≤ p < ∞.

1.5.6 The C ∞ Urysohn Lemma

We proved Urysohn’s Lemma for the setting of the real line in Theorem 1.15.
By using approximate identities, we now prove a much more refined C ∞ -
version of Urysohn’s Lemma for subsets of R.

Theorem 1.79 (C ∞ Urysohn Lemma). Let K ⊆ R be compact, and let


U ⊇ K be an open set. Then there exists f ∈ Cc∞ (R) such that 0 ≤ f ≤ 1,
f = 1 on K, and supp(f ) ⊆ U.

Proof. Since K is compact and R\U is closed, the distance between these sets
is positive, i.e.,

d = dist(K, R\U ) = inf |x − y| : x ∈ K, y ∈/ U > 0.

Let
56 1 The Fourier Transform on L1 (R)
n do
V = y ∈ R : dist(y, K) < ,
3
R
and let k be any function in Cc∞ (R) such that k ≥ 0, k = 1, and supp(k) ⊆
[− d3 , d3 ] (for example, dilate the function constructed in Exercise 1.42). Set
f = k ∗ χV . Since k and χV are both compactly supported, their convolution
is also compactly supported, and hence it follows from Corollary 1.41 that
f ∈ Cc∞ (R). Since
Z Z
f (x) = k(x − y) dy ≤ k = 1,
V

d
we have 0 ≤ f ≤ 1 everywhere. Also, if x ∈ K and y ∈ / V then |x − y| ≥ 3
and so k(x − y) = 0. Therefore for x ∈ K we have
Z Z
f (x) = k(x − y) dy = k(x − y) dy = 1.
V

Similarly if x ∈
/ U then it follows that f (x) = 0. ⊓

1.5.7 Gibbs’s Phenomenon

Gibbs’s phenomenon refers to the behavior of the partial sums of the Fourier
series of a periodic function near a jump discontinuity. Although this chapter
focuses on the Fourier transform on R rather than Fourier series on T, there
is an analogous phenomenon for the Fourier transform that we will discuss.
For the formulation and proof of Gibbs’s phenomenon on the torus, we refer
to [DM72].
To illustrate this, let H = χ[0,∞) be the Heaviside function. Although
H does not belong to L1 (R), the important fact for this example is that H
has a jump discontinuity at x = 0. Pointwise convergence of Fourier series
corresponds to convolution with the Dirichlet kernel on the torus (see Sec-
tion 2.2.1, and equation (2.9) in particular). The Dirichlet kernel on the real
line is {dλ }λ>0 , which is obtained by dilating the Dirichlet function
sin ξ
d(ξ) = .
πξ
/ L1 (R), the Dirichlet kernel does not form an approximate identity,
Since d ∈
but even so let us consider the pointwise behavior of H ∗ dλ as λ → ∞.
R∞
Using the fact that, as an improper Riemann integral, 0 sinx x dx = π2 (see
Problem 1.43), we have
Z ∞
sin λ(x − y)
(H ∗ dλ )(x) = dy
0 π(x − y)
Z λx
sin y
= dy
−∞ πy
1.5 Approximate Identities 57
Z λx
1 sin y
= + dy.
2 0 πy
sin y
Since πy > 0 for 0 < y < π, H ∗ dλ is increasing on (0, π/λ). Then it
decreases on (π/λ, 2π/λ), then increases on (2π/λ, 3π/λ)—but never back to
the value it had at π/λ. Continuing in this way we see that H ∗ dλ achieves
its maximum at x = π/λ, and this maximum is
π Z π
1 sin y
(H ∗ dλ ) = + dy ≈ 1.089 . . . .
λ 2 0 πy
Note that this maximum is independent of λ. Although (H ∗ dλ )(x) converges
pointwise to H(x) as λ → ∞ for all x 6= 0, this convergence is not uniform.
In particular, the maximum distance between (H ∗ dλ )(x) and H(x) for x > 0
is a constant (approximately 0.089. . . ) that is independent of λ, although its
location at x = π/λ decreases with λ. Figure 1.13 displays a plot of H ∗ dλ
for λ = 16π.

1.0

0.8

0.6

0.4

0.2

-0.4 -0.2 0.2 0.4

Fig. 1.13. Graph of H ∗ dλ for λ = 16π.

1.5.8 Translation-Invariant Subspaces of L1 (R)


The following characterization of the closed, translation-invariant subspaces
of L1 (R) is due to Norbert Wiener (1894–1964); compare this to the char-
acterization of the closed, translation-invariant subspaces of L2 (R) given in
Problem 3.10.
Definition 1.80 (Translation-Invariant Subspaces). We say that a sub-
set J of L1 (R) is translation-invariant if J is closed under all translations,
i.e., if
f ∈ J, a ∈ R =⇒ Ta f ∈ J. ♦
58 1 The Fourier Transform on L1 (R)

Remark 1.81. A shift-invariant space V is one that is closed under integer


translations, i.e., if f ∈ V then Tk f ∈ V for all integer k. These spaces play
important roles in sampling theory and wavelet theory, see [Dau92]. ♦
Exercise 1.82. Let J be a closed subspace of L1 (R), and prove that the
following statements are equivalent.
(a) J is translation-invariant.
(b) J is an ideal in L1 (R) under convolution. ♦
Note that closedness is important here. For example, Cc (R) is a translation-
invariant subspace of L1 (R), but it is not an ideal with respect to convolution.
By using Exercise 1.82, we can give another characterization of the princi-
pal ideal I(g) = g ∗ L1 (R) generated by a function g ∈ L1 (R). By Exercise 1.48
and Problem 1.39, I(g) is the smallest closed ideal in L1 (R) that contains g.
Exercise 1.83. Given g ∈ L1 (R), show that I(g) is the closure of the finite
linear span of the translations of g:

I(g) = span {Ta g}a∈R . ♦

Thus, the smallest closed ideal that contains g is precisely the smallest
closed subspace that contains all translates of g.
Recall that a set S ⊆ L1 (R) is complete in L1 (R) if span(S) is dense in
1
L (R). It therefore follows from Exercise 1.83 that {Ta g}a∈R is complete in
L1 (R) if and only if I(g) = L1 (R). But when does this happen? The next
exercise gives a necessary condition.
Exercise 1.84. Given g ∈ L1 (R), show that

{Ta g}a∈R is complete in L1 (R) =⇒ gb(ξ) 6= 0 for all ξ ∈ R. ♦

The converse of Exercise 1.84 is also true, but is a much deeper fact that
is part of Wiener’s Tauberian Theorem, which we discuss in Section 2.10. In
contrast, the analogous question in L2 (R) is much simpler, see Problem 3.9.

Additional Problems

1.34. This problem constructs an example of an approximate identity {kλ }λ>0


where kλ need not be a dilation of a single function k. ForR each λ > 0,
let kλ ∈ L1 (R) be any function satisfying kλ ≥ 0, kkλ k1 = kλ = 1, and
supp(kλ ) ⊆ [− λ1 , λ1 ]. Show that {kλ }λ>0 is an approximate identity.

1.35. Show that if {kλ }λ>0 is an approximate identity then kbλ (ξ) → 1 point-
wise as λ → ∞.
1.36. Let {kλ }λ>0 be an approximate identity and fix f ∈ L1 (R) ∩ L∞ (R).
Show that if f is continuous at some point x ∈ R, then (f ∗ kλ )(x) → f (x) as
λ → ∞.
1.5 Approximate Identities 59

1.37. Fix g ∈ L1 (R) and define Lg : L1 (R) → L1 (R) by Lg (f ) = f ∗ g. Show


that the operator norm of Lg is kLg k = kgk1 . This gives the best constant in
equation (1.15) for the case p = q = r = 1.
R
1.38. Assume k ∈ L1 (R) is given and we set r = k and kλ (x) = λk(λx).
Prove that if 1 ≤ p < ∞, then for each f ∈ Lp (R) we have that f ∗ kλ → rf
in Lp -norm as λ → ∞. Note that this includes the possibility that r may be
complex or zero.

1.39. Fix g ∈ L1 (R), and consider the ideals g ∗ L1 (R) and I(g) introduced
in Exercise 1.48. Show that we need not have g ∈ g ∗ L1 (R), but we always
have g ∈ I(g).
R
1.40. Let k ∈ L1 (R) be any function such that k = 1 and xk(x) ∈ L1 (R),
and define an approximate identity by setting kλ (x) = λk(λx). Fix 1 ≤ p < ∞.
(a) Show that kP kλ k1 → 0 as λ → ∞, where P is the position operator.
(b) Show that if f ∈ Lp (R) then f ∗ P kλ → 0 in Lp -norm. Further, if we also
have xf (x) ∈ Lp (R), then P (f ∗ kλ ) → P f in Lp -norm.

1.41. Given 0 < α < 1, let C α (R) be the space of Hölder continuous functions
given in Definition 1.75. Show that
f (x) − f (y)
kf kC α = |f (0)| + sup
x6=y |x − y|α

is a norm on C α (R), and C α (R) is complete with respect to this norm.

1.42. Show that the Cantor–Lebesgue function is Hölder continuous precisely


for exponents α in the range 0 < α ≤ log3 2 ≈ 0.6309 . . . .
R∞
1.43. (a) Show that 0 sinx x dx = ∞.
R∞
(b) If x > 0, then 0 e−tx dt = x1 . Combine this with Fubini’s Theorem to
R a sin x
evaluate 0 x dx. Then apply the Lebesgue Dominated Convergence The-
Ra
orem to show that lima→∞ 0 sinx x dx = π2 .
Thus, even though sinx x is not integrable, the improper Riemann integral
R ∞ sin x π
0 x dx exists and equals 2 (this integral can also be evaluated by using
contour integration).
Rb
(c) Show that supa<b a sinx x dx < ∞.

1.44. This problem will illustrate one of the many different possible gener-
alizations of the results of this section by considering particular weighted Lp
spaces. Given s ∈ R let vs (x) = (1+|x|)s ; we refer to vs as a polynomial weight
(since it has polynomial-like growth if s ≥ 0, or decays like the reciprocal of
a polynomial if s ≤ 0). For this problem fix 1 ≤ p < ∞ and s ∈ R. Then we
define Lps (R) to be the space of all measurable functions f : R → C such that
60 1 The Fourier Transform on L1 (R)
Z 1/p
p ps
kf kp,s = kf vs kp = |f (x)| (1 + |x|) dx < ∞.

If we identify functions that are equal a.e. then Lps (R) is a Banach space.
Prove the following statements.
(a) If s ≥ 0 then vs is submultiplicative, i.e., vs (x + y) ≤ vs (x) vs (y) for
x, y ∈ R. If s ≤ 0 then vs is v−s -moderate, i.e., vs (x + y) ≤ vs (x) v−s (y) for
x, y ∈ R.
(b) For each a ∈ R, the translation operator Ta is a continuous map of
Lps (R) into itself, with operator norm kTa kLps →Lps ≤ v|s| (a) = (1 + |a|)|s| .
(c) Translation is strongly continuous on Lps (R), i.e., for each f ∈ Lps (R)
we have lima→0 kf − Ta f kp,s = 0.
(d) If {kλ }λ>0 is an approximate identity, then for each f ∈ Lps (R) we
have limλ→∞ kf − f ∗ kλ kp,s = 0.
(e) Cc∞ (R) is dense in Lps (R).
(f) Do parts (b)–(d) still hold if we replace vs by an arbitrary weight
w : R → (0, ∞)? What properties do we need w to possess?

1.6 The Inversion Formula


In this section we will prove Theorem 1.9, the Inversion Formula for the Fourier
transform on L1 (R), which states that if f and fb both belong to L1 (R), then
∨ ∨ ∧
f = fb = f .
Unfortunately, the Inversion Formula does not apply to every function in
L1 (R). For example, by Exercise 1.7 we have that χ = χ[−1/2,1/2] belongs to
L1 (R) but χ b (ξ) = dπ (ξ) = sin πξ is not integrable. Another example is given
πξ
in Problem 1.1: The one-sided exponential function f (x) = e−x χ[0,∞) (x) is
integrable, but its Fourier transform fb(ξ) = 2πiξ+1
1
is not.

1.6.1 The Fejér Kernel

To prove the Inversion Formula, we will use the machinery of approximate


identities that we developed in Section 1.5. Specifically, we will use a particular
approximate identity, named after Lipót Fejér (1880–1959), that has some
useful special properties.

Definition 1.85 (Fejér Kernel). The Fejér function is the square of the
sinc function dπ :  2
sin πx
w(x) = = dπ (x)2 .
πx
The Fejér kernel is {wλ }λ>0 where wλ (x) = λw(λx). ♦
1.6 The Inversion Formula 61

1.0

0.8

0.6

0.4

0.2

-4 -2 2 4
-0.2
Fig. 1.14. Graph of the Fejér function w.

The letter “w” is for “Weiss,” which was Fejér’s surname at birth.
In order to conclude that the Fejér kernel is an approximate identity, we
need to know that the integral of w is 1.
R
Exercise 1.86. Show that w ∈ L1 (R) and w = 1. ♦
sin πξ
Let χ = χ[−1/2,1/2] . Then we know from Exercise 1.7 that χ
b (ξ) =
πξ =
dπ (ξ). Hence if we let
W = χ ∗ χ,
then we have
W b 2 = dπ2 = w.
b = χ (1.27)

Also W = w since W is even. Since both W and W b = w belong to L1 (R), if
we had already proved that the Inversion Formula holds, then we could apply
it to W to conclude that w’s Fourier transform is W, i.e.,

w
b = (W )

= W and

b )∨ = W.
w = (W (1.28)

We will see eventually that this is true, but first we must prove the Inversion
Formula. And we will prove the Inversion Formula by making use of the Fejér
kernel, although we will not need equation (1.28) to do this. Obviously, this
is a good thing, since otherwise the argument would be circular.
It is not so much the Fejér kernel itself that will be important to us, but
rather some of the properties that it happens to have, including:
(a) w, W ≥ 0,
(b) w(0) = 1,
b ∈ L1 (R),
(c) W, W
R
(d) W = W b (0) = 1.

Even these are not all essential, and the reader can consider what other kernels
we might use instead to prove the Inversion Formula.
62 1 The Fourier Transform on L1 (R)

1.0

0.8

0.6

0.4

0.2

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

Fig. 1.15. Graph of the hat function W.

Exercise 1.87. Prove that W is the hat function or tent function on the
interval [−1, 1] defined by

W (ξ) = max{1 − |ξ|, 0}. ♦

Consequently,
n |ξ| o
W (ξ/λ) = max 1 − , 0 (1.29)
λ
is the dilated hat function with height 1 supported on [−λ, λ]. In particular,
W (ξ/λ) → 1 pointwise as λ → ∞.

1.6.2 Proof of the Inversion Formula

To motivate our first step towards the proof of the Inversion Formula, choose
any f ∈ L1 (R). Then f ∗ wλ ∈ L1 (R) and also (f ∗ wλ ) = fb wbλ . We don’t

know this yet, but we will show that wbλ (ξ) = W (ξ/λ) ∈ L1 (R). Once we
establish this, it follows (since fb is bounded) that

= fb wbλ ∈ L1 (R).

(f ∗ wλ )

Hence, if only we already knew that the Inversion Formula was valid, we could
compute that
Z
∧ ∨
(x) = (fb wbλ ) (x) = fb(ξ) W (ξ/λ) e2πiξx dξ.

(f ∗ wλ )(x) = (f ∗ wλ )

Unfortunately, these calculations are not yet justified since, among other
things, they rely on the Inversion Formula, which has not yet been proved.
However, instead of trying to justify the full Inversion Formula right now, we
begin with a much
R smaller step: We show directly that the specific equality
(f ∗ wλ )(x) = fb(ξ) W (ξ/λ) e2πiξx dξ holds when f ∈ L1 (R).

Theorem 1.88. If f ∈ L1 (R), then for each λ > 0 we have


1.6 The Inversion Formula 63
Z
(f ∗ wλ )(x) = fb(ξ) W (ξ/λ) e2πiξx dξ
Z 
λ
|ξ|  2πiξx
= fb(ξ) 1 − e dξ. (1.30)
−λ λ
Proof. We already know that

Z
w(x) = W (x) = W (ξ) e2πiξx dξ,

and therefore by making a change of variables it follows that


Z
wλ (x) = λw(λx) = W (ξ/λ) e2πiξx dξ.

Suppose that f ∈ L1 (R) is given. Assuming for the moment that the use of
Fubini’s Theorem in the following calculation is justified, we have that
Z
(f ∗ wλ )(x) = f (y) wλ (x − y) dy
Z Z
= f (y) W (ξ/λ) e2πiξ(x−y) dξ dy
Z Z 
λ
|ξ|  2πiξ(x−y)
= f (y) 1 − e dξ dy
−λ λ
Z λ Z 
2πiξ(x−y) |ξ| 
= f (y) e dy 1 − dξ
−λ λ
Z λ  |ξ|  2πiξx
= fb(ξ) 1 − e dξ.
−λ λ
Of course, the applicability of Fubini’s Theorem does have to be justified, and
we assign this task to the reader as an exercise. ⊓

Now we obtain the Inversion Formula by taking the limit in equation
(1.30).

Theorem 1.89 (Inversion Formula). If f, fb ∈ L1 (R), then f and fb are


continuous, and
Z
∨
b
f (x) = f (x) = fb(ξ) e2πiξx dξ, (1.31)

with equality holding pointwise everywhere, and similarly


Z
∨ ∧ ∨
f (x) = f (x) = f (ξ) e−2πiξx dξ

for every x.
64 1 The Fourier Transform on L1 (R)

Proof. Since f ∈ L1 (R), we know that fb is a continuous function on R. There-


fore, for any fixed x, we have the pointwise limit

∀ ξ ∈ R, lim fb(ξ) W (ξ/λ) e2πiξx = fb(ξ) e2πiξx .


λ→∞

Further, since 0 ≤ W ≤ 1,

|fb(ξ) W (ξ/λ) e2πiξx | ≤ |fb(ξ)| ∈ L1 (R).

Since f ∗wλ is continuous by Theorem 1.36, it is defined pointwise everywhere.


By Theorem 1.88 and the Lebesgue Dominated Convergence Theorem, we
therefore have for each x that
Z
lim (f ∗ wλ )(x) = lim fb(ξ) W (ξ/λ) e2πiξx dξ
λ→∞ λ→∞
Z
∨
= fb(ξ) e2πiξx dξ = fb (x).

On the other hand, f ∗ wλ → f in L1 -norm, so there is a subsequence such


∨
that (f ∗ wλk )(x) → f (x) for almost every x. Therefore fb (x) = f (x) a.e.
∨
Since fb is continuous, by redefining f on a set of measure zero we can
assume that equality holds pointwise everywhere. ⊓ ⊔

Observe that the hypothesis f, fb ∈ L1 (R) can be equivalently restated as

f, fb ∈ L1 (R) ⇐⇒ f ∈ L1 (R) ∩ A(R).

Corollary 1.90. If f, fb ∈ L1 (R), then f, fb ∈ C0 (R). ♦

One consequence of the Inversion Formula is that we can now justify our
hope, presented in equation (1.28), that w
b = W. Note that since w and W are
even, their Fourier and inverse Fourier transforms coincide.

Corollary 1.91. w

b )∨ = w.
b = (W ) = W = (W

Proof. This follows from the Inversion Formula, the fact that w and W both
b = w. ⊓
belong to L1 (R), and our proof in equation (1.27) that W ⊔

An additional consequence is that the Fourier transform is an injective


map on L1 (R).

Theorem 1.92 (Uniqueness Theorem). If f, g ∈ L1 (R), then

f = g a.e. ⇐⇒ fb = gb a.e.

In particular,
f = 0 a.e. ⇐⇒ fb = 0 a.e.
1.6 The Inversion Formula 65

Proof. Since the Fourier transform is linear, the first equivalence is a con-
sequence of the second. If f = 0 a.e., then fb = 0 everywhere by defini-
tion of the Fourier transform. On the other hand, if fb = 0 a.e., then we
have both f, fb ∈ L1 (R), so the Inversion Formula applies, and we obtain
∨ ∨
f = fb = 0 = 0. ⊓ ⊔
Note that we could also appeal to Theorem 1.88 to give another proof of
the Uniqueness Theorem.

1.6.3 Summability
If fb ∈
/ L1 (R) then the Inversion Formula need not hold. However, as an
immediate consequence of Theorem 1.88, we have the following approximation
formula for f in terms of fb that is valid for all f ∈ L1 (R), even if fb is not
integrable.
Theorem 1.93. If f ∈ L1 (R), then
Z λ  |ξ|  2πiξx
fb(ξ) 1 − e dξ → f in L1 -norm as λ → ∞. (1.32)
−λ λ
Proof. Simply note that, by Theorem 1.88, the left-hand side of equation
(1.32) is f ∗ wλ . ⊓

Thus, even if fb ∈
/ L1 (R), we have an “approximate” interpretation of the
∨
b
formula f = f in the sense that equation (1.32) will hold. This is analogous
to using summability conditions for evaluating divergent
P∞ series. For example,
consider a formal1 bi-infinite series of the form k=−∞ ak . Let us say that
this series converges and equals L if the symmetric partial sums
N
X
sN = ak
k=−N

converge to L. The Cesàro means or arithmetic means of these partial sums


are
s0 + s1 + · · · + s N
σN = .
N +1
P∞
If the series k=−∞ ak converges (for example, if a = (ak )k∈Z is a summable
sequence), then the Cesàro means will converge to the same limit. However,
the Cesàro means may converge even whenPthe partial sums do not; if the

Cesàro means converge then we say that k=−∞ ak is Cesàro summable.
Further, the form of the Cesàro means given in the next exercise suggests an
interesting analogy with Theorem 1.93.
1
It is an amusing linguistic footnote that a formal statement in mathematics
means that the statement is entirely informal.
P In particular, we use the term “for-
mal” at this point because the symbols ∞ k=−∞ ak denote a completely arbitrary
series, without any requirement that it converge in any sense whatsoever.
66 1 The Fourier Transform on L1 (R)

Exercise 1.94. Given a sequence of scalars a = (ak )k∈Z , let the partial sums
sN and Cesàro means σN be as above.
(a) Show that
XN  |k| 
σN = 1− ak .
N +1
k=−N

(b) Show that if the partial sums sN converge, then the Cesàro means σN
converge to the same limit, i.e.,
N
X  |k| 

X
lim 1− ak = lim sN = ak . (1.33)
N →∞ N +1 N →∞
k=−N k=−∞

P
(c) Set ak = (−1)k for k ≥ 0 and ak = 0 for k < 0. Show that the series ak
is Cesàro summable even though the partial sums do not converge, and
find the limit of the Cesàro means.
(d) Show that if ak ≥ 0 for every k then σN converges if and only if sN
converges. ♦
Comparing Theorem 1.93 to equation (1.33), we see that Theorem 1.93 is
a continuous version of Cesàro summability. Essentially, Theorem 1.93 says
that if f ∈ L1 (R), then the formal integral
Z ∞
∨
b
f (x) = fb(ξ) e2πiξx dξ
−∞

is Cesàro summable to f even if fb is not integrable.


It is possible to make this analogy even more precise. Just as the integral
in equation (1.32) is related to the convolution of f with the Fejér kernel,
specifically,
Z λ  |ξ|  2πiξx
(f ∗ wλ )(x) = (fb wbλ ) (x) = fb(ξ) 1 −

e dξ,
−λ λ
it is also true that Cesàro means of an infinite series are related to a discrete
version of the Fejér kernel. Indeed, when comparing the Fourier transform to
Fourier series, the two Fejér kernels play entirely analogous roles. We expand
on this in Section 2.2.
As an application, we use Theorem 1.93 to obtain useful results regard-
ing functions with compactly supported Fourier transforms. Such functions
appear in a variety of contexts. For example, we will see them again in Chap-
ter 3 when we prove the Paley–Wiener Theorem (Section 3.5) and the energy
concentration theorems (Section 3.8).
Definition 1.95 (Bandlimited Function). We say that a function f is
bandlimited if supp(fb) is compact. ♦
1.6 The Inversion Formula 67

Exercise 1.96. (a) Show that the space of bandlimited functions in L1 (R),

{f ∈ L1 (R) : supp(fb) is compact},

is dense in L1 (R).
(b) Show that if Ω is fixed, then the space of functions in L1 (R) bandlim-
ited to [−Ω, Ω],
{f ∈ L1 (R) : supp(fb) ⊆ [−Ω, Ω]},
is a closed proper subspace of L1 (R). ♦

1.6.4 Pointwise Inversion

As we have seen, the Cesàro means σN of a formal infinite series may converge
even when the partial sums sN do not. For the Fourier transform, the analogue
of the Cesáro means are convolutions with the Fejér kernel:
Z λ  |ξ|  2πiξx
(f ∗ wλ )(x) = fb(ξ) 1 − e dξ.
−λ λ

We always have f ∗ wλ → f in L1 -norm when f ∈ L1 (R).


Similarly, the analogue of the partial sums for the Fourier transform are
convolutions with the Dirichlet kernel. To see this, write χλ = χ[−λ,λ] . Then

χλ = d2πλ , so
Z

(f ∗ d2πλ )(x) = f (y) χλ (x − y) dy
Z Z λ
= f (y) e2πiξ(x−y) dξ dy
−λ
Z λ Z 
= f (y) e2πiξy dy e2πiξx dξ
−λ
Z λ
= fb(ξ) e2πiξy dy.
−λ

The interchange in the order of integration in this computation is justified by


Fubini’s Theorem, since f (y) e2πiξy χλ (ξ) ∈ L1 (R × [−λ, λ]).
Unfortunately, the Dirichlet kernel is not an approximate identity, and the
question of convergence of f ∗d2πλ is much more delicate than the convergence
of f ∗ wλ . In general, if we know only that f ∈ L1 (R) then f ∗ d2πλ need not
converge to f in L1 -norm. However, we will show that if we impose a modest
amount of local regularity on f near a given point x then (f ∗ d2πλ )(x) will
converge to f (x) for that x.
The “local regularity” that we need is bounded variation on an interval
around x, and the main part of the argument is contained in the following
68 1 The Fourier Transform on L1 (R)

lemma. The proof of this lemma uses the Second Mean Value Theorem for
integrals. Problem 1.55 gives a short proof of a special case of the Second
Mean Value Theorem, and for the proof of the general case we refer to [Fol99,
Lem. 8.4.1].
Lemma 1.97. If g ∈ BV[0, δ], then
Z δ
g(0+)
lim g(x) d2πλ (x) dx = .
λ→∞ 0 2
Proof. The Jordan Decomposition Theorem (Problem 1.30) tells us that any
function g ∈ BV[0, δ] can be written as g = g1 − g2 where g1 , g2 are increas-
ing on [0, δ]. Therefore it suffices to assume that g is increasing. Further, by
replacing g(x) with g(x) − g(0+), we may also assume that g(0+) = 0.
By Problem 1.43,
Z b
sin x
C = sup dx < ∞.
a<b a x
Fix ε > 0. Then there exists an η > 0 such that |g(x)| < ε for all 0 < x < η.
Since g is continuous and d2πλ is continuous, the Second Mean-Value Theorem
for Integrals tells us that there exists some point c ∈ [0, η] such that
Z η Z c Z η
g(x) d2πλ (x) dx = g(0+) d2πλ (x) dx + g(c−) d2πλ (x) dx.
0 0 c

Since g(0+) = 0, this implies that


Z η Z η
sin 2πλx
g(x) d2πλ (x) dx = g(c−) dx
0 c πx
Z 2πλη
sin x
= |g(c−)| dx
2πλc πx
C
≤ ε . (1.34)
π
g(x) χ
Since η > 0, the function f (x) = 2πix [η,δ] (x) is integrable on R. Applying
the Riemann–Lebesgue Lemma, we therefore have
Z δ Z δ
g(x) 2πiλx
lim g(x) d2πλ (x) dx = lim (e − e−2πiλx ) dx
λ→∞ η λ→∞ η 2πix

= lim fb(λ) − fb(−λ) = 0. (1.35)
λ→∞

Combining equations (1.34) and (1.35), we see that


Z δ

lim sup g(x) d2πλ (x) dx ≤ .
λ→∞ 0 π
Since ε is arbitrary, the result follows. ⊓

1.6 The Inversion Formula 69

Now we can prove Jordan’s Theorem on the pointwise convergence of the


“partial sums” f ∗ d2πλ . In the statement of this result we are perform a
common abuse of notation by combining an almost everywhere requirement
(f ∈ L1 (R)) with a pointwise everywhere requirement (bounded variation on
[x − δ, x + δ]). As usual, this means that there is function g defined everywhere
such that f = g a.e. and g ∈ BV[x − δ, x + δ].
Theorem 1.98 (Jordan’s Theorem). If f ∈ L1 (R) and f ∈ BV[x−δ, x+δ],
then
f (x+) − f (x−)
lim (f ∗ d2πλ )(x) = .
λ→∞ 2
Proof. Noting that x is fixed, define
g(t)
g(t) = f (x − t) + f (x + t) and h(t) = χ[δ,∞) (t).
2πit
The hypotheses imply that g ∈ BV[0, δ] and h ∈ L1 (R). Applying Lemma 1.97
and the Riemann–Lebesgue Lemma, we obtain

lim (f ∗ d2πλ )(x)


λ→∞
Z
= lim f (x − t) d2πλ (t) dt
λ→∞
Z ∞ 
= lim f (x − t) + f (x − t) d2πλ (t) dt
λ→∞ 0
Z δ Z ∞
g(t) 2πiλt
= lim g(t) d2πλ (t) dt + lim (e − e−2πiλt ) dt
λ→∞ 0 λ→∞ δ 2πit
g(0+) 
= + lim b
h(λ) − b
h(−λ)
2 λ→∞

f (x+) − f (x−)
= . ⊓

2

1.6.5 Decay and Smoothness Revisited

There are many variations on the theme of duality between smoothness and
decay under the Fourier transform. Some of these were presented in Sec-
tion 1.4. As an application of the Inversion Formula, we now prove another
version, relating decay in frequency to smoothness in time.

Recall that A(R) = fb : f ∈ L1 (R) . Hence if f ∈ L1 (R) then fb ∈ A(R) by
definition, and to say that 2πiξ fb(ξ) belongs to A(R) means that there exists
a function g ∈ L1 (R) whose Fourier transform is gb(ξ) = 2πiξ fb(ξ). Note that
this is more than just a decay requirement on fb, for not only must we have
2πiξ fb(ξ) ∈ C0 (R), but we must have that 2πiξ fb(ξ) belongs to the smaller
space A(R).
70 1 The Fourier Transform on L1 (R)

Theorem 1.99. If f ∈ L1 (R) and 2πiξ fb(ξ) ∈ A(R), then f is differen-


tiable a.e. and f ′ = g a.e. where g ∈ L1 (R) is the function that satisfies
gb(ξ) = 2πiξ fb(ξ).

Proof. First consider the case where we have the additional hypothesis that
gb ∈ L1 (R). Since 2πiξ fb(ξ) = b
g(ξ) ∈ L1 (R) and fb is continuous, it follows that
fb ∈ L1 (R). Therefore we can apply the inverse Fourier transform analogue
∨
of Theorem 1.55 to the function fb and conclude that fb ∈ C01 (R). Since
∨
the Inversion Formula applies to f, we know that f = fb is differentiable
everywhere and f ′ ∈ C0 (R). Furthermore,
∨ ∨
f ′ = (f )′ = 2πiξ fb(ξ) =
∧∨
gb = g,

with equality holding everywhere.


Now consider the general case, i.e., f, g ∈ 1 b
2
R L (R) and gb(ξ) = 2πiξ f (ξ).
Let k be any function in Cc (R) that satisfies k = 1. By Theorem 1.65 (and
equation (1.24) in particular), we have bk ∈ L1 (R). Define kλ (x) = λk(λx), so
{kλ }λ>0 is an approximate identity. We have f ∗ kλ , g ∗ kλ ∈ L1 (R), and also,
since b
k is integrable and gb is bounded, (g ∗ kλ ) ∈ L1 (R). Further,

g(ξ) kbλ (ξ) = 2πiξ fb(ξ) kbλ (ξ) = 2πiξ (f ∗ kλ ) (ξ).


∧ ∧
(g ∗ kλ ) (ξ) = b

The previous case therefore implies that f ∗ kλ is differentiable everywhere


and (f ∗ kλ )′ = g ∗ kλ . Since g ∗ kλ ∈ L1 (R), we conclude that f ∗ kλ is ab-
solutely continuous on every interval [a, b] (see Theorem 1.62). Consequently,
the Fundamental Theorem of Calculus applies, so we have for any a < b that
Z b Z b
(f ∗ kλ )(b) − (f ∗ kλ )(a) = (g ∗ kλ )(t) dt → g(t) dt,
a a

the convergence following from the fact that g ∗ kλ → g in L1 (R) as λ → ∞.


On the other hand, by Theorem 1.77,

(f ∗ kλ )(x) → f (x) for a.e. x.

Hence we have for almost every a < b that


Z b
f (b) − f (a) = g(t) dt. (1.36)
a

By redefining f on a set of measure zero, we can assume that equation (1.36)


holds for all a < b. Therefore f is absolutely continuous on every interval
[a, b], f is differentiable a.e., and f ′ = g a.e. (see Theorem 1.59). ⊓

1.6 The Inversion Formula 71

Additional Problems

1.45. Show that if f, fb ∈ L1 (R), then f ∈ Lp (R) for every 1 ≤ p ≤ ∞.

1.46. Suppose that f ∈ L1 (R). Show that if fb is even then f is even, and if fb
is odd then f is odd. Compare Problem 1.2.

1.47. Show that if g ∈ L1 (R) and f, fb ∈ L1 (R), then (f g) = fbb



g.

1.48. Show that Z


sin πx −2π|x|+πix π
e dx = .
x 4
1.49. Show that the only function in L1 (R) that satisfies f = f ∗f is f = 0 a.e.
(compare Problem 3.5).

1.50. Show that if g ∈ L1 (R), g 6= 0, then {Ta g}a∈R is finitely linearly inde-
pendent, i.e., every finite subset is linearly independent.

1.51. Prove the following variation on the theme “decay in frequency implies
smoothness in time”: If f ∈ L1 (R) and there exists C > 0 and 0 < α < 1 such
that
C
∀ ξ ∈ R, |fb(ξ)| ≤ ,
|ξ|1+α
then f is Hölder continuous with exponent α (see Definition 1.75).

1.52. Show that the Fourier transform maps L1 (R) ∩ A(R) bijectively onto
itself, and that L1 (R) ∩ A(R) ⊆ C0 (R).

1.53. For x, ξ ∈ R, define



mx = inf kf k1 + kfbk1 : f ∈ L1 (R) ∩ A(R) and f (x) = 1 ,

mξ = inf kf k1 + kfbk1 : f ∈ L1 (R) ∩ A(R) and fb(ξ) = 1 .

Show that mx = mξ = 1 for every x and ξ.

1.54. Let B0 = χ[0,1] , and recursively define the nth B-spline Bn by

Bn = Bn−1 ∗ χ[0,1] .

B-splines and more general splines have applications in numerical analysis,


computer graphics, and many other areas.
(a) Find explicit formulas for B1 , B2 , and B2′ . Find an explicit formula for
cn , and show that B
B cn ∈ L1 (R) for all n > 0.
(b) Prove that Bn′ = T1 Bn−1 −Bn−1 for n > 1 (T1 is the translation operator).
(n−1)
(c) Show that Bn ∈ Ccn−1 (R) for n > 0, and that Bn is piecewise linear.
72 1 The Fourier Transform on L1 (R)

(d) Prove that there exist scalars ckn such that Bn satisfies the refinement
equation
n+1
X
Bn (x) = ckn Bn (2x − k).
k=0

(e) Prove that there exists a 1-periodic function m0 (in fact, a trigonometric
polynomial) such that Bcn (ξ) = m0 (ξ/2) B
cn (ξ/2).

1.55. This problem will prove a special case of the Second Mean Value Theo-
rem for Integrals. Assume that h is both continuous and nonnegative on [a, b]
and g is monotone increasing on [a, b] with g(a+) ≥ 0. Define
Z x Z b
G(x) = g(a+) h(t) dt + g(b−) h(t) dt,
a x
Rb
and show that G(b) ≤ a g(t) h(t) dt ≤ G(a). Apply the Intermediate Value
Theorem to show there exists a point c ∈ [a, b] such that
Z b Z c Z b
g(t) h(t) dt = G(c) = g(a+) h(t) dt + g(b−) h(t) dt.
a a c

1.7 The Range of the Fourier Transform


We know that A(R), the range of the Fourier transform acting on L1 (R), is a
subspace of C0 (R). We will show that it is a dense, but proper, subspace of
C0 (R).
To begin, the fact that A(R) is dense is a consequence of the smoothness
versus decay dualities of the Fourier transform combined with the Inversion
Formula—any sufficiently smooth function f ∈ L1 (R) will have a Fourier
transform fb that decays quickly enough that we must have fb ∈ L1 (R).

Exercise 1.100. Show that if f ∈ Cc2 (R) then fb ∈ L1 (R). Conclude that
Cc2 (R) ⊆ A(R), and that A(R) is dense in C0 (R). ♦

The next exercise (taken from [Fol99]) shows that A(R) is a proper subset
of C0 (R), although the argument is implicit in the sense that it does not
construct a specific example of a function in C0 (R)\A(R). The main point is
that if the Fourier transform F was a bounded map of L1 (R) onto C0 (R),
then, since both of these are Banach spaces, the Inverse Mapping Theorem
(Theorem C.14) would imply that F had a bounded inverse. However, the
exercise shows that F −1 is not a bounded map of A(R) (under the L∞ -norm)
back to L1 (R).

Exercise 1.101. Define fk = χ[−1,1] ∗ χ[−k,k] for k ∈ N.


1.7 The Range of the Fourier Transform 73

(a) Find an explicit formula for fk and show that kfk k∞ = 2.


(b) Show that limk→∞ kfbk k1 = ∞.
(c) Show that F −1 : (A(R), k · k∞ ) → L1 (R) is unbounded, and conclude that
A(R) ( C0 (R). ♦
While we now know that A(R) is a proper subset of C0 (R), we do not
yet have any explicit examples of functions in C0 (R)\A(R). The next exercise
will construct such an example (this construction is based on Goldberg’s text
[Gol61, p. 8]).
Exercise 1.102. (a) Show that if f ∈ L1 (R) and f is odd, then
Z b b
f (ξ)
sup dξ < ∞.
b≥1 1 ξ

(b) Show that if f ∈ L1 (R) is odd, fb is differentiable at ξ = 0, and fb ≥ 0


on (0, ∞), then fb(ξ)/ξ ∈ L1 (R).
(c) Define 
1/ ln ξ,
 ξ > e,
F (ξ) = ξ/e, −e ≤ ξ ≤ e,


−1/ ln ξ, ξ < −e,
and show that F ∈ C0 (R)\A(R). ♦
In fact, there exist compactly supported functions in C0 (R) that do not
belong to A(R). An example is constructed in [Her85], where it is shown that

 1 sin(2π4n ξ), 1 1
n 2n+1 ≤ |ξ| ≤ 2n ,
B(ξ) =

0, ξ ≤ 0 or |ξ| > 21 ,
(B for “butterfly”) belongs to Cc (R)\A(R).
Although it is dense in C0 (R), in a topological sense A(R) is only a “small”
part of C0 (R). One consequence of the Open Mapping Theorem is that if
T : X → Y is a bounded linear map on a Banach space X and its range T (X)
is a dense but proper subspace of a Banach space Y, then T (X) is a meager
or first category subset of Y (Problem 1.57). Applying this to the mapping
F : L1 (R) → C0 (R), we conclude that A(R) is only a meager subset of C0 (R).

Additional Problems
1.56. Define Ac (R) = {F ∈ A(R) : supp(F ) is compact}. Show that Ac (R) is
a dense subspace of A(R) in the norm of A(R). Compare Problem 1.23, which
shows that Ac (R) is an ideal in A(R).
1.57. Let X and Y be Banach spaces. Show that T ∈ B(X, Y ) is surjective if
and only if range(T ) is not meager in Y.
74 1 The Fourier Transform on L1 (R)

1.0

0.5

0.1 0.2 0.3 0.4 0.5

-0.5

-1.0

Fig. 1.16. Graph of the butterfly function.

1.8 Some Special Kernels


In addition to the Fejér function, there are several very useful special functions
that can be used to generate approximate identities.
First, we modify the Fejér kernel to obtain an approximate identity
{vλ }λ>0 that has the appealing property that vbλ = 1 on [−λ, λ]. As we have

seen, any approximate identity {kλ }λ>0 has the property that (f ∗ kλ ) (ξ) =
fb(ξ) kbλ (ξ) converges pointwise to fb. Using the approximate identity {vλ }λ>0
we will have the extra property that fb(ξ) is actually equal to fb(ξ) vbλ(ξ) on
the compact set [−λ, λ].

Definition 1.103 (de la Vallée–Poussin Kernel). Let w be the Fejér func-


tion. Then the de la Vallée–Poussin function is

v(x) = 2w2 (x) − w(x) = 4w(2x) − w(x),

and the de la Vallée–Poussin kernel is {vλ }λ>0 where vλ (x) = λv(λx). ♦

Charles Jean de la Vallée–Poussin (1866–1962) is perhaps best known for


the fact that he and Jacques Hadamard (1865–1963) independently gave, in
1896, the first proofs of the Prime Number Theorem.
R
Exercise 1.104. Show that v = 1 (so {vλ }λ>0 is an approximate identity),
and show that

1,
 |ξ| ≤ λ,
vbλ(ξ) = linear, on [−2λ, −λ] and [λ, 2λ],


0, |ξ| ≥ 2λ. ♦
1.8 Some Special Kernels 75

3.0
2.5
2.0
1.5
1.0
0.5

-4 -2 2 4
-0.5

Fig. 1.17. Graph of the de la Vallée–Poussin function v.

0.5

-2.5 -2 -1.5 -1 -0.5 0.5 1 1.5 2 2.5

Fig. 1.18. Graph of the Fourier transform vb.

To illustrate its use, we hint that an “easy” solution to the following ex-
ercise can be obtained by using the de la Vallée–Poussin kernel.
Exercise 1.105. Prove that if f ∈ L1 (R), then

supp(fb) is compact ⇐⇒ ∃ g ∈ L1 (R) such that f = f ∗ g. ♦


Note that the preceding exercise characterizes those functions f ∈ L1 (R)
such that f belongs to f ∗ L1 (R); compare Exercise 1.48 and Problem 1.39.
Poisson’s kernel, named after Siméon-Denis Poisson (1781–1840) is related
to the two-sided exponential function (see Problem 1.1).
Definition 1.106 (Poisson Kernel). The Poisson function is
1
p(x) = ,
π(x2 + 1)
and the Poisson kernel is {pλ }λ>0 where pλ (x) = λp(λx). ♦
R
Exercise 1.107. Show that p = 1 and pb(ξ) = e−2π|ξ| . ♦
2 2
A Gaussian function is a function of the form ae−(x−b) /(2c ) . In Chap-
ter 3 we will see that Gaussian functions are the functions that are “best
concentrated” with respect to the Uncertainty Principle for the Fourier trans-
form. For this reason, Gaussian functions are ubiquitous in Fourier analysis.
In particular, we can use an appropriately normalized Gaussian to generate
an approximate identity. Of course, Gaussian functions are named after Carl
Friedrich Gauss (1777–1855).
76 1 The Fourier Transform on L1 (R)

0.5
0.4
0.3
0.2
0.1

-3 -2 -1 0 1 2 3

Fig. 1.19. Graph of the Poisson function p.

1.0

0.8

0.6

0.4

0.2

-3 -2 -1 0 1 2 3

Fig. 1.20. Graph of the Fourier transform pb.

Definition 1.108 (Gauss Kernel). Let φ denote the Gaussian function


2
φ(x) = e−πx .

Then the Gauss kernel is {φλ }λ>0 where φλ (x) = λφ(λx). ♦


InRorder to know that {φλ }λ>0 is an approximate identity, we must show
that φ = 1. Fortunately, this is not hard to prove. Unfortunately, it is
more difficult to compute an explicit formula for the Fourier transform φ. b
Aficionados of complex analysis will recognize that one way to do this is by
using contour integration, and we encourage those readers familiar with this
b On the other hand, a real-variable approach
technique to use it to compute φ.
b
to computing φ is given in the next exercise.
b where φ(x) = e−πx2 .
Exercise 1.109. Let Φ = φ,
(a) Use Theorems 1.55 and 1.65 to show that

Φ′ (ξ) = −2πξΦ(ξ).

(b) Solve the differential equation in part (a), and show that

Φ(ξ) = Φ(0) φ(ξ).

(c) Show that Φ(0) = 1. Conclude that Φ = φ, and therefore the Gauss kernel
is an approximate identity. ♦
1.8 Some Special Kernels 77

Thus the Gaussian function φ has the interesting property that φb = φ,


and hence is a 1-eigenvector for the Fourier transform! Are there any other
such functions? Yes, the Hermite functions are eigenfunctions of the Fourier
transform, and they form an orthogonal basis for L2 (R). We will investigate
the Hermite functions in Section 3.2.
As an application, we will use the Gauss kernel to prove the Weierstrass
Approximation Theorem (the proof we give is taken from Stein and Shakarchi
[SS03a]).

Theorem 1.110 (Weierstrass Approximation Theorem). If f ∈ C[a, b]


and ε > 0, then there exists a polynomial p such that

kf − pk∞ = sup |f (x) − p(x)| < ε.


x∈[a,b]

Proof. Choose f ∈ C[a, b], and fix R large enough that [a, b] ⊆ (−R, R). Let g
be any continuous function on R supported in [−R, R] that equals f on [a, b].
Let {φλ }λ>0 be the Gauss kernel. By Exercise 1.73, g ∗ φλ converges uniformly
to g, so we can choose a λ such that
ε
kg − g ∗ φλ k∞ < . (1.37)
2
Since the Taylor series for ex converges uniformly on compact sets, the series

X X∞
2
x2 (−πλ2 x2 )n (−1)n π n λ2n+1 2n
φλ (x) = λe−πλ = λ = x
n=0
n! n=0
n!

converges uniformly on [−2R, 2R]. Therefore, there exists an N such that if


we set
XN
(−1)n π n λ2n+1 2n
q(x) = x
n=0
n!
then
ε
sup |φλ (x) − q(x)| < .
x∈[−2R,2R] 2kgk1
Therefore, for x ∈ [−R, R] we have
Z R
(g ∗ φλ )(x) − (g ∗ q)(x) ≤ |g(y)| |φλ (x − y) − q(x − y)| dy
−R
Z R
ε ε
≤ |g(y)| dy = . (1.38)
−R 2kgk1 2

Since g and f are equal on [a, b], by combining equations (1.37) and (1.38),
we see that
sup |f (x) − (g ∗ q)(x)| < ε.
x∈[a,b]
78 1 The Fourier Transform on L1 (R)

Finally, if we write out g ∗ q and expand using the binomial theorem, we see
that g ∗ q is actually a polynomial of degree at most 2N :

XN Z
(−1)n π n λ2n+1 R
(g ∗ q)(x) = g(y) (x − y)2n dy
n=0
n! −R

XN 2n   Z R 
(−1)n π n λ2n+1 X 2n
= g(y) (−y)2n−j dy xj .
n=0
n! j=0
j −R

The integrals appearing above are finite since g ∈ Cc (R). Thus p = g ∗ q is


the polynomial that we sought. ⊓⊔

Additional Problems

1.58. There are many functions that equal their own Fourier transforms. Show
that if f, fb ∈ L1 (R), then f
∧∧∧∧ ∧ ∧∧ ∧∧∧
= f, and consequently g = f +f +f +f
satisfies b
g = g.

1.0

0.5

1 2 3 4
-0.5

-1.0

2
Fig. 1.21. Graph of the real part of the chirp e2πi2x .

2
1.59. Let φ(x) = e−πx be the Gaussian function.
2
(a) For r > 0, set ϕr (x) = e−πrx . Show that ϕ
cr = r−1/2 ϕ1/r .
(b) Now we extend the definition of ϕr to complex parameters. Let c =
a + ib ∈ C be complex, and set
2 2 2
ϕc (x) = e−πcx = e−πibx e−πax .
2
In engineering jargon, ϕc is a Gaussian multiplied by a chirp e−πibx (and the
resulting function is often also called a Gaussian function). Show that part (a)
extends to complex parameters with positive real part, i.e., if c = a + ib and
a > 0, then ϕcc = c−1/2 ϕ1/c . This fact will be useful to us in Section 3.3.
Note: We take a complex square root that extends the square root of
the positive real numbers. For example, since c = a + ib with a > 0, we
1.9 The Schwartz Space 79

can write c = reiθ with −π/2 < θ < π/2. We take c1/2 = r1/2 eiθ/2 and
c−1/2 = r−1/2 e−iθ/2 .
Remark: If we regard e2πiξx as a “pure tone” of constant frequency, then
2
we can think of e2πiξx as a function whose frequency increases with time, see
the illustration in Figure 1.21. If played through a speaker, such a function
sounds something like a bird’s chirp.

1.60. Suppose that f ∈ L1 (R)∩L∞ (R) has a jump discontinuity at the origin,
i.e.,
f (0+) = lim+ f (t), f (0−) = lim− f (t)
t→0 t→0

both exist, but f (0−) 6= f (0+). Let {φλ }λ>0 be the Gauss kernel. Prove that

f (0+) + f (0−)
lim (f ∗ φλ )(0) = .
λ→∞ 2
Can other kernels be used to obtain the same result?

1.61. Let f ∈ L1 (R) be given. Show that if there exist δ, R > 0 such that f
is bounded on [−δ, δ] and fb(ξ) ≥ 0 for |ξ| > R, then fb ∈ L1 (R).

1.62. Suppose that f ∈ L1 (R) ∩ L∞ (R) is real and even (so fb is real and even
as well). Show that if f is not continuous, then fb must change sign infinitely
often. Thus, a “local” (jump) discontinuity of f forces a global “reaction”
in fb. 
On the other hand, show that g(x) = ie−|x| χ[0,∞) − χ(−∞,0) has a single
jump discontinuity and that gb is real and changes sign only once. Still, the
fact that g has a discontinuity implies that b
g decays slowly.

1.63. Prove the following version of Bernstein’s Inequality. Suppose that f ∈


L1 (R) is differentiable and supp(fb) ⊆ [−R, R]. Show that kf ′ kp ≤ CR kf kp
for each 1 ≤ p ≤ ∞, where C is a fixed constant independent of f, R, or p.

1.9 The Schwartz Space


The Schwartz space will play an important role throughout the rest of this
volume. In this section, after giving its definition and basic properties, we will
discuss its special place in harmonic analysis.
The Schwartz space is a space of infinitely differentiable functions that,
together with all their derivatives, “decay rapidly” at infinity. It is named
in honor of Laurent Schwartz (1915–2002), because of its role in the theory
of distributions, which he developed. Distributions will be the focus of our
attention in Chapter 4.
80 1 The Fourier Transform on L1 (R)

1.9.1 Definition and Basic Properties

The precise definition of the Schwartz space is as follows.

Definition 1.111 (The Schwartz Space). The Schwartz space is



S(R) = f ∈ C ∞ (R) : xm f (n) (x) ∈ L∞ (R) for all m, n ≥ 0 .

An element of S(R) is called a Schwartz-class function. ♦

Thus, “rapid decay” means decay faster than the reciprocal of any poly-
nomial: For each m, n ≥ 0 there exists a constant Cmn such that
Cmn
|f (n) (x)| ≤ , x ∈ R.
|xm |

The constants Cmn need not be uniformly bounded in m and n.


A consequence of rapid decay is that f and every derivative of f is inte-
grable.

Exercise 1.112. Show that if f ∈ S(R) then for every m, n ≥ 0 we have


xm f (n) (x) ∈ L1 (R) ∩ C0 (R), with

kxm f (n) (x)k1 ≤ 2 kxm f (n) (x)k∞ + 2 kxm+2 f (n) (x)k∞ . ♦

Since Cc∞ (R) ⊆ S(R), we know that S(R) is dense in Lp (R) for every
1 ≤ p < ∞, and also is dense in C0 (R) in L∞ -norm. The Gaussian function
2
e−x is an example of a function in S(R) that is not compactly supported.
On the other hand, while the two-sided exponential e−|x| has rapid decay, it
is not a Schwartz-class function since it is not differentiable at the origin.

1.9.2 Topology and Convergence in the Schwartz Space

The Schwartz space is a topological vector space but not a normed vector
space. Instead of being determined by a single norm, the topology on S(R) is
determined by the infinite collection of seminorms

ρmn (f ) = kxm f (n) (x)k∞ , m, n ≥ 0.

The Schwartz space and related spaces determined by families of seminorms


will be discussed in more detail in Section 4.2. For us, the meaning of con-
vergence in S(R) is more important than the associated topology, and con-
vergence simply means simultaneous convergence with respect to each of the
seminorms ρmn .

Definition 1.113 (Convergence in S(R)). Given fk , f ∈ S(R), we say that


fk converges to f in S(R) if
1.9 The Schwartz Space 81

∀ m, n ≥ 0, lim ρmn (f − fk ) = 0,
k→∞

Writing out the seminorms explicitly, convergence in S(R) means that


(n)
∀ m, n ≥ 0, lim kxm f (n) (x) − xm fk (x)k∞ = 0.
k→∞

We write fk → f in S(R) to denote that fk converges to f in S(R). ♦


Because S(R) is determined by only countably many seminorms, it is pos-
sible to create a metric on S(R) that determines the convergence criterion. In
particular, if we define
X 1 ρmn (f − g)
d(f, g) = m+n
,
2 1 + ρmn (f − g)
m,n≥0

then fk → f in S(R) if and only if d(f, fk ) → 0 (see Exercise 4.12). Unfortu-


nately, there is no norm that induces this metric on S(R), but the fact that
there is an underlying metric means that we can define convergence in terms
of ordinary sequences indexed by the natural numbers instead of having to
use abstract nets, as is required in generic topological spaces (see [Heil11b,
Chap. 1] for details).
Since we have a metric, we can ask whether S(R) is complete with respect
to this metric. The answer is yes, every Cauchy sequence in S(R) converges
in S(R) (see Exercise 4.14). Thus S(R) is a vector space whose topology is
generated by a metric, and it is complete with respect to that metric. Such a
space is called a Fréchet space.
Since we have defined convergence in S(R), we also have a notion of conti-
nuity. For example, a functional µ : S(R) → C is continuous if fk → f in S(R)
implies that hfk , µi → hf, µi, where hf, µi denotes the action of the functional
µ on the vector f. The space of all continuous linear functionals on S(R), i.e.,
the dual space of S(R), is the space of tempered distributions, denoted S ′ (R).
It will be a major center of attention in Chapter 4.

1.9.3 Invariance of the Schwartz Space


Now that we have introduced the Schwartz space, let us explain why it is
so interesting to us. The definition of the Schwartz space incorporates both
smoothness and decay—an element f of S(R) is both infinitely smooth and
decays rapidly at infinity (as do all of its derivatives). The Fourier transform
interchanges smoothness and decay. Thus, since f ∈ S(R) has both smooth-
ness and decay, fb has both decay and smoothness, suggesting that we may
have fb ∈ S(R). And in fact this is true: The Schwartz space is invariant under
the Fourier transform. In order to prove this, we first need the reader to verify
that an application of Theorems 1.55 and 1.65 implies the following relation
between derivatives and products. To simplify notation, we let D denote the
differentiation operator:
Df (x) = f ′ (x). (1.39)
82 1 The Fourier Transform on L1 (R)

Exercise 1.114. Show that if xj f (k) (x) ∈ L1 (R) for j = 0, . . . , m and k =


0, . . . , n, then
 ∧
Dn (−2πix)m f (x) (ξ) = (2πiξ)n Dm fb(ξ). ♦

Theorem 1.115 (Invariance of the Schwartz Space). The Fourier trans-


form is a bijection of S(R) onto itself.

Proof. Suppose that f ∈ S(R). Then, by the product rule, we have for any
m, n ≥ 0 that
Xn  
 n
Dn (−2πix)m f (x) = Dj (−2πix)m f (n−j) (x) ∈ L1 (R).
j=0
j

Hence,
 ∧
(2πiξ)n Dm fb(ξ) = Dn (−2πix)m f (x) (ξ) ∈ L∞ (R).

Since this is true for every m and n, we conclude that fb ∈ S(R). Thus the
Fourier transform maps S(R) into itself, and we know that it is injective by
the Uniqueness Theorem.
∨ ∨

On the other hand, we also have f ∈ S(R), and therefore ( f ) ∈ S(R).


Hence f = ( f ) by the Inversion Formula, so the Fourier transform is surjec-
tive. ⊓

Not only is the Fourier transform a bijection, but it is a topological iso-


morphism (or homeomorphism) of S(R) onto itself, i.e., both F and F −1 are
continuous with respect to the topology of the Schwartz space.

Exercise 1.116 (Continuity of the Fourier Transform on S(R)). Given


any m, n ≥ 0, show that there exist constants Cjℓ > 0 such that
m X
X n
kξ n fb(m) (ξ)k∞ ≤ Cjℓ kxj f (ℓ) (x)k1 .
j=0 ℓ=0

Use this to show that F : S(R) → S(R) is continuous, i.e.,

fk → f in S(R) =⇒ fbk → fb in S(R),

and similarly F −1 : S(R) → S(R) is continuous. ♦

As a final remark, we note that since Cc∞ (R) ⊆ S(R), the Fourier transform
maps Cc∞ (R) into S(R):

f ∈ Cc∞ (R) =⇒ fb ∈ S(R).


1.9 The Schwartz Space 83

However, we will later see the Paley–Wiener Theorem (Theorem 3.49), which
implies that f and fb cannot both be compactly supported (unless f = 0).
Hence the Fourier transform does not map Cc∞ (R) into itself:
 
F Cc∞ (R) ⊆ S(R) but F Cc∞ (R) ∩ Cc∞ (R) = {0}.

Similarly, if f ∈ L1 (R) and fb ∈ Cc∞ (R), then fb ∈ S(R), so we know that


f ∈ S(R), but f cannot be compactly supported.

Additional Problems

1.64. Show that if f ∈ S(R) and g ∈ Cb∞ (R), then f g ∈ S(R). In particular,
S(R) is closed under pointwise products.

1.65. Show that if f ∈ S(R) and xm g(x) ∈ L1 (R) for every m ≥ 0, then
f ∗ g ∈ S(R). In particular, S(R) is closed under convolution.

1.66. Show that f ∈ L1 (R) : fb ∈ Cc∞ (R) is dense in Lp (R) for each 1 ≤
p < ∞.

1.67. Construct a function f ∈ S(R) that satisfies limn→∞ |f (n) (n)| = ∞.

You might also like