0% found this document useful (0 votes)
39 views58 pages

Lecture Notes Fourier28

Notes from King's College

Uploaded by

barandiszabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views58 pages

Lecture Notes Fourier28

Notes from King's College

Uploaded by

barandiszabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

LECTURE NOTES

FOURIER ANALYSIS

IGOR WIGMAN

Contents
0. Preliminaries 2
0.1. Integration (Riemann) 2
0.2. Complete (metric) vector spaces 3
0.3. Functions on the circle S 1 4
I. Fourier series on the circle 4
I.1. Definition of the Fourier series, their simple properties, and some examples 4
I.2. Uniqueness of Fourier series, and uniform convergence. Decay of Fourier coefficients and
smoothness. 9
I.3. Convolutions and Dirichlet kernels. Smoothing effect of convolutions. 15
I.4. “Good kernels” 18
I.5. Cesàro summability 20
I.6. Abel summability and Poisson Kernel. Steady-state heat equation 23
I.7. Mean-square convergence of Fourier series. 26
Towards the proof of the mean square convergence Theorem I.34 29
I.8. Pointwise convergence 33
II. Some applications of Fourier series 34
II.1. The isoperimetric inequality 34
II.2. Weyl’s Equidistribution Theorem 37
II.3. Vibrating string and wave propagation 42
II.4. Heat equation on the circle 45
III. Fourier transform on R 46
III.1. Overview and Fourier transform as the limiting case of the Fourier series 46
III.2. The Fourier transform on the Schwartz space and its fundamental properties 47
III.3. Inverse Fourier transform and the Plancherel formula 52
III.4. The class M(R) of functions of moderate decrease, and extension of the Fourier
transform to M(R). Fourier transform as an operator and extension to L2 (R) 54
IV. Application of the Fourier transform on R: The time-dependent heat equation on R. 55

1
2 IGOR WIGMAN

0. Preliminaries
0.1. Integration (Riemann). Let −∞ < a < b < +∞, then I = [(a, b)] is an interval I ⊆ R, open
or closed, or semi-closed. If f : I → R (or f : I → C) is a function, we are interested in defining the
Rb
integral S = f (x)dx.
a

Definition 0.1 (Riemann integration). (1) A partition of I is ∆ : a = t0 < t1 < . . . tn = b, of


mesh (or norm)
|∆| := max |tj+1 − tj |.
0≤j≤n−1

(2) The Riemann sum of f corresponding to ∆ is


n−1
X
S(f ; ∆) := f (tj ) · (tj+1 − tj ).
j=0

Rb
(3) We say that f is integrable on I, with value of integral s := f (x)dx, if for all ϵ > 0, there
a
exists a δ > 0 such that for every partition ∆ of I of mesh size |∆| < δ, we have
|S(f, ∆) − s| < ϵ.
That is,
Zb
f (x)dx := lim S(f, ∆).
|∆|→0
a

The following facts on Riemann integration were established as part of a standard analysis course:
(1) If a function f is not bounded on I, then f is not integrable on I.
(2) Define the upper (resp. lower) Riemann sums:
n−1
!
X
U (f ; ∆) := sup f (t) · (tj+1 − tj )
j=0 t∈[tj ,tj+1 ]

(resp.
n−1 
X 
L(f ; ∆) := inf f (t) · (tj+1 − tj )).
t∈[tj ,tj+1 ]
j=0

Then f is integrable on I, if and only if for every ϵ > 0, there exists a partition ∆ such that
0 ≤ U (f, ∆) − L(f, ∆) < ϵ.
Examples 0.2. (1) The Dirichlet function f : [0, 1] → R:
(
1 x∈Q
f (x) = .
−1 x ∈ /Q

The function f is not Riemann integrable (and nowhere continuous), since U (f ; ∆) ≡ 1 but
L(f ; ∆) ≡ −1. However, its absolute value |f | ≡ 1 is integrable, a peculiarity of integrability
in the Riemann sense.
(2) If f : [a, b] → R (or f : [a, b] → C) is continuous, then f is integrable.
(3) If f, g are integrable, then so is f · g, and hence, in particular, f 2 .
LECTURE NOTES FOURIER ANALYSIS 3

Definition 0.3 (Improper integrals). Let f : [a, +∞) → R be a function, integrable on every
[a, b] ⊆ [a, +∞). Then the improper integral is the limit
Z+∞ Zb
f (x)dx := lim f (x)dx,
b→∞
a a

if it exists.
+∞
R +∞
R
The integral f (x)dx might make sense, even if |f (x)|dx does not converge, whence we say
a a
+∞
sin x
R
that it is conditionally convergent, such as, for example x
dx. Similarly, we define
1

Z+∞ Z Zb
f (x)dx := f (x)dx,
−∞ b→+∞ −b

if it exists. We end this section by stating that all this theory could be generalized to multivariate
functions, defined on subsets of Rn , n ≥ 1.
0.2. Complete (metric) vector spaces.
Definition 0.4. Let X be a set (e.g. a finite dimensional real vector space X = Rn ). A metric on
X is a function d : X × X → R satisfying the following properties:
(1) Positivity: a. For all x, y ∈ X, d(x, y) ≥ 0. b. For all x, y ∈ X, d(x, y) = 0 if and only if
x = y.
(2) Symmetry: For all x, y ∈ X, d(x, y) = d(y, x).
(3) Triangle inequality: For every x, y, z ∈ Z, d(x, z) ≤ d(x, y) + d(y, z).
If d(·, ·) is a metric on X, we say that the tuple (X, d) is a metric space.
Example 0.5. If (X, ∥ · ∥) is a normed linear space (i.e. · 7→ ∥ · ∥ is a norm on X), then it induces
the metric d(x, y) := ∥x − y∥. That is every normed linear space is a metric space, but there are a
lot of interesting metric spaces X that are not linear spaces (ditto not normed linear spaces).
Definition 0.6 (Complete metric spaces). Let (X, d) be a metric space.
(1) A sequence {xn } ⊆ X of elements of X is Cauchy, if for every ϵ > 0 there exists a number
N = N (ϵ) such that for every n, m > N , the inequality d(xn , xm ) < ϵ holds.
(2) A sequence {xn } ⊆ X of elements of X converges to a (unique!) element x ∈ X, if for every
ϵ > 0 there exists N = N (ϵ) so that for every n > N the inequality d(xn , x) < ϵ holds.
(3) The metric space (X, d) is complete, if every Cauchy sequence {xn } ⊆ X is convergent.
Examples 0.7. (1) The space (R, |x − y|) is complete. In this example the notions of Cauchy
and convergent sequences coincide with the classical definitions. Same for (Rn , ∥x − y∥p ),
n ≥ 1, p > 0.
(2) The space (Q, |x−y|) is not complete. To see that we can take the sequence defined recursively
by x0 = 1 and xn+1 = xn +2/x n
, which is Cauchy but not convergent (left for homework).
2 √
Otherwise, consider the function f (x) = 1 + x and Taylor expand it at x = 1. We
obtain a rapidly decaying series, with rational partial sums, divergent in Q. In any case, the
conclusion is that Q contains “holes”.
ˇ of X: it is a unique
If a metric space (X, d) is not complete, it gives rise to the completion (X̌, d)
(up to an isometric bijection) metric space satisfying the following properties:
(1) The space X is embedded in X̌: X ⊆ X̌ and dˇ extends d.
4 IGOR WIGMAN

(2) X is dense in X̌, i.e. for every x ∈ X̌ there exists a sequence {xn } ⊆ X of elements of X
such that lim xn = x in X̌.
n→∞
(3) X̌ is complete.
Examples 0.8. (1) R is the completion of Q.
(2) [0, 1] is the completion of (0, 1) ⊆ R.
Given (X, d), one way to construct its completion is by defining X̌ as the collection of all equivalence
classes of Cauchy sequences in X, w.r.t. a suitable equivalence relation.

0.3. Functions on the circle S 1 . Let f : R → R (or f : R → C) be periodic with period 2π.
Then f is determined by its restriction f |[−π,π] (or any other period, i.e. f |[x0 ,x0 +2π] ), and moreover,
f (−π) = f (π). Conversely, given such a function f : [−π, π] → R (or f : [−π, π] → C) with
f (−π) = f (π), it uniquely prescribes a periodic function f : R → R with period 2π: for every x ∈ R,
f (x + 2π) = f (x).
Alternatively, we may thing of such a function as f defined on the circle S 1 ⊆ R2 , by identifying
it with the function F ((cos θ, sin θ)) := f (θ), θ ∈ [−π, π). Moreover, F is continuous, if and only if
f is continuous (assuming f (−π) = f (π), as otherwise ±π is a point of discontinuity of f ).

I. Fourier series on the circle


I.1. Definition of the Fourier series, their simple properties, and some examples. We are
interested in decomposing periodic functions f : R → C into some basic ingredients. Let f be 2π-
periodic, so that we may consider f as a function f : [−π, π] → C (or, equivalently, f : S 1 → C is
defined on the unit circle). The functions cos(nx) and sin(nx) with n ∈ Z are also 2π-periodic, and
so is einx = cos(nx) + i sin(nx). Suppose that we can decompose f as the infinite series

X
f∼ an einx (I.1)
−∞

for some coefficients an ∈ C, where the meaning of “∼” is not specified at the moment (in particular,
no convergence is implied). Manipulating formally with (I.1) (changing the order of summation and
integration, with no justification), we take n0 ∈ Z and write
Zπ ∞
X Zπ
−in0 x
f (x)e dx = an ei(n−n0 )x dx = 2πan0 ,
−π n=−∞ −π

since, as it is easy to compute explicitly,


Zπ (
2π n=0
einx dx = . (I.2)
0 n ̸= 0
−π

In light of the above, it is reasonable to expect that if an expansion of the type (I.1) exists, then
the coefficients are given by

1
an = f (x)e−inx dx.

−π

The following definition is then only natural.


Definition I.1. Let f : [−π, π] → C be a (Riemann) integrable function.
LECTURE NOTES FOURIER ANALYSIS 5

(1) For n ∈ Z we define the n’th Fourier coefficient of f to be



1
an = fb(n) := f (θ)e−inθ dθ. (I.3)

−π

It is well-defined, since, as discussed earlier, a product of two integrable functions (in this
case, f and e−inθ ) is integrable.

an einθ (convergent or not) is called the Fourier series associated to f , which
P
(2) The series
n=−∞
is designated by

X
f∼ an einθ ,
n=−∞

and has no other meaning (e.g. the convergence of the Fourier series), than that an are given
by (I.3).
Remark I.2. (1) The integral
Zπ θZ
0 +2π
1 −inθ 1
f (θ)e dθ = f (θ)e−inθ dθ
2π 2π
−π θ0

as in (I.3) is independent of the period (left to homework).


(2) If f is periodic with period L (in place of 2π), e.g. L = 1, then it makes sense to define
ZL/2
1
fb(n) := f (x)e−2πinx/L dx,
L
−L/2



fb(n)e2πinx/L , e.g. by the linear transformation θ =
P
n ∈ Z, and f (x) ∼ L
x. We will
n=−∞
assume everywhere L = 2π, unless specified otherwise.
A central question is that of convergence of the Fourier series (to f ?). That is, we define for
N ∈ Z>0 the N ’th partial Fourier sum of f
N
X
(SN (f ))(θ) := fb(n)einθ .
n=−N

Does SN (f ) → f as N → ∞, and in what sense.


cn einθ , and its degree is
P
Definition I.3. (1) A trigonometric polynomial is a finite sum
n
the maximal |n| such that cn ̸= 0. For example, SN (f ) is always a trigonometric polynomial,
of degree at most N .
cn einθ , finite or not. For example, any Fourier series
P
(2) A trigonometric series is any sum
n
of a function is a trigonometric series.
Example I.4. The function f (θ) = θ, on θ ∈ [−π, π], discontinuous at θ = ±π. First, for n = 0,

1
fb(0) = θdθ = 0.

−π
6 IGOR WIGMAN

For n ̸= 0, we use integration by parts to compute the integral


Zπ Zπ
1 −inθ 1 1 −inθ θ=π 1 1 −inθ 1 i iπn (−1)n
f (n) =
b θe dθ = · θe θ=−π
+ · e dθ = · · 2π · e = i · ,
2π 2π −in 2π in 2π n n
−π −π


since e−inθ dθ = 0, and eiπn = (−1)n .
−π
Hence the Fourier series of f is
X (−1)n ∞ ∞ ∞
X 1  X 1 X sin(nθ)
f∼ i· einθ = i·(−1)n · · einθ − e−inθ = i·(−1)n 2i·sin(nθ) = 2 (−1)n+1 ,
n̸=0
n n=1
n n=1
n n=1
n

by uniting the summands corresponding to n and −n. Does this series converge? To f ? At this
point we can only infer that, by the slow decay of its summands, there is no absolute convergence.
In general, in cases, like f as in Example I.4, that the function f : [−π, π] → R is real-valued, we
may represent its Fourier series in its real form by grouping the summands corresponding to n and
−n in the following fashion.
Claim I.5. If f is real-valued, then for every n ∈ Z,

fb(−n) = fb(n), (I.4)


where · is the complex conjugate.
Proof. Since for all θ ∈ [−π, π] we have f (θ) = f (θ) by assumption, we may write
Zπ Zπ
1 inθ 1
fb(−n) = f (θ)e dθ = f (θ)e−inθ dθ = fb(n).
2π 2π
−π −π


The converse, i.e. that if (I.4) is satisfied for all n ∈ Z then f is real-valued, is also true under
further assumption on f . It will be shown using convergence results later on.
Let us perform the same trick of adding the terms in the Fourier series, corresponding to n and
−n, as in Example I.4. Denoting an := fb(n), for n ̸= 0, by the above claim, we have
fb(n)einθ + fb(−n)e−inθ = an einθ + an e−inθ = an einθ + an einθ = 2ℜ an einθ

(I.5)
= 2 (ℜ(an ) cos(nθ) − ℑ(an ) sin(nθ)) .

Now an = 1

f (θ)e−inθ dθ, so that
−π


1
ℜ(an ) = f (θ) cos(nθ)dθ

−π

whereas

1
ℑ(an ) = − f (θ) sin(nθ)dθ.

−π

Therefore, (I.5) is
fb(n)einθ + fb(−n)e−inθ = bn cos(nθ) + cn sin(nθ),
LECTURE NOTES FOURIER ANALYSIS 7

where

1
bn = 2ℜ(an ) = f (θ) cos(nθ)dθ, (I.6)
π
−π
and

1
cn = −2ℑ(an ) = f (θ) sin(nθ)dθ,
π
−π
all under the assumption n ̸= 0.
As for n = 0 (that has no partner −n other than itself), we have

i·0·θ 1 b0
a0 e = a0 = f (θ)dθ =: ,
2π 2
−π

where b0 extends the definition (I.6) to n = 0. Here the denominator 2 manifests the lack of
accompanying sine function to 1 = cos(0 · θ).
All in all, the real form of the Fourier series of an arbitrary integrable real-valued function f :
[−π, π] → R is

b0 X
f (θ) ∼ + (bn cos(nθ) + cn sin(nθ)) .
2 n=1
Here

b0 X
+ bn cos(nθ)
2 n=1
is the “even part” of f , whereas

X
cn sin(nθ)
n=1
is its “odd part”. In the homework assignment it will be shown that if f is even, then its odd part
vanishes, whereas if f is odd, then its even part vanishes. The converse is also true under some further
assumption on f , will be covered later, when the convergence of a Fourier series corresponding to a
function will be addressed.
Example I.6. f (θ) = 14 θ2 , θ ∈ [−π, π]. (In this example, on θ ∈ [π, 2π], we have f (θ) = 41 (2π − θ)2 .)
The function f : S 1 → R is continuous on the unit circle S 1 , since f (−π) = f (π) (hence f is
continuous as a function on R, extended by 2π-periodicity). We can compute its Fourier coefficients
by appealing to integration by parts (this time, twice), and transform the Fourier series to its real
form:

π 2 X (−1)n cos(nθ)
f (θ) ∼ + .
12 n=1 n2
If, indeed, the Fourier series of f converges to f at every point, we can substitute θ = π, so that
cos(nπ) = (−1)n and then
∞ ∞
π2 π 2 X (−1)n · (−1)n π2 X 1
= f (π) = + = + ,
4 12 n=1 n2 12 n=1 n2
and we can conclude

X 1 π2 π2 π2
2
= − = ,
n=1
n 4 12 6
8 IGOR WIGMAN

a highly nontrivial result on the special value of the Riemann zeta function

X 1
ζ(2) := 2
.
n=1
n
Example I.7. The Dirichlet kernel of degree N ≥ 1:
N
X
DN (x) = einx . (I.7)
n=−N

DN is a degree-N trigonometric polynomial. It is easy to see, by changing the order of sum and
integration, and the orthogonality relation (I.2), that
(
1 |n| ≤ N
D
d N (n) = .
0 |n| > N
Thus DN ∼ DN , i.e. DN is its own Fourier series.
More generally, if
X
TN (x) = an einx ,
|n|≤N

x ∈ [−π, π], is an arbitrary trigonometric polynomial (of degree N ≥ 1), then


(
an |n| ≤ N
Tc
N (n) = ,
0 otherwise
and so TN ∼ TN is its own Fourier series.
Back to DN as in (I.7). It is possible to find a closed expression for DN , by summing a geometric
series. We write:
N
!
iN x iN x/2
− e−iN x/2
   
ix 1 − e ix i(N/2−1/2)x e
X
inx
DN (x) = 1 + 2ℜ e = 1 + 2ℜ e = 1 + 2ℜ e e
n=1
1 − e ix eix/2 − e−ix/2
 
N +1 sin(N x/2) sin(N x/2)
= 1 + 2ℜ ei 2 x = 1 + 2 cos((N + 1/2)x)
sin(x/2) sin(x/2)
sin((N + 1/2)x) − sin(x/2) sin((N + 1/2)x)
=1+ = .
sin(x/2) sin(x/2)
(I.8)
Recall that
N
X
(SN (f ))(θ) := fb(n)einθ ,
n=−N

for f : [−π, π] → C, integrable. Does SN (f ) → f as N → ∞? In what sense? The following precisely


addresses the question in what sense a sequence of function converges to a limit function.
Definition I.8 (Modes of convergence). Let {fn }n≥1 be a sequence of (Riemann) integrable func-
tions fn : I → C on I ⊆ R (e.g. I = [−π, π]), and f : I → C integrable.
(1) We say that fn → f as n → ∞ pointwise, if for all x ∈ I, fn (x) → f (x) as n → ∞ (usual
convergence of complex numbers).
(2) We say that fn → f uniformly (or in L∞ , pronounced “L infinity”), if for all ϵ > 0 there
exists a number N = N (ϵ) so that for all n ≥ N and all x ∈ I,
|fn (x) − f (x)| < ϵ.
LECTURE NOTES FOURIER ANALYSIS 9

L1
(3) We say that fn −→ f (convergence in L1 , alternatively in mean), if
n→∞
Z
|fn − f |dx → 0
I
as n → ∞.
[Note that the above integral is well-defined by our assumptions on the integrability of fn
and f . Also this implicitly defines a metric topology on the space of Riemann integrable
functions, called the L1 -metric.]
L2
(4) We say that fn −→ f (convergence in L2 , alternatively mean square convergence), if
n→∞
Z
|fn − f |2 dx → 0
I
as n → ∞.
[This implicitly defines a metric topology on the space of Riemann integrable functions,
called the L2 -metric.]
It is easy to show that (will be left to homework assignment) the following implications hold:
easy Cauchy-Schwarz
L∞ ⇒ L2 ⇒ L1 ,
and also
easy
L∞ ⇒ pointwise,
whereas all the other implications do not hold, with counter-examples of each left to homework
assignment.
Since one can always change a value of f at a single point without affecting the Fourier coefficients
at all (and hence neither the Fourier series), for pointwise convergence of the Fourier series to f it
will be essential to impose some stronger assumptions on f other than its mere integrability. One
immediate such condition to be consider is the continuity of f . Surprisingly, it was shown by Du
Bois-Reymond (1873), that there exists a continuous function on S 1 , such that its Fourier series
diverges (at one point). However, further imposing the smoothness of f will mend this problem.
I.2. Uniqueness of Fourier series, and uniform convergence. Decay of Fourier coefficients
and smoothness. Suppose, that for some class of functions indeed

X
f (θ) = fb(n)einθ
n=−∞

pointwise (i.e. SN (θ) → f (θ) pointwise), and g is some other function such that for every n ∈ Z we
have fb(n) = gb(n). If the Fourier series corresponding to g also converges to g, then one has
X∞ X∞
inθ
g(θ) = gb(n)e = fb(n)einθ = f (θ).
n=−∞ n=−∞

That is, by the convergence assumption, the function is determined by its Fourier coefficients, and
if we have two such functions with same Fourier coefficients, then we can infer f ≡ g.
In particular, we can infer from the above, that if we can’t prove the uniqueness of a function f
with the given Fourier series, then trying to prove the convergence is hopeless. By taking a difference
(f, g) 7→ f − g,
we reduce to proving whether if f : [−π, π] → C is an integrable functions such that for all n ∈ Z,
fb(n) = 0, then f ≡ 0. Of course, this statement can’t be literally true with no further assumptions,
10 IGOR WIGMAN

as changing, for example, the values of f at finitely many points has no impact on fb(n). However,
we have the following positive result:
Theorem I.9 (Uniqueness of Fourier series). Let f : S 1 → C be an integrable function such that
for all n ∈ Z, fb(n) = 0. The for all θ0 ∈ S 1 continuity points of f , one has f (θ0 ) = 0. In particular,
if f ∈ C(S 1 ) (i.e. f is continuous on S 1 ), then f ≡ 0.
It is known that any Riemann integrable function f on S 1 is continuous at “most” points of S 1 , in
the sense of Lebesgue measure, outside of the scope of this course. Before giving a proof to Theorem
I.9 we will first draw an important corollary on pointwise convergence of the Fourier series.
Corollary I.10. Suppose f ∈ C(S 1 ), and, moreover that

X
|fb(n)| < +∞ (I.9)
n=−∞

P
(i.e. the series fb(n) is absolutely convergent). Then
n=−∞

X
f= fb(n)einθ
n=−∞
1
uniformly on S , i.e. (SN (f ))(θ) → f (θ) uniformly.
2
[In particular, for f (θ) = θ4 of Example I.6 the conditions of Corollary I.10 are all satisfied, hence
2
our theory yield the value ζ(2) = π6 .] We will see in a bit what is a sufficient (smoothness) condition
P∞
to impose on f so that |fb(n)| < +∞ is satisfied.
n=−∞

Proof of Corollary I.10. We use the convergence (I.9) to define the function

X
g(x) = fb(n)einθ ;
n=−∞

by the said absolute convergence, and bearing in mind that


fb(n)einθ = |fb(n)|,
g is continuous. Moreover,
Z ∞ Z
1 −inθ
X 1
gb(n) = g(θ)e dθ = fb(m) · ei(m−n)θ dθ = fb(n),
2π m=−∞

S1 S1

since we may exchange the order of summation and integration by the uniform absolute convergence,
and the orthogonality (I.2). Hence by the Uniqueness Theorem I.9 (applied on f − g), f ≡ g, and
indeed

X
f (θ) = g(θ) = fb(n)einθ
n=−∞
1
uniformly on S .

Proof of the Uniqueness Theorem I.9. First, assume that f is real-valued, and we will reduce the
general case in the end. We will further assume with no loss of generality, that for f : [−π, π] → R,
θ0 = 0, and, by contradiction, that f (θ0 ) > 0 (otherwise, work with −f (·)).
LECTURE NOTES FOURIER ANALYSIS 11

We observe that, for every trigonometric polynomial


X
p(θ) = ak eikθ ,
|k|≤N

one has Z Z
X X
f (θ)p(θ)dθ = ak f (θ)eikθ dθ = 2π ak fb(−n) = 0 (I.10)
S1 |k|≤N S1 |k|≤N

vanishes by assumption. We aim to construct a sequence {pk } of trigonometric polynomials so that,


as k → ∞,
Z
f (θ)pk (θ)dθ → +∞, (I.11)
S1

certainly contradicting (I.10).


Since f is assumed to be continuous at θ0 = 0, there exists a number 0 < δ < π2 , so that for all
θ ∈ S 1 with |θ| < δ, one has
f (0)
f (θ) > > 0.
2
Now choose the trigonometric polynomial
p1 (θ) := ϵ + cos(θ),
where ϵ > 0 is so small so that for all δ ≤ |θ| ≤ π, the inequality
|p(θ)| < 1 − ϵ/2 (I.12)
is satisfied (we have p(θ) = ϵ+cos(θ) ≥ −1+ϵ everywhere, and hence, in light of the cosine decreasing
on [0, π], it is equivalent to choosing ϵ > 0 so that p1 (δ) = ϵ + cos(δ) < 1 − ϵ/2, which does hold for
sufficiently small positive ϵ, since it holds for ϵ = 0). Now choose η > 0 so that η < δ, and for all
|θ| < η,
ϵ
p1 (θ) ≥ 1 + (I.13)
2
(again, by the decrease of the cosine and p(0) = 1 + ϵ > 1, such η exists). We finalize our choice of
pk (·) by taking
pk (θ) := (p1 (θ))k ; (I.14)
since f is integrable, it is bounded, and thereupon let B > 0 be such that for all θ ∈ S 1 ,
|f (θ)| ≤ B. (I.15)

Now we are going to estimate


Z
Ik := f (θ)pk (θ)dθ
S1

towards showing (I.11). We have


Z Z Z
Ik = + + =: Ik,1 + Ik,2 + Ik,3 . (I.16)
|θ|≤η η≤|θ|≤δ |θ|>δ

The range Ik,1 is where the bulk of the contribution will come, whereas we are going to show that
the contributions of Ik,2 and Ik,3 can be ignored (Ik,2 is non-negative, whereas Ik,3 is negligible).
12 IGOR WIGMAN

In the first range |θ| < η < δ one has |f (θ)| > f (0)
2
> 0, and pk (θ) ≥ (1 + ϵ/2)k by (I.13) and the
definition (I.14) of pk . Hence
Z
f (0)
Ik,1 = f (θ)pk (θ)dθ ≥ 2η · · (1 + ϵ/2)k → +∞. (I.17)
2
|θ|<η

For the next range η < |θ| < δ, we observe that for |θ| < δ, still f (θ) ≥ f (0)
2
≥ 0, and p1 (θ) =
ϵ + cos(θ) > 0 (this is why we cared to choose δ < π2 ), and so pk (θ) ≥ 0. Therefore, the integrand
f (θ) · pk (θ) ≥ 0 in this range, and, in particular, the integral
Z
Ik,2 = f (θ)pk (θ)dθ ≥ 0. (I.18)
η<|θ|<δ

Next, for the range δ < |θ| < π, we use the bound (I.15) for f valid on all of S 1 , whereas for pk
we use the bound |pk (θ)| ≤ (1 − ϵ/2)k , which is a by-product of the bound (I.12) on p1 , valid in the
same range. We then use the triangle inequality to bound the corresponding integral
Z
|Ik,3 | ≤ |f (θ)| · |pk (θ)|dθ ≤ 2π · B · (1 − ϵ/2)k → 0 (I.19)
δ<|θ|<π

as k → ∞. Our desired estimate then follows upon consolidating (I.17), (I.18) and (I.19), and
inserting these into (I.16).
Finally, we reduce the complex valued case of f : S 1 → C to the readily resolved real-valued one.
Given a function f (θ) = u(θ) + iv(θ), define
g(θ) = f (θ) := f (θ) = u(θ) − iv(θ)
with the Fourier coefficients
gb(n) = fb(−n),
left as a homework exercise. Then, by assumption, for all n ∈ Z, gb(n) = 0, and so are the Fourier
coefficients of
f (θ) + g(θ)
u(θ) =
2
and
f (θ) − g(θ)
v(θ) = .
2i
Now we apply the Uniqueness Theorem on the real-valued functions u and v to conclude that,
assuming that θ0 is a continuity point of f (hence of both u and v), u(θ0 ) = v(θ0 ) = 0, and then
f (θ0 ) = u(θ0 ) + iv(θ0 ) = 0.

We are now concerned when the series

X
|fb(n)| < +∞
n=−∞

is convergent, i.e. one of the sufficient conditions (I.9) of Corollary I.10 is satisfied. We will find that
the rate of decay of |fb(n)| as |n| → ∞ is intimately related to the smoothness of f . To measure the
rate of decay of |fb(n) we introduce the following notation.
LECTURE NOTES FOURIER ANALYSIS 13

Notation I.11 (Big O). If ϕ, ψ : Z → R>0 are two positive functions defined on integers (i.e., two
doubly-infinite sequences of positive sequences), we say that ϕ(n) = O(ψ(n)) as |n| → ∞ (“ϕ(n) is
a big O of ψ(n)”), if there exists a number C > 0 so that
ϕ(n) ≤ C · ψ(n).
The big O-notation could be generalized to more general index sets (other than Z), and convergence
of the index to a finite value (instead of |n| → ∞).
Example I.12. Recall that for the Example I.6, we had
 
1
|f (n)| = O
b ,
n2
and, in particular, in this case

X
|fb(n)| < ∞.
n=−∞

Lemma I.13 (”Differentiation is easy”). Let k ≥ 0, and assume that f : S 1 → C is f ∈ C k (S 1 ), that


is, f is k times continuously differentiable on S 1 . Then
(k) (n) = (in)k fb(n),
fd
and, moreover, as |n| → ∞,
 
1
|fb(n)| = O .
|n|k
For example, if f ∈ C(S 1 ) is continuous (i.e. k = 0), then |fb(n)| = O(1), i.e. the Fourier
coefficients are bounded. If f ∈ C 2 (S 1 ), then
 
1
|fb(n)| = O , (I.20)
n2

P
the series |fb(n)| < ∞ is convergent, and therefore, by Corollary I.10,
n=−∞
X
f (θ) = fb(n)einθ ,
n∈Z

uniformly on S . Note that the condition f ∈ C 2 (S 1 ) is sufficient for the decay (I.20) (and hence
1

for the uniform convergence of the Fourier series to the given function), but not necessary, as it is
violated, in Example I.6.
We will now prove Lemma I.13 in the case k = 1, the general case following by using mathematical
induction teamed up with the same argument, left to the homework assignment.
Proof of Lemma I.13, case k = 1. First, if f is continuous, then, by the triangle inequality,
Z Z
1 −inθ 1
|fb(n)| = f (θ)e dθ ≤ |f (θ)|dθ = O(1),
2π 2π S1
S1

since f is bounded, which is the case k = 0. For k = 1, we assume n ̸= 0, f ∈ C 1 (S 1 ), and use


integration by parts (which, to justify, we use the fact that f is continuously differentiable):
Z Z
1 −inθ 1 −inθ π 1
f (n) =
b f (θ)e dθ = f (θ)e −π
+ f ′ (θ)e−inθ dθ.
2π −2πin 2πin
S1 S1
14 IGOR WIGMAN

We have that
1 π
f (θ)e−inθ −π = 0,
−2πin
by the 2π-periodicity of f (and of e−inθ ), and
Z
1 1
f ′ (θ)e−inθ dθ = fb′ (n),
2πin in
S1

so that
1
fb(n) = fb′ (n), (I.21)
in
which is equivalent to
infb(n) = fb′ (n).
For n = 0: Z
1 1 π
fb′ (0) = f ′ (θ)dθ = f (θ) −π
= 0,
2π 2π
S1
again, by periodicity. Finally, that  
1
|fb(n)| = O
|n|
as |n| → ∞ follows directly from (I.21), since |fb′ (n)| = O(1) is bounded, by the readily established
case k = 0.

The statement of Lemma I.13 allows, under suitable conditions on f , to infer that if

X
f∼ fb(n)einθ ,
n=−∞

then

X
f (k) ∼ (in)k fb(n)einθ ,
n=−∞

i.e. to differentiate term by term the Fourier series k times (note that no convergence of the Fourier
series of f (k) is asserted). The next result is, in a sense, the converse of Lemma I.13.
Lemma I.14. Conversely, if for some k ≥ 2,
 
1
|fb(n)| = O , (I.22)
|n|k
then f ∈ C k−2 (S 1 ).
[Notice that Lemma I.14 “loses two powers” of n relatively to Lemma I.13. There are some
(very difficult and highly acclaimed) refinements of this, e.g. due to L. Carleson, but, in general,
Example I.6 shows that from decay of O n12 for the Fourier coefficients, one cannot infer that even


f ∈ C 1 (S 1 ).]
The proof of Lemma I.14 is a simple corollary to the following theorem from basic analysis course:
Theorem I.15 (Term by term differentiation). Assume that a series of differentiable functions
fn : I → C, I ⊆ R is an interval (finite or infinite), converges pointwise
X
fn (x) = f (x)
n
LECTURE NOTES FOURIER ANALYSIS 15

to some function f : I → C. Assume further that the corresponding series of derivatives


X
fn′ (x) = g(x)
n

converges uniformly to a function g : I → C. Then f is differentiable, and f ′ = g.


 
Proof of Lemma I.14. Since, by assumption, |f (n)| = O |n|1 k , k ≥ 2, we have
P b
b |f (n)| < ∞, hence
n∈Z

fb(n)einθ (uniformly on S 1 ). To apply Theorem
P
it follows from Corollary I.10, that f (θ) =
n=−inf ty
I.15 one needs to validate that, under our assumptions, for all 0 ≤ j ≤ k − 2, the series
X dj
fb(n) j einθ
n∈Z

d j
inθ
converges uniformly. This is indeed the case, since dθ je = (in)j einθ , and, accordingly, we have
dj inθ
fb(n) · dθ je = fb(n) · nj , which is convergent by the assumption (I.22).

I.3. Convolutions and Dirichlet kernels. Smoothing effect of convolutions.
Definition I.16 (Convolution of functions). Let f, g : S 1 → C be two integrable functions, and
we will think of f and g as 2π-periodic defined on R. We define a third function h : S 1 → C the
convolution h = f ∗ g of f and g as

1
(f ∗ g)(x) = f (y)g(x − y)dy,

−π

x ∈ R.
First, for every x ∈ R, f (y) · g(x − y) is an integrable function on [−π, π] since it is a product of f
(which is integrable) with a shifted version g(x − ·) of an integrable function, hence h(x) = (f ∗ g)(x)
makes sense. Further, h(·) is 2π-periodic, since g is 2π-periodic. Therefore h : S 1 → C is well defined
as a function on the circle.
Next, assuming that both f and g are continuous, we can transform the variables x − y = z to
write
Zπ x−π
Z
1 1
(f ∗ g)(x) = f (y)g(x − y)dy = − f (x − z)g(z)dz
2π 2π
−π x+π
x+π
Z Zπ
1 1
= g(z)f (x − z)dz = g(z)f (x − z)dz = (g ∗ f )(x),
2π 2π
x−π −π

where we used the 2π-periodicity of both f and g so that the integral is the same along any period.
Therefore, at least for f, g continuous, f ∗ g = g ∗ f , with the continuity assumption lifted in a
homework assignment (via a “density” argument).
1

We can view 2π f (y)g(x − y)dy as taking a “weighted average” of translations of g, where the
−π
weights are according to f . E.g., if f ≡ 1, then

1
(f ∗ g)(x) ≡ g(y)dy

−π
16 IGOR WIGMAN

is the total mass of g. Convolving functions has a smoothing effect (on both functions).
Example I.17. Take a (small) number ϵ > 0, and
(
1 x ∈ [−ϵ, ϵ]
f (x) = g(x) = χ[−ϵ,ϵ] (x) = .
0 otherwise
The convolution (in home assignment) is the triangle function
(
1

(2ϵ − |x|) |x| ≤ 2ϵ
(f ∗ g)(x) = .
0 otherwise

This function is continuous f ∗ g ∈ C 0 (S 1 ), but not differentiable at ±2ϵ.


Example I.18 (Important example). Given f : S 1 → C integrable,
N N Zπ N Zπ
X X 1 X 1
(SN (f ))(x) = fb(n)einx = f (y)e−iny dyeinx = f (y)ein(x−y) dy
n=−N n=−N
2π n=−N

−π −π
Zπ " N
# Zπ
1 X 1
= f (y) ein(x−y) dy = f (y)DN (x − y)dy = (f ∗ DN )(x),
2π n=−N

−π −π

with DN the previously discussed Dirichlet kernel.


Proposition I.19. Let f, g, h be Riemann integrable functions on S 1 . Then:
(1) f ∗ (g + h) = f ∗ g + f ∗ h.
(2) For all c ∈ C,
(cf ) ∗ g = c(f ∗ g) = f ∗ (cg)
(3) f ∗ g = g ∗ f
(4) f ∗ g is continuous (f, g are only assumed to be integrable). This is consistent with our
interpretation of convolution as having a smoothing effect.
(5) For every n ∈ Z,
f[∗ g(n) = fb(n) · gb(n). (I.23)
This is very important in applications (e.g. in computer sciences or engineering), as it allows
for an efficient computation of the convolution on the Fourier side.
It is proposed as a home assignment to validate (I.23) for the case in Example I.17.
As for giving a proof for Proposition I.19: (1)-(2) trivial, (3) done for continuous functions, left
as a home assignment in the more general case. Concerning (4)-(5), the general strategy is to prove
first under the additional assumption that f, g are continuous, and then relax this assumption using
an approximation result to be formulated below (“density argument”).
Proof of (4) for continuous f, g. First, by Cantor’s Theorem, g is uniformly continuous on [−π, π],
hence, by periodicity, on R. That is, for every ϵ > 0, there exists a number δ > 0 so that for all
y1 , y2 ∈ R, if |y1 − y2 | < δ, then
|g(y1 ) − g(y2 )| < ϵ.
Now,

1
(f ∗ g)(x1 ) − (f ∗ g)(x2 ) = f (y)(g(x1 − y) − g(x2 − y))dy, (I.24)

−π
LECTURE NOTES FOURIER ANALYSIS 17

hence if |x1 − x2 | < δ, then |(x1 − y) − (x2 − y)| = |x1 − x2 | < δ. Therefore, we may use the triangle
inequality on (I.24) to bound
Zπ Zπ
1 1
|(f ∗ g)(x1 ) − (f ∗ g)(x2 )| ≤ |f (y)||g(x1 − y) − g(x2 − y)|dy ≤ |f (y)| · ϵdy
2π 2π
−π −π

ϵ ϵ
= |f (y)|dy ≤ · 2π · B,
2π 2π
−π
ϵ
where B is any bound satisfying |f (x)| ≤ B. Since the number 2π
· 2π · B could be made arbitrarily
small, that proves that f ∗ g is continuous.

Proof of (5) assuming f, g are continuous. We may, using the continuity, change the order of inte-
gration:
 
Zπ Zπ Zπ
1 1 1
f[∗ g(n) = (f ∗ g)(x)e−inx dx = f (y)g(x − y)dy  e−inx dx
2π 2π 2π
−π −π −π
 
Zπ Zπ
1 1
= f (y)e−iny  g(x − y)e−in(x−y) dx dy
2π 2π
−π −π

1
= f (y)e−iny dy · gb(n) = fb(n) · gb(n).

−π


In order to reduce our statement to continuous functions we will need the following standard result
from analysis.
Lemma I.20 (“Approximation Lemma”). Let f be an integrable function on S 1 , and
sup |f (x)| ≤ B.
x∈S 1

Then there exists a sequence {fk }k≥1 of continuous functions fk : S 1 → C so that for all k ≥ 1,
sup |fk (x)| ≤ B,
x∈S 1

and as k → ∞,

|f (x) − fk (x)| dx → 0.
−π

Proof of (4) for integrable functions. We apply the Approximation Lemma I.20 to obtain sequences
fk , gk respectively, satisfying the postulated properties. We claim that
fk ∗ gk → f ∗ g
uniformly on S 1 , which is sufficient to deduce that f ∗g is continuous (as a uniform limit of a sequence
of continuous functions). We have that
f ∗ g − fk ∗ gk = (f − fk ) ∗ g + fk ∗ (g − gk ),
18 IGOR WIGMAN

whence we bound
Zπ Zπ
1 1
|((f − fk ) ∗ g)(x)| ≤ |f (y) − fk (y)| · |g(x − y)|dy ≤ sup |g(z)| · |f (y) − fk (y)| dy → 0
2π 2π z∈S 1
−π −π

uniformly for x ∈ S 1 as k → ∞, by the postulated property of the sequence fk . Similarly, fk ∗ (g −


gk ) → 0 uniformly, this time using the fact that sup |fk (y)| is bounded by a constant independent
y∈S 1
of k. All in all, fk ∗ gk → f ∗ g uniformly, which, as it was mentioned above, is sufficient for the
continuity of f ∗ g.

Observations: (1) The multiplicative property (5) of Proposition I.19 is another manifestation
of the smoothing effect of the convolution, as smoothness of f is related to the decay of its Fourier
coefficients fb(n). (2) On the Fourier side the equality SN (f ) = DN ∗ f is trivial in light of the same
property (5) of Proposition I.19.

I.4. “Good kernels”. Good kernels are a generalization of the {pk } of the proof of the Uniqueness
Theorem I.9 we constructed in order to isolate the behaviour of a function around the origin by
convolving it with pk , that is highly peaked at the origin. We are going to construct a family
{Kn (x)}∞n=1 of functions on the circle, but they could also be indexed by a continuous parameter in
place of n. (But for simplicity we will assume n = 1, 2, . . ..)
Definition I.21 (Good kernels). A family of functions {Kn }∞ 1
n=1 , Kn : S → C is called a family of
good kernels, if the following hold:
a. Unit (signed) weight: for every n ≥ 1,

1
Kn (x)dx = 1.

−π

b. Control over fluctuations: there exists a number M > 0 (independent of n) such that for every
n ≥ 1 one has

|Kn (x)|dx ≤ M.
−π

c. Mass concentration around the origin: for every δ > 0, as n → ∞ one has
Z
|Kn (x)|dx → 0.
δ≤|x|≤π

If Kn (x) ≥ 0, as we will often encounter, then (b) follows automatically from (a). The functions
Kn approximate for large n the “Dirac Delta function”. It makes no sense to ask whether a given
function is a good kernel, but rather when a 1-parameter family of functions is a family of good
kernels, with the parameter tending to some value, that is an accumulation point (in our case, grows
to infinity).
Theorem I.22 (“Good Kernels Theorem”). Let {Kn } be a family of good kernels, and f : S 1 → C
integrable. Then for all continuity points x ∈ S 1 of f we have the limit
lim (f ∗ Kn )(x) = f (x). (I.25)
n→∞

If, in addition, f is continuous on S 1 , then the limit (I.25) is uniform w.r.t. x ∈ S 1 .


LECTURE NOTES FOURIER ANALYSIS 19

Intuitively,

1
(f ∗ Kn )(x) = f (x − y)Kn (y)dy,

−π
and, since Kn concentrates around y = 0, with unit mass,
Zπ Zδ Zδ
1 1 1 1
f (x − y)Kn (y)dy ≈ f (x − y)dy ≈ f (x)dy = · 2δf (x) = f (x),
2π 2δ 2δ 2δ
−π −δ −δ

to be made rigorous as follows.


Proof. As f is continuous at x ∈ S 1 , for every ϵ > 0, there exists a number δ > 0 so that for every
|y| < δ,
|f (x − y) − f (x)| < ϵ.
[If f is continuous on S 1 , then by Cantor’s Theorem, δ is independent of x.] We have
Zπ Zπ
1 1
(f ∗ Kn )(x) − f (x) = Kn (y)f (x − y)dy − f (x) = Kn (y) (f (x − y) − f (x)) dy,
2π 2π
−π −π

by (a) of Good Kernels. Hence, by the triangle inequality,



1
|(f ∗ Kn )(x) − f (x)| ≤ |Kn (y)| · |f (x − y) − f (x)|dy

−π
Z
1
= |Kn (y)| · |f (x − y) − f (x)|dy

|y|<δ
Z (I.26)
1
+ |Kn (y)| · |f (x − y) − f (x)|dy

δ<|y|<π
Z Z
ϵ 1
≤ |Kn (y)|dy + · 2B · |Kn (y)|dy,
2π 2π
|y|<δ δ<|y|≤2π

where B > 0 is any number so that |f (·)| ≤ B on S 1 (f is bounded, since it is integrable). We then
have Z
ϵ ϵ
|Kn (y)|dy ≤ M
2π 2π
|y|<δ

by (b) of Good Kernels, that could be made arbitrarily small, and


Z
1
· 2B · |Kn (y)|dy → 0

δ<|y|≤2π

as n → ∞, by (c) of Good Kernels. Therefore,


|(f ∗ Kn )(x) − f (x)| < C · ϵ
for n sufficiently large, some C > 0. That yields the first assertion of Theorem I.22 (on pointwise
convergence). If f is continuous on S 1 , then, by Cantor’s Theorem, it is uniformly continuous, so
δ > 0 could be chosen independent of x, and so is the second summand on the r.h.s. of (I.26). It
then yields the uniform convergence (I.25), i.e. the second assertion of Theorem I.22.

20 IGOR WIGMAN

Recall that (Example I.18) the partial Fourier sums are given by
(SN (f ))(x) = (f ∗ DN )(x),
where
N
X sin((N + 1/2)x)
DN (x) = einx =
n=−N
sin(x/2)
is the Dirichlet kernel of degree N (see (I.8)). It immediately raises the question whether {DN }N ≥1
is a family of good kernels. For if it does, then the Good Kernels Theorem I.22 would immediately
yield the convergence (SN (f ))(x) → f (x) as N → ∞ at every continuity point x of f . Let us test
the properties (a)-(c) of good kernels.
a. First,

1
DN (x)dx = 1, (I.27)

−π
seen by using the expression (I.7), with only the integral corresponding to n = 0 not vanishing.
Hence (a) is satisfied.
b. Unfortunately, it is possible to show that

|DN (x)|dx ≥ c log N (I.28)
−π

for some c > 0, so (b) is violated due to the fluctuations of DN , increasing with N , relegated to a
guided exercise in a homework assignment.
We conclude that, somewhat disappointingly, {DN }N ≥1 is not a family of good kernels. [In fact,
using the estimate (I.28), it is possible to construct a sequence of continuous functions fN : S 1 → R
such that fN (0) = 0 but |(SN fN )(0)| ≥ c′ log N with some c′ > 0. In particular, it is possible to
construct a function f : S 1 → R so that there exists a point x0 ∈ S 1 such that (SN f )(x0 ) → ∞, so
that it is not converging to f (x).]
I.5. Cesàro summability. We will be able to establish the convergence ((SN f )(x) → f (x)) in a

P
different, weaker, sense. Recall that if ck is a series of (complex) numbers ck ∈ C, then we consider
k=0
n
X
sn := ck ,
k=0

P
and say that ck = S, if
k=0
sn −→ S.
n→∞
Instead, we consider the Cesàro means of {sn }n≥1 :
−1
NP
sn
n=0 s0 + s1 + . . . + sN −1
σN := = ,
N N

P
also called the N ’th Cesàro sums of the series ck .
k=0

P
Definition I.23. We say that the series ck is Cesàro summable to σ ∈ C, if
k=0
lim σN = σ.
N →∞
LECTURE NOTES FOURIER ANALYSIS 21

Example I.24. Consider the series


X∞
(−1)k = 1 − 1 + 1 − 1 + . . . ,
k=0

not convergent in the usual sense. Since


(
1 n even
sn = ,
0 n odd
we have that
N −1
X N
sn = + O(1),
n=0
2

1
(−1)k is Cesàro summable to 12 .
P
and thus σN = 2
+ O(1/N ), and the series
k=0

Fact I.25 (Guided homework assignment). If



X
ck = s
k=0

P
(in the usual sense of convergence of series), then ck is Cesàro summable to s. The converse is
k=0
false, as shown by Example I.24.

fk of functions fk : S 1 → C is Cesàro summable to
P
Definition I.26. We say that a series
k=0

f (x) : S 1 → C at x ∈ S 1 (resp. uniformly Cesàro summable), if the series of numbers
P
fk (x)
k=0
is Cesàro summable to f (x) (resp. σN (x) → f (x) uniformly on S 1 , i.e. σN (x) are the Cesàro sums
P∞
of fk (x)).
k=0

For f : S 1 → C let (Sn (f ))(x) be the partial Fourier sums of f , and


−1
NP
(Sn (f ))(x)
n=0 (S0 (f ))(x) + . . . + (SN −1 (f ))(x)
(σN (f ))(x) := = .
N N
We have
1 1
(σN (f ))(x) = (((S0 (f ))(x) + . . . + (SN −1 (f ))(x)) = ((D0 ∗ f )(x) + . . . + (DN −1 ∗ f )(x))
N N
1
= ((D0 + . . . + DN −1 ) ∗ f ) (x) = (FN ∗ f )(x),
N
where
D0 (x) + . . . + DN −1 (x)
FN (x) =
N
is the N ’th Fejér Kernel.
Lemma I.27. (1) We have explicitly (guided home assignment)
1 sin(N x/2)2
FN (x) = . (I.29)
N sin(x/2)2
(2) The Fejér kernels {FN }N ≥1 are good kernels.
22 IGOR WIGMAN

Proof that the {FN } are good kernels. a. First, we defined


D0 (x) + . . . + DN −1 (x)
FN (x) = .
N
Therefore, on recalling that (I.27) holds, i.e. that DN satisfied property (a) of good kernels,
Zπ N −1 Zπ
1 1 X 1
FN (x)dx = DN (x)dx = 1.
2π N n=0 2π
−π −π

b. Follows directly from (a), in light of the positivity of FN .


c. Let 0 < δ < π. We use the trivial inequality | sin(N x/2)| ≤ 1 for the numerator in (I.29), whereas
for the denominator in (I.29), we use

| sin(x/2)| ≥ cδ > 0

for δ ≤ |x| ≤ π (e.g. take cδ = sin(δ/2), and use the monotone increasing of the sine). Then
Z Z
1 dx 2π
|FN (x)|dx ≤ 2
≤ →0
N cδ N c2δ
δ≤|x|≤π δ≤|x|≤π

as N → ∞.

Hence, in light of the Good Kernels Theorem I.22, we have the following result, bearing in mind
that we readily seen σN (f ) = FN ∗ f :

Theorem I.28 (Fejér’s Theorem). Let f : S 1 → C be an integrable function.


(1) The Fourier series of f is Cesàro summable to f at every point x ∈ S 1 of continuity of f (i.e.
(σN (f ))(x) → f (x)).
(2) If, in addition, f is continuous on S 1 , then the Fourier series of f is uniformly Cesàro summable
to f (i.e. (σN (f ))(x) → f (x) uniformly on S 1 ).

[In particular, we rediscover the Uniqueness Theorem I.9: if f is integrable on S 1 , and for all
n ∈ Z we have fb(n) = 0, then (σN (f ))(x) ≡ 0, so, by Fejér’s Theorem I.28, f (x) = 0 for all points
of continuity x ∈ S 1 .]
[That also shows the converse implications in the statements: If f is continuous, then the following
holds. 1. f : S 1 → R is even, if and only if for all n ≥ 1, cn = 0. 2. f : S 1 → R is odd, if and only if
for all n ≥ 0, bn = 0.]
We have an important corollary that we are going to find very useful later:

Corollary I.29 (Uniform approximation of continuous functions by trigonometric polynomials).


Every continuous function f : S 1 → C can be uniformly approximated by trigonometric polynomials
(finite linear combinations of einx ), i.e., for every ϵ > 0, there exists a trigonometric polynomial P ,
so that for every x ∈ S 1 ,
|f (x) − P (x)| < ϵ
(equivalently, ∥f − P ∥∞ < ϵ).

[Unfortunately, the degree of P will in general grow as ϵ → 0. This also implies approximation, in
the L1 -sense of integrable functions by combining with the Approximation Lemma I.20.]
LECTURE NOTES FOURIER ANALYSIS 23

I.6. Abel summability and Poisson Kernel. Steady-state heat equation. Given a series
P∞
ck of complex numbers, let A(r) : [0, 1) → C be defined as
k=0

X
A(r) := ck r k
k=0

P
(Taylor series on [0, 1) corresponding to Taylor coefficients {ck }k≥0 ). We say that ck is Abel
k=0

ck rk converges and
P
summable to s, if: for every r ∈ [0, 1) the series
k=0

lim A(r) = s.
r→1−

The sums A(r) are the Abel means of the series.



P ∞
P
Fact I.30. If ck is Cesàro summable to s, then also ck is Abel summable to s. (This in
k=0 k=0

P
particular holds if ck = s in the usual sense of convergence.) The converse fails - see counter-
k=0
example in a homework assignment.

an einθ , it makes sense to consider symmetric sums,
P
In the context of Fourier series f (θ) ∼
n=−∞
i.e. take

X
(Ar (f ))(θ) = r|n| an einθ ,
n=−∞

P
and hence the Abel convergence of cn , c0 = a0 , and for n > 0,
n=0

ck = an einθ + a−n e−inθ .


Since f is integrable, |an | is a bounded sequence, so (Ar (f ))(θ) is well defined as a complex-valued
function on [0, 1). Just as with the Cesàro means,
(Ar (f ))(θ) = (f ∗ Pr )(θ),
where

X
Pr (θ) = r|n| einθ (I.30)
n=−∞
is the Poisson Kernel, indexed by the continuous variable r (left to homework assignment).
Lemma I.31. (1) We have explicitly for every r ∈ [0, 1):
1 − r2
Pr (θ) = . (I.31)
1 − 2r cos θ + r2
(2) The {PRr }r is a family of good kernels as r → 1−. [In this case property (c) of good kernels is
lim |Pr (θ)|dθ = 0.]
r→1−
δ≤|θ|≤π

Proof. (1) Left to homework assignment.


(2) We are to prove properties (a)-(c) of good kernels.
a. Follows directly from (I.30).
b. Follows from (a) and the positivity of Pr (which itself is a consequence of the explicit expression
(I.31)).
24 IGOR WIGMAN

c. The denominator of Pr in (I.31) is


1 − 2r cos θ + r2 = (1 − r)2 + 2r(1 − cos θ).
1
Therefore, for every δ > 0, δ ≤ |θ| ≤ π, 2
≤ r ≤ 1 (recall that eventually r → 1−):
1
1 − 2r cos θ + r2 ≥ 2r(1 − cos θ) ≥ 2 · (1 − cos δ) > 0,
2
so
Z Z Z
2 dθ 2π
|Pr (θ)|dθ = Pr (θ)dθ ≤ (1 − r ) · ≤ (1 − r2 ) · −→ 0.
1 − cos δ 1 − cos δ r→1−
δ≤|θ|≤π δ≤|θ|≤π δ≤|θ|≤π


Lemma I.31 teamed up with the Good Kernels Theorem I.22 yields the following result (which, in
light of Fact I.30 already follows from Fejér’s Theorem I.28):
Theorem I.32 (Abel’s Theorem). The Fourier series of an integrable function f : S 1 → C is Abel
summable to f (x) at every point of continuity x ∈ S 1 of f . If f is continuous on S 1 , then the Fourier
series of f is uniformly Abel summable to f .

As an application of Abel summability, we are going to solve the steady-state heat equation.
Consider R2 to be an (infinite) metal plate, and for (x, y) ∈ R2 and t ≥ 0, let u(x, y, t) be the
temperature at (x, y) at time t. Suppose that u(·, ·, 0) is given, and we are interested in the heat
propagation, i.e. the evolution of u(x, y, t), t > 0. Let S = S(x0 , y0 ; h) for (x0 , y0 ) ∈ R2 be the side-h
square centred at (x0 , y0 ), and assume that h is small (infinitesimal). Then
ZZ
H(t) := σ · u(x, y, t)dxdy
S

with some σ > 0 constant associated to the metal plate, is the total heat on S.
The Newton law of cooling states that the heat flows from higher to lower temperature, at the rate
proportional to the difference of temperatures at two nearby points, i.e. the gradient. On the one
hand, ZZ
dH ∂u ∂u
=σ· (x, y, t)dxdy ≈ σh2 · (x0 , y0 , t), (I.32)
dt ∂t ∂t
S
whereas on the other hand, by Newton, we can estimate the amount of heat travelling through each
of the 4 sides of the square, entering to or escaping from it:
 
dH ∂u ∂u ∂u ∂u
≈ κh (x0 + h/2, y0 , t) − (x0 − h/2, y0 , t) + (x0 , y0 + h/2, t) − (x0 , y0 − h/2, t) ,
dt ∂x ∂x ∂y ∂y
(I.33)
with κ > 0 the conductivity of the material. We can now approximate
∂u ∂u ∂ 2u
(x0 + h/2, y0 , t) − (x0 − h/2, y0 , t) ≈ h · 2 (x0 , y0 , t),
∂x ∂x ∂x
and
∂u ∂u ∂ 2u
(x0 , y0 + h/2, t) − (x0 , y0 − h/2, t) ≈ h · 2 (x0 , y0 , t),
∂y ∂y ∂y
so that (I.33) reads
∂ 2u ∂ 2u
 
dH
≈ kh2 + . (I.34)
dt ∂x2 ∂y 2
LECTURE NOTES FOURIER ANALYSIS 25

Comparing (I.32) to (I.34) yields the (time-dependent) heat equation


σ ∂u ∂ 2u ∂ 2u
= + =: ∆u,
κ ∂t ∂x2 ∂y 2
where
∂2 ∂2
∆= +
∂x2 ∂y 2
is the Laplace operator (Laplacian) on R2 . As t → ∞, the system tends to an equilibrium, i.e., u
becomes independent of t, s.t. ∂u
∂t
≡ 0, and the steady-state heat equation is
∆u ≡ 0. (I.35)
We say that a function u satisfying (I.35) is a harmonic function.
Let
D := {(x, y) ∈ R2 : x2 + y 2 < 1}
be the unit disc, in polar coordinates
D = {(r, θ) : 0 ≤ r < 1},
with boundary
S 1 = ∂D = {(r, θ) : r = 1}.
The Dirichlet problem is to find a harmonic function u : D → C such that u|S 1 ≡ f , with some
f : S 1 → C given, i.e. u(1, θ) ≡ f (θ). Now let us assume that the Dirichlet problem admits a
solution of the form

X
u(r, θ) = an r|n| einθ .
n=−∞

Substituting r = 1 suggests that (assuming that f is integrable on S 1 ) we should take an = fb(n),


whence

1
u(r, θ) = (Ar (f ))(θ) = (f ∗ Pr )(θ) = f (ϕ) · Pr (θ − ϕ)dϕ.

−π

Indeed, we have the following result.


Theorem I.33. Let f : S 1 → C be integrable, and define
u(r, θ) := (f ∗ Pr )(θ). (I.36)
(1) The function u(·, ·) has two continuous derivatives on D, and ∆u ≡ 0.
(2) For every θ ∈ S 1 continuity of f ,
lim u(r, θ) = f (θ).
r→1−

If f is continuous on S 1 , then this limit is uniform.


[(3) If f is continuous on S 1 , then u as above is the unique solution to the steady-state heat
equation on D satisfying (1) and (2).]
Proof. As before, we rewrite (I.36) as

X
u(r, θ) = an r|n| einθ ,
n=−∞
26 IGOR WIGMAN

where an = fb(n) are the Fourier coefficients of f . Since einθ = 1 and the {an } are bounded, say by
|an | ≤ C, to check that u ∈ C 2 (D), it is sufficient to check that for every r < ρ < 1, the series of the
possible second derivatives (either w.r.t. r or w.r.t. θ, or mixed), majorized by

X ∞
X
2 |n|
|an |n r <C· n2 ρ|n| < +∞
n=−∞ n=−∞

is uniformly convergent in the disc r < ρ, which is indeed the case. To check that ∆u = 0, we can
evaluate the Laplacian in the polar coordinate (homework assignment) to be
∂ 2 u 1 ∂u 1 ∂ 2u
∆u = + · + · , (I.37)
∂r2 r ∂r r2 ∂θ2
whence

∂ 2u X
= an |n|(|n| − 1)r|n|−2 einθ , (I.38)
∂r2 n=−∞

1 ∂u X
· = an |n|r|n|−2 einθ , (I.39)
r ∂r n=−∞

1 ∂ 2u X
· = an (−n2 )r|n|−2 einθ , (I.40)
r2 ∂θ2 n=−∞
justified by the said uniform absolute convergence for r < ρ < 1.
Finally, summing up (I.38), (I.39) and (I.40), and substituting these into (I.37), we obtain, using
the elementary identity
|n|(|n| − 1) + |n| − n2 = n2 − |n| + |n| − n2 = 0,
that ∆u ≡ 0 on D, as claimed. That the second assertion of Theorem I.33 holds is then a direct
implication of Abel’s Theorem I.32. The uniqueness of a solution u(·, ·) satisfying (1) and (2) of
Theorem I.33, subject to f continuous on S 1 , is out of the scope of this course.

I.7. Mean-square convergence of Fourier series. Our main goal in this section is to prove the
following remarkable result:
Theorem I.34. Suppose that f is integrable on S 1 . Then, as N → ∞,

1
|f (θ) − (SN (f ))(θ)|2 dθ → 0.

−π

[The optimal condition on f is f ∈ L2 (S 1 ), which is “weaker” than the Riemann integrability of


f .]
In what follows we will work with the linear space R of Riemann integrable functions on S 1 ,
equipped with the inner product

1
(f, g) 7→ ⟨f, g⟩ = f (θ)g(θ)dθ, (I.41)

−π

and first we discuss the relevant linear algebra preliminaries.


Definition I.35. Let V be a linear space (finite or infinite dimensional) over R or C, denoted F
(i.e. in either case F = R or F = C).
LECTURE NOTES FOURIER ANALYSIS 27

(1) If V is over C, then an inner product is a map


(v, w) ∈ V 2 7→ ⟨v, w⟩ ∈ C,
satisfying the following properties:
(a) Conjugate symmetry: for every v, w ∈ V, ⟨v, w⟩ = ⟨w, v⟩.
(b) Linearity w.r.t. the first argument: for every v, w, u ∈ V and a ∈ C, ⟨av, w⟩ = a⟨v, w⟩,
and
⟨v + w, u⟩ = ⟨v, u⟩ + ⟨w, u⟩.
(c) Positive definiteness: for every v ∈ V, ⟨v, v⟩ ∈ R≥0 , and also ⟨v, v⟩ = 0, if and only if
v = 0. [Positive semi-definite if only “⇐” holds.]
[Caution: For v, w ∈ V, a ∈ C, one has
⟨v, aw⟩ = ⟨aw, v⟩ = a · ⟨w, v⟩ = a · ⟨v, w⟩,
following from (a) and (b). Also,
⟨v, w + u⟩ = ⟨v, w⟩ + ⟨v, u⟩
is a by-product of (a) and (b).]
(2) If V is over R, then an inner product is (v, w) 7→ ⟨v, w⟩ ∈ R satisfying the analogous properties.
Examples I.36. (1) V = Rn , n ≥ 1, α = (αi ), β = (βi ),
n
X
⟨α, β⟩ = αi βi ,
i=1

an inner product over R.


(2) V = Cn , n ≥ 1, α = (αi ), β = (βi ),
n
X
⟨α, β⟩ := α i · βi ,
i=1

an inner product over. However,


n
X
⟨α, β⟩ := α i · βi
i=1

is not an inner product.


(3) ( ∞
)
X
V = l2 (Z) = (αi )i∈Z : αi ∈ C, |αi |2 < ∞ ,
i=−∞

X
⟨α, β⟩ := α i · βi
i=−∞
is an inner product. It is left to a homework exercise to validate that, equipped with ⟨·, ·⟩,
l2 (Z) is an inner product vector space.
Given an inner product ⟨·, ·⟩ on a vector space, one can associate a norm
∥v∥ := ⟨v, v⟩1/2 . (I.42)
For example, if V = Cn , then
n
!1/2
X
∥α∥2 := |αi |2 .
i=1
28 IGOR WIGMAN

Conversely, if V is over R, we can recover the original inner product it came from via
1
∥v + w∥2 − ∥v − w∥2 ,

⟨v, w⟩ = (I.43)
4
if ∥ · ∥ is indeed associated to an inner product via (I.42).
If V is over C, then only
1
∥v + w∥2 − ∥v − w∥2

ℜ(⟨v, w⟩) =
4
holds, but it is still possible to recover the inner product (see homework assignment). [However, ∥ · ∥1
is a norm, but, with this norm, (I.43) does not recover any inner product.]
Definition I.37 (Orthogonality). Let V be an inner product vector space. We say that v, w ∈ V
are orthogonal, if ⟨v, w⟩ = 0.
Lemma I.38. Let V be an inner product vector space.
(1) Pythagorean Theorem: If v, w ∈ V are orthogonal, then
∥v + w∥2 = ∥v∥2 + ∥w∥2 .
(2) Cauchy-Schwarz: For every v, w ∈ V one has the inequality
|⟨v, w⟩| ≤ ∥v∥ · ∥w∥.
(3) Triangle inequality: For every v, w ∈ V ,
∥v + w∥ ≤ ∥v∥ + ∥w∥.
Definition I.39 (Hilbert space). An inner product space (V, ⟨·, ·⟩) (with ⟨·, ·⟩ positive definite) is a
Hilbert space if it is complete, i.e. for every sequence {vn } ⊆ V of elements of V , if {vn } is Cauchy,
then it converges to an element v ∈ V. [A sequence {vn } is Cauchy, if for every ϵ > 0 there exists a
number N > 0 such that for all n, m > N , one has ∥vn − vm ∥ < ϵ. A sequence {vn } converges to v, if
for every ϵ, there exists a number N > 0 so that for all n > N , ∥vn − v∥ < ϵ. A convergent sequence
is automatically Cauchy.]
Examples I.40. (1) V = l2 (Z) is a Hilbert space, see homework assignment. Here a sequence
of elements {αn }n = {(αn,i )i∈Z }n converges to an element α0 = (α0,i ) ∈ V, if
X∞
|αn,i − α0,i |2 → 0.
i=−∞

The convergence αn → α0 forces that the sequences of individual components of αn converge


to the corresponding component of α0 : For every i ∈ Z,
αn,i → α0,i ,
that is, usual convergence of sequences of complex numbers. But( the converse is false: Take
1 i=n
the unit vectors en = (δn,i )i∈Z with Kronecker’s delta δn,i = , so that for every
0 i ̸= n
i ∈ Z, en,i → 0, but en ̸→ 0.
(2) R = {f : S 1 → C : f Riemann integrable}, a vector space over C. Define the inner product

1
⟨f, g⟩ := f (θ) · g(θ)dθ.

−π

This inner product is only positive semi-definite, as ∥f ∥ = 0, if an only if f vanishes outside


a thin (“measure zero”) set. It is easy to resolve it - by passing to equivalence classes, but,
in fact, we will not require it. The more alarming problem is the lack of completeness of R,
as the following example illustrates.
LECTURE NOTES FOURIER ANALYSIS 29

Example I.41. Let f : [0, 2π] → R be defined on [0, 2π] as


(
0 θ = 0, 2π
f (θ) = ,
log(1/θ) 0 < θ < 2π
and (
0 0 ≤ θ ≤ 1/n, θ = 2π
fn (θ) = ,
f (θ) 1/n < θ < 2π
and define fn , f : S 1 → R by 2π-periodicity. First, f is itself not in R, as it is not bounded. However,
for every n ≥ 1, fn ∈ R, as it is integrable on each of the two subintervals of [−π, π]. Moreover, {fn }
is Cauchy: For n > m, upon transforming the variable x = log θ, we may evaluate
1/m
Z 1/m
Z −Zlog m log
Z n 2
2 1 2 1 2 1 2 x 1 y
∥fn − fm ∥ = (log(1/θ)) dθ = (log(θ)) dθ = x e dx = dy → 0
2π 2π 2π 2π ey
1/n 1/n − log n log m

as m = min{m, n} → ∞, as a tail of a convergent integral.


Finally, if {fn } converged to an element of R, say g ∈ R, then for every δ > 0,
Z2π Z2π Z2π
|f − g|2 dθ ≤ 2 |f − fn |2 dθ + 2 |fn − g|2 dθ,
δ δ δ

R2π
and, since for n > 1δ , |f − fn |2 dθ = 0 and
δ

Z2π Z2π
2
|fn − g| dθ ≤ |fn − g|2 dθ → 0
δ 0

as n → ∞ by assumption, it follows that for every δ > 0,


Z2π
|f − g|2 dθ = 0.
δ

R2π
But that implies that g is unbounded (since otherwise, if g is bounded by B, say, then |f − g|2 dθ
δ
would have a positive contribution around 0 where |f (θ)| > B), whence it is not integrable.
To resolve this setback, we use the aforementioned completion procedure, to complete R, the
resulting space L2 (S 1 ) is a Hilbert space. However, that requires Lebesgue integration, and outside
the scope of this course. Since we constrain ourselves with Riemann integrable functions, we will
perfectly be content with R, with its disadvantages.
Towards the proof of the mean square convergence Theorem I.34. As it was mentioned,
we will work with the space R of integrable functions f : S 1 → C, endowed with the inner product
(I.41), and the corresponding norm

2 1
∥f ∥ = |f (θ)|2 dθ.

−π

With this interpretation, our goal is to prove that


∥f − SN (f )∥ → 0
30 IGOR WIGMAN

as N → ∞.
Towards the proof of the convergence in mean square, let for n ∈ Z the function (vector in R)
en : S 1 → C be defined by en (θ) := einθ . Obviously, en ∈ R (in fact, the en are analytic, and for
every n, m ∈ Z we have the orthogonality relations
(
0 n ̸= m
⟨en , em ⟩ = δnm = .
1 n=m
Therefore, E := {en }n∈Z is an orthonormal system of (infinitely many) vectors in R. The meaning
of the mean square convergence is that E is sufficiently “rich”, i.e. spans the entire R. We identify
fb(n) = ⟨f, en ⟩, and
XN
SN (f ) = ⟨f, en ⟩en
n=−N
is the projection of f onto the subspace RN of R spanned by {en }|n|≤N .
Lemma I.42. Let f : S 1 → C, f ∈ R, and N ≥ 1.
(1) We have
X 2
∥f ∥2 = ∥f − SN (f )∥2 + fb(n) . (I.44)
|n|≤N

(2) “Best approximation”: for every gN trigonometric polynomial of degree ≤ N , we have


∥f − SN (f )∥ ≤ ∥f − gN ∥,
i.e., out of all trigonometric polynomials of degree ≤ N , the trigonometric polynomial SN (f )
provides the best approximation for f , in the mean square norm. That is, for every collection
of constants {cn }|n|≤N ⊆ C,
X
∥f − SN (f )∥ ≤ f − cn en .
|n|≤N

Proof. Since {en } are orthonormal,


X
f − SN (f ) = f − ⟨f, en ⟩en
|n|≤N

is orthogonal to all en , |n| ≤ N . Hence f − SN (f ) is orthogonal to every combination of {en }|n|≤N ,


and, in particular, to SN (f ) itself. Hence
f = (f − SN (f )) + SN (f ) (I.45)
is an orthogonal decomposition of f , as illustrated in Figure 1, and by Pythagoras (Lemma I.38(1)),
∥f ∥2 = ∥f − SN (f )∥2 + ∥SN (f )∥2 . (I.46)
P b
Now, SN (f ) = f (n) · en is an orthogonal decomposition of SN (f ), hence, again by Pythagoras,
|n|≤N
we have
X 2
∥SN (f )∥2 = fb(n) , (I.47)
|n|≤N

and, upon substituting (I.46) into (I.47), we obtain (I.44), concluding the proof of (1) of Lemma I.42.
To prove (2) of Lemma I.42 we reuse the fact that f − SN (f ) is orthogonal to every trigonometric
polynomial gN of degree ≤ N , and write the orthogonal decomposition
f − gN = (f − SN (f )) + (SN (f ) − gN )
LECTURE NOTES FOURIER ANALYSIS 31

Figure 1. The vector SN is the orthogonal projection of f onto the space spanned by {en }|n|≤N .

of f − gN . This implies
∥f − gN ∥2 = ∥f − SN (f )∥2 + ∥SN (f ) − gN ∥2 ≥ ∥f − SN (f )∥2 ,
which is (2) of Lemma I.42.

We are now approaching the proof of the Mean Square Convergence Theorem I.34, asserting that
∥f − SN (f )∥ → 0 as N → ∞. To this end we will need two reminders:
(1) Approximation Lemma I.20: For every f ∈ R such that for some B > 0, we have sup |f (θ)| ≤
θ∈S 1
B, there exists a sequence {fn }n≥1 of continuous functions fn : S 1 → C such that for every
n ≥ 1, sup |fn (θ)| ≤ B and, as n → ∞,
θ∈S 1

|fn (θ) − f (θ)|dθ → 0
−π

(2) Uniform approximation of continuous functions by trigonometric polynomials (Corollary I.29):


for every f : S 1 → C continuous, and every ϵ > 0, there exists a number N ≫ 1 sufficiently
large and a trigonometric polynomial gN of degree ≤ N , such that ∥f − gN ∥∞ < ϵ, and, in
particular, ∥f − gN ∥ < ϵ.
Corollary I.43 (Corollary from reminders (1) and (2)). For every f ∈ R and ϵ > 0, there exists
a number N ≥ 1 sufficiently large and
gN ∈ RN := {trigonometric polynomials of degree ≤ N },
so that ∥f − gN ∥ < ϵ.
Proof. Let B := sup |f (θ)|, and apply (1) on f to find a function g := fn , n sufficiently large so that
θ∈S 1

ϵ2 π
|g(θ) − f (θ)|dθ <
4B
−π

and sup |g(θ)| ≤ B. Then


θ∈S 1
Zπ Zπ
1 2B B ϵ2 π ϵ2
∥f − g∥2 = |f (θ) − g(θ)|2 dθ ≤ · |f (θ) − g(θ)|dθ < · = ,
2π 2π π 4B 4
−π −π
32 IGOR WIGMAN

so that ∥f − g∥ < 2ϵ .
Now apply (2) on g to obtain a number N > 0 sufficiently large and a function gN ∈ RN , so that
∥g − gN ∥ < 2ϵ . Thus, by the triangle inequality, given f ∈ R, we have found a number N sufficiently
large and a function g ∈ RN , so that ∥f − gN ∥ < ϵ.

Proof of Theorem I.34. Given f ∈ R and ϵ > 0, apply Corollary I.43 to obtain N > 0 sufficiently
large and gN ∈ RN , such that ∥f − gN ∥ < ϵ. However, by the Best Approximation Lemma I.42(2),
we obtain
∥f − SN (f )∥ ≤ ∥f − gN ∥ < ϵ,
concluding the proof of Theorem I.34.

We have the following further results.
P b 2
Theorem I.44. (1) Parseval’s identity: for every f ∈ R one has |f (n)| < ∞, and
n∈Z
X
∥f ∥2 = |fb(n)|2 .
n∈Z

Consider the map F : R → l2 (Z) defined as


F(f ) = (fb(n))n∈Z .
Parseval’s identity asserts that F is an isometry of linear spaces (i.e. norm preserving).
In fact, F, extended to L2 (S 1 ), is a bijective isometry; the above construction is one
way of extending F on L2 (S 1 ) (“closure of an operator”). It will be seen in a homework
assignment that in this situation F also preserves the inner product, i.e. for every f, g ∈ R,
⟨f, g⟩ = ⟨Ff, Fg⟩.
(2) Bessel’s inequality: if {fn } is any orthonormal family of functions (finite or infinite), then
X
|⟨f, fn ⟩|2 ≤ ∥f ∥2 .
n∈Z

Parseval’s identity shows that {en }n∈Z is sufficiently rich to span the entire space R. If we would
miss any of the en , that would result in a mere inequality in place of Parseval’s identity.
Proof. Recall that, by the orthogonality of f − SN (f ) and SN (f ), and the orthogonal decomposition
(I.45), we have the equality (I.44), i.e., that
N
X 2
2 2
∥f ∥ = ∥f − SN (f )∥ + fb(n) .
n=−N

In particular, for every N ,


N
X 2
fb(n) ≤ ∥f ∥2 .
n=−N

P 2
Then fb(n) < ∞ is convergent, and, since ∥f − SN (f )| → 0, it follows that
n=−∞

X 2
fb(n) = ∥f ∥2 ,
n=−∞

which is (1) of Theorem I.44.


LECTURE NOTES FOURIER ANALYSIS 33

(2) of Theorem I.44 follows directly from the orthogonal decomposition


 
X N
X
f = f − ⟨f, fn ⟩ · fn  + ⟨f, fn ⟩ · fn .
|n|≤N n=−N
P
We observe that, if we denote the orthogonal projection SeN (f ) := ⟨f, fn ⟩·fn of f onto sp{fn }|n|≤N ,
|n|≤N
then Bessel’s inequality is an equality, if and only is ∥f − SeN (f )∥ → 0, i.e. SeN (f ) → f .

Since the convergence of a series implies that the elements vanish at infinity, either Parseval’s
identity or Bessel’s inequality imply:
Theorem I.45 (“Riemann-Lebesgue Lemma”). If f ∈ R, then fb(n) → 0 as |n| → ∞. Equivalently,
Rπ Rπ
as n → ∞, f (θ) sin(nθ)dθ → 0 and f (θ) cos(nθ)dθ → 0.
−π −π

[Unlike what we had for smooth functions, Riemann-Lebesgue does not prescribe the rate of decay.]
I.8. Pointwise convergence.
Theorem I.46 (Pointwise Convergence Theorem). Let f : S 1 → C be integrable, and θ0 ∈ S 1 such
that f is differentiable at θ0 . Then, as N → ∞,
(SN (f ))(θ0 ) → f (θ0 ).
It is possible to construct a continuous function f : S 1 → C and θ0 ∈ S 1 such that (SN (f ))(θ0 )
diverges as N → ∞ (Fejér). However, necessarily (SN (f ))(θ0 ) → f (θ0 ) for almost all θ0 ∈ S 1 (L.
Carlesson, 1966).
Proof. Define the auxiliary function F : S 1 → C:

f (θ0 −t)−f (θ0 )

 t
t ∈ (−π, π) \ {0}
F (t) = −f (θ0 )′ .
t=0

Doesn’t matter as long as it is periodic t = ±π

Then, for every δ > 0, F is continuous outside of (−δ, +δ), hence is integrable there, and, moreover,
F is bounded, since we assumed that f is differentiable at θ0 . Therefore, by HW1 Q1, F is integrable
1

on S 1 . We then use SN (f ) = f ∗ DN , and bear in mind that 2π DN (t)dt = 1 to write
−π
Zπ Zπ
1 1
(SN (f ))(θ0 ) − f (θ0 ) = f (θ0 − t)DN (t)dt − f (θ0 ) = (f (θ0 − t) − f (θ0 ))DN (t)dt
2π 2π
−π −π
(I.48)

1
= F (t) · tDN (t)dt.

−π
sin((N +1/2)t)
Now recall that DN is explicitly given by DN (t) = sin(t/2)
, so that (I.48) is

1 F (t)t
(SN (f ))(θ0 ) − f (θ0 ) = · sin((N + 1/2)t)dt. (I.49)
2π sin(t/2)
−π
t F (t)t
Since sin(t/2) is continuous, the function g(t) := sin(t/2) is integrable, therefore it is then tempting to
apply the Riemann-Lebesgue Lemma (Theorem I.45) on g directly to deduce that the r.h.s. of (I.49)
34 IGOR WIGMAN

vanishes as N → ∞. However one is not able to infer the vanishing of the r.h.s. of (I.49) by a direct
application of Theorem I.45, as N + 1/2 is not an integer. Instead, we write
sin((N + 1/2)t) = sin(N t) cos(t/2) + sin(t/2) cos(N t),
and argue that
 π 
Z Zπ
1  t cos(t/2)
(SN (f ))(θ0 ) − f (θ0 ) = F (t)t cos(N t)dt + F (t) · sin(N t)dt ,
2π sin(t/2)
−π −π

whence both summands vanish via two separate applications of Theorem I.45.

II. Some applications of Fourier series


II.1. The isoperimetric inequality. According to a legend about the foundation of Carthage
(around 850 B.C.), Queen Dido purchased from a local king the land along the North African coast-
line that could be enclosed by the hide of an ox. She sliced the hide into very thin strips, tied
them together, and was able to enclose an area which became the city of Carthage. By a strike of
genius, Queen Dido decided to endow Carthage with a circular shape, thus maximizing its area. The
celebrating moments of building Carthage was immortalized by J. M. W. Turner in his famous paint-
ing “Dido building Carthage, or The Rise of the Carthaginian Empire”, displayed in the National
Gallery (London, UK), alongside with C. Lorrain’s “The Embarkation of the Queen of Sheba”, by
which it was heavily influenced (see a wikipedia article about the storyline surrounding these two sig-
nificant paintings). At this background a field of mathematics was born, dealing with isoperimetric
inequalities of different kinds, the most basic of these resolving the following problem:
Problem II.1. Given a number ℓ > 0, find a planar curve of length ℓ that encloses the largest area.

Figure 2. If the shape is not convex, then it should be easy to improve it by adding
to its area without adding to its perimeter.

If we are to guess what attributes the “optimal” shape should have, then, first, it should be convex.
For if it is not convex, it should be possible to “improve” the shape by adding to the area of the
enclosed oval, without increasing its length, as illustrated in Figure 2. Also, any straight segment in
the boundary of the oval could be added area by making it curvier. Therefore, a circle is a reasonable
choice for a potential candidate to maximize the area of the enclose oval, the perimeter length given.
The question is whether this intuition could be made into a rigorous theorem.
Before being able to address the given problem by Fourier analysis methods, we will have to give
it a formal mathematical treatment.
LECTURE NOTES FOURIER ANALYSIS 35

Definition II.2 (Parameterized curves). (1) A parameterized curve is a mapping γ : [a, b] →


R2 . A curve is the image Γ = γ([a, b]) ⊆ R2 . To make these susceptible to the methods of
Fourier analysis, we will think of γ as (b − a)-periodic.
(2) We say that Γ is simple, if Γ does not intersect itself, i.e. for every t1 , t2 ∈ [a, b), if γ(t1 ) =
γ(t2 ), then t1 = t2 . (Note that γ(a) = γ(b) is allowed.)
(3) We say that Γ is closed, if γ(a) = γ(b).
(4) We say that Γ is C 1 , if γ is C 1 , and for all t ∈ [a, b], γ̇(t) ̸= 0 (if Γ is closed, then we impose
the C 1 condition on γ as a periodic function, so that, in particular, γ̇(a) = γ̇(b)).
Definition II.3 (Re-parameterizations of curves). (1) If s : [c, d] → [a, b] is bijective, then we
may re-parameterize γ by considering η(r) := γ(s(r)), r ∈ [c, d]. If for every r ∈ [c, d] one has
s′ (r) > 0, these parametrizations are equivalent, i.e. orientation preserving.
(2) Let Γ be a C 1 curve, and γ : [a, b] → R2 its arbitrary C 1 -parametrization. The length of Γ
is (parametrization invariant)
Zb Zb
ℓ = ℓ(Γ) = ∥γ̇(t)∥dt = (x′ (t)2 + y ′ (t)2 )1/2 dt,
a a

where γ(t) = (x(t), y(t)).


(3) We say that γ is an arc-length parametrization of a curve Γ, if for all t ∈ [a, b], we have
∥γ̇(t)∥ = 1.
If γ : [a, b] → R2 is an arc-length parametrization of Γ, then ℓ(Γ) = b − a, and, further, for every
c ∈ [a, b], the length of the head γ [a,c] is precisely c − a. Every C 1 -smooth curve admits an arc-length
parametrization, and moreover, by further re-parametrization, we may assume γ : [0, ℓ] → R2 .
Recall that by Jordan’s Theorem, every simple closed continuous curve Γ divides the complement
R2 \ Γ into two connected components: a bounded component S, called interior of Γ, and an
unbounded one, called the exterior of Γ. If Γ is a simple closed C 1 curve, and γ = (x(t), y(t)) :
[a, b] → R2 is its C 1 parametrization, then, by Green’s formula, one has
Z Zb
1 1
A = Area(S) = xdy − ydx = (x(t)y ′ (t) − y(t)x′ (t))dt .
2 2
Γ a

Theorem II.4 (Isoperimetric Inequality). Suppose that Γ is a simple closed curve in R2 , C 1 , of


length ℓ, and let A be the area of the interior domain enclosed by Γ. Then
ℓ2
A≤ ,

with equality holding, if and only if Γ is a circle.
We now have the following claim, reducing the proof of Theorem II.4 to the particular case ℓ = 2π,
whose proof is relegated to a homework exercise.
Claim II.5. We can assume that ℓ = 2π, i.e. that if Γ is a simple closed C 1 curve of length 2π in
R2 , then it encloses a domain of area
A≤π (II.1)
with equality holding, if and only if Γ is a unit circle.
Proof of Claim II.5. Given a length-ℓ parameterized curve γ : [0, ℓ] → R2 , consider the re-scaled
e : [0, 2π] → R2 defined by
curve γ  
2π ℓ
γ
e(·) = γ · ,
ℓ 2π
36 IGOR WIGMAN

and study how its length and enclosed area are related to those of γ. Details are left as a homework
assignment. □
Proof of Theorem II.4 assuming ℓ = 2π. Let γ(s) = (x(s), y(s)) : [0, 2π] → R2 be a (2π-periodic)
arc-length parametrization of Γ, so that
x′ (s)2 + y ′ (s)2 ≡ 1 (II.2)
on s ∈ [0, 2π]. Let an = xb(n) and bn = yb(n) be the Fourier coefficients of x(·) and y(·) respectively,
so that the Fourier coefficients of their derivatives are xb′ (n) = inan and yb′ (n) = inbn .
An application of Parseval’s identity (Theorem I.44) on x′ (·) yields
Z2π ∞ ∞
1 ′ 2
X 2 X
x (s) ds = xb′ (n) = n2 |an |2 ,
2π n=−∞ n=−∞
0

whereas one on y ′ (·) yields


Z2π ∞
1 ′ 2
X
y (s) ds = n2 |bn |2 ,
2π n=−∞
0
so that combining these two together gives:
∞ Z2π
X 1
n2 (|an |2 + |bn |2 ) = (x′ (s)2 + y ′ (s)2 )ds = 1, (II.3)
n=−∞

0

by the unit speed assumption (II.2).


First we express the interior area A = Area(γ) of γ in terms or the Fourier coefficients an and bn .
We start from an application of Green’s formula to write
Z2π
1
A= (x(s)y ′ (s) − y(s)x′ (s))ds . (II.4)
2
0

Note that, since both x and y are real-valued, it follows (in fact, equivalent) that for every n ∈ Z,
a−n = an and b−n = bn . Recall the inner product ⟨·, ·⟩ on R, and that, by Parseval’s Theorem I.44(1),
the Fourier map preserves the inner product, i.e. for f, g ∈ R
Zπ ∞
1 X
f (s)g(s)ds = fb(n) · gb(n).
2π n=−∞
−π

Applying this identity with f (·) = x(·) and g = y ′ (·) yields the equality
Z2π Z2π ∞ ∞
1 1 X X
x(s)y ′ (s)dx = x(s)y ′ (s)dx ′
= ⟨x, y ⟩ = an · inbn = −i nan · bn , (II.5)
2π 2π n=−∞ n=−∞
0 0

and, similarly,
Z2π ∞ ∞
1 ′ ′
X X
y(s)x (s)ds = ⟨y, x ⟩ = bn · inan = −i nan · bn . (II.6)
2π n=−∞ n=−∞
0

Now we insert (II.5) and (II.6) into Green’s formula (II.4) to write

X ∞
X
A=π n(an bn − an bn ) = 2π nℑ(an · bn ) ,
n=−∞ n=−∞
LECTURE NOTES FOURIER ANALYSIS 37

and, since we have


|an |2 + |bn |2
ℑ(an · bn ) ≤ |an · bn | ≤ ,
2
by an elementary inequality, we may use the triangle inequality to write

X X X
|n| · |an |2 + |bn |2 = π |n| · |an |2 + |bn |2 ≤ π n2 · |an |2 + |bn |2 = π,
  
A≤π (II.7)
n=−∞ n̸=0 n̸=0

thanks to (II.3). This concludes the first statement of Theorem II.4, i.e. that the inequality (II.1)
holds.
Next, we need to classify the curves, for which (II.1) is an equality. To this end, we observe that
for the equality in (II.7) to hold, it is necessary that for all n with |n| ≥ 2, one has an = bn = 0 (as
here the strict inequality |n| < n2 holds). Hence, if A = π, then
x(s) = a−1 e−is + a0 + a1 eis ,
y(s) = b−1 e−is + b0 + b1 eis .
Since both x, y are real-valued, a−1 = a1 , b−1 = b1 , and (II.3) reads 2(|a1 |2 + |b1 |2 ) = 1, and since, by
(II.7) and the argument above it, we must also have
|a1 |2 + |b1 |2
ℑ(a1 · b1 ) = |a1 | · |b1 | = , (II.8)
2
it follows that (
|a1 |2 + |b1 |2 = 21
1
|a1 |2 · |b1 |2 = 16 .
Therefore, we have |a1 | = |b1 | = 12 , and so we can write a1 = 12 eiα , b1 = 21 eiβ for some α, β ∈ [0, 2π).
In light of the above, we have
(
x(s) = a0 + ℜ(ei(α+s) ) = a0 + cos(α + s)
y(s) = b0 + ℜ(ei(β+s) ) = b0 + cos(β + s).
Finally, we claim that cos(β + s) = ± sin(α + s), so that Γ is a unit circle centred at (a0 , b0 ) ∈ R2 .
To this end, we recall (II.8), so that
1 1 1
| sin(α − β)| = ℑ(ei(α−β) ) = ℑ(a1 · b1 ) = |a1 b1 | = .
4 4 4
π π
Hence | sin(α − β)| = 1, then α − β ∈ π · Z + 2 , i.e., it is an odd multiple of 2 , and then cos(β + s) =
± sin(α + s), which, as it was mentioned above, finally yields that Γ is a unit circle.

II.2. Weyl’s Equidistribution Theorem.
Motivating question: Let us take a real number α ∈ R \ {0} and consider the sequence
A = Aα = {⟨α · n⟩}n≥1 , (II.9)
where [0, 1) ∋ ⟨x⟩ = x − ⌊x⌋ is the fractional part of x. For example ⟨3.5⟩ = 0.5, ⟨π⟩ = π − 3 =
0.1415 . . ., ⟨−3.5⟩ = (−3.5) − (−4) = 0.5, ⟨−3.4⟩ = 0.6. Implicitly, we defined an equivalence relation
between real numbers, where x ∼ y, if x − y ∈ Z.
Question II.6. (1) Is the set A dense in [0, 1], i.e. can it approximate every number β ∈ [0, 1]
arbitrarily well?
(2) Is A equidistributed in [0, 1), i.e. every interval (a, b) ⊆ [0, 1] gets proportionally many ele-
ments of A to its length b − a?
38 IGOR WIGMAN

Definition II.7 (Equidistribution in [0, 1)). Let A = {an }n≥1 ⊆ [0, 1) be a sequence. We say that
A is equidistributed in [0, 1), if for every (a, b) ⊆ [0, 1],
|{1 ≤ n ≤ N : an ∈ (a, b)}|
lim = b − a. (II.10)
N →∞ N
Note that in the Definition II.7 above, b = 1 is allowed. Also note that the rate of convergence of
the limit (II.10) is allowed to (and, in general, will) depend on the interval (a, b). It is easy to show
(left to a homework assignment) that every equidistributed sequence is dense in [0, 1].
1
Examples II.8. (1) Take α = 3
and consider A is in (II.9). Then in this case
1 2 1 2
an = , , 0, , , 0, . . . .
3 3 3 3
This sequence is not dense in [0, 1], hence, in particular, not equidistributed. More generally,
if 0 ̸= α = pq ∈ Q is a rational number with (p, q) = 1, then the sequence an = ⟨nα⟩ is periodic
with period q, so not
√ dense, and, in particular, not equidistributed. What about an irrational
number, e.g. α = 2?
(2) Prove that the sequence
1 1 2 1 2 3
0, , 0, , , 0, , , , 0, . . . ,
2 3 3 4 4 4
m
i.e. a linear ordering of { n }n≥1, 0≤m≤n−1 , is equidistributed.
(3) The sequence
1 1 2 3 1 2
0, , 0, , , , 0, , , . . . ,
2 4 4 4 8 8
m
i.e. a linear ordering of { 2n }n≥1, 0≤m≤2 −1 , is not equidistributed (but dense) in [0, 1]. Take,
n

for example, (a, b) = ( 21 , 1), and choose N = 32 · 2k , then for an ∈ (a, b) it is necessary to have
n ≤ 2k , and then
|{1 ≤ n ≤ N : an ∈ (a, b)}| 2k−1 1 1
∼ ∼ < .
N N 3 2
The reason why in this example the sequence fails to equidistribute is because each cycle
spends a positive proportion ( 12 ) of the time, in contrast to the previous example.
Theorem II.9 (Weyl’s equidistribution). If α is irrational, then the sequence Aα = {⟨n · α⟩}n≥1 is
equidistributed in [0, 1]. In particular, Aα is dense in [0, 1] (“Kronecker’s Theorem”).
[Theorem II.9 could be generalized to vectors α = (α1 , . . . αd ) ∈ Rd , ⟨n · α⟩ ∈ Td , the d-dimensional
torus (“ergodic dynamical system”). The rate of convergence (“quantitative discrepancy”) in (II.10)
and its higher dimensional analogue depends on the Diophantine properties of the number α ∈ R \ Q
or the vector α, i.e. how close these are to being rational.]
Towards the proof of Theorem II.9 we recall the equivalence relation we defined on R, i.e. for
x, y ∈ R, x ∼ y if x − y ∈ Z.
Proof. Let (a, b) ⊆ [0, 1], and χ(a,b) : [0, 1] → R be the characteristic function of (a, b)
(
1 x ∈ (a, b)
χ(a,b) (x) = ,
0 otherwise

and extend χ(a,b) to a function χ(a,b) : R → R by 1-periodicity, that is, the extended χ(a,b) (·) satisfy
χ(a,b) (x) = χ(a,b) (y) for all x ∼ y with the above defined equivalence relation. Or, put it differently,
LECTURE NOTES FOURIER ANALYSIS 39

χ(a,b) (x) = χ(a,b) (⟨x⟩) for every x ∈ R, i.e. χ(a,b) (x) only depends on the equivalence class of x. Then,
under the above conventions,
N
X N
X
|{1 ≤ n ≤ N : ⟨nα⟩ ∈ (a, b)}| = χ(a,b) (⟨nα⟩) = χ(a,b) (nα),
n=1 n=1

by 1-periodicity of χ(a,b) . Hence we are to prove that, as N → ∞,


N
1 X
f (nα) → b − a, (II.11)
N n=1

for f = χ(a,b) . Note that


Z1
χ(a,b) (x)dx = b − a,
0

hence, in this case, (II.11) is


N Z 1
1 X
f (nα) → f (x)dx. (II.12)
N n=1
0

More generally, in what follows, we aim is to establish (II.12) for a wide class of 1-periodic functions
f : R → C that would contain all the characteristic functions of intervals. This is done in 4 steps (a
fifth step will be offered in a homework assignment). In the theory of dynamical systems, the l.h.s. of
(II.12) is usually referred as “space average” of an observable f along an implicitly defined dynamical
system, whereas the r.h.s. of (II.12) is called “space average” of that observable; a dynamical system
is ergodic if these two are equal.
Step 1: Let f be the exponential f (x) = ek (x) = e2πikx : R → C for some k ∈ Z, a 1-periodic
function.
If f ≡ 1 (i.e. k = 0), then for every N , the l.h.s. of (II.12) is equal to 1, and so is the r.h.s., so
the limit (II.12) as N → ∞ certainly holds. For k ̸= 0 the l.h.s. of (II.12) is a geometric sequence
that could be summed:
N 1
1 2πikα 1 − e2πikN α
Z
1 X 2πik(nα)
e = ·e · 2πikα
→ 0 = e2πikx dx,
N n=1 N 1−e
0

since e2πikα = 1, |1 − e2pikN α | ≤ 2 by the triangle inequality, and 1 − e2πikα ̸= 0 since α ∈


/ Q, is a
fixed number independent of N . Therefore, (II.12) holds with all f = ek , k ∈ Z.
Here we exploited the particulars of the sequence ⟨nα⟩, and it will not be used again. From this
point on we are going to use a standard density argument, asserting that if (II.12) holds for the
functions {ek }k∈Z (that are “dense” in some appropriate space of functions, e.g. R), then it also
holds for a much wider class of functions. That is going to yield an important Weyl’s criterion for
equidistribution of sequences in [0, 1] (cf. Theorem II.10).
K
ak e2πikx .
P
Step 2: (II.12) holds for all trigonometric polynomials f =
k=−K
We observe that both sides of (II.12) are linear w.r.t. f , i.e. if (II.12) holds with both f and g
in place of f , then it also holds with f + g and any (finite) linear combination of functions. Since a
trigonometric polynomial is a (finite) linear combination of {ek }, and (II.12) was readily validated
for all of ek , the conclusion follows.
40 IGOR WIGMAN

Step 3: (II.12) holds for all continuous 1-periodic functions f : R → C.


Let f : R → C be a 1-periodic continuous function. We may easily convert it to 2π-periodic by
y

considering a homothety g(y) := f 2π , and then apply the Uniform Approximation Corollary I.29
of continuous functions g : [−π, π] → C by trigonometric polynomials. This way, given ϵ > 0, we
obtain a number K ≫ 0 sufficiently large, and a trigonometric polynomial TK (y) of degree ≤ K, such
that ∥g − TK ∥L∞ ([−π,π]) < 3ϵ . Hence, if we define PK (x) := TK (2πx), then sup |f (x) − PK (x)| < 3ϵ ,
x∈[0,1]
and by the 1-periodicity of both f and PK , we have
ϵ
sup |f (x) − PK (x)| < ,
x∈R 3
N
an e2πinx is a trigonometric polynomial of the type considered in Step 2.
P
where PK =
n=−N
Then, by Step 2, for all N ≫ 0 sufficiently large,
N Z 1
1 X ϵ
PK (nα) − PK (x)dx < .
N n=1 3
0

Overall, with the use of the triangle inequality, we have: for N sufficiently large,
N Z 1 N N Z 1
1 X 1 X 1 X
f (nα) − f (x)dx ≤ |f (nα) − PK (nα)| + PK (nα) − PK (x)dx
N n=1 N n=1 N n=1
0 0
Z1 Z1
ϵ ϵ ϵ
+ PK (x)dx − f (x)dx < + + = ϵ,
3 3 3
0 0

so that (II.12) is valid for the given continuous function f .


Step 4. (II.12) is valid with f = χ(a,b) , 0 ≤ a < b ≤ 1.
It would be great, if we could approximate the function χ(a,b) , in a manner similar to Step 3.
Unfortunately, Step 3 required uniform approximation of the function f by continuous functions,
which is impossible in this case, since that would force χ(a,b) to be continuous (which it fails to be).
Instead, we are going to approximate χ(a,b) from below and above, which is going to be sufficient for
our cause.
We will assume (a, b) ̸= (0, 1), for otherwise the result is trivial, and further, with no loss of
generality, that b < 1. Under the above conditions, for ϵ > 0 sufficiently small, there exist two
continuous functions, fϵ± , 1-periodic, 0 ≤ fϵ± (x) ≤ 1, so that for all x ∈ R, one has
fϵ− (x) ≤ χ(a,b) (x) ≤ fϵ+ (x),
and fϵ− (x) = χ[a,b] (x) = fϵ+ (x) for all x ∈ (a − ϵ, a + ϵ) ∪ (b − ϵ, b + ϵ), understood in the periodic
sense, as illustrated in Figure 3.
By the defining properties of fϵ± , one has
Z1 Z1
b − a − 2ϵ ≤ fϵ− (x)dx ≤ fϵ+ (x)dx ≤ b − a + 2ϵ.
0 0
N
1
P
Now we take an ϵ > 0, denote SN := N
χ(a,b) (n · α), and write:
n=1
N N
1 X − 1 X +
f (n · α) ≤ SN ≤ f (n · α). (II.13)
N n=1 ϵ N n=1 ϵ
LECTURE NOTES FOURIER ANALYSIS 41

Figure 3. The characteristic function χ(a,b) is approximated from below and above
by continuous functions fϵ± .

By Step 3 we know that, as N → ∞,


N Z 1
1 X −
f (n · α) → fϵ− (x)dx ≥ b − a − 2ϵ,
N n=1 ϵ
0

and in a manner analogous,


N Z 1
1 X +
f (n · α) → fϵ+ (x)dx ≤ b − a + 2ϵ.
N n=1 ϵ
0

Hence, taking a lim sup of the r.h.s. in (II.13), we obtain:


N →∞

N N Z 1
1 X + 1 X +
lim sup SN ≤ lim sup fϵ (n · α) = lim fϵ (n · α) = fϵ+ (x)dx ≤ b − a + 2ϵ,
N →∞ N →∞ N N →∞ N
n=1 n=1 0

and, since ϵ > 0 is arbitrary, and lim sup SN does not depend on ϵ, we can infer that
N →∞

lim sup SN ≤ b − a.
N →∞

Now applying lim inf on the l.h.s. of (II.13) and manipulating with the resulting expressions in
N →∞
analogous manner, we obtain
lim inf SN ≥ b − a.
N →∞

This result, coupled with the result on lim sup finally yields the existence of the limit
N →∞

lim SN = b − a,
N →∞

which concludes the proof of Step 4, and thus of Theorem II.9.


A few remarks are due.


(1) Let us analyse the proof in hindsight. Only Step 1 used the (explicit) identity of the sequence
an = nα, whereas the rest was to extend the class of f satisfying the ergodicity property (II.12),
form ek (·) = e2πik· to eventually χ(a,b) . What we showed is that any sequence an ⊆ [0, 1) satisfying
42 IGOR WIGMAN

the premise of
N 1 (
0 k ̸= 0
Z
1 X
ek (an ) → ek (x)dx = ,
N n=1 1 k=0
0
is equidistributed (the l.h.s. is called “Weyl’s exponential sum”). It is satisfied automatically for
k = 0, and, since ak are real, one can deduce that if it is satisfied for k, it is also satisfied for −k.
Therefore, we obtained the following useful equidistribution criterion:
Theorem II.10 (Weyl’s equidistribution criterion). A sequence {an }n≥1 ⊆ R is equidistributed
modulo 1, if and only if for every k ∈ Z \ {0} (alternatively, k ∈ Z>0 ) we have
N
1 X 2πikan
e →0
N n=1
as N → ∞.
We proved “if”, whereas “only if” is left as a home assignment.
(2) One can consider the map ρ : [0, 1) ∼
= S 1 → [0, 1) defined by
x 7→ ρx = α + x mod 1,
endowing S 1 with a structure of a dynamical system. The “orbit” of 0 is precisely 0, ρ0 = α, ρ2 0 = 2α
mod 1, . . .. If α ∈ Q, this orbit is periodic. Otherwise, it is equidistributed in S 1 . We can take a
function f : S 1 → R (or f : S 1 → C), called in this context a test function or an observable. In
N
this context, N1 f (ρn 0) is the time average of the observable f along the trajectory of 0, whereas
P
n=1
R1
f (x)dx is the space average of the observable f . The central question of the ergodic theory is
0
whether and under what conditions on the dynamical system and the observable, these two are
equal, asymptotically as N → ∞. If it is indeed the case for a suitable class of observable functions,
we say that the given dynamical system is ergodic.
(3) In our particular case, it is possible to prove that the ergodic property (II.12) holds for all
Riemann integrable 1-periodic functions f : R → C, proposed as a home assignment, with proof
borrowing from the readily done Step 4, i.e. approximating the Riemann integrable function, from
below and above, by step functions.
(4) There exists a vast number of generalizations for other sequences, dynamical systems in higher
dimensions (e.g. x 7→ x + (α1 , α2 ) on S 1 × S 1 = T2 , the 2-dimensional standard torus) etc.
(5) Finally, it seems appropriate to mention the Three Gaps Theorem, with connections to many
disciplines, including biology, musicology, linguistics and others. For some N ≥ 1, let {αk } be the
ordering of the numbers {n · α}n≤N . Then the collection of gaps {αk+1 − αk } contains at most 3
elements (depending on N ). Further, if it contains precisely 3 elements, then the largest of the three
gaps equals the sum of the other two. The proof of the Three Gaps Theorem is left as a guided home
assignment.
II.3. Vibrating string and wave propagation. In this section we will first derive the wave
equation, and then will solve it using the Fourier analysis. We consider a string of length L > 0
positioned at rest along the x axis between the origin x = 0 and x = L, attached to the axis by
its endpoints. We assume that the string consists of particles moving vertically, and at time t, the
height of the particle x is u(x, t), as shown in Figure 4.
Let ρ > 0 be the (constant) linear density of the given string, i.e. the mass of its snippet between
a and b is ρ · (b − a). We divide the string into N → ∞ equal segments of length h = L/N and
LECTURE NOTES FOURIER ANALYSIS 43

Figure 4. A vibrating string attached by its endpoint.

mass ρ · h, and assume that they are concentrated at particle located at xn = n · h, 1 ≤ n ≤ N , at


height yn = u(xn , t). We further assume each particle only acts (i.e. applies tension force) on its
neighbours.
By the 2nd law of Newton, we have ρ · h · yn′′ (t) = force applied on n’th particle, and, by our
simplified assumptions, the only sources of force applied on the n’th particle are the tension forces
from the (n + 1)’th particle and (n − 1)’th particle, which, as N → ∞, are approximately τ · yn+1h−yn
and τ · yn−1h−yn respectively, where τ > 0 is the coefficient of tension. Hence, as N → ∞ (and thereby
h → 0+),
τ
ρhyn′′ (t) ≈ · (yn+1 (t) − 2yn (t) + yn−1 (t)),
h
∂2
which, upon standard manipulation, becomes (upon substituting yn = u(xn , t) and yn′′ (t) = ∂t2
u(xn , t)):
ρ ∂2 u(xn + h, t) − 2u(xn , t) + u(xn − h, t)
2
u(xn , t) ≈ .
τ ∂t h→0+ h2
Since
u(xn +h,t)−u(xn ,t) u(xn ,t)−u(xn −h,t)
u(xn + h, t) − 2u(xn , t) + u(xn − h, t) h
− h ∂u
= −→ (xn , t),
h2 h h→0+ ∂x2
we finally obtain the wave equation
1 ∂2 ∂
2 2
u(x, t) = u(x, t),
c ∂t ∂x2
q
τ
with c = ρ
> 0 the propagation velocity. By re-scaling the various variables, we may assume that
c = 1, and further, that L = π (or any other value), which we will tacitly assume in what follows.

All in all, we are looking for a function u(x, t) : [0, π] × R≥0 → R satisfying the wave equation
∂2 ∂2
u(x, t) = u(x, t), (II.14)
∂t2 ∂x2
44 IGOR WIGMAN

subject to the initial conditions


(
u(x, 0) ≡ f (x) initial position

, (II.15)
∂t
u(x, 0) ≡ g(x) initial velocity
and u(0, t) = u(π, t) ≡ 0.
In an attempt to solve (II.14) we separate variables, i.e. look for its solutions u(·, ·) of the form
u(x, t) = φ(x) · ψ(t), which we substitute into (II.14) to yield the equality φ(x) · ψ ′′ (t) = φ′′ (x) · ψ(t).
Or, ignoring possible problems occurring in case of division by 0,
φ(x) ψ(t)
= ,
φ′′ (x) ψ ′′ (t)
which forces
φ(x) ψ(t)
′′
= ′′ ≡ λ ∈ R.
φ (x) ψ (t)
Therefore, (
φ(x) − λ · φ′′ (x) = 0
, (II.16)
ψ(t) − λ · ψ ′′ (t) = 0
and the equation for ψ(t) forces λ < 0, as otherwise the string will not oscillate (will be exponentially
growing).
We may then substitute λ = −m2 for some m ∈ R, and solve (II.16) for φ(·) and ψ(·):
(
ψ(t) = A · cos(mt) + B · sin(mt)
φ(x) = A e · sin(mt) ,
e · cos(mt) + B

with some A, B, A,
eB e ∈ R, and the meaning of the string being attached by its endpoints is that
φ(0) = φ(π) = 0, which, in its turn forces m ∈ Z≥0 and A e = 0, and thus we may assume m ∈ Z>0 .
We consolidate all of the above to find a particular solution
um (x, t) = (Am cos(mt) + Bm sin(mt)) · sin(mx) (II.17)
to (II.14), not adhering to the initial conditions (II.15), which we will now impose.
By the linearity of the wave equation (II.14), we can superimpose its solutions (II.17) to write

X
u(x, t) = (Am cos(mt) + Bm sin(mt)) · sin(mx),
m=1

and we will try impose (II.15) to find sequences of numbers {Am } and {Bm } that would satisfy these.
By substituting t = 0, we have
X∞
Am sin(mx) ≡ f (x), (II.18)
m=1
which relates the Am to the Fourier coefficients of f .
Though the Fourier expansion on the l.h.s. of (II.18) involves the sines only (so must be odd), on
an optimistic note we observe that f : [0, π] → R (i.e. f is π-periodic and not merely 2π-periodic).
Inspired by this observation, we may extend (uniquely) f on [−π, π] by oddness, and then the Am
are the Fourier coefficients of the resulting extension (in the real form). Similarly to the above,
X∞
mBm sin(mx) ≡ g(x),
m=1

hence we extend the function g to [−π, π] by oddnss, and mBm are the Fourier coefficients of the
extended g. In homework there will be two concrete examples of solving the wave equation (one of
them is the “plucked string”, i.e. when g ≡ 0).
LECTURE NOTES FOURIER ANALYSIS 45

II.4. Heat equation on the circle. Let u(θ, t) be the temperature at time t > 0 at the point
θ ∈ S 1 . We already derived the heat equation for a 2-dimensional setting, and similarly we can
derive one for the 1-dimensional case, like ours:
∂u ∂ 2u
= c 2, (II.19)
∂t ∂θ
where c > 0 is a constant, and we will assume with no loss of generality, that c = 1. The heat
equation is subject to the initial conditions

u(θ, 0) ≡ f (θ)

with a given function f : S 1 → R designating the initial temperature.


We separate variables: u(θ, t) = A(θ) · B(t), with A(·) a 2π-periodic function. Then, substituting
into (II.19) yields
A(θ)B ′ (t) = A′′ (θ)B(t),
so that (conveniently ignoring the possible problems due to division),
A′′ (θ) B ′ (t)
= ≡λ
A(θ) B(t)

for some λ ∈ R. Then A′′ (θ) ≡ λ · A(θ) implies, using the periodicity of A, that λ = −n2 for some
n ∈ Z, and
A(θ) = An einθ + Bn e−inθ .
2
Concerning B(t), we have, by the above, B ′ (t) = −n2 B(t), which implies that B(t) = Cn · e−n t .
By the linearity of the heat equation (II.19), we superimpose the solutions u(θ, t) = A(θ) · B(t) of
the above type to write

2
X
u(θ, t) = an e−n t · einθ . (II.20)
n=−∞

Since u(θ, 0) ≡ f (θ), it follows that an = fb(n). We observe that |an | ≤ ∥f ∥∞ by the triangle
inequality, hence the defining series in (II.20) is rapidly absolutely convergent. In particular, u ∈
C 2 (S 1 ) (in fact, u ∈ C ∞ (S 1 )), and u(θ, t) as in (II.20) solves the heat equation.
Note that by the definition (II.20) of u(θ, t), for each fixed t ≥ 0, we have
2 2
\
u(·, t)(n) = an · e−n t = fb(n) · e−n t .
ct (n) = e−n2 t , i.e.
Hence u(·, t) = (f ∗ Ht )(θ), where Ht (·) is the function on S 1 such that H

2
X
Ht (θ) = e−n t einθ ,
n=−∞

the heat kernel on the circle. The {Ht (·)}t≥0 is a family of good kernels as t → 0, hence u(θ, t) −→
t→0
f (θ) pointwise at continuity points (and uniform if f is continuous on S 1 ). Moreover, as t → 0,
L2
u(θ, t) −→ f (θ),

with the use of Parseval (left as a homework assignment).


46 IGOR WIGMAN

III. Fourier transform on R


III.1. Overview and Fourier  T transform as the limiting case of the Fourier series. Let T > 0
T

be a parameter, and f (x) : − 2 , 2 → C a T -periodic function, so that the (re-scaled by homothety
so that f becomes 2π-periodic) Fourier series makes sense for f . In what follows we are interested in
what happens to the Fourier series of f as f stops being periodic, i.e. T → ∞. (It is instructive to
think of f = fT as a restriction to the growing intervals [−T /2, T /2] of a fixed function.) As it was
implied, we re-scale the variable

θ := · x, (III.1)
T
and define the function gT : S 1 → C on the circle by
 
T
gT (θ) := f (x) = f θ · .

Then the Fourier coefficients of gT are given by (with the linear transformation of variable (III.1))
Zπ   ZT /2 ZT /2
1 T 1 1
gc
T (n) = f θ· e−inθ dθ = f (x) · e−in(2πx/T ) dx = f (x) · e−2πi(n/T )x dx, (III.2)
2π 2π T T
−π −T /2 −T /2

and, under suitable assumptions on gT , we can recover the values of gT (and hence those of f (x))
from its Fourier coefficients via the Fourier series:

X ∞
X ∞
X
inθ in(x2π/T ) 2πi(n/T )x
f (x) = gT (θ) = T (n) · e
gc = T (n) · e
gc = T (n) · e
gc . (III.3)
n=−∞ n=−∞ n=−∞

Since both (III.2) and (III.3) are given in terms of n/T rather than n, it is only natural to re-
n
parameterize the frequency variable n, so that to parameterize gc T (·) in terms of yn := T . The new
frequency variable yn ∈ T1 Z is taking values in a grid of vanishing mesh size T1 (so, asymptotically
for T → ∞, will take values in R). That is, we rewrite (III.2) as
ZT /2
T (yn ) := T · g
fc c T (T · yn ) = f (x) · e−2πiyn x dx, (III.4)
−T /2

and the corresponding reconstruction formula (III.3) as


∞ ∞
X
2πiyn x 1 X c
f (x) = T (n) · e
gc = fT (yn ) · e2πiyn x . (III.5)
n=−∞
T n=−∞

We then observe the striking similarities between the formulas (III.4) and (III.5) for passing from
the time variable x to the frequency variable y = yn and backwards respectively. The only differences
are the negative and positive signs in the exponent respectively, and that the integral in (III.4) is a
summation in (III.5). However, as T → ∞, yn becomes a continuous variable, and the summation
(III.5) is a Riemann sum on R corresponding to the function fc T (y), assuming it is defined for y ∈ R.
In light of all the above, it would be natural to define the Fourier transform acting on functions
f : R → C as F : f (x) 7→ fb(y) with
Z∞
fb(y) = f (x)e−2πixy dx,
−∞
LECTURE NOTES FOURIER ANALYSIS 47

and the Inverse Fourier transform F ∗ : g(y) 7→ ǧ(x) with


Z∞
ǧ(x) = g(y)e2πixy dy.
−∞

In what follows we are going to discuss when these definitions make sense, and what are the funda-
mental properties of F and F ∗ , and their inter-relations.
III.2. The Fourier transform on the Schwartz space and its fundamental properties. It
will be seen that the decay of functions at infinity is related to the smoothness of their Fourier
transform, and vice versa. It is therefore understood that the “best” functions to deal with at first
stage is those smooth functions rapidly decaying at infinity.
Definition III.1. The Schwartz space S(R) is the space of all functions f : R → C, infinitely
differentiable f ∈ C ∞ (R), decaying at infinity faster than any polynomial, together with all their
derivatives, i.e. for every k, ℓ ∈ Z≥0 ,
dk f
xℓ k < +∞.
dx ∞
We say that a function f ∈ S(R) is a Schwartz class function.
Before we proceed let us consider a few examples.
2
Examples III.2. (1) The Gaussian f (x) = e−x . We claim that f ∈ S(R) is a Schwartz class
function. First, obviously, f ∈ C ∞ (R). For the decay, it is possible to prove by induction
2
that f (k) (x) = Pk (x) · e−x , where Pk (·) are polynomials. Since the exponential grows faster
than any polynomial, the conclusion follows.
(2) f (x) = e−|x| ∈ / S(R), since f ∈ / C 1 (R) (ditto f ∈/ C ∞ (R)), left to validate as a homework
assignment.
(3) Let us construct a bump function, also called a mollifier: it is a smooth function f ∈
C ∞ (R), with compact support, hence f ∈ S(R) (denoted f ∈ C0∞ (R)). Take
( 1
e− 2x x > 0
g(x) = .
0 x≤0
Then, as it is easy to check, g is continuous at x = 0, and, by induction,
( 1
Pk (x) − 2x
(k) x 2k · e x>0
g (x) = ,
0 x≤0
with Pk some polynomials, also continuous at x = 0. Hence g ∈ C ∞ (R) (but not compactly
supported). We then enforce the compact support by taking
(
− 1
e 1−x2 x ∈ (−1, 1)
f (x) := g(1 − x) · g(1 + x) = ,
0 x∈/ (−1, 1)
and f ∈ C0∞ (R).
(4) If f (x) ∈ S(R), then f ′ (x) ∈ S(R) and x · f (x) ∈ S(R), proposed as a homework assignment.
We now define the Fourier transform and the Inverse (or Conjugate) Fourier transform on S(R).
Definition III.3. The Fourier transform on S(R) is the linear operator F : S(R) → S(R),
f (x) 7→ fb(ξ) with
Z∞
fb(ξ) = f (x)e−2πixξ dx.
−∞
48 IGOR WIGMAN

We will see below that indeed fb(·) ∈ S(R). The Inverse Fourier transform (Conjugate Fourier
transform) is F ∗ : S(R) → S(R), g(ξ) 7→ ǧ(x), where
ǧ(x) = (Fg)(−x).
Our first goal is to validate that for f ∈ S(R), Ff ∈ S(R). To this end we have the following
proposition:
Proposition III.4. Let f (·) ∈ S(R), fb = Ff , h ∈ R, and δ > 0. Then:
(1) Translation of function is mapped to product by exponential:
F
f (x + h) 7→ fb(ξ) · e2πihξ .
(2) Dual to (1), left to a homework assignment:
F
f (x)e−2πixh 7→ fb(ξ + h).
(3) Homothethy is mapped to inverse (normalized) homothety:
 
F 1 b ξ
f (δx) 7→ f .
δ δ
(4) Derivative operator is mapped to (normalized) multiplication by the independent variable:
F
f ′ (x) 7→ 2πiξ fb(ξ).
(5) Dual to (4) left to a homework assignment:
F d b
(−2πix)f (x) 7→ f (ξ).

Proof. Proof of (1):
Z∞ Z∞
−2πixξ
f\
(· + h)(ξ) = f (x + h)e dx = f (y)e−2πi(y−h)ξ dy = e2πihξ fb(ξ).
−∞ −∞

Proof of (3):
Z∞ Z∞ Z∞  
−2πixξ −2πiy/δ·ξ dy −2πiyξ/δ dy 1b ξ
f[
(δ·)(ξ) = f (δx)e dx = f (y)e = f (y)e = f .
δ δ δ δ
−∞ −∞ −∞

Proof of (4): we use integration by parts to write


Z∞ Z∞
fb′ (ξ) = f ′ (x)e−2πixξ dx = −(−2πiξ) f (x)e−2πixξ dx = (2πiξ)fb(ξ),
−∞ −∞

where the contribution of ±∞ to the integration by parts term vanishes, by the assumption f ∈ S(R).

We will finally able to prove that F, F ∗ : S(R) → S(R) as we wanted.

Theorem III.5. If f ∈ S(R), then fb = Ff ∈ S(R). (Same for F ∗ .)


C
Proof. First, since |f (x)| ≤ 1+|x|2
, |fb(·)| is bounded by the triangle inequality. Now, we need to
dk fb
prove that for all ℓ, k ≥ 0, g(ξ) = ξ ℓ · dξ k
is bounded. The function g is a by-product of f , first
LECTURE NOTES FOURIER ANALYSIS 49

differentiated k times, then multiplied by ξ for ℓ times. Hence, by Proposition III.4, (5) and (4), if
we denote f1 (x) = (−2πix)k · f (x) ∈ S(R),
F dk b
f1 →
7 f (·),
dξ k
and, further,
(−2πi)k dℓ k 1 dℓ f 1 F ℓ b k
ℓ d b
x f (·) = →
7 ξ f 1 (ξ) = ξ f (ξ) = g,
(2πi)ℓ dxℓ (2πi)ℓ dxℓ dξ k
which is bounded as an image of a Schwartz function.

Fourier Inversion formula: our next aim is to prove the inversion formula: for f ∈ S(R), fb = Ff
one has
Z∞
∗b
f (x) = (F f )(x) = fb(ξ)e2πixξ dξ. (III.6)
−∞

Note that, since (F g)(x) = (Fg)(−x), the inversion formula (III.6) reads
f (x) = (F ∗ Ff )(x) = (F 2 f )(−x).
Hence, if (III.6) is indeed valid, then it will also imply that (F 4 f )(x) = f (x), i.e. the operator
F 4 = Id is the identity operator on S(R). We will move towards this goal, which will be finally
established in section III.3.
2
We consider the Gaussian function f (x) = e−πx , and first claim that
Z∞
A := fb(0) = f (x)dx = 1.
−∞

To see that we write, transforming to the polar coordinates:


Z Z∞ Z∞ Z∞
−π(x2 +y 2 ) −πρ2 −πρ2 2 2 ∞
2
A = e dxdy = 2π ρe dρ = (2πρ)e dρ = e−πρ d(πρ2 ) = −e−πρ 0
= 1,
R2 0 0 0

which is sufficient to imply that A = 1 (clearly, A ≥ 0).


2
Claim III.6. The Fourier transform of f (x) = e−πx satisfies fb(ξ) = f (ξ).
This result explains the Central Limit Theorem, and also why the Gaussian distribution appears
everywhere in the nature.
F F
Proof. Let g = Ff , i.e. f 7→ g, and, by Proposition III.4(5), −2πixf (x) 7→ g ′ , and by (4) of the
F
same proposition, if ′ (x) 7→ −2πξg(ξ). However, since, as it is easy to check, if ′ (x) = −2πixf (x), it
forces
−2πξg(ξ) = F(if ′ ) = F(−2πi · f (·)) = g ′ (ξ). (III.7)
2
Now define G(ξ) := g(ξ) · eπξ . We have
2 2 2
G′ (ξ) = g ′ (ξ)eπξ + 2πξg(ξ)eπξ = (g ′ (ξ) + 2πξg(ξ))eπξ ≡ 0,
2
by (III.7). Therefore, G ≡ const, and, since G(0) = g(0) = 1, G ≡ 1, i.e. g(ξ) = e−πξ .

A similar role to good kernels (cf. section I.4) will be played by the scaled Gaussians.
50 IGOR WIGMAN

Corollary III.7. For δ > 0 define


2 /δ

Kδ (x) = δ −1/2 e−πx = δ −1/2 K1 (x/ δ). (III.8)
cδ (ξ) = e−πδξ2 , and, in particular,
Then K
Z∞
Kδ (x)dx = K
cδ (0) = 1.
−∞

Proof. By (III.8) and Proposition III.4(3), we have


√ √ √
cδ (ξ) = δ −1/2 · δK1 ( δ · x) = K1 ( δ · x) = e−πδξ2 .
K

1 1
Figure 5. Plots of Kδ with δ = , , 1, 2, 10.
10 2
The higher δ is, the more concentrated
is the kernel at the origin.

1 1
Figure 5 depicts the Gaussian kernels Kδ for δ = 10 , 2 , 1, 2, 10. For the higher parameter δ,
the corresponding kernel is more concentrated at the origin, whereas setting δ to a lower value
resulted in more mass of Kδ spreading outside the origin. This is related to the general Heisenberg
uncertainty principle, quantifying the relation between how concentrated a function and its Fourier
transform could be simultaneously, designating the confidence in the position and the momentum
of a particle in quantum mechanics.
The following definition of a family of good kernels is the analogue on R of Definition I.21.
Definition III.8. A family {Lδ }δ>0 ⊆ S(R) is a family of good kernels as δ → 0, if
R∞
(1) For every δ > 0, Lδ (x)dx = 1.
−∞
R∞
(2) There exists a number M > 0 so that |Lδ (x)|dx ≤ M .
R −∞
(3) For every η > 0, |Lδ (x)|dx → 0.
δ→0
|x|>η

As one can hope, the scaled Gaussian kernels {Kδ }δ>0 in (III.8) is a family of good kernels.
LECTURE NOTES FOURIER ANALYSIS 51

Claim III.9. The scaled Gaussians {Kδ } is a family of good kernels.

Proof. We readily validated (1) of good kernels (Corollary III.7), and (2) follows directly from (1) by
the positivity of all Kδ . For (3) of good kernels, we use a linear transformation of variables to write:
Z Z Z
1 −πx2 /δ 2
|Kδ (x)|dx = √ e dx = e−πx dx → 0,
δ √
δ→0
|x|>η |x|>η |x|>η/ δ

as a tail of a convergent integral.


On S(R) we have a notion of convolution, analogous to Definition I.16.

Definition III.10. For f, g ∈ S(R), their convolution is


Z∞
(f ∗ g)(x) := f (x − y) · g(y)dy,
−∞

integral converging due to the rapid decreasing of f (x − y) · g(y) (fixed x).

The following corollary is a particular case of the Good Kernels Theorem for functions on S(R),
analogous to Theorem I.22. Its proof, very similar to the one of Theorem I.22, will only given in
sketch for the Gaussian kernels, and is left in full generality as a homework assignment.

Corollary III.11. For every f ∈ S(R),

(f ∗ Kδ )(x) → f (x),
δ→0

uniformly w.r.t. x ∈ R.

Sketch of proof of Corollary III.11. We write


Z∞
(f ∗ Kδ )(x) − f (x) = Kδ (y) · (f (x − y) − f (x))dy.
−∞

R∞ R R
We then choose a parameter η > 0, divide the integral into ranges = + , use the triangle
−∞ |y|>η |y|<η
inequality, and bound the former integral
Z
|Kδ (y)| · |f (x − y) − f (x)|dy
|y|>η

using property (3) of good kernels (f is bounded), whereas the latter integral
Z
|Kδ (y)| · |f (x − y) − f (x)|dy
|y|<η

by |f (x − y) − f (x)| uniformly small (since f ′ (·) is bounded, it is, in particular, Lipschitz).



52 IGOR WIGMAN

III.3. Inverse Fourier transform and the Plancherel formula. Our next aim is to prove the
Fourier inversion formula (III.6). Towards this goal we have the following result:
Proposition III.12 (Multiplication formula). For every f, g ∈ S(R) the following equality holds:
Z∞ Z∞
f (x) · gb(x)dx = fb(y) · g(y)dx (III.9)
−∞ −∞

holds.
Proof. We manipulate:
Z∞ Z∞ Z∞ Z∞ Z∞ Z∞
−2πixy −2πixy
f (x) · gb(x)dx = f (x)dx g(y)e dy = g(y)dy f (x)e dx = g(y) · fb(y)dy,
−∞ −∞ −∞ −∞ −∞ −∞

and it is easy to justify the said manipulation, due to the smoothness and the rapid decay of all the
integrals. □

Proof of the Fourier inversion formula (III.6). We first claim that (III.6) holds for x = 0, i.e., that
Z∞
f (0) = fb(ξ)dξ, (III.10)
−∞

and then will be able to deduce the general case via a translation. Recall that for δ > 0 we defined
2 2
the kernels Kδ (x) = √1δ · e−πx /δ , and take gδ (x) := e−πδx , so that gbδ (ξ) = Kδ (ξ). An application of
the multiplication formula (III.9) with f and gδ gives
Z∞ Z∞
f (x) · Kδ (x)dx = fb(y) · gδ (y)dy, (III.11)
−∞ −∞

and since Kδ (x) = Kδ (−x), the l.h.s. of (III.11) is


Z∞ Z∞
f (x) · Kδ (x)dx = f (x) · Kδ (−x)dx = (f ∗ Kδ )(0) → f (0), (III.12)
δ→0
−∞ −∞

by the Good Kernels Theorem (Corollary III.11).


We now claim that the r.h.s. of (III.11) satisfies
Z∞ Z∞
fb(y) · gδ (y)dy → fb(y)dy, (III.13)
−∞ −∞

which is sufficient to yield (III.10), thanks to the readily established (III.12). The convergence
(III.13) follows immediately from the Dominated Convergence Theorem, in light of the fact that gδ is
bounded by 1. Otherwise (as the Dominated Convergence Theorem might be unfamiliar territory),
we choose a large parameter T > 0, and separate the integral (III.13):
Z∞ ZT Z
fb(y) · gδ (y)dy = fb(y) · gδ (y)dy + fb(y) · gδ (y)dy.
−∞ −T |y|>T
LECTURE NOTES FOURIER ANALYSIS 53

1
R 
Now, since |gδ | ≤ 1 and fb(y) ∈ S(R), the latter integral is fb(y) · gδ (y)dy = O T
. For the
|y|>T
former integral, we write
2
gδ (y) = e−πδy = 1 − O(δy 2 ) = 1 − O(δT 2 )
on the relevant range |y| < T , so that, assuming ∥fb∥∞ ≤ B, we conclude
ZT ZT
fb(y) · gδ (y)dy = fb(y)dy + O(BδT 3 ).
−T −T

Consolidating all the bounds above, we may estimate the integral (III.13) as
Z∞ ZT Z∞ Z∞
fb(y) · gδ (y)dy = fb(y)dy + O(BδT 3 ) + O(1/T ) = fb(y)dy + O(BδT 3 ) + O(1/T ) → fb(y)dy,
−∞ −T −∞ −∞

since the error terms could be made arbitrarily small, i.e., given ϵ > 0, first choose T > 2ϵ , and then
ϵ
δ < 2BT 3.

The equality (III.10) is now proved, and we are now going to reduce the case of arbitrary x to
x = 0, via a simple translation. Namely, we consider F (y) = Fx (y) := f (x + y), x ∈ R given. Then,
by (III.10) applied on F (·),
Z∞ Z∞
f (x) = F (0) = Fbx (ξ)dξ = fb(ξ)e2πixξ dξ,
−∞ −∞

which is (III.6).

Recall that F, F ∗ : S(R) → S(R) map S(R) onto S(R). The above result shows that both
F, F ∗ are bijective, FF ∗ = F ∗ F = Id, and, since (F ∗ f )(x) = (Ff )(−x), we have that (F 2 f )(x) =
(F ∗ Ff )(−x) = f (−x), it follows that F 4 = Id.
Proposition III.13. Let f, g ∈ S(R). Then:
(1) f ∗ g ∈ S(R).
(2) f ∗ g = g ∗ f .
(3) f[∗ g(ξ) = fb(ξ) · gb(ξ).
Proof. Only the most difficult (1) will be given a proof, the other ones, being standard manipulation
with the integrals similar to the corresponding results on the circle, will be relegated to a homework
assignment. It will also be shown in a guided homework exercise that for every ℓ ≥ 0, there exists a
number Aℓ > 0 so that
sup |x|ℓ · |g(x − y)| ≤ Aℓ · (1 + |y|)ℓ , (III.14)
x∈R
that we will use in the course of the proof. Using (III.14) we may write:
Z∞ Z∞
ℓ ℓ
|x| · |(f ∗ g)(x)| = |x| · f (y) · g(x − y)dy ≤ |f (y)| · (|x|ℓ · |g(x − y)|)dy
−∞ −∞
Z∞
≤ Aℓ · |f (y)| · (1 + |y|)ℓ dy < +∞,
−∞
54 IGOR WIGMAN

independent of x ∈ R.
Hence |x|ℓ · |(f ∗ g)(x)| is bounded, and, since g ∈ C ∞ (R) is rapidly decaying, we may differentiate
under the integral sign: for k ≥ 0,
 k   k 
d d g
(f ∗ g)(x) = f ∗ (x).
dx dxk
Hence, by the above,
 k
ℓ d
|x| (f ∗ g)(x)
dx
is bounded.

We endow S(R), a vector space over C, with an inner product
Z∞
⟨f, g⟩ = f (x) · g(x)dx,
−∞

absolutely convergent by the rapid decay of both f and g, and the corresponding norm is defined via
Z∞
∥f ∥2 = |f (x)|2 dx.
−∞

Theorem III.14 (Plancherel). The operator F : f 7→ fb is an isometry on S(R), i.e. for every
f ∈ S(R), ∥f ∥ = ∥fb∥. [Hence F also preserves the inner product (F is unitary): for every
f, g ∈ S(R), ⟨f, g⟩ = ⟨Ff, Fg⟩. ]
Proof. Given f ∈ S(R), define g(x) := f (−x). By a homework assignment, the sign negation negates
the sign of the Fourier transform, whereas conjugation negates the sign and conjugates the Fourier
transform. Therefore, we have that gb(ξ) = fb(ξ). Now we take h := f ∗ g, whose Fourier transform is
h(ξ) = fb(ξ) · gb(ξ) = fb(ξ) · fb(ξ) = |fb(ξ)|2 ,
b
and write the Fourier inversion formula for h at x = 0:
Z∞ Z∞
h(0) = h(ξ)dξ =
b |fb(ξ)|2 dξ = ∥Ff ∥2 . (III.15)
−∞ −∞

On the other hand, by the definition,


Z∞ Z∞
h(0) = (f ∗ g)(0) = f (x) · g(−x)dx = f (x) · f (x)dx = ∥f ∥2 , (III.16)
−∞ −∞

and Plancherel’s formula follows upon comparing (III.16) to (III.15).



III.4. The class M(R) of functions of moderate decrease, and extension of the Fourier
transform to M(R). Fourier transform as an operator and extension to L2 (R). The
following definition is non-standard, and, seemingly, only given by Stein and Schakarchi.
Definition III.15 (Functions of moderate decrease). We define M(R) to be the class of continuous
functions, except perhaps finitely many points, of moderate decrease f : R → C, s.t. there exists
A 1
A > 0 so that for all x ∈ R, |f (x)| ≤ 1+x 2 . [We could use 1+|x|1+ϵ instead. Other than continuity, no

smoothness is imposed on f .]
LECTURE NOTES FOURIER ANALYSIS 55

With the definition of the class M(R) as given, it is easy to define fb(ξ) = (Ff )(ξ) for all f ∈ M(R).
The main problem is that it might be that f ∈ M(R), but fb = Ff ∈ / M(R), so that F ∗ fb makes
no sense. We could then restrict the definition of Ff to only f ∈ M(R) so that also Ff ∈ M(R).
The following lemma, already stated on S(R), also valid on M(R) will be proved in a homework
assignment.
Lemma III.16. If f, g ∈ M(R), then f ∗ g ∈ M(R), and
∗ g(ξ) = fb(ξ) · gb(ξ).
f[
From Lemma III.16 one can further deduce the multiplication formula, and subsequently the
Fourier inversion formula and Plancherel’s identity.
Example III.17. Consider f = χ[−1/2,1/2] , the characteristic function of the interval [−1/2, 1/2]. Its
Fourier transform was computed as a homework assignment to be fb(ξ) = sin(πξ) πξ
∈/ M(R). However,
2
both f and f are square integrable, i.e. in L (R).
b

We then would like to extend the Fourier transform to L2 (R). Recall that S(R) is an inner product
vector space with the inner product
Z∞
⟨f, g⟩ = f (x) · g(x)dx.
−∞

It is not complete, as one can take any not identically vanishing function outside of S(R), and convolve
it with a bump function. The completion of S(R) w.r.t. the associated norm coincides with L2 (R),
the class of (measurable) functions whose square is Lebesgue integrable (i.e. finite L2 -mass), for
example containing both f = χ[−1/2,1/2] and its Fourier transform fb(ξ) = sin(πξ)
πξ
. It is then possible
∗ 2 2
to extend F and F to operators L (R) → L (R), in a way that F is an isometry (hence unitary),
and F ∗ = F −1 , so that F is bijective.
Construction: Let f ∈ L2 (R), represented as a Cauchy sequence f = {fk }k≥1 ⊆ S(R). Now take the
image of the sequence {gk = Ffk }k≥1 ⊆ S(R). Then, since F is an isometry, the sequence {gk }k≥1
is Cauchy, and thereby, is an element g = {gk }k≥1 ∈ L2 (R) of L2 (R). Finally, we set Ff := g. It is
then important to prove that Ff is well-defined, i.e. that the equivalence class of g is independent of
the representation of the equivalence class f. Equivalently, if f ∈ L2 (R), then
ZR
Ff := lim f (x)e−2πixξ dx,
R→∞
−R

limit in L2 (not pointwise).

IV. Application of the Fourier transform on R: The time-dependent heat


equation on R.
We consider the heat propagation on an infinite rod, i.e. the evolution of the function
u(x, t) : R × R≥0 → C,
where u(x, t) is the temperature at the point x ∈ R at time t ∈ R≥0 . In this case we assume that
the initial temperature is given, i.e. u(·, 0) ≡ f (·), for some given function f : R → C, and we will
be interested to evaluate u(·, t) for t > 0. It will be convenient to assume that f ∈ S(R).
Recall that earlier on, we derived the (time-dependent) heat equation on R2 , addressed in the first
part of this module for S 1 in place of R2 . We interpreted Newton’s Law of Cooling for a function
56 IGOR WIGMAN

u(x, y, t) : R2 × R≥0 → C to satisfy the differential equation


∂u
(x, y, t) = ∆u(x, y, t),
∂t
∂ 2 ∂ 2
where ∆ = ∂x 2 + ∂x2 is the Laplacian (Laplace operator). On the circle S 1 , the corresponding
equation for u(θ, t), (θ, t) ∈ S 1 × R≥1 is
∂u ∂ 2u
(θ, t) = 2 (θ, t),
∂t ∂θ
i.e. the second derivative w.r.t. θ in place of the Laplace operator. In that case (i.e., on the circle), we
separated variables, and arrived at the solution, under appropriate conditions on the initial function
f (θ):
u(·, t) = (f ∗ Ht )(θ),
with the heat kernel

2
X
Ht (θ) := e−n t einθ .
n=−∞

In our case of u(x, t) : R × R≥0 , the corresponding heat equation is


∂u ∂ 2u
(x, t) = (x, t), (IV.1)
∂t ∂x2
that is derived by similar considerations (Newton’s cooling law), subject to the initial conditions
u(·, 0) ≡ f (·), (IV.2)
for a given function f : R → C. Let us analyse the heat equation (IV.1) by taking the Fourier trans-
form of both sides w.r.t. the spacial variable x (i.e. thinking of the temperature t ≥ 0 as constant).
We have (formally), recalling on the Fourier side (with independent variable ξ), differentiation acts
by multiplying by 2πiξ,
∂b
u
(ξ, t) = (2πiξ)2 · u
b(ξ, t) = −4π 2 ξ 2 u
b(ξ, t), (IV.3)
∂t
since differentiation w.r.t. t commutes with Fourier transform w.r.t. x (note these are different
variables).
It turns out that the relevant heat kernels for solving the equation (IV.3) are not other than
our own Gaussian kernels (up to the correct linear rescaling of the time, so that the choice of the
parameters δ will work)
1 −x2 /(4t)
Ht (x) := Kδ(t) (x) = √ e , (IV.4)
4πt
with δ(t) := 4πt, and the Fourier transform (evaluated in some previous lectures)
ct (ξ) = e−πδ(t)ξ2 = e−4π2 tξ2 .
H (IV.5)
Indeed, for a fixed ξ ∈ R, as a function of t ∈ R≥0 , the equation (IV.3) admits the solution (for
b(ξ, ·)),
u
2 2
b(ξ, t) = A(ξ) · e−4π ξ t ,
u (IV.6)
and substituting t = 0, and the initial conditions (IV.2) recover the value of A(·) via
fb(ξ) = u
b(ξ, 0) = A(ξ),
so that (IV.6) reads
2 2
b(ξ, t) = fb(ξ) · e−4π ξ t .
u (IV.7)
LECTURE NOTES FOURIER ANALYSIS 57

We then take the inverse Fourier transform of (IV.7) (alternatively, using the direct Fourier trans-
form bijectivity), and using the duality of multiplication of functions on the time plane and convo-
lution on the frequency plane and vice versa, to tip the solution
u(x, t) = (f ∗ Ht )(x) = (f ∗ Kδ(t) )(x).
The fact that {Kδ (·)}δ>0 are good kernels as δ → 0 will ensure the (uniform) convergence of u(x, t)
to f as t → 0 (also implying δ → 0 by the linear rescaling). That this is indeed so is asserted by the
following theorem.
Theorem IV.1. Let f ∈ S(R) be given, and for t > 0, x ∈ R define
u(x, t) := (f ∗ Ht )(x), (IV.8)
with Ht the heat kernel given by (IV.4). Then:
(1) The function u(·, ·) is C 2 (R × R>0 ), satisfying the heat equation (IV.1).
(2) As t → 0,
u(x, t) → f (x)
uniformly w.r.t. x ∈ R. Hence u defines a continuous function on
R × R>0 = {(x, t) : x ∈ R, t ≥ 0}
(by setting u(·, 0) := f (·)).
(3) Further to (2), the L2 -convergence holds, i.e.
Z∞
|u(x, t) − f (x)|2 dx → 0, (IV.9)
−∞

as t → 0.
Note that the L2 -convergence (IV.9) does not follow from the uniform convergence on R (unlike,
e.g. on S 1 ).
Proof. We first prove (1), i.e. that u(·, ·) is sufficiently smooth, and satisfies the heat equation. We
take the Fourier transform of the definition (IV.8) of u(·, ·) as a function of x ∈ R (for t > 0 fixed)
to yield
Fu = (Ff ) · (FHt ),
explicitly, bearing in mind (IV.5),
2 tξ 2
b(ξ, t) = fb(ξ) · e−4π
u . (IV.10)
Now, apply the Fourier inversion formula on (IV.10) (w.r.t. x):
Z∞
2 2
u(x, t) = fb(ξ)e−4π tξ e2πiξx dξ. (IV.11)
−∞
2 2
Now, using the rapid decay of fb(·) ∈ S(R) (and also of e−4π tξ ), we can differentiate under the
integral sign of (IV.11) w.r.t. either ξ or t, and it follows that u(·, ·) ∈ C ∞ (R × R>0 ) is an infinitely
differentiable of both its variables, stronger that claimed by (1). By the Leibnitz rule, we obtain,
thanks to (IV.11) again:
Z∞ Z∞
∂u −4π 2 tξ 2 2 tξ 2 ∂ 2u
= −4π 2 ξ 2 fb(ξ)e e2πiξx dξ = (2πi)2 ξ 2 fb(ξ)e−4π e2πiξx dξ = .
∂t ∂x2
−∞ −∞
58 IGOR WIGMAN

The statement (2) of the theorem follows directly from the Good Kernel theorem applied on the
Gaussian kernels {Kδ }δ , which was proved during lectures in the precise form
(f ∗ Ht )(x) = (f ∗ Kδ(t) )(x) → f (x),
uniformly w.r.t. x ∈ R, thanks to δ(t) = 4πt → 0 as well. Finally, to prove the L2 -convergence (3),
we use Plancherel’s identity:
Z∞ 2
2 2
2
∥u(·, t) − f (·)∥2 = ∥b 2
u(·, t) − f (·)∥2 =
b |fb(ξ)|2 · e−4π tξ − 1 dξ,
−∞

by (IV.10).
It then remains to show that the latter integral vanishes as t → 0, very similar to a homework
assignment problem, both in terms of the statement and in terms of the proof, except the integral in
place of the summation (and a slightly different normalization). Namely, we take a large parameter
T ≫ 0, and write
Z∞ 2
ZT Z
2 2
−4π tξ 2
|f (ξ)| · e
b − 1 dξ = + .
−∞ −T |ξ|>T
2 2
In the range |ξ| > T we use the trivial inequality e−4π tξ − 1 ≤ 1, whence
Z 2
Z
2 −4π 2 tξ 2
|f (ξ)| · e
b − 1 dξ ≤ |fb(ξ)|2 dξ = O(1/T ),
|ξ|>T |ξ|>T

−4π 2 tξ 2
since fb(·) ∈ S(R), whereas for |ξ| < T , we have e − 1 = O(tξ 2 ) = O(tT 2 ), so that
ZT 2
2 tξ 2
|fb(ξ)|2 · e−4π − 1 dξ = O(Bt2 T 5 ) = O(t2 T 5 ),
−T

where B = ∥fb∥2∞ (we allow the constant implicit within the ‘O′ -notation to depend on f ). Finally,
it is easy to first choose the parameter T sufficiently large, and then t > 0 sufficiently small, so that
the total contribution of
O(1/T ) + O(t2 T 5 ) < ϵ
is arbitrarily small.

It is also possible to assert the uniqueness of a solution, by imposing further conditions on u(·, ·)
but it is outside the scope of our module.

You might also like