Lecture Notes Fourier28
Lecture Notes Fourier28
FOURIER ANALYSIS
IGOR WIGMAN
Contents
0. Preliminaries 2
0.1. Integration (Riemann) 2
0.2. Complete (metric) vector spaces 3
0.3. Functions on the circle S 1 4
I. Fourier series on the circle 4
I.1. Definition of the Fourier series, their simple properties, and some examples 4
I.2. Uniqueness of Fourier series, and uniform convergence. Decay of Fourier coefficients and
smoothness. 9
I.3. Convolutions and Dirichlet kernels. Smoothing effect of convolutions. 15
I.4. “Good kernels” 18
I.5. Cesàro summability 20
I.6. Abel summability and Poisson Kernel. Steady-state heat equation 23
I.7. Mean-square convergence of Fourier series. 26
Towards the proof of the mean square convergence Theorem I.34 29
I.8. Pointwise convergence 33
II. Some applications of Fourier series 34
II.1. The isoperimetric inequality 34
II.2. Weyl’s Equidistribution Theorem 37
II.3. Vibrating string and wave propagation 42
II.4. Heat equation on the circle 45
III. Fourier transform on R 46
III.1. Overview and Fourier transform as the limiting case of the Fourier series 46
III.2. The Fourier transform on the Schwartz space and its fundamental properties 47
III.3. Inverse Fourier transform and the Plancherel formula 52
III.4. The class M(R) of functions of moderate decrease, and extension of the Fourier
transform to M(R). Fourier transform as an operator and extension to L2 (R) 54
IV. Application of the Fourier transform on R: The time-dependent heat equation on R. 55
1
2 IGOR WIGMAN
0. Preliminaries
0.1. Integration (Riemann). Let −∞ < a < b < +∞, then I = [(a, b)] is an interval I ⊆ R, open
or closed, or semi-closed. If f : I → R (or f : I → C) is a function, we are interested in defining the
Rb
integral S = f (x)dx.
a
Rb
(3) We say that f is integrable on I, with value of integral s := f (x)dx, if for all ϵ > 0, there
a
exists a δ > 0 such that for every partition ∆ of I of mesh size |∆| < δ, we have
|S(f, ∆) − s| < ϵ.
That is,
Zb
f (x)dx := lim S(f, ∆).
|∆|→0
a
The following facts on Riemann integration were established as part of a standard analysis course:
(1) If a function f is not bounded on I, then f is not integrable on I.
(2) Define the upper (resp. lower) Riemann sums:
n−1
!
X
U (f ; ∆) := sup f (t) · (tj+1 − tj )
j=0 t∈[tj ,tj+1 ]
(resp.
n−1
X
L(f ; ∆) := inf f (t) · (tj+1 − tj )).
t∈[tj ,tj+1 ]
j=0
Then f is integrable on I, if and only if for every ϵ > 0, there exists a partition ∆ such that
0 ≤ U (f, ∆) − L(f, ∆) < ϵ.
Examples 0.2. (1) The Dirichlet function f : [0, 1] → R:
(
1 x∈Q
f (x) = .
−1 x ∈ /Q
The function f is not Riemann integrable (and nowhere continuous), since U (f ; ∆) ≡ 1 but
L(f ; ∆) ≡ −1. However, its absolute value |f | ≡ 1 is integrable, a peculiarity of integrability
in the Riemann sense.
(2) If f : [a, b] → R (or f : [a, b] → C) is continuous, then f is integrable.
(3) If f, g are integrable, then so is f · g, and hence, in particular, f 2 .
LECTURE NOTES FOURIER ANALYSIS 3
Definition 0.3 (Improper integrals). Let f : [a, +∞) → R be a function, integrable on every
[a, b] ⊆ [a, +∞). Then the improper integral is the limit
Z+∞ Zb
f (x)dx := lim f (x)dx,
b→∞
a a
if it exists.
+∞
R +∞
R
The integral f (x)dx might make sense, even if |f (x)|dx does not converge, whence we say
a a
+∞
sin x
R
that it is conditionally convergent, such as, for example x
dx. Similarly, we define
1
Z+∞ Z Zb
f (x)dx := f (x)dx,
−∞ b→+∞ −b
if it exists. We end this section by stating that all this theory could be generalized to multivariate
functions, defined on subsets of Rn , n ≥ 1.
0.2. Complete (metric) vector spaces.
Definition 0.4. Let X be a set (e.g. a finite dimensional real vector space X = Rn ). A metric on
X is a function d : X × X → R satisfying the following properties:
(1) Positivity: a. For all x, y ∈ X, d(x, y) ≥ 0. b. For all x, y ∈ X, d(x, y) = 0 if and only if
x = y.
(2) Symmetry: For all x, y ∈ X, d(x, y) = d(y, x).
(3) Triangle inequality: For every x, y, z ∈ Z, d(x, z) ≤ d(x, y) + d(y, z).
If d(·, ·) is a metric on X, we say that the tuple (X, d) is a metric space.
Example 0.5. If (X, ∥ · ∥) is a normed linear space (i.e. · 7→ ∥ · ∥ is a norm on X), then it induces
the metric d(x, y) := ∥x − y∥. That is every normed linear space is a metric space, but there are a
lot of interesting metric spaces X that are not linear spaces (ditto not normed linear spaces).
Definition 0.6 (Complete metric spaces). Let (X, d) be a metric space.
(1) A sequence {xn } ⊆ X of elements of X is Cauchy, if for every ϵ > 0 there exists a number
N = N (ϵ) such that for every n, m > N , the inequality d(xn , xm ) < ϵ holds.
(2) A sequence {xn } ⊆ X of elements of X converges to a (unique!) element x ∈ X, if for every
ϵ > 0 there exists N = N (ϵ) so that for every n > N the inequality d(xn , x) < ϵ holds.
(3) The metric space (X, d) is complete, if every Cauchy sequence {xn } ⊆ X is convergent.
Examples 0.7. (1) The space (R, |x − y|) is complete. In this example the notions of Cauchy
and convergent sequences coincide with the classical definitions. Same for (Rn , ∥x − y∥p ),
n ≥ 1, p > 0.
(2) The space (Q, |x−y|) is not complete. To see that we can take the sequence defined recursively
by x0 = 1 and xn+1 = xn +2/x n
, which is Cauchy but not convergent (left for homework).
2 √
Otherwise, consider the function f (x) = 1 + x and Taylor expand it at x = 1. We
obtain a rapidly decaying series, with rational partial sums, divergent in Q. In any case, the
conclusion is that Q contains “holes”.
ˇ of X: it is a unique
If a metric space (X, d) is not complete, it gives rise to the completion (X̌, d)
(up to an isometric bijection) metric space satisfying the following properties:
(1) The space X is embedded in X̌: X ⊆ X̌ and dˇ extends d.
4 IGOR WIGMAN
(2) X is dense in X̌, i.e. for every x ∈ X̌ there exists a sequence {xn } ⊆ X of elements of X
such that lim xn = x in X̌.
n→∞
(3) X̌ is complete.
Examples 0.8. (1) R is the completion of Q.
(2) [0, 1] is the completion of (0, 1) ⊆ R.
Given (X, d), one way to construct its completion is by defining X̌ as the collection of all equivalence
classes of Cauchy sequences in X, w.r.t. a suitable equivalence relation.
0.3. Functions on the circle S 1 . Let f : R → R (or f : R → C) be periodic with period 2π.
Then f is determined by its restriction f |[−π,π] (or any other period, i.e. f |[x0 ,x0 +2π] ), and moreover,
f (−π) = f (π). Conversely, given such a function f : [−π, π] → R (or f : [−π, π] → C) with
f (−π) = f (π), it uniquely prescribes a periodic function f : R → R with period 2π: for every x ∈ R,
f (x + 2π) = f (x).
Alternatively, we may thing of such a function as f defined on the circle S 1 ⊆ R2 , by identifying
it with the function F ((cos θ, sin θ)) := f (θ), θ ∈ [−π, π). Moreover, F is continuous, if and only if
f is continuous (assuming f (−π) = f (π), as otherwise ±π is a point of discontinuity of f ).
for some coefficients an ∈ C, where the meaning of “∼” is not specified at the moment (in particular,
no convergence is implied). Manipulating formally with (I.1) (changing the order of summation and
integration, with no justification), we take n0 ∈ Z and write
Zπ ∞
X Zπ
−in0 x
f (x)e dx = an ei(n−n0 )x dx = 2πan0 ,
−π n=−∞ −π
In light of the above, it is reasonable to expect that if an expansion of the type (I.1) exists, then
the coefficients are given by
Zπ
1
an = f (x)e−inx dx.
2π
−π
It is well-defined, since, as discussed earlier, a product of two integrable functions (in this
case, f and e−inθ ) is integrable.
∞
an einθ (convergent or not) is called the Fourier series associated to f , which
P
(2) The series
n=−∞
is designated by
∞
X
f∼ an einθ ,
n=−∞
and has no other meaning (e.g. the convergence of the Fourier series), than that an are given
by (I.3).
Remark I.2. (1) The integral
Zπ θZ
0 +2π
1 −inθ 1
f (θ)e dθ = f (θ)e−inθ dθ
2π 2π
−π θ0
∞
2π
fb(n)e2πinx/L , e.g. by the linear transformation θ =
P
n ∈ Z, and f (x) ∼ L
x. We will
n=−∞
assume everywhere L = 2π, unless specified otherwise.
A central question is that of convergence of the Fourier series (to f ?). That is, we define for
N ∈ Z>0 the N ’th partial Fourier sum of f
N
X
(SN (f ))(θ) := fb(n)einθ .
n=−N
Rπ
since e−inθ dθ = 0, and eiπn = (−1)n .
−π
Hence the Fourier series of f is
X (−1)n ∞ ∞ ∞
X 1 X 1 X sin(nθ)
f∼ i· einθ = i·(−1)n · · einθ − e−inθ = i·(−1)n 2i·sin(nθ) = 2 (−1)n+1 ,
n̸=0
n n=1
n n=1
n n=1
n
by uniting the summands corresponding to n and −n. Does this series converge? To f ? At this
point we can only infer that, by the slow decay of its summands, there is no absolute convergence.
In general, in cases, like f as in Example I.4, that the function f : [−π, π] → R is real-valued, we
may represent its Fourier series in its real form by grouping the summands corresponding to n and
−n in the following fashion.
Claim I.5. If f is real-valued, then for every n ∈ Z,
□
The converse, i.e. that if (I.4) is satisfied for all n ∈ Z then f is real-valued, is also true under
further assumption on f . It will be shown using convergence results later on.
Let us perform the same trick of adding the terms in the Fourier series, corresponding to n and
−n, as in Example I.4. Denoting an := fb(n), for n ̸= 0, by the above claim, we have
fb(n)einθ + fb(−n)e−inθ = an einθ + an e−inθ = an einθ + an einθ = 2ℜ an einθ
(I.5)
= 2 (ℜ(an ) cos(nθ) − ℑ(an ) sin(nθ)) .
Rπ
Now an = 1
2π
f (θ)e−inθ dθ, so that
−π
Zπ
1
ℜ(an ) = f (θ) cos(nθ)dθ
2π
−π
whereas
Zπ
1
ℑ(an ) = − f (θ) sin(nθ)dθ.
2π
−π
Therefore, (I.5) is
fb(n)einθ + fb(−n)e−inθ = bn cos(nθ) + cn sin(nθ),
LECTURE NOTES FOURIER ANALYSIS 7
where
Zπ
1
bn = 2ℜ(an ) = f (θ) cos(nθ)dθ, (I.6)
π
−π
and
Zπ
1
cn = −2ℑ(an ) = f (θ) sin(nθ)dθ,
π
−π
all under the assumption n ̸= 0.
As for n = 0 (that has no partner −n other than itself), we have
Zπ
i·0·θ 1 b0
a0 e = a0 = f (θ)dθ =: ,
2π 2
−π
where b0 extends the definition (I.6) to n = 0. Here the denominator 2 manifests the lack of
accompanying sine function to 1 = cos(0 · θ).
All in all, the real form of the Fourier series of an arbitrary integrable real-valued function f :
[−π, π] → R is
∞
b0 X
f (θ) ∼ + (bn cos(nθ) + cn sin(nθ)) .
2 n=1
Here
∞
b0 X
+ bn cos(nθ)
2 n=1
is the “even part” of f , whereas
∞
X
cn sin(nθ)
n=1
is its “odd part”. In the homework assignment it will be shown that if f is even, then its odd part
vanishes, whereas if f is odd, then its even part vanishes. The converse is also true under some further
assumption on f , will be covered later, when the convergence of a Fourier series corresponding to a
function will be addressed.
Example I.6. f (θ) = 14 θ2 , θ ∈ [−π, π]. (In this example, on θ ∈ [π, 2π], we have f (θ) = 41 (2π − θ)2 .)
The function f : S 1 → R is continuous on the unit circle S 1 , since f (−π) = f (π) (hence f is
continuous as a function on R, extended by 2π-periodicity). We can compute its Fourier coefficients
by appealing to integration by parts (this time, twice), and transform the Fourier series to its real
form:
∞
π 2 X (−1)n cos(nθ)
f (θ) ∼ + .
12 n=1 n2
If, indeed, the Fourier series of f converges to f at every point, we can substitute θ = π, so that
cos(nπ) = (−1)n and then
∞ ∞
π2 π 2 X (−1)n · (−1)n π2 X 1
= f (π) = + = + ,
4 12 n=1 n2 12 n=1 n2
and we can conclude
∞
X 1 π2 π2 π2
2
= − = ,
n=1
n 4 12 6
8 IGOR WIGMAN
a highly nontrivial result on the special value of the Riemann zeta function
∞
X 1
ζ(2) := 2
.
n=1
n
Example I.7. The Dirichlet kernel of degree N ≥ 1:
N
X
DN (x) = einx . (I.7)
n=−N
DN is a degree-N trigonometric polynomial. It is easy to see, by changing the order of sum and
integration, and the orthogonality relation (I.2), that
(
1 |n| ≤ N
D
d N (n) = .
0 |n| > N
Thus DN ∼ DN , i.e. DN is its own Fourier series.
More generally, if
X
TN (x) = an einx ,
|n|≤N
L1
(3) We say that fn −→ f (convergence in L1 , alternatively in mean), if
n→∞
Z
|fn − f |dx → 0
I
as n → ∞.
[Note that the above integral is well-defined by our assumptions on the integrability of fn
and f . Also this implicitly defines a metric topology on the space of Riemann integrable
functions, called the L1 -metric.]
L2
(4) We say that fn −→ f (convergence in L2 , alternatively mean square convergence), if
n→∞
Z
|fn − f |2 dx → 0
I
as n → ∞.
[This implicitly defines a metric topology on the space of Riemann integrable functions,
called the L2 -metric.]
It is easy to show that (will be left to homework assignment) the following implications hold:
easy Cauchy-Schwarz
L∞ ⇒ L2 ⇒ L1 ,
and also
easy
L∞ ⇒ pointwise,
whereas all the other implications do not hold, with counter-examples of each left to homework
assignment.
Since one can always change a value of f at a single point without affecting the Fourier coefficients
at all (and hence neither the Fourier series), for pointwise convergence of the Fourier series to f it
will be essential to impose some stronger assumptions on f other than its mere integrability. One
immediate such condition to be consider is the continuity of f . Surprisingly, it was shown by Du
Bois-Reymond (1873), that there exists a continuous function on S 1 , such that its Fourier series
diverges (at one point). However, further imposing the smoothness of f will mend this problem.
I.2. Uniqueness of Fourier series, and uniform convergence. Decay of Fourier coefficients
and smoothness. Suppose, that for some class of functions indeed
∞
X
f (θ) = fb(n)einθ
n=−∞
pointwise (i.e. SN (θ) → f (θ) pointwise), and g is some other function such that for every n ∈ Z we
have fb(n) = gb(n). If the Fourier series corresponding to g also converges to g, then one has
X∞ X∞
inθ
g(θ) = gb(n)e = fb(n)einθ = f (θ).
n=−∞ n=−∞
That is, by the convergence assumption, the function is determined by its Fourier coefficients, and
if we have two such functions with same Fourier coefficients, then we can infer f ≡ g.
In particular, we can infer from the above, that if we can’t prove the uniqueness of a function f
with the given Fourier series, then trying to prove the convergence is hopeless. By taking a difference
(f, g) 7→ f − g,
we reduce to proving whether if f : [−π, π] → C is an integrable functions such that for all n ∈ Z,
fb(n) = 0, then f ≡ 0. Of course, this statement can’t be literally true with no further assumptions,
10 IGOR WIGMAN
as changing, for example, the values of f at finitely many points has no impact on fb(n). However,
we have the following positive result:
Theorem I.9 (Uniqueness of Fourier series). Let f : S 1 → C be an integrable function such that
for all n ∈ Z, fb(n) = 0. The for all θ0 ∈ S 1 continuity points of f , one has f (θ0 ) = 0. In particular,
if f ∈ C(S 1 ) (i.e. f is continuous on S 1 ), then f ≡ 0.
It is known that any Riemann integrable function f on S 1 is continuous at “most” points of S 1 , in
the sense of Lebesgue measure, outside of the scope of this course. Before giving a proof to Theorem
I.9 we will first draw an important corollary on pointwise convergence of the Fourier series.
Corollary I.10. Suppose f ∈ C(S 1 ), and, moreover that
∞
X
|fb(n)| < +∞ (I.9)
n=−∞
∞
P
(i.e. the series fb(n) is absolutely convergent). Then
n=−∞
∞
X
f= fb(n)einθ
n=−∞
1
uniformly on S , i.e. (SN (f ))(θ) → f (θ) uniformly.
2
[In particular, for f (θ) = θ4 of Example I.6 the conditions of Corollary I.10 are all satisfied, hence
2
our theory yield the value ζ(2) = π6 .] We will see in a bit what is a sufficient (smoothness) condition
P∞
to impose on f so that |fb(n)| < +∞ is satisfied.
n=−∞
Proof of Corollary I.10. We use the convergence (I.9) to define the function
∞
X
g(x) = fb(n)einθ ;
n=−∞
since we may exchange the order of summation and integration by the uniform absolute convergence,
and the orthogonality (I.2). Hence by the Uniqueness Theorem I.9 (applied on f − g), f ≡ g, and
indeed
∞
X
f (θ) = g(θ) = fb(n)einθ
n=−∞
1
uniformly on S .
□
Proof of the Uniqueness Theorem I.9. First, assume that f is real-valued, and we will reduce the
general case in the end. We will further assume with no loss of generality, that for f : [−π, π] → R,
θ0 = 0, and, by contradiction, that f (θ0 ) > 0 (otherwise, work with −f (·)).
LECTURE NOTES FOURIER ANALYSIS 11
one has Z Z
X X
f (θ)p(θ)dθ = ak f (θ)eikθ dθ = 2π ak fb(−n) = 0 (I.10)
S1 |k|≤N S1 |k|≤N
The range Ik,1 is where the bulk of the contribution will come, whereas we are going to show that
the contributions of Ik,2 and Ik,3 can be ignored (Ik,2 is non-negative, whereas Ik,3 is negligible).
12 IGOR WIGMAN
In the first range |θ| < η < δ one has |f (θ)| > f (0)
2
> 0, and pk (θ) ≥ (1 + ϵ/2)k by (I.13) and the
definition (I.14) of pk . Hence
Z
f (0)
Ik,1 = f (θ)pk (θ)dθ ≥ 2η · · (1 + ϵ/2)k → +∞. (I.17)
2
|θ|<η
For the next range η < |θ| < δ, we observe that for |θ| < δ, still f (θ) ≥ f (0)
2
≥ 0, and p1 (θ) =
ϵ + cos(θ) > 0 (this is why we cared to choose δ < π2 ), and so pk (θ) ≥ 0. Therefore, the integrand
f (θ) · pk (θ) ≥ 0 in this range, and, in particular, the integral
Z
Ik,2 = f (θ)pk (θ)dθ ≥ 0. (I.18)
η<|θ|<δ
Next, for the range δ < |θ| < π, we use the bound (I.15) for f valid on all of S 1 , whereas for pk
we use the bound |pk (θ)| ≤ (1 − ϵ/2)k , which is a by-product of the bound (I.12) on p1 , valid in the
same range. We then use the triangle inequality to bound the corresponding integral
Z
|Ik,3 | ≤ |f (θ)| · |pk (θ)|dθ ≤ 2π · B · (1 − ϵ/2)k → 0 (I.19)
δ<|θ|<π
as k → ∞. Our desired estimate then follows upon consolidating (I.17), (I.18) and (I.19), and
inserting these into (I.16).
Finally, we reduce the complex valued case of f : S 1 → C to the readily resolved real-valued one.
Given a function f (θ) = u(θ) + iv(θ), define
g(θ) = f (θ) := f (θ) = u(θ) − iv(θ)
with the Fourier coefficients
gb(n) = fb(−n),
left as a homework exercise. Then, by assumption, for all n ∈ Z, gb(n) = 0, and so are the Fourier
coefficients of
f (θ) + g(θ)
u(θ) =
2
and
f (θ) − g(θ)
v(θ) = .
2i
Now we apply the Uniqueness Theorem on the real-valued functions u and v to conclude that,
assuming that θ0 is a continuity point of f (hence of both u and v), u(θ0 ) = v(θ0 ) = 0, and then
f (θ0 ) = u(θ0 ) + iv(θ0 ) = 0.
□
We are now concerned when the series
∞
X
|fb(n)| < +∞
n=−∞
is convergent, i.e. one of the sufficient conditions (I.9) of Corollary I.10 is satisfied. We will find that
the rate of decay of |fb(n)| as |n| → ∞ is intimately related to the smoothness of f . To measure the
rate of decay of |fb(n) we introduce the following notation.
LECTURE NOTES FOURIER ANALYSIS 13
Notation I.11 (Big O). If ϕ, ψ : Z → R>0 are two positive functions defined on integers (i.e., two
doubly-infinite sequences of positive sequences), we say that ϕ(n) = O(ψ(n)) as |n| → ∞ (“ϕ(n) is
a big O of ψ(n)”), if there exists a number C > 0 so that
ϕ(n) ≤ C · ψ(n).
The big O-notation could be generalized to more general index sets (other than Z), and convergence
of the index to a finite value (instead of |n| → ∞).
Example I.12. Recall that for the Example I.6, we had
1
|f (n)| = O
b ,
n2
and, in particular, in this case
∞
X
|fb(n)| < ∞.
n=−∞
uniformly on S . Note that the condition f ∈ C 2 (S 1 ) is sufficient for the decay (I.20) (and hence
1
for the uniform convergence of the Fourier series to the given function), but not necessary, as it is
violated, in Example I.6.
We will now prove Lemma I.13 in the case k = 1, the general case following by using mathematical
induction teamed up with the same argument, left to the homework assignment.
Proof of Lemma I.13, case k = 1. First, if f is continuous, then, by the triangle inequality,
Z Z
1 −inθ 1
|fb(n)| = f (θ)e dθ ≤ |f (θ)|dθ = O(1),
2π 2π S1
S1
We have that
1 π
f (θ)e−inθ −π = 0,
−2πin
by the 2π-periodicity of f (and of e−inθ ), and
Z
1 1
f ′ (θ)e−inθ dθ = fb′ (n),
2πin in
S1
so that
1
fb(n) = fb′ (n), (I.21)
in
which is equivalent to
infb(n) = fb′ (n).
For n = 0: Z
1 1 π
fb′ (0) = f ′ (θ)dθ = f (θ) −π
= 0,
2π 2π
S1
again, by periodicity. Finally, that
1
|fb(n)| = O
|n|
as |n| → ∞ follows directly from (I.21), since |fb′ (n)| = O(1) is bounded, by the readily established
case k = 0.
□
The statement of Lemma I.13 allows, under suitable conditions on f , to infer that if
∞
X
f∼ fb(n)einθ ,
n=−∞
then
∞
X
f (k) ∼ (in)k fb(n)einθ ,
n=−∞
i.e. to differentiate term by term the Fourier series k times (note that no convergence of the Fourier
series of f (k) is asserted). The next result is, in a sense, the converse of Lemma I.13.
Lemma I.14. Conversely, if for some k ≥ 2,
1
|fb(n)| = O , (I.22)
|n|k
then f ∈ C k−2 (S 1 ).
[Notice that Lemma I.14 “loses two powers” of n relatively to Lemma I.13. There are some
(very difficult and highly acclaimed) refinements of this, e.g. due to L. Carleson, but, in general,
Example I.6 shows that from decay of O n12 for the Fourier coefficients, one cannot infer that even
f ∈ C 1 (S 1 ).]
The proof of Lemma I.14 is a simple corollary to the following theorem from basic analysis course:
Theorem I.15 (Term by term differentiation). Assume that a series of differentiable functions
fn : I → C, I ⊆ R is an interval (finite or infinite), converges pointwise
X
fn (x) = f (x)
n
LECTURE NOTES FOURIER ANALYSIS 15
x ∈ R.
First, for every x ∈ R, f (y) · g(x − y) is an integrable function on [−π, π] since it is a product of f
(which is integrable) with a shifted version g(x − ·) of an integrable function, hence h(x) = (f ∗ g)(x)
makes sense. Further, h(·) is 2π-periodic, since g is 2π-periodic. Therefore h : S 1 → C is well defined
as a function on the circle.
Next, assuming that both f and g are continuous, we can transform the variables x − y = z to
write
Zπ x−π
Z
1 1
(f ∗ g)(x) = f (y)g(x − y)dy = − f (x − z)g(z)dz
2π 2π
−π x+π
x+π
Z Zπ
1 1
= g(z)f (x − z)dz = g(z)f (x − z)dz = (g ∗ f )(x),
2π 2π
x−π −π
where we used the 2π-periodicity of both f and g so that the integral is the same along any period.
Therefore, at least for f, g continuous, f ∗ g = g ∗ f , with the continuity assumption lifted in a
homework assignment (via a “density” argument).
1
Rπ
We can view 2π f (y)g(x − y)dy as taking a “weighted average” of translations of g, where the
−π
weights are according to f . E.g., if f ≡ 1, then
Zπ
1
(f ∗ g)(x) ≡ g(y)dy
2π
−π
16 IGOR WIGMAN
is the total mass of g. Convolving functions has a smoothing effect (on both functions).
Example I.17. Take a (small) number ϵ > 0, and
(
1 x ∈ [−ϵ, ϵ]
f (x) = g(x) = χ[−ϵ,ϵ] (x) = .
0 otherwise
The convolution (in home assignment) is the triangle function
(
1
2π
(2ϵ − |x|) |x| ≤ 2ϵ
(f ∗ g)(x) = .
0 otherwise
hence if |x1 − x2 | < δ, then |(x1 − y) − (x2 − y)| = |x1 − x2 | < δ. Therefore, we may use the triangle
inequality on (I.24) to bound
Zπ Zπ
1 1
|(f ∗ g)(x1 ) − (f ∗ g)(x2 )| ≤ |f (y)||g(x1 − y) − g(x2 − y)|dy ≤ |f (y)| · ϵdy
2π 2π
−π −π
Zπ
ϵ ϵ
= |f (y)|dy ≤ · 2π · B,
2π 2π
−π
ϵ
where B is any bound satisfying |f (x)| ≤ B. Since the number 2π
· 2π · B could be made arbitrarily
small, that proves that f ∗ g is continuous.
□
Proof of (5) assuming f, g are continuous. We may, using the continuity, change the order of inte-
gration:
Zπ Zπ Zπ
1 1 1
f[∗ g(n) = (f ∗ g)(x)e−inx dx = f (y)g(x − y)dy e−inx dx
2π 2π 2π
−π −π −π
Zπ Zπ
1 1
= f (y)e−iny g(x − y)e−in(x−y) dx dy
2π 2π
−π −π
Zπ
1
= f (y)e−iny dy · gb(n) = fb(n) · gb(n).
2π
−π
□
In order to reduce our statement to continuous functions we will need the following standard result
from analysis.
Lemma I.20 (“Approximation Lemma”). Let f be an integrable function on S 1 , and
sup |f (x)| ≤ B.
x∈S 1
Then there exists a sequence {fk }k≥1 of continuous functions fk : S 1 → C so that for all k ≥ 1,
sup |fk (x)| ≤ B,
x∈S 1
and as k → ∞,
Zπ
|f (x) − fk (x)| dx → 0.
−π
Proof of (4) for integrable functions. We apply the Approximation Lemma I.20 to obtain sequences
fk , gk respectively, satisfying the postulated properties. We claim that
fk ∗ gk → f ∗ g
uniformly on S 1 , which is sufficient to deduce that f ∗g is continuous (as a uniform limit of a sequence
of continuous functions). We have that
f ∗ g − fk ∗ gk = (f − fk ) ∗ g + fk ∗ (g − gk ),
18 IGOR WIGMAN
whence we bound
Zπ Zπ
1 1
|((f − fk ) ∗ g)(x)| ≤ |f (y) − fk (y)| · |g(x − y)|dy ≤ sup |g(z)| · |f (y) − fk (y)| dy → 0
2π 2π z∈S 1
−π −π
I.4. “Good kernels”. Good kernels are a generalization of the {pk } of the proof of the Uniqueness
Theorem I.9 we constructed in order to isolate the behaviour of a function around the origin by
convolving it with pk , that is highly peaked at the origin. We are going to construct a family
{Kn (x)}∞n=1 of functions on the circle, but they could also be indexed by a continuous parameter in
place of n. (But for simplicity we will assume n = 1, 2, . . ..)
Definition I.21 (Good kernels). A family of functions {Kn }∞ 1
n=1 , Kn : S → C is called a family of
good kernels, if the following hold:
a. Unit (signed) weight: for every n ≥ 1,
Zπ
1
Kn (x)dx = 1.
2π
−π
b. Control over fluctuations: there exists a number M > 0 (independent of n) such that for every
n ≥ 1 one has
Zπ
|Kn (x)|dx ≤ M.
−π
c. Mass concentration around the origin: for every δ > 0, as n → ∞ one has
Z
|Kn (x)|dx → 0.
δ≤|x|≤π
If Kn (x) ≥ 0, as we will often encounter, then (b) follows automatically from (a). The functions
Kn approximate for large n the “Dirac Delta function”. It makes no sense to ask whether a given
function is a good kernel, but rather when a 1-parameter family of functions is a family of good
kernels, with the parameter tending to some value, that is an accumulation point (in our case, grows
to infinity).
Theorem I.22 (“Good Kernels Theorem”). Let {Kn } be a family of good kernels, and f : S 1 → C
integrable. Then for all continuity points x ∈ S 1 of f we have the limit
lim (f ∗ Kn )(x) = f (x). (I.25)
n→∞
Intuitively,
Zπ
1
(f ∗ Kn )(x) = f (x − y)Kn (y)dy,
2π
−π
and, since Kn concentrates around y = 0, with unit mass,
Zπ Zδ Zδ
1 1 1 1
f (x − y)Kn (y)dy ≈ f (x − y)dy ≈ f (x)dy = · 2δf (x) = f (x),
2π 2δ 2δ 2δ
−π −δ −δ
where B > 0 is any number so that |f (·)| ≤ B on S 1 (f is bounded, since it is integrable). We then
have Z
ϵ ϵ
|Kn (y)|dy ≤ M
2π 2π
|y|<δ
Recall that (Example I.18) the partial Fourier sums are given by
(SN (f ))(x) = (f ∗ DN )(x),
where
N
X sin((N + 1/2)x)
DN (x) = einx =
n=−N
sin(x/2)
is the Dirichlet kernel of degree N (see (I.8)). It immediately raises the question whether {DN }N ≥1
is a family of good kernels. For if it does, then the Good Kernels Theorem I.22 would immediately
yield the convergence (SN (f ))(x) → f (x) as N → ∞ at every continuity point x of f . Let us test
the properties (a)-(c) of good kernels.
a. First,
Zπ
1
DN (x)dx = 1, (I.27)
2π
−π
seen by using the expression (I.7), with only the integral corresponding to n = 0 not vanishing.
Hence (a) is satisfied.
b. Unfortunately, it is possible to show that
Zπ
|DN (x)|dx ≥ c log N (I.28)
−π
for some c > 0, so (b) is violated due to the fluctuations of DN , increasing with N , relegated to a
guided exercise in a homework assignment.
We conclude that, somewhat disappointingly, {DN }N ≥1 is not a family of good kernels. [In fact,
using the estimate (I.28), it is possible to construct a sequence of continuous functions fN : S 1 → R
such that fN (0) = 0 but |(SN fN )(0)| ≥ c′ log N with some c′ > 0. In particular, it is possible to
construct a function f : S 1 → R so that there exists a point x0 ∈ S 1 such that (SN f )(x0 ) → ∞, so
that it is not converging to f (x).]
I.5. Cesàro summability. We will be able to establish the convergence ((SN f )(x) → f (x)) in a
∞
P
different, weaker, sense. Recall that if ck is a series of (complex) numbers ck ∈ C, then we consider
k=0
n
X
sn := ck ,
k=0
∞
P
and say that ck = S, if
k=0
sn −→ S.
n→∞
Instead, we consider the Cesàro means of {sn }n≥1 :
−1
NP
sn
n=0 s0 + s1 + . . . + sN −1
σN := = ,
N N
∞
P
also called the N ’th Cesàro sums of the series ck .
k=0
∞
P
Definition I.23. We say that the series ck is Cesàro summable to σ ∈ C, if
k=0
lim σN = σ.
N →∞
LECTURE NOTES FOURIER ANALYSIS 21
| sin(x/2)| ≥ cδ > 0
for δ ≤ |x| ≤ π (e.g. take cδ = sin(δ/2), and use the monotone increasing of the sine). Then
Z Z
1 dx 2π
|FN (x)|dx ≤ 2
≤ →0
N cδ N c2δ
δ≤|x|≤π δ≤|x|≤π
as N → ∞.
□
Hence, in light of the Good Kernels Theorem I.22, we have the following result, bearing in mind
that we readily seen σN (f ) = FN ∗ f :
[In particular, we rediscover the Uniqueness Theorem I.9: if f is integrable on S 1 , and for all
n ∈ Z we have fb(n) = 0, then (σN (f ))(x) ≡ 0, so, by Fejér’s Theorem I.28, f (x) = 0 for all points
of continuity x ∈ S 1 .]
[That also shows the converse implications in the statements: If f is continuous, then the following
holds. 1. f : S 1 → R is even, if and only if for all n ≥ 1, cn = 0. 2. f : S 1 → R is odd, if and only if
for all n ≥ 0, bn = 0.]
We have an important corollary that we are going to find very useful later:
[Unfortunately, the degree of P will in general grow as ϵ → 0. This also implies approximation, in
the L1 -sense of integrable functions by combining with the Approximation Lemma I.20.]
LECTURE NOTES FOURIER ANALYSIS 23
I.6. Abel summability and Poisson Kernel. Steady-state heat equation. Given a series
P∞
ck of complex numbers, let A(r) : [0, 1) → C be defined as
k=0
∞
X
A(r) := ck r k
k=0
∞
P
(Taylor series on [0, 1) corresponding to Taylor coefficients {ck }k≥0 ). We say that ck is Abel
k=0
∞
ck rk converges and
P
summable to s, if: for every r ∈ [0, 1) the series
k=0
lim A(r) = s.
r→1−
□
Lemma I.31 teamed up with the Good Kernels Theorem I.22 yields the following result (which, in
light of Fact I.30 already follows from Fejér’s Theorem I.28):
Theorem I.32 (Abel’s Theorem). The Fourier series of an integrable function f : S 1 → C is Abel
summable to f (x) at every point of continuity x ∈ S 1 of f . If f is continuous on S 1 , then the Fourier
series of f is uniformly Abel summable to f .
As an application of Abel summability, we are going to solve the steady-state heat equation.
Consider R2 to be an (infinite) metal plate, and for (x, y) ∈ R2 and t ≥ 0, let u(x, y, t) be the
temperature at (x, y) at time t. Suppose that u(·, ·, 0) is given, and we are interested in the heat
propagation, i.e. the evolution of u(x, y, t), t > 0. Let S = S(x0 , y0 ; h) for (x0 , y0 ) ∈ R2 be the side-h
square centred at (x0 , y0 ), and assume that h is small (infinitesimal). Then
ZZ
H(t) := σ · u(x, y, t)dxdy
S
with some σ > 0 constant associated to the metal plate, is the total heat on S.
The Newton law of cooling states that the heat flows from higher to lower temperature, at the rate
proportional to the difference of temperatures at two nearby points, i.e. the gradient. On the one
hand, ZZ
dH ∂u ∂u
=σ· (x, y, t)dxdy ≈ σh2 · (x0 , y0 , t), (I.32)
dt ∂t ∂t
S
whereas on the other hand, by Newton, we can estimate the amount of heat travelling through each
of the 4 sides of the square, entering to or escaping from it:
dH ∂u ∂u ∂u ∂u
≈ κh (x0 + h/2, y0 , t) − (x0 − h/2, y0 , t) + (x0 , y0 + h/2, t) − (x0 , y0 − h/2, t) ,
dt ∂x ∂x ∂y ∂y
(I.33)
with κ > 0 the conductivity of the material. We can now approximate
∂u ∂u ∂ 2u
(x0 + h/2, y0 , t) − (x0 − h/2, y0 , t) ≈ h · 2 (x0 , y0 , t),
∂x ∂x ∂x
and
∂u ∂u ∂ 2u
(x0 , y0 + h/2, t) − (x0 , y0 − h/2, t) ≈ h · 2 (x0 , y0 , t),
∂y ∂y ∂y
so that (I.33) reads
∂ 2u ∂ 2u
dH
≈ kh2 + . (I.34)
dt ∂x2 ∂y 2
LECTURE NOTES FOURIER ANALYSIS 25
where an = fb(n) are the Fourier coefficients of f . Since einθ = 1 and the {an } are bounded, say by
|an | ≤ C, to check that u ∈ C 2 (D), it is sufficient to check that for every r < ρ < 1, the series of the
possible second derivatives (either w.r.t. r or w.r.t. θ, or mixed), majorized by
∞
X ∞
X
2 |n|
|an |n r <C· n2 ρ|n| < +∞
n=−∞ n=−∞
is uniformly convergent in the disc r < ρ, which is indeed the case. To check that ∆u = 0, we can
evaluate the Laplacian in the polar coordinate (homework assignment) to be
∂ 2 u 1 ∂u 1 ∂ 2u
∆u = + · + · , (I.37)
∂r2 r ∂r r2 ∂θ2
whence
∞
∂ 2u X
= an |n|(|n| − 1)r|n|−2 einθ , (I.38)
∂r2 n=−∞
∞
1 ∂u X
· = an |n|r|n|−2 einθ , (I.39)
r ∂r n=−∞
∞
1 ∂ 2u X
· = an (−n2 )r|n|−2 einθ , (I.40)
r2 ∂θ2 n=−∞
justified by the said uniform absolute convergence for r < ρ < 1.
Finally, summing up (I.38), (I.39) and (I.40), and substituting these into (I.37), we obtain, using
the elementary identity
|n|(|n| − 1) + |n| − n2 = n2 − |n| + |n| − n2 = 0,
that ∆u ≡ 0 on D, as claimed. That the second assertion of Theorem I.33 holds is then a direct
implication of Abel’s Theorem I.32. The uniqueness of a solution u(·, ·) satisfying (1) and (2) of
Theorem I.33, subject to f continuous on S 1 , is out of the scope of this course.
□
I.7. Mean-square convergence of Fourier series. Our main goal in this section is to prove the
following remarkable result:
Theorem I.34. Suppose that f is integrable on S 1 . Then, as N → ∞,
Zπ
1
|f (θ) − (SN (f ))(θ)|2 dθ → 0.
2π
−π
Conversely, if V is over R, we can recover the original inner product it came from via
1
∥v + w∥2 − ∥v − w∥2 ,
⟨v, w⟩ = (I.43)
4
if ∥ · ∥ is indeed associated to an inner product via (I.42).
If V is over C, then only
1
∥v + w∥2 − ∥v − w∥2
ℜ(⟨v, w⟩) =
4
holds, but it is still possible to recover the inner product (see homework assignment). [However, ∥ · ∥1
is a norm, but, with this norm, (I.43) does not recover any inner product.]
Definition I.37 (Orthogonality). Let V be an inner product vector space. We say that v, w ∈ V
are orthogonal, if ⟨v, w⟩ = 0.
Lemma I.38. Let V be an inner product vector space.
(1) Pythagorean Theorem: If v, w ∈ V are orthogonal, then
∥v + w∥2 = ∥v∥2 + ∥w∥2 .
(2) Cauchy-Schwarz: For every v, w ∈ V one has the inequality
|⟨v, w⟩| ≤ ∥v∥ · ∥w∥.
(3) Triangle inequality: For every v, w ∈ V ,
∥v + w∥ ≤ ∥v∥ + ∥w∥.
Definition I.39 (Hilbert space). An inner product space (V, ⟨·, ·⟩) (with ⟨·, ·⟩ positive definite) is a
Hilbert space if it is complete, i.e. for every sequence {vn } ⊆ V of elements of V , if {vn } is Cauchy,
then it converges to an element v ∈ V. [A sequence {vn } is Cauchy, if for every ϵ > 0 there exists a
number N > 0 such that for all n, m > N , one has ∥vn − vm ∥ < ϵ. A sequence {vn } converges to v, if
for every ϵ, there exists a number N > 0 so that for all n > N , ∥vn − v∥ < ϵ. A convergent sequence
is automatically Cauchy.]
Examples I.40. (1) V = l2 (Z) is a Hilbert space, see homework assignment. Here a sequence
of elements {αn }n = {(αn,i )i∈Z }n converges to an element α0 = (α0,i ) ∈ V, if
X∞
|αn,i − α0,i |2 → 0.
i=−∞
R2π
and, since for n > 1δ , |f − fn |2 dθ = 0 and
δ
Z2π Z2π
2
|fn − g| dθ ≤ |fn − g|2 dθ → 0
δ 0
R2π
But that implies that g is unbounded (since otherwise, if g is bounded by B, say, then |f − g|2 dθ
δ
would have a positive contribution around 0 where |f (θ)| > B), whence it is not integrable.
To resolve this setback, we use the aforementioned completion procedure, to complete R, the
resulting space L2 (S 1 ) is a Hilbert space. However, that requires Lebesgue integration, and outside
the scope of this course. Since we constrain ourselves with Riemann integrable functions, we will
perfectly be content with R, with its disadvantages.
Towards the proof of the mean square convergence Theorem I.34. As it was mentioned,
we will work with the space R of integrable functions f : S 1 → C, endowed with the inner product
(I.41), and the corresponding norm
Zπ
2 1
∥f ∥ = |f (θ)|2 dθ.
2π
−π
as N → ∞.
Towards the proof of the convergence in mean square, let for n ∈ Z the function (vector in R)
en : S 1 → C be defined by en (θ) := einθ . Obviously, en ∈ R (in fact, the en are analytic, and for
every n, m ∈ Z we have the orthogonality relations
(
0 n ̸= m
⟨en , em ⟩ = δnm = .
1 n=m
Therefore, E := {en }n∈Z is an orthonormal system of (infinitely many) vectors in R. The meaning
of the mean square convergence is that E is sufficiently “rich”, i.e. spans the entire R. We identify
fb(n) = ⟨f, en ⟩, and
XN
SN (f ) = ⟨f, en ⟩en
n=−N
is the projection of f onto the subspace RN of R spanned by {en }|n|≤N .
Lemma I.42. Let f : S 1 → C, f ∈ R, and N ≥ 1.
(1) We have
X 2
∥f ∥2 = ∥f − SN (f )∥2 + fb(n) . (I.44)
|n|≤N
and, upon substituting (I.46) into (I.47), we obtain (I.44), concluding the proof of (1) of Lemma I.42.
To prove (2) of Lemma I.42 we reuse the fact that f − SN (f ) is orthogonal to every trigonometric
polynomial gN of degree ≤ N , and write the orthogonal decomposition
f − gN = (f − SN (f )) + (SN (f ) − gN )
LECTURE NOTES FOURIER ANALYSIS 31
Figure 1. The vector SN is the orthogonal projection of f onto the space spanned by {en }|n|≤N .
of f − gN . This implies
∥f − gN ∥2 = ∥f − SN (f )∥2 + ∥SN (f ) − gN ∥2 ≥ ∥f − SN (f )∥2 ,
which is (2) of Lemma I.42.
□
We are now approaching the proof of the Mean Square Convergence Theorem I.34, asserting that
∥f − SN (f )∥ → 0 as N → ∞. To this end we will need two reminders:
(1) Approximation Lemma I.20: For every f ∈ R such that for some B > 0, we have sup |f (θ)| ≤
θ∈S 1
B, there exists a sequence {fn }n≥1 of continuous functions fn : S 1 → C such that for every
n ≥ 1, sup |fn (θ)| ≤ B and, as n → ∞,
θ∈S 1
Zπ
|fn (θ) − f (θ)|dθ → 0
−π
so that ∥f − g∥ < 2ϵ .
Now apply (2) on g to obtain a number N > 0 sufficiently large and a function gN ∈ RN , so that
∥g − gN ∥ < 2ϵ . Thus, by the triangle inequality, given f ∈ R, we have found a number N sufficiently
large and a function g ∈ RN , so that ∥f − gN ∥ < ϵ.
□
Proof of Theorem I.34. Given f ∈ R and ϵ > 0, apply Corollary I.43 to obtain N > 0 sufficiently
large and gN ∈ RN , such that ∥f − gN ∥ < ϵ. However, by the Best Approximation Lemma I.42(2),
we obtain
∥f − SN (f )∥ ≤ ∥f − gN ∥ < ϵ,
concluding the proof of Theorem I.34.
□
We have the following further results.
P b 2
Theorem I.44. (1) Parseval’s identity: for every f ∈ R one has |f (n)| < ∞, and
n∈Z
X
∥f ∥2 = |fb(n)|2 .
n∈Z
Parseval’s identity shows that {en }n∈Z is sufficiently rich to span the entire space R. If we would
miss any of the en , that would result in a mere inequality in place of Parseval’s identity.
Proof. Recall that, by the orthogonality of f − SN (f ) and SN (f ), and the orthogonal decomposition
(I.45), we have the equality (I.44), i.e., that
N
X 2
2 2
∥f ∥ = ∥f − SN (f )∥ + fb(n) .
n=−N
[Unlike what we had for smooth functions, Riemann-Lebesgue does not prescribe the rate of decay.]
I.8. Pointwise convergence.
Theorem I.46 (Pointwise Convergence Theorem). Let f : S 1 → C be integrable, and θ0 ∈ S 1 such
that f is differentiable at θ0 . Then, as N → ∞,
(SN (f ))(θ0 ) → f (θ0 ).
It is possible to construct a continuous function f : S 1 → C and θ0 ∈ S 1 such that (SN (f ))(θ0 )
diverges as N → ∞ (Fejér). However, necessarily (SN (f ))(θ0 ) → f (θ0 ) for almost all θ0 ∈ S 1 (L.
Carlesson, 1966).
Proof. Define the auxiliary function F : S 1 → C:
f (θ0 −t)−f (θ0 )
t
t ∈ (−π, π) \ {0}
F (t) = −f (θ0 )′ .
t=0
Doesn’t matter as long as it is periodic t = ±π
Then, for every δ > 0, F is continuous outside of (−δ, +δ), hence is integrable there, and, moreover,
F is bounded, since we assumed that f is differentiable at θ0 . Therefore, by HW1 Q1, F is integrable
1
Rπ
on S 1 . We then use SN (f ) = f ∗ DN , and bear in mind that 2π DN (t)dt = 1 to write
−π
Zπ Zπ
1 1
(SN (f ))(θ0 ) − f (θ0 ) = f (θ0 − t)DN (t)dt − f (θ0 ) = (f (θ0 − t) − f (θ0 ))DN (t)dt
2π 2π
−π −π
(I.48)
Zπ
1
= F (t) · tDN (t)dt.
2π
−π
sin((N +1/2)t)
Now recall that DN is explicitly given by DN (t) = sin(t/2)
, so that (I.48) is
Zπ
1 F (t)t
(SN (f ))(θ0 ) − f (θ0 ) = · sin((N + 1/2)t)dt. (I.49)
2π sin(t/2)
−π
t F (t)t
Since sin(t/2) is continuous, the function g(t) := sin(t/2) is integrable, therefore it is then tempting to
apply the Riemann-Lebesgue Lemma (Theorem I.45) on g directly to deduce that the r.h.s. of (I.49)
34 IGOR WIGMAN
vanishes as N → ∞. However one is not able to infer the vanishing of the r.h.s. of (I.49) by a direct
application of Theorem I.45, as N + 1/2 is not an integer. Instead, we write
sin((N + 1/2)t) = sin(N t) cos(t/2) + sin(t/2) cos(N t),
and argue that
π
Z Zπ
1 t cos(t/2)
(SN (f ))(θ0 ) − f (θ0 ) = F (t)t cos(N t)dt + F (t) · sin(N t)dt ,
2π sin(t/2)
−π −π
whence both summands vanish via two separate applications of Theorem I.45.
□
Figure 2. If the shape is not convex, then it should be easy to improve it by adding
to its area without adding to its perimeter.
If we are to guess what attributes the “optimal” shape should have, then, first, it should be convex.
For if it is not convex, it should be possible to “improve” the shape by adding to the area of the
enclosed oval, without increasing its length, as illustrated in Figure 2. Also, any straight segment in
the boundary of the oval could be added area by making it curvier. Therefore, a circle is a reasonable
choice for a potential candidate to maximize the area of the enclose oval, the perimeter length given.
The question is whether this intuition could be made into a rigorous theorem.
Before being able to address the given problem by Fourier analysis methods, we will have to give
it a formal mathematical treatment.
LECTURE NOTES FOURIER ANALYSIS 35
and study how its length and enclosed area are related to those of γ. Details are left as a homework
assignment. □
Proof of Theorem II.4 assuming ℓ = 2π. Let γ(s) = (x(s), y(s)) : [0, 2π] → R2 be a (2π-periodic)
arc-length parametrization of Γ, so that
x′ (s)2 + y ′ (s)2 ≡ 1 (II.2)
on s ∈ [0, 2π]. Let an = xb(n) and bn = yb(n) be the Fourier coefficients of x(·) and y(·) respectively,
so that the Fourier coefficients of their derivatives are xb′ (n) = inan and yb′ (n) = inbn .
An application of Parseval’s identity (Theorem I.44) on x′ (·) yields
Z2π ∞ ∞
1 ′ 2
X 2 X
x (s) ds = xb′ (n) = n2 |an |2 ,
2π n=−∞ n=−∞
0
Note that, since both x and y are real-valued, it follows (in fact, equivalent) that for every n ∈ Z,
a−n = an and b−n = bn . Recall the inner product ⟨·, ·⟩ on R, and that, by Parseval’s Theorem I.44(1),
the Fourier map preserves the inner product, i.e. for f, g ∈ R
Zπ ∞
1 X
f (s)g(s)ds = fb(n) · gb(n).
2π n=−∞
−π
Applying this identity with f (·) = x(·) and g = y ′ (·) yields the equality
Z2π Z2π ∞ ∞
1 1 X X
x(s)y ′ (s)dx = x(s)y ′ (s)dx ′
= ⟨x, y ⟩ = an · inbn = −i nan · bn , (II.5)
2π 2π n=−∞ n=−∞
0 0
and, similarly,
Z2π ∞ ∞
1 ′ ′
X X
y(s)x (s)ds = ⟨y, x ⟩ = bn · inan = −i nan · bn . (II.6)
2π n=−∞ n=−∞
0
Now we insert (II.5) and (II.6) into Green’s formula (II.4) to write
∞
X ∞
X
A=π n(an bn − an bn ) = 2π nℑ(an · bn ) ,
n=−∞ n=−∞
LECTURE NOTES FOURIER ANALYSIS 37
thanks to (II.3). This concludes the first statement of Theorem II.4, i.e. that the inequality (II.1)
holds.
Next, we need to classify the curves, for which (II.1) is an equality. To this end, we observe that
for the equality in (II.7) to hold, it is necessary that for all n with |n| ≥ 2, one has an = bn = 0 (as
here the strict inequality |n| < n2 holds). Hence, if A = π, then
x(s) = a−1 e−is + a0 + a1 eis ,
y(s) = b−1 e−is + b0 + b1 eis .
Since both x, y are real-valued, a−1 = a1 , b−1 = b1 , and (II.3) reads 2(|a1 |2 + |b1 |2 ) = 1, and since, by
(II.7) and the argument above it, we must also have
|a1 |2 + |b1 |2
ℑ(a1 · b1 ) = |a1 | · |b1 | = , (II.8)
2
it follows that (
|a1 |2 + |b1 |2 = 21
1
|a1 |2 · |b1 |2 = 16 .
Therefore, we have |a1 | = |b1 | = 12 , and so we can write a1 = 12 eiα , b1 = 21 eiβ for some α, β ∈ [0, 2π).
In light of the above, we have
(
x(s) = a0 + ℜ(ei(α+s) ) = a0 + cos(α + s)
y(s) = b0 + ℜ(ei(β+s) ) = b0 + cos(β + s).
Finally, we claim that cos(β + s) = ± sin(α + s), so that Γ is a unit circle centred at (a0 , b0 ) ∈ R2 .
To this end, we recall (II.8), so that
1 1 1
| sin(α − β)| = ℑ(ei(α−β) ) = ℑ(a1 · b1 ) = |a1 b1 | = .
4 4 4
π π
Hence | sin(α − β)| = 1, then α − β ∈ π · Z + 2 , i.e., it is an odd multiple of 2 , and then cos(β + s) =
± sin(α + s), which, as it was mentioned above, finally yields that Γ is a unit circle.
□
II.2. Weyl’s Equidistribution Theorem.
Motivating question: Let us take a real number α ∈ R \ {0} and consider the sequence
A = Aα = {⟨α · n⟩}n≥1 , (II.9)
where [0, 1) ∋ ⟨x⟩ = x − ⌊x⌋ is the fractional part of x. For example ⟨3.5⟩ = 0.5, ⟨π⟩ = π − 3 =
0.1415 . . ., ⟨−3.5⟩ = (−3.5) − (−4) = 0.5, ⟨−3.4⟩ = 0.6. Implicitly, we defined an equivalence relation
between real numbers, where x ∼ y, if x − y ∈ Z.
Question II.6. (1) Is the set A dense in [0, 1], i.e. can it approximate every number β ∈ [0, 1]
arbitrarily well?
(2) Is A equidistributed in [0, 1), i.e. every interval (a, b) ⊆ [0, 1] gets proportionally many ele-
ments of A to its length b − a?
38 IGOR WIGMAN
Definition II.7 (Equidistribution in [0, 1)). Let A = {an }n≥1 ⊆ [0, 1) be a sequence. We say that
A is equidistributed in [0, 1), if for every (a, b) ⊆ [0, 1],
|{1 ≤ n ≤ N : an ∈ (a, b)}|
lim = b − a. (II.10)
N →∞ N
Note that in the Definition II.7 above, b = 1 is allowed. Also note that the rate of convergence of
the limit (II.10) is allowed to (and, in general, will) depend on the interval (a, b). It is easy to show
(left to a homework assignment) that every equidistributed sequence is dense in [0, 1].
1
Examples II.8. (1) Take α = 3
and consider A is in (II.9). Then in this case
1 2 1 2
an = , , 0, , , 0, . . . .
3 3 3 3
This sequence is not dense in [0, 1], hence, in particular, not equidistributed. More generally,
if 0 ̸= α = pq ∈ Q is a rational number with (p, q) = 1, then the sequence an = ⟨nα⟩ is periodic
with period q, so not
√ dense, and, in particular, not equidistributed. What about an irrational
number, e.g. α = 2?
(2) Prove that the sequence
1 1 2 1 2 3
0, , 0, , , 0, , , , 0, . . . ,
2 3 3 4 4 4
m
i.e. a linear ordering of { n }n≥1, 0≤m≤n−1 , is equidistributed.
(3) The sequence
1 1 2 3 1 2
0, , 0, , , , 0, , , . . . ,
2 4 4 4 8 8
m
i.e. a linear ordering of { 2n }n≥1, 0≤m≤2 −1 , is not equidistributed (but dense) in [0, 1]. Take,
n
for example, (a, b) = ( 21 , 1), and choose N = 32 · 2k , then for an ∈ (a, b) it is necessary to have
n ≤ 2k , and then
|{1 ≤ n ≤ N : an ∈ (a, b)}| 2k−1 1 1
∼ ∼ < .
N N 3 2
The reason why in this example the sequence fails to equidistribute is because each cycle
spends a positive proportion ( 12 ) of the time, in contrast to the previous example.
Theorem II.9 (Weyl’s equidistribution). If α is irrational, then the sequence Aα = {⟨n · α⟩}n≥1 is
equidistributed in [0, 1]. In particular, Aα is dense in [0, 1] (“Kronecker’s Theorem”).
[Theorem II.9 could be generalized to vectors α = (α1 , . . . αd ) ∈ Rd , ⟨n · α⟩ ∈ Td , the d-dimensional
torus (“ergodic dynamical system”). The rate of convergence (“quantitative discrepancy”) in (II.10)
and its higher dimensional analogue depends on the Diophantine properties of the number α ∈ R \ Q
or the vector α, i.e. how close these are to being rational.]
Towards the proof of Theorem II.9 we recall the equivalence relation we defined on R, i.e. for
x, y ∈ R, x ∼ y if x − y ∈ Z.
Proof. Let (a, b) ⊆ [0, 1], and χ(a,b) : [0, 1] → R be the characteristic function of (a, b)
(
1 x ∈ (a, b)
χ(a,b) (x) = ,
0 otherwise
and extend χ(a,b) to a function χ(a,b) : R → R by 1-periodicity, that is, the extended χ(a,b) (·) satisfy
χ(a,b) (x) = χ(a,b) (y) for all x ∼ y with the above defined equivalence relation. Or, put it differently,
LECTURE NOTES FOURIER ANALYSIS 39
χ(a,b) (x) = χ(a,b) (⟨x⟩) for every x ∈ R, i.e. χ(a,b) (x) only depends on the equivalence class of x. Then,
under the above conventions,
N
X N
X
|{1 ≤ n ≤ N : ⟨nα⟩ ∈ (a, b)}| = χ(a,b) (⟨nα⟩) = χ(a,b) (nα),
n=1 n=1
More generally, in what follows, we aim is to establish (II.12) for a wide class of 1-periodic functions
f : R → C that would contain all the characteristic functions of intervals. This is done in 4 steps (a
fifth step will be offered in a homework assignment). In the theory of dynamical systems, the l.h.s. of
(II.12) is usually referred as “space average” of an observable f along an implicitly defined dynamical
system, whereas the r.h.s. of (II.12) is called “space average” of that observable; a dynamical system
is ergodic if these two are equal.
Step 1: Let f be the exponential f (x) = ek (x) = e2πikx : R → C for some k ∈ Z, a 1-periodic
function.
If f ≡ 1 (i.e. k = 0), then for every N , the l.h.s. of (II.12) is equal to 1, and so is the r.h.s., so
the limit (II.12) as N → ∞ certainly holds. For k ̸= 0 the l.h.s. of (II.12) is a geometric sequence
that could be summed:
N 1
1 2πikα 1 − e2πikN α
Z
1 X 2πik(nα)
e = ·e · 2πikα
→ 0 = e2πikx dx,
N n=1 N 1−e
0
Overall, with the use of the triangle inequality, we have: for N sufficiently large,
N Z 1 N N Z 1
1 X 1 X 1 X
f (nα) − f (x)dx ≤ |f (nα) − PK (nα)| + PK (nα) − PK (x)dx
N n=1 N n=1 N n=1
0 0
Z1 Z1
ϵ ϵ ϵ
+ PK (x)dx − f (x)dx < + + = ϵ,
3 3 3
0 0
Figure 3. The characteristic function χ(a,b) is approximated from below and above
by continuous functions fϵ± .
N N Z 1
1 X + 1 X +
lim sup SN ≤ lim sup fϵ (n · α) = lim fϵ (n · α) = fϵ+ (x)dx ≤ b − a + 2ϵ,
N →∞ N →∞ N N →∞ N
n=1 n=1 0
and, since ϵ > 0 is arbitrary, and lim sup SN does not depend on ϵ, we can infer that
N →∞
lim sup SN ≤ b − a.
N →∞
Now applying lim inf on the l.h.s. of (II.13) and manipulating with the resulting expressions in
N →∞
analogous manner, we obtain
lim inf SN ≥ b − a.
N →∞
This result, coupled with the result on lim sup finally yields the existence of the limit
N →∞
lim SN = b − a,
N →∞
the premise of
N 1 (
0 k ̸= 0
Z
1 X
ek (an ) → ek (x)dx = ,
N n=1 1 k=0
0
is equidistributed (the l.h.s. is called “Weyl’s exponential sum”). It is satisfied automatically for
k = 0, and, since ak are real, one can deduce that if it is satisfied for k, it is also satisfied for −k.
Therefore, we obtained the following useful equidistribution criterion:
Theorem II.10 (Weyl’s equidistribution criterion). A sequence {an }n≥1 ⊆ R is equidistributed
modulo 1, if and only if for every k ∈ Z \ {0} (alternatively, k ∈ Z>0 ) we have
N
1 X 2πikan
e →0
N n=1
as N → ∞.
We proved “if”, whereas “only if” is left as a home assignment.
(2) One can consider the map ρ : [0, 1) ∼
= S 1 → [0, 1) defined by
x 7→ ρx = α + x mod 1,
endowing S 1 with a structure of a dynamical system. The “orbit” of 0 is precisely 0, ρ0 = α, ρ2 0 = 2α
mod 1, . . .. If α ∈ Q, this orbit is periodic. Otherwise, it is equidistributed in S 1 . We can take a
function f : S 1 → R (or f : S 1 → C), called in this context a test function or an observable. In
N
this context, N1 f (ρn 0) is the time average of the observable f along the trajectory of 0, whereas
P
n=1
R1
f (x)dx is the space average of the observable f . The central question of the ergodic theory is
0
whether and under what conditions on the dynamical system and the observable, these two are
equal, asymptotically as N → ∞. If it is indeed the case for a suitable class of observable functions,
we say that the given dynamical system is ergodic.
(3) In our particular case, it is possible to prove that the ergodic property (II.12) holds for all
Riemann integrable 1-periodic functions f : R → C, proposed as a home assignment, with proof
borrowing from the readily done Step 4, i.e. approximating the Riemann integrable function, from
below and above, by step functions.
(4) There exists a vast number of generalizations for other sequences, dynamical systems in higher
dimensions (e.g. x 7→ x + (α1 , α2 ) on S 1 × S 1 = T2 , the 2-dimensional standard torus) etc.
(5) Finally, it seems appropriate to mention the Three Gaps Theorem, with connections to many
disciplines, including biology, musicology, linguistics and others. For some N ≥ 1, let {αk } be the
ordering of the numbers {n · α}n≤N . Then the collection of gaps {αk+1 − αk } contains at most 3
elements (depending on N ). Further, if it contains precisely 3 elements, then the largest of the three
gaps equals the sum of the other two. The proof of the Three Gaps Theorem is left as a guided home
assignment.
II.3. Vibrating string and wave propagation. In this section we will first derive the wave
equation, and then will solve it using the Fourier analysis. We consider a string of length L > 0
positioned at rest along the x axis between the origin x = 0 and x = L, attached to the axis by
its endpoints. We assume that the string consists of particles moving vertically, and at time t, the
height of the particle x is u(x, t), as shown in Figure 4.
Let ρ > 0 be the (constant) linear density of the given string, i.e. the mass of its snippet between
a and b is ρ · (b − a). We divide the string into N → ∞ equal segments of length h = L/N and
LECTURE NOTES FOURIER ANALYSIS 43
All in all, we are looking for a function u(x, t) : [0, π] × R≥0 → R satisfying the wave equation
∂2 ∂2
u(x, t) = u(x, t), (II.14)
∂t2 ∂x2
44 IGOR WIGMAN
with some A, B, A,
eB e ∈ R, and the meaning of the string being attached by its endpoints is that
φ(0) = φ(π) = 0, which, in its turn forces m ∈ Z≥0 and A e = 0, and thus we may assume m ∈ Z>0 .
We consolidate all of the above to find a particular solution
um (x, t) = (Am cos(mt) + Bm sin(mt)) · sin(mx) (II.17)
to (II.14), not adhering to the initial conditions (II.15), which we will now impose.
By the linearity of the wave equation (II.14), we can superimpose its solutions (II.17) to write
∞
X
u(x, t) = (Am cos(mt) + Bm sin(mt)) · sin(mx),
m=1
and we will try impose (II.15) to find sequences of numbers {Am } and {Bm } that would satisfy these.
By substituting t = 0, we have
X∞
Am sin(mx) ≡ f (x), (II.18)
m=1
which relates the Am to the Fourier coefficients of f .
Though the Fourier expansion on the l.h.s. of (II.18) involves the sines only (so must be odd), on
an optimistic note we observe that f : [0, π] → R (i.e. f is π-periodic and not merely 2π-periodic).
Inspired by this observation, we may extend (uniquely) f on [−π, π] by oddness, and then the Am
are the Fourier coefficients of the resulting extension (in the real form). Similarly to the above,
X∞
mBm sin(mx) ≡ g(x),
m=1
hence we extend the function g to [−π, π] by oddnss, and mBm are the Fourier coefficients of the
extended g. In homework there will be two concrete examples of solving the wave equation (one of
them is the “plucked string”, i.e. when g ≡ 0).
LECTURE NOTES FOURIER ANALYSIS 45
II.4. Heat equation on the circle. Let u(θ, t) be the temperature at time t > 0 at the point
θ ∈ S 1 . We already derived the heat equation for a 2-dimensional setting, and similarly we can
derive one for the 1-dimensional case, like ours:
∂u ∂ 2u
= c 2, (II.19)
∂t ∂θ
where c > 0 is a constant, and we will assume with no loss of generality, that c = 1. The heat
equation is subject to the initial conditions
u(θ, 0) ≡ f (θ)
for some λ ∈ R. Then A′′ (θ) ≡ λ · A(θ) implies, using the periodicity of A, that λ = −n2 for some
n ∈ Z, and
A(θ) = An einθ + Bn e−inθ .
2
Concerning B(t), we have, by the above, B ′ (t) = −n2 B(t), which implies that B(t) = Cn · e−n t .
By the linearity of the heat equation (II.19), we superimpose the solutions u(θ, t) = A(θ) · B(t) of
the above type to write
∞
2
X
u(θ, t) = an e−n t · einθ . (II.20)
n=−∞
Since u(θ, 0) ≡ f (θ), it follows that an = fb(n). We observe that |an | ≤ ∥f ∥∞ by the triangle
inequality, hence the defining series in (II.20) is rapidly absolutely convergent. In particular, u ∈
C 2 (S 1 ) (in fact, u ∈ C ∞ (S 1 )), and u(θ, t) as in (II.20) solves the heat equation.
Note that by the definition (II.20) of u(θ, t), for each fixed t ≥ 0, we have
2 2
\
u(·, t)(n) = an · e−n t = fb(n) · e−n t .
ct (n) = e−n2 t , i.e.
Hence u(·, t) = (f ∗ Ht )(θ), where Ht (·) is the function on S 1 such that H
∞
2
X
Ht (θ) = e−n t einθ ,
n=−∞
the heat kernel on the circle. The {Ht (·)}t≥0 is a family of good kernels as t → 0, hence u(θ, t) −→
t→0
f (θ) pointwise at continuity points (and uniform if f is continuous on S 1 ). Moreover, as t → 0,
L2
u(θ, t) −→ f (θ),
and, under suitable assumptions on gT , we can recover the values of gT (and hence those of f (x))
from its Fourier coefficients via the Fourier series:
∞
X ∞
X ∞
X
inθ in(x2π/T ) 2πi(n/T )x
f (x) = gT (θ) = T (n) · e
gc = T (n) · e
gc = T (n) · e
gc . (III.3)
n=−∞ n=−∞ n=−∞
Since both (III.2) and (III.3) are given in terms of n/T rather than n, it is only natural to re-
n
parameterize the frequency variable n, so that to parameterize gc T (·) in terms of yn := T . The new
frequency variable yn ∈ T1 Z is taking values in a grid of vanishing mesh size T1 (so, asymptotically
for T → ∞, will take values in R). That is, we rewrite (III.2) as
ZT /2
T (yn ) := T · g
fc c T (T · yn ) = f (x) · e−2πiyn x dx, (III.4)
−T /2
We then observe the striking similarities between the formulas (III.4) and (III.5) for passing from
the time variable x to the frequency variable y = yn and backwards respectively. The only differences
are the negative and positive signs in the exponent respectively, and that the integral in (III.4) is a
summation in (III.5). However, as T → ∞, yn becomes a continuous variable, and the summation
(III.5) is a Riemann sum on R corresponding to the function fc T (y), assuming it is defined for y ∈ R.
In light of all the above, it would be natural to define the Fourier transform acting on functions
f : R → C as F : f (x) 7→ fb(y) with
Z∞
fb(y) = f (x)e−2πixy dx,
−∞
LECTURE NOTES FOURIER ANALYSIS 47
In what follows we are going to discuss when these definitions make sense, and what are the funda-
mental properties of F and F ∗ , and their inter-relations.
III.2. The Fourier transform on the Schwartz space and its fundamental properties. It
will be seen that the decay of functions at infinity is related to the smoothness of their Fourier
transform, and vice versa. It is therefore understood that the “best” functions to deal with at first
stage is those smooth functions rapidly decaying at infinity.
Definition III.1. The Schwartz space S(R) is the space of all functions f : R → C, infinitely
differentiable f ∈ C ∞ (R), decaying at infinity faster than any polynomial, together with all their
derivatives, i.e. for every k, ℓ ∈ Z≥0 ,
dk f
xℓ k < +∞.
dx ∞
We say that a function f ∈ S(R) is a Schwartz class function.
Before we proceed let us consider a few examples.
2
Examples III.2. (1) The Gaussian f (x) = e−x . We claim that f ∈ S(R) is a Schwartz class
function. First, obviously, f ∈ C ∞ (R). For the decay, it is possible to prove by induction
2
that f (k) (x) = Pk (x) · e−x , where Pk (·) are polynomials. Since the exponential grows faster
than any polynomial, the conclusion follows.
(2) f (x) = e−|x| ∈ / S(R), since f ∈ / C 1 (R) (ditto f ∈/ C ∞ (R)), left to validate as a homework
assignment.
(3) Let us construct a bump function, also called a mollifier: it is a smooth function f ∈
C ∞ (R), with compact support, hence f ∈ S(R) (denoted f ∈ C0∞ (R)). Take
( 1
e− 2x x > 0
g(x) = .
0 x≤0
Then, as it is easy to check, g is continuous at x = 0, and, by induction,
( 1
Pk (x) − 2x
(k) x 2k · e x>0
g (x) = ,
0 x≤0
with Pk some polynomials, also continuous at x = 0. Hence g ∈ C ∞ (R) (but not compactly
supported). We then enforce the compact support by taking
(
− 1
e 1−x2 x ∈ (−1, 1)
f (x) := g(1 − x) · g(1 + x) = ,
0 x∈/ (−1, 1)
and f ∈ C0∞ (R).
(4) If f (x) ∈ S(R), then f ′ (x) ∈ S(R) and x · f (x) ∈ S(R), proposed as a homework assignment.
We now define the Fourier transform and the Inverse (or Conjugate) Fourier transform on S(R).
Definition III.3. The Fourier transform on S(R) is the linear operator F : S(R) → S(R),
f (x) 7→ fb(ξ) with
Z∞
fb(ξ) = f (x)e−2πixξ dx.
−∞
48 IGOR WIGMAN
We will see below that indeed fb(·) ∈ S(R). The Inverse Fourier transform (Conjugate Fourier
transform) is F ∗ : S(R) → S(R), g(ξ) 7→ ǧ(x), where
ǧ(x) = (Fg)(−x).
Our first goal is to validate that for f ∈ S(R), Ff ∈ S(R). To this end we have the following
proposition:
Proposition III.4. Let f (·) ∈ S(R), fb = Ff , h ∈ R, and δ > 0. Then:
(1) Translation of function is mapped to product by exponential:
F
f (x + h) 7→ fb(ξ) · e2πihξ .
(2) Dual to (1), left to a homework assignment:
F
f (x)e−2πixh 7→ fb(ξ + h).
(3) Homothethy is mapped to inverse (normalized) homothety:
F 1 b ξ
f (δx) 7→ f .
δ δ
(4) Derivative operator is mapped to (normalized) multiplication by the independent variable:
F
f ′ (x) 7→ 2πiξ fb(ξ).
(5) Dual to (4) left to a homework assignment:
F d b
(−2πix)f (x) 7→ f (ξ).
dξ
Proof. Proof of (1):
Z∞ Z∞
−2πixξ
f\
(· + h)(ξ) = f (x + h)e dx = f (y)e−2πi(y−h)ξ dy = e2πihξ fb(ξ).
−∞ −∞
Proof of (3):
Z∞ Z∞ Z∞
−2πixξ −2πiy/δ·ξ dy −2πiyξ/δ dy 1b ξ
f[
(δ·)(ξ) = f (δx)e dx = f (y)e = f (y)e = f .
δ δ δ δ
−∞ −∞ −∞
where the contribution of ±∞ to the integration by parts term vanishes, by the assumption f ∈ S(R).
□
We will finally able to prove that F, F ∗ : S(R) → S(R) as we wanted.
differentiated k times, then multiplied by ξ for ℓ times. Hence, by Proposition III.4, (5) and (4), if
we denote f1 (x) = (−2πix)k · f (x) ∈ S(R),
F dk b
f1 →
7 f (·),
dξ k
and, further,
(−2πi)k dℓ k 1 dℓ f 1 F ℓ b k
ℓ d b
x f (·) = →
7 ξ f 1 (ξ) = ξ f (ξ) = g,
(2πi)ℓ dxℓ (2πi)ℓ dxℓ dξ k
which is bounded as an image of a Schwartz function.
□
Fourier Inversion formula: our next aim is to prove the inversion formula: for f ∈ S(R), fb = Ff
one has
Z∞
∗b
f (x) = (F f )(x) = fb(ξ)e2πixξ dξ. (III.6)
−∞
∗
Note that, since (F g)(x) = (Fg)(−x), the inversion formula (III.6) reads
f (x) = (F ∗ Ff )(x) = (F 2 f )(−x).
Hence, if (III.6) is indeed valid, then it will also imply that (F 4 f )(x) = f (x), i.e. the operator
F 4 = Id is the identity operator on S(R). We will move towards this goal, which will be finally
established in section III.3.
2
We consider the Gaussian function f (x) = e−πx , and first claim that
Z∞
A := fb(0) = f (x)dx = 1.
−∞
1 1
Figure 5. Plots of Kδ with δ = , , 1, 2, 10.
10 2
The higher δ is, the more concentrated
is the kernel at the origin.
1 1
Figure 5 depicts the Gaussian kernels Kδ for δ = 10 , 2 , 1, 2, 10. For the higher parameter δ,
the corresponding kernel is more concentrated at the origin, whereas setting δ to a lower value
resulted in more mass of Kδ spreading outside the origin. This is related to the general Heisenberg
uncertainty principle, quantifying the relation between how concentrated a function and its Fourier
transform could be simultaneously, designating the confidence in the position and the momentum
of a particle in quantum mechanics.
The following definition of a family of good kernels is the analogue on R of Definition I.21.
Definition III.8. A family {Lδ }δ>0 ⊆ S(R) is a family of good kernels as δ → 0, if
R∞
(1) For every δ > 0, Lδ (x)dx = 1.
−∞
R∞
(2) There exists a number M > 0 so that |Lδ (x)|dx ≤ M .
R −∞
(3) For every η > 0, |Lδ (x)|dx → 0.
δ→0
|x|>η
As one can hope, the scaled Gaussian kernels {Kδ }δ>0 in (III.8) is a family of good kernels.
LECTURE NOTES FOURIER ANALYSIS 51
Proof. We readily validated (1) of good kernels (Corollary III.7), and (2) follows directly from (1) by
the positivity of all Kδ . For (3) of good kernels, we use a linear transformation of variables to write:
Z Z Z
1 −πx2 /δ 2
|Kδ (x)|dx = √ e dx = e−πx dx → 0,
δ √
δ→0
|x|>η |x|>η |x|>η/ δ
The following corollary is a particular case of the Good Kernels Theorem for functions on S(R),
analogous to Theorem I.22. Its proof, very similar to the one of Theorem I.22, will only given in
sketch for the Gaussian kernels, and is left in full generality as a homework assignment.
(f ∗ Kδ )(x) → f (x),
δ→0
uniformly w.r.t. x ∈ R.
R∞ R R
We then choose a parameter η > 0, divide the integral into ranges = + , use the triangle
−∞ |y|>η |y|<η
inequality, and bound the former integral
Z
|Kδ (y)| · |f (x − y) − f (x)|dy
|y|>η
using property (3) of good kernels (f is bounded), whereas the latter integral
Z
|Kδ (y)| · |f (x − y) − f (x)|dy
|y|<η
III.3. Inverse Fourier transform and the Plancherel formula. Our next aim is to prove the
Fourier inversion formula (III.6). Towards this goal we have the following result:
Proposition III.12 (Multiplication formula). For every f, g ∈ S(R) the following equality holds:
Z∞ Z∞
f (x) · gb(x)dx = fb(y) · g(y)dx (III.9)
−∞ −∞
holds.
Proof. We manipulate:
Z∞ Z∞ Z∞ Z∞ Z∞ Z∞
−2πixy −2πixy
f (x) · gb(x)dx = f (x)dx g(y)e dy = g(y)dy f (x)e dx = g(y) · fb(y)dy,
−∞ −∞ −∞ −∞ −∞ −∞
and it is easy to justify the said manipulation, due to the smoothness and the rapid decay of all the
integrals. □
Proof of the Fourier inversion formula (III.6). We first claim that (III.6) holds for x = 0, i.e., that
Z∞
f (0) = fb(ξ)dξ, (III.10)
−∞
and then will be able to deduce the general case via a translation. Recall that for δ > 0 we defined
2 2
the kernels Kδ (x) = √1δ · e−πx /δ , and take gδ (x) := e−πδx , so that gbδ (ξ) = Kδ (ξ). An application of
the multiplication formula (III.9) with f and gδ gives
Z∞ Z∞
f (x) · Kδ (x)dx = fb(y) · gδ (y)dy, (III.11)
−∞ −∞
which is sufficient to yield (III.10), thanks to the readily established (III.12). The convergence
(III.13) follows immediately from the Dominated Convergence Theorem, in light of the fact that gδ is
bounded by 1. Otherwise (as the Dominated Convergence Theorem might be unfamiliar territory),
we choose a large parameter T > 0, and separate the integral (III.13):
Z∞ ZT Z
fb(y) · gδ (y)dy = fb(y) · gδ (y)dy + fb(y) · gδ (y)dy.
−∞ −T |y|>T
LECTURE NOTES FOURIER ANALYSIS 53
1
R
Now, since |gδ | ≤ 1 and fb(y) ∈ S(R), the latter integral is fb(y) · gδ (y)dy = O T
. For the
|y|>T
former integral, we write
2
gδ (y) = e−πδy = 1 − O(δy 2 ) = 1 − O(δT 2 )
on the relevant range |y| < T , so that, assuming ∥fb∥∞ ≤ B, we conclude
ZT ZT
fb(y) · gδ (y)dy = fb(y)dy + O(BδT 3 ).
−T −T
Consolidating all the bounds above, we may estimate the integral (III.13) as
Z∞ ZT Z∞ Z∞
fb(y) · gδ (y)dy = fb(y)dy + O(BδT 3 ) + O(1/T ) = fb(y)dy + O(BδT 3 ) + O(1/T ) → fb(y)dy,
−∞ −T −∞ −∞
since the error terms could be made arbitrarily small, i.e., given ϵ > 0, first choose T > 2ϵ , and then
ϵ
δ < 2BT 3.
The equality (III.10) is now proved, and we are now going to reduce the case of arbitrary x to
x = 0, via a simple translation. Namely, we consider F (y) = Fx (y) := f (x + y), x ∈ R given. Then,
by (III.10) applied on F (·),
Z∞ Z∞
f (x) = F (0) = Fbx (ξ)dξ = fb(ξ)e2πixξ dξ,
−∞ −∞
which is (III.6).
□
Recall that F, F ∗ : S(R) → S(R) map S(R) onto S(R). The above result shows that both
F, F ∗ are bijective, FF ∗ = F ∗ F = Id, and, since (F ∗ f )(x) = (Ff )(−x), we have that (F 2 f )(x) =
(F ∗ Ff )(−x) = f (−x), it follows that F 4 = Id.
Proposition III.13. Let f, g ∈ S(R). Then:
(1) f ∗ g ∈ S(R).
(2) f ∗ g = g ∗ f .
(3) f[∗ g(ξ) = fb(ξ) · gb(ξ).
Proof. Only the most difficult (1) will be given a proof, the other ones, being standard manipulation
with the integrals similar to the corresponding results on the circle, will be relegated to a homework
assignment. It will also be shown in a guided homework exercise that for every ℓ ≥ 0, there exists a
number Aℓ > 0 so that
sup |x|ℓ · |g(x − y)| ≤ Aℓ · (1 + |y|)ℓ , (III.14)
x∈R
that we will use in the course of the proof. Using (III.14) we may write:
Z∞ Z∞
ℓ ℓ
|x| · |(f ∗ g)(x)| = |x| · f (y) · g(x − y)dy ≤ |f (y)| · (|x|ℓ · |g(x − y)|)dy
−∞ −∞
Z∞
≤ Aℓ · |f (y)| · (1 + |y|)ℓ dy < +∞,
−∞
54 IGOR WIGMAN
independent of x ∈ R.
Hence |x|ℓ · |(f ∗ g)(x)| is bounded, and, since g ∈ C ∞ (R) is rapidly decaying, we may differentiate
under the integral sign: for k ≥ 0,
k k
d d g
(f ∗ g)(x) = f ∗ (x).
dx dxk
Hence, by the above,
k
ℓ d
|x| (f ∗ g)(x)
dx
is bounded.
□
We endow S(R), a vector space over C, with an inner product
Z∞
⟨f, g⟩ = f (x) · g(x)dx,
−∞
absolutely convergent by the rapid decay of both f and g, and the corresponding norm is defined via
Z∞
∥f ∥2 = |f (x)|2 dx.
−∞
Theorem III.14 (Plancherel). The operator F : f 7→ fb is an isometry on S(R), i.e. for every
f ∈ S(R), ∥f ∥ = ∥fb∥. [Hence F also preserves the inner product (F is unitary): for every
f, g ∈ S(R), ⟨f, g⟩ = ⟨Ff, Fg⟩. ]
Proof. Given f ∈ S(R), define g(x) := f (−x). By a homework assignment, the sign negation negates
the sign of the Fourier transform, whereas conjugation negates the sign and conjugates the Fourier
transform. Therefore, we have that gb(ξ) = fb(ξ). Now we take h := f ∗ g, whose Fourier transform is
h(ξ) = fb(ξ) · gb(ξ) = fb(ξ) · fb(ξ) = |fb(ξ)|2 ,
b
and write the Fourier inversion formula for h at x = 0:
Z∞ Z∞
h(0) = h(ξ)dξ =
b |fb(ξ)|2 dξ = ∥Ff ∥2 . (III.15)
−∞ −∞
smoothness is imposed on f .]
LECTURE NOTES FOURIER ANALYSIS 55
With the definition of the class M(R) as given, it is easy to define fb(ξ) = (Ff )(ξ) for all f ∈ M(R).
The main problem is that it might be that f ∈ M(R), but fb = Ff ∈ / M(R), so that F ∗ fb makes
no sense. We could then restrict the definition of Ff to only f ∈ M(R) so that also Ff ∈ M(R).
The following lemma, already stated on S(R), also valid on M(R) will be proved in a homework
assignment.
Lemma III.16. If f, g ∈ M(R), then f ∗ g ∈ M(R), and
∗ g(ξ) = fb(ξ) · gb(ξ).
f[
From Lemma III.16 one can further deduce the multiplication formula, and subsequently the
Fourier inversion formula and Plancherel’s identity.
Example III.17. Consider f = χ[−1/2,1/2] , the characteristic function of the interval [−1/2, 1/2]. Its
Fourier transform was computed as a homework assignment to be fb(ξ) = sin(πξ) πξ
∈/ M(R). However,
2
both f and f are square integrable, i.e. in L (R).
b
We then would like to extend the Fourier transform to L2 (R). Recall that S(R) is an inner product
vector space with the inner product
Z∞
⟨f, g⟩ = f (x) · g(x)dx.
−∞
It is not complete, as one can take any not identically vanishing function outside of S(R), and convolve
it with a bump function. The completion of S(R) w.r.t. the associated norm coincides with L2 (R),
the class of (measurable) functions whose square is Lebesgue integrable (i.e. finite L2 -mass), for
example containing both f = χ[−1/2,1/2] and its Fourier transform fb(ξ) = sin(πξ)
πξ
. It is then possible
∗ 2 2
to extend F and F to operators L (R) → L (R), in a way that F is an isometry (hence unitary),
and F ∗ = F −1 , so that F is bijective.
Construction: Let f ∈ L2 (R), represented as a Cauchy sequence f = {fk }k≥1 ⊆ S(R). Now take the
image of the sequence {gk = Ffk }k≥1 ⊆ S(R). Then, since F is an isometry, the sequence {gk }k≥1
is Cauchy, and thereby, is an element g = {gk }k≥1 ∈ L2 (R) of L2 (R). Finally, we set Ff := g. It is
then important to prove that Ff is well-defined, i.e. that the equivalence class of g is independent of
the representation of the equivalence class f. Equivalently, if f ∈ L2 (R), then
ZR
Ff := lim f (x)e−2πixξ dx,
R→∞
−R
We then take the inverse Fourier transform of (IV.7) (alternatively, using the direct Fourier trans-
form bijectivity), and using the duality of multiplication of functions on the time plane and convo-
lution on the frequency plane and vice versa, to tip the solution
u(x, t) = (f ∗ Ht )(x) = (f ∗ Kδ(t) )(x).
The fact that {Kδ (·)}δ>0 are good kernels as δ → 0 will ensure the (uniform) convergence of u(x, t)
to f as t → 0 (also implying δ → 0 by the linear rescaling). That this is indeed so is asserted by the
following theorem.
Theorem IV.1. Let f ∈ S(R) be given, and for t > 0, x ∈ R define
u(x, t) := (f ∗ Ht )(x), (IV.8)
with Ht the heat kernel given by (IV.4). Then:
(1) The function u(·, ·) is C 2 (R × R>0 ), satisfying the heat equation (IV.1).
(2) As t → 0,
u(x, t) → f (x)
uniformly w.r.t. x ∈ R. Hence u defines a continuous function on
R × R>0 = {(x, t) : x ∈ R, t ≥ 0}
(by setting u(·, 0) := f (·)).
(3) Further to (2), the L2 -convergence holds, i.e.
Z∞
|u(x, t) − f (x)|2 dx → 0, (IV.9)
−∞
as t → 0.
Note that the L2 -convergence (IV.9) does not follow from the uniform convergence on R (unlike,
e.g. on S 1 ).
Proof. We first prove (1), i.e. that u(·, ·) is sufficiently smooth, and satisfies the heat equation. We
take the Fourier transform of the definition (IV.8) of u(·, ·) as a function of x ∈ R (for t > 0 fixed)
to yield
Fu = (Ff ) · (FHt ),
explicitly, bearing in mind (IV.5),
2 tξ 2
b(ξ, t) = fb(ξ) · e−4π
u . (IV.10)
Now, apply the Fourier inversion formula on (IV.10) (w.r.t. x):
Z∞
2 2
u(x, t) = fb(ξ)e−4π tξ e2πiξx dξ. (IV.11)
−∞
2 2
Now, using the rapid decay of fb(·) ∈ S(R) (and also of e−4π tξ ), we can differentiate under the
integral sign of (IV.11) w.r.t. either ξ or t, and it follows that u(·, ·) ∈ C ∞ (R × R>0 ) is an infinitely
differentiable of both its variables, stronger that claimed by (1). By the Leibnitz rule, we obtain,
thanks to (IV.11) again:
Z∞ Z∞
∂u −4π 2 tξ 2 2 tξ 2 ∂ 2u
= −4π 2 ξ 2 fb(ξ)e e2πiξx dξ = (2πi)2 ξ 2 fb(ξ)e−4π e2πiξx dξ = .
∂t ∂x2
−∞ −∞
58 IGOR WIGMAN
The statement (2) of the theorem follows directly from the Good Kernel theorem applied on the
Gaussian kernels {Kδ }δ , which was proved during lectures in the precise form
(f ∗ Ht )(x) = (f ∗ Kδ(t) )(x) → f (x),
uniformly w.r.t. x ∈ R, thanks to δ(t) = 4πt → 0 as well. Finally, to prove the L2 -convergence (3),
we use Plancherel’s identity:
Z∞ 2
2 2
2
∥u(·, t) − f (·)∥2 = ∥b 2
u(·, t) − f (·)∥2 =
b |fb(ξ)|2 · e−4π tξ − 1 dξ,
−∞
by (IV.10).
It then remains to show that the latter integral vanishes as t → 0, very similar to a homework
assignment problem, both in terms of the statement and in terms of the proof, except the integral in
place of the summation (and a slightly different normalization). Namely, we take a large parameter
T ≫ 0, and write
Z∞ 2
ZT Z
2 2
−4π tξ 2
|f (ξ)| · e
b − 1 dξ = + .
−∞ −T |ξ|>T
2 2
In the range |ξ| > T we use the trivial inequality e−4π tξ − 1 ≤ 1, whence
Z 2
Z
2 −4π 2 tξ 2
|f (ξ)| · e
b − 1 dξ ≤ |fb(ξ)|2 dξ = O(1/T ),
|ξ|>T |ξ|>T
−4π 2 tξ 2
since fb(·) ∈ S(R), whereas for |ξ| < T , we have e − 1 = O(tξ 2 ) = O(tT 2 ), so that
ZT 2
2 tξ 2
|fb(ξ)|2 · e−4π − 1 dξ = O(Bt2 T 5 ) = O(t2 T 5 ),
−T
where B = ∥fb∥2∞ (we allow the constant implicit within the ‘O′ -notation to depend on f ). Finally,
it is easy to first choose the parameter T sufficiently large, and then t > 0 sufficiently small, so that
the total contribution of
O(1/T ) + O(t2 T 5 ) < ϵ
is arbitrarily small.
□
It is also possible to assert the uniqueness of a solution, by imposing further conditions on u(·, ·)
but it is outside the scope of our module.