Funcional
Funcional
Manfred Einsiedler
Thomas Ward
Functional
Analysis,
Spectral Theory,
and Applications
Graduate Texts in Mathematics 276
Graduate Texts in Mathematics
Series Editors:
Sheldon Axler
San Francisco State University, San Francisco, CA, USA
Kenneth Ribet
University of California, Berkeley, CA, USA
Advisory Board:
Graduate Texts in Mathematics bridge the gap between passive study and creative
understanding, offering graduate-level introductions to advanced topics in mathe-
matics. The volumes are carefully written as teaching aids and highlight character-
istic features of the theory. Although these books are frequently used as textbooks
in graduate courses, they are also suitable for individual study.
Mathematics Subject Classification (2010): 46-01, 47-01, 11N05, 20F69, 22B05, 35J25, 35P10, 35P20,
37A99, 47A60
Believe us, we also asked ourselves what could be the rationale for ‘Yet an-
other book on functional analysis’.(1) Little indeed can justify this beyond
our own enjoyment of the beauty and power of the topics introduced here.
Functional analysis might be described as a part of mathematics where
analysis, topology, measure theory, linear algebra, and algebra come together
to create a rich and fascinating theory. The applications of this theory are
then equally spread throughout mathematics (and beyond).
We follow some fairly conventional journeys, and have of course been in-
fluenced by other books, most notably that of Lax [59]. While developing the
theory we include reminders of the various areas that we build on (in the
appendices and throughout the text) but we also reach some fairly advanced
and diverse applications of the material usually called functional analysis that
often do not find their place in a course on that topic.
The assembled material probably cannot be covered in a year-long course,
but has grown out of several such introductory courses taught at the Eid-
genössische Technische Hochschule Zürich by the first named author, with
a slightly different emphasis on each occasion. Both the student and (espe-
cially) the lecturer should be brave enough to jump over topics and pick the
material of most interest, but we hope that the student will eventually be
sufficiently interested to find out what happens in the material that was not
covered initially. The motivation for the topics discussed may by found in
Chapter 1.
Notation and Conventions
The symbols N = {1, 2, . . . }, N0 = N ∪ {0}, and Z denote the natural
numbers, non-negative integers and integers; Q, R, C denote the rational
numbers, real numbers and complex numbers. The real and imaginary parts
of a complex number are denoted by x = ℜ(x + iy) and y = ℑ(x + iy).
For functions f, g defined on a set X we write f = O(g) or f ≪ g if there
is a constant A > 0 with kf (x)k 6 Akg(x)k for all x ∈ X. When the implied
constant A depends on a set of parameters A, we write f = OA (g) or f ≪A g
v
vi Preface
(but we may also forget the index if the set of parameters will not vary at
all in the discussion). A sequence a1 , a2 , . . . in any space will be denoted (an )
(or (an )n if we wish to emphasize the index variable of the sequence). For
two C-valued functions f, g defined on Xr{x0 } for a topological space X
containing x0 we write f = o(g) as x → x0 if limx→x0 fg(x) (x)
= 0. This definition
includes the case of sequences by letting X = N ∪ {∞} and x0 = ∞ with
the topology of the one-point compactification. Additional specific notation
introduced throughout the text is collected in an index of notation on p. 600.
Prerequisites
We will assume throughout that the reader is familiar with linear algebra
and quite frequently that she is also familiar with finite-dimensional real
analysis and complex analysis in one variable. Further background and con-
ventions in topology and measure theory are collected in two appendices, but
let us note that throughout compact and locally compact spaces are implicitly
assumed to be Hausdorff.
Organisation
There are 402 exercises in the text, 221 of these with hints in an appendix,
all of which contribute to the reader’s understanding of the material. A small
number are essential to the development (of the ideas in the section or of later
theories); these are denoted ‘Essential Exercise’ to highlight their significance.
We indicate the dependencies between the various chapters in the Leitfaden
overleaf and in the guide to the chapters that follows it.
Acknowledgements
We are thankful for various discussions with Menny Aka, Uri Bader,
Michael Björklund, Marc Burger, Elon Lindenstrauss, Shahar Mozes, René
Rühr, Akshay Venkatesh, and Benjamin Weiss on some of the topics presen-
ted here. We also thank Emmanuel Kowalski for making available his notes
on spectral theory and allowing us to raid them. We are grateful to sev-
eral people for their comments on drafts of sections, including Menny Aka,
Manuel Cavegn, Rex Cheung, Anthony Flatters, Maxim Gerspach, Tommaso
Goldhirsch, Thomas Hille, Guido Lob, Manuel Lüthi, Clemens Macho, Alex
Maier, Andrea Riva, René Rühr, Lukas Ruosch, Georg Schildbach, Samuel
Stark, Andreas Wieser, Philipp Wirth, and Gao Yunting. Special thanks are
due to Roland Prohaska, who proofread the whole volume in four months.
Needless to say, despite these many helpful eyes, some typographical and
other errors will remain — these are of course solely the responsibility of the
authors.
The second named author also thanks Grete for her repeated hospitality
which significantly aided this book’s completion, and thanks Saskia and Toby
for doing their utmost to prevent it.
Manfred Einsiedler, Zürich
Thomas Ward, Leeds
2nd April 2017
Leitfaden
Banach Spaces 2
4 5 Sobolev Spaces
& Dirichlet Problem
Completeness
6
Dual Spaces 7
Compact Operators
& Weyl’s Law
Weak* Compactness
& Locally Convex Spaces 8
Spectral Theorems
12
& Pontryagin Duality
14 13
vii
viii Preface
Guide to Chapters
Chapter 1 is mostly motivational in character and can be skipped for the
theoretical discussions later.
Chapter 4 has a somewhat odd role in this volume. On the one hand
it presents quite central theorems for functional analysis that also influence
many of the definitions later in the volume, but on the other hand, by chance,
the theorems are not crucial for our later discussions.
The dotted arrows in the Leitfaden indicate partial dependencies. Chapter 6
consists of two parts; the discussion of compact groups depends only on
Chapter 3 while the material on Laplace eigenfunctions also builds on mater-
ial from Chapter 5. The discussion of the adjoint operator and its properties
in Chapter 6 is crucial for the spectral theory in Chapters 11, 12, and 13.
Moreover, one section in Chapter 8 builds on and finishes our discussion of
Sobolev spaces in Chapters 5 and 6. Finally, some of Chapter 11 needs the
discussion of Haar measures in the first section of Chapter 10.
With these comments and the Leitfaden it should be easy to design many
different courses of different lengths focused around the topic of Functional
Analysis.
Contents
1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 From Even and Odd Functions to Group Representations . . . . 1
1.2 Partial Differential Equations and the Laplace Operator . . . . . 5
1.2.1 The Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3 The Mantegna Fresco . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 What is Spectral Theory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 The Prime Number Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Further Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
ix
x Contents
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
General Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Chapter 1
Motivation
†
We start by discussing some seemingly disparate topics that are all intim-
ately linked to notions from functional analysis. Some of the topics have
been important motivations for the development of the theory that came
to be called functional analysis in the first place. We hope that the variety
of topics discussed in Sections 1.1–1.4 and those mentioned in Section 1.5
will help to give some insight into the central role of functional analysis in
mathematics.
Exercise 1.1. Is the decomposition of a function into odd and even parts in (1.1) unique?
That is, if f = e + o with e even and o odd, is e(x) = f (x)+f
2
(−x)
?
As one might guess, behind the definition of even and odd functions and
the decomposition in (1.1), is the group Z/2Z = {0, 1} acting on R via the
map x 7→ (−1)ℓ x for ℓ ∈ Z/2Z. Here we are using ℓ ∈ Z as a shorthand for
the coset ℓ + 2Z ∈ Z/2Z.
† Chapter 1 is atypical for this book. The reader may, and the lecturer should, skip this
Notice that each of the characters in (1) and (2) corresponds to ex-
actly one type of function in the decompositions of functions discussed
above. Generalizing this correspondence, we turn to (3) and say that a
complex-valued function f : R2 → C has weight n (is of type n) if it sat-
isfies f (k(2πφ)v) = χn (φ)f (v) for all φ ∈ T and v ∈ R2 .
4 1 Motivation
One might now guess — and we will see in Chapter 3 that this is indeed
the case — that any reasonable function f : R2 → C can be written as a
linear combination X
f= fn (1.3)
n∈Z
The right-hand side of (1.4) is called the Fourier series of f . We will see later
that it is relatively straightforward (at least in the abstract sense) to find the
Fourier coefficients cn via the identity
Z
cn = f (x)χn (x) dx
T
for all n ∈ Z.
Fourier series arise naturally in many day-to-day applications. A string
or a wind instrument playing a note is producing a periodic pressure wave
with a certain frequency. The tone humans hear usually corresponds to this
frequency, which is called the fundamental in music theory. There are also
higher frequencies, usually integer multiples of the fundamental frequency,
appearing in the wave. These frequences are called harmonics and the ratio
between the Fourier coefficients of the harmonics and the Fourier coefficient
of the fundamental make the distinctive sound of different instruments (for
example, the flute and the clarinet) when playing the same fundamental note.
Returning to our discussion of symmetries for functions on R2 we will show
similarly that for a reasonable function f : R2 → C the function
Z
fn (v) = χn (φ)f (k(2πφ)v) dφ
T
1.2 Partial Differential Equations and the Laplace Operator 5
for n ∈ Z has weight n (compare this with (1.2)), and that (1.3) holds.
R
Exercise 1.4. Show that if the function fn (v) = T χn (φ)f (k(2πφ)v) dφ is well-defined
(say the integral exists for almost every v ∈ R2 ), then it has weight n.
uniqueness of solutions to certain initial value problems and its proof will not
be surprised by this connection. We refer to Section 2.4 for more on this.
However, here we would like to discuss two particular partial differential
equations. As we will see later, the mathematical background needed for
this, most of which comes from functional analysis, is much more interesting
(meaning difficult) than that needed for ordinary differential equations. One
of the objectives of this book is to make the informal discussion in this section
more formal and rigorous. We will cover this topic in Chapters 5 and 6 (apart
from a technical point, which we resolve in Section 8.2).
In both of the partial differential equations that we will discuss, we will
need to express the difference between the value of a function at a point and
its values in a neighbourhood of the point. To make the resulting equations
more amenable for study one uses an infinitesimal version of this difference,
which brings into the picture(2) the Laplace operator ∆ (also sometimes de-
noted by ∇2 ) defined by
∂2f ∂ 2f
∆f = 2 + ··· + 2 (1.5)
∂x1 ∂xd
R R
Next notice that Br
yi2 dy = Br
yj2 dy for all 1 6 i, j 6 d and
1.2 Partial Differential Equations and the Laplace Operator 7
Z Z
kyk2 dy = r2 kzk2 rd dz
Br B1
Thus ✘✘
C
1
d+2✘ ✘✘
vol(Sd−1
) 1
c= = ✘ ✘✘= .
2d vol(B1 ) 2d 1
✁ d✘ ✘
vol(S d−1
) 2(d + 2)
✁
where
∂2u ∂2u
∆x u = ∆u = 2 + ···+ 2
∂x1 ∂xd
is the Laplace operator with respect to the space variables x1 , . . . , xd only.
Equation (1.7) is called the heat equation. If we take the physical interpreta-
tion of this equation for granted, then we can use it to give heuristic explan-
ations of some of the mathematical phenomena that arise.
Suppose first that we prescribe a time-independent temperature distribu-
tion at the boundary ∂U of the medium U , and then wait until the system
has settled into thermal equilibrium. Experience (that is, physical intuition)
suggests that in the long run (as time goes to infinity) the temperature distri-
bution inside U will reach a stable (time-independent) configuration. That is,
for any prescribed boundary value b : ∂U → R we expect the heat equation
on U to have a time-independent solution. More formally, we expect there to
be a function u : U → R satisfying
∆u = 0
(1.8)
u|∂U = b.
The boundary value problem (1.8) is the Dirichlet boundary value problem,
the partial differential equation ∆u = 0 is called the Laplace equation, and its
solutions are called harmonic functions. Proving what the physical intuition
suggests, namely that the Dirichlet boundary value problem does indeed have
a (smooth) solution, will take us into the theory of Sobolev spaces. We will
prove the existence of smooth solutions for the Dirichlet boundary value
problem in Chapter 5 (and Section 8.2).
Leaving the Dirichlet problem to one side for now, we continue with the
heat equation. Motivated by the methods of linear ordinary differential equa-
tions and their initial value problems, we would like to know how we can find
other solutions to the partial differential equation while ignoring the bound-
ary values. A simple kind of solution to seek would be those with separated
variables, that is solutions of the form
u(x, t) = F (x)G(t)
and so (we may as well choose all physical constants to make c = 1) the
quotient
G′ (t) ∆F (x)
=
G(t) F (x)
is independent of x and of t, and therefore is a constant (as this is not really
a proof, we will not worry about the division by a quantity that may vanish).
In summary, u(x, t) = F (x)G(t) solves the equation
∂u
= ∆x u
∂t
if G(t) = eλt and ∆F = λF for some constant λ, which one can quickly check
(rigorously). Ignoring for the moment the values of F on the boundary ∂U , it
is easy to find functions with ∆F = λF for any λ ∈ R by using suitable expo-
nential and trigonometric functions. However, these simple-minded solutions
turn out not to be particularly useful. Only those special functions F : U → R
with
∆F = λF inside U
F |∂U = 0
turn out to be useful in the general case. However, it is not clear that such
functions even exist, nor for which values of λ they may exist.
Suppose now that the following non-trivial result — the existence of a
basis of eigenfunctions — (which we will be able to prove in many special
cases in Chapter 6) is known for the region U ⊆ Rd .
of the region for all t > 0. We conclude by mentioning that the claim above
will follow from the study of the spectral theory of an operator, but the
definition of the operator involved will be somewhat indirect.
The wave equation describes how an elastic membrane moves. We let u(x, t)
be the vertical position of the membrane at time t above the point with
coordinate x. As the membrane has mass (and hence inertia) our assumption
is that the vertical acceleration — a second derivative of position with respect
to time t — of the membrane at time t above x will be proportional to the
difference between the position of the membrane at that point and at nearby
points. Hence we call
∂2u
= c∆x u (1.10)
∂t2
the wave equation. As in the case of the heat equation, we may as well choose
physical units to arrange that c = 1.
Once more we may argue from physical intuition that the Dirichlet bound-
ary problem for the wave equation always has a solution. Consider a wire loop
above the boundary ∂U (notice that even at this vague level we are imposing
some smoothness: our physical image of a wire loop may be very distorted
but will certainly be piecewise smooth) and imagine a soap film whose edge
is the wire. Then, after some initial oscillations,† we expect the soap film to
stabilize, giving a solution to the Dirichlet boundary value problem in (1.8)
defined by the shape of the wire.
In this context, what is the meaning of eigenfunctions of the Laplace oper-
ator that vanish on the boundary? To see this, imagine a drum whose skin has
the shape U so that the vibrating membrane is fixed along the boundary ∂U ,
which is simply a flat loop. Suppose now that F : U → R satisfies
∆F = λF in U
F |∂U = 0
√
for some λ < 0, then we see that u(x, t) = F (x) cos( −λt) satisfies
∂2 √ √
2
u(x, t) = F (x) (−( −λ)2 ) cos( −λt)
∂t | {z }
=λ
√ √
= λF (x) cos( −λt) = ∆x (F cos( −λt))
and hence solves the wave equation. In other words, if we start the drum
at time t = 0 with the prescribed shape given by the function F , then the
†In the real world there would also be a friction term, and the model for this is a modified
wave equation (which we will not discuss further).
1.2 Partial Differential Equations and the Laplace Operator 11
Exercise 1.7. For a circular string vibrating in one dimension — the wave equation over T
— the basis of eigenfunctions claim is precisely the claim that every nice function can be
represented by its Fourier series. Assuming that this holds, show the basis of eigenfunctions
claim for the domain U = (0, 1) ⊆ R. This relates to the wave equation for the clamped
vibrating string on [0, 1], that is, to the boundary conditions y(0) = y(1) = 0. (In fact the
eigenfunctions are given by x 7→ sin(πnx) with n = 1, 2 . . . ; no rigorous proof is expected,
but explore the connection.)
An illustration of how some of the ideas discussed above link together will
be seen in Section 6.4.3, where we discuss eigenfunctions of the Laplacian
on a disk. Here the circular symmetry is exploited as in Section 1.1, and the
eigenfunctions may be used to decompose functions.
A remarkable application of these ideas was made to the problem of recon-
structing a bombed fresco by Andrea Mantegna in a church in Padua. The
damage resulted in the fresco being broken into approximately 88, 000 small
pieces which needed to be reassembled using a black and white photograph;
we refer to Fornasier and Toniolo [35] for the detailed description of how cir-
cular harmonics were used to render the computation required practicable.
The partially reconstructed coloured image was then used to build a coloured
image of the entire fresco.
† This preferred frequency for certain physical objects is part of the phenomena of reson-
ance, and the design of large structures like buildings or bridges tries to prevent resonances
that may lead to reinforcement of oscillations by wind, for example.
12 1 Motivation
As we will see later, the topics considered in Sections 1.1 and 1.2 are connected
to spectral theory.
The goal of spectral theory, at its broadest, might be described as an
attempt to ‘classify’ all linear operators. We will restrict our attention to
Hilbert spaces, which is natural for two reasons. Firstly, it is much easier
than the general case of operators on Banach spaces. Secondly, many of the
most important applications belong to this simpler setting of operators on
Hilbert spaces.
In finite-dimensional linear algebra the classification problem for linear
operators is successfully solved by the theory of eigenvalues, eigenspaces,
minimal and characteristic polynomials, which leads to a canonical normal
form (the Jordan normal form) for any linear operator Cn → Cn for n > 1.
We will not be able to get such a general theory if the Hilbert space H is
infinite-dimensional, but it turns out that many operators of great interest
have properties which, in the finite-dimensional case, ensure an even simpler
description. They may belong to any of the special classes of operators defined
on a Hilbert space by means of the adjoint operation T 7→ T ∗ : self-adjoint
operators, unitary operators, or normal operators. For these, if dim H = n and
we work over C, then there is an orthonormal basis (e1 , . . . , en ) of eigenvectors
of T with corresponding eigenvalues (λ1 , . . . , λn ) so that
X
n n
X
T αj ej = αj λj ej . (1.11)
j=1 j=1
P
In other words, the map φ( nj=1 αj ej ) = (α1 , . . . , αn ) is an isometry
from H to Cn and we may rephrase (1.11) to become
T1 = φ ◦ T ◦ φ−1 (1.12)
Part of the inherent beauty of mathematics comes from the interplay between
simple problems and the sophisticated theories that are sometimes required
to solve these problems. The natural numbers are among the simplest math-
ematical objects, but number theory tends to use techniques from much of
mathematics to study basic properties of N. Additively N is quite simple,
but multiplicatively N is much more complex as it is generated by the prime
numbers 2, 3, 5, . . . . Perhaps because of mathematics’ omnipresence across all
of the sciences, its absolute (internal) truth, and the pre-eminent role played
by the natural numbers, Gauss is alleged to have said “mathematics is the
queen of the sciences and number theory is the queen of mathematics”.
For modern number theory functional analysis is one of many essential
tools. While we will not be able to really justify this statement without de-
voting a significant proportion of this volume to number theory, we do at-
tempt a partial justification by giving a proof of the prime number theorem
in Chapter 14. Prime numbers have been a source of inspiration for math-
ematicians certainly since Euclid proved (approx. 300 BCE) that there are
infinitely many prime numbers. One of many mysteries concerning the prime
numbers is their distribution or location within the natural numbers.
Exercise 1.8. Writing p1 , p2 , p3 , . . . for the primes 2, 3, 5, . . . and π(x) = |{n | pn 6 x}|
for the number of primes less than or equal to x, recall Euclid’s argument using the fact
that p1 p2 · · · pn + 1 has a prime divisor not in {p1 , . . . , pn } to show that there are infinitely
n
many primes. Use this to show that pn < 22 and deduce that
In this chapter we start the more formal treatment of functional analysis, giv-
ing the fundamental definitions and introducing some of the basic examples
and their properties. We also discuss some theorems and constructions that
may be considered part of topology or measure theory, to put them into the
context of the theory developed here.
Example 2.2. The following are examples of normed real vector spaces, in
which we write v = (v1 , . . . , vd )t for elements of Rd .
p
(1) Rd with the Euclidean norm kvk = kvk2 = |v1 |2 + · · · + |vd |2 .
(2) Rd with kvk = kvk∞ = max16i6d |vi |.
(3) Rd with kvk = kvk1 = |v1 | + · · · + |vd |.
(4) Rd with the norm defined by kvkB = inf{α > 0 | α1 v ∈ B}, where B is
a non-empty, open, centrally symmetric (that is, with B = −B), convex,
bounded (with respect to the Euclidean norm) subset of Rd .
(5) Let X be any topological space (for example, a metric space; see Ap-
pendix A), and let Cb (X) = {f : X → R | f is continuous and bounded}
with the uniform or supremum norm
kf k = kf k∞ = sup |f (x)|.
x∈X
Notice that if X is compact, then Cb (X) coincides with C(X), the space of
continuous functions X → R. We note that our definition of compactness
(see Definition A.18) contains the assumption that X is Hausdorff.
(6) A special case of (5) makes C([0, 1]), and so also the subspace
into a normed vector space. A different norm on C 1 ([0, 1]) may be ob-
tained by setting kf kC 1([0,1]) = max{kf k∞ , kf ′ k∞ }.
(7) Finally, consider the vector space of real polynomials
n N
X o
R[x] = f= cf (k)xk | N ∈ N, cf (k) ∈ R
k=0
Throughout the text we will use notions from topology (see Appendix A
for a summary).
Lemma 2.4 (Associated metric). Suppose that (V, k·k) is a normed vector
space. Then for every v, w ∈ V we have
kvk − kwk 6 kv − wk. (2.1)
The norm is continuous at v ∈ V if for every ε > 0 there exists some δ > 0
such that d(u, v) < δ implies kuk − kvk < ε. By (2.1), we may choose δ = ε
to see this.
Notice that the triangle inequality makes addition continuous. Indeed, if
we write
Bεk·k (v) = {w ∈ V | kw − vk < ε}
for the ball of radius ε around v ∈ V , then we have
k·k k·k
Bε/2 (v1 ) + Bε/2 (v2 ) ⊆ Bεk·k (v1 + v2 ) (2.2)
for every ε > 0. This means that (v, w) 7→ v + w is continuous at (v1 , v2 ) and,
since v1 , v2 ∈ V were arbitrary, shows that addition is continuous.
Scalar multiplication is also continuous. To see this fix a scalar α and a
vector v ∈ V , and notice that
Lemma 2.5 (Equivalence of norms). Two norms k·k and k·k′ on the same
vector space induce the same topology if and only if there exists a (Lipschitz)
constant c > 1 such that
1 ′
c kvk 6 kvk 6 ckvk′ (2.4)
and
Bεk·k (v) = {w ∈ V | kw − vk < ε}
with respect to the two norms satisfy
2.1 Norms and Semi-Norms 19
k·k′ ′
B 1 ε (v) ⊆ Bεk·k (v) ⊆ Bcε
k·k
(v).
c
This implies that the topologies have the same notion of neighbourhood, and
so are identical.
k·k
Suppose now that the two topologies are the same, so that B1 is a neigh-
bourhood of 0 in this topology. Then there must be some ε > 0 with
′ k·k
Bεk·k ⊆ B1 .
Equivalently, kvk′ < ε implies that kvk < 1. For any v ∈ V r{0}, if w = ε
2kvk′ v
then
ε
kwk′ = kvk′ < ε
2kvk′
and so
ε
kvk = kwk < 1.
2kvk′
This implies that kvk 6 2ε kvk′ for all v ∈ V , giving the second inequality
in (2.4). Reversing the roles of k · k and k · k′ gives the first inequality, and
choosing c to be the larger of the two choices produced for c gives the lemma.
The phenomenon seen in the proof of Lemma 2.5, where a property on
all of V is determined by the local behaviour at 0, is something that will
occur frequently. For Rd the notion of equivalence of norms has the following
property.
Proposition 2.6 (Equivalence in finite dimensions). Any two norms
on Rd are equivalent, for any d > 1.
As we will see in the proof, this is related to the compactness of the closed
unit ball in Rd .
Proof of Proposition 2.6. Let k · k1 be the norm on Rd from Ex-
ample 2.2(3), and let k · k′ be an arbitrary norm on Rd . It is enough to
show that these two norms are equivalent. Write e1 , . . . , ed for the standard
basis of Rd , and let M = max16i6d kei k′ . Then
Xd
′ Xd
kvk′ =
vi ei
6 |vi |kei k′ 6 M kvk1 , (2.5)
i=1 i=1
S1 = {v ∈ Rd | kvk1 = 1}
Exercise 2.8. Show that no two of the norms on R[x] from Example 2.2(7) are equivalent.
However, some of the pairs of norms do satisfy an inequality of the form kf k 6 ckf k′ for
some fixed c > 0 and any f ∈ R[x]. Find those that do and identify the smallest relevant
constant c in each case.
Exercise 2.9. Let V, W be normed vector spaces. Show that V × W with its canonical
inherited vector space structure can be made into a normed vector space using either of
the norms 1/p
k(v, w)kp = kvkpV + kwkpW
for some p ∈ [1, ∞), or
k(v, w)k∞ = max{kvkV , kwkW }.
Show that all of these norms are equivalent, and that they induce the product topology.
The next exercise also shows why we are careful in setting up the theory
of normed spaces instead of just declaring that everything is a generalization
of the finite-dimensional theory.
P∞
Exercise 2.10. We define the space ℓ1 (N) = {(xn ) | n=1 |xn | < ∞} to be the space of
all absolutely summable sequences.
P ∞
n=1 |xn | for x ∈ ℓ (N) defines a norm, and that the sub-
• Show that kxk1 = 1
V1 + V2 = {v1 + v2 | v1 ∈ V1 , v2 ∈ V2 }
is not closed.
†
The following weakening of Definition 2.1 is often useful.
Definition 2.11. A non-negative function k · k : V → R>0 on a vector
space V is called a semi-norm (or a pseudo-norm) if k · k satisfies the homo-
geneity property and the triangle inequality of a norm.
Thus a semi-norm is allowed to have a non-trivial subset (which will be a
subspace, see below) on which it vanishes. A semi-norm gives rise to a pseudo-
metric, which in turn gives rise to a topology on V . The resulting topology is
Hausdorff if and only if the original semi-norm is a norm in the usual sense.
Indeed, if v ∈ V has kvk = 0, then v will belong to every neighbourhood of 0
in the topology defined by k · k.
Example 2.12. Let (X, B, µ) be a measure space (see Appendix B), and define
and this is not a norm (unless the measure space (X, µ) has special properties;
see Exercise 2.13).
Exercise 2.13. Characterize those measure spaces (X, B, µ) on which the semi-norm from
Example 2.12 on the space Lµ1 (X) of Lebesgue integrable functions is a norm.
V0 = {v ∈ V | kvk = 0}
so V0 is a subspace.
By the argument used in Lemma 2.4, we see that the semi-norm k · k
is continuous with respect to the induced topology. It follows that the pre-
image V0 = (k · k)−1 ({0}) is also closed.
Returning to Example 2.12, recall that for f ∈ Lµ1 (X), kf k1 = 0 is equi-
valent to the statement that f = 0 almost everywhere with respect to µ.
Thus the usual equivalence class of a function f is precisely the coset f + V0
defined by f with respect to the kernel V0 ⊆ Lµ1 (X) of the semi-norm. We
define, as is standard, the quotient space
and note that the semi-norm k · k1 on Lµ1 (X) gives rise to a norm, also
denoted k · k1 , on L1µ (X). For an introduction to the function spaces Lpµ (X)
for p ∈ [1, ∞) we refer to Appendix B.3. Where the measure is clear from
the context or has a standard choice (for example, the Lebesgue measure
on [0, 1]), it is omitted from the notation. This construction is a special case
of the following.
Lemma 2.15 (Quotient norm). For any vector space V equipped with a
semi-norm k · k, and any closed subspace W ⊆ V , the expression
kv + W kV /W = inf kv + wk
w∈W
Proof. This is simply a matter of chasing the definitions through the state-
ments. Let v1 , v2 ∈ V and ε > 0 be given. Then there exist w1 , w2 ∈ W
with
kvi + wi k 6 kvi + W kV /W + ε
for i = 1, 2. Hence
2.1 Norms and Semi-Norms 23
kv1 + v2 + W kV /W 6 kv1 + v2 + w1 + w2 k
6 kv1 + w1 k + kv2 + w2 k
6 kv1 + W kV /W + kv2 + W kV /W + 2ε,
and so the triangle inequality holds for k · kV /W . Similarly, for any scalar α,
kαv1 + αw1 k = |α|kv1 + w1 k 6 |α| kv1 + W kV /W + ε ,
which gives
kαv1 + W kV /W 6 |α|kv1 + W kV /W .
If α = 0 then this is clearly an equality, and if α 6= 0 then we may apply the
above to αv1 and the scalar α−1 to give
kv + W kV /W = 0.
Then for every ε > 0 there exists some w ∈ W with kv − wk < ε. However,
this shows that v belongs to the closure W of W . By assumption W = W is
closed, so that v ∈ W and v + W = W is the zero element in the quotient
space V /W .
For W = V0 we have kv + wk = kv + wk + k − wk > kvk for every v ∈ V
and w ∈ V0 , which gives the final claim.
Notice that we cannot expect the infimum in Lemma 2.15 to be a minimum
in general (see, for example, Exercise 2.16).
Exercise 2.16. Let (C([−1, 1]), k · k∞ ) be the normed vector space defined as in Ex-
ample 2.2(5). Define
Z 0 Z 1
W = f ∈ C([−1, 1]) | f (x) dx = f (x) dx = 0 .
−1 0
Show that W is a closed subspace. Now let f (x) = x, calculate kf kC([−1,1])/W , and show
that the infimum is not achieved.
†
The following strengthening of the triangle inequality has interesting con-
sequences.
† The results in Section 2.1.3 are interesting, but will not be needed later.
24 2 Norms and Banach Spaces
for all v, v0 ∈ V .
Exercise 2.19. Show that the supremum norm k·k∞ on R2 is not strictly convex. Give an
example to show that an isometry between normed spaces need not be affine, by considering
maps of the form x 7→ (x, f (x)) from (R, |·|) to (R2 , k·k∞ ) for a suitably chosen function f .
for all v1 , v2 ∈ V and satisfies M (0) = 0, then M is linear. To see this, pick v
in V and apply (2.6) to the pairs v and 0, then to 12 v and 0, and inductively
to 21k v and 0 to prove that M ( 21k v) = 21k M (v) for all k ∈ N and v ∈ V . Next
apply (2.6) to 2v and 0, and inductively to (ℓ + 1)v and (ℓ − 1)v to prove
that M (ℓv) = ℓM (v) for all ℓ ∈ N and v ∈ V . Finally, apply (2.6) to v and −v
to see that M (−v) = −M (v) for all v ∈ V . This gives
M 2kn v = 2kn M (v)
2.1 Norms and Semi-Norms 25
that
kv1 − zk = kz − v2 k = 12 kv1 − v2 k,
and hence (since M is an isometry)
and
for all v ∈ V , which also implies that z itself is the only point fixed of ψz .
Now fix v1 , v2 ∈ V and write z = v1 +v
2
2
as before for the midpoint. Let B
be the group of all bijective isometries V → V that fix v1 and v2 , and define
λ = sup{kg(z) − zk | g ∈ B}.
2kg(z) − zk = kg ′ (z) − zk 6 λ,
Exercise 2.21. Show that the vertex set of a graph consisting of vertices v1 , v2 , v3 , vc
and three edges connecting one central vertex vc to the remaining three vertices v1 , v2 , v3 ,
endowed with the combinatorial distance given by d(vj , vk ) = 2δjk for j, k ∈ {1, 2, 3}
and d(vj , vc ) = 1 for j = 1, 2, 3 admits no isometric embedding into any Banach space
with a strictly sub-additive norm.
d(xm , xn ) < ε
for any m, n > N . The metric space is called complete if every Cauchy se-
quence converges to an element of X.
Once again there are many familiar examples of Banach spaces. As we will
see there is often an almost canonical choice of norm k · kV which makes a
linear space V into a Banach space (V, k · kV ). It is clear that this property of
a norm does not define it uniquely. In fact, any equivalent norm would induce
the same topology, the same notion of Cauchy sequence, and therefore also
make V into a Banach space.
Example 2.24. We start with a small number of examples, and postpone the
proof that these are indeed Banach spaces to Section 2.2.1.
(1) The Euclidean space Rd with any of the norms from Example 2.2(1)–(4)
from Section 2.1.1 forms a Banach space.
(2) Let X be any set. Then B(X) = {f : X → R | f is bounded}, equipped
with the norm
kf k∞ = sup |f (x)|,
x∈X
ℓ∞ = ℓ∞ (N) = B(N).
is a closed subspace of Cb (X) and hence a Banach space. The notion of the
limit of f (x) as x → ∞ used here is defined as follows: limx→∞ f (x) = A
if and only if for every ε > 0 there exists some compact set K ⊆ X
with |f (x) − A| < ε for all x ∈ XrK. If X = N (with the discrete topo-
logy), one often writes c0 = c0 (N) = C0 (N) for this subspace of ℓ∞ (N).
(5) The space C 1 ([0, 1]) of continuously differentiable functions on [0, 1] with
the norm
kf kC 1 ([0,1]) = max{kf k∞, kf ′ k∞ }
is a Banach space.
(6) Let U ⊆ Rd be non-empty and open, and fix k > 1. Then the space Cbk (U )
of functions U → R for which all partial derivatives up to order k exist
and are continuous and bounded on U , equipped with the norm
is a Banach space, where ∂α for α ∈ Nd0 stands for the partial differential
operator defined by
∂ kαk1
∂α f = f
∂x1 · · · ∂xα
α1
d
d
L∞ ∞
µ (X) = L (X)/Wµ (X),
where
and L∞
µ (X) is equipped with the essential supremum norm defined by
kf kesssup = esssupx∈X |f (x)| = inf α > 0 | µ ({x | |f (x)| > α}) = 0 .
(2.9)
We will generally follow the convention that the essential supremum norm
of f is also denoted for simplicity by kf k∞ . All of these ℓp and Lp spaces also
have natural complex-valued analogues.
As is customary, we will quickly stop being too careful about the dis-
tinction between an element of L ∞ (X) and the equivalence class defined
by it in L∞ µ (X). For example, |f |(x) = |f (x)| for all x ∈ X really depends
on f ∈ L ∞ (X) and not just on the equivalence class, but (as we will see
later in the proof of completeness) the norm defined in (2.9) is independent
of the representative chosen for a given equivalence class.
Exercise 2.25. Show that a product of two normed vector spaces V × W is complete with
respect to one of the norms from Exercise 2.9 if and only if both V and W are complete
with respect to their own norms. Thus the product of two Banach spaces is a Banach space.
In this subsection we will explain why the examples from Example 2.24 are
indeed Banach spaces. Depending on the background of the reader, parts of
this section may be skipped. In each case it is proving completeness that
really takes up what effort is required. The following principle will be used
several times.
Example 2.24(1). If two norms are equivalent then they define the same
notion of convergence and of Cauchy sequence. Thus it is enough to con-
sider Rd with the norm k·k∞ by Proposition 2.6. Now a Cauchy sequence (vn )
(i)
in Rd has the property that each component sequence (vn ) for a fixed i,
(1) (d) t
where vn = (vn , . . . , vn ) for all n, is itself a Cauchy sequence in R. Since R
(i)
is complete, there exists a limit v (i) = limn→∞ vn for each i. These limits
(1) (d) t
together define a vector v = (v , . . . , v ) and it is easy to see that v is the
limit of (vn ) in Rd .
Example 2.24(2). Let X be any set and let (fn ) be a Cauchy sequence
in B(X) with respect to k·k∞ . Then for any fixed x ∈ X the sequence (fn (x))
is a Cauchy sequence in R, which therefore has a limit f (x). This defines a
function f : X → R. We need to show that f ∈ B(X) and fn → f as n → ∞
with respect to k · k∞ . Since (fn ) is Cauchy, for any ε > 0 there is some N (ε)
with kfm − fn k∞ < ε for all m, n > N (ε), and so |fm (x) − fn (x)| < ε for
any x ∈ X and m, n > N (ε). Now let m → ∞ to see that |f (x) − fn (x)| 6 ε
for all n > N (ε). Setting ε = 1 and n = N (1) gives |f (x)| 6 1 + kfN (1) k∞ for
any x ∈ X, showing that f ∈ B(X). For any ε > 0, we obtain kf − fn k∞ 6 ε
for all n > N (ε) and hence that f = limn→∞ fn ∈ B(X), as required. If |X|
has cardinality d then this example reduces to the previous one.
Example 2.24(3). By definition, Cb (X) is a subspace of B(X), and we use
the same norm on both spaces. Thus, if (fn ) is a Cauchy sequence in Cb (X)
then, by (2), there exists a limit f = limn→∞ fn ∈ B(X). It remains to show
that f ∈ Cb (X) — that is, to show that Cb (X) is a closed subspace of B(X).
This is a familiar argument from real analysis. Given any ε > 0 there exists
some n with kfn − f k∞ < ε. Since fn ∈ Cb (X) is continuous at x, there is a
neighbourhood U ⊆ X of x with |fn (y) − fn (x)| < ε for all y ∈ U . Therefore,
|f (y) − f (x)| 6 |f (y) − fn (y)| + |fn (y) − fn (x)| + |fn (x) − f (x)| < 3ε
| {z } | {z } | {z }
6kf −fn k∞ <ε <ε 6kf −fn k∞ <ε
for all y ∈ U . As the existence of such a neighbourhood holds for all ε > 0
and x ∈ X, we see that f ∈ Cb (X) as required.
Example 2.24(4). Once again C0 (X) ⊆ Cb (X) and we use the same norm.
So if (fn ) is a Cauchy sequence in C0 (X), then f = limn→∞ fn ∈ Cb (X)
exists by (3). We only need to show that f ∈ C0 (X). For this, let ε > 0 and
choose n ∈ N with kfn − f k∞ < ε. Since fn ∈ C0 (X), there exists some
compact set K ⊆ X with |fn (x)| < ε for all x ∈ XrK. Thus
and (fn ) is a Cauchy sequence with respect to the norm k · k∞ . Thus (3)
applies and shows that fn converges uniformly to some f ∈ C([0, 1]). The
same argument applies to the sequence (fn′ ) of derivatives, showing that (fn′ )
converges uniformly to some g ∈ C([0, 1]). All that remains is to verify that
f ′ = g. (2.10)
as n → ∞, as required.
Exercise 2.27. Generalize Example 2.24(6) to give a Banach space over C in two different
ways as follows.
(a) Let U ⊆ Rd be open and consider C-valued bounded differentiable functions with
bounded continuous derivative (here there is little difference from the real case).
(b) Let U ⊆ C (or in Cd ) be open, and consider the space of bounded complex differentiable
functions with bounded derivative.
For Examples 2.24(7) and (8) regarding integrable functions and bounded
measurable functions, we will use two lemmas that we formulate more gen-
erally.
The usual definitions of convergence and absolute convergence
P∞ of series
extend easily to normed vector spaces as follows. A series n=1 vn converges
PN
if the sequence of partial sums (sN )N >1 converges, where sN = v
P∞ n=1 n
for all N > 1, and converges absolutely if the real-valued series n=1 kvn k
converges.
and the last sum can be made arbitrarily small by requiring n to be sufficiently
large.
Assume now for the converse that (V, k·k) is a normed vector space in which
every absolutely convergent series is convergent, and let (vn ) be a Cauchy
sequence in V . In order to render the Cauchy property more uniform, we
choose a subsequence of (vn ) as follows. For each k > 1 there exists some Nk
such that
1
kvm − vn k < k
2
for all m, n > Nk . Using these numbers we define inductively an increasing
sequence (nk ) by n1 = N1 and nk = max{nk−1 + 1, Nk } for k > 2. The
corresponding subsequence (vnk )k>1 satisfies kvnk+1 − vnk k < 21k . Now define
wk = vnk+1 − vnk
2.2 Banach Spaces 33
P P∞ 1
for all k > 1,Pso that ∞ k=1 kwk k < k=1 2k = 1 converges, and hence the
∞
infinite sum k=1 wk = w ∈ V converges by our assumption on the normed
space (V, k · k). For the ℓth partial sum of this series we obtain
ℓ
X
wk = vnℓ+1 − vn1 ,
k=1
converges.
We refer to Appendix B for basic properties of k · kp on and inLpµ (X),
particular for the triangle inequality. Moreover, in the proof below and in
the remainder of the book we will need the monotone convergence and the
dominated convergence theorems (Theorems B.7 and B.8, respectively).
Example 2.24(7). Let (fn ) be a sequence in Lpµ (X) with
∞
X
M= kfn kp < ∞.
n=1
P
By Lemma 2.28 it is enough to show that ∞ p
n=1 fn converges in Lµ (X). For
this, define a sequence of functions (gn ) by
n
X
gn (x) = |fk (x)|.
k=1
Clearly gn (x) ր g(x) for some measurable function g : X → [0, ∞]. Note
that !p
Z Xn
p p
|gn | dµ = kgn kp 6 kfk kp 6 Mp
k=1
f : X → R.
Strictly speaking we have only defined f on the complement of a null set, but
we simplify the notation by ignoring this distinction here.
Since we also have |f (x)| 6 g(x) for all x, we have f ∈ Lpµ (X). It remains
to show that
n
X
fk − f
−→ 0 (2.14)
k=1 p
Pn p
notice first that | k=1 fk − f | 6 (2g)p and by definition
as n → ∞. For this,P
n p
of f we also have | k=1 fk − f | −→ 0 as n → ∞ and almost everywhere,
so that we may apply dominated convergence to the sequence of integrals
defined by
2.2 Banach Spaces 35
n
p Z n p
X
X
fk − f
= fk − f dµ.
X
k=1 p k=1
as given in Example 2.24(8). For this, assume first that α > kf kesssup so
that Nα = {x ∈ X | |f (x)| > α} is a µ-null set, and hence gα = −f 1Nα ∈ Wµ .
It follows that
kf kL ∞ /Wµ 6 kf + gα k∞ 6 α.
Since this holds for any α > kf kesssup it follows that
kf kL ∞ /Wµ 6 kf kesssup .
If, on the other hand, α > kf kL ∞ /Wµ , then there exists some g ∈ Wµ with
kf + gk∞ < α,
and so
{x ∈ X | |f (x)| > α} ⊆ {x ∈ X | g(x) 6= 0}
is a null set. Varying α once more, we see that kf kesssup 6 kf kL ∞ /Wµ .
Exercise 2.30. Show that in the definition of k · kesssup and of k · kL ∞ /Wµ (from the
proof that Example 2.24(8) is a Banach space on p. 35) the infima are actually minima
and hence that for f ∈ L∞
µ (X) we have |f (x)| 6 kf kesssup µ-almost everywhere.
Even though we have seen several examples of Banach spaces above, there
are many natural normed vector spaces that are not Banach spaces. For
example, R[x] is not a Banach space with respect to any of the five norms
discussed in Example 2.2(7) (see also Exercise 2.62). As a result it is useful
to know that any normed vector space has a completion (whose uniqueness
properties we will discuss in Corollary 2.60).
B = W/W0
and
kbkB = lim kvn k
n→∞
where b = (vn ) + W0 .
It follows from our discussion concerning quotient norms (see Lemma 2.15)
that (B, k · kB ) is a normed vector space. Moreover, B contains an isometric
copy of V (that is, there is an isometry V → B), since an element v ∈ V can
be identified with the equivalence class of the constant sequence
φ(v) = (v, v, . . . ) + W0 ,
by definition.
We claim that (the image of) V is dense in B. Given an equivalence
class b = (v1 , v2 , . . . ) + W0 ∈ B of a Cauchy sequence (vn ), for every ε > 0
there exists some N with kvm − vn k < ε for m, n > N . Then
Using this for any ε > 0 shows that the image of V is dense in B.
It remains to show that B is complete with respect to k · kB . For this,
assume that (bn )n>1 is a Cauchy sequence in B. Since the image of V is
dense in B we can find a sequence (vn ) of vectors in V with
1
kbn − φ(vn )kB <
n
for each n ∈ N. Then for every ε > 0 there exists some N (ε) with
kbm − bn kB < ε
1 1
and m, n < ε for m, n > N (ε), so that
b = (v1 , v2 , . . . ) + W0 ∈ B (2.15)
1
kb − bm kB 6 kb − φ(vm )kB + kφ(vm ) − bm kB < lim kvn − vm k + < 4ε
n→∞ | {z } m
<3ε
for m > N (ε). Thus b ∈ B defined by (2.15) is the limit of (bn ) and so B is
a Banach space.
Exercise 2.33. Let Cc (R) be the vector space of continuous functions f : R → R with
Supp(f ) = {x ∈ R | f (x) 6= 0}
compact, with the norm k · k∞ . Show that this space is not complete, and find a Banach
space containing Cc (R) as a dense subspace so that the induced norm obtained by restric-
tion is k · k∞ . Can you do the same for the norm kf kΨ = kf Ψk∞ , where Ψ : R → R>0 is
2
a fixed continuous function (for example, Ψ(x) = ex )?
Exercise 2.34. Generalize Theorem 2.32 to metric spaces as follows. If (X, d) is a metric
space, then a completion of (X, d) is a pair consisting of a complete metric space (X ∗ , d∗ )
and an isometry φ : X → X ∗ with the property that φ(X) is dense in X ∗ . Prove that any
metric space has a completion.
kvn k 6 1 (2.16)
1 d 1
kvk+1 − vn k > kvk+1 + W kV /W = kv + W kV /W > =
kv + wk 2d 2
P∞
p 1/p
with the p-norm k(xn )kp = n=1 |xn | .
for all x, y ∈ X and f ∈ K. The key uniformity here is that a single δ may
be used for all the functions f ∈ K.
|f (x) − f (y)| 6 |f (x) − fi (x)| + |fi (x) − fi (y)| + |fi (y) − f (y)| < 3ε,
| {z } | {z } | {z }
<ε <ε by (2.19) <ε
showing equicontinuity.
Proving compactness: Now suppose that K ⊆ C(X) is closed, bounded,
and equicontinuous. To show that K is compact, let (fn ) be an arbitrary
sequence in K. It will be enough to exhibit a Cauchy subsequence of (fn ),
since by Example 2.24(3) such a subsequence will converge in C(X), and by
our assumption that K is closed the limit will be in K.
2.3 The Space of Continuous Functions 41
kfn k∞ 6 M
+ |fnk (y) − fnℓ (y)| + |fnℓ (y) − fnℓ (x)| < 3ε.
| {z } | {z }
<ε by (2.22) <ε by (2.21)
Thus kfnk − fnℓ k < 3ε for all k, ℓ > N (ε), showing that the subsequence is
Cauchy as required.
Exercise 2.39. (a) Prove the Arzela–Ascoli theorem for any compact space (that is,
without assuming that the space is a metric space). To do this, define a subset K of C(X)
to be equicontinuous if for every ε > 0 and every x ∈ X there exists a neighbourhood U
of x with |f (y) − f (x)| < ε for all f ∈ K.
(b) Extend the Arzela–Ascoli theorem to the space C0 (X) of continuous functions vanishing
at infinity with the uniform norm kf k∞ = supx∈X |f (x)|, where X is a locally compact
metric (or just locally compact) space.
for any f, g ∈ A.
Proof. It is easy to check that the algebra operations are continuous with
respect to k · k∞ . (See (2.2)–(2.3) for the vector space operations and gen-
eralize the argument to include the product operation for functions; see also
Section 2.4.2 for a more general discussion containing this case.) Therefore A
is also an algebra. Recall that
∞
X
√ 1/2
n
1 + u = (1 + u)1/2 = n u
n=0
P∞
Studying the coefficients more closely gives n=0 | 1/2
n | < ∞, and the reader
who knows this may set ε = 0 and simplify the argument accordingly, but
we will not use this. Suppose that f ∈ A, M = kf k∞ , and ε > 0. Then the
function
1
gε = 2 (f 2 + ε)
M +ε
is in A and takes on values in [ε/(M 2 + ε), 1], and so
∞
X
1/2kgε − 1kn∞ < ∞,
n
n=0
44 2 Norms and Banach Spaces
converges with
p respect to k · k∞ by Example 2.24(3) and Lemma 2.28. We
deduce that f 2 + ε ∈ A. Now
p p p f2 + ε − f2 ε
06 f 2 + ε − |f | = f 2 + ε − f 2 = p p 6 √ .
2
f +ε+ f 2 ε
p
√ p
In particular,
|f | − f 2 + ε
6 ε, and so the fact that f 2 + ε ∈ A
∞
for all ε > 0 implies that |f | ∈ A. The identities
max{f, g} = 21 (f + g) + 21 |f − g|
and
min{f, g} = 12 (f + g) − 21 |f − g|
give the other parts of the lemma.
max{f1 , . . . , fn }, min{f1 , . . . , fn } ∈ A.
We will use this property for a given f ∈ CR (X) and ε > 0 to find a func-
tion fε ∈ A with kf − fε k∞ < ε. This then implies that A = CR (X). The
construction has three steps.
First step: correct value at two points. Let x0 , x ∈ X be (not ne-
cessarily distinct) points. Then there exists some hx0 ,x ∈ A with
hx0 ,x (x0 ) = f (x0 )
(2.23)
hx0 ,x (x) = f (x).
for all y ∈ X. That is, gx0 is chosen to have the correct value at x0 for the
objective of approximating f , and to be not much smaller than f at every
other point, as illustrated in Figure 2.1.
hx0 ,x f
f −ε
x0 x X
| {z }
| {z } Ox
O x0
Fig. 2.1: The function gx0 is constructed by finding x1 , . . . , xn (in this case, x0
and x) with the property that gx0 = max{hx0 ,x1 , . . . , hx0 ,xn } > f − ε.
We will construct gx0 as a maximum after finding a finite subcover for the
following open cover of X. For any x ∈ X (including x0 ) there exists an open
neighbourhood Ox of x with
We define
gx0 = max{hx0 ,x1 , . . . , hx0 ,xn } ∈ A,
and notice that gx0 satisfies
by (2.25).
f +ε
g x2
f
g x1
f −ε
g x3
| {z }
U x1
x2 x1 x3 X
| {z } | {z }
U x2 U x3
Fig. 2.2: The function fε = min{gx1 , . . . , gxm } is constructed with kf −fε k∞ < ε.
We define
fε = min{gx1 , . . . , gxm } ∈ A,
and claim that kf − fε k∞ 6 ε, as illustrated in Figure 2.2. For every y ∈ X
we have
gxi (y) > f (y) − ε
by the property of gxi in (2.24), and so fε (y) > f (y)−ε. By (2.28) every y ∈ X
lies in some Uxi and so (2.27) implies that
2.3 The Space of Continuous Functions 47
AR = A ∩ CR (X).
f +f f −f
u= ,v = ∈ AR
2 2i
by our assumption on A. Thus AR also contains a function that separates x
and y. By the real case, AR is dense in CR (X), so by splitting an arbitrary
function in CC (X) into real and imaginary parts and approximating each of
these with elements of AR ⊆ A the theorem is proved.
Exercise 2.43. (a) Let X be as in Theorem 2.40. Show that the second requirement
(Constants) on A ⊆ C(X) could also be replaced by the requirement
• (Nowhere vanishing) for every x ∈ X there is a function f ∈ A with f (x) 6= 0.
(b) Let X be a locally compact space. Extend the Stone–Weierstrass theorem to C0 (X) by
considering a sub-algebra A ⊆ C0 (X) that separates points, is closed under conjugation,
and vanishes nowhere.
Show that k · kK and k · kL are inequivalent norms on R[x] if K 6= L are two different
infinite compact subsets.
Exercise 2.45. Let X and Y be two compact spaces. Prove that the linear hull of all
functions of the form (x, y) ∈ X × Y → f (x)g(y) for f ∈ C(X) and g ∈ C(Y ) is dense
in C(X × Y ).
Proof. The space X is separable (this may be seen, for example, from the
proof of Theorem 2.38) so we may choose a countable dense set {xn | n ∈ N}
in X. We now define fn (x) = d(x, xn ) for all x ∈ X and n > 1, and claim
48 2 Norms and Banach Spaces
that these functions separate points in X. That is, if x 6= y then there exists
some n for which fn (x) 6= fn (y). To see this, notice that by density there is
some n with d(x, xn ) = fn (x) < 12 d(x, y), which implies that
†
As an application of the discussion above, and in particular of the Stone–
Weierstrass theorem, we now describe the notion of equidistribution. A se-
quence (xn )n of elements of a metric space X is dense if for every x ∈ X
there is a subsequence (xnk )k that converges to x. A much finer property
is given by equidistribution, which roughly speaking corresponds to the se-
quence spending the right proportion of time in any given part of the space.
In this section we will define and discuss this notion carefully for X = [0, 1].
A sequence (xn )n>1 of points in [0, 1] is said to be equidistributed or uni-
formly distributed if any one of the following equivalent conditions is satisfied:
1
(1) |{k ∈ N | 1 6 k 6 K, xk ∈ [a, b]}| → b−a as K → ∞ for 0 6 a < b 6 1.
K
K Z 1
1 X
(2) f (xk ) −→ f (x) dx as K → ∞ for any f ∈ C([0, 1]) (that is,
K 0
k=1
any continuous function).
K Z 1
1 X
(3) f (xk ) −→ f (x) dx as K → ∞ for any f ∈ R([0, 1]) (that is,
K 0
k=1
any Riemann-integrable function). (
K Z 1
1 X 0 if n 6= 0
(4) χn (xk ) −→ χn (x) dx = as K → ∞ for any n
K 0 1 if n = 0
k=1
in Z, where χn (x) = e2πinx for all x ∈ [0, 1].
We will now sketch some of the implications between these equivalent state-
ments (see Exercise 2.48) and will return to the topic of equidistribution in
Chapter 8 from a more general point of view.
Almost a proof of (4) =⇒ (2). Consider the algebra of trigonometric
polynomials
† The results of this section will not be needed in this form later, so may be skipped.
2.3 The Space of Continuous Functions 49
( N
)
X
A= cn χn | cn ∈ C, N ∈ N .
n=−N
and
1 X
K K
1 X
f (xk ) − g(xk ) < ε
K K
k=1 k=1
It follows that
1 X
K Z 1
f (xk ) − f (x) dx < 3ε,
K 0
k=1
which is not quite the claim in (2) since C(T) and C([0, 1]) differ slightly.
Indeed, any function f : T → C gives rise to a function f : R → C via the
diagram
f
R /C
⑧⑧?
⑧⑧
⑧⑧⑧ f
⑧
T
which we can restrict to [0, 1], defining an element g ∈ C([0, 1]). If f : T → C
is continuous then so is g, but g will always satisfy g(0) = g(1). On the other
hand, if g ∈ C([0, 1]) is a function satisfying g(0) = g(1) then one can define
a continuous function f : T → C by f (t + Z) = g(t) for t ∈ [0, 1], and obtain
the result for such g.
The extension to general continuous functions on [0, 1] can be handled by
the same method as in the proof that (2) implies (1) below, where we will
only assume (2) for all f ∈ C(T).
Exercise 2.47. Show that A as in the previous proof does indeed satisfy all the assump-
tions of Theorem 2.40,
50 2 Norms and Banach Spaces
Proof of (2) =⇒ (1). Suppose first that 0 < a < b < 1 and write 1[a,b]
for the characteristic function of the interval [a, b]. Fix ε > 0 and choose
continuous functions f− , f+ : [0, 1] → R with
(a) 0 6 f− (x) 6 1[a,b] (x) 6 f+ (x) 6 1 for all x ∈ [0, 1],
Z 1
(b) (f+ − f− ) dx < ε, and
0
(c) f+ (0) = f+ (1) = f− (0) = f− (1) = 0.
For example, the functions f+ and f− could be chosen to be piecewise linear,
as illustrated in Figure 2.3. In this case the shaded region can easily be
chosen to have total area bounded above by ε, as required in (b). By (c), the
functions f− and f+ also define continuous functions on T.
1[a,b]
f+
f−
0 a b 1
Fig. 2.3: The function 1[a,b] and the approximations f− (drawn using dots) and f+
(using dashes).
Since
K K K
1 X 1 X 1 X
f− (xk ) 6 1[a,b] (xk ) 6 f+ (xk )
K K K
k=1 k=1 k=1
R1
for all K > 1, and the left-hand side converges to 0 f− (x) dx while the
R1
right-hand side converges to 0 f+ (x) dx as K → ∞, we obtain
K K
1 X 1 X
(b−a)−ε 6 lim inf 1[a,b] (xk ) 6 lim sup 1[a,b] (xk ) 6 (b−a)+ε,
K→∞ K K→∞ K
k=1 k=1
which implies the claim in (1) for 0 < a < b < 1. The formula in (1) holds
trivially if f ≡ 1, so we also get
K
1 X
1[0,a) (xk ) + 1(b,1] (xk ) −→ 1 − (b − a)
K
k=1
2.3 The Space of Continuous Functions 51
and Z 1
(f+ − f− ) dx < 3ε,
0
and the formula in (1) already holds for f− and f+ . As before, this implies
the claim for 1[0,b] . The case of 0 < a < b = 1 is similar.
Exercise 2.48. Prove the remaining implications to show that the four characterizations
of equidistribution at the start of this section are indeed equivalent.
as K → ∞.
f = ℜf + iℑf,
as n → ∞, by dominated convergence. PN
Thus it is sufficient to show that any simple function f = i=1 ai 1Bi
(where ai ∈ R and Bi ∈ B(X) have µ(Bi ) < ∞ for i = 1, . . . , N ) can be
approximated by elements of C(X). This in turn will follow if we can show
that the characteristic function of any Borel set can be approximated by
elements of C(X) in the k · kp norm.
Defining a σ-algebra. Having made these initial reductions, we can now
turn to the heart of the argument (still assuming that X is compact). We
define the family n o k·kp
A = B ∈ B | 1B ∈ C(X)
Open Subsets. Let O ⊆ X be open. Define the closed set A = XrO and
the distance function
d(x, A) = inf d(x, y). (2.29)
y∈A
† The reader may be familiar with this result for the Lebesgue measure (for example),
and this case is sufficient for much of the material that will follow. Thus she may skip the
general proof and return to it at a later stage if needed.
2.3 The Space of Continuous Functions 53
which implies that d(x1 , A) 6 d(x1 , x2 ) + d(x2 , A), and hence (2.30) by the
symmetry between x1 and x2 . This shows the continuity of x 7→ d(x, A)
and so it follows that the function defined by fn (x) = min{1, nd(x, A)} lies
in C(X). Moreover, if x ∈ A = XrO then fn (x) = 0 = 1O (x), while if x ∈ O
then d(x, A) > 0 and fn (x) ր 1 = 1O (x). Thus fn ր 1O as n → ∞ on X,
and so Z 1/p
kfn − 1O kp = |fn − 1O |p dµ −→ 0
O
which implies
Thus
[
∞ ℓ
[ 1/p
S∞
1 k=1 Ak − 1Sℓk=1 Ak
= µ Akr Ak < ε.
p
k=1 k=1
Sℓ
However, since k=1 Ak ∈ A for any ℓ > 1, we already know that there exists
an f ∈ C(X) with
f − 1Sℓk=1 Ak
< ε,
p
and so
f − 1S∞ A
< 2ε.
k=1 k p
S
Since ε > 0 was arbitrary, we deduce that ∞k=1 Ak ∈ A.
Concluding the compact case. By the arguments above, A is a σ-algebra
containing all the open subsets of X. By definition, A ⊆ B and so A = B by
definition of the Borel σ-algebra B. As explained above, this implies that every
simple function, and so also every function, in Lpµ (X) can be approximated
by continuous functions.
Extending to the locally compact case. Let us now extend the above
to the general case where X is locally compact σ-compact metric and µ is
locally finite. By Lemma A.22 we find a sequence (XS m ) of compact subsets
o ∞
of X with Xm ⊆ Xm+1 for all m > 1, and with X = m=1 Xm .
Given some f ∈ Lpµ (X) we first note that the sequence fm = f 1Xm con-
verges to f with respect to k · kp as m → ∞ (by dominated convergence).
Given some ε > 0 we choose m such that kf − fm kp < ε.
Next we apply the compact case above and hence we find some g ∈ C(Xm )
with kg − fm kLp(Xm ,µ) < ε. Applying Tietze’s extension theorem (Proposi-
o
tion A.29) we can extend g to an element g ∈ Cc (Xm+1 ) ⊆ Cc (X). Using
again the distance function in (2.29) with A = Xm we define the sequence
gn (x) = 1 − min{1, nd(x, Xm )} g(x) ∈ Cc (X).
where the first expression on the right is less than εp by construction of g and
the second expression converges to 0 by dominated convergence as n → ∞.
Therefore, there exists some n > 1 such that kgn − fm kp < 2ε. Combining
this with the choice of fm above, we obtain kf − gn kp < 3ε and gn ∈ Cc (X),
as desired.
2.4 Bounded Operators and Functionals 55
is finite.
V
since v ∈ Bε/kLk r{0} implies that
op
kLvkW = kvkV
L kvk−1
V v W < ε.
| {z } | {z }
<ε/kLkop 6kLkop
is continuous. Use this to shorten the argument in the proof for Example 2.24(5) on p. 30.
Show also that the operator D : C 1 ([0, 1]) → C([0, 1]) defined as the derivative D(f ) = f ′
is not continuous if we use the norm k · k∞ on both spaces.
Notice that the definition of the operator norm immediately gives the
general inequality
kLvkW 6 kLkop kvkV ,
for all v ∈ V , and the operator norm may be characterized as being the
smallest number C with the property that
for all v ∈ V . We will use both these statements frequently in the sequel
without comment.
Essential Exercise 2.56. Prove that the operator norm of a bounded op-
erator L : V → W between two normed vector spaces is the smallest con-
stant C > 0 such that (2.31) holds for all v ∈ V .
and so kαL1 + L2 kop 6 |α|kL1 kop + kL2 kop . That is, the operator norm
satisfies the triangle inequality and one half of the homogeneity property.
The reverse inequality for homogeneity of the operator norm follows easily
by considering the case α = 0 and α 6= 0 separately, as in the proof of
Lemma 2.15. Strict positivity is clear, so we have shown that B(V, W ) is a
normed vector space with the operator norm.
Now suppose that W is a Banach space and that (Ln ) is a Cauchy sequence
in B(V, W ). We claim that
which (for fixed v) may be made as small as we please for m, n large by the
Cauchy property for the sequence (Ln ).
To see that the limit L is a bounded operator one has to show that it is
linear (which we leave as an exercise) and that it is bounded. For the latter,
assume that v ∈ V has kvkV 6 1 and choose N (ε) as in the Cauchy property
for (Ln ) with kLm − Ln kop 6 ε for m, n > N (ε). Continuity of the norm now
gives
kLn (v) − L(v)kW = lim kLn (v) − Lm (v)kW 6 ε
m→∞
for n > N (ε). Taking the supremum over v with kvkV 6 1, we get
kL − Ln kop 6 ε,
kS ◦ Rk 6 kSkkRk.
Exercise 2.58. Compute the operator norm of the continuous map f 7−→ f when viewed:
(a) as a map from the Banach space C 1 ([0, 1]) to C([0, 1]) (and where the former is equipped
with the norm kf kC 1 ([0,1]) = max{kf k∞ , kf ′ k∞ } for f ∈ C 1 ([0, 1])); and
(b) as a map C([0, 1]) → L1m ([0, 1]), where m denotes Lebesgue measure on [0, 1].
(c) Compute the operator norm of the composition of the maps from (a) and from (b).
(d) Now restrict the maps in (a), (b) and (c) to the subspace of functions f with f (0) = 0,
and compute the operator norms again.
58 2 Norms and Banach Spaces
The following result is both quite easy and extremely useful for the theory
to come.
Proposition 2.59 (Unique extension to completion). Let V be a normed
vector space, let V0 ⊆ V be a dense subspace, and assume that L0 : V0 → W is
a bounded operator into a Banach space W . Then L0 has a unique bounded
extension L : V → W , that is a bounded linear map L : V → W which
satisfies L|V0 = L0 . Moreover, kLkB(V,W ) = kL0 kB(V0 ,W ) .
We implicitly assume here that a subspace V0 ⊆ V is equipped with the
restriction of the norm on V to V0 . This is important to remember in applic-
ations where the subspace may have other natural norms defined on it.
Proof of Proposition 2.59. For any v ∈ V there is a sequence (vn ) in V0
with vn → v as n → ∞. In particular, this implies that (vn ) is a Cauchy
sequence in V0 , and since
L0 : V0 → W
is bounded (and so Lipschitz), it follows that (L0 (vn )) is a Cauchy sequence
in W . If (vn′ ) is another sequence in V0 with vn′ → v as n → ∞ then it is
clear that vn − vn′ → 0 as n → ∞ and so
L0 (vn ) − L0 (vn′ ) −→ 0
because W is a Banach space. Notice that by density and the desired continu-
ity of the extension, this is the only possible definition of a bounded operator
that extends L0 . One can quickly check that L is a linear map from V to W .
Moreover, if v ∈ V and (vn ) is a sequence in V0 with vn → v as n → ∞, then
showing that L is bounded, with kLk 6 kL0 k. On the other hand L|V0 = L0 ,
so kLk > kL0 k.
V
❆
φ1 ⑥⑥⑥ ❆❆❆ φ2
⑥ ❆❆
⑥⑥⑥ ❆❆
~ ⑥ ψ1 +B
B1 k 2
ψ2
Fig. 2.4: The two given completions φ1 , φ2 and the maps ψ1 , ψ2 to be constructed.
φ2 ◦ φ−1
1 : φ1 (V ) −→ φ2 (V ) ⊆ B2
φ1 (v) 7−→ φ2 (v)
Exercise 2.61. Let D = {z ∈ C | |z| < 1} ⊆ C be the open unit disk, and parameterize
the circle of radius r ∈ (0, 1) by the map γr : [0, 1] → C defined by γr (t) = re2πit . Let V
be the space of functions f ∈ C(D) holomorphic on D, and fix p ∈ [1, ∞).
(a) Equip V with the norm
Z 1 1/p
kf kH p (D) = sup |f (γr (t))|p dt .
r∈(0,1) 0
Show that the linear map Ez : f 7−→ f (z) is continuous with respect to k · kH p (D) for
all z ∈ D. Also show that if O ⊆ D is open with compact closure O ⊆ D, then
V ∋ f 7−→ f |O ∈ C(O)
is a bounded operator with respect to k·kH p (D) and k·k∞ on C(O). In particular, conclude
that there exists a canonical injective map from the completion H p (D) of V , known as a
Hardy space, into the space of holomorphic functions on D.
(b) Equip V with the norm
kf kAp (D) = kf kLp (D) ,
and repeat the problems from (a) to obtain the(7) Bergman space Ap (D).
60 2 Norms and Banach Spaces
Exercise 2.62. For each of the five norms on R[x] given in Example 2.2(7), find a Banach
space containing R[x] for which the induced norm obtained by restriction coincides with
the given norm on R[x].
Let X be a locally compact metric space, and let µ be a finite Borel measure.
Then Z
µ : f 7−→ f dµ
In fact, a more precise statement holds, but this takes a little more work.
Lemma 2.63 (Operator norm of integration). Suppose that µ is a Borel
measure on a locally compact σ-compact metric space X and g is a function
in L1µ (X). Then the norm of the functional on C0 (X) defined in (2.32) is
precisely kgkL1µ .
Proof† . Let (
g(x)
|g(x)| if g(x) 6= 0,
h(x) = arg(g(x)) =
0 if g(x) = 0.
Clearly h ∈ L∞
µ (X) and
Z Z
hg dµ = |g| dµ = kgkL1µ .
Thus
Z Z Z
kg dµkop
> fε g dµ > fε g dµ − fε g dµ
X
ZK Z XrK
> |g| dµ − |g| dµ > kgkL1µ − 2ε.
K XrK
kgkL1µ 6 kg dµkop ,
Recall that a ring or an algebra does not need to have a unit; if a non-
trivial ring A has a unit 1A satisfying 1A a = a1A = a for all a ∈ A then it is
called unital.
The additional axiom on the norm makes the product operation continuous
by the following argument. Fix ε ∈ (0, 1) and x, y ∈ A. Then kx′ − xk < ε < 1
and ky ′ − yk < ε together imply that
Since ε ∈ (0, 1) was arbitrary, this shows the continuity of the product map
at (x, y) ∈ A × A.
We want to briefly indicate how even the simplest differential equations can
lead directly to the study of integral operators, which may be analyzed using
tools introduced above (and in Chapter 6).
Consider first the differential equation
2.5 Ordinary Differential Equations 63
f ′′ (x) + f (x) = 0,
giving
f (x) = A sin x + B cos x (2.37)
for constants A and B. Then one moves on to the problem of finding one
particular solution fp to the equation
so f ′ (0) = 0. Finally,
Z x
f ′′ (x) = − cos x + cos(x − x)g(x) − sin(x − t)g(t) dt
0
= −f (x) + g(x),
Now define k(x, t) = sin(x − t)σ(t) so that (2.41) takes the form
f = u + K(f ), (2.42)
(I − K)f = u
where I is the identity map. The solution f is then given by applying the
inverse operator (I − K)−1 to u, which we may calculate (in this particular
case) using an operator form of the geometric series (in this context the
geometric series is usually called a von Neumann series),
∞
X
(I − K)−1 = K n,
n=0
and hence ∞
X
f= K n u.
n=0
defines a bounded linear operator K : C([0, 1]) → C([0, 1]) with kKk 6 kkk∞ ,
and more generally with kK n k 6 kkkn∞ /n! for n > 1. In particular, the
geometric series
X∞
−1
(I − K) = Kn
n=0
converges in B C([0, 1]) . It follows that the integral equation (I − K)f = u
has a unique solution for any u ∈ C([0, 1]). For u(x) = cos x, σ ∈ C([0, 1])
and k(x, t) = sin(x−t)σ(t) with x, t ∈ [0, 1], this solution belongs to C 2 ([0, 1]),
and solves the initial value problem
(
f ′′ + f = σf
(2.43)
f (0) = 1, f ′ (0) = 0
on [0, 1].
for all t ∈ [0, 1]. Multiplying by f (t) and integrating from 0 to x shows that
xn
|K n (f )(x)| 6 kkkn∞ kf k∞ . (2.44)
n!
Then
66 2 Norms and Banach Spaces
Z x
n+1
K (f )(x) 6 |k(x, t)| |K n (f )(t)| dx
0
Z x n
t xn+1
6 kkkn+1
∞ kf k∞ dx = kkkn+1
∞ kf k∞
0 n! (n + 1)!
for all x ∈ [0, 1]. By induction on n, it follows that (2.44) holds for all n > 1.
Hence kK n k 6 kkkn∞ /n! for alln > 1, as claimed.
By Lemma 2.54, B C([0, 1]) is a Banach space. It follows by Lemma 2.28
P∞
that the absolutely convergent series n=0 K n also converges in B C([0, 1]) .
However,
X
∞ X
∞ ∞
X ∞
X
(I − K) Kn = K n (I − K) = Kn − K n = I,
n=0 n=0 n=0 n=1
P∞
so the sum n=0 K n is the inverse of I − K and, for any u ∈ C([0, 1]), the
equation (I − K)f = u has the unique solution f = (I − K)−1 u. In the
case u(x) = cos x, k(x, t) = sin(x − t)σ(t) for x, t ∈ [0, 1] and σ ∈ C([0, 1]),
the calculation after (2.39) shows that the solution f belongs to C 2 ([0, 1])
and solves (2.40) with the initial values f (0) = 1 and f ′ (0) = 0.
We now make two more small changes to the initial value problem (2.35)
and (2.36). Fix a parameter λ > 0 and consider instead the Sturm–Liouville
equation
f ′′ + λ2 f = g, (2.45)
with the boundary conditions
f (0) = f (1) = 0.
These boundary conditions (made at the two end points of [0, 1]) replace the
initial value conditions in (2.36), and that change has a surprisingly deep
impact on the resulting equation.
As we recall below the space of functions satisfying (2.45) is, if non-empty,
a two-dimensional affine subspace of functions, so that the additional bound-
ary conditions might lead to a unique solution.
We may proceed just as before. The functions of the form
fp′′ + λ2 fp = g
2.5 Ordinary Differential Equations 67
(ignoring the boundary conditions). After this, one would use the solutions
to the homogeneous differential equation to satisfy the boundary conditions.
Explicitly, given fp we can calculate the vector
fp (0)
(2.46)
fp (1)
and
sin(λ0) 0
= .
sin(λ1) sin λ
If
1 0
det = sin λ
cos λ sin λ
is non-zero, then this is always possible and we find a unique solution to the
boundary value problem. However, if λ ∈ πZ then sin λ = 0 and we may be
unlucky with the value of the vector (2.46): if the vectors
fp (0) 1
,
fp (1) cos λ
are linearly independent, then there will not be a solution to the boundary
value problem.
This obstruction to being able to find a solution to the boundary value
problem may be phrased in terms of another integral operator.
and
f ′′ (s) = sh(s) − (s − 1)h(s) = h(s),
so f is a solution of the boundary value problem (2.47).
To see the converse, notice that the boundary value problem has a solution
(by the argument above). However, our previous discussion of the boundary
value problem associated to the Sturm–Liouville equation (2.45) (which needs
to be modified for the case λ = 0) shows that in this case the solution is
unique. Thus the equivalence of (2.47) and f = Kh is established.
Exercise 2.69. Modify the argument for the Sturm–Liouville equation for the case λ = 0,
and show that the solution is always unique.
sn = −(πn)2 K(sn ).
by saying that this differential equation always has a unique solution for any g
unless λ = πn corresponds to one of the eigenvalues µn = −(πn)−2 = −λ−2
of K.
2.5 Ordinary Differential Equations 69
Exercise 2.71. In this exercise we generalize the connection between the Sturm–Liouville
boundary value problem and integral operators. Let a < b be real numbers, and assume
that p ∈ C 1 ([a, b]) and q ∈ C([a, b]) are real-valued functions with p > 0 and q > 0. We
define the second-order differential operator
B1 (f ) = α1 f (a) + α2 f ′ (a) = 0,
B2 (f ) = β1 f (b) + β2 f ′ (b) = 0.
L(f ) = 0
† That is, the functions f1 , f2 form a basis of the vector space of all solutions.
70 2 Norms and Banach Spaces
The material in this chapter represents the basic language and some of the
main examples of functional analysis. Let us mention briefly some directions
in which the theory continues.
• In Chapter 3 and the following chapters we will start to see why we
insisted on completeness in the definition of Banach spaces.
• We have seen the definition of dual spaces, but have not yet found a
description of any dual space. This will be corrected in the next chapter
and more generally in Chapter 7, where we will describe the dual spaces
of many of the Banach spaces that we discussed here.
• How can one construct a generalized limit notion that assigns to every
bounded sequence a limit, and still has many of the expected properties?
One such property is linearity (but notice, for example, that lim sup is
not a linear function on the space of bounded sequences). Another such
property is translation-invariance with respect to the underlying group
(for a sequence in the normal sense, this group would be Z). After we
construct this so-called Banach limit, we ask which groups have similar
notions of generalized limits. We will discuss these topics in Sections 7.2
and 10.2.
• Clearly there is some hidden notion of convergence of measures to the
Lebesgue measure in Section 2.3.3. In order to formulate this precisely, we
will need to define an appropriate topology on a space of measures. This
topology will be called the weak* topology (read as ‘weak star’ topology;
see Chapter 8), and as we will show the space of probability measures on
a compact metric space is itself a compact metric space in this topology.
This result helps to provide a coherent setting for many equidistribution
results.
• Some natural spaces (examples include Cc (X) and C ∞ ([0, 1])) do not fit
into the framework of Banach spaces, but do fit into the more general
context of locally convex spaces. These will be introduced in Chapter 8.
• Convexity will also turn out to be fundamental for many discussions in
functional analysis. One of the goals in Chapter 8 will be to analyze how
the extreme points of a convex compact set determine the set.
• Banach algebras will be discussed in greater detail in Chapter 11, which
lays the foundations for the more advanced spectral theory in Chapters 12
and 13.
The reader is advised to continue with the next chapter (or at least the
first three or four sections of it), after which she may select parts of the text.
Chapter 3
Hilbert Spaces, Fourier Series, and
Unitary Representations
gives a norm on V .
for all v, w ∈ V , where equality holds if and only if v and w are linearly
dependent. Moreover, the function k · k defined in (3.1) is a norm on V , so
that every inner product space is also a normed vector space.
If h·, ·i is onlypassumed to be a semi-inner product on V , then the induced
function kvk = hv, vi for v ∈ V is a semi-norm and the inequality in (3.2)
also holds in that case.
Definition 3.3. A Hilbert space is an inner product space (H, h·, ·i) which
is complete with respect to the norm k · k induced by the inner product as
in (3.1).
for all v, w ∈ V .
• The relationship with linear functionals: for fixed w ∈ V the map φw
defined by φw (v) = hv, wi is a linear functional with norm kφw k = kwk.
• The relationship with geometry: the vector hv,wi
kwk2 w (appearing as tw in
the proof of Proposition 3.2 above) is the orthogonal projection of v onto
the subspace spanned by w. Moreover, if hv, wi = 0 then we recover
Pythagoras’ theorem in the form kv + wk2 = kvk2 + kwk2 .
These are easy to check. For the first, expand the left-hand side to obtain
kv + wk2 + kv − wk2 = hv + w, v + wi + hv − w, v − wi
= kvk2 + 2ℜ hv, wi + kwk2 + kvk2 − 2ℜ hv, wi + kwk2 .
The second claim is a consequence of the linearity of the inner product, the
Cauchy–Schwarz inequality and the
definition of the operator norm. The
two final claims follow by expanding v − hv,wi
kwk2 w, w , respectively the square
norm kv + wk2 = hv + w, v + wi.
Exercise 3.5. (a) Show that any real inner product space satisfies the polarization identity
1
hx, yi = 4
kx + yk2 − kx − yk2
Example 3.6. We have already seen several Hilbert spaces without making
explicit the underlying inner product.
Pd
(1) Rd (or Cd ) with hv, wi = i=1 vi wi , (also written v ·w) giving the 2-norm
d
!1/2
X
kvk2 = |vi |2 .
i=1
3.1 Hilbert Spaces 75
Notice that in Example 3.6(2) and (3), the spaces are themselves defined
as the set of sequences or functions with finite 2-norm. We recall how this
implies that the inner product is well-defined and note that (3) contains (2)
as a special case.
Lemma 3.7. If (X, B, µ) is a measure space and f, g ∈ L2µ (X), then the
right-hand side of (3.5) is well-defined.
From the equality case of the Cauchy–Schwarz inequality, which is itself used
in the proof of the triangle inequality, it follows quickly that a norm in an
76 3 Hilbert Spaces, Fourier Series, and Unitary Representations
inner product space is strictly sub-additive (see Definition 2.17). Thus the
Mazur–Ulam theorem concerning isometries (Theorem 2.20) applies in par-
ticular to Hilbert spaces.
While the emphasis in this section is on Hilbert spaces, we will isolate a
more abstract convexity property which is precisely what is needed for several
proofs in this section.
Exercise 3.10. (a) Show that the norm in a Hilbert space is strictly sub-additive (see
Definition 2.17).
(b) Show that the norm in a uniformly convex vector space (as defined below) is strictly
sub-additive.
x+y
2
Fig. 3.1: If x and y are not close to each other, then the mid-point is uniformly
closer to zero (independent of the choice of x and y).
K
v0
w
The following exercise shows that the unique existence of the best approx-
imation is by no means guaranteed.
Exercise 3.14. (1) Let K ⊆ V be a non-empty compact subset of a normed vector
space (V, k · k) or let V be finite-dimensional and K closed. Show the existence of a best
approximation of any v0 ∈ V within K.
(2) Let V = R2 equipped with the norm k · k∞ and let K be the closed unit ball. Find a
point v0 ∈ V that has more than one best approximation within K. Describe the points
that have exactly one best approximation within K.
(3) Let V = ℓ1R (N) equipped with the norm k · k1 . Let
n ∞
X o
K= (xn ) ∈ ℓ1R (N) | xn > 0 and an xn = 1 ,
n=1
where (an ) is a fixed sequence in (0, 1) with limn→∞ an = 1. Show that K is closed and
convex. Let v0 = 0 and show that there is no best approximation of v0 within K. Conclude
in particular from this that the closed unit ball in ℓ1R (N) is not compact and that ℓ1R (N) is
not uniformly convex.
Exercise 3.15. (8) Let (X, B, µ) be a measure space with Lpµ (X), for p ∈ [1, ∞], the asso-
ciated function spaces.
p p 2 2 p/2 for any p ∈ [2, ∞) and a, b > 0.
(a) Show that a
+ b
6 (a
+ b
)
p
p
(b) Show that
f +g
2
+
f −g
2
6 1
2
kf kpp + 12 kgkpp for any p ∈ [2, ∞) and f, g in Lpµ (X).
p p
(c) Deduce from (b) that Lpµ (X) is uniformly convex for p ∈ [2, ∞).
(d) Show that L1µ (X) and L∞µ (X) are in general not uniformly convex.
Proof of Theorem 3.13. By translating both the set K and the point v0
by −v0 , we may assume without loss of generality that v0 = 0. We define
78 3 Hilbert Spaces, Fourier Series, and Unitary Representations
with
1
2sm
a= 1 1 > 0,
2sm + 2sn
1
2sn
b= 1 1 > 0,
2sm + 2sn
Let η be as in Definition 3.11 and fix ε > 0. Choose N = N (ε) large enough
to ensure that m > N implies that
1
> 1 − η(ε).
sm
Then m, n > N implies that
3.1 Hilbert Spaces 79
1 1
+ > 1 − η(ε),
2sm 2sn
which together with the definition of uniform convexity gives
1 − η kxm − xn k > k xm +x 2
n
k > 1 − η(ε).
kxm − xn k < ε
H = Y ⊕ Y ⊥,
showing (3.6).
⊥ ⊥
It is clear from the definitions that Y ⊆ Y ⊥ . If v ∈ Y ⊥ then we
may write v = y + z for some y ∈ Y and z ∈ Y ⊥ by the first part of the
⊥
proof. However, 0 = hv, zi = kzk2 implies that v = y and so Y = Y ⊥ .
An immediate consequence of Corollary 3.17 is the following.
PY : H −→ Y
h 7−→ y
ℓ : H → R (or C)
3.1 Hilbert Spaces 81
B(x, y) = hT x, yi
Exercise 3.22. Use Corollary 3.19 to show that if H is a Hilbert space, then H∗ is also a
Hilbert space, and exhibit a natural isometric isomorphism between H and H∗∗ .
or
f (a) = hf, ka iA2 (D)
respectively. The function D × D ∋ (a, w) 7→ ka (w) is called a reproducing kernel. Determ-
ine ka explicitly in both cases.
(1) V1 and V2 are closed subspaces of V and the map π|V1 is a homeomorphism
(where V /V2 is equipped with the quotient norm from Lemma 2.15);
(2) the map φ is a homeomorphism (where V1 × V2 is equipped with any of the norms
from Exercise 2.9);
(3) P is a bounded operator.
If any of these equivalent conditions hold, then the subspaces are called topologically com-
plemented.
(1) Show that (ℓ∞ )∗ contains a countable subset A with the property that if x ∈ ℓ∞
has a(x) = 0 for all a ∈ A then x = 0, and deduce that the same holds for any space
isomorphic to a closed subspace of ℓ∞ .
Using the following steps, show that V = ℓ∞ /c0 does not have the property in (1) and
hence that c0 cannot be complemented by Exercise 3.27.
(2) Use an enumeration of Q to construct for each i ∈ I = RrQ a sequence
(i)
x(i) = (xn ) ∈ ℓ∞
(i)
Supp(x(i) ) = {n ∈ N | xn = 1}
(4) Deduce from (3) that for any continuous linear functional f ∈ V ∗ and n ∈ N the
1
set {i ∈ I | |f (x(i) )| > n } is finite. Conclude that for any countable subset A of V ∗ there
is some i ∈ I with the property that a(x(i) ) = 0 for all a ∈ A.
We will show in this section how the results from Section 3.1.2 can be used in
measure theory. Before stating the main result, we recall some definitions. A
measure ν is absolutely continuous with respect to another measure µ, writ-
ten ν ≪ µ, if any measurable set N with µ(N ) = 0 must also satisfy ν(N ) = 0.
Two measures µ and ν are singular with respect to each other, written ν ⊥ µ,
if there exist disjoint measurable sets Xµ , Xν ⊆ X with X = Xµ ⊔ Xν and
with ν(Xµ ) = 0 = µ(Xν ). Finally, recall that a measure µ is σ-finite if there
is a decomposition of X into measurable sets,
∞
G
X= Xi ,
i=1
Proof. Suppose that µ and ν are both finite measures (the general case can
be reduced to this case by using the assumption that µ and ν are both σ-finite;
see Exercise 3.31).
We define a new measure m = µ + ν and will work with the real Hilbert
space H = L2m (X). On this Hilbert space we define a linear functional φ by
Z
φ(g) = g dν
for all g ∈ L2m (X). We claim that k takes values in [0, 1] almost surely with
respect to m. Indeed, for any B ∈ B we have 0 6 ν(B) 6 m(B), so (using
the fact that g = 1B ) Z
06 k dm 6 m(B).
B
This holds by construction for all simple functions g, and hence for all non-
negative measurable functions by monotone convergence. Now define
for two finite Borel measures µ and ν on a locally compact metric space X (which may or
may not be mutually singular).
Exercise 3.33. Let (X, B) be a measurable space and denote the space of signed measures
on X (as defined in Section B.5) by M(X).
(a) Given a signed measure dν = g dµ with a finite measure µ and g ∈ L1µ (X), define kνk
to be kgkL1 (µ) . Show that this yields a well-defined norm on M(X).
(b) Show that M(X) is a Banach space with respect to this norm.
such that Z Z
f dµ = Eµ f | A dµ (3.9)
A A
for all A ∈ A.
(b) Show that (3.9) uniquely characterizes Eµ (f | A) ∈ L1µ (X, A) as an equivalence class.
(c) Show that f ∈ L1µ (X, B) and g ∈ L∞
µ (X, A) implies that Eµ (f g | A) = gEµ (f | A).
(d) Show that kEµ (f | A) k1 6 kf k1 for f ∈ L1µ (X, B).
for all m, n > 1. In other words, we require that all the vectors have length
one, and are mutually orthogonal.
As one might expect, this notion is fundamental for Hilbert spaces, and
gives rise to the following satisfying abstract result, which as we will see lays
the ground for Fourier analysis.
Proposition 3.36 (The closed linear hull of an orthonormal list).
Let H be a Hilbert space. Then the closed linear hull of an orthonormal
list (xn ) is given by
X
h{xn }i = an xn | the sum converges in H ,
n
P P
where the sum v = an xn converges in H if and only if n |an |2 < ∞. In
n
P 1/2
2
that case we also have kvk = n |an | and hv, xm i = am for m > 1.
P 2
Hence the linear map φ that sends the sequence (an ) with n |an | < ∞
P
to n an xn ∈ h{xn }i is a unitary isomorphism of Hilbert spaces.
P
We note that the series n an xn need not be absolutely convergent,
since ℓ2 (N) ) ℓ1 (N).
Proof of Proposition 3.36. Suppose first that (x1 , . . . , xN ) is a finite
orthonormal list. Then we may define a map φ from KN (with K being the
3.2 Orthonormal Bases and Gram–Schmidt 87
PN
field of scalars R or C) to H by setting φ((an )) = n=1 an xn . Using the
assumption of orthonormality it follows that
X X
kφ((an ))k2 = ham xm , an xn i = |an |2 = k(an )k22 (3.10)
m,n n
and X
hφ((an )), xj i = han xn , xj i = aj (3.11)
n
where the sum is actually finite by definition of the space cc (N), so that prop-
erties (3.10) and (3.11) clearly still hold in this case. We note that φ(cc (N))
is the linear hull of the set {xn }.
Now notice that cc (N) ⊆ ℓ2 (N) is dense. Indeed, if a = (an ) ∈ ℓ2 (N) and
we define (
an if n 6 N ;
a(N
n
)
=
0 if n > N
for N ∈ N, then
2 ∞
X
(N )
an − (an )
= |an |2 −→ 0
2
n=N +1
By continuity of the norm and the inner product, properties (3.10) and (3.11)
extend to all of ℓ2 (N). By (3.10), φ is an isometry from ℓ2 (N) onto its image,
so the image is complete and therefore closed in H. Since φ(cc (N)) = h{xn }i,
2
it follows that φ(ℓ
P∞(N)) = h{xn }i is the closed linear hull. Finally
P (3.12) shows
that the series n=1 an xn converges if (an ) ∈ ℓ2 (N), and if ∞ an xn con-
n=1P
∞
verges, then (3.10) (applied to the partial sums) also implies that n=1 |an |2
converges.
The argument in the proof above can also be used for orthogonal subspaces
as in the following exercise.
88 3 Hilbert Spaces, Fourier Series, and Unitary Representations
Essential Exercise 3.37. (a) Let (Hn ) be a finite or countable list of Hilbert
spaces. Then we define the direct Hilbert space sum
M n X o
Hn = (vn ) | vn ∈ Hn and kvn k2 < ∞
n n
defines an inner product on the direct sum, making it into a Hilbert space.
(b) Let H be a Hilbert space and (Hn ) a finite or countable list of mutually
orthogonal closed subspaces of H. Show that there is a canonical isometric
isomorphism
M D[ E
φ: Hn −→ Hn
n n
analogous to Proposition 3.36, and describe the inverse map of φ using or-
thogonal projections.
Definition 3.38. A list of orthonormal vectors in a Hilbert space H is said
to be complete (or to be an orthonormal basis) if its closed linear hull is H.
We note that strictly speaking this notion of a Schauder basis of an infinite-
dimensional Hilbert space does not coincide with the notion of a basis in the
sense of linear algebra as we allow (in contrast to the standard definition)
infinite converging sums to represent arbitrary vectors as linear combinations
of the basis vectors. We invite the reader to compare our discussion here with
the proof of the existence of a basis for an infinite-dimensional vector space
(relying on the axiom of choice and often called a Hamel basis), and hope
that the reader agrees with us that the notion of an orthonormal basis of a
Hilbert space is much more natural than the notion of a Hamel basis in our
context. More importantly, the notion of an orthonormal basis will prove to
be much more useful in the following discussions.
Theorem 3.39 (Gram–Schmidt). Every separable Hilbert space H has an
orthonormal basis. If H is n-dimensional, then H is isomorphic to Rn or Cn .
If H is not finite-dimensional, then H is isomorphic to ℓ2 (N).
Here isomorphic means isomorphic as Hilbert spaces, so there is a linear
bijection between the spaces that preserves the inner product. The proof
of Theorem 3.39 is simply an interpretation of the familiar Gram–Schmidt
orthonormalization procedure.
Proof of Theorem 3.39. Let {y1 , y2 , . . . } ⊆ H be a dense countable subset.
We are going to use the vectors {yn } to construct an orthonormal list of
vectors which has the same linear hull. This is built up from the simple
geometrical observation that if a vector v does not lie in the linear span of a
3.2 Orthonormal Bases and Gram–Schmidt 89
finite set of vectors, then something from the linear span may be added to v
to produce a non-zero vector orthogonal to the linear span.
We may assume that y1 6= 0 and define x1 = kyy11 k . Suppose now that
we have already constructed orthonormal vectors x1 , . . . , xn by using the
vectors y1 , . . . , yk with k > n in such a way that
Vn = hx1 , . . . , xn i = hy1 , . . . , yk i.
Exercise 3.40. Give a direct proof that the closed unit ball in an infinite-dimensional
Hilbert space is not compact by using the material from this section.
Exercise 3.41. Recall the Hardy and Bergman spaces H 2 (D) and A2 (D) on the unit
disk D = B1C from Exercise 2.61. Describe the spaces H 2 (D) andP A2 (D) in terms of the
sequence of Taylor coefficients (an ) of the Taylor expansion f (z) = n>0 an z n of elements
of the space.
Corollary 3.42. Let (xn ) beP a countable orthonormal list in a Hilbert space H
∞
and let (an ) ∈ ℓ2 (N). Then n=1 an xn converges
P∞ unconditionally, P∞meaning
that for any permutation : N → N we have m=1 a(m) x(m) = n=1 an xn .
In particular, it makes sense to speak of a countable orthonormal basis even
if we do not specify an enumeration of the basis.
∞
X ∞
X
|a(m) |2 = |an |2
m=1 n=1
90 3 Hilbert Spaces, Fourier Series, and Unitary Representations
P∞
the same applies to w
= a(m) x(m) . Also by Proposition 3.36 we
m=1
have hv, xn i = an and w, x(m) = a(m) for all m, n ∈ N. As is a permuta-
tion we see that hv − w, xn i = 0 for all n ∈ N. As v − w belongs to the closed
linear hull of {xn }, it follows that v = w, as claimed.
Suppose now B ⊆ H is a countable set consisting of mutually ortho-
gonal unit vectors with dense linear hull. Then we may choose an enumer-
ation B = {xn | n ∈ N} and obtain an orthonormal basis in the sense of
Definition 3.38. By the above the properties of the orthonormal basis and
also the coordinates hv, xi of v ∈ H associated to a given element of x ∈ B
remain unchanged if a different enumeration is being used.
While the motivation generated by natural examples and the notational con-
venience of thinking of countable collections as sequences incline one strongly
to the separable case, there is no reason to restrict attention completely to
separable Hilbert spaces.
Example 3.43. Let I be a set, equipped with the discrete topology and the
counting measure λcount defined on the σ-algebra P(I) of all subsets of I.
Then ℓ2 (I) = L2 (I, P(I), λcount ) is a Hilbert space, and it comprises all func-
tions a : I → R (or C) for which the P support Supp(a)
P = {i ∈ I | ai 6= 0} is
finite or countable, and for which i∈I |ai |2 = i∈Supp(a) |ai |2 < ∞.
. .
F = {(I, x ) | the function x : I → H has orthonormal image} ,
with partial order defined by (I, x.) 4 (J, y.) if I ⊆ J and x. = y.| . In this I
partially ordered set every totally ordered subset (or chain) has an upper
bound, which can be found by simply taking the union of the index sets and
the natural extension of the partially defined functions to the union.
† In order to ensure that this definition does indeed define a set, we could add the require-
.
ment that I is a subset of H, and let x be the identity.
3.3 Fourier Series on Compact Abelian Groups 91
.
It follows that there exists a maximal element (I, x ) of this partially
ordered set by Zorn’s lemma. Using this, define an isometry φ : ℓ2 (I) → H
by X
a 7−→ ai xi
i∈Supp(a)
first on the subset of all elements a ∈ ℓ2 (I) with | Supp(a)| < ∞, and then,
by applying the automatic extension to the closure (Proposition 2.59), on
all of ℓ2 (I). This again defines an isomorphism from ℓ2 (I) to the complete,
and hence closed, subspace Y = φ(ℓ2 (I)) ⊆ H. We claim that Y = H, for
otherwise there would exist some x ∈ Y ⊥ of norm one by the orthogonal
decomposition of Hilbert spaces (Corollary 3.17), and using this element x
we can define a new element of F which is strictly bigger than the maximal
.
element (I, x ) in the partial order. This contradiction shows the claim, and
hence proves the theorem.
However, we will also see in some examples below that these are often easy
to prove if the group is given concretely.
Theorem (Existence of Haar measure†(10) ). Every locally compact σ-
compact metric group G has a left Haar measure mG , satisfying (and, up to
positive multiples, characterized by) the properties:
• mG (K) < ∞ for any compact set K ⊆ G;
• mG (O) > 0 for any non-empty open set O ⊆ G; and
• mG (gB) = mG (B) for all measurable B ⊆ G and g ∈ G.
We will usually be dealing with σ-compact metrizable groups, which sim-
plifies the measure theory needed, but the existence of Haar measure only
requires the group to be locally compact. For G = Td , which as a measur-
able space can be identified with [0, 1)d , the Haar measure is simply the d-
dimensional Lebesgue measure restricted to [0, 1)d .
Exercise 3.46. Show that the Lebesgue measure on [0, 1)d considered as a measure on Td
satisfies all the properties of the Haar measure.
χ(x + Z) = e2πix
† This will be proved in Section 10.1.
‡ This will be established in Section 12.8, and holds more generally for locally compact
abelian groups.
3.3 Fourier Series on Compact Abelian Groups 93
χj (x + Zd ) = e2πixj
t
for x = (x1 , . . . , xd ) ∈ Rd , separate points since if x 6= y we must have
some j ∈ {1, . . . , d} with xj 6= yj , and then χj (x) 6= χj (y).
In some discussions about characters we will parameterize the collection
of all characters using some index set. For example, we will see shortly that
the characters on Td are parameterized by elements n ∈ Zd in a natural way
if we define for n ∈ Zd the character χn on Td by
χn (x + Zd ) = e2πin·x
where n·x denotes the usual inner product Rd . We will write x ∈ Td as a short-
hand for the element x+Zd ∈ Td , and whenever convenient we identify x ∈ Td
with x ∈ [0, 1)d .
Assuming the existence of a Haar measure and the completeness of char-
acters as above for a compact metric abelian group G, we will now describe
the theory of Fourier series on G. This will give a complete description
of L2 (G) = L2mG (G) where mG is the Haar measure on G. For convenience
we normalize mG to satisfy mG (G) = 1.
where the sum, which runs over all the characters of G, is convergent(11) with
respect to k · k2 , the equality is meant as elements of L2 (G), the coefficients
are given by aχ = hf, χi, and they satisfy
X
|aχ |2 = kf k22 .
χ
In fact, in (3.15) we used the defining invariance property of the Haar measure
extended to integrals as in (3.13) and the fact that a character is in particular
a homomorphism. However, we have chosen g with χ(g) 6= 1 so (3.15) gives
Z
χ dm = 0.
G
shows that the embedding C(G) → L2 (G), which we know has dense image
by Proposition 2.51, is continuous. It follows that there can be only countably
many distinct characters, since an uncountable collection would
√ give rise to
an uncountable collection of disjoint open balls of radius 21 2, contradicting
separability.
In order to show completeness we will use the completeness of characters
from p. 92. Define the complex linear hull A = hχ | χ a character on Gi, and
notice that A is an algebra since the product of two characters is another
character. Also notice that A is closed under conjugation, since
Exercise 3.48. Let G ⊆ Td be a closed subgroup. Show that any character χ on G is the
restriction of a character of the form χn for some n ∈ Zd (by using the arguments from
the proof of Theorem 3.47).
Exercise 3.49. Find all the characters on G = Z/q Z and prove Theorem 3.47 directly for
this case.
formula).
P
(c) Prove that kfbk∞ = maxχ∈G b 1
b |f (χ)| 6 |G| g∈G |f (g)| = kf k1 .
(d) Use the Cauchy–Schwarz inequality, Parseval’s formula, and the inequality from (c) to
deduce the following uncertainty principle(12) : | Supp f |·| Supp fb| > |G| for f ∈ L2 (G)r{0}.
Exercise 3.51. (a) Find all the characters on G = (Z/N Z)N (endowed with the product
topology). Show the existence of a Haar measure and the completeness of characters
from p. 92 for this case.
(b) Now set N = 2 and notice that G = (Z/2Z)N is, as a measure space, isomorphic to (0, 1)
with the Lebesgue measure (by using the binary expansion of real numbers). Interpret the
characters of G as maps on (0, 1) to obtain the orthonormal basis known as the Walsh
system.
Exercise 3.52. Let p ∈ N be a prime number. Describe all the characters on the compact
group of p-adic integers G = Zp , defined by
( ∞
)
Y
G = lim Z/(pn Z) = (zn ) ∈ Z/(pn Z) | zn ≡ zn+1 mod pn Z .
←−
n→∞ n=1
Show the existence of a Haar measure and the completeness of characters from p. 92 for
this case.
The following exercise shows how large the class of metric compact abelian
groups really is.
Exercise 3.53. Let Γ be a countable abelian group and use it to define
(a) Show that G is a metric compact abelian group in the induced topology from the
product topology on TΓ .
(b) Use the theorem on completeness of the characters from p. 92 to show that the group
of characters on G is isomorphic to Γ .
important that we will treat it in greater detail here. Along the way, we will
give a proof for Fourier series on the torus that will be independent of the
theorem regarding Fourier series on general groups (Theorem 3.47).
For this section, we will define a character on Td to be a function of the
form
χn (x) = e2πin·x = e2πi(n1 x1 +···+nd xd )
for all x ∈ Td , for some n ∈ Zd . We will see in Corollary 3.67 that these are
indeed all the characters on Td in the sense of Section 3.3 (also see Exer-
cise 3.48). We note that χn (x) = χ−n (x) = χn (−x) for n ∈ Zd and x ∈ Td .
A trigonometric polynomial is a finite linear combination
X
p= a n χn
n∈F
for n ∈ Zd . Moreover, X
kf k22 = |an |2 . (3.17)
n∈Zd
Exercise 3.55. (a) Phrase Theorem 3.47 for d = 1 using the function χ0 = 1 and the
functions x 7→ cos(2πnx) and x 7→ sin(2πnx) for n > 1.
(b) For every n > 1 choose dn such that fn (x) = dn sin(πnx) has norm one in L2 ((0, 1)).
Show that (fn )n>1 forms an orthonormal basis of L2 ((0, 1)). Notice that each fn satisfies
the boundary conditions fn (0) = fn (1) = 0, which are called the Dirichlet boundary
conditions.
(c) For every n > 0 choose dn such that gn (x) = dn cos(πnx) has norm one in L2 ((0, 1)).
Show that (gn )n>0 forms an orthonormal basis of L2 ((0, 1)). Note that every gn satis-
′ (0) = g ′ (1) = 0, which are called the Neumann boundary conditions.
fies gn n
(d) Find an orthonormal basis (hn )n>1 of L2 ((0, 1)) that consists of smooth functions
satisfying the mixed boundary conditions hn (0) = h′n (1) = 0 for all n > 1.
3.4 Fourier Series on Td 97
Exercise 3.56. (a) Rephrase Theorem 3.47 for Td for real-valued functions using sine and
cosine functions.
(b) Find an orthonormal basis of L2 ((0, 1)d ) satisfying the Dirichlet boundary conditions
(that is, the basis should consist of smooth functions that vanish on the boundary of [0, 1]d ).
The reader may wonder in what sense the absolute convergence is meant,
and the answer is in all of them: With respect to k · k2 , pointwise at every
point, and with respect to k · k∞ .
In order to prove these results (independently from the previous section)
we will need to discuss convolution.
(1) f ∗ g = g Z
∗ f;
(2) f ∗ χn = f (t)χn (t) dt χn ; and
(3) hχm , χn i = δm,n .
Proof. The first formula follows by a simple substitution (see Exercise 3.46):
Z Z
f ∗ g(x) = f (t)g(x − t) dt = f (x − u)g(u) du = g ∗ f (x).
Td
| {z } Td
=u
χm (t)χn (t) = χm−n (t) = e2πi((m1 −n1 )t1 +···+(md −nd )td ) ,
Since shifting functions preserves their integrals and their p-norms, we deduce
that d(x, y) < δ implies that
kf x − f y kp 6 kf x − F x kp + kF x − F y kp + kF y − f y kp < 3ε,
by the Hölder inequality, whenever d(x, y) < δ (strictly speaking the func-
tion (g̃)x (t) = g(x − t) with g̃(t) = g(−t) appears in the definition of f ∗ g(x),
but using kg̃ x − g̃ y kp = kg x − g y kp this does not make much of a difference).
As ε > 0 was arbitrary we see that f ∗ g is continuous.
Let us assume first that d = 1. By Lemma 3.59(2) the nth term in the Fourier
series of f is given by an (f )χn = f ∗ χn with an (f ) = hf, χn i for every n ∈ Z.
Thus the partial sums of the Fourier series satisfy
N N
!
X X
an (f )χn = f ∗ χn .
n=−N n=−N
and satisfies Z
DN (x) dx = 1.
T
Proof. The case x = 0 and the integral calculation follow immediately from
the definitions. To check the formula for x 6= 0 we notice that the Dirichlet
iφ −iφ
kernel is a geometric series and use the relation sin φ = e −e2i :
N
X n 2N
DN (x) = e2πix = e−2πiN x 1 + · · · + e2πix
n=−N
2N +1
−2πiN x e2πix −1 e2πi(N +1)x − e−2πiN x
=e =
e2πix − 1 e2πix − 1
2πi(N + 12 )x −2πi(N + 12 )x sin (N + 12 )2πx
e −e
= = .
eπix − e−πix sin(πx)
Now we split the range of integration into the interval [δ, 1 − δ] and its com-
plement:
Z 1−δ
|f ∗ FM (x) − f (x)| 6 FM (t) |f (x − t) − f (x)| dt
δ | {z }
62kf k∞
Z δ
+ FM (t) |f (x − t) − f (x)| dt.
−δ | {z }
<ε
Exercise 3.66. Analyze where the above proof fails if we replace the Fejér kernel by the
Dirichlet kernel.
for ε, δ > 0 and large enough M (how large depending on ε and δ). Next notice
that f ∗ Fg d
M is a trigonometric polynomial for any f ∈ C(T ). The argument
is now similar to the case d = 1: we again show that the sequence (F gM ) is an
d
approximate identity. Given f ∈ C(T ) and ε > 0 we can choose δ > 0 such
that |f (x − t) − f (x)| < ε for x ∈ Td and t ∈ [−δ, δ]d . This implies that
Z
g g
f ∗ FM (x) − f (x) 6 |f (x − t) − f (x)|F M (t) dt
Td
Z
6 g
|f (x − t) − f (x)| FM (t) dt
[−δ,δ]d | {z }
<ε
Z
+ g
2kf k∞FM (t) dt
Td r[−δ,δ]d
< ε + 2kf k∞ ε,
as required. As in the case d = 1 this implies that the set of characters forms
an orthonormal basis, and the theorem follows.
Let us briefly describe how the definition of a character in Section 3.3
relates to the characters χn for n ∈ Zd that we used here (see also Exer-
cise 3.48).
is a polynomial.
(b) State, prove, and use an appropriate approximate identity property of the sequence (Ln )
to show that Ln ∗ f (x) converges uniformly on [0, 1] to f as n → ∞.
We now turn to the interplay between Fourier series and differentiation, with
the goal of proving Theorem 3.57. As we will see, this relationship will be a
simple but important consequence of integration by parts.
Suppose that f ∈ C 1 (Td ) and j ∈ {1, . . . , d}. Notice that
Z xj =1 Z
∂j f (x)χn (x) dxj = f (x)χn (x) − f (x)∂j χn (x) dxj
T xj =0 T
Z
= − f (x)∂j χn (x) dxj
T
Z
= 2πinj f (x)χn (x) dxj (3.19)
T
Integrating over the remaining variables, we see that the Fourier coefficients
of f and of the partial derivative ∂j f satisfy the relation
3.4 Fourier Series on Td 105
Z Z
an (∂j f ) = ∂j f (x)χn (x) dx = 2πinj f (x)χn (x) dx (by (3.19))
Td Td
= 2πinj an (f ). (3.20)
Proof of Theorem 3.57. The formula (3.18) follows from (3.20) by induc-
tion on k. To prove the last claim of the theorem, we will show that
X q
|an (f )| ≪d kf k22 + k∂1k f k22 + · · · + k∂dk f k22 (3.21)
n∈Zd
for f ∈ C k (Td ) and k > d2 . Assuming (3.21) for the moment, we see that
X
an (f )χn
n∈Zd
where we have used (3.18) in the last step. Therefore we can simplify the sum
under the square root in (3.21) to give the estimate
X d
X
kf k22 + k∂ek1 f k22 + · · · + k∂ekd f k22 = 1 + (2π)2k n2k
j
|an (f )|2
n∈Zd j=1
X
2k
≫d 1 + knk2 |an (f )|2
n∈Zd
√ Pd
since knk2 6 d max16j6d |nj | and hence knk2k
2 ≪d
2k
j=1 |nj | . We claim
that −1/2
1 + knk2k2 d
∈ ℓ2 (Zd ) (3.22)
n∈Z
d
for k > 2.
From this claim the inequality (3.21) follows quickly by the
Cauchy–Schwarz inequality:
106 3 Hilbert Spaces, Fourier Series, and Unitary Representations
X X −1/2 1/2
|an (f )| = 1 + knk2k
2 1 + knk2k
2 |an (f )|
n∈Zd n∈Zd
sX
2k −1/2
6
1 + knk
1 + knk2k |an (f )|2
n∈Z
2 d 2
ℓ2 (Zd ) n∈Zd
q
≪d kf k22 + k∂ek1 f k22 + · · · + k∂ekd f k22 .
we split up the sum. Firstly, by running through the possibilities of the signs of
the nj , it is sufficient to show convergence for n1 , . . . , nd > 0. Secondly, using
the symmetry of the summands with respect to permutation of the variables,
we may restrict the sum to those n ∈ Zd for which n2 , n3 , . . . , nd 6 n1 , and
we may also assume that n1 > 1. Now 1 + knk2k 2 > n1 , so
2k
X X∞ Xn1
1 1
2k
≪d
1 + knk2 n2k
n =1 n ,...,n =0 1
n∈Zd 1 2 d
X∞ X∞
(n1 + 1)d−1 1
= 2k
≪d 2k+1−d
,
n1 =1
n1 n
n1 =1 1
and the last sum converges if 2k > d. This implies the claim above, the
inequality (3.21), and hence the theorem.
The above is already sufficient to answer another claim from Section 1.5.
We state a special case of the inheritance of smoothness below, and return
again to this topic in Chapter 5.
Exercise 3.69. Let f be a real-valued function defined on an open subset U ⊆ R2 . Suppose
that f is continuous and that ∂14 f , ∂24 f exist and are continuous. Show that ∂1 ∂2 f exists
and is continuous.
. : G × X −→ X
(g, x) 7−→ g.x
.. . .
with g (h x) = (gh) x for g, h ∈ G and x ∈ X, and e x = x for all x ∈ X,
where e ∈ G is the identity element.
In this definition we have used multiplicative notation for the group oper-
ation in G, but as usual if G is abelian we will often use additive notation.
. . .
πg1 (πg2 f )(x) = πg2 f (g1−1 x) = f (g2−1 (g1−1 x))
= f ((g g ) .x) = π
1 2
−1
g1 g2 f (x)
.
K = U Supp F ⊆ X
(g, x) 7−→ F (g −1 x) .
is continuous and so is uniformly continuous on the set U × K. Hence there
exists a δ > 0 for which
p
.
d(g, e) < δ =⇒ g ∈ U and F (g −1 y) − F (y) < ε/ µ(K)
p
.
for all y ∈ K. Note also that F (g −1 y) 6= 0 for y ∈ X and g ∈ U im-
.
plies g −1 y ∈ Supp F and hence y ∈ K. Thus
† That is, a measure µ with the property that at every point there is an open neighbourhood
of the point with finite µ-measure.
3.5 Group Actions and Representations 109
Z
kπg F − F kpp = .
F (g −1 x) − F (x)p dµ(x)
X
Z
= F (g −1 .x){z− F (x)} dµ(x) < ε
p p
K|
<εp /µ(K)
for all g with d(g, e) < δ. Together with (3.25), this gives
kπg f − f kp < 3ε
. R
.
Fubini’s theorem shows that φ ∗f (x) = G f (g −1 x)φ(g) dmG (g) depends
measurably on x ∈ X. Integrating over X with respect to µ gives
Z Z Z
.
(φ ∗f )p dµ 6 .
f (g −1 x)p φ(g) dmG (g) dµ(x)
X X G
Z
= kf kpp φ(g) dmG (g) = kf kpp ,
G
.
G × X ∋ (g, x) 7−→ f (g −1 x)p φ(g)
and used the fact that πg preserves the p-norm (see Lemma 3.74). Taking
.
the pth root, we obtain kφ ∗f kp 6 kf kp in the case considered.
If f ∈ Lpµ (X) and φ ∈ L1 (G), then we can apply the above to fe = |f |
and φe = kφk−1
1 |φ|. Since
Z
∗ .
|φ f |(x) = φ(g)f (g x) dmG (g)
−1 .
G
Z
6 kφk1 .
e fe(g −1 x) dmG (g) = kφk1 φ̃ ∗f˜(x),
φ(g) .
G
χ(g) hv, wi = hχ(g)v, wi = hπg v, wi = hv, π−g wi = hv, η(−g)wi = η(g) hv, wi .
has weight χ.
Fix some choice of sample points for ξ, ζ, and η and note that xPi ∈ Pi
and zPi ∩Qj ∈ Pi ∩ Qj ⊆ Pi implies d(xPi , zPi ∩Qj ) < δ for every i and j. This
gives
X
k X′
kR(f, ξ) − R(f, η)k =
(f (x Pi ) − f (z Pi ∩Qj ))µ(Pi ∩ Q )
j
i=1 j
k X
X ′
6 kf (xPi ) − f (zPi ∩Qj )kµ(Pi ∩ Qj )
i=1 j
k X
X ′
6 εµ(Pi ∩ Qj ) = εµ(X),
i=1 j
P
where we write ′j for the sum over those j ∈ {1, . . . , ℓ} with Pi ∩ Qj 6= ∅.
The same holds for the Riemann sums R(f, ζ) and R(f, η), which implies
3.5 Group Actions and Representations 113
that
kR(f, ξ) − R(f, ζ)k 6 2µ(X)ε.
This implies the lemma: If (ξn ) is a sequence satisfying (3.27), then for
every ε > 0 there exists some N such that ξ = ξm , ζ = ξn satisfy the above
discussions whenever m, n > N . This implies that (R(f, ξn )) is a Cauchy
sequence. If (ζn ) is another such sequence, we may mix the two sequences
of partitions into another sequence (by, for example, setting η2n−1 = ξn
and η2n = ζn for all n ∈ N) satisfying (3.27). Since the Riemann sums for
this sequence also form a Cauchy sequence, we see that the limit is indeed
independent of the choice of the sequence of partitions and the choice of the
sample points.
Note that in the context of Theorem 3.80 we may set X = G, the meas-
ure µ = mG , V = H, and f (g) = χ(g)πg (v) and use Proposition 3.81 to
obtain a definition of χ ∗π v.
Proof. To see this we only have to show that the right-hand side of (3.28)
defines a continuous functional on H, for then the Fréchet–Riesz representa-
tion theorem (Corollary 3.19) implies the claimed existence and uniqueness.
By the Cauchy–Schwarz inequality we have
Z Z Z
hv, f (x)i dµ(x) 6 |hv, f (x)i| dµ(x) 6 kvk kf (x)k dµ(x),
X X X
so the hypotheses show that the integral converges, and hence the map is
well-defined. Moreover, for any scalar α and v, w ∈ H we have by linearity of
the inner product that
Z Z Z
hαv + w, f (x)i dµ(x) = α hv, f (x)i dµ(x) + hw, f (x)i dµ(x),
X X X
Lemma 3.84. Let (X, d) be a compact metric space and let µ be a finite Borel
measure on X.
R Let H beR a Hilbert space, and let f : X → H be a continuous
map. Then R- X f dµ = X f dµ, that is, the strong and weak integrals agree.
where Fn is the simple function with values Fn (x) = hw, f (xP )i for all x ∈ P
and P ∈ ξn . Note that Fn (x) → hw, f (x)i as n → ∞ by continuity of f and
the assumption (3.27) on (ξn ). Letting n → ∞ we can apply the definition of
the strong integral and dominated convergence to obtain
Z Z
w, R- f dµ = hw, f (x)i dµ(x).
X X
As this holds for any w ∈ H, the lemma follows from the construction of the
weak integral in Proposition 3.83.
In the context of unitary representations of a group G on a Hilbert space H,
we can use the notions of integration above to define convolution with meas-
ures or with L1 functions as follows.
3.5 Group Actions and Representations 115
If G has a left Haar measure mG and φ ∈ L1 (G) = L1mG (G), then for v ∈ H
we define Z
φ ∗π v = φ(g)πg v dmG (g).
G
We are now ready to prove Theorem 3.80 which, apart from the generalized
context, is a simple extension of the theorem regarding Fourier series on
compact abelian groups (Theorem 3.47). We will use the assumptions of the
theorem in this section without further remark.
Proof. If v ∈ Hχ then
Z Z
χ ∗π v = χ(g)πg v dmG (g) = χ(g)χ(g)v dmG (g) = v,
G G
Proof of Theorem 3.80. Let H′ be the closed linear hull of Hχ for all
characters χ of G. By Lemma 3.79 the various weight spaces are mutually
orthogonal, and (as already noted) closed, which gives us the following de-
scription of their closed linear hull
( )
M X X
′ 2
H = Hχ = vχ | vχ ∈ Hχ and kvχ k < ∞ ,
χ χ χ
see Exercise 3.37. If H′ = H then the theorem follows from Exercise 3.86(a)
(which specializes PropositionP 3.83) and Lemma 3.88. In fact, (3.26) can then
be shown as follows: if v = χ vχ is an element of H′ with vχ ∈ Hχ for every
character χ of G, then continuity of convolution implies
!
X
χ ∗π v = χ ∗π vχ = χ ∗π vχ = vχ
χ
• F
Z > 0;
• F dmG = 1; and
ZG
• F dmG > 1 − ε.
BδG
The careful reader might notice that the approximation may not be real-
valued nor have integral one. To deal with this issue we let F1 be the first ε′ -
approximation and define F2 to be the function 12 (F1 + F1 ) R+ ε′ , which is a
′ −1
positive
R 2ε -approximation since Rf > 0. Now define F′ = ( F2 dmG ) F2 .
Since f dmG = 1, we have | F2 dmG − 1| < 2ε and so F is an ε-
approximation if ε′ is sufficiently small.
Then F ∗π v ∈ H′ is a finite linear combination of elements from weight
spaces by Lemma 3.87. However, we also claim that
kF ∗π v − vk 6 ε (1 + 2kvk) . (3.29)
6 kwkε (1 + 2kvk),
Corollary
P 3.89. Every function f ∈ L2 (R2 ) can be written uniquely as a
sum n∈Z fn that converges with respect to k · k2 , where
Z
fn (x) = χn (φ)f (k(2πφ)x) dφ
T
118 3 Hilbert Spaces, Fourier Series, and Unitary Representations
and fn has weight n with respect to the rotation action of T on R2 for all
integers n ∈ Z.
Exercise 3.90. Let f ∈ C ∞ (R2 ) ∩ L2 (R2 ) (or, more generally, f ∈ C ∞ (R2 )). Show that
the decomposition of f given by Corollary 3.89 converges uniformly on compact subsets
of R2 to f . How much smoothness is needed to arrive at this uniform convergence?
3.5.6 Convolution
for all g ∈ G. The integral defining f1 ∗ f2 (g) exists for mG -almost every g
in G, and defines an element f1 ∗ f2 ∈ L1 (G) with kf1 ∗ f2 k1 6 kf1 k1 kf2 k1 . In
other words, the convolution makes L1 (G) into a separable Banach algebra.
Suppose in addition π is a unitary representation of G on the Hilbert space H.
Then
f1 ∗π (f2 ∗π v) = (f1 ∗ f2 ) ∗π v,
where ∗π is defined in Definition 3.85. In other words, H is a module for the
Banach algebra L1 (G).
= f1 ∗ (f2 ∗ f3 )(g),
Using the substition k = hg in the inner integral and exchanging the order
of integration we obtain
ZZ
hw, f1 ∗π (f2 ∗π v)i = w, f1 (h)f2 (h−1 k)πk v dmG (k) dmG (h)
Z
= hw, f1 ∗ f2 (k)πk vi dmG (k) = hw, (f1 ∗ f2 ) ∗π vi .
By the definition of the weak integral in Proposition 3.83, this implies the
proposition.
Exercise 3.94. Show that there is a continuous injective algebra homomorphism from
the Banach algebra ℓ1 (Z) (which may be thought of as L1 (Z) with respect to the counting
measure on Z and convolution) to C(T), where multiplication is pointwise multiplication.
• Hilbert spaces are at the heart of many developments. We will start to see
this in the context of Sobolev spaces and the Laplace differential operator
in Chapters 5 and 6.
• Another case where the Hilbert space splits into eigenspaces will be con-
sidered in Chapter 6.
• The spectral theory of a single unitary operator (equivalently, a unitary
representation of the group G = Z) is actually more delicate than the
case considered above (see Exercise 6.1, for example), where we showed
that the Hilbert space splits into a sum of generalized eigenspaces (the
weight spaces). We will treat the case of a single unitary operator only
in Chapter 9 (which will build on the material in Chapters 7 and 8).
• The topic of Fourier series on Td leads naturally to the study of the
Fourier integral on Rd (see Section 9.2). The concepts of Fourier series and
Fourier integrals on Td and Rd , respectively, find a common generalization
in the theory of Pontryagin duality (see Section 12.8).
• The case of unitary representations for compact abelian groups considered
in this chapter was quite straightforward and is only the beginning of the
important theory of unitary representations of locally compact groups.
For locally compact abelian groups this is strongly related to Pontryagin
duality; see Sections 11.4 and 12.8. For compact groups the main theorem
in this direction is the Peter–Weyl theorem [85] (which is covered in
Folland [32]). For many other groups that are neither abelian nor compact
this topic is also important and can have many interesting surprises.
• One such surpise may be the so-called property (T) that was introduced
by Každan in 1967 and has become important in many parts of math-
ematics since then. Building on the material in Chapter 9 we will study
this notion in Section 10.3.
• We have seen in this chapter that the notion of a left Haar measure leads
to many interesting concepts. For a concretely given group it is often
not difficult to find its left Haar measure. In Chapter 10 (which relies
on Chapter 7) we will prove the existence of the left Haar measure in
general.
The reader may continue with Chapter 4, 5, 6, or 7 (with some of the
material of Chapter 6 building on Chapter 5).
Chapter 4
Uniform Boundedness and the Open
Mapping Theorem
The reader in a hurry may also first prove the Baire category theorem
(Theorem 4.12) and derive Theorem 4.1 relatively quickly from it (see Ex-
ercise 4.16). We refrain from doing this here as it might help her to see the
argument behind the Baire category theorem once here in the concrete ap-
plication and once in the general case.
Proof of Theorem 4.1. Assume first that there is an open ball Bε (x0 ) on
which
{Tα x | α ∈ A}
is uniformly bounded: that is, there is a constant K such that
K + kTα x0 k K + K′
kTα yk 6 2 kyk 6 2 kyk,
ε ε
where K ′ = supα kTα x0 k 6 K < ∞. It follows that
4K
kTα k 6
ε
for every α ∈ A, as required.
To finish the proof we have to show that there is a ball on which prop-
erty (4.1) holds. This is proved by contradiction. Assume that there is no ball
on which (4.1) holds. Fix an arbitrary open ball B0 . By assumption there is
a point x1 ∈ B0 such that
kTα1 x1 k > 1
for some index α1 ∈ A. Since each Tα is continuous, there is a ball Bε1 (x1 )
with kTα1 yk > 1 for all y ∈ Bε1 (x1 ). Assume without loss of generality
that Bε1 (x1 ) ⊆ B0 and ε1 < 1. By assumption, in this new ball the fam-
ily {Tα x | α ∈ A} is not bounded, so there is a point x2 ∈ Bε1 (x1 ) with
kTα2 x2 k > 2
for all n > 1. Now the sequence (xn ) is clearly Cauchy (since xm ∈ Bεn (xn )
for all m > n, and so d(xm , xn ) < εn < 1/n), and therefore converges to
some z ∈ X. By construction, z ∈ Bεn (xn ) and kTαn zk > n for all n > 1,
which contradicts the hypothesis that the set {Tα z | α ∈ A} is bounded.
Corresponding to the operator norm defined in Lemma 2.52 there is of
course a notion of convergence in the space B(X, Y ) of bounded linear op-
erators from X to Y . A sequence (Tn ) in B(X, Y ) is uniformly convergent
to T ∈ B(X, Y ) if kTn − T k → 0 as n → ∞ (so uniform convergence of a
sequence of operators is simply convergence in the operator norm).
4.1 Uniform Boundedness 123
Corollary 4.3. Let X be a Banach space, and Y any normed vector space.
If a sequence (Tn ) in B(X, Y ) is strongly convergent, then there exists an
operator T ∈ B(X, Y ) such that (Tn ) is strongly convergent to T .
Exercise 4.5. Phrase the definition of a unitary representation of a metric group (Defin-
ition 3.73) using the notion of strong convergence for operators.
Exercise 4.6. Let cc (N) ⊆ ℓ∞ (N) be the space of sequences with finite support equipped
with the supremum norm. Define T : cc (N) → cc (N) by
for all (x1 , x2 , x2 , . . .) ∈ cc (N). Show that T is not bounded. Construct a sequence of
bounded linear operators Tk on cc (N) with Tk x → T x as k → ∞ for all x in cc (N).
n
X
sn (x) = am e2πimx .
m=−n
Recall that one of the basic goals of Fourier analysis is to clarify the relation-
ship between the sequence of partial sums (sn ) and the function f . That is,
to understand in what sense does the function sn approximate f for large n
(if it does at all). We now ask if the sequence of functions (sn ) converges
uniformly or pointwise to f for f ∈ C(T).
Recall from Definition 3.61 that the Dirichlet kernel Dn is defined by
n
X sin((n + 21 )2πx)
Dn (x) = e2πikx =
sin(πx)
k=−n
is bounded, with Z
kTn k = |Dn (x)| dx.
T
This is a very special case of the general argument in Lemma 2.63, but we
include it for the case at hand as this is easier to prove.
Proof. For any function f ∈ C(T) we have
Z Z
|Tn f | 6 |f (x)||Dn (x)| dx 6 kf k∞ |Dn (x)| dx,
T T
so Z
kTn k 6 |Dn (x)| dx.
T
Fix δ > 0. Since Dn is analytic it can only have finitely many sign changes
in [0, 1]. Therefore, we may find a continuous (this could even be chosen to
be piecewise-linear, for example) function fn with kfn k∞ 6 1 that differs
from sign(Dn (x)) only on a finite union of intervals whose total length is less
than kDn1k∞ δ. The triangle inequality for integrals now gives
Z Z
fn (x)Dn (x) dx > |Dn (x)|dx − 2δ,
T T
as n → ∞.
1
Now | sin t| > 2 for all t ∈ πZ + [ π6 , 5π
6 ]. In particular, it follows that if
2n
[
(2n + 1)πx ∈ [(k + 16 )π, (k + 65 )π]
k=0
then
| sin((2n + 1)πx)| > 12 .
Together this gives
Z X2n Z (k+ 5 )π/(2n+1)
sin((n + 21 )2πx) 6 1 1
dx > dx
sin(πx) 1
5
π(k + 6 )/(2n + 1) 2
T k=0 (k+ 6 )π/(2n+1)
+✘
2n✘
2n
1 X✘ 1 64 π
= −→ ∞
2π 2n✘
k + 56 ✘ +✘ 1
k=0
as n → ∞.
Proof. As noted before Lemma 4.7, we have Tn f = sn (0) for all f ∈ C(T).
Moreover, for a fixed f ∈ C(T), if the Fourier series of f converges at 0, then
the family {Tn f | n > 1} is bounded (since each element is just a partial
sum of a convergent series). Thus if the Fourier series of f converges at 0
for all f ∈ C(T), then for each f ∈ C(T) the set {Tn f | n > 1} is bounded.
By Theorem 4.1, this implies that the set {kTn k | n > 1} is bounded, which
contradicts Lemmas 4.7 and 4.8.
It follows that there must be some f ∈ C(T) whose Fourier series does not
converge at 0 (and in fact the partial sums must be unbounded).
In principle the proofs of Theorem 4.1 and Theorem 4.9 allow one to con-
struct the function f as in Theorem 4.9 more concretely, at least as the
limit of a Cauchy sequence of explicit continuous functions. Comparing The-
orem 4.9 with the absolute convergence claim in Theorem 3.57 and the result
126 4 Uniform Boundedness and the Open Mapping Theorem
regarding the Fejér kernel in Proposition 3.65, we see that this limit func-
tion is not continuously differentiable and that the Fourier series of f at 0
is an oscillating function with the property that the Césaro averages of the
diverging sequence (sn (0)) actually converge to f (0).
Recall that a continuous map has the property that the pre-image of any
open set is open, but in general the image of an open set is not open. We now
show that bounded linear maps between Banach spaces on the other hand
have the following special property.
Theorem 4.10 (Open mapping theorem). Let X and Y be Banach
spaces, and let T be a bounded linear map from X onto Y . Then T maps
open sets in X onto open sets in Y .
The assumption that X maps onto Y is essential. Consider, for example,
the projection (x, y) 7→ (x, 0) from R2 → R2 to see this.
The proof of Theorem 4.10 uses the Baire category theorem,(13) which
states that a complete non-empty metric space cannot be written as a count-
able union of nowhere dense subsets.
for any open ball Bε (x) since Gn is dense. Therefore,SXn is nowhere dense
∞
for each n > 1. By
T∞Theorem 4.12 the complement of n=1 Xn is dense, and
this is precisely n=1 Gn , by construction.
128 4 Uniform Boundedness and the Open Mapping Theorem
Exercise 4.16. Prove the Banach–Steinhaus theorem (Theorem 4.1) using the Baire cat-
egory theorem (Theorem 4.12).
Let us mention that the notion of a dense Gδ -set is the topological ver-
sion of being a ‘large’ set, while a set is measure-theoretically ‘large’ if its
complement is a null set. Both notions of being large share similar features,
and in particular a countable intersection of large sets in either sense is also
large.(14) However, these two notions are quite different. Example 4.17 shows
how to construct topologically large sets that are measure-theoretically small,
and vice-versa.
Example 4.17. For every ε > 0 there exists an open set Oε ⊆ R which con-
tains Q and has Lebesgue measure less than ε. This may be found, for ex-
ample, by listing the elements of Q as {x1 , x2 , . . . } and setting
[
Oε = Bε/2k+2 (xk ).
k>1
T
Then G = n>1 O1/n is a dense Gδ and a null set, and its complement RrG
is meagre and of full measure.
The Baire category theorem can be used to show the existence of elements
of a complete metric space with certain properties. If the set of elements of
a complete space which do not satisfy the property can be obtained as a
countable union of nowhere dense sets, there must be elements that satisfy
the property (indeed, there exists a dense set of such elements).
Exercise 4.18. (a) Assume that X and Y are metric spaces. Show that for any f : X → Y
the set {x ∈ X | f is continuous at x} is a Gδ -set.
(b) Show that the map f : R → R defined by
(
1 p
q
if x = q
∈ Q;
f (x) =
0 if x ∈ RrQ
is continuous at each irrational point and is not continuous at each rational, where we
assume that pq is written in lowest terms and has q > 1.
(c) Use (a) to show that no function could have the reverse properties of the function
in (b).
Exercise 4.19. Show that f ∈ L1 ((0, 1)) | kf |(a,b) k∞ = ∞ whenever 0 6 a < b 6 1
contains a dense Gδ -subset of L1 ((0, 1)).
Exercise 4.20. Show that the set of functions in C([0, 1]) that are nowhere differentiable
contains a dense Gδ -set.
Recall that we write BrX and BrY for the open balls of radius r and centre 0
in X and Y , respectively.
4.2 The Open Mapping and Closed Graph Theorems 129
X
and similarly B2ε = BεX − BεX . Therefore,
Y X
B2δ ⊆ T BεX − T BεX ⊆ T B2ε
y − T x1 ∈ BδY2 ,
the inclusion (4.5) with n = 2 implies that there exists a point x2 ∈ BεX2 such
that ky − T x1 − T x2 k < δ3 . Continuing, we obtain a sequence (xn ) in X such
that kxn k < εn for all n, and
!
Xn
y − T xk
< δn+1 . (4.6)
k=1
130 4 Uniform Boundedness and the Open Mapping Theorem
x + BεX ⊆ O.
kxk(1) 6 Kkxk(2) ,
for all x ∈ X, then the two norms are equivalent. That is, there is another
constant K ′ > 0 with
kxk(2) 6 K ′ kxk(1)
for all x ∈ X.
GT = {(x, T x) | x ∈ DT } ⊆ X × Y.
Notice as usual that this notion becomes trivial in finite dimensions in the
following sense. If X and Y are finite-dimensional, then the graph of T is
simply some linear subspace, which is automatically closed. Also it is easy
to see that a continuous operator has a closed graph. The next theorem —
the converse — is called the closed graph theorem. Notice that this converse
is not a purely topological fact. For instance, the set consisting of the graph
of the hyperbola xy = 1 and the origin is the closed graph of a discontinuous
function f : R → R.
Proof. Fix the norm k(x, y)k = kxkX + kykY on X × Y . The graph GT is,
by hypothesis, a closed subspace of X × Y , so GT is itself a Banach space.
Consider the projection P : GT → X defined by P (x, T x) = x. Then P is
clearly bounded, linear, and bijective. It follows by Proposition 4.25 that P −1
is a bounded linear operator from X to GT , so
for all x ∈ X, for some constant K. It follows that kxkX + kT xkY 6 KkxkX
for all x ∈ X, so T is bounded, and hence continuous by Lemma 2.52.
Proof. Notice that the hypotheses in the statement do not require that the
map is continuous, but simply ask that the range lies in L2µ (X). However,
if (fn , gfn ) has fn → f and gfn → ψ as n → ∞ in L2µ (X), (that is, a
sequence in the graph that converges to (f, ψ)), then we can extract a sub-
sequence along which both convergences hold µ-almost everywhere. Along
this subsequence gfn converges almost everywhere to gf and to ψ, so that
gf = ψ ∈ L2µ (X),
and hence (f, ψ) also lies in the graph of T . It follows that T is closed, and
hence continuous by Theorem 4.28.
Knowing now that T is bounded, there is a constant C = kT k > 0 such
that kgf k2 6 Ckf k2 for any f ∈ L2µ (X). Let
Using the theory of Fourier series developed in Section 3.4, we will now de-
velop the notion of Sobolev spaces and prove the Sobolev embedding theorem.
Sobolev spaces combine familiar notions of smoothness (that is, differentiabil-
ity) with bounds on Lp norms. We will set p = 2 and so will have all the tools
of Hilbert spaces at our disposal, but the theory can be extended to all p > 1.
The Sobolev embedding theorem and elliptic regularity for the Laplace op-
erator will allow us to prove in Section 5.3 the existence of solutions to the
Dirichlet boundary value problem introduced in Section 1.2.
Definition 5.1. Let k > 0 be an integer. We (initially) define the (L2 ) So-
bolev space H k (Td ) to be the closure of C ∞ (Td ) inside
M
V = L2 (Td ),
kαk1 6k
where the direct sum runs over all multi-indices α ∈ Nd0 with kαk1 6 k and a
function f ∈ C ∞ (Td ) is identified with the tuple φk (f ) = (∂α f )kαk1 6k ∈ V .
H 1 (Td ) = φ1 (C ∞ (Td ))
d+1
is the closure of φ1 (C ∞ (Td )) in L2 (Td ) , where we used the em-
bedding φ1 : f 7→ (f, ∂1 f, . . . , ∂d f ) ∈ V . So, by our definition, elements
of H 1 (Td ) are (d + 1)-tuples of functions on Td . In order to be able to
think of these as single functions on Td (which is how we will think of So-
bolev spaces), notice that the last d terms of the (d+1)-tuple are uniquely
determined by the first term. This is clear for φ1 (f ) with f ∈ C ∞ (Td ),
but also remains true in the closure H 1 (Td ), as we show next.
Lemma 5.2 (Fourier series of weak derivatives). Suppose that the vec-
tor (f, f1 , . . . , fd ) belongs to H 1 (Td ) and the Fourier series of f is given by
X
f= cn χn .
n∈Zd
Then X
fj = 2πinj cn χn . (5.1)
n∈Zd
Proof. For f ∈ L2 (Td ) and n ∈ Zd , write an (f ) for the nth Fourier coeffi-
cient. We start with the formula
an ∂j f = h∂j f, χn i = 2πinj hf, χn i = 2πinj an (f )
for all n ∈ Zd and all f ∈ C ∞ (Td ), see (3.18). Using continuity of the inner
product and the definition of H 1 (Td ), this formula automatically extends to
all (f, f1 , . . . , fd ) ∈ H 1 (Td ). Expanding fj into its Fourier series (see The-
orem 3.54) gives the lemma.
The lemma now shows in full generality that the first component f of any
element (f, f1 , . . . , fd ) ∈ H 1 (Td ) determines all the other components. Thus
we can identify an element of H 1 (Td ) with the associated element f ∈ L2 (Td ),
and will write f ∈ H 1 (Td ) and ∂ j f = fj ∈ L2 (Td ) for j = 1, . . . , d. We will
also call the other components ∂ j f weak derivatives (this will be further
justified in Section 5.2), as these generalize the notion of partial derivative
for smooth functions. However, the norm associated to f ∈ H 1 (Td ) is
5.1 Sobolev Spaces and Embedding on the Torus 137
v
u d
u X
kf kH 1 = tkf k22 + ∂ j f k22 .
k∂
j=1
with norm less than or equal to one. Finally, (3.18) holds similarly for all f
in H k (Td ) and for all α ∈ Nd0r{(0, . . . , 0)} with kαk1 6 k.
Proof. For the first claim consider the map
M M
πk,ℓ : L2 (Td ) −→ L2 (Td )
kαk1 6k kαk1 6ℓ
and notice that πk,ℓ (φk (f )) = φℓ (f ) for all f ∈ C ∞ (Td ). Therefore the ex-
tended map ık,ℓ is simply the restriction of this projection to H k (Td ), and so
has norm less than or equal to one. Using constant functions we see that the
norm of ık,ℓ is equal to one. Injectivity will follow from the last claim of the
proposition.
For the second claim, regarding the operator ∂ j : H k (Td ) → H k−1 (Td ),
we modify the argument above as follows. Consider the projection map
M M
πj : L2 (Td ) −→ L2 (Td )
kαk1 6k kαk1 6k−1
which clearly has norm one. Figure 5.1 illustrates the difference between the
projection πk,ℓ and the projection πj in a simple example. For f ∈ C ∞ (Td )
we see that πj (φk (f )) = φk−1 ∂ej (f ) , which (as above) shows that the
restriction of πj to H k (Td ) is the desired operator ∂ j : H k (Td ) → H k−1 (Td ).
The final claim of the proposition follows from the description of the Four-
ier series of the weak derivative in Lemma 5.2 for k = 1 and induction.
Now justified by Proposition 5.3, we identify an element f = (fα )kαk1 6k
in H k (Td ) with its first component f0 in L2 (Td ). The other components are
138 5 Sobolev Spaces and Dirichlet’s Boundary Problem
for all multi-indices α with kαk1 6 k. In this notation our norm becomes
s X
kf kH k = k∂∂ α f k22 .
kαk1 6k
As we have seen in the discussion above, each of the spaces H k (Td ) consists
of certain L2 functions on Td . For k = 0 we have H 0 (Td ) = L2 (Td ). A nat-
ural question for k > 1 is to ask which functions in L2 (Td ) lie in H k (Td ).
Using Fourier series we can give a formal answer to this, and this will have
interesting and important consequences which will be discussed below. An-
other consequence of this lemma is that it makes it meaningful to define H k
for k ∈ R by using the convergence property in the lemma as a definition —
we will not pursue this further.
k d
Lemma 5.4 (CharacterizingP H (T ) by 2thed Fourier series). Let k > 0
be an integer and let f = n∈Zd cn χn ∈ L (T ). Then f ∈ H k (Td ) if and
only if X
|cn |2 knk2k
2 < ∞. (5.2)
n∈Zd
for all α with kαk1 6 k. We apply this to α = ke1 , ke2 , . . . , ked and see that
5.1 Sobolev Spaces and Embedding on the Torus 139
d X
X
n2k 2
j |cn | < ∞.
j=1 n∈Zd
d
X
Using the bound knk2k
2 ≪ n2k
j for all n ∈ Z , we get (5.2) as required.
d
j=1
Conversely, assume (5.2). Then for any α ∈ Nd0r{(0, . . . , 0)} with kαk1 6 k
we have |nα | 6 knkk2 , and so
X 2
|(2πin)α cn | < ∞.
n∈Zd
The following theorem shows (in a more constructive manner than the pre-
vious exercise) how special the elements of the subset H k (Td ) within L2 (Td )
become once k is sufficiently large. If k > d2 , then any element of H k (Td )
agrees almost surely (and will be identified) with a continuous function. In-
creasing k further also gives some differentiability of this continuous function.
The proof will show that most of the work has already been done.
Proof of Theorem 5.6. Let us start with the case ℓ = 0. In this case we
already know that
q
kf k∞ ≪d kf k22 + k∂ek1 f k22 + · · · + k∂ekd f k22
for f ∈ C ∞ (Td ) by Theorem 3.57. However, the square root on the right-hand
side is bounded above by kf kH k , which shows that the inclusion map
ı : C ∞ (Td ), k · kH k −→ C(Td ), k · k∞
really does select a continuous representative. For this, notice that the com-
position of the inclusion maps
ı
C ∞ (Td ) −→ C(Td ) −→ L2 (Td )
kf kC ℓ = max k∂γ f k∞ .
kγk1 6ℓ
kf kC ℓ ≪d kf kH k
for f ∈ C ∞ (Td ), and the inclusion C ∞ (Td ) → C ℓ (Td ) once again gives rise
to a bounded operator ıℓ : H k (Td ) → C ℓ (Td ). Composing with the inclusion
map from C ℓ (Td ) to L2 (Td ) we again see that f ∈ H k (Td ) agrees almost
everywhere with ıℓ (f ) ∈ C ℓ (Td ).
Definition 5.7. Let d > 1 and k > 0 be integers, and let U ⊆ Rd be an open
subset. Then the (L2 ) Sobolev space H k (U ) is defined† to be the closure of
(∂α f )α | f ∈ C ∞ (U ), ∂α f ∈ L2 (U ) for kαk1 6 k (5.3)
L
inside kαk1 6k L2 (U ), where as before we take the direct sum over all α ∈ Nd0
with kαk1 6 k.
Even though the closure H k (U ) contains many new functions that are not
in C ∞ (U ), those new elements
still have some of the properties of the elements in the subspace (5.3) used
to define H k (U ). In fact, as we will show below, f = f0 determines all the
other components fα of the vector (5.4), and these are derivatives of f in the
following weaker sense (which, as we will see, turns integration by parts into
the definition of a derivative).
for all φ ∈ Cc∞ (U ). In the case α = ej , we will call this the weak jth partial
derivative and write g = ∂ j f .
for x ∈ (−1, 1). Then f has weak e1 -partial derivative g. In fact, for φ
in Cc∞ ((−1, 1)) we have
Z 1 Z 1 1 Z 1 Z 1
f φ′ dx = xφ′ (x) dx = xφ(x) − φ dx = 0 − gφ(x) dx,
−1 0 0 0 −1
as required.
† In the literature another notation that is used is W k,2 . The more general case of W k,p
is defined similarly using Lp (U ) instead of L2 (U ).
142 5 Sobolev Spaces and Dirichlet’s Boundary Problem
for any y ∈ U , where the boundary terms vanish since φ ∈ Cc∞ (U ). Integ-
rating over the remaining variables shows that ∂ej f is indeed also a weak ej -
partial derivative. By induction on kαk1 , this implies that
Using this identification, the subspace (5.3) will from now on be referred to
as C ∞ (U ) ∩ H k (U ).
∂ α : H k (U ) −→ H k−kαk1 (U )
f 7−→ ∂ α f
We will see later that elements of H0k (U ) ‘vanish in the square-mean norm
sense’ at ∂U if k > 1.
Let us add the following remark to Definition 5.7. We defined H k (U ) to
consist of those f ∈ L2 (U ) that have weak α-partial derivatives ∂ α f ∈ L2 (U )
for all α with kαk1 6 k such that the vector
M
∂ α f )kαk1 6k ∈
(∂ L2 (U )
kαk1 6k
C ∞ (U ) ∩ H k (U ).
One may ask whether this approximation statement can be proved instead
of assumed.
Exercise 5.15. Show that f ∈ L2 (Td ) belongs to H k (Td ) if and only if there exists, for
every α ∈ Nd0 with kαk1 6 k, a weak α-partial derivative ∂ α f ∈ L2 (Td ). Here the weak
partial derivative is defined in terms of smooth test functions φ ∈ C ∞ (Td ).
The analogue of Exercise 5.15 also holds(15) for certain open subsets U
of Rd , but we will not use this possible alternative definition of the Sobolev
spaces here (and will return to this question in Section 8.2.2). We note that
due to the boundary of U this equivalence is a bit harder to prove. For this
proof, but more importantly also for the material that follows in this chapter,
we need some more background concerning smooth functions on Rd , which
we outline in the following series of exercises.
144 5 Sobolev Spaces and Dirichlet’s Boundary Problem
is smooth on R.
Exercise 5.19. Suppose U ⊆ Rd is open, bounded, and star-shaped with centre 0 in the
sense that U ⊆ λU for all λ > 1 (see, for example, Figure 5.2). Let f, f1 , . . . , fd be in L2 (U )
and suppose fj is the weak ej -partial derivative of f for j = 1, . . . , d. Show that f ∈ H 1 (U ).
5.2.1 Examples
We illustrate the theory above with some simple examples, which will be
justified below.
where
1 if x < s < y,
σ(y, x, s) = −1 if y < s < x, and
0 if s is not between x and y.
Applying Fubini’s theorem we get
Z 1 Z 1
f (y) = f (x) dx + f ′ (s)k(y, s) ds, (5.6)
0 0
where Z (
1
s if s < y,
k(y, s) = σ(y, x, s) dx =
0 s − 1 if s > y.
Hence (5.6) expresses the value of f at y ∈ U as the sum hf, 1U i+hf ′ , k(y, ·)i,
which is clearly continuous on H 1 ((0, 1)), and since kk(y, ·)kL2 6 1 we also
have |f (y)| 6 2kf kH 1 . Moreover, we may use (5.6) for y = 0 and y = 1 as a
definition of f (0) and f (1), and then
Z 1
|f (y1 ) − f (y2 )| = f (s) (k(y1 , s) − k(y2 , s)) ds
′
0
p
6 kf ′ k2 kk(y1 , ·) − k(y2 , ·)k2 = kf ′ k2 |y1 − y2 | (5.7)
Exercise 5.25. Extend Example 5.21 by showing that f (x) = loglog kxk defines an
element of H 1 (B1/2 ).
d
Exercise 5.26. Let U = B1R , and let fα (x) = kxkα for x ∈ U . For which values of α do
we have fα ∈ H k (U )?
Example 5.28. Let U = (0, 1)d , and S = (0, 1)d−1 and write
Sy = S × {y} ⊆ U
for y ∈ [0, 1]. For every y ∈ [0, 1] there is a natural restriction operator
to L2 (Sy ), called the trace on Sy ,
H 1 (U ) ∋ f 7−→ f ∈ L2 (Sy ),
Sy
Moreover, if we identify the space L2 (Sy ) with L2 (S) for all y ∈ [0, 1] (by
simply identifying Sy with S via the projection Sy ∋ (x, y) 7→ x ∈ S), then
we also have
p
f Sy − f Sy
2 6 kf kH 1 (U) |y1 − y2 |.
1 2 L (S)
Exercise 5.29. Prove the statements of Example 5.28 by the following steps:
(a) Fix some f ∈ C ∞ (U ) ∩ H 1 (U ) and apply Fubini’s theorem to see that the restriction
of f to {x} × (0, 1) belongs to H 1 ((0, 1)) for almost every x ∈ (0, 1)d−1 . Now apply
Example 5.20 (or more precisely (5.6)) to show that
Z 1 Z y Z 1
f (x, y) = f (x, s) ds + s∂2 f (x, s) ds + (s − 1)∂2 f (x, s) ds.
0 0 y
Notice that this also gives a definition for the trace in the cases y = 0 and y = 1. Use this
to estimate the L2 norm of the restriction of f to Sy .
(b) For the last statement show that for any f ∈ C ∞ (U ) ∩ H 1 (U ),
p
|f (x, y1 ) − f (x, y2 )| 6 |y1 − y2 | · k∂2 f kL2 ({x}×(0,1)) .
and the image of φ [0, 1]d−1 × {0} for a smooth map φ : [0, 1]d → U .
We now consider a general open set U ⊆ Rd and define the trace for ele-
ments of H01 (U ). For the statement that such functions vanish in the square-
mean sense at ∂U we want to assume that U has a sufficiently regular bound-
ary in the following sense (this may feel familiar after recalling the implicit
function theorem).
Bε (z0 )
This includes examples like U = Br (x), but excludes U = (0, 1)d if k > 1
and d > 2. Notice that an open set with a C k -smooth boundary need not
be connected, simply connected, or bounded. Also note that the rotation
within Definition 5.31 does not affect whether a function belongs to H k (U ).
In fact, since a rotation R preserves the H k norm, a convergent sequence (fn )
in C ∞ (U ) ∩ H k (U ) is mapped to another convergent sequence (fn ◦ R)
in H k (R−1 U ).
Exercise 5.32. Let U ⊆ Rd be a bounded open set, and let Φ be a diffeomorphism (a
rotation, for example) defined on a neighbourhood of U. Let k > 0 be an integer. Show
that H k (U ) ∋ f 7→ f ◦ Φ ∈ H k (Φ−1 (U )) is an isomorphism (and in the case of a rotation,
an isometry) between H k (U ) and H k (Φ−1 (U )).
which satisfies
p
f Graph(φ)
6 ∂ d f kL2 (U) ,
δφ k∂
L2 (Graph(φ))
δφ
Graph(φ)
explains the earlier claim that f ∈ H01 (U ) vanishes in the square-mean sense
at ∂U .
We also note that if we set φ(x) = y0 to be constant we obtain that the L2
norm of the restriction of f to U ∩ Rd−1 × {y0 } is bounded by a multiple
∂ y f k2 . Integrating the square of this inequality over y0 we obtain
of k∂
∂ y f k2
kf k2 ≪U k∂ (5.8)
for all f ∈ H01 (U ), where the implicit constant depends on the bounded open
set U .
Proof of Proposition 5.33. Recall that we write x for the first (d − 1)
coordinates and y for the last coordinate. For f ∈ Cc∞ (U ) we have
Z φ(x)+δφ
f (x, φ(x)) = − ∂d f (x, s) ds (5.9)
φ(x)
Using automatic extension to the closure (Proposition 2.59), this implies the
proposition.
We now extend the Sobolev embedding theorem (Theorem 5.6) to open sub-
sets U ⊆ Rd (but, for now, leave open the question regarding the behaviour
of f at ∂U ).
150 5 Sobolev Spaces and Dirichlet’s Boundary Problem
The proof of Theorem 5.34 will be (apart from Exercise 5.18) the first
example of a technique that we will use frequently: If a given statement
is already known to hold on Td (where we can use Fourier series to prove
it), then one can sometimes obtain the same statement for open subsets
of Rd by moving the functions or the problem to Td . For this the following
lemma and the notation TdR = Rd /(2RZd ) for R > 0 will be useful. We
also define H k (TdR ) in the same way as we defined H k (Td ) except that we
will use the fundamental domain [−R, R)d and the restriction of the Lebesgue
measure to it to define the L2 norm and the derived Sobolev norms. Of course
the theorems of the previous section also hold in that context (possibly with
different multiplicative constants).
Lemma 5.36 (Transfering regularity). Let U ⊆ Rd be open, k > 1,
and χ ∈ Cc∞ (V ) for some open V ⊆ U . Then Mχ : H k (U ) → H0k (V )
defined by Mχ (f ) = χf is a bounded operator. Let R > 0 and assume now
that U ⊆ BR . For a function f on U we define P (f ) on TdR by first ex-
tending f to [−R, R)d by setting it to be zero outside of U and then identi-
fying [−R, R)d with TdR . Then P : H0k (U ) → H k (TdR ) is a linear isometry.
Finally, f ∈ H k (U ), χ ∈ Cc∞ (U ) and P (χf ) ∈ H ℓ (TdR ) for some ℓ > k
implies that χf ∈ H0ℓ (U ).
k∂α (χf )kL2 (V ) ≪ sup k∂β χk∞ sup k∂γ f kL2 (U)
kβk1 6kαk1 kγk1 6kαk1
for all α with kαk1 6 k, which leads to kMχ (f )kH k (V ) ≪χ kf kH k (U) . From
this it follows that the operator Mχ is a bounded operator from H k (U )
5.2 Sobolev Spaces on Open Sets 151
into H0k (V ) (and that the weak partial derivatives ∂ α (χf ) for f ∈ H k (U )
are obtained by the same Leibniz rule as the partial derivates ∂α (χf ) for f
in C ∞ (U ).
For the second statement of the lemma notice first that
P (Cc∞ (U )) ⊆ C ∞ (TdR )
(which would not be true for C ∞ (U ) ∩ H k (U )). Since the norm on H k (TdR )
is defined by integration over the Lebesgue measure on (−R, R)d , and since
U ⊆ (−R, R)d
is well-defined. Arguing just as in the first part of the proof, we get that
multiplication by ψ defines a bounded operator from H ℓ (TdR ) to H0ℓ (U ). Ap-
plying this map to g = P (χf ) ∈ H ℓ (TdR ) we get ψP (χf ) = χf ∈ H0ℓ (U ).
The existence of functions in Cc∞ (U ) that are equal to one on large subsets
of U (as used in the above proof) will frequently be useful.
Essential Exercise 5.37 (Smooth approximate characteristic func-
tions). Let K ⊆ U be a compact subset of an open subset U ⊆ Rd . Find a
smooth function ψ ∈ Cc∞ (U ) with ψ|K ≡ 1.
Proof of Theorem 5.34. Let x0 ∈ U and ε > 0 be such that
d
R
V = B2ε (x0 ) ⊆ U.
kf kK,∞ ≪K,U kf kH k (U )
d
for f ∈ H k (U ) and k > 2
.
In this section we will combine the discussion of Sobolev spaces from Sec-
tion 5.2, the Fréchet–Riesz representation theorem (Corollary 3.19), and a
simple orthogonality relation to solve the Dirichlet boundary value problem
∆u = 0
(5.11)
u|∂U = b
∂2g ∂2g
∆g = + · · · + = ∂12 g + · · · + ∂d2 g
∂x21 ∂x2d
is the Laplacian of g.
We note that (with the exception of Lemma 5.48 and its proof) we restrict
our attention in this section to real-valued functions. In the following we will
also write h·, ·iL2 (U) to denote the inner product on L2 (U ), and similarly for
other Hilbert spaces, to emphasize the difference between the various inner
products used, especially for the semi-inner product h·, ·i1 as introduced in
5.3 Dirichlet’s Boundary Value Problem and Elliptic Regularity 153
the next lemma. Recall from Definition 3.1 that a semi-inner product satisfies
positivity instead of strict positivity.
for u, v ∈ H 1 (U ).
since the boundary terms vanish. Integrating over the remaining variables
and summing over all j = 1, . . . , d, we get hg, φi1 = h−∆g, φiL2 (U) = 0 by
the assumption on g.
Motivated by Lemma 5.40, the approach is to decompose a function f
in C 1 (U ) as f = g + v, where v ∈ H01 (U ) and g is ‘orthogonal to’ H01 (U )
with respect to the semi-inner product h·, ·i1 . As harmonic functions have
this orthogonality property by Lemma 5.40, there is some hope that g will be
harmonic and indeed it will turn out to be. Morevoer, v will vanish at ∂U in
the square-mean sense and so f |∂U = g|∂U at least in the square-mean sense.
As we wish to use the semi-inner product from (5.12) in the definition of
the orthogonal complement, we will have to discuss properties of this semi-
inner product. We will then show that g is smooth and harmonic, and it is this
step that relies on a general phenomenon called elliptic regularity, the Laplace
operator being an example of an elliptic differential operator. We will show in
Section 5.3.3, for d = 2, that g extends continuously to the boundary ∂U and
agrees with f |∂U there. Finally, we will discuss in Section 8.2.2 the behaviour
at a smooth boundary in any dimension.
Lemma 5.41 (Semi-inner product). The semi-inner product h·, ·i1 re-
stricted to H01 (U ) is an inner product, and the norm defined by this inner
product is equivalent to k · kH 1 (U) . The semi-norm k · k1 induced by h·, ·i1
on C ∞ (U ) ∩ H 1 (U ) has as its kernel the subspace of all locally constant func-
tions.
154 5 Sobolev Spaces and Dirichlet’s Boundary Problem
Here the kernel is the subspace of all functions f with hf, f i1 = kf k21 = 0.
A function f on U is called locally constant if for every x ∈ U there is a
neighbourhood V of x such that f |V is constant. If U is connected, then any
locally constant function is constant.
Proof of Lemma 5.41. Let f ∈ H01 (U ). We have kf kL2 (U) ≪ k∂ ∂ xd f kL2 (U)
by (5.8). Thus
q q Pd q
∂ j f k2L2 (U) ≪ hf, f i1
hf, f i1 6 kf kH 1 (U) = kf k2L2 (U) + j=1 k∂
d
X
2
h∆φ, f − viL2 (U) = ∂j φ, f − v L2 (U) (by definition of ∆)
j=1
d
X
=− h∂j φ, ∂j f − ∂ j viL2 (U) (by Lemma 5.10)
j=1
In this section we will upgrade the conclusion from the previous section to
show that the weakly harmonic function g is actually smooth and harmonic.
The principle at work here is much more general, and is called elliptic reg-
ularity. We will again rely on Fourier series in the argument, and this will
only give the result in the interior of U and not at the boundary ∂U . For this
reason, it is natural to start with functions that have little structure on ∂U ,
as in the following definition.
Roughly speaking, the theorem says that if ∆g exists, then the Sobolev-
regularity of g must be two more than that of ∆g. In other words, any
non-smoothness of g will be visible also in ∆g, or there is no cancellation
of singularities when ∆g is calculated from g. This remarkable result has
many striking consequences, a few of which we list here.
1
Corollary 5.46. If g ∈ Hloc (U ) has ∆g = u ∈ C ∞ (U ) (or g is weakly
harmonic in the sense that ∆g = 0), then g ∈ C ∞ (U ) satisfies ∆g = u
(respectively is harmonic).
k
Proof. Since u ∈ Hloc (U ) for all k ∈ N0 , Theorem 5.45 implies that
k+2
g ∈ Hloc (U )
for all k > 0. Hence χg ∈ H k+2 (U ) for all k > 0 and all functions χ ∈ Cc∞ (U ).
By the Sobolev embedding theorem for open subsets (Theorem 5.34), this
implies that χg ∈ C ∞ (U ) for all χ ∈ Cc∞ (U ). Choosing χ ∈ Cc∞ (U ) equal
to 1 on a neighbourhood of a given x ∈ U shows that g ∈ C ∞ (U ), since it
is C ∞ in a neighbourhood of each point. Finally, integration by parts gives
1
Proof. By assumption, ∆g = λg ∈ Hloc (U ) and so by Theorem 5.45 we
3 3
also have g ∈ Hloc (U ). However, this shows that ∆g = λg ∈ Hloc (U ) and
5
Theorem 5.45 may be applied again to see that g ∈ Hloc (U ), and so on.
k
It follows that g ∈ Hloc (U ) for all k > 0, and arguing as in the proof of
Corollary 5.46 we see that g ∈ C ∞ (U ).
We will prove Theorem 5.45 in two steps: firstly we deal with the case
of functions on Td (which turns out to be easy because of Fourier series),
and secondly we show how to transfer the theorem from Td to open subsets
of U . Morally the second step (the transfer) should be the easy step as we are
discussing the ‘Laplace operator’ on both of these spaces. However, some care
is necessary as ∆ has different meanings on Td and on U since the spaces
of allowed test functions in the definition of ∆ are C ∞ (Td ) and Cc∞ (U ),
respectively.
5.3 Dirichlet’s Boundary Value Problem and Elliptic Regularity 157
is the Fourier series of u. This follows from Fourier series on the torus (The-
orem 3.54) since the characters χn are eigenfunctions of the Laplace operator:
and so
hu, χn iL2 (Td ) = hg, ∆χn iL2 (Td ) = −(2π)2 knk22 hg, χn iL2 (Td ) .
| {z }
=cn
rely only on the definitions and are arguably a bit tedious, we postpone their
proofs until after the proof of Theorem 5.45.
hP (ψu1 ), φiL2 (Td ) = hψu1 , φiL2 (U) = hu1 , ψφiL2 (U) = hχg, ∆(ψφ)iL2 (U) .
R
5.3 Dirichlet’s Boundary Value Problem and Elliptic Regularity 159
Since ψ is one and its derivatives are zero at any point of Supp χ, we can
remove ψ on the right-hand side. One may wonder why ψ is introduced in
the first place, since it is only brought in so that it can be removed again.
The answer lies in the definition of ∆ which depends crucially on a choice of
test functions. In particular, ∆ is defined differently on Td and on U — we
use ψ to bridge between these two definitions.
We now obtain
k
Proof of Lemma 5.50. By assumption u ∈ Hloc (U ) and so χu ∈ H k (U ) by
k ℓ
definition of Hloc (U ) (Definition 5.43). Similarly, g ∈ Hloc (U ) and so (∆χ)g
ℓ ℓ
lies in H (U ). Finally, by assumption, g ∈ Hloc (U ) with ℓ > 1 and so ∂ j g lies
ℓ−1
in Hloc (U ) by Lemma 5.49, which gives (∂j χ)(∂ ∂ j g) ∈ H ℓ−1 (U ). Therefore,
d
X
χu + (∆χ)g + 2 ∂ j g) ∈ H min{k,ℓ−1} (U )
(∂j χ)(∂
j=1
and it remains to show that this function is equal to ∆(gχ) weakly. For this,
recall that ∆g = u, let φ ∈ Cc∞ (U ), and calculate
* d
+
X
χu + (∆χ)g + 2 ∂ j g), φ
(∂j χ)(∂
j=1
d
X
= hu, χφi + hg, (∆χ)φi + 2 ∂ j g, (∂j χ)φi
h∂
j=1
d
X
= hg, ∆(χφ)i + hg, (∆χ)φi − 2 hg, ∂j ((∂j χ)φ)i
j=1
* d
X
= g, (∆χ)φ + 2 (∂j χ)(∂j φ) + χ∆φ
j=1
d d
+
X X
+(∆χ)φ − 2 (∂j2 χ)φ −2 (∂j χ)(∂j φ)
j=1 j=1
weakly.
For k > 0 and a bounded open set U ⊆ Rd the function space C k (U ) consists
of all continuous functions with f |U ∈ C k (U ) such that the partial derivatives
extend continuously to the closure U . If U has C k -smooth boundary, then
the function space C k (∂U ) is defined using the assumption that ∂U has the
structure of a manifold; the local charts allow smoothness properties to be
5.3 Dirichlet’s Boundary Value Problem and Elliptic Regularity 161
x
where n = kxk 2
is the normalized outward normal vector to the sphere of
radius kxk2 at x and we also write σ for the area measure on ∂Bε . For the
right-hand side, we calculate for f as above
Therefore Z Z
f · n dσ = f · n dσ. (5.16)
∂Br ∂Bε
Using this and the analogous formula for kxk = ε allows us to write (5.16) as
5.3 Dirichlet’s Boundary Value Problem and Elliptic Regularity 163
Z Z
1 1
φ dσ = φ(x) dσ.
rd−1 rSd−1 εd−1 εSd−1
as ε → 0, by continuity of φ.
Exercise 5.54. Use Proposition 5.53 to prove that any bounded harmonic function on Rd
is constant.
Proof. To prove (5.17) for z (0) ∈ ∂U we use the assumption that U has C 1 -
smooth boundary and rotate the coordinate system so that Bδ (z (0) ) ∩ U
can be described as in Definition 5.31 (see also Exercise 5.32). By a further
rotation and by shrinking δ if necessary we may also assume that |φ′ (x)| <1
for all x ∈ Bδ (x0 ) with the notation z (0) = (x0 , y0 ) and φ ∈ C 1 Bε (x0 as in
Definition 5.31. We claim that for ε ∈ (0, 14 δ), x1 ∈ Bδ (x0 ), and y1 = φ(x1 )
we have
Z Z sZ Z
x1 +ε y1 +ε x1 +ε y1 +ε
|v| dx dy 6 2ε2 ∂ 2 v|2 dx dy,
|∂ (5.18)
x1 −ε y1 −ε x1 −ε y1 −ε
Now integrate the absolute value of v(x, y) with respect to x and y to get
164 5 Sobolev Spaces and Dirichlet’s Boundary Problem
Z x1 +ε Z y1 +ε Z x1 +ε Z y1 +ε Z y1 +ε
|v(x, y)| dy dx 6 |∂2 v(x, s)| ds dy dx
x1 −ε y1 −ε x1 −ε y1 −ε y
Z x1 +ε Z y1 +ε
= |∂2 v(x, s)||s − y1 + ε| ds dx,
x1 −ε y1 −ε
y1
y0
x0 x1
Fig. 5.4: The point z (1) = x1 , y1 ∈ ∂U and the ε-box Qε , containing the ε-ball.
Exercise 5.56. Describe what prevents the proof of Lemma 5.55 from extending to higher
dimensions by trying to emulate the calculations involved.
• In Section 8.2.2 we will return to the topic of elliptic regularity one more
time and will present an argument that also gives the result at the bound-
ary of U .
Chapter 6
Compact Self-Adjoint Operators and
Laplace Eigenfunctions
of the image of the unit ball is compact in W . We will sometimes write K(V, W )
for the space of compact operators, and if V = W we will write K(V ) for the
space of compact operators from V to V .
We will see in Example 6.5 that L B1V is in general not closed, even
if L is a compact operator. Since compact sets are bounded, every compact
operator is also bounded, but the converse does not hold. For example, the
identity operator V → V on an infinite-dimensional normed vector space is
not a compact operator by Proposition 2.35. As noted above, if L : V → W is
a bounded operator and L(V ) is finite-dimensional, then L is a compact op-
erator. We will see many more examples after we prove a few basic properties
of compact operators.
Lemma 6.3 (Composition). Let V1 , V2 , V3 be normed vector spaces, and
let L1 : V1 → V2 and L2 : V2 → V3 be bounded operators. If L1 or L2 is a
compact operator, then so is L2 ◦ L1 .
Proof. Suppose that L1 is compact. Then L2 L1 B1V1 ⊆ L2 L1 B1V1
and the latter is compact since L1 B1V1 has compact closure and L2 is con-
tinuous. It follows that L2 L1 B1V1 is contained in a compact subset of V3 ,
so its closure is compact, and therefore L2 ◦ L1 is a compact operator.
If L2 is compact, then L2 ◦ L1 B1V1 ⊆ L2 kL1 kop B1V2 = kL1 kop L2 B1V2 ,
which is compact, and so L1 ◦ L2 is again compact.
Exercise 6.4. Let V and W be two normed vector spaces. Show that
This example shows that it is necessary to take the closure of the image and
not just the image of the closed ball: for example, the function f defined
C 1 ([0,1]) C 1 ([0,1])
by f (x) = |x − 21 | belongs to L(B1 ) but not to L(B1 ).
(b) For f ∈ C([0, 1]) and x ∈ [0, 1], define
Z x
T (f )(x) = f (t) dt.
0
Then T : C([0, 1]) → C([0, 1]) is compact, since T : C([0, 1]) → C 1 ([0, 1]) is
bounded and the inclusion C 1 ([0, 1]) → C([0, 1]) is compact by (a).
Exercise 6.6. For which pR > 1 is the operator sending f ∈ Lp ([0, 1]) to the function
x
in C([0, 1]) defined by x 7→ 0 f (t) dt a compact operator?
Lemma 6.7 improves the claim from Exercise 6.4 in that the two-sided
ideal K(V ) in B(V ) is even closed for any Banach space V (see also Exer-
cise 6.8).
Proof of Lemma 6.7. Let M = L B1V ⊆ W . Since W is assumed to be a
Banach space, M is complete. It remains to show that M is totally bounded
(see Section A.4 for the notion and for the equivalence to compactness).
Let ε > 0 and choose Ln with kLn − Lk < ε. Since Ln is compact, we know
that Ln B1V is compact and hence is totally bounded. It follows that there
exist elements w1 , . . . , wm ∈ Ln B1V with
m
[
Ln B1V ⊆ BεW (wi ).
i=1
For each wi there exists some vi ∈ B1V with kwi − Ln (vi )k < ε.
If now v ∈ B1V , then for some i ∈ {1, . . . , m} we have
170 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
It follows that
m
[
L B1V ⊆ W
B4ε (L(vi )),
i=1
which implies that the points L(vi ) for i = 1, . . . , m are 5ε-dense in the
set M = L B1V . As ε was arbitrary, M is therefore totally bounded, so M
is a compact set and hence L is a compact operator.
Exercise 6.8. Continuing the discussion from Exercise 6.4, show that B(V )/ K(V ) be-
comes a Banach algebra — the Calkin algebra — by defining (A + K(V ))(B + K(V )) to
be AB + K(V ) for all A, B ∈ B(V ) and using the quotient norm k · kB(V )/ K(V ) .
We explore here briefly the realm of integral operators and show that many
(but not all) are in fact compact operators.
Lemma 6.10 (Integral operators defined by continuous kernels). As-
sume that (X, dX ) and (Y, dY ) are compact metric spaces. Let µ be a finite
Borel measure on X, and let k be a function in C(X × Y ). Then the oper-
ator K : L2µ (X) −→ C(Y ) defined by
Z
K(f )(y) = f (x)k(x, y) dµ(x)
X
is a compact operator.
Proof. We first need to show that K is well-defined. To see this, notice that
Z
|f (x)||k(x, y)| dµ(x) 6 kf k2 kk(·, y)k2 6 kf k2 kkk∞ µ(X)1/2 ,
X
We now must show that K(f ) is continuous, and in doing so we will obtain
equicontinuity of the image of the unit ball, which together with (6.1) and
the Arzela–Ascoli theorem will give the compactness of K. Since X × Y is
compact, k is uniformly continuous, and so for any ε > 0 there is a δ > 0
for which dY (y1 , y2 ) < δ implies that |k(x, y1 ) − k(x, y2 )| < ε for all x ∈ X.
Therefore
Z
|K(f )(y1 )−K(f )(y2 )| 6 |f (x)||k(x, y1 )−k(x, y2 )| dµ(x) 6 εµ(X)1/2 kf k2
X
Exercise 6.12. Assume in addition that X, Y are compact metric spaces and µ, ν are
finite measures on the Borel σ-algebras of X and Y , respectively. Deduce Proposition 6.11
in this case as a corollary of Lemma 6.10.
kKkop 6 kkkL2µ×ν .
A1 × B1 , . . . , Am × Bm
S∞
are all disjoint
S∞ and have finite µ × ν-measure. Let us write X = n=1 Xn
and Y = n=1 Yn with X1 ⊆ X2 ⊆ · · · , Y1 ⊆ Y2 ⊆ · · · and with µ(Xn ) < ∞
and ν(Yn ) < ∞ for all n > 1. Then
A = {D ∈ BX ⊗ BY | the claim above holds for D∩(Xn ×Yn ) for all n > 1}
Exercise 6.13. Prove that the collection A in the proof of Proposition 6.11 is a σ-algebra.
Exercise 6.14. Let g ∈ L2 (Td ). Show that L2 (Td ) ∋ f 7→ f ∗g ∈ C(Td ) defines a compact
operator from (L2 (Td ), k · k2 ) to (C(Td ), k · k∞ ).
6.1 Compact Operators 173
Not all integral operators are compact, as shown by the Holmgren oper-
ators.
and Z
sup |k(x, y)| dµ(x) < ∞.
y∈Y X
Proof. The proof that the integral in (6.4) makes sense for ν-almost every y
in Y , and defines an element in L2ν , is less straightforward than the proof
of Proposition 6.11, and uses the Fréchet–Riesz representation theorem (Co-
rollary 3.19). Suppose that f ∈ L2µr{0} and g ∈ L2νr{0}, and consider the
integral Z
I= |f (x)k(x, y)g(y)| dµ×ν(x, y).
X×Y
Notice that for any real numbers a, b > 0 and c > 0, we always have
q q 2
c 1 ca2 b2
ab 6 ab + 2a − 2c b = + .
2 2c
and obtain
Z
√
|f (x)k(x, y)g(y)| dµ×ν(x, y) 6 sX sY kf kL2µ kgkL2ν .
X×Y
Show that the corresponding Holmgren operator K as defined in Proposition 6.15 is not a
compact operator on L2λ (R).
for v1 ∈ H1 , v2 , v2′ ∈ H2 and any scalar α. By (6.7) we have kA∗ kop 6 kAkop ,
so A∗ is bounded. Taking conjugates in (6.6) implies that A∗∗ = A,
so kAkop = kA∗ kop .
Exercise 6.18. Show that im(T )⊥ = ker(T ∗ ) and ker(T )⊥ = im(T ∗ ) for a linear oper-
ator T between Hilbert spaces.
The next exercise is not simply another example. It turns out to really be
the basis of the powerful spectral theory of normal bounded operators as well
as self-adjoint unbounded operators.
Essential Exercise 6.25. Let (X, B, µ) be a measure space, H = L2µ (X),
let g : X → C be a measurable function, and let Mg be the multiplication
operator Mg : f 7→ gf for f ∈ H.
(a) What properties of g ensure that Mg : H → H is well-defined and
bounded? What is kMg kop ?
(b) When is Mg a bounded self-adjoint operator? That is, what property of g
is equivalent to hMg f1 , f2 i = hf1 , Mg f2 i holding for all f1 , f2 ∈ H? What
property of g is equivalent to Mg being unitary?
(c) When does Mg have λ ∈ C as an eigenvalue?
(d) Suppose that X = R and let g(x) = x, and assume that µ is an arbitrary
finite compactly supported Borel measure on R. Characterize in terms of µ
the property that Mg can be diagonalized. That is, characterize the property
that H has an Porthonormal basis
∞ P∞{en | n ∈ N} and a sequence of scalars (λn )
such that Mg ( n=1 xn en ) = n=1 λn xn en for every (xn ) ∈ ℓ2 (N).
Exercise 6.26. Let H = Cn be a finite-dimensional Hilbert space with respect to the
usual inner product. Show that the linear operator defined by a matrix A = (ai,j ) is self-
adjoint if and only if A is equal to its own conjugate transpose (that is, ai,j = aji for
all i, j). Such matrices are also called Hermitian.
for all f1 , f2 ∈ L2µ (X) by Fubini’s theorem. Hence Theorem 6.27 applies, but
in this case it is a priori not at all clear how one could find the eigenvalues
or eigenvectors for the operator.
Example 6.28. Notice that the integral operator from Section 2.5.2 defined
by the kernel (
s(t − 1) for 0 6 s 6 t 6 1;
G(s, t) =
t(s − 1) for 0 6 t 6 s 6 1
satisfies the conditions above, and so the eigenfunctions found in Section 2.5.2
coincide with the eigenvectors which must exist by Theorem 6.27.
178 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
In fact as we saw in Section 3.4 (see Exercise 3.55(b) and its hint
on p. 566) the functions s1 , s2 , . . . form an orthonormal basis of L2 ([0, 1])
which makes K a diagonalizable operator. These notions P∞ also explain the
argument from P Section 2.5.2 quite clearly: If g = n=1 dn sn and we are
∞
looking for f = n=1 cn sn with (I + λ2 K)f = g, then (1 + λ2 µn )cn = dn for
all n ∈ N, which can be solved for cn unless λ2 = −µ−1 n and dn 6= 0.
Exercise 6.29. Let K be the Hilbert–Schmidt integral operator on L2µ (X) defined by a
kernel k ∈ L2µ×µ (X × X) with k(x, y) = k(y, x) as above. Prove that the generalized
Fredholm integral equation of the second kind f = λK(f ) + φ has a solution for any
function φ ∈ L2µ (X) if and only if λλn 6= 1 for all n, where (λn ) is the sequence of
eigenvalues of K on L2µ (X).
since the two inner products appearing are of the form hAu, ui and thus
satisfy |hAu, ui| 6 s(A)kuk2 . Now we apply the parallelogram identity (3.4)
to obtain
4kAxk2 6 2s(A) λ2 kxk2 + λ12 kAxk2 .
kAxk
Assuming that kAxk 6= 0, we set λ2 = kxk and get
kAxk kxk
4kAxk2 6 2s(A) kxk2 + kAxk2 = 4s(A)kAxkkxk,
kxk kAxk
and so kAxk 6 s(A)kxk for all x ∈ H. This shows that kAk 6 s(A).
We are now ready to prove the existence of an eigenvector.
†
In the following we let A be a compact self-adjoint operator on a separable
infinite-dimensional Hilbert space H (or a Hermitian matrix in Matn,n (C)).
Applying Theorem 6.27 we find a (finite or countable) sequence of positive
eigenvalues ϕ1 (A) > ϕ2 (A) > · · · > 0 and a (finite or countable) sequence of
negative eigenvalues ν1 (A) 6 ν2 (A) 6 · · · < 0, with corresponding orthonor-
mal eigenvectors v1 , v2 , . . . and w1 , w2 , . . ., respectively, so that
X X
A= ϕj (A)vj ⊗ vj∗ + νj (A)wj ⊗ wj∗ ,
j j
the former case we obtain the numerical range (0, ϕ1 ] or [0, ϕ1 ] (and hence
set ν1 = 0) and in the latter case we obtain [ν1 , 0) or [ν1 , 0] (and hence
set ϕ1 = 0). In particular,
where we set νk (A) = 0 if there are fewer than k negative eigenvalues. Similarly
where we set ϕk (A) = 0 if there are fewer than k positive eigenvalues. Formulate and prove
the result also for Hermitian matrices.
Exercise 6.35. Deduce from Exercise 6.34 the Weyl monotonicity principle (17) as follows.
For compact self-adjoint operators A and B write A 6 B if hAv, vi 6 hBv, vi for all v. Show
that if A 6 B then νj (A) 6 νj (B) and ϕj (A) 6 ϕj (B) (where we set νj = 0 and ϕj = 0
if there are not sufficient eigenvalues of the necessary sign) for all j. Formulate and prove
the result also for Hermitian matrices.
6.3 Trace-Class Operators 183
Exercise 6.36. Use Exercise 6.34 to prove Cauchy’s interlacing theorem as follows.
Let A ∈ Matn,n (C) be a Hermitian matrix. A matrix B ∈ Matm,m (C) with m 6 n is called
a compression of A if there is an orthogonal projection Q from Cn onto an m-dimensional
subspace with QAQ∗ = B. Show that λj (A) 6 λj (B) 6 λn−m+j (A) for 1 6 j 6 m in this
case.
†
The trace is undoubtedly one of the fundamental functions on the Pspace of
matrices. Recall that for any n > 1 the trace is defined by tr(A) = nk=1 Akk
for all A = (Ajk ) ∈ Matn,n (C) and that it satisfies tr(AB) = tr(BA) for A
and B in Matn,n (C), so that
for any A ∈ Matn,n (C) and S ∈ GLn (C). The identity (6.15) means that
the trace is well-defined on the space of linear maps of a finite-dimensional
vector space (specifically, independent of the choice of basis). Using a Hilbert
space structure on Cn and fixing an orthonormal basis v1 , . . . , vn we note
that hAvj , vk i is the coefficient of vk when expressing Avj in terms of the
orthonormal basis for j, k = 1, . . . , n. Hence
n
X
tr(A) = hAvj , vj i.
j=1
is finite, where the supremum is taken over all integers N > 0 and over any
two finite lists of orthonormal vectors (v1 , . . . , vN ) and (w1 , . . . , wN ) of the
same length N .
† In this section we present an important class of compact operators. However, it is not
N
X
kA + Bktc = sup |h(A + B)vn , wn i|
(vn ),(wn ) n=1
N
X
6 sup |hAvn , wn i| + |hBvn , wn i| 6 kAktc + kBktc .
(vn ),(wn ) n=1
w1′ = c1 w1′′
w2′ = c2 (w2′′ − hw2′′ , w1′ i w1′ )
w3′ = c3 (w3′′ − hw3′′ , w1′ i w1′ − hw3′′ , w2′ i w2′ )
..
.
′ ′′ ′′
′′ ′ ′
wm = cm wm − hwm , w1′ i w1′ − · · · − wm , wm−1 wm−1 ,
where the constants c1 , . . . , cm > 0 are chosen to normalize the vectors to have
unit length. As w1 , . . . , wm are orthogonal and due to (6.16), a simple induc-
tion on j = 1, . . . , m shows that cj exists for all large enough N , that cj → 1
and also wj′ → wj as N → ∞.
Proof. Fix some ε > 0 and some N > 0 and suppose the claim in the
lemma does not hold for N . Then there exist orthonormal vectors w1 , . . . , wm
in hv1 , . . . , vN i⊥ such that
m
X
|hAwk , wk i| > ε. (6.17)
k=1
Now apply Lemma 6.40 to the orthonormal vectors w1 , . . . , wm and the or-
thonormal basis vN +1 , vN +2 . . . of the Hilbert space hv1 , . . . , vN i⊥ to find
some N ′ > N large enough and a very good orthonormal approxima-
tion w1′ , . . . , wm
′
to w1 , . . . , wm inside hvN +1 , vN +2 , . . . , vN ′ i. In particular,
we may suppose that (6.17) also holds for w1′ , . . . , wm ′
.
We now apply the argument above infinitely often to achieve a contra-
diction to the hypothesis that A ∈ TC(H). Indeed, set N0 = 0 to find
some N1 > N0 and orthonormal vectors w1,1 , . . . , w1,m1 in hv1 , . . . , vN1 i
so that (6.17) also holds for w1,1 , . . . , w1,m1 . Assuming we have already
found N0 < N1 < · · · < Nℓ and orthonormal vectors wj,1 , . . . , wj,mj
in hvNj−1 +1 , . . . , vNj i with the same estimate for all j = 1, . . . , ℓ, we may
apply the same argument to find wℓ+1,1 , . . . , wℓ+1,mℓ+1 in hvNℓ +1 , . . . , vNℓ+1 i
with the same properties. However, the bound
mj
ℓ X
X
ℓε < |hAwj,k , wj,k i| 6 kAktc
j=1 k=1
6.3 Trace-Class Operators 187
shows that the construction above has to stop, proving the lemma.
V = hv1 , . . . , vN , w1 , . . . , wN i,
′ ′
and extend v1 , . . . , vN with vectors vN +1 , . . . , vM to an orthonormal basis
′ ′
of V . Similarly, we may find an orthonormal basis w1 , . . . , wN , wN +1 , . . . , wM
of V . Define a linear map AV : V → V by sending v ∈ V to πV (Av) where πV
is the orthogonal projection H → V . Note that hAV v, wi = hAv, wi for any
two v, w ∈ V . By the tail estimate in Lemma 6.41 we have
M
X
|hAvk′ , vk′ i| 6 ε
k=N +1
and
M
X
|hAwk′ , wk′ i| 6 ε.
k=N +1
Before proving this, notice the following property of the trace-class norm.
As the supremum is taken over (vn )n=1,...,N and (wn )n=1,...,N separately, we
could multiply each wn by an appropriate scalar αn with |αn | = 1 to ensure
that
hAvn , αn wn i > 0.
Therefore we may also write
188 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
N
X
kAktc = sup hAvn , wn i ,
(vn ),(wn )
n=1
and if A ∈ TC(H) then for every ε > 0 there exist finite orthonormal
lists (vn )n=1,...,N and (wn )n=1,...,N with
N
X
hAvn , wn i > kAktc − ε.
n=1
Proof. From the fact that en+1 ⊥ Cn (by the identification between Cn
and Cn × {0}) and the definition of the trace-class norms, we have
and so
|d| 6 kA1 ktc − kA0 ktc 6 ε2 kA1 ktc 6 εkA1 ktc (6.18)
by the hypotheses. The lemma will follow from Pythagoras’ theorem after we
have shown the more delicate estimate
and
6.3 Trace-Class Operators 189
1
w=
0
are both unit vectors, and we may apply the definition of the trace-class
norm kA1 ktc to just these two vectors and obtain
p
|hA1 v, wi| = a 1 − ε2 + bεθ 6 kA1 ktc .
and the comment after the statement of Proposition 6.42 above, there exist
orthonormal bases v1 , . . . , vn ∈ Cn and w1 , . . . , wn ∈ Cn with
n
X
kA0 ktc = hA0 vj , wj i .
j=1
for any choice of orthonormal basis of Cn and j = 1, . . . , n. Now let the or-
thonormal basis v1′ , . . . , vn′ of Cn be chosen so that U b = kbkvn′ . We extend U
to a unitary operator on Cn+1 by setting U en+1 = en+1 , and note that
and
U −1 v1′ , . . . , U −1 vn−1
′
, U −1 vn′ .
Using the definition of the trace-class norm kA1 ktc we get
n−1
X
p
kA1 ktc > U A1 vj′ , vj′ + 1 − ε2 hU A1 vn′ , vn′ i + εkbkhvn , vn i
j=1
p
> 1 − ε2 kA0 ktc + εkbk,
where we have used (6.23) and (6.22) in the last step. This is the analogue
to (6.20) with |a| replaced by kA0 ktc . Together with (6.21) we obtain (6.19)
in the general case.
as in the definition of k·ktc and using the comment after Proposition 6.42. We
define Aε by setting it equal to A on hv1 , . . . , vn i and to 0 on hv1 , . . . , vn i⊥ .
For any vector v ′ ∈ hv1 , . . . , vn i and v ′′ ∈ hv1 , . . . , vn i⊥ we have
(A − Aε ) (v ′ + v ′′ ) = Av ′′ ,
v ′′ = vn+1 ∈ hv1 , . . . , vn i⊥
6.3 Trace-Class Operators 191
for j, k = 1, . . . , n + 1. Then
A0 b
A1 = ∈ Matn+1,n+1 (C)
ct d
Pn+1 Pn+1
Since hU ∗ A1 ej , ej i = hAvj , k=1 ukj wk i and the vectors k=1 ukj wk ∈ H
are orthonormal for j = 1, . . . , n + 1, the inequality kA1 ktc 6 kAktc follows.
Combining the estimate kA1 ktc 6 kAktc with (6.26) and Lemma 6.43, we
get (6.25) for any v ′′ ∈ hv1 , . . . , vn i⊥ with kv ′′ k = 1.
It follows that A is the limit of Aε defined as above as ε ց 0 (with respect
to the operator norm), and so Lemma 6.7 implies the proposition.
The results above regarding the trace and the trace-class are satisfying, but
the concepts would not be important without non-trivial examples of trace-
class operators. We next discuss the relationship with the class of self-adjoint
(compact) operators, which gives us many examples.
We say that a self-adjoint operator A on a Hilbert space H is positive
if hAv, vi > 0 for all v ∈ H.
Using Lemma 6.40 we can find some N > 1 and orthonormal approxima-
tions of (xk ) and (yk ) within V = hv1 , . . . , vN i. Letting N → ∞ later on,
it suffices to show (6.27) for the approximations within V and we will use
the same letters to denote the approximations. We extend the orthonormal
lists (xk ) and (yk ) to orthonormal bases of V . Using the comment after Pro-
position 6.42 we may adjust the yk once more and assume without loss of
generality that hAxk , yk i > 0 for k = 1, . . . , N without changing the value of
the left-hand side in (6.27). We also define a unitary operator U : V → V
satisfying U ∗ xk = yk for k = 1, . . . , N . In other words, we wish to estimate
K
X N
X N
X
hAxk , yk i 6 hAxk , yk i = hAxk , U ∗ xk i = tr(U AV ), (6.28)
k=1 k=1 k=1
This, together with (6.28), implies (6.27), first for the approximations of (xk )
and (yk ) in V , and then using Lemma 6.40 and letting N → ∞ as indicated
earlier for any two lists of orthonormal vectors in H. Hence kAktc 6 S and
the proposition follows.
Exercise 6.45. Let A be a compact operator. Show that A has a polar decomposition of
the form A = QP where ker(A) = ker(Q) = ker(P ), Q|(ker(A))⊥ is an isometry, and P is
positive, self-adjoint, and compact. Show that P is trace-class if and only if A is.
6.3 Trace-Class Operators 193
is trace-class. Then Z
tr(K) = k(x, x) dµ(x).
X
194 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
∞ n(ℓ)
X X X
tr(K) = hKvn , vn i = lim hKvn , vn i = lim hKwP , wP i
ℓ→∞ ℓ→∞
n=1 n=1 P ∈ξℓ
Pnℓ
since n=1 hKvn , vn i = tr(πWℓ K|Wℓ ), where πWℓ denotes the orthogonal
projection onto Wℓ , and this trace can also be computed in the orthonormal
basis {wP | P ∈ ξℓ }. Now we may use the definition of K to see that
X X Z X Z
hKwP , wP i = 1
µ(P ) K(1P ) dµ = 1
µ(P ) k(x, y)dµ×µ(x, y).
P ∈ξℓ P ∈ξℓ P P ∈ξℓ P ×P
Now fix ε > 0 and use uniform continuity of k to find an ℓ sufficiently large
to ensure that
|k(x, y) − k(x, x)| < ε
whenever x, y ∈ P for some P ∈ ξℓ . We may also suppose that ℓ is large
enough to have X
tr(K) − hKwP , wP i < ε.
P ∈ξℓ
Show that h·, ·iC is a complex inner product making HC into a complex Hilbert space.
(b) We used Lemma 6.38 (concerning complex Hilbert spaces) twice in this section. Use (a)
to show that the results of this section also hold in the case of a real Hilbert space.
T ∋ t 7−→ At ∈ TC(H)
The next two exercises give a tool for showing that certain Hilbert–Schmidt
integral operators (as in Proposition 6.48) are trace-class.
Exercise 6.53. Let H be a Hilbert space with an orthonormal basis (en ). Define the
Hilbert–Schmidt norm X
kAk2HS = |hAej , ek i|2
j,k
and the space of Hilbert–Schmidt operators HS(H) = {A ∈ B(H) | kAkHS < ∞}.
(a) Show that A ∈ HS(H) if and only if A∗ ∈ HS(H), and that kA∗ kHS = kAkHS for
all A ∈ HS(H).
(b) Show that the definition of the Hilbert–Schmidt norm is independent of the choice of
orthonormal basis.
(c) Show that HS(H) forms a two-sided ideal in B(H). That is, for any A ∈ HS(H)
and B ∈ B(H) we have AB ∈ HS(H) and BA ∈ HS(H).
196 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
(d) Find an inner product on HS(H) which induces the norm k · kHS , and show that HS(H)
is a Hilbert space with this inner product.
(e) Show that HS(H) is also a Banach algebra, meaning that kABkHS 6 kAkHS kBkHS .
(f) Show that HS(H) is a closed subspace of B(H) if and only if H is finite-dimensional.
(g) Show that every Hilbert–Schmidt operator is compact.
(h) Assume now that H = L2 ((0, 1)). For every k ∈ L2 ((0, 1)2 ) we define the associated
Hilbert–Schmidt integral operator as in Proposition 6.11. Show that the space of Hilbert–
Schmidt integral operators corresponds exactly to HS(H). In particular, show that for any
operator A ∈ HS(H) the corresponding kernel kA is given by
X
kA (x, y) = hAei , ej i ei (x)ej (y).
i,j
Exercise 6.55. (18) Let H be a Hilbert space with respect to the inner product h·, ·iH ,
and write k · kH for the induced norm on H. Let h·, ·i0 be a semi-inner product on H, and
write k · k0 for the induced semi-norm on H. Assume that k · k0 6 k · kH .
(a) Show that there exists a unique positive bounded self-adjoint operator A such that
The relative trace of k · k0 with respect to k · kH is defined as the trace of A (which might
be infinity).
(b) Let k > d2 , H = H k (U ) for some open subset U ⊆ Rd , and hf, gi0 = f (x)g(x) for some
fixed x ∈ U . Show that A as in (a) has finite trace (and so k · k0 has finite relative trace
with respect to k · kH ).
(c) Let µ be a compactly supported measure on U . Combine (b) with Exercise 6.52 to show
R 1/2
that the semi-norm kf kL2 (µ) = |f |2 dµ for f ∈ H k (U ) has finite relative trace with
respect to k · kH .
We will prove in this section the claim from Section 1.2 that for any open
bounded subset U ⊆ Rd there is a basis of L2 (U ) consisting of eigenfunctions
of the Laplace operator such that these functions also vanish (in the square-
mean sense) at the boundary of U .
In the proof we will first go back to the case of the d-dimensional torus,
even though (or actually precisely because) we already have an orthonormal
basis consisting of eigenfunctions of the Laplacian in this setting, namely
the characters. In Section 6.4.2 we will define a right inverse of ∆ defined
on L2 (U ) for an open subset U of Rd — a setting in which we do not know
the eigenfunctions of the Laplacian. Finally, we will ask in Section 6.4.4 about
the growth rate of the eigenvalues and prove Weyl’s law for Jordan measurable
open domains. We start by stating the main theorem, which will be proved
in Section 6.4.2.
6.4 Eigenfunctions for the Laplace Operator 197
We already used the fact that the characters on Td are eigenfunctions of the
Laplace operator on Td in the proof of elliptic regularity (Lemma 5.48). Ob-
taining a compact self-adjoint right inverse to ∆ is quite easy on the torus Td .
R
Exercise 6.57. Define L20 (Td ) = f ∈ L2 (Td ) | Td f dx = 0 , and prove that there exists
a compact self-adjoint operator S : L20 (Td ) −→ L20 (Td ) with the property that ∆Sf = f
for all f ∈ L20 (Td ).
For the discussion on an open subset we will need the following lemma.
is compact.
This implies a uniformity claim for the convergence of the Fourier series of
all f ∈ K. Indeed, for any N > 1 we have
X X X
|an |2 = N −2 N 2 |an |2 6 N −2 (1 + knk22 )|an |2 ≪ N −2 ,
knk2 >N knk2 >N knk2 >N
and so we see that the above tail sum goes to zero uniformly for all f ∈ K
as N → ∞.
To see that K is totally bounded we fix some ε > 0 and choose N such
that the above statement becomes
198 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
X
|an |2 < ε2 /4
knk2 >N
P
for all f = n an χn ∈ K. Next take a finite ε/2-dense subset of the finite-
dimensional compact set
n X o
f= an χn | kf k2 6 1 and k∂j f k2 6 1 for j = 1, . . . d .
knk2 6N
Exercise 6.59. Consider the map ık,ℓ : H k (Td ) → H ℓ (Td ) for k > ℓ > 0.
(a) Characterize those k and ℓ for which the map ık,ℓ is compact.
(b) Characterize those k for which the map ık,0 ı∗k,0 is Hilbert–Schmidt class.
(c) Characterize those k for which the map ık,0 ı∗k,0 is trace-class.
The following provides the link between the Laplace operator and our dis-
cussion of compact self-adjoint operators in Theorem 6.27. The compactness
claim is a special case of Rellich’s Theorem.
Proposition 6.60 (Self-adjoint compact right inverse). Let U ⊆ Rd
be a bounded and open subset. Using Lemma 5.41 we equip H01 (U ) with the
inner product h·, ·i1 . Then the map ı = ı1,0 : H01 (U ) −→ H 0 (U ) = L2 (U ) has
the property that ∆(ıı∗ f ) ∈ L2 (U ) exists for all f ∈ L2 (U ) and equals −f . In
other words, ∆ ◦ (−ıı∗ ) = I is the identity on L2 (U ). Finally, S = −ıı∗ is a
compact self-adjoint operator L2 (U ) −→ L2 (U ).
P ı1,0 ·|U
H01 (U ) −→ H 1 (TdR ) −→ H 0 (TdR ) = L2 (TdR ) −→ L2 (U ),
6.4 Eigenfunctions for the Laplace Operator 199
(b) Let U = {(x1 , x2 ) ∈ (0, 1) × (0, 1) | x1 + x2 < 1}. Find an orthonormal basis of L2 (U )
consisting of eigenfunctions of the Laplace operator and satisfying the Dirichlet boundary
value conditions.
Exercise 6.62. Assume that d > 2 (or that d = 2 for simplicity). Let U ⊆ Rd be open
and K ⊆ U a compact subset. Let f ∈ H01 (U ) be an eigenfunction of ∆ (and of S as in
the proof of Theorem 6.56) such that ∆f = λf for some λ < −1. Show that
d 1
kf kK,∞ ≪K,U |λ| 4 + 2 kf k2 .
(19)
We now describe a concrete case of Theorem 6.56. As mentioned earlier,
a concrete description of the Laplace eigenfunction is generally impossible
unless the domain has special features. Thus a natural case beyond the open
2
rectangle considered in Exercise 6.61 is to set U to be the open unit disc B1R .
1
For a given eigenfunction f ∈ H0 (U ) of ∆ and some rotation matrix
200 6 Compact Self-Adjoint Operators and Laplace Eigenfunctions
cos φ − sin φ
k(φ) =
sin φ cos φ
for φ ∈ [0, 2π) as in Section 1.1 we may consider the function f k (x) = f (kx).
A simple calculation (which may be carried out using Proposition 1.5) shows
that f k is also an eigenfunction of ∆ on U with the same eigenvalue as f .
Since the eigenspace of H01 functions of ∆ for a given eigenvalue is finite-
dimensional, it follows that we can find for any given eigenvalue a basis
of the eigenspace with the property that every basis vector also has some
weight n ∈ Z for the action of K on U (cf. Corollary 3.89).
Fixing the weight n ∈ Z and the eigenvalue λ < 0, the partial differential
equation ∆f = λf has a convenient reformulation. In fact, a calculation
∂2 ∂2
reveals that the Laplace operator ∆ = ∂x 2 + ∂y 2 has the representation
∂2 1 ∂ 1 ∂2
∆= + +
∂r2 r ∂r r2 ∂θ2
in polar coordinates (we will also write f for the eigenfunction in polar co-
ordinates), and if f has weight n then f (r, θ) = F (r)einθ for a function F
on [0, 1]. Since f is smooth on U , F is smooth on (0, 1). Since f vanishes
on ∂U we have F (1) = 0. Moreover, if n 6= 0 we must also have F (0) = 0
(check this). Finally, the partial differential equation ∆f = λf now becomes
2
n2
the ordinary differential equation ddrF2 + r1 dF
dr − r 2 F = λF , or, equivalently,
d2 F dF
r2 +r + |λ|r2 − n2 F = 0, (6.30)
dr2 dr
with the conditions on F (0) and F (1) as explained above. The differential
equation
x2 Jn′′ + xJn′ + (x2 − n2 )Jn = 0 (6.31)
on (0, ∞) is known as Bessel’s equation and the solutions are called the Bessel
functions, one of a class of special functions introduced by the astronomer
Bessel in 1917 in connection with the problem of three bodies moving un-
der mutual gravitational attraction. The two equations (6.30) and (6.31) are
essentially equivalent by setting x = |λ|1/2 r and Jn (x) = F (|λ|1/2 r).
Since (6.31) is a linear second-order differential equation there are two
linearly independent real solutions for each λ and n. The function Jn is
characterized up to a scalar multiple by the condition that limx→0 Jn (x)
exists (see Exercise 6.63(b)). Bessel found the integral representation
Z
1 π
Jn (x) = cos (x sin t − nt) dt (6.32)
π 0
of the function Jn (we refer to Whittaker and Watson [113] for a general
treatment of special functions). We will not develop this theory further, but
refer to Figure 6.1–6.2 for a visualization of the resulting functions; in mod-
6.4 Eigenfunctions for the Laplace Operator 201
elling the behaviour of a drum the time variable is also needed, and so these
illustrations may be thought of as snapshots of an oscillating drum skin (as
alluded to in Section 1.2.2).
Exercise 6.63. Make the discussion of this section complete by the following steps.
(a) Prove that Jn as defined in (6.32) satisfies the differential equation (6.31).
(b) Show that the equation 6.31 has a solution Yn with Yn (x) → −∞ as x → 0 given by
Z π Z ∞
1 1
Yn (x) = sin (x sin t − nt) dt − ent + (−1)n e−nt e−x sinh t dt.
π 0 π 0
(The solutions Jn and Yn are referred to as Bessel functions of the first and second kind,
respectively.)
(c) Show that for every n ∈ Z there is an eigenfunction of weight n.
(d) Show that for every n ∈ Z the eigenvalues (and eigenfunctions) of weight n correspond
to the zeros of Jn .
N (T )
lim = (2π)−d ωd m(U ), (6.33)
T →∞ T d/2
d
where m is the Lebesgue measure on Rd and ωd = m(B1R ) is the volume of
the unit ball in Rd .
In 1966 M. Kac [50] asked ‘Can one hear the shape of a drum?’ As we
explained in Section 1.2.2, the eigenvalues of the Laplacian on an open set U
relate directly to the frequencies at which a membrane with the shape U
would vibrate. Thus the notes one hears from a drum with shape U are
precisely related to the eigenvalues of the Laplacian and the question raised
by Kac asks whether the list of eigenvalues determines U (up to isometric
motions of Rd ). One of the consequences of Theorem 6.64 is that the size of
the drum certainly can be heard in this sense. Kac’s question was answered
in the negative.(21)
Our (by now well-established) approach is to first show the result for the
torus, and we will then apply a technique known as Dirichlet–Neumann brack-
eting to extend the proof to the general case.
Proposition 6.65. Let R > 0 and U = TdR = Rd /(2RZd ) or U = (0, R)d .
Then Weyl’s law holds for the eigenvalues of the Laplacian on U .
Proof. In both cases, write (λn ) for the eigenvalues and (fn ) for the associ-
ated eigenfunctions in H 1 (TdR ) resp. H01 (U ). In the case of TdR we know that
the basis of eigenfunctions is given by (χn ), where
−1
χn (x) = e2πi(2R) (n1 x1 +···+nd xd )
d d
Z ∩ BSR
lim = ωd .
S→∞ Sd
N (T )
lim = ωd ,
T →∞ (T 1/2 2R/2π)d
We now extend the result to U = (0, R)d . For this let n ∈ Nd0 and note
that the characters (χm ) on TdR for m = (±n1 , . . . , ±nd ) all have the same
eigenvalue −4π 2 (2R)−2 knk22 for the Laplacian. Taking linear combinations of
the characters we obtain the eigenfunctions
and so
NU (T ) NTd (T )
lim d/2
= lim d R d/2 = (2π)−d ωd m(U ).
T →∞ T T →∞ 2 T
For the proof of the claim we apply the discussion of even and odd functions
from Section 1.1 in d dimensions. For this we identify L2 ((0, R)d ) with the
subspace of functions in L2 ((−R, R)d ) = L2 (TdR ) that are odd with respect
to all coordinates. More precisely, given f ∈ L2 ((0, R)d ) we define f˜|U = f
and
f˜(ε1 t1 , ε2 t2 , . . . , εd td ) = ε1 · · · εd f˜(t1 , . . . , td )
for ε1 , . . . , εd ∈ {±1} and (t1 , . . . , td ) ∈ U (and the same formula then holds
for all t ∈ (−R, R)d ). Expand f˜ into eigenfunctions of the form (6.34) for
all n ∈ Nd . If g is one of these, then either g is the product only of sine
functions or it is even with respect to one or more of the variables; assume
that it is even with respect to xk . Using the substitution xk → −xk in the
inner product we obtain
and so hf˜, giL2 ((−R,R)d ) = 0. This shows that f˜ is expressed using products
of sine functions only. We also note that for any f, g ∈ L2 ((0, R)d ) we have
D E
f˜, g̃ = 2d hf, giL2 ((0,R)d )
L2 ((−R,R)d )
which may be seen by splitting (−R, R)d into 2d smaller cubes and substi-
tuting yj = ±xj for j = 1, . . . , d and x ∈ (0, R)d on each one of them. It
follows that the functions of the form x 7→ sin(πR−1 n1 x1 ) · · · sin(πR−1 nd xd )
for n ∈ Nd are an orthogonal basis of L2 (U ). As these functions also vanish
on ∂U it follows that they belong to H01 (U ) (see Exercise 6.66), proving the
proposition. The cautious reader may notice that we have only found an or-
thonormal basis of L2 (U ) in H01 (U ) consisting of eigenfunctions of ∆ as in
Theorem 6.56. However, as ∆ is not a well-defined operator it is not clear
whether this basis is the same as the one in Theorem 6.56. This is resolved
in Lemma 6.67(a).
Essential Exercise 6.66. (a) Show that the function x 7→ sin(πR−1 nx) lies
in H01 ((0, R)) for all n > 1.
(b) Formulate and show the analogous result for U = (0, R)d .
for all g ∈ H01 (U ). If this holds for a non-trivial f then λ < 0 and we
have S(f ) = λ−1 f (where S = −ıı∗ is as in Proposition 6.60). In particular,
the eigenspaces inside H01 (U ) of ∆ coincide with those of S.
(b) If f1 , f2 , . . . ∈ H01 (U ) ∩ C ∞ (U ) are eigenfunctions of ∆ with eigen-
values 0 > λ1 > λ2 > · · · that form an orthonormal basis of L2 (U ), then
1 1
|λ1 |− 2 f1 , |λ2 |− 2 f2 , . . . (6.36)
Proof. For the proof of (a), suppose that f ∈ H01 (U ) ∩ C ∞ (U ) satisfies the
equation ∆f = λf . Let φ ∈ Cc∞ (U ). Then
X
hf, φi1 = h∂j f, ∂j φiL2 (U) = − h∆f, φiL2 (U) = −λ hf, φiL2 (U) .
j
hı∗ (−λf ), gi1 = h−λf, ı(g)iL2 (U) = h−λf, giL2 (U) = hf, gi1
Proof. Fix some T > 0. Suppose that |λ1 |, . . . , |λn | 6 T and |λk | > T
for all k > n (so that n = N (T )). Define V0 =Phf1 , . . . , fn i. Applying
n
Lemma 6.67(a) to each fk for k = 1, . . . , n and g = ℓ=1 aℓ fℓ we get
DX
n n
X E n
X D X n E
kgk21 = ak f k , aℓ f ℓ = ak |λk | fk , aℓ f ℓ
1 L2 (U)
k=1 ℓ=1 k=1 ℓ=1
Xn
Xn
2
= |λk ||ak |2 6 T
ak f k
2
L (U)
k=1 k=1
and we see that V does not satisfy the requirement in (6.37). Hence any
subspace V as in (6.37) would satisfy dim V 6 dim V0 = N (T ) and the
lemma follows.
Proof of Theorem 6.64. Notice first that Lemma 6.68 implies for disjoint
open subsets U1 and U2 of a bounded open set U the sub-additivity
where we write NU ′ (T ) for the counting function for an open and bounded
domain U ′ ⊆ Rd . Indeed, on extending functions to be zero outside Uj we may
write H01 (Uj ) ⊆ H01 (U ) for j = 1, 2 as in Exercise 5.27, and, once embedded,
we have H01 (U1 ) ⊥ H01 (U2 ), with respect to both h·, ·i1 and h·, ·iL2 (U) , since U1
and U2 are disjoint, so that we can take the direct sum of the subspaces
realising the maximum U1 and U2 appearing in Lemma 6.68. We note that —
although it is tempting to try — it is not possible to derive the estimate 6.38
6.4 Eigenfunctions for the Laplace Operator 207
Fig. 6.4: Approximating U by two pixelated versions of U , one from inside and
one from outside.
NU (T ) NI1 (T ) + · · · + NIk (T )
lim inf > lim
T →∞ T d/2 T →∞ T d/2
= (2π) ωd m(I1 ⊔ · · · ⊔ Ik ) > (2π)−d ωd (m(U ) − ε)
−d
by Proposition 6.65.
On the other hand, we may add extra cubes to O1 ⊔ · · · ⊔ Oℓ to obtain
O1 ⊔ · · · ⊔ Oℓ ⊔ E1 ⊔ · · · ⊔ En = [−R, R]d
n
X
NO o (T )+ NEj (T )
n 1 ⊔···⊔Oℓ
X NE (T ) j=1
lim sup NTUd/2
(T )
+ j
T d/2
6 lim sup
T d/2
T →∞ j=1 T →∞
N(−R,R)d (T )
6 lim T d/2
= (2π)−d ωd m((−R, R)d ).
T →∞
NEj (T )
lim = (2π)−d ωd m(Ej )
T →∞ T d/2
it follows that
NU (T )
lim sup 6 (2π)−d ωd m (O1 ⊔ · · · ⊔ Oℓ ) 6 (2π)−d ωd (m(U ) + ε) .
T →∞ T d/2
Since ε > 0 was arbitrary, this and the reverse bound for the limit infimum
above prove the theorem.
Exercise 6.69. Let U ⊆ Rd be open, bounded, and Jordan measurable. Show that
|λn |
lim = (2π)2 (ωd m(U ))−2/d ,
n→∞ n2/d
where λ1 > λ2 > · · · is the ordered list of eigenvalues of ∆ on U .
Exercise 6.70. Assume that d > 2 (or, for simplicity, that d = 2), let U ⊆PRd be an open,
bounded, Jordan measurable set, and let f ∈ Cc∞ (U ). Show that the series ∞n=1 hf, fn ifn ,
with (fn ) as in Theorem 6.56 ordered as in Exercise 6.69, converges pointwise on U and
uniformly on any compact subset of U .
• In Section 8.2.2 we return one more time to the topic of Sobolev spaces
and study elliptic regularity up to and including the boundary of U . For
further reading in that direction, we refer to Evans [30].
• The spectral theory of compact self-adjoint operators proven here is only
the starting point. We discuss spectral theory again in Chapter 9 for
unitary operators, in Chapter 11 from a general perspective as a prepar-
ation for Chapter 12 for bounded normal operators, and in Chapter 13
for unbounded self-adjoint operators.
The reader should continue with Chapter 7 and Chapter 8 as these give
important results for the chapters that follow.
Chapter 7
Dual Spaces
Let X be a real (or complex) normed vector space. A bounded linear operator
from X into the normed space R (or C) is a (continuous) linear functional
on X. Recall that the space of all continuous linear functionals is denoted X ∗
or B(X, R) and it is called the dual or conjugate space of X. Lemma 2.54
shows that X ∗ is a Banach space with respect to the operator norm.
In Section 7.1 we prove the Hahn–Banach theorem, a fundamental tool for
constructing linear functionals with prescribed properties. We also discuss
several further consequences of the Hahn–Banach theorem concerned with
the relationship between X and X ∗ . In Section 7.2 we discuss applications
of these results. Finally, in Sections 7.3 and 7.4 we will identify the duals of
many important Banach spaces, leading to examples and counter-examples
to the property of reflexivity.
One of the most important questions one may ask of X ∗ is the following: are
there ‘enough’ elements in X ∗ ? For example, are there enough elements to
separate points? This is answered in great generality using the Hahn–Banach
theorem (Theorem 7.3 below); see Corollary 7.4.
Even though in the main applications of the Hahn–Banach lemma the func-
tion p below is simply a norm, we will also see applications of this stronger
form of the lemma (with the stated weaker assumptions on the function p)
in Section 7.4 and Section 8.6.1.
and
p(λx1 ) = λp(x1 )
for all λ > 0 and x1 , x2 ∈ X. Let Y be a subspace of X, and f : Y → R
a linear function with f (y) 6 p(y) for all y ∈ Y . Then there exists a linear
functional F : X → R such that F (y) = f (y) for y ∈ Y , and F (x) 6 p(x) for
all x ∈ X.
and
Exercise 7.2. Let X be a real vector space and let K ⊆ X be a convex subset. Suppose
that 0 ∈ K and that for every x ∈ X there is some t > 0 with tx ∈ K. Define the gauge
function pK (x) = inf{t > 0 | 1t x ∈ K}. Show that pK is norm-like in the sense that it is
non-negative, homogeneous for positive scalars, and satisfies the triangle inequality (the
latter two being assumptions in Lemma 7.1).
For real vector spaces, the Hahn–Banach theorem follows at once (for
complex spaces a little more work is needed).
Theorem 7.3 (Hahn–Banach theorem). Let X be a real or complex
normed space, and Y a linear subspace. Then for any y ∗ ∈ Y ∗ there exists
an x∗ ∈ X ∗ such that kx∗ k = ky ∗ k and x∗ (y) = y ∗ (y) for all y ∈ Y .
That is, any linear functional defined on a subspace may be extended to
a linear functional on the whole space, without increasing the norm.
Proof of Theorem 7.3. Assume first that X is a real normed space.
Let p(x) = ky ∗ kkxk and f (x) = y ∗ (x). Apply the Hahn–Banach lemma
(Lemma 7.1) to find an extension x∗ = F to the whole space. To check
that kx∗ k 6 ky ∗ k, write x∗ (x) = θ|x∗ (x)| with θ ∈ {±1}. Then
x∗ (ix) = x∗R (ix) − ix∗R (i2 x) = ix∗R (x) − i2 x∗R (ix) = ix∗ (x).
Finally, |x∗ (x)| = θx∗ (x) for some θ ∈ C with |θ| = 1, and so
which shows that kx∗ k = ky ∗ k and hence the complex case of the theorem.
The reader should compare the following result for a general normed vector
space to the characterization of the closed linear hull in Hilbert spaces (see
Corollary 3.26).
7.1 The Hahn–Banach Theorem and its Consequences 213
Proof. The inclusion of the left-hand side in the right-hand side is clear
since ker(x∗ ) is a closed subspace for any x∗ ∈ X ∗ . Suppose that x0 ∈/ hSi,
and let Y = hx0 i + hSi. Then the functional y ∗ defined by y ∗ (αx0 + z) = α
for z ∈ hSi is bounded. For otherwise there would exist, for every n > 1,
some scalar αn 6= 0 and some zn ∈ hSi with |αn | > nkαn x0 + zn k, which
implies that
kx0 + α1n zn k 6 n1 ,
for all x ∈ X.
Exercise 7.8. (a) Prove that if the dual space X ∗ of a real normed vector space X is
strictly convex (see Definition 2.17), then the Hahn–Banach extension of a continuous
functional on a subspace to all of X is unique.
(b) Give an explicit example of a situation in which the extension defined by the Hahn–
Banach theorem is not unique.
kxk = max
∗ ∗
|x∗ (x)| (7.4)
x ∈X ,
kx∗ k61
ı : X −→ X ∗∗ = (X ∗ )∗
x 7−→ ı(x)
from X into the bidual of X that sends x ∈ X to the linear functional ı(x)
defined by ı(x)(x∗ ) = x∗ (x) for x∗ ∈ X ∗ , is an isometric embedding.
214 7 Dual Spaces
As we will see in the next section, some Banach spaces which we have
already encountered are reflexive, but some are not.
Proof of Corollary 7.9. By definition, |x∗ (x0 )| 6 kx∗ kkx0 k 6 kx0 k for
all x∗ ∈ X ∗ with kx∗ k 6 1 and x0 ∈ X. Moreover, we may apply Corollary 7.4
to obtain some functional x∗ ∈ X ∗ of norm one with x∗ (x0 ) = kx0 k, which
proves (7.4). Now notice that
Exercise 7.12. Let X be a normed vector space and suppose that the dual X ∗ is separable.
Show that X is also separable. In particular, if X is separable but X ∗ is not, then X cannot
be reflexive. Find an example of a Banach space that is not reflexive for that reason.
†
The description of the closed linear hull in Corollary 7.6 can be used as a
spanning criterion: a subset S of a Banach space X spans X (that is, has X
as its closed linear hull) if and only if there is no non-zero x∗ ∈ X ∗ with the
property that S ⊆ ker(x∗ ).
This is a powerful tool, surprisingly often even without a complete descrip-
tion of the dual space. The following result generalizes the Stone–Weierstrass
theorem on the unit interval. The full result also shows the converse, so the
divergence characterises the density.
† The result of this subsection will not be needed in the remainder of the book.
7.1 The Hahn–Banach Theorem and its Consequences 215
Proof. Let Y be the closed linear hull of the set {1, pn1 , pn2 , . . . } in C([0, 1]).
By Corollary 7.6 we have to show that if ℓ ∈ C([0, 1])∗ has
for all k > 1, then ℓ = 0. In fact, it is enough to show that if ℓ ∈ C([0, 1])∗
has (7.5) for all k > 1, then ℓ(pn ) = 0 for all integers n > 1. This is because
Corollary 7.6 then shows that C[x] ⊆ Y , after which the Stone–Weierstrass
theorem (Theorem 2.40) may be applied to give Y = C([0, 1]). So assume
that ℓ ∈ C([0, 1])∗ satisfies (7.5) for all k > 1, and assume also
P∞ that there is
some n ∈ N with ℓ(pn ) 6= 0. We will show that this implies k=1 n1k < ∞.
For ζ ∈ C with ℜ(ζ) > 0, we define pζ (t) = tζ for t ∈ [0, 1] (with the
convention that 0ζ = 0). This defines the function pζ ∈ C([0, 1]) satisfy-
ing kpζ k 6 1. Moreover, we have
tζ+δ − tζ tδ − 1
lim = lim tζ = tζ log t,
C∋δ→0 δ C∋δ→0 δ
for all t ∈ [0, 1] and in fact the convergence is with respect to the k · k∞ norm
(use the complex version of the mean value theorem to check this claimed
uniformity).
Now define f (ζ) = ℓ(pζ ) for ζ ∈ C with ℜ(ζ) > 0, so |f (ζ)| 6 kℓk.
Furthermore, f is analytic for ℜ(ζ) > 0 since
f (ζ + δ) − f (ζ)
lim = ℓ tζ log t
C∋δ→0 δ
exists by the above observation regarding uniform convergence. Finally, we
have f (nk ) = 0 for k > 1 by assumption.
Now define the Blaschke product (22)
K
Y ζ − nk
BK (ζ) = ,
ζ + nk
k=1
|BK (ζ)| −→ 1
216 7 Dual Spaces
for kζk > R(δ) and ℜ(ζ) 6 ε(δ), the positive quantities R(δ) and ε(δ) depend
on δ, and δ > 0 is arbitrary.
Applying the maximum principle for gK on the half-disk
in Figure 7.1, the function |gK | must attain its maximum on the boundary
of the half-circle. As we have (7.6) on that boundary, we obtain
kgK k 6 kℓk(1 + δ)
first on the half-disk and, by decreasing ε(δ) and increasing R(δ), on all of
the right half-space. As δ > 0 was arbitrary, we obtain kgK k∞ 6 kℓk.
0 ε
Recall that n was chosen so that f (n) = ℓ(pn ) 6= 0. For ζ = n this shows
that
YK Y K
n + nk
1 + 2n = −1 kℓk
nk − n n − nk = |BK (n)| 6 |f (n)| < ∞,
k=1 k=1
meaning that we have found an upper bound for the product on the left-hand
side independent of K. Notice that nk > n for all but finitely many k ∈ N.
7.2 Banach Limits, Amenable Groups, and the Banach–Tarski Paradox 217
Taking the logarithm and using the fact that x ≪ log(1 + x) for all x ∈ [0, 1],
PK
it follows that the sum k=1 nk1−n has an upper bound independent of K.
Multiplying the series term-by-term with nkn−n (and noticing that nkn−n →1
P∞ 1 k k
as k → ∞), it follows that k=1 nk < ∞, as claimed. This contradicts our
assumption, and the theorem follows.
On the space c(N) = {(xn )n∈N ∈ ℓ∞ (N) | limn→∞ xn exists} we have the
natural linear functional lim defined by
Proof. We work initially over R. Let c(N) ⊆ ℓ∞ (N) and lim ∈ c(N)∗ be as
given before the statement of the corollary. Notice that k lim k = 1 since
lim an 6 sup |an |.
n→∞ n>1
Moreover,
LIM((an )n − (an+1 )n ) = L a1 − a2 , a1 −a
2
3 a1 −a4
, 3 , . . . = 0,
in 1949, perhaps as a pun, as these groups are ‘easy to work with’ and hence
a-men-able (US) / a-mean-able (UK), and are groups that ‘admit a mean’,
hence a-mean-able.
Example 7.17. Corollary 7.14 together with the next lemma shows G = Z
is amenable. Moreover, any invariant mean on Z (there will turn out to be
many, see Exercise 7.25) will have some reassuringly natural properties. For
example, if E = {n ∈ Z | n is even}, then for any invariant mean m we must
have m(E) = 21 since 2m(E) = m(E) + m(E + 1) = m(Z) = 1.
and check that LIM is well-defined, linear, bounded of norm one, positive,
and left-invariant.
Next we show that the class of amenable groups is closed under many
natural operations that allow us to give more examples.
so LIMH (ah0 ) = LIMG ((aG )h0 ) = LIMG (aG ) = LIMH (a), and Lemma 7.18
shows that H is amenable.
For (b) we again use Lemma 7.18, allowing us to work with function-
als on the space of bounded functions on the groups involved. Assume first
that G is amenable, so that H is amenable by (a). For a ∈ ℓ∞ (G/H),
define an element aG ∈ ℓ∞ (G) by setting aG (g) = a(gH). Just as in (a)
we define LIMG/H (a) = LIMG (aG ) to obtain a left-invariant positive func-
tional of norm one on ℓ∞ (G/H), showing that G/H is amenable.
7.2 Banach Limits, Amenable Groups, and the Banach–Tarski Paradox 221
For the converse, assume that H and G/H are both amenable, and
write LIMH and LIMG/H for the associated functionals. For a ∈ ℓ∞ (G)
define the bounded function a on G by
Even though the above shows that many groups are amenable, it is also
easy to give an example of a group that is not amenable.
G = Sα ⊔ Sα−1 ⊔ Sβ ⊔ Sβ −1 ⊔ {e}
However,
α−1 Sα = Sα ⊔ Sβ ⊔ Sβ −1 ⊔ {e},
so
222 7 Dual Spaces
for all h ∈ G.
Lemma 7.24. If a countable group has a Følner sequence, then it is amen-
able.
Proof. Let LIMN be the Banach limit from Corollary 7.14 and let (Fn ) be
a Følner sequence. Then for any a ∈ ℓ∞ (G) we can define
1 X
LIMG (a) = LIMN a(g) ,
|Fn |
g∈Fn
showing left-invariance.
Exercise 7.25. Show that the invariant means on Z constructed by using the Følner
(1) (2) (3)
sequences defined by Fn = [0, n], Fn = [−n, 0], and Fn = [n2 , n2 + n] are all different.
Can you construct infinitely many different invariant means on Z?
Exercise 7.26. Show that any countable abelian group is amenable.
Exercise 7.27. Prove that the discrete Heisenberg group
1k ℓ
H = 0 1 m | k, ℓ, m ∈ Z
0 0 1
Exercise 7.28. Show that SL2 (Z) is not amenable. You may use the fact that the
group PSL2 (Z) = SL2 (Z)/{±I} is isomorphic to the free product of Z/2Z and Z/3Z.
The following surprising consequence of the axiom of choice was one of the
original motivations for the study of amenable groups and of measurable sets.
z
a
y
x
b
and similarly define eb+ = 5b and eb− = 5b−1 . For some part of the proof we will
be working over the field F5 (that is, working modulo 5). The matrices arising
can all be viewed as linear transformations of the vector space Z3 /5Z3 ∼ = F35 .
We want to study how they act on the vectors
1 2 1 0 0
v = 1 , wα = 1 , wα−1 = 2 , wβ = 2 , wβ −1 = 1 .
1 0 0 1 2
but
3 –4 0 1 0
e
a+ wα−1 = 4 3 0 2 ≡ 0.
0 0 0 0 0
The same applies to the other matrices, which in summary means that each
of the matrices e a− , eb+ , eb− has its own non-zero eigenvector in F35 , maps
a+ , e
the eigenvector of the matrix with the same symbol but opposite sign to the
zero vector, but maps v and the other three to a multiple of its eigenvector.
Suppose now that γ is a reduced word of length n > 1 in F2 , the free
group with generators α and β (that is, a finite string of symbols chosen
from α, α−1 , β, β −1 with the property that no symbol is immediately fol-
lowed by its inverse). Define a homomorphism φ : F2 → SO3 (R) by defining
it on the generators by φ(α) = a, φ(β) = b and then extending to F2 using
the homomorphism property, and use this to define φ(γ) e = 5n φ(γ). Equival-
e
ently, φ(γ) ∈ Mat33 (Z) may be obtained by multiplying e a− , eb+ , eb− in the
a+ , e
order and multiplicities corresponding to the appearance of α, α−1 , β, β −1 in
the word γ. As the word γ is reduced, we see by induction on n and the
calculations above that
e
φ(γ)v ∈ Z3
modulo 5 is a non-zero multiple of wη where η ∈ {α, α−1 , β, β −1 } is the
e
left-most symbol of γ. In particular, φ(γ) is not divisible by 5 and φ(γ) 6= I
e
because φ(γ) 6= 5 I. Thus φ is injective and so im φ = ha, bi ∼
n
= F2 .
In the proof of Theorem 7.29 the free subgroup ha, bi < SO3 (R) from
Proposition 7.31 will play a critical role. It will be convenient to define two
subsets B1 , B2 of R3 to be equivalent, written B1 ∼ B2 , if they can be
decomposed as B1 = P1 ⊔ · · · ⊔ Pn and B2 = Q1 ⊔ · · · ⊔ Qn into finitely many
disjoint subsets such that Qk is the image of Pk under some isometric motion
for k = 1, . . . , n.
3
Proof of Theorem 7.29. Let B = B1R be the closed unit ball in R3 .
Step 1. We claim that B ∼ Br{0} by using a ‘Hilbert’s Hotel’ argument.(24)
To see this, let x0 = ( 12 , 0, 0)t and let γ : R3 → R3 be an irrational rotation
(meaning that γ n = I for some n ∈ Z implies that n = 0) about the point x0
in the x-y plane extended trivially to a rotation about the line parallel to
the z-axis through x0 , so that the orbit D = {γ n (0) | n ∈ N0 } ⊆ B is infinite.
Therefore
B = BrD ⊔ D ∼ BrD ⊔ γD = Br{0},
proving the claim.
Now let H < SO3 (R) be the free subgroup constructed in Proposi-
tion 7.31. Since H is countable and every non-trivial rotation in SO3 (R) has a
single one-dimensional eigenspace with eigenvalue 1, we can find a countable
union E of lines through the origin such that HE = E and with the property
that no vector in BrE is fixed by a non-trivial element of H.
Step 2. By using countably many Hilbert Hotel arguments at once we claim
that Br{0} ∼ BrE. To see this, notice that the set S2 ∩ E is countable and
so the set P of pairs of vectors v1 , v2 ∈ S2 ∩ E with v1 6= v2 is also countable.
Therefore
W = w ∈ R3 | w ⊥ v1 − v2 for some (v1 , v2 ) ∈ P
hv1 , x1 i = hγ m v1 , x1 i = hγ n v2 , x1 i = hv2 , x1 i
as claimed.
Step 3. We claim that BrE ∼ BrE ⊔ (BrE + (3, 0, 0)t ). This is clearly the
main step in the argument, and it is here that we will use the fact that H is
a free group, and in particular the resulting decomposition
which can be found by a direct application of the axiom of choice (and will
not be measurable). We now decompose BrE into the four disjoint sets
G G
B1 = S a C ⊔ a−n C, B2 = Sa−1 Cr a−n C, B3 = Sb C
n>0 n>1
and B4 = Sb−1 C.
Applying a to B2 we obtain
G
aB2 = (Sa−1 C ⊔ C ⊔ Sb C ⊔ Sb−1 C) r a−n C
n>0
= B2 ⊔ B3 ⊔ B4 ∼ B2 ⊔ (B3 ⊔ B4 ) + (3, 0, 0)t ,
Leaving B1 and B3 untouched and taking the union this proves the claim in
Step 3.
Applying Step 2 and Step 1 (twice each) backwards, the theorem follows.
we use this notation and terminology we do not assume that Y is indeed the
whole dual to X or vice versa. The reader may start with the following as a
warm-up exercise on how dual spaces may be found.
Exercise 7.33. (a) Recall that c0 (N) = {(an ) | limn→∞ an = 0} ⊆ ℓ∞ (N) is a Banach
space with respect to the supremum norm k · k∞ . Show that there is an isometric iso-
morphism (c0 (N))∗ ∼
= ℓ1 (N), where the dual pairing is given by
∞
X
h(an ), (bn )i = an bn
n=1
which satisfies
Z Z
kφ(g)kop
= sup f g dµ 6 sup |f ||g| dµ 6 kgk∞ .
kf k1 61 X kf k1 61 X
For the converse we assume that g 6= 0, let ε ∈ (0, kgk∞) and choose a
measurable set A ⊆ {x ∈ X | |g(x)| > kgk∞ − ε} with µ(A) > 0 (which is
possible by definition of the essential supremum) and with µ(A) < ∞ (which
is possible since µ is σ-finite). Now define
7.3 The Duals of Lpµ (X) 229
1 |g(x)|
f= 1A ,
µ(A) g(x)
kf k1 6 µ(X)1/2 kf k2
so that f = 1A |g| 2 1
g ∈ Lµ (X) ⊆ Lµ (X) and kf k1 = µ(A). If µ(A) > 0 then
Z Z
kℓkop µ(A) < |g| dµ = f g dµ = |ℓ(f )| 6 kℓkop µ(A)
A X
gives a contradiction. Thus µ(A) = 0 and so kgk∞ 6 kℓkop . Since ℓ and φ(g)
2 1
agree on the dense subset LF µ (X) ⊆ Lµ (X), we have ℓ = φ(g), as required.
If µ is σ-finite with X = ∞ Y
n=1 n and µ(Yn ) < ∞, then we may apply the
argument above to ℓ|L1µ (Yn ) to find some gn ∈ L∞µ (Yn ) with
contains all simple functions in its closure, so that we have V = L1µ (X). By
construction ℓ and φ(g) coincide on V , so once again ℓ = φ(g), as required.
for f ∈ Lpµ (X) and g ∈ Lqµ (X). The operator norm of the functional determ-
ined by g is precisely kgkq .
1 1
Proof. For f ∈ Lpµ (X) and g ∈ Lqµ (X) with p + q = 1 we have
by the Hölder inequality (Theorem B.15). It follows that the linear functional
defined by g on Lpµ (X) is bounded, with norm less than or equal to kgkq . If
we set (
|g|
|g|q/p if g 6= 0,
f= g
0 if g = 0
then Z 1/p
kf kp = |g|q dµ = kgkq/p
q <∞
and Z Z
q
1+ p
hf, gi = |g| dµ = |g|q dµ = kf kp kgkq
X X
shows that the norm of the functional φ(g) ∈ (Lpµ (X))∗ determined by g must
be equal to kgkq . It remains to show that every bounded linear functional ℓ
in (Lpµ (X))∗ is determined as above by some g ∈ Lqµ (X).
Let ℓ ∈ (Lpµ (X))∗ . Replacing ℓ by ℜ(ℓ), respectively by ℑ(ℓ) if necessary,
we may restrict to real-valued functions on X and to R-linear functionals, as
the complex case then follows by putting together the functions associated
to ℜ(ℓ) and ℑ(ℓ).
So we work over the reals and define
+
for any measurable set B ⊆ X. R Notice that
R if+ ℓ were defined by g, then ν (B)
+
would be given by ν (B) = A g dµ = B g dµ for A = {x ∈ B | g(x) > 0}.
Thus for a general ℓ we would like to show that ν + ≪ µ is an absolutely
continuous measure on X (which then will give us g + as a Radon–Nikodym
derivative). Clearly ν + (B2 ) > ν + (B1 ) > 0 for measurable B1 ⊆ B2 ⊆ X. For
measurable disjoint B1 , B2 ⊆ X and A1 ⊆ B1 , A2 ⊆ B2 as in the definition
of ν + , we have
and so ν + is a measure on X.
Finally, if B ⊆ X has finite µ-measure, then
where (Xn ), with XSn∞⊆ Xn+1 for all n > +1, is a sequence of sets with finite
+
measure and X = n=1 Xn . Notice thatPm ր g as n → ∞. Now let h > 0
g n
be a simple function of the form h = j=1 βj 1Bj , where βj > 0 for all j and
with the sets Bj measurable and pairwise disjoint. Then
Z Z m
X
hgn+ dµ 6 hg + dµ = βj sup ℓ(1Aj ) | Aj ⊆ Bj
j=1
X
m
= sup ℓ βj 1Aj | Aj ⊆ Bj ,
j=1
and so Z
hgn+ dµ 6 kℓkopkhkp .
Using monotone convergence this estimate extends to all positive h ∈ Lpµ (X).
Applying the argument (for gn+ ∈ Lqµ (X)) from the beginning of the proof this
shows that kgn+ kq 6 kℓkop and letting n → ∞ also shows that kg + kq 6 kℓkop
by monotone convergence.
Now define ∗
ℓ− = φ(g + ) − ℓ ∈ Lpµ (X)
where φ(g + ) is the functional determined by g + ∈ Lqµ (X). Notice that for
all B ⊆ X measurable with µ(B) < ∞ we have
7.3 The Duals of Lpµ (X) 233
Z
ℓ− (1B ) = 1B g + dµ − ℓ(1B )
= sup{ℓ(1A − 1B ) | A measurable, A ⊆ B}
= sup{−ℓ(1C ) | C measurable, C ⊆ B}, (7.10)
or equivalently
Z Z
ℓ(1B ) = 1B (g − g ) dµ =
+ −
1B g dµ
†
The Riesz–Thorin interpolation theorem (also called the Riesz–Thorin con-
vexity theorem) bounds the norms of linear maps between Lp spaces. This can
be useful because certain Lp spaces have special properties making it easier
to understand properties of operators on them — this particularly applies to
the cases p = 1, 2, and ∞.
1 6 q0 < q < q1 6 ∞.
Then
Lqµ0 (X) ∩ Lqµ1 (X) ⊆ Lqµ (X)
and kf kq 6 kf k1−t t q0 q1
q0 kf kq1 for all f ∈ Lµ (X) ∩ Lµ (X), where t ∈ (0, 1) is
1 1−t t
determined by the relation q = q0 + q1 .
† The results of this subsection conclude our discussion of Lp -spaces but will not be needed
in the remainder of the book.
234 7 Dual Spaces
q0
Now suppose that q1 < ∞. In this case the numbers (1−t)q and qtq1 are
Hölder conjugate by definition of t. Let f ∈ Lqµ0 (X) ∩ Lqµ1 (X). Applying
Hölder’s inequality (Theorem B.15) gives
Z Z
|f | dµ = |f |(1−t)q |f |tq dµ
q
tq
6
|f |(1−t)q
|f |
= kf k(1−t)q
q0 kf ktq
q1 .
q0 /(1−t)q q1 /tq
be a linear map such that kT f kq0 6 M0 kf kp0 and kT f kq1 6 M1 kf kp1 for
all f ∈ Lpµ0 (X) ∩ Lpµ1 (X) and some constants M0 , M1 > 0. Then T has
a linear extension to a linear space D of (equivalence classes of ) functions
on X into the space Lqν0 (Y ) + Lqν1 (Y ) with the following properties. If we
define pt and qt for any t ∈ (0, 1) by p1t = 1−t t 1 1−t t
p0 + p1 and qt = q0 + q1 then
we have Lpµt (X) ⊆ D and kT f kqt 6 M01−t M1t kf kpt for all f ∈ Lpµt (X). The
conclusion also holds for t = 0 if p0 < ∞ and for t = 1 if p1 < ∞.
for every χ ∈ G.b For f ∈ L2 (G) we have a(f ) ∈ ℓ2 (G)b and ka(f ) k2 = kf k2 ;
1 (f ) ∞ b (f )
for f ∈ L (G) we have a ∈ ℓ (G) with ka k∞ 6 kf k1 — or formally we
have p0 = 2 = q0 , p1 = 1, q1 = ∞, and M0 = M1 = 1. The above interpola-
tion theorem now implies that the map is also defined for functions f ∈ Lp (G)
b A short calculation reveals
with p ∈ [1, 2], taking values in a certain ℓq (G).
that in this case q ∈ [2, ∞] is the Hölder conjugate of p ∈ [1, 2].
Proof of Theorem 7.38 in the case p0 = p1 . Set D = Lpµ0 (X), and notice
that for f ∈ Lpµ0 (X) we have kT f kq0 6 M0 kf kp0 and kT f kq1 6 M1 kf kp0 .
Applying Proposition 7.37 gives the theorem in this case.
For the general case we will need the following result from complex analyis.
7.3 The Duals of Lpµ (X) 235
since t ∈ [0, 1]. For t = ℜ(z) = 0 or t = ℜ(z) = 1 this gives |φε (z)| 6 1 by
assumption on φ. Moreover,
for sufficiently large Nε we see that |φε (z)| 6 1 for all z ∈ S. By (7.11) it
follows that
|φ(z)| 6 |M01−z M1z | = M01−t M1t
for z ∈ S with t = ℜ(z). If M0 = 0 (or M1 = 0) we may apply the argument
above with M0 (or M1 ) replaced by any δ > 0 and obtain the lemma by
letting δ → 0.
Proof of Theorem 7.38 in the case p0 6= p1 . Our first goal is the in-
equality
kT f kqt 6 M01−t M1t kf kpt (7.12)
for a fixed t ∈ (0, 1) and all f ∈ ΣX . Here ΣX denotes the space of simple
integrable functions on X (and ΣY is defined similarly). Then ΣX ⊆ Lpµ (X)
for all p ∈ [1, ∞] and in particular T is defined on ΣX and satisfies
by the assumption in the theorem and Proposition 7.36. Assume for the
moment that qt ∈ (1, ∞]. Then the Hölder conjugate qt′ of qt belongs to [1, ∞)
q′
and ΣY is dense in Lνt (Y ). Fix some f ∈ ΣX and assume that
236 7 Dual Spaces
Z
(T f )g dν 6 M 1−t M1t kf kpt kgkq′ (7.13)
0 t
for all g ∈ ΣY . Then the above and Propositions 7.34 and 7.36 imply (7.12).
The case qt = 1 with qt′ = ∞ is only slightly different. Assume again (7.13)
and fix some measurable set B ⊆ Y with ν(B) < ∞. Then
is dense in L∞
ν (B) and as before (see also Corollary 7.9) we obtain
independent of B. Using the fact that ν is σ-finite, this again implies (7.12).
For the proof of (7.13) it suffices to fix some t ∈ (0, 1), some f ∈ ΣX
with kf kpt = 1, and some g ∈ ΣY with kgkqt′ = 1. By definition, we may
express f and g as finite sums
m
X
f= cj 1Ej ,
j=1
α(z) = (1 − z)p−1 −1
0 + zp1 ,
β(z) = (1 − z)q0−1 + zq1−1
n
X
|dk |(1−β(z))/(1−β(t)) arg(dk )1Fk if β(t) < 1;
gz = k=1
g if β(t) = 1.
7.3 The Duals of Lpµ (X) 237
S = {z ∈ C | 0 6 ℜ(z) 6 1}.
which is the quantity that we wish to estimate. The desired estimate will
follow from Lemma 7.40 once we establish its remaining assumptions.
Boundary estimate: Consider therefore z = iu with ℜ(z) = 0 and no-
tice that ℜ(α(iu)) = p−1 0 and ℜ(1 − β(iu)) = 1 − q0−1 = (q0′ )−1 . Since the
sets E1 , . . . , Em , respectively F1 , . . . , Fn , are disjoint, this gives
and ( ′ ′
|g|ℜ((1−β(iu))/(1−β(t))) = |g|qt /q0 if β(t) < 1;
|giu | =
|g| if β(t) = 1.
Using the assumption on T this gives
Tp0 (f0 ) − Tp0 (f0′ ) = T (f0 − f0′ ) = T (f1′ − f1 ) = Tp1 (f1′ ) − Tp1 (f1 ).
and
f l = f 1XrB ∈ Lpµt (X) ∩ Lpµ0 (X).
If we now choose a sequence (fn ) in ΣX with |fn | 6 |f | for all n > 1 and
with fn → f pointwise as n → ∞ then fns = fn 1B → f s in Lpµt (X) and
in Lpµ1 (X) by dominated convergence if p1 < ∞. If however p1 = ∞ then
we can choose the sequence (fn ) of simple functions to also have fns → f s
with respect to k · k∞ as n → ∞. Similarly, fnl = fn 1XrB → f l in Lpµt (X)
and in Lpµ0 (X). Therefore, T (fns ) → Tpt (f s ) in Lqνt (Y ) and T (fns ) → Tp1 (f s )
in Lqν1 (Y ) as n → ∞. Choosing a subsequence if necessary, the convergence
7.4 Riesz Representation: The Dual of C(X) 239
also holds pointwise almost everywhere, which gives Tpt (f s ) = Tp1 (f s ). The
same argument gives Tpt (f l ) = Tp0 (f l ), and Tpt (f ) = T (f ) follows.
Exercise 7.41. Show that t 7→ log kTpt k is convex for t ∈ (0, 1), where
Exercise 7.42. Let G be a locally compact, σ-compact, metrizable, abelian group. Fix
some p ∈ [1, ∞) with Hölder conjugate q and some F ∈ Lp (G). Show (or recall) that
Z
f ∗ F (x) = f (t)F (x − t) dmG (t)
The next result is useful in many ways. It will allow us to completely de-
scribe C(X)∗ in Section 7.4.5, but it is more often used directly in the form
presented here.
Exercise 7.45. Let X be a σ-compact, locally compact metric space. Let µ be a locally
finite measure on X. Show that µ is regular, meaning that
We will prove Theorem 7.44 in several steps, first showing the claimed
uniqueness of the measure, then showing existence in the totally disconnected
compact case, then the compact case and finally the general case.
7.4.1 Uniqueness
for all f ∈ Cc (X). This implies that µ and ν are locally finite, since for every
compact set K ⊆ X there exists some function f ∈ Cc (X) with f > 1K by
Urysohn’s lemma (Lemma A.27), which shows that µ(K), ν(K) 6 Λ(f ) < ∞.
Define m = µ + ν, so that µ ≪ m and ν ≪ m. By Proposition 3.29 there
exist Radon–Nikodym derivatives fµ , fν > 0 with
dµ = fµ dm, dν = fν dm,
As our first step towards the existence of the measure representing a positive
linear functional we consider the following kind of spaces, where the proof is
quite simple.
Definition 7.46. Let X be a topological space. A set C ⊆ X is called clopen
if it is both open and closed in X. The space X is called totally disconnected
7.4 Riesz Representation: The Dual of C(X) 241
if every open set in X is a union of clopen sets, so the topology has a basis
consisting of clopen sets.
Example 7.47. Before we give the proof, let us give examples of compact
metric totally disconnected spaces.
(1) X = {1, . . . , a}N is a compact metrizable space with respect to the
product topology using the discrete topology on {1, . . . , a}. It is also
totally disconnected, since for any finite collection F1 , . . . , Fn ⊆ {1, . . . , a}
the set π1−1 (F1 ) ∩ · · · ∩ πn−1 (Fn ) is both open and closed (here πj is the
projection X → {1, . . . , a} onto the jth coordinate). Q
∞
(2) More generally, we can also take the product X = n=1 An , where
each An is a finite set equipped with the discrete topology. Note that
any closed subset Y ⊆ X is again totally disconnected and compact.
One way to define a metric on X as in (1) or (2) and hence also on Y as
in (2) is to set
(
0 if x = y, and
d(x, y) = 1
n if x1 = y1 , . . . , xn−1 = yn−1 , but xn 6= yn
for all points x, y (see also Lemma A.17). In this metric the open ball of
radius n1 and centre y is given by
B n1 (y) = x | x1 = y1 , . . . , xn = yn = π1−1 ({y1 }) ∩ · · · ∩ πn−1 ({yn }).
1
Also note Br (y) = B n1 (y) if n+1 < r 6 n1 . It follows that there are only
countably many balls and that these are all clopen. As every open set O ⊆ X
is a union of balls it follows that every open set is actually a countable union
of clopen sets. In particular, the clopen sets generate the Borel σ-algebra.
Lemma 7.48. Let X be a totally disconnected compact metric space. Then
the Borel σ-algebra is generated by the clopen sets.
As we have already obtained a proof of the lemma in the setting of Ex-
ample 7.47 and since these cases will be sufficient for the proof of The-
orem 7.44 we leave the proof as an exercise.
Exercise 7.49. (a) Prove Lemma 7.48 in general by showing that in a compact totally
disconnected metric space, there are only countably many clopen sets.
(b) Show that every compact totally disconnected metric space is homeomorphic to a
metric space Y as in Example 7.47(2).
µC (C) = Λ(1C ).
This is possible since 1C ∈ C(X) as C is both open and closed. It follows
that
• µC (C) > 0 for C ∈ C (Positivity);
• µC (C1 ⊔C2 ) = µC (C1 )+µC (C2 ) for disjoint C1 , C2 ∈ C (Finite additivity).
By Caratheodory’s extension theorem (see Theorem B.4) we can extend µC
to a measure on the Borel σ-algebra B of X if
∞
! ∞
G X
µC Cn = µC (Cn )
n=1 n=1
F∞
for any disjoint sets C1 , C2 , . . . in C with n=1 Cn ∈ C. In the totally discon-
nected compact setting this is F∞ quite easy to check. Suppose that Cn ∈ C are
disjoint for n > 1 and C = n=1 Cn ∈ C. Then C is compact since C ∈ C
gives that it is a closed subsetF of X and X is compact. On the other hand the
sets Cn ∈ C are open, so C = ∞ n=1 Cn ∈ C is an open cover of a compact set.
FN
It follows that C = n=1 Cn for some N > 1, and hence Cn = ∅ for n > N .
Hence finite additivity gives
N
X ∞
X
µC (C) = µC (Cn ) = µC (Cn ),
n=1 n=1
as required.
Therefore, µC can be extended to a measure µ, defined on the Borel σ-
algebra B of X. By construction
Z
1C dµ = Λ(1C )
X
g − ε < f < g + ε.
R
Hence we may apply Λ and · dµ and obtain from the positivity of both
these functionals the bounds
Z
Λ(f ), f dµ ∈ [Λ(g) − εΛ(1), Λ(g) + εΛ(1)]
7.4 Riesz Representation: The Dual of C(X) 243
and so Z
f dµ − Λ(f ) 6 2εΛ(1).
As this holds for all ε > 0 and all f ∈ C(X), the theorem follows.
We now upgrade the result from Section 7.4.2 to the case of a general com-
pact metric space. For this we are going to use the Hahn–Banach lemma
(Lemma 7.1) and the following lemma.
Example 7.51. A few cases of this lemma do not need a proof, and should
help explain why one can think of Y as a symbolic cover.
• If X = [0, 1] then we may take Y = {0,P 1}N to be the space of all bin-
∞
ary sequences with the map φ ((an )) = n=1 an 2−n sending the binary
sequence to the real number with that binary expansion.
• Let X ⊆ [−M, M ]d be a compact subset of Rd . By composing with an
affine map, we can assume without loss that X ⊆ [0, 1]d = X ′ . Define
d
Y ′ = {0, 1}N
φ′ : Y ′ −→ X ′ = [0, 1]d
!
X ∞ ∞
X
(1) (d) (1) −n (d) −n
(an ), . . . , (an ) 7−→ an 2 , . . . , an 2
n=1 n=1
We postpone the proof of the lemma until after we have seen why it is
useful for the problem at hand.
Proof of Theorem 7.44 for compact metric spaces. Let X be a
compact metric space, and let Y and φ : Y → X be as in Lemma 7.50.
Let Λ : C(X) → R be a positive linear functional. For f ∈ C(X) we have
244 7 Dual Spaces
sup f (x) 1X − f > 0,
x∈X
so
Λ sup f (x) 1X − f > 0
x∈X
by positivity, or equivalently
for all f ∈ C(X), proving the theorem for a compact metric space X.
We note that the argument above actually proves the following abstract
principle. If φ : Y → X is a continuous surjective map between two compact
spaces, and the Riesz representation theorem holds for Y , then it also holds
for X.
It remains to construct the totally disconnected symbolic cover.
Proof of Lemma 7.50. Recall that since X is a compact metric space,
it is also totally bounded, so for every m > 1 there exist finitely many
(m) (m)
points x1 , . . . , xn(m) ∈ X with
n(m)
[ (m)
X= B1/m xi . (7.14)
i=1
We define
∞
Y
Z= {1, . . . , n(m)}
m=1
with the product topology from the discrete topologies on each of the
spaces {1, . . . , n(m)}. Then Z is a compact metric space (see Sections A.3
and A.4). We will define Y as a closed subset of Z, and will define φ : Y → X
by
(m)
φ(y) = lim xy(m) ,
m→∞
(m)
where y(m) ∈ {1, . . . , n(m)} is the mth coordinate of y and xy(m) is the
corresponding centre of the y(m)-th ball in the cover (7.14). Our definition
of Y will ensure that φ is well-defined (that is, the limit defining φ exists),
continuous, and surjective.
The closed set Y . Define
n o
(1) (m)
Y = y ∈ Z | B1/1 xy(1) ∩ · · · ∩ B1/m xy(m) 6= ∅ for all m > 1 .
We will show that Y is closed by proving that its complement ZrY is open.
So suppose that z ∈ ZrY , so that
(1) (m)
B1/1 xz(1) ∩ · · · ∩ B1/m xz(m) = ∅
246 7 Dual Spaces
for some m > 1. However, this means that all other sequences with the same
first m coordinates also lie in ZrY . That is,
π1−1 ({z(1)}) ∩ · · · ∩ πm
−1
({z(m)}) ⊆ ZrY,
(m)
This shows that xy(m) is a Cauchy sequence in X and so has a limit in X.
Continuity of φ. Let y ∈ Y and fix ε > 0. Choose ℓ with 4ℓ < ε. Suppose
that z ∈ Y belongs to the neighbourhood πℓ−1 ({y(ℓ)}) defined by the ℓth
coordinate of y. Letting m → ∞ in (7.15) we see that
(ℓ) 2
d xy(ℓ) , φ(y) 6 ℓ
and similarly
(ℓ)
d xz(ℓ) , φ(z) 6 2ℓ .
However, by the choice of z we have y(ℓ) = z(ℓ) and so
d φ(z), φ(y) 6 4ℓ < ε.
and hence
Λ(f ) 6 Λ(fn ) sup f (x).
o
x∈Kn
for f ∈ C(Kn ) has all the properties needed to apply Lemma 7.1, so Λ|Cc (Kno )
extends to some Λn defined on C(Kn ) and is again positive (use the argument
from Section 7.4.3 to check this), and can be represented by a finite meas-
ure µn defined on the Borel sets in Kn . Restricting this measure µn to Kno ,
we obtain a measure µn = µn |Kno on Kno with
Z
Λ(f ) = f dµn
o
Kn
for all f ∈ Cc (Kno ). We claim that these measures can be patched together to
define a locally finite measure µ on X with the desired properties. For this,
o
notice that µn+1 is a measure on Kn+1 which satisfies
Z Z
Λ(f ) = f dµn+1 = f dµn+1
o
Kn+1 o
Kn
as required.
Exercise 7.53. Let X be a σ-compact locally compact metric space, and let Λ be a pos-
itive linear functional
Z C0 (X) → R (where we do not assume that Λ is bounded). Show
that Λ(f ) = f dµ for all f ∈ C0 (X) for a finite measure µ on X.
In the remainder of this section we again treat the real and the complex case
simultaneously. The following result describes the dual of C0 (X).
for all f ∈ C0 (X). The operator norm of Λ is equal to kgkL1|µ| (X) , which
shows that C0 (X)∗ ∼
= M(X) under the pairing
Z
hf, µi = f dµ
for f ∈ C0 (X) and µ ∈ M(X), where M(X) is equipped with the norm in
Exercise 3.33.
We note that in a sense Theorem 7.54 also gives a polar decomposition for
complex signed measures (see Exercises 7.56 and 7.55).
In the proof below we first construct from the linear functional Λ a positive
linear functional |Λ| (which may be called the positive version of Λ) which
will give rise to the positive finite measure |µ|. The existence of g will then
follow from Proposition 7.34. At first sight the construction of |Λ| is surprising
— we will force positivity, and then linearity is a minor miracle. Comparing
this construction to our discussion of the operator norm of integration in
Lemma 2.63 and its proof should make this less surprising.
Proof of Theorem 7.54. Let Λ be a continuous linear functional on C0 (X).
Uniqueness: To see the uniqueness claim in the theorem, suppose that Λ is
represented by dµ1 = g1 d|µ1 | and also by dµ2 = g2 d|µ2 |. Define
7.4 Riesz Representation: The Dual of C(X) 249
µ = |µ1 | + |µ2 |
and notice that |µ1 |, |µ2 | ≪ µ. By Proposition 3.29 this implies that there is
a measurable function hj > 0 with d|µj | = hj dµ for j = 1, 2. This shows
that Λ is representated by dµj = gj hj dµ for j = 1, 2, so (g1 h1 − g2 h2 ) dµ
represents the zero functional on C0 (X). By Lemma 2.63 this implies that
kg1 h1 − g2 h2 kL1µ = 0,
for any non-negative and continuous f ∈ C0,R (X). Clearly |Λ(g)| 6 kΛkop kf k∞
for all g as in the definition of |Λ|(f ) and so
x ∈ D0 = {x ∈ D | g(x) = 0}
and so (f1 + f2 )+ + f1− + f2− = (f1 + f2 )− + f1+ + f2+ . We may apply |Λ| to
the latter equation and use the non-negative linearity in (7.17) to get
|Λ| (f2 +f2 )+ +|Λ|(f1− )+|Λ|(f2− ) = |Λ| (f1 +f2 )− +|Λ|(f1+ )+|Λ|(f2+ ).
for f ∈ Cc,R (X). Now use local compactness, σ-compactness, and Urysohn’s
lemma (see Lemma A.22 and Lemma A.27) to find some non-negative func-
tion fn ∈ Cc,R (X) with fn ր 1 as n → ∞ and apply monotone convergence
to obtain Z
|µ|(X) = lim fn d|µ| = lim |Λ|(fn ) 6 kΛkop (7.21)
n→∞ n→∞
by (7.16). Note that (7.20) extends now to all f ∈ C0,R (X) by applying (7.20)
to the sequence (fn f ) in Cc,R (X) together with continuity of |Λ| and domin-
ated convergence.
Description of Λ: We now return to the study of the original functional Λ.
For any f ∈ C0 (X) we may apply the definition of |Λ| and |µ| to obtain
Z
|Λ(f )| = αΛ(f ) = ℜ(Λ(αf )) 6 |Λ|(|αf |) = |f | d|µ| (7.22)
X
for some g ∈ L∞|µ| (X). Moreover, kΛkop = kgkL1 (µ) by Lemma 2.63, and to-
gether with (7.21) we obtain kΛkop = kgkL1(µ) 6 kgk∞ |µ|(X) 6 kgk∞ kΛkop ,
and so kgk∞ = 1 follows unless kΛkop = 0. In the trivial case Λ = 0 we
have |µ| = 0 and may also set g ≡ 1.
Exercise 7.55. In the notation of Theorem 7.54 (and of its proof) show that |g| = 1
for |µ|-almost every x ∈ X.
the form dµ = g d|µ| with |g| = 1 everywhere and |µ| being a positive finite measure. Show
that |µ| is uniquely determined, as is g, |µ|-almost everywhere.
Exercise 7.57. Let X be a locally compact σ-compact metric space, and let Λ be a linear
functional on Cc (X) with the property that for any compact K ⊆ X there is a con-
stant CK > 0 such that |Λ(f )| 6 CK kf k∞ for any f ∈ Cc (X) with Supp f ⊆ K. Show
that Λ can be represented by a signed Radon measure on X, meaning that there exists a
Radon measure µ on X and a locallyR integrable (that is, integrable on any compact subset)
function g on X such that Λ(f ) = f g dµ for all f ∈ Cc (X).
Exercise 7.58. Let X = [0, 1] ⊆ R (though the reader will notice that the same conclu-
sions holds on most compact metric spaces).
(a) Notice that every finite signed measure µ on X defines a linear functional on the
space L ∞ (X) = {f : X → R | kf k∞ < ∞, f measurable} but that L ∞ (X)∗ contains
other functionals as well.
(b) Notice that every function f ∈ L ∞ (X) defines a linear functional on the space of finite
signed measures M(X) ∼ = C(X)∗ . Deduce that C(X) is not reflexive. Show that M(X)∗
contains more functionals than those arising from L ∞ (X).
Exercise 7.59. Find a description of the dual of C n ([0, 1]) for all n ∈ N.
Definition 8.1. Let X be a normed vector space with dual space X ∗ . The
weak topology on X is the weakest (coarsest) topology on X for which all the
elements of X ∗ (which are functions on X) are continuous.
Exercise 8.2. Show that the weak and norm topologies coincide for a finite-dimensional
normed vector space.
n
\
Nℓ1 ,...,ℓn ;ε (x0 ) = x ∈ X | |ℓi (x) − ℓi (x0 )| < ε
i=1
Definition 8.4. Let X be a normed vector space with dual space X ∗ . The
weak* topology (read as ‘weak star’ topology) is the weakest (or coarsest)
topology on X ∗ for which all the evaluation maps x∗ 7→ x∗ (x) corresponding
to x ∈ X are continuous.
Once again we can describe the weak* topology by saying that a neigh-
bourhood of x∗0 ∈ X ∗ is a set containing a set of the form
n
\
Nx1 ,...,xn ;ε (x∗0 ) = x∗ ∈ X ∗ | |x∗ (xi ) − x∗0 (xi )| < ε
i=1
for some ε > 0 and x1 , . . . , xn ∈ X. As before, we can show that the weak*
topology and the norm topology on X ∗ are different if X (and hence if X ∗ )
is infinite-dimensional.
Example 8.5. (a) For a Hilbert space H, the weak and weak* topologies are
identical. The same holds for any reflexive Banach space. However, in general
there is no definition of a weak* topology on a given Banach space as there
may not exist a pre-dual of X, meaning a Banach space Y with X = Y ∗ (see
Example 8.81).
(b) Let X = [0, 1] and consider the sequence of measures (µn ) where
1
µn = δ1/n + δ2/n + · · · + δ1 ,
n
8.1 Weak Topologies and the Banach–Alaoglu Theorem 255
Lemma 8.7. For a Banach space X the weak topology on X and the weak*
topology on X ∗ are Hausdorff.
Proof. For the weak topology this follows from Corollary 7.4: if y 6= z in X
there exists some ℓ ∈ X ∗ with ℓ(y) 6= ℓ(z), so that Nℓ;ε (y) ∩ Nℓ;ε (z) = ∅
for ε = |ℓ(z)−ℓ(y)|
2 . The proof for the weak* topology is similar, using the fact
that for x∗1 6= x∗2 there exists some x ∈ X with x∗1 (x) 6= x∗2 (x).
Exercise 8.8. Let X be a Banach space and let (xn ) be a sequence converging to x ∈ X
in the weak topology. Show that supn>1 kxn k < ∞. In other words, show that weakly
convergent sequences in Banach spaces are bounded.
The following exercise shows that the weak and the weak* topologies have
natural compatibility properties with respect to bounded operators.
256 8 Locally Convex Vector Spaces
The importance of the weak* topology comes from the following theorem,
which was alluded to in the introduction to the chapter.
B1X = {ℓ ∈ X ∗ | kℓkop 6 1}
∗
Proof. Let B(r) be the closed (and hence compact) ball of radius r > 0
in R or C depending on the field of scalars. By Tychonoff’s theorem (see
Theorem A.20) the space
Y
Y = B(kxk)
x∈X
is compact with respect to the product topology (see Definition A.16). Now
define the embedding
∗
φ : B1X −→ Y
ℓ 7−→ (ℓ(x))x∈X ∈ Y.
∗
which is precisely one of the neighbourhoods of ℓ0 ∈ B1X defining the weak*
∗ ∗
topology on B1X . Therefore, φ is a homeomorphism from B1X (with the re-
striction of the weak* topology) to a subset of Y (with the product topology).
We claim that
∗
φ B1X ⊆ Y
is closed, which then implies the theorem, since any closed subset of Y is
compact since Y is itself compact.
∗
To see the claim, notice first that φ(B1X ) consists of all linear maps in Y .
This is because any element y ∈ Y is a scalar-valued function on X with
y(x) ∈ B(kxk)
for all x ∈ X, and so if y is linear then kyk 6 1. The claim now follows easily
since linearity is defined by equations and so is a closed condition, as we will
now show. In fact for any scalars α1 , α2 the set
The weak and weak* topologies are never metrizable for infinite-dimensional
Banach spaces (see Exercise 8.12), but when restricted to the unit ball the
situation is better.
∗
is a neighbourhood of ℓ0 ∈ B1X defined by ε > 0 and some arbitrary x ∈ X.
∗
Choose some x′ ∈ D with kx − x′ k < 3ε , and notice that for all ℓ ∈ B1X we
′ ε
have |ℓ(x) − ℓ(x )| < 3 and so
∗ ∗
Nx′ ;ε/3 (ℓ0 ) ∩ B1X ⊆ Nx;ε (ℓ0 ) ∩ B1X
by a simple application of the triangle inequality (check this). Thus the to-
∗
pologies defined on B1X using the evaluation maps for x ∈ D or for x ∈ X
(the latter being the weak* topology by definition) agree.
For the last claim of the proposition, notice that if X is separable, then
by definition there exists a countable dense set D = {x1 , x2 , . . . } ⊆ X. For
every xn ∈ D the weakest topology for which ℓ 7→ ℓ(xn ) is continuous is
the topology induced by the semi-norm kℓkxn = |ℓ(xn )|, and so the weak*
∗
topology is the weakest topology on B1X that is stronger than all the topo-
logies induced by the semi-norms k · kxn for n ∈ N. By Lemma A.17 and the
Hausdorff property of the weak* topology from Lemma 8.7, this topology is
metrizable.
Let us finish with the following lemma, which answers both of the following
questions for a Banach space affirmatively:
• Does X ∗ as a vector space with the weak* topology characterize X?
• If the weak and weak* topologies on X ∗ agree, does it follow that X is
reflexive?
Exercise 8.15. We know that the weak topology and the norm topology on infinite-
dimensional Banach spaces are different. In contrast to this, show that a sequence in ℓ1 (N)
converges in the weak topology if and only if it converges in the norm topology.
Exercise 8.17. Let X, Y be normed vector spaces, and let T : X → Y be linear. Show
that T is a bounded operator if and only if T is sequentially continuous with respect to
the weak topology, that is, xn → x weakly in X as n → ∞ implies that T xn → T x weakly
in Y as n → ∞.
Exercise 8.18. Let X be an infinite-dimensional normed vector space. Show that the weak
closure of the unit sphere S = {x ∈ X | kxk = 1} is the closed unit ball
†
As we have seen, weak convergence and norm convergence are in general
quite different. There are, however, situations in which weak convergence
can be upgraded to norm convergence. Analytic functions taking values in a
Banach space provide one setting where this phenomenon is seen.
by the Cauchy integral formula, where the integral is a contour integral over
a circular path with positive orientation winding once around ζ with radius
of ε. Therefore we have
† This subsection will not be needed in the remainder of the book.
8.2 Applications of Weak* Compactness 261
I
1 1 1
ℓ ◦ f (ζ + h) − ℓ ◦ f (ζ) = ℓ ◦ f (z) − dz
2πi |z−ζ|=ε z − (ζ + h) z − ζ
I
h 1
= ℓ ◦ f (z) dz. (8.1)
2πi |z−ζ|=ε (z − (ζ + h))(z − ζ)
Notice that the denominator in the integral on the right-hand side of (8.2)
is uniformly bounded away from zero, and the numerator is bounded above
by M kℓk for some constant M depending only on f , ζ, and ε. It follows that
The Banach–Alaoglu theorem (Theorem 8.10) is quite helpful for the con-
struction of the Haar measure on compact abelian groups and invariant means
(see Section 3.3, Section 7.2 and Section 10.2).
262 8 Locally Convex Vector Spaces
Exercise 8.21. Let G be a compact metric abelian group. Show that there exists a G-
invariant positive functional Λ : CR (G) → R with Λ(1) = 1, and deduce the existence of a
Haar measure on G.
Exercise 8.22. Let H be a separable Hilbert space, and suppose that A ∈ B(H) is a
compact operator on H. Show that A B1H is compact and that A∗ is also a compact
operator.
Exercise 8.23 (Discrete abelian groups are amenable). Let G be any abelian dis-
crete group. Define for any finitely generated subgroup H < G the set SH to be the set
of all positive functionals L ∈ (ℓ∞ (G))∗ whichThave norm one and are left-invariant un-
der elements of H. Show that the intersection H f.g. SH taken over all finitely generated
subgroups H in G is non-empty, and deduce that G is amenable by applying Lemma 7.18.
Exercise 8.24. Use Exercise 8.23 and the Riesz representation theorem to give a different
proof of the existence of Haar measure on a compact abelian group.
The next exercise generalizes Exercise 7.21(b) and shows the existence of
a maximal amenable normal subgroup called the amenable radical of G.
Exercise 8.25 (Amenable radical). Let G be a discrete group. Let A be a set and
suppose that Hα ⊳ G is an amenable normal subgroup for any α ∈ A. Show that the
subgroup hHα | α ∈ Ai generated by these subgroups is an amenable normal subgroup.
8.2.1 Equidistribution
Proposition 8.27. Let X be a compact metric space. Then the space P(X)
of probability measures defined on the Borel σ-algebra of X forms a compact
metric space in the weak* topology. The same applies to
M6T (X) = µ is a positive measure on X with µ(X) 6 T
C(X)∗ ∼
= M(X) ⊇ P(X),
8.2 Applications of Weak* Compactness 263
where M(X) is the space of finite signed measures defined on the Borel σ-
algebra of X. By Theorem 7.44 the set of probability measures is given by
R \ R
P(X) = µ ∈ M(X) | 1X dµ = 1 ∩ µ ∈ M(X) | f dµ > 0
f >0
where the intersection is taken over all f ∈ C(X) with f > 0. Since each of
the sets in the intersection is closed in the weak* topology, we see that P(X)
is closed as well.
By the Banach–Alaoglu theorem (Theorem 8.10), and since
C(X)∗
P(X) ⊆ B1 ,
this implies that P(X) is compact in the weak* topology. By Lemma 2.46
we know that C(X) is separable, so by Proposition 8.11 the weak* topology
on P(X) is metrizable. The same argument applies to M6T (X).
Exercise 8.28. Let X be a locally compact σ-compact metric space. Show for any T > 0
that M6T (X) (defined as in Proposition 8.27) is compact with respect to the weak*
topology defined by C0 (X) and the identification between C0 (X)∗ and M(X) in the Riesz
representation (Theorem 7.54). Also show that the space of probability measures P(X) is
necessarily not compact if X is not compact.
Definition 8.29. Let X be a compact metric space, and let (µn ) be a se-
quence of probability measures in P(X). We say that (µn ) equidistributes
with respect to a probability measure m ∈ P(X) if µn → m as n → ∞ in
the weak* topology; that is, if
Z Z
f dµn −→ f dm
X X
Essential Exercise 8.31. Prove Lemma 8.30 using the density of the trigo-
nometric polynomials in C(Td ).
Exercise 8.32. Assume that 1, α1 , . . . , αd ∈ R are linearly independent over Q. Show that
N−1 Z
1 X
f n(α1 , . . . , αd ) (mod Zd ) −→ f (x) dx
N Td
n=0
for any f ∈ C(Td ). Use this to generalize Exercise 2.50 to a statement about powers of 2
and 3 with the same exponent.
Exercise 8.33. Assume that α1 , . . . , αd ∈ R are linearly independent over Q. Show that
Z T Z
1
f t(α1 , . . . , αd ) (mod Zd ) dt −→ f (x) dx
T 0 Td
Notice that
so that Proposition 8.34 will certainly follow from the stronger result that
the orbit {T n (0, 0) | n > 0} is equidistributed in T2 . Dynamical questions
of this sort — concerning equidistribution of an orbit under iteration of a
map — are part of ergodic theory. We will briefly outline how one can use
the Banach–Alaoglu theorem (Theorem 8.10) to prove Proposition 8.34 using
ideas from ergodic theory without developing this theory further, and refer
to [27] for a more thorough treatment.
8.2 Applications of Weak* Compactness 265
and, by T -invariance of ν1 ,
Z
ν1 (B) = ν1 (T −1 B) = f1 dµ.
T −1 B
Let us assume† now that T has a continuous inverse (which is the case for
the map on T2 considered above to which this result will be applied). Then
the above implies that f1 = f1 ◦ T almost everywhere with respect to µ,
since T −1 (T B) = B shows that all measurable sets are pre-images.
Since ν1 6= µ the function f1 is not equal to 1 almost everywhere with
respect to µ, and has Z
f1 dµ = ν1 (X) = 1.
X
Therefore, B = f1−1 ([0, 1)) satisfies µ(B△T −1 B) = 0 and has µ(B) ∈ (0, 1),
so µ is not ergodic.
The compactness of P(X) can be used to obtain elements of P T (X) from
sequences of approximately invariant measures.
n−1
1X j
µn = T νn
n j=0 ∗
for all n > 1. Show that any weak* limit µ of a subsequence of (µn ) is T -
invariant, and deduce that P T (X) is non-empty.(26)
where we have used the fact that URα : f 7→ f ◦ Rα is an isometry of L2λT (T)
and hence maps a convergent series to a convergent series. Notice that
lim znkℓ = z.
ℓ→∞
(b) Assume in addition that Z is a compact metric space, and show that the
following gives another equivalent condition:
• For every convergent subsequence (znk ) we have
lim znk = z.
k→∞
(c) Assume now that α ∈ RrQ, and use this, together with the fact
that P Rα (T) only contains the measure λT and Exercise 8.37, to show the
equidistribution of (nα) in T.
Essential Exercise 8.40. Show that the Lebesgue measure λT2 is T -invariant
and ergodic.
n−1
1X j
T (δx × λT ) −→ λT2 (8.5)
n j=0 ∗
as n → ∞ (check this).
Fix some ρ ∈ (0, 12 ) and write λy,ρ = λBρ (y) for the Lebesgue measure
restricted to the ρ-ball Bρ (y) = (y − ρ, y + ρ) ⊆ T around y ∈ T, and consider
the average
n−1
1 X j
T∗ δx × λy,ρ . (8.6)
2ρn j=0
We want to show that these averages converge to λT2 in the weak* topology.
Proposition 8.27 and Exercise 8.39 imply that for this it is enough to show
that any convergent subsequence has λT2 as its limit. So assume (nk ) is the
index sequence of a convergent subsequence, and denote the limit by µ1 .
Using the convergence in (8.5), we see that
nX
k −1
1
T j (δx × (λT − λy,ρ )) −→ µ2
(1 − 2ρ)nk j=0 ∗
since T j (x, y + z) = T j (x, y) + (0, z) has distance less than ρ from T j (x, y)
for all z ∈ (−ρ, ρ). Using the convergence of (8.6) to λT2 , it follows that
Z n−1
X
1
lim sup f dλT2 − f T j (x, y)
n→∞ T2 n j=0
270 8 Locally Convex Vector Spaces
Z
1 XZ
n−1
6 lim sup f dλT2 − f dT∗j (δx × λy,ρ ) + ε = ε.
n→∞ T2 2ρn j=0
†
We show in this and the next subsection how the Banach–Alaoglu theorem
can help to prove elliptic regularity for weak solutions to equations of the
form ∆g = u with g in H01 (U ), u in L2 (U ), and U ⊆ Rd open and bounded.
In this section we essentially reprove Theorem 5.45 using different methods.
In the next subsection we will assume that U has smooth boundary and
will show the regularity (unlike in Section 5.3.2) up to and including the
boundary. For convenience we will consider only R-valued functions.
Definition 8.41 (Difference quotients). Let U ⊆ Rd and V ⊆ U be open
subsets. For any f ∈ L2 (U ), j = 1, . . . , d and h ∈ R such that V + hej is
contained in U we define the difference quotient Djh f ∈ L2 (V ) by
f (x + hej ) − f (x)
Djh f (x) =
h
for almost every x ∈ V .
As one might expect the difference quotient and the weak partial derivative
are related. The first connection below is a direct application of our definition
of the Sobolev spaces.
Lemma 8.42 (Bounding the difference quotient). Let V ⊆ U ⊆ Rd be
open subsets and s > 0 such that V + [−s, s]ej ⊆ U for some j ∈ {1, . . . , d}.
Then, for any function f ∈ H 1 (U ),
kDjh f kL2 (V ) 6 k∂
∂ j f kL2 (U)
Proof. If f ∈ C ∞ (U ) ∩ H 1 (U ) then
Z 1
f (x + hej ) − f (x)
Djh f (x) = = ∂j f (x + thej ) dt
h 0
for all x ∈ V and 0 < |h| 6 s. By integrating the square of this equation,
applying Cauchy–Schwarz, translation invariance of the Lebesgue measure,
and Fubini’s theorem we obtain
† This and the next subsection finish our discussion of Sobolev spaces and the Laplace
operator. In particular, this material will not be needed in the remainder of the book.
8.2 Applications of Weak* Compactness 271
Z Z Z 1 2
|Djh f (x)|2 dx = ∂ f (x + the ) dt dx
j j
V V 0
Z Z 1 Z
2
6 |∂j f (x + thej )| dt dx 6 |∂j f (x)|2 dx
V 0 U
for all 0 < |h| 6 s. Then f |V has a weak partial derivative ∂ j f on V satisfying
∂ j f kL2 (V ) 6 C.
k∂
Proof. Let φ ∈ Cc∞ (V ) and note that this implies that Djh φ converges
uniformly to ∂j φ as h → 0. Indeed, we have
−1/n
ing kDj f kL2 (V ) 6 C. By the Banach–Alaoglu theorem (Theorem 8.10
and Proposition 8.11) there exists a subsequence (nk ) with the property
−1/nk
that Dj f |V converges in the weak* topology to some function v ∈ L2 (V )
as k → ∞ with kvkL2 (V ) 6 C. We claim that v is the weak partial derivative
sought in the corollary. In fact, by (8.9) we now have for any φ ∈ Cc∞ (V )
1/nk −1/nk
hf, Dj φiL2 (V ) = −hDj f, φiL2 (V ) −→ −hv, φiL2 (V )
as k → ∞. Together with (8.8) this gives hf, ∂j φiL2 (V ) = −hv, φiL2 (V ) for
any function φ ∈ Cc∞ (V ), which proves the corollary.
Exercise 8.44. Using the same assumptions as in Corollary 8.43 show that the difference
quotients Djh f converge weakly in L2 (V ) to ∂ j f as h → 0. Do they also converge strongly?
for x ∈ Rd is smooth with compact support, and its derivatives are given
by ∂α (f ∗ χ) = f ∗ ∂α χ for all α ∈ Nd0 . If f ∈ L2 (U ) has a weak α-partial
derivative fα ∈ L2 (U ) for some α ∈ Nd0 and the open subset V ⊆ U has the
property that V − Supp χ ⊆ U , then we also have ∂α (f ∗ χ)|V = (fα ∗ χ)|V .
Similarly, if g ∈ L2 (U ) has compact support and satisfies the equation ∆g = u
then ∆(g ∗ χ)|V = (u ∗ χ)|V .
for all x ∈ Rd by the mean value theorem and dominated convergence. Induc-
tion now shows that f ∗ χ ∈ Cc∞ (Rd ) and ∂α (f ∗ χ) = f ∗ ∂α χ for all α ∈ Nd0 ,
as claimed.
Assume next that f ∈ L2 (U ) has the weak α-partial derivative fα ∈ L2 (U )
for some α ∈ Nd0 . Note that fα also has compact support, as it vanishes by
Lemma 5.10 almost everywhere on every open subset on which f vanishes
almost everywhere. Also suppose for χ ∈ Cc∞ (Rd ) that the open subset V ⊆ U
satisfies V − Supp χ ⊆ U . Now let φ ∈ Cc∞ (V ) and consider
Z Z
hf ∗ χ, ∂α φiL2 (V ) = f (x − y)χ(y) dy∂α φ(x) dx
V Supp χ
Z
= χ(y) hf, ∂α (λ−y φ)iL2 (U) dy,
Supp χ
where λ−y φ(x) = φ(x + y) has support Supp φ − y and defines for all y
in Supp χ an element of Cc∞ (U ) since V − Supp χ ⊆ U . Using the fact that fα
is the weak partial derivative of f we obtain
Z
hf ∗ χ, ∂α φiL2 (V ) = (−1)kαk1 χ(y) hfα , λ−y φiL2 (U) dy
Supp χ
Z Z
= (−1)kαk1 χ(y) fα (x − y)φ(x) dx dy
Supp χ V
kαk1
= (−1) hfα ∗ χ, φiL2 (V )
for any φ ∈ Cc∞ (V ). By uniqueness of the weak derivative (Lemma 5.10) and
continuity we now obtain ∂α (f ∗ χ)|V = (fα ∗ χ)|V (pointwise) as required.
This argument also gives the claim in the proposition for g ∈ L2 (U )
with ∆g = u ∈ L2 (U ). Indeed, with the same arguments concerning the
support of λ−y φ with y ∈ Supp χ we obtain
Z Z
hg ∗ χ, ∆φiL2 (V ) = g(x − y)χ(y) dy∆φ(x) dx
V Supp χ
Z
= χ(y) hg, ∆(λ−y φ)iL2 (U) dy
Supp χ
Z
= χ(y) hu, λ−y φiL2 (U) dy
Supp χ
Z Z
= χ(y) u(x − y)φ(x) dx dy = hu ∗ χ, φiL2 (V ) ,
Supp χ V
as required.
We will now use a non-negative functionR ∈ Cc∞ (Rd )
as in Exercise 5.17,
so that Supp is the closed unit ball and dx = 1, and define the scaled
function ε (x) = ε−d ( xε ) for all x ∈ Rd and ε > 0.
274 8 Locally Convex Vector Spaces
for x ∈ Rd . Therefore
Z Z
kfε − f k2L2 (Rd ) = f (x − εz) − f (x) (z) dz 2 dx
ZZ
6 f (x − εz) − f (x)2 (z) dz dx
Z
2
=
λεz f − f
L2 (Rd ) (z) dz
by Jensen’s inequality (see the first paragraph of the proof of Lemma 3.75)
and Fubini’s theorem. By Lemma 3.74 and dominated convergence the latter
converges to zero as ε → 0.
We note that the following corollary to Proposition 8.45 will be combined
with Corollary 8.43.
for all α ∈ Nd0 with kαk1 6 k. Applying Proposition 8.45 and Lemma 8.46
we see that (χf ) ∗ ε is in Cc∞ (Rd ) and that
∂α (χf ) ∗ ε = ∂ α (χf ) ∗ ε
8.2 Applications of Weak* Compactness 275
∆f = v ∈ H 0 (U ) = L2 (U ),
hf, ∆φiL2 (Rd ) = hf, ∆(ψφ)iL2 (U) = hv, ψφiL2 (U) = hv, φiL2 (Rd ) .
for all i, j ∈ {1, . . . , d} and real numbers h with 0 < |h| 6 1. We now let ε → 0
and obtain from (8.10), Proposition 8.45, and Lemma 8.46 that
kDih∂ j f k2 6 kvk2
for all h with 0 < |h| 6 1. Applying Corollary 8.43, this implies that ∂ i∂ j f
exists in L2 (Rd ) and by Corollary 8.47 it follows that f ∈ Hloc
2
(Rd ). Using the
same function ψ ∈ Cc∞ (U ) as above this implies f = ψf ∈ H 2 (Rd ) ∩ H 2 (U ),
2
so g ∈ Hloc (U ) since f = χg and χ ∈ Cc∞ (U ) was arbitrary.
Induction on k. The theorem now follows by induction on k > 0. The
k
case k = 0 is proven above. So assume now that ∆g = u ∈ Hloc (U ) for
k−1
some k > 1. Since we then also have u ∈ Hloc (U ) we obtain from the
k+1
inductive hypothesis that in fact g ∈ Hloc (U ). Let χ ∈ Cc∞ (U ). Lemma 5.50
then gives for f = χg that ∆f = v ∈ H k (U ). If α ∈ Nd0 satisfies kαk1 6 k,
then ∂ α f ∈ L2 (U ) satisfies ∆∂
∂ α f = ∂ α v since
for all φ ∈ Cc∞ (U ). Hence the argument above for k = 0 applies to ∂ α f and
shows that ∂ α f ∈ Hloc2
(U ). As α ∈ Nd0 is arbitrary with kαk1 6 k, it follows
that f satisfies the assumption of Corollary 8.47 for the integer k + 2 and
k+2
hence f = ψf ∈ H k+2 (U ), or equivalently that g ∈ Hloc (U ) since f = χg
∞
and χ ∈ Cc (U ) was arbitrary. This concludes the induction and the proof
of the theorem.
The above proof of elliptic regularity, and in particular the step in (8.10),
was tailored very closely to the Laplace operator on open subsets in Rd .
In order to also obtain the regularity at the boundary we start by giving
a different argument which will be more amenable for generalizations (even
though it will be a bit more involved). For this we will use the following
inequality, which is also known as the Cauchy inequality with an ε. For any
measure space (X, B, µ), functions u, v ∈ L2µ (X), and ε > 0 we have
Z
1
hu, viL2 (X,µ) 6 |uv| dµ 6 εkuk22 + kvk22 . (8.11)
X 4ε
The first inequality is the triangle inequality and the second follows from
√ 2 2
integrating the inequality 0 6 ε|u| − 2|v|
√
ε
= ε|u|2 + |v|
4ε − |u||v| over X.
Third proof of elliptic regularity on open sets. As in the second
proof of elliptic regularity above, we multiply g by some χ ∈ Cc∞ (U ) and
apply Lemma 5.50. This shows that it suffices to consider the case U = Rd and
a function g ∈ H01 (Rd ) with compact support satisfying ∆g = u ∈ H k (Rd ).
We also initially set k = 0 and will use Corollary 8.43 after bounding
8.2 Applications of Weak* Compactness 277
h
Dℓ gj
2
see also Lemma 5.41. Approximating any v ∈ H01 (Rd ) by smooth functions
with compact support this formula extends to φ = v. We set
where L denotes the left-hand side and R denotes the right-hand side.
Studying the left-hand side. By definition of h·, ·i1 , we have
d
X d
X
L=− gj , ∂ j Dℓ−h (Dℓh g) = − gj , Dℓ−h (Dℓh gj ) ,
j=1 j=1
where we used the fact that ∂ j and Dℓh commute (check this). Finally, we
apply the same argument as in (8.9), which gives our main term
d
X d
h X
h
2
M=L= Dℓ gj , Dℓh gj =
Dℓ gj
.
2
j=1 j=1
This is precisely what we wish to estimate, and it is the only term that is
quadratic in the difference quotient of the weak partial derivatives of g.
Bounding the right-hand side. We are aiming to convert (8.13) into an
estimate on M that is uniform with respect to h. For this, we need to bound
the right-hand side R of (8.13). In fact, we have for any ε > 0 that
Z Z
2 1
2
|R| = uDℓ−h (Dℓh g) dx 6 |Dℓ−h (Dℓh g)||u| dx 6 ε
Dℓ−h (Dℓh g)
2 + 4ε
u
2
by (8.11) (Cauchy’s inequality with an ε). In the first expression of the bound
on the right we use the fact that Dℓh g ∈ H01 (Rd ) and the bound on the
difference quotient by the weak partial derivative in Lemma 8.42 to obtain
−h h
2
D (Dℓ g)
6
∂ ℓ (Dℓh g)
2 =
Dℓh gℓ
2 .
ℓ 2 2 2
278 8 Locally Convex Vector Spaces
This gives
2
1
2
1
2
|R| 6 ε
Dℓh gℓ
2 + 4ε u 2 6 εM + 4ε u 2 , (8.14)
which on setting ε = 12 gives |R| 6 12 M + 12 kuk22 .
Putting the estimates together. Using (8.13) and the estimate for R
2
in (8.14) we finally see that M 6 21 M + 12
u
2 and so
d
X
2
kDℓh gj k22 = M 6
u
2 .
j=1
Note that this upper bound is independent of h and holds for all ℓ in {1, . . . , d}.
Applying Corollary 8.43 we see that ∂ ℓ gj exists for all ℓ, j in {1, . . . , d}. In
other words, all degree two weak partial derivatives of g exist, and so g lies
in H 2 (Rd ) by Corollary 8.47.
Induction on k. The theorem again follows by induction on k as in the
second proof of elliptic regularity above.
to consist of all smooth functions on U with the property that the function
and all partial derivatives can be extended continuously to U (see also the
first paragraph of Section 5.3.3). We consider C ∞ (U ) as a subspace of C(U ).
Proposition 8.50 (Sobolev embedding up to the boundary). Let U
be a bounded and open subset of Rd with smooth boundary. Then
\
H k (U ) = C ∞ (U ). (8.15)
k>0
The above theorem and proposition together allow us to complete our dis-
cussion of the Dirichlet boundary value problem and the eigenfunctions of the
Laplace operator, which previously had only weaker than desired conclusions
regarding the behaviour of the functions near the boundary.
8.2 Applications of Weak* Compactness 279
We first assume Theorem 8.49 and Proposition 8.50 and show how these
imply the corollary.
Proof of Corollary 8.51. For the Dirichlet boundary value problem we
recall from the proof of Theorem 5.51 that we first extended f ∈ C ∞ (∂U )
to all of U , which under our assumptions leads to a function f ∈ C ∞ (U ).
Proposition 5.42 then gives a function v ∈ H01 (U ) with g = f − v satisfy-
ing ∆g = 0. In other words, ∆v = ∆f ∈ H k (U ) for all k > 0, which by
Theorem 8.49 and Proposition 8.50 gives v, g ∈ C ∞ (U ). Proposition 5.33
now implies that v vanishes at ∂U pointwise.
Similarly, suppose f ∈ H01 (U ) is an eigenfunction of ∆ from Theorem 6.56.
In this case, ∆f = λf implies f ∈ H 3 (U ) by Theorem 8.49, then f ∈ H 5 (U )
and so on. Together with Proposition 8.50 this gives f ∈ C ∞ (U ).
where δ ∈ (0, ε/d) is chosen so that Φ(V ′ ) ⊆ Bε (0). In particular, Φ and all
the partial derivatives of Φ will be bounded on V ′ and Φ(U ′ ) ⊆ U ∩Bε (0). We
note that Φ maps the Lebesgue measure on V ′ ⊆ Rd to the Lebesgue measure
on the open set V = Φ(V ′ ) ⊆ Bε (0). This implies that every f ∈ C ∞ (U ∩ V )
is of the form f = g ◦ Φ−1 for g = f ◦ Φ ∈ C ∞ (U ′ ∩ V ′ ). Moreover, by the
multi-dimensional chain rule and induction we have
H k (U ∩ V ) ∋ f 7−→ f ◦ Φ ∈ H k (U ′ ∩ V ′ )
for all k > 0. In particular, it is enough to prove the desired local statement
for U ′ and V ′ instead of U and V .
Simplifying the notation further we apply a linear map, set δ = 1, and
may suppose in the following that V = (−1, 1)d and U = (−1, 1)d−1 × (0, 1).
Boundary points, trace operators on a box. We define S = (−1, 1)d−1
and will need the trace operator on the hyperplanes Sy = S × {y} inside U
for all possible values of the height parameter y ∈ (0, 1). The trace operat-
ors for y ց 0 will allow us to extend functions from U to U ∩ V = U ∪ S0
with S0 = S × {0}. We note that these trace operators already featured in
Section 5.2.2 but (except for Exercise 5.29 and Exercise 5.35) not in the gen-
erality needed here. For completeness we quickly go through the construction
of these operators once more.
To define the trace operators we note that for any y1 , y2 ∈ (0, 1), any
function f ∈ C ∞ (U ), and x ∈ S we have
Z y2
f (x, y2 ) = f (x, y1 ) + ∂d f (x, t) dt. (8.16)
y1
8.2 Applications of Weak* Compactness 281
·|Sy : C ∞ (U ) ∩ H 1 (U ) −→ C ∞ (S)
f 7−→ S ∋ x 7−→ f (x, y)
is linear. Using (8.17) and Cauchy–Schwarz for the integration over t ∈ (0, 1)
it is also easy to see that the trace map is bounded with respect to k · kH 1 (U)
and k · kL2 (S) . Therefore the trace map is defined on H 1 (U ) and takes values
in L2 (S). For y1 , y2 ∈ (0, 1) we may also use (8.16) to obtain
Z Z Z y2 2
2
|f (x, y2 ) − f (x, y1 )| dx = ∂d f (x, t) dt dx
S S y1
Z Z 1
6 |∂d f (x, t)|2 dt dx |y2 − y1 |
S 0
for f ∈ H k (U ) and y1 , y2 ∈ (0, 1). This also shows that for any sequence (yn )
with yn → 0 as n → ∞ the sequence of functions defined by K ∋ x 7→ f (x, yn )
for n > 1 is a Cauchy sequence with respect to k · kK,∞ .
Boundary points, conclusion. Thus if k > 1 + d−1 k
2 and f ∈ H (U ∩ V )
has Supp f ⊆ U ∩ V , then we can find κ > 0 so that
and apply the above to see that f can be continuously extended to U . This
proves the remaining local statement. As mentioned before, this discussion
also applies to all partial derivatives of f and hence completes the proof of
the proposition.
Much as in the proof of Proposition 8.50, the statement in Theorem 8.49
can be reduced to a purely local statement. Indeed, suppose the following
holds.
(Local statement) For any z (0) ∈ U there exists a neighbourhood V
of z (0) in Rd so that if g ∈ H01 (U ) with ∆g = u ∈ H k (U ) for some k > 0
and Supp g ⊆ U ∩ V , then g ∈ H k+2 (U ).
Then we may find a finite cover of U consisting of such neighbourhoods and an
associated smooth partition of unity. Together with Lemma 5.50 this reduces
the proof to the local statements (check this).
Moreover, for any interior point z (0) ∈ U we may take V = U , apply elliptic
regularity on open subsets (Theorem 5.45 or Theorem 8.48), and multiply g
by a smooth ψ ∈ Cc∞ (U ) with ψ ≡ 1 on Supp g to obtain the local statement.
We have therefore reduced the proof of Theorem 8.49 to the following:
(Local statement at boundary points) For any z (0) ∈ ∂U there
exists a neighbourhood V ⊆ Rd so that g ∈ H01 (U ) with ∆g = u ∈ H k (U )
for some k > 0 and Supp g ⊆ U ∩ V implies that g ∈ H k+2 (U ).
Pd ∞
where ai,j = ′
ℓ=1 (∂ℓ Ψi )(∂ℓ Ψj ) ◦ Φ and bi = (∆Ψi ) ◦ Φ belong to C (U )
for all i, j ∈ {1, . . . , d}. Hence the relationship in (8.20) between g and u′
′
for all ϕ ∈ Cc∞ (U ′ ), where P is the degree two partial differential operator
d
X d
X
P (ϕ) = ai,j ∂i ∂j ϕ + bi ∂i ϕ
i,j=1 i=1
We will not prove this theorem in detail (see Exercise 8.53), but instead
consider only the ‘local version’ needed to complete the proof of elliptic reg-
ularity of the Laplace operator in Theorem 8.49.
(Local statement at boundary of a box) Let V = (−1, 1)d
and U = (−1, 1)d−1 × (0, 1), and let k > 0. Assume that g ∈ H01 (U )
with
Supp g ⊆ U ∩ V = (−1, 1)d−1 × [0, 1),
and u ∈ H k (U ) with
g ∈ H k+2 (U ). (8.22)
8.2 Applications of Weak* Compactness 285
and Z Z
gbi ∂i φ dx = − ∂ i (gbi )φ dx
for all i, j ∈ {1, . . . , d}. Using this we can rewrite (8.23) in the form
d Z
X Z
− ∂ i g)(∂j φ) dx =
ai,j (∂ eφ dx
u (8.24)
i,j=1
This allows us to ignore the first and zero order terms in the definition of P .
Since (8.24) only involves the first derivatives of φ ∈ Cc∞ (U ), it also holds by
continuity for any φ = v ∈ H01 (U ).
The choice of v. We fix some ℓ ∈ {1, . . . , d − 1}. Then
d Z
X
L= Dℓh (ai,j gi )Dℓh gj dx
i,j=1
d Z
X d Z
X
= ahi,j Dℓh gi Dℓh gj dx + Dℓh ai,j gi Dℓh gj
i,j=1 i,j=1
| {z } | {z }
M E
1
Dℓh (ai,j gi )(x) = ai,j (x + heℓ )gi (x + heℓ ) − ai,j (x)gi (x)
h
1
= ai,j (x + heℓ ) gi (x + heℓ ) − gi (x)
h
+ ai,j (x + heℓ ) − ai,j (x) gi (x)
= ahi,j (x)Dℓh gi (x) + Dℓh ai,j (x)gi (x)
The extra term E may be bounded using the Cauchy inequality with an ε as
in (8.11) to obtain
d
X
1
|E| 6 εkDℓh gj k22 + k Dℓh ai,j gi k22
i,j=1
4ε
d
−1 κd X
6 εθ dM + kgi k22 , (8.27)
4ε i=1
where we also used (8.26) and bounded the supremum norm of Dℓh ai,j
on Supp gj by some constant κ > 0.
Bounding the right-hand side. Since our right-hand side has the same
shape as the right-hand side of (8.13) the argument on p. 277 applies to give
1 1
|R| 6 εkDℓh gℓ k22 + uk22 6 εθ−1 M + ke
ke uk22 . (8.28)
4ε 4ε
8.2 Applications of Weak* Compactness 287
d
X
kDℓh gi k22 6 θ−1 |M| 6 2
θε C (8.29)
i=1
for some s > 0 implies that ∂ ℓ (gi |Ws ) exists for any i ∈ {1, . . . , d} and for
any ℓ ∈ {1, . . . , d − 1}. We choose s > 0 such that
for x ∈ Rd−1 × (−s, ∞) and extended trivially outside that set satisfies
∂ α g − ∂ α gs k2 < ε
k∂
for all α ∈ Nd0 with kαk 6 2. By Proposition 8.45 and Lemma 8.46 the
function g δ = gs ∗ δ is smooth, and satisfies kgδ − gk2 < 2ε for δ ∈ (0, s)
sufficiently small. Also by Proposition 8.45 and our shift of the functions the
derivatives ∂α g δ of gδ for all α ∈ Nd0 with kαk 6 2 can be expressed on U by
convolution of ∂ α gs with δ and so k∂α gδ − ∂ α gkL2 (U) < 2ε for sufficiently
small δ ∈ (0, s) and α ∈ Nd0 with kαk 6 2 . Therefore g ∈ H 2 (U ), which
concludes the case k = 0.
Induction on k > 0. The argument above gives the base of the induction
on k. Suppose we already know (8.22) for k − 1 > 0 and assume again
that u ∈ H k (U ). By the inductive hypothesis we already know g ∈ H k+1 (U ).
We again fix some ℓ ∈ {1, . . . , d − 1} and claim that ∂ ℓ g ∈ H01 (U ) and that
there exists some uℓ ∈ H k−1 (U ) with
∂ ℓ g, P φi = huℓ , φi
h∂ (8.31)
for sufficiently small h ∈ Rr{0} and almost every x ∈ U . Indeed, this holds
for any f ∈ H 1 (U ) ∩ C ∞ (U ) and x, x + heℓ ∈ U , extends by continuity to
any f ∈ H 1 (U ) and almost every x ∈ U with x+heℓ ∈ U , and then holds for f
with Supp f ⊆ U ∩ V for almost every x ∈ U and sufficiently small h 6= 0. By
the continuity of the regular representation in Lemma 3.74 the identity (8.32)
implies that
∂ ℓ f = lim Dℓh f.
h→0
2
Using this for g ∈ H (U ) and its partial derivatives ∂ i g for i = 1, . . . , d we
find for a given ε > 0 some h > 0 such that
and
∂ ℓ∂ i g − Dℓh∂ i gk2 < ε
k∂
for i = 1, . . . , d. Since g ∈ H01 (U ), Supp(g) ⊆ U ∩ V and ℓ ∈ {1, . . . , d − 1} we
have Dℓh g ∈ H01 (U ) for all sufficiently small h, and since ε > 0 was arbitrary
this implies ∂ ℓ g ∈ H01 (U ), as claimed.
For the second part (8.31) of the claim we let φ ∈ Cc∞ (U ) and calculate
∂ ℓ g, P φi = − hg, ∂ℓ (P φ)i
h∂
* +
X d d
X
= − g, ∂ℓ ai,j ∂i ∂j φ + bi ∂i φ + cφ
i,j=1 i=1
* d d
+
X X
= − g, P ∂ℓ φ + ∂ℓ ai,j ∂i ∂j φ + ∂ℓ bi ∂i φ + ∂ℓ c φ
i,j=1 i=1
d
X d
X
= − hu, ∂ℓ φi − hg∂ℓ ai,j , ∂i ∂j φi − hg∂ℓ bi , ∂i φi−hg∂ℓ c, φi
i,j=1 i=1
* d d
+
X X
= ∂ ℓu − ∂ i∂ j (g∂ℓ ai,j ) + ∂ i (g∂ℓ bi ) − g∂ℓ c, φ
i,j=1 i=1
= huℓ , φi ,
where
d
X d
X
uℓ = ∂ ℓu − ∂ i∂ j (g∂ℓ ai,j ) + ∂ i (g∂ℓ bi ) − g∂ℓ c
i,j=1 i=1
|{z} | {z } | {z } |{z}
∈H k−1 (U) ∈H k−1 (U) ∈H k (U) ∈H k+1 (U)
belongs to H k−1 (U ), as claimed. This concludes the induction and hence the
proof of Theorem 8.49.
290 8 Locally Convex Vector Spaces
Exercise 8.53. Complete the proof of Theorem 8.52 following the steps below.
(a) State and give a detailed proof of the extension of Corollary 8.47 that was used in the
above proof.
(b) Generalize Lemma 5.50 to allow the uniformly elliptic operator P instead of just ∆.
(c) Use the assumption that U has smooth boundary and a smooth partition of unity to
localize the situation. Apply the above proof on each of the local statements.
Let X and Y be Banach spaces. Then we have seen that the space B(X, Y )
of bounded linear operators from X to Y together with the operator norm is
again a Banach space.
Since any Banach space has a weak topology, there is of course also a
weak topology on B(X, Y ). There are, however, further topologies that make
special use of the fact that B(X, Y ) is a space of maps.
Definition 8.55. Let X and Y be Banach spaces. The strong operator topo-
logy on B(X, Y ) is the weakest topology for which the evaluation maps
B(X, Y ) ∋ L 7−→ Lx ∈ Y
The strong operator topology is in many situations more natural than the uni-
form topology, and the study of unitary representations (see Definition 3.73
for the general definition) is an example.
Example 8.56. Let H = L2 (R) and define for x ∈ R the unitary map
ρx : H → H
8.3 Topologies on the space of bounded operators 291
kρx − ρy k = kI − ρy−x k = 2
kρy fi − ρx fi k2 < ε
292 8 Locally Convex Vector Spaces
(1) it is Hausdorff;
(2) it is weaker than the uniform operator topology (defined by the operator norm);
(3) a sequence (Tn ) in B(X, Y ) converges to T0 ∈ B(X, Y ) as n → ∞ in the strong
operator topology if and only if Tn (v) → T (v) as n → ∞ for all v ∈ X; and
(4) a filter F on B(X, Y ) converges to T0 ∈ B(X, Y ) if the filter generated by
{T v | T ∈ F} | F ∈ F
Definition 8.58. Let X and Y be Banach spaces. The weak operator topology
on B(X, Y ) is the weakest topology with respect to which the maps
Equivalently, the weak operator topology can be defined using the neigh-
bourhoods defined by the semi-norms
kLkx1 ,y1∗ ;x2 ,y2∗ ;...;xn ,yn∗ = max{|y1∗ (Lx1 )|, . . . , |yn∗ (Lxn )|}.
Exercise 8.59. Assume that X and Y are infinite-dimensional Banach spaces. Show that
the uniform topology, the weak topology, the strong operator topology, and the weak
operator topology are all different Hausdorff topologies on B(X, Y ).
Even if we were initially only interested in Banach spaces, the last few sections
should have left no doubt that the next definition is natural and unavoidable.
It gives a class of topological vector spaces generalizing normed vector spaces.
{k · kα | α ∈ A}
n
\ k·kαi
Nα1 ,...,αn ;ε (x0 ) = Bε (x0 ) = x ∈ X | max kx − x0 kαi < ε .
i=1,...,n
i=1
The vector space X together with this topology is called a locally convex
vector space.
also belongs to the collection (that is, coincides with k · kα for some α ∈ A).
If this is the case, then the neighbourhoods of x ∈ X are sets containing a
ball of the form
Bεk·kα (x0 ) = x ∈ X | kx − x0 kα < ε
Essential Exercise 8.61. Show that a locally convex vector space (as in
Definition 8.60) has the property that addition and scalar multiplication are
continuous, and that 0 ∈ X has a basis consisting of absorbent balanced
convex sets.
As the next exercise shows, even if a locally convex vector space topology
cannot be described using a norm, the locally convex structure is enough to
obtain results similar to those obtained as corollaries of the Hahn–Banach
theorem (Theorem 7.3).
Exercise 8.62. Let X be a locally convex vector space. Show that the space X ∗ of con-
tinuous linear functionals on X separates points.
We have seen many examples of locally convex vector spaces. These in-
clude normed vector spaces with their norm or weak topology, duals of Banach
spaces with the weak* topology, and the space B(X, Y ) of operators between
two Banach spaces with any of the topologies discussed in Section 8.3. How-
ever, there are further spaces that we have neglected so far because they do
not fit well (or at all) into the framework of normed spaces.
294 8 Locally Convex Vector Spaces
Example 8.63. (1) The space C ∞ ([0, 1]) is a locally convex vector space with
the semi-norms
kf kC n([0,1]) = max kf (j) k∞
j=0,...,n
{k · kF | F ∈ C(U )},
Supp(fn ), Supp(f ) ⊆ K
Exercise 8.66. Suppose that the topology of a locally convex vector space X is induced
by countably many semi-norms k · kn for n ∈ N.
(a) Show that a sequence (xn ) in X is a Cauchy sequence with respect to the metric
in (8.33) if and only if (xn ) is a Cauchy sequence with respect to all of the semi-norms k·kn
for n ∈ N.
(b) Show that if two families of semi-norms {k · kn } and {k · k′n } make X into a locally
convex vector space with the same topology, then X is complete with respect to d if and
only if X is complete with respect to d′ , where d′ is defined using {k·k′n } (just as in (8.33)).
(c) Show that the spaces from Example 8.63(1), (2) and (3) are Fréchet spaces.
kf kα,F = k(∂α f )F k∞
for f ∈ Cc∞ (U ). Using F = 1 and α = 0 shows that these include k·k∞ , so that
the topology is indeed Hausdorff. We define the space D(U ) of distributions
on U to be the space of continuous linear functionals on the locally convex
vector space Cc∞ (U ).
This definition of a distribution is a cheat because we have finessed the
problem that no function F satisfies (8.34) by simply declaring F to be
the distribution (that is, continuous linear functional) which sends the test
function φ to φ(0) without giving a more direct generalization of functions
on R. We may write this formally as
hF, φi = φ(0),
Prove that the resulting map f 7−→ Ff is linear and injective. Actually it is sufficient to
assume that f ∈ L1loc (U ), the space of locally integrable functions, measurable functions
that are integrable on any compact set.
Exercise 8.69. Show that no measurable and locally integrable function f : R → R has
the property (8.34) for all φ ∈ Cc∞ (R).
∂α : Cc∞ (U ) −→ Cc∞ (U )
f 7−→ ∂α f
which depends linearly on ψ ∈ C ∞ (U ) and linearly on F ∈ D(U ). Prove also the product
rule
∂ j F ).
∂ j (ψ · F ) = (∂j ψ) · F + ψ · (∂
298 8 Locally Convex Vector Spaces
Then pK (αx) = αpK (x) and pK (x + y) 6 pK (x) + pK (y) for all α > 0
and x, y ∈ X.
Proof. The positive homogeneity follows directly from the definition. Sup-
pose now that x, y ∈ X and tx , ty > 0 have
1 1
x, y ∈ K. (8.35)
tx ty
Then
1 tx 1 ty 1
(x + y) = x + y
tx + ty tx + ty tx tx + ty ty
also lies in K, since K is convex. Thus pK (x + y) 6 tx + ty , and since this
holds for all tx , ty with (8.35), the triangle inequality follows.
Exercise 8.72. Use Lemma 8.71 to prove the converse to Exercise 8.61. More precisely,
let X be a vector space endowed with a Hausdorff topology. Assume that addition and
scalar multiplication are continuous and that 0 ∈ X has a basis of neighbourhoods consist-
ing of absorbent balanced convex sets. Show that X is a locally convex space in the sense
of Definition 8.60.
K
ℓ(x) = c
M = K + U = {y + u | y ∈ K, u ∈ U }
and notice that M is convex because both K and U are (check this) and
that M is absorbent as it contains U .
We now apply Lemma 8.71 to obtain the norm-like function pM . By defin-
ition, we have
pM (·) 6 2ε max{k · kα1 , . . . , k · kαn } (8.36)
since U ⊆ M .
We claim that pM (z) > 1. For otherwise there exists a sequence (λn )
with λn → 1 as n → ∞ and with
1
λn z = kn + un ∈ M = K + U
1
for all n > 1. Clearly λn z and un are bounded in the semi-norms
k · kα1 , . . . , k · kαn ,
so the same holds also for kn . Now rewrite the above equation as
z = kn + (λn − 1)kn + λn un
which upgrades to
2
|ℓ(x)| 6 ε max{kxkα1 , . . . , kxkαn }
by linearity of ℓ and since the right-hand side is a semi-norm. This gives the
theorem (for c = 1).
Since the weak topology is, for infinite-dimensional vector spaces, strictly
coarser than the norm topology, there is no reason why a set that is closed
in the norm topology should be closed in the weak topology. However, for
convex sets the situation is better.
Exercise 8.76. Suppose that K, L ⊆ X are disjoint convex sets in a locally convex vector
space X over R. Suppose one of them has non-empty interior. Show that there exists a
non-trivial continuous linear functional ℓ and a constant c ∈ R such that ℓ(x) 6 c 6 ℓ(y)
for all x ∈ K and y ∈ L.
Exercise 8.77. Let X be a normed vector space over R, and let K ⊆ X be a non-empty
closed and convex subset. Show that
inf kz − xk = sup ℓ(z) − sup ℓ(x)
x∈K ℓ∈X ∗ x∈K
kℓk=1
Exercise 8.78. Let X be a real Banach space and let ı : X → X ∗∗ be the embedding
∗∗
of X into its bidual as in Corollary 7.9. Show that ı B1X is dense in B1X when X ∗∗ is
equipped with the weak* topology.
8.6 Convex Sets 301
An important concept for convex sets, both abstractly and for many concrete
applications (see, for example, Proposition 8.36), is the notion of extreme
points.
Definition 8.79. Let X be a locally convex space and let K ⊆ X be a convex
subset. An element x ∈ K is an extreme point of K if x cannot be expressed
as a proper convex combination of points of K (that is, if x = sy + (1 − s)z
with y, z ∈ K and s ∈ (0, 1) then we must have x = y = z).
As illustrated in Figure 8.3, the set of extreme points of a convex set
will not be closed in general, even in a finite-dimensional setting. In infinite-
dimensional spaces, the situation is more complex still and the extreme points
may even be dense (see Exercise 8.84). The smallest closed convex subset of
a locally convex space X that contains A ⊆ X is called the closed convex
hull of A and is the intersection of all closed convex sets containing A, or
equivalently the closure of the convex hull of A.
Fig. 8.3: The set of extreme points need not be closed: Here two cone-like objects
are glued together at their base so that a single straight line connects the two cone
points in the resulting convex set. The extreme points are the two ends together
with all but one point of the central circle.
is such an element. For this we only need to show that E ∈ F , as the fact
that E < Eα for every α ∈ I then follows directly from the definition of <.
Since each Eα is closed and convex, the same holds for the intersection E.
Since each Eα is non-empty and {Eα | α ∈ I} is linearly ordered, we see
that every finite intersection Eα1 ∩ · · · ∩ Eαn is non-empty, because it must
coincide with one of the sets Eα1 , . . . , Eαn . Since K is compact, we see that
the intersection E is non-empty (see Appendix A.4). It remains to show
that E is an extremal subset. Suppose therefore that x = sy + (1 − s)z ∈ E
with y, z ∈ K and s ∈ (0, 1). Then x ∈ Eα for all α ∈ I as E ⊆ Eα . By
extremality of Eα this forces y, z ∈ Eα for all α ∈ I, and so y, z ∈ E as
required.
In summary, we have shown that we are in a position to use Zorn’s lemma,
so that there must be a maximal element E of F . In our setting, this is a
minimal closed extremal subset of K. We claim that E = {x} is a singleton,
which then implies that x must be an extreme point of K. Indeed, if E
contains two points x0 , y0 , then by Theorem 8.73 there exists a continuous
linear functional ℓ on X with ℓ(x0 ) < ℓ(y0 ). However, by compactness this
implies that
E ′ = z ∈ E | ℓ(z) = sup ℓ|E
is a non-empty proper closed convex subset of E. It is also an extremal subset
of K, since if x = sy + (1 − s)z ∈ E ′ with y, z ∈ K and s ∈ (0, 1), then we
must have y, z ∈ E as E is extremal and so ℓ(x) = sℓ(y) + (1 − s)ℓ(z)
and ℓ(y), ℓ(z) 6 ℓ(x) = max ℓ|E , which implies that y, z ∈ E ′ , as required,
since s ∈ (0, 1). However, this is a contradiction since E ⊆ K was supposed to
be a minimal closed extremal subset of K. Therefore, E = {x0 } is a singleton
and we have shown that the set of extreme points of K is non-empty.
Now let M denote the closed convex hull of the set of all extreme points
of K. Clearly M ⊆ K and we need to show that M = K.
Suppose that x0 lies in KrM . By Theorem 8.73 there exists a continuous
linear functional ℓ with ℓ(y) 6 c < ℓ(x0 ) for all y ∈ M . Now let
and ℓ(y), ℓ(z) 6 max ℓ|K = ℓ(x), which implies that y, z ∈ F since s ∈ (0, 1)
and hence x = y = z by extremality of x in F .
This contradiction shows that K = M is the closed convex hull of the set
of extreme points.
The Krein–Milman theorem, together with the Banach–Alaoglu theorem
(Theorem 8.10), can produce some striking consequences.
Example 8.81. Let us show that c0 (N) has no pre-dual. In other words, there
is no Banach space X with the property that X ∗ is isometrically isomorphic
to c0 (N). Indeed, suppose that there is such a Banach space. Then, by the
Banach–Alaoglu theorem, the unit ball of c0 (N) would be weak* compact.
Thus, by the Krein–Milman theorem,† the unit ball would have to contain
some extreme point (an )n>1 . We complete the argument by showing that
there cannot be such an extreme point of the unit ball.
By definition, |an | 6 1 for all n > 1 and limn→∞ an = 0. Therefore, there
exists some n0 with |an0 | < 21 and then the sequences (bn ) and (cn ) defined
by (
an for n 6= n0 ,
bn = 1
an + 2 for n = n0
and (
an for n 6= n0 ,
cn = 1
an − 2 for n = n0
are different, both belong to the unit ball by construction, and we have
(c) Assume now instead that X is non-compact. Show that the conclusion of the Krein–
Milman theorem (Theorem 8.80) holds for P(X) (despite the fact that the assumptions do
not).
We now further refine the Krein–Milman theorem by showing that every point
of a compact convex set can be obtained as a ‘generalized convex combination’
of extreme points of K. However — even after taking account of convergence
questions — convex combinations alone will not be sufficiently general, as
the next example shows.
Exercise 8.85. Let X and P(X) ⊆ C(X)∗ be as in Exercise 8.83. Describe the elements
of P(X) that can be written Pas a convergent (in norm, or equivalently in the weak* to-
∞
pology) convexPcombination n=1 cn νn of extreme points νn ∈ P(X) with cn > 0 for
∞
all n > 1 and n=1 c n = 1. Now let X = [0, 1] and give examples of Borel probability
measures that cannot be obtained as such limits.
for every ℓ ∈ X ∗ .
Notice that each ℓ ∈ X ∗ is continuous on K and hence is integrable with
respect to any µ on K as in the definition above.
Essential Exercise 8.87. Show that the barycentre of a Borel probability
measure µ on a metrizable compact convex subset is uniquely determined
by µ.
8.6 Convex Sets 305
Throughout the discussion of this subsection we will assume that the in-
duced topology on K ⊆ X is metrizable, writing simply (as above) that K is
a metrizable subset of X. With Proposition 8.11 and Exercise 8.12 in mind,
it should be clear why we do not wish to assume that X itself is metrizable.
Notice that for a fixed ℓ ∈ X ∗ , this hyperplane is not empty since ℓ is linear.
The lemma is equivalent to the statement that
\
K∩ Hℓ 6= ∅.
ℓ∈X ∗
L : X −→ Rn
x 7−→ (ℓ1 (x), . . . , ℓn (x))
and note that this is equivalent to (8.37). Hence the claim implies the lemma.
Suppose therefore that (8.38) does not hold. Then by TheoremP 8.73 (ap-
plied to L(K) ⊆ Rn ) there exists a functional φ defined by φ(t) = nj=1 aj tj
for t ∈ Rn and some row vector a ∈ Rn such that
While the issues arising in the proof are functional-analytic, the intuition
behind the statement is essentially geometric, which is more visible in a finite-
dimensional version illustrated in the next exercise.
Exercise 8.91 (Carathéodory’s form of Minkowski’s theorem). Let K ⊆ Rn be a
compact convex subset. Show that any point x0 ∈ K is a convex combination of (n + 1)
extreme points of K.
For the proof of Theorem 8.90 we will need the following lemma and some
notation. We write A for the space of affine functions on X, that is, functions
of the form a(x) = ℓ(x) + c for some ℓ ∈ X ∗ and c ∈ R. Moreover, recall
that kf kK,∞ denotes the supremum norm of a function f restricted to some
subset K ⊆ X.
Proof. Since the constant function kf kK,∞ belongs to A, (1) follows at once
from the definition of f .
Given x, y ∈ K, λ ∈ [0, 1] and a ∈ A with a > f on K we see that
a λx + (1 − λ)y = λa(x) + (1 − λ)a(y) > λf (x) + (1 − λ)f (y)
by definition of f (x) and f (y). The claim in (2) follows by taking the infimum
over a.
For (3), let c ∈ R and x0 ∈ K such that f (x0 ) < c. Then there exists
some a ∈ A with a > f and a(x0 ) < c. Clearly
M = {(x, c) ∈ K × R | c 6 f (x)}
with f (x) < a(x) for all x ∈ K, and so f (x0 ) 6 a(x0 ) = c0 . Since x0 ∈ K
and c0 > f (x0 ) were arbitrary we deduce that f = f as required.
Now let r > 0, a ∈ A and g be as in (5). For r = 0 the statement is clear.
For r > 0 and a function a ∈ A on X we have a > f if and only if ra > rf .
Therefore (a) follows from standard properties of the infimum.
Next notice that af > f , ag > g and af , ag ∈ A implies that
A ∋ af + ag > f + g
f + a 6 f + a = f + a = f + a − a + a 6 f + a + −a + a = f + a,
f =f −g+g 6f −g+g
whose properties are crucial for the proof of the theorem. Since the series
converges uniformly on K, we see that F ∈ C(K). We claim that F is strictly
convex, meaning that
for some n0 ∈ N and hence a strict inequality in (8.40) for this choice of n = n0
(by strict convexity of t 7→ t2 ). Summing over n gives (8.39).
We now fix x0 ∈ K, which we wish to represent. Using x0 we define the
subspace
V = A + RF ⊆ C(K),
the linear functional
for any a + cF ∈ V , and the function p defined by p(f ) = f (x0 ) for all f
in C(K).
By Lemma 8.92(5), the function p is norm-like, as required in the Hahn–
Banach lemma (Lemma 7.1). Also, by Lemma 8.92(5) we have
also for c < 0. Hence all the assumptions of the Hahn–Banach lemma are
satisfied and so there is an extension of Λ (which we again denote by Λ) to
all of C(K) satisfying
for all f ∈ C(K) (where the last inquality is given by Lemma 8.92(1)). For
a non-negative function f ∈ C(K) we also have −f 6 0 and so Λ(f ) > 0.
Since 1 ∈ A ⊆ V we have Λ(1) = 1(x0 ) = 1 by definition of Λ. Therefore
we may apply the Riesz representation theorem (Theorem 7.44) to obtain a
Borel probability measure µ on K with
Z
Λ(f ) = f dµ (8.41)
K
On the other hand a ∈ A and a > F implies that a > F and therefore
Z Z
F dµ 6 a dµ = Λ(a) = a(x0 ).
K K
We claim that
ext(K) ⊇ {x ∈ K | F = F }. (8.44)
To see this, let z = λx + (1 − λ)y ∈ K be a non-extreme point (that is,
with x 6= y ∈ K and λ ∈ (0, 1)). Since F is strictly convex, this gives
and the claim follows. Note that once we have shown the measurability
of ext(K), this claim implies that µ Xr ext(K) = 0 by (8.43).
8.7 Further Topics 311
for each n > 1. Therefore, µ(Kr ext(K)) = 0 by (8.43) and (8.44) and µ
represents x0 by the argument after (8.43). This proves the theorem.
Exercise 8.93. Let K be a metrizable compact convex subset of a locally convex vector
space over R, and let x0 ∈ K. Show that x0 is an extreme point if and only if µ = δx0 is
the only Borel probability measure on K that represents x0 .
The material above follows the monograph of Phelps [86] loosely. We refer
to those notes for many interesting applications of Choquet’s theorem as well
as the generalization of this result to the case of general compact convex sets
in locally convex vector spaces (without the metrizability assumption) in the
form of the Choquet–Bishop–de Leeuw theorem.
Many proofs and theories depend on weak* compactness, the notion of locally
convex vector spaces, or the study of extreme points of convex subsets. We
only mention a few samples and give further references.
• Decay of Matrix Coefficients for Simple Lie Groups (the Howe–Moore
Theorem): If a simple non-compact Lie group G acts unitarily on a Hil-
bert space H without non-zero G-fixed vectors, then the matrix coef-
ficients hπg v, wi decay to zero as g → ∞ in G, for any v, w ∈ H. In
the language of ergodic theory, this means that every measure-preserving
ergodic G-action is mixing. This may sound complicated but the proof
for SLd (R) only needs as inputs the equality case of the Cauchy–Schwarz
inequality, the Banach–Alaoglu theorem, and matrix multiplication. We
refer to [27, Sec. 11.4] for a discussion of the easier case G = SL2 (R), and
to [25] for the general case. The weak* compactness is here used on the
Hilbert space H.
• In the study of von Neumann algebras two more topologies on B(X, Y )
are used (particularly in the case where X = Y is a Hilbert space):
312 8 Locally Convex Vector Spaces
kU vkH2 = kvkH1
The type of operators seen in Example 9.1 are not difficult to deal with
even though they are usually not diagonalizable. Having abandoned the false
hope that all unitary operators will be diagonalizable (that is, describable
ultimately in terms of only countably many scalar multiplications on the
ground field), the next best hope one might have is that they can be fully
described in terms of multiplication by characters as in Example 9.1 (at the
expense of allowing the underlying measure µ to vary). That this is in fact
true is the content of the spectral theory of unitary operators.
Although this is not immediately apparent, a useful concept for the proof of
Theorem 9.2 is the notion of a positive-definite sequence.
pn (v) = hU n v, viH
for all n ∈ Z. Thus it is enough to consider the sequence (pn (v)) from Ex-
ample 9.5. Let (cn ) be a finite complex sequence as in Definition 9.3. Then
X X
cm cn pm−n (v) = cm cn hU m v, U n viH
m,n∈Z m,n∈Z
* +
X X
m n
= cm U v, cn U v > 0,
m∈Z n∈Z H
N
1 X
FN (θ) = χ−m+n (θ)pm−n
N m,n=1
Recall the notion of a unitary representation from Definition 3.73 and notice
that a unitary operator U defines (and is defined by) an associated unitary
representation π of the group Z given by πn = U n for n ∈ Z.
Definition 9.7. A unitary representation π of a group G on a Hilbert
space H is called cyclic if
H = Hv = hπg v | g ∈ Gi
then H1 is π-invariant. Together with the fact that πg∗ = πg−1 for all g ∈ G
and Lemma 6.30, this implies that H1⊥ is also π-invariant. Define H2 = Hw2⊥ ,
where w2⊥ ∈ H1⊥ is the orthogonal projection of w2 onto H1⊥ . Again the
spaces H1 ⊕ H2 and (H1 ⊕ H2 )⊥ are π-invariant, and we can continue the
⊥
process by defining H3 = Hw3⊥ , where w3⊥ ∈ (H1 ⊕ H2 ) is the orthogonal
projection of w3 onto (H1 ⊕ H2 )⊥ . Clearly w1 ∈ H1 , w2 ∈ H1 ⊕ H2 , and
w3 ∈ H1 ⊕ H2 ⊕ H3 .
H = Hv = hU n v | n ∈ Zi
U
H −−−−→ H
φy
φ
y
L2 (T, µv ) −−−−→ L2 (T, µv )
Mχ1
of unitary maps.
Together with the discussion before Theorem 9.2, we see that spectral
measures should be thought of as a replacement for, or a generalization of,
eigenvalues. As we will see during the proof, the spectral measure µv stores
precisely the values of the inner products hU n v, vi for a given unitary oper-
ator U on a Hilbert space H and vector v ∈ H. In the case of an eigenvector
with eigenvalue λ0 = e2πix0 , we obtain the Dirac measure kvk2 δx0 . At the
opposite extreme, it could be that the vectors . . . , U −1 v, v, U v, . . . are all mu-
tually orthogonal, in which case µv is the multiple kvk2 mT of the Lebesgue
measure.
Proof of Corollary 9.8. Let v ∈ H be a generator of
H = Hv = hU n v | n ∈ Zi.
φ : Hv = hU n v | n ∈ Zi −→ L2 (T, µv )
to be the unique extension of the map that sends any finite linear combina-
tions of the vectors U n v to the corresponding trigonometric polynomial,
φ : Hv −→ L2 (T, µv )
X X
cn U n v 7−→ cn χn .
|n|6N |n|6N
While this is a natural attempt at defining the map φ, it is not clear whether
it produces a well-defined map. Curiously (at first encounter), this will follow
from the map being an isometry: for any finite complex sequence (cn ), we
have
9.1 Spectral Theory of Unitary Operators 319
X
2 X X
cn U n v
= hcm U m v, cn U n viH = cm cn = U m−n v, v H
n∈Z
H
m,n∈Z m,n∈Z
| {z }
pm−n (v)=pm−n (µv )
X Z
= cm cn χm−n dµv
m,n∈Z
Z X X
X
2
= cm χm cn χn dµv =
cn χn
.
L2 (T,µv )
m∈Z n∈Z n∈Z
and so
X X X X
φ U cn U n v = cn χn+1 = χ1 cn χn = Mχ1 φ cn U n v .
n∈Z n∈Z n∈Z n∈Z
That is, the desired formula holds on a dense subset of Hv and so by continuity
on all of Hv . This proves the corollary.
Exercise 9.9. (a) Let Hv ∼ = L2 (T, µv ) be as in Corollary 9.8. Let w ∈ Hv and suppose
that f ∈ L2 (T, µv ) corresponds to w. Characterize the property Hv = Hw in terms of f .
(b) Apply (a) to the unitary operator defined by U ((an )) = (an−1 ) on H = ℓ2 (Z) and the
vector (vn ) with vn = 0 for n 6= 0 and v0 = 1.
320 9 Unitary Operators and Flows, Fourier Transform
The spectral theory of unitary operators has the spectral theory of self-
adjoint operators as a consequence, as the next exercise shows. However, we
will also give an independent and much more detailed treatment of this theory
in Chapter 12.
Exercise 9.10. (a) ForP∞ any bounded operator A : V → V on a Banach space V and any
power series f (z) = k
k=0 ck z whose radius of convergence is bigger than kAk, show
that the natural definition of f (A) as the limit ofPthe sequence of operators obtained as
∞ n is the inverse func-
partial sums makes sense. Show that if g(z) = n=0 dn (z − c0 )
tion P
to f definedPin a neighbourhood
of f (0) = c 0 (represented by another power series)
and ∞ ∞ k n < ∞, then we have g(f (A)) = A.
n=0 |dn | k=0 |ck |kAk
1
(b) Let A : H → H be a non-zero self-adjoint operator. Replacing A by 2kAk A we may
assume that kAk = 12 . Apply part (a) to A and the power series corresponding to eiz to
obtain a unitary operator U : H → H. Show that kU − Ik 6 e1/2 − 1 < 1, and that A can
be recovered from U via the power series representing 1i log(z) in a neighbourhood of 1.
(c) Apply Theorem 9.2 to U and show that one can describe A on H by a direct sum of
multiplication operators as in Exercise 6.25(b). In fact, for each of the direct summands
the measure space can be chosen to be a copy of R together with a measure supported
in [−kAk, kAk] and the multiplication operator can be chosen to be MI (f )(x) = xf (x).
For simplicity we have been working in this section with a single unitary
operator (or the group Z) but the approach can be generalized to several
commuting unitary operators (the group Zd ) as outlined in the following
exercise.
Exercise 9.11. (a) Define positive-definite functions on Zd (so that the sequence case
corresponds to d = 1), and generalize Herglotz’s theorem to this context.
(b) State and prove a corollary to part (a) regarding the spectral theory of d commuting
unitary operators, so that Theorem 9.2 corresponds to the case d = 1.
In this subsection we will strengthen the spectral theorem for unitary op-
erators by studying the sequence of spectral measures appearing in it more
carefully. We start with a few immediate consequences of the definition of
the spectral measures.
Proof† . In the proof of Theorem 9.2 (more precisely, in the argument after
Definition 9.7), we found a sequence of vectors w1 , w2 , . . . in H such that
M
H= Hwn
n>1
(where the sum is possibly finite). Each of the vectors has a spectral meas-
ure νn = µwn for n > 1. Applying the Lebesgue decomposition theorem (from
Proposition 3.29) to ν1 and νn for n > 2 we define
with νnac ≪ ν1 and νn⊥ ⊥ ν1 . From νn⊥ ⊥ ν1 it follows that there exists some
measurable Bn ⊆ T such that νn⊥ (Bn ) = 0 and ν1 (TrBn ) = 0, which implies
with (9.1) that νnac = νn |Bn and νn⊥ = νn |TrBn . We will use the set Bn to
decompose wn into two components. In fact, under the unitary isomorphism
between Hwn and L2 (T, νn ), let wn⊥ ∈ Hwn be the vector corresponding
to cn 1TrBn where ⊥ 1
P cn > 0⊥ is chosen so that kwn k 6 2n .
Let w = w1 + n>2 wn , which converges absolutely. Using Lemma 9.12(d)
we now find that X
dµw = dν1 + c2n 1TrBn dνn ,
n>2
or equivalently X
µw = ν 1 + c2n νn⊥ .
n>2
Exercise 9.15. (a) Given finite measures µ and ν on T with µ ≪ ν ≪ µ, show that the
corresponding multiplication operators from Example 9.1 are unitarily isomorphic.
(b) Reformulate Corollary 9.13 to show the existence of a sequence of finite measures (νn )
and a measure
L ν∞ 2with νmn ⊥ νn2 for all m 6= n with m, n ∈ N ∪ {∞} with the property
that H ∼= n>1 L (T, νn ) ⊕ L (T, ν∞ ) and the isomorphism carries U to the sum of
N
to L2 (T, ν2 )2 .
(e) Generalize (d) to higher multiplicities and conclude that the sequence of the measure
classes of ν1 , ν2 , . . . , ν∞ and subspaces in H corresponding to L2 (T, νn )n resp. L2 (T, νn )N
are uniquely determined by U .
further. For example, one could define h(Mg ) by setting it equal to Mh◦g for
any bounded measurable function h. The reader should verify at this point
that this definition does generalize the prior definition for analytic functions
to measurable functions. Since Theorem 9.2 and Exercise 9.10 describe ar-
bitrary unitary or self-adjoint operators in terms of multiplication operators,
this allows one to also define the operators obtained by applying h to these.
However, from this definition it is not clear whether the result is independent
of the choices made to describe the operator on H as a sum of multiplication
operators. As it turns out, this is the case, and we will discuss this ‘functional
calculus’ in greater detail and in a more general setting in Chapter 12. Here
we aim to give a first taste of this theory, by discussing simpler instances of
the results for a single unitary operator.
For the discussion in this subsection it is convenient to use χ1 as an iso-
morphism from T to S1 = {z ∈ C | |z| = 1}, and transport the spectral
measures µv on T provided by Corollary 9.8 to S1 . We will still use the same
symbol for these measures so that their characterizing property becomes
Z
hU n v, viH = z n dµv (z)
S1
for all n ∈ Z.
The idea behind the proof of this corollary to Bochner’s theorem is simple,
and relies on the polarization identity
3
X
hU n v, wiH = 1
4 iℓ U n (v + iℓ w), v + iℓ w H (9.3)
ℓ=0
Exercise 9.19. Improve the estimate in (9.5) to the inequality kµv,w k 6 kvkkwk for
all v, w ∈ H.
Exercise 9.20. Let U be a unitary operator on a separable complex Hilbert space and h
a function in L ∞ (S1 ). When is h(U ) unitary or self-adjoint? What is the norm kh(U )kop ?
2
for all f, g ∈ L (X, µ).
Furstenberg introduced the notion of joinings and also gave the first classes
of disjoint systems.
any f in L20 (X, νX ) and any g in L20 (Y, νY ) the spectral measures µf (with
respect to UT on L2 (X, νX )) and µg (with respect to US on L2 (Y, νY )) are
mutually singular. Then the two systems are disjoint.
UT ×S (f ◦ πX ) = UT (f ) ◦ πX .
Moreover,
n
UT ×S (f ◦ πX ), f ◦ πX L2 (X×Y,ρ) = hUTn f, f iL2 (X,νX )
for all n ∈ Z, which implies that the spectral measures µf ◦πX defined using
the unitary operator UT ×S : L2 (X × Y, ρ) → L2 (X × Y, ρ) agrees with µf . A
similar statement holds for g ∈ L2 (Y, νY ).
Applying this to f ∈ L20 (X, νX ) and g ∈ L20 (Y, νY ), using the assumption in
the theorem and Lemma 9.12(e) we see that f ◦ πX ⊥ g ◦ πY . Now let A ⊆ X
and B ⊆ Y be measurable sets and define f = 1A − νX (A) ∈ L20 (X, νX )
and g = 1B − νY (B) ∈ L20 (Y, νY ) to obtain
where all the inner products are taken in L2 (X × Y, ρ). As this holds for all
measurable A ⊆ X and B ⊆ Y , we deduce that ρ = νX × νY .
We wrap up the discussion of disjointness, and our excursion into ergodic
theory, by discussing a consequence of disjointness for the dynamics of indi-
vidual points.
Let X be a compact metric space, T : X → X a continuous map, and µ
a T -invariant and ergodic probability measure on X. A consequence of the
pointwise ergodic theorem (one of the fundamental results in ergodic theory,
see [27, Ch. 2, Sec. 4.4.2]) is that µ-almost every point x ∈ X satisfies
N −1 Z
1 X n
f (T x) −→ f dµ
N n=0 X
The analogue of the fact that the Fourier series represents the original func-
tion (where this is true) is a Fourier inversion formula f = (fb)q . However,
the way in which the optimistic identity f = (fb)q needs to be interpreted
as a mathematical theorem is more involved. For example, if f ∈ L2 (Rd )
then there is no reason to expect the integral defining the Fourier transform
in (9.6) to exist. However, we will still be able to obtain a sensible defini-
tion of the Fourier transform as an extension of a densely defined bounded
operator.
330 9 Unitary Operators and Flows, Fourier Transform
Thus
Z
2 2
fb(0)2 = e−πx e−πy dx dy (by Fubini)
R2
Z ∞ Z 2π
2
= e−πr r dθ dr (in polar coordinates)
0
Z ∞0 Z ∞
2
= 2π e−πr r dr = e−s ds = 1 (where πr2 = s)
0 0
and as fb(0) > 0 we get fb(0) = 1. To verify the claimed formula for a general t
in R we will use the Cauchy integral formula for complex path integrals
2
applied to the holomorphic function C ∋ z 7→ e−πz . We integrate over
a rectangular path γ with corners at ±M and ±M + it as illustrated in
Figure 9.1.
−M + it M + it
−M M
2
which implies that e−π(±M+is) → 0 uniformly on [−t, t] as M → ∞. Thus
letting M → ∞ in (9.8) we see that
Z ∞ Z ∞
2 2 2
1= e−πx dx = e−πx e−2πitx dx eπt ,
−∞ −∞
| {z }
fb(t)
λx0 (f ) : x 7−→ f (x − x0 )
and
Mχ(t0 ) (f ) : x 7−→ e2πix·t0 f (x).
Then λ\ b \ b
x0 (f ) = Mχ(−x0 ) (f ) and Mχ(t0 ) (f ) = λt0 (f ).
for all t0 , t ∈ Rd .
for all t ∈ Rd .
f\ bb
1 ∗ f2 = f1 f2 .
Thus the integral defining f1 ∗ f2 (x) exists for almost every x ∈ Rd , and
Z Z
kf1 ∗ f2 k1 6 |f1 (y)f2 (x − y)| dy dx = kf1 k1 kf2 k1 .
Rd Rd
The impatient reader may use the propositions above together with Ex-
ample 9.27 to show that the Fourier transform extends to an isometry
from L2 (Rd ) to L2 (Rd ) via the steps of the following exercise.
Exercise 9.33. (a) Show that
( )
X 2
A= x 7−→ ci e−πai kx−xi k +2πix·ti
| ci ∈ C, ai > 0, xi , ti ∈ Rd
finite
Exercise 9.35. Show that the Fourier transform calculated in (9.9) does not belong
to L1 (R).
As mentioned above, we will show that the Fourier back transform is the
inverse of the Fourier transform. However, as we will see, this requires ad-
ditional assumptions on the function, since the hypothesis f ∈ L1 (Rd ) does
not imply that fb ∈ L1 (Rd ) (as seen in Exercise 9.35), so there is no reason
to expect that the Fourier back transform will be defined on fb.
where 2
φr (t) = e−πkrtk
for x0 ∈ Rd and r > 0. Using Lemma 9.38 we can define the function fr in
two equivalent ways by
Z Z
fr (x0 ) = fb(t)φr,x0 (t) dt = f (x)φ[
r,x0 (x) dx (9.10)
Rd Rd
for all x0 ∈ Rd . We will use the two sides of this formula to show that fr
converges as r → 0 both to (fb)q and to f (in two different ways).
Pointwise convergence. We first show that fr → (fb)q pointwise as r → 0
(where we will use the left-hand integral in (9.10)). Since
Z
2
fr (x0 ) = fb(t)e2πit·x0 e−πkrtk dt
Rd
2
and e−πkrtk → 1 as r → 0, we obtain
Z
fr (x0 ) −→ fb(t)e2πit·x0 dt = (fb)q(x0 )
Rd
φ[
r,x0 (x) = r
−d −πk(x−x0 )/rk2
e .
This gives
Z
2
fr (x0 ) = cr (x0 )
f (x)r−d e−πk(x0 −x)/rk dx = f ∗ φ
R 2
by using the substitution z = x/r (and recalling that Rd e−πkzk dz = 1). On
taking the norm we obtain
Z Z
cr − f k1 = f (x0 − rz) − f (x0 ) e−πkzk2 dz dx0
kf ∗ φ
ZZ
2
6 |f (x0 − rz) − f (x0 )|e−πkzk dz dx0
Z
2
6 kλrz (f ) − f k1 e−πkzk dz
cr − f k1 −→ 0
kf ∗ φ
|f |2 = f f ∈ L1 (Rd )
Z Z
cr − f
2 = (f (x0 − rz) − f (x0 ))e−π|z|2 dz 2 dx0
f ∗ φ
2
ZZ
6 λrz f (x0 ) − f (x0 )2 e−π|z|2 dz dx0
2
Z
2 2
=
λrz f − f
2 e−π|z| dz −→ 0
In other words, we have shown that the Fourier transform preserves the inner
product for elements in V. It follows that the Fourier transform extends to
an isometry from L2 (Rd ) to itself, which we again denote by
Using Proposition 9.29 and λ−t0 (b g )(−t) = b g (t0 − t) for all t ∈ Rd we can
extend this to
Z
fb ∗ gb(t0 ) = fb(t)b
g (t0 − t) dt = fb ∗ λ−t0 (b
g)(0)
Rd
Z
b \
= f ∗ Mχ(−t0 ) (g) (0) = f Mχ(−t0 ) (g) dx = fcg(t0 )
Rd
340 9 Unitary Operators and Flows, Fourier Transform
for all t0 ∈ Rd .
Exercise 9.41. Show that the unitary operator L2 (Rd ) ∋ f 7→ fb ∈ L2 (Rd ) is completely
diagonalizable and has only four eigenvalues.
Exercise 9.42. Use the Riesz–Thorin interpolation theorem to prove the Hausdorff–Young
inequality. Fix p ∈ (1, 2). Show that the Fourier transform on L1 (Rd ) ∩ L2 (Rd ) can be
extended to all f ∈ Lp (Rd ) so that kfbkq 6 kf kp where p1 + 1q = 1.
As with Fourier series in Section 3.4, smoothness and decay properties of the
For x ∈α R and α ∈ N0 we write x d
d d α
Fourier transform are closely related.
α1 αd
for x1 · · · xd and define M(cI)α f (x) = (cx) f (x) for any function f on R
and scalar c.
Proposition 9.43 (Duality between differentiation and multiplica-
tion by monomials). If x 7→ xα f (x) lies in L1 (Rd ) for all α ∈ Nd0
with kαk1 6 k, then fb ∈ C k (Rd ), and
V
∂α fb = M(−2πiI)α (f )
∂d
α f (t) = M(2πiI)α f
The following exercises describe the main properties of S (Rd ) and of the
Fourier transform on S (Rd ).
Essential Exercise 9.45. (a) Show that S (Rd ) is a Fréchet space (see
Definition 8.65) with the seminorms kf kα,β = kxα ∂β f k∞ for f ∈ S (Rd )
and α, β ∈ Nd0 .
(b) If the seminorms kf k′α,β = k∂β (xα f (x))k∞ are used instead, do you get
the same Fréchet space?
(c) What happens if we replace the supremum norms by 1-norms or 2-norms?
(d) Show that S (Rd ) ⊆ Lp (Rd ) for all p ∈ [1, ∞].
342 9 Unitary Operators and Flows, Fourier Transform
Essential Exercise 9.46. Show that the Fourier transform c maps S (Rd )
to itself, is a continuous operator, and has the Fourier back transform | as
its continuous inverse.
Exercise 9.47. Prove the Poisson summation formula:
X X
f (n) = fb(n)
n∈Zd n∈Zd
for f ∈ S (Rd ).
zw + zw = 2ℜ(zw). (9.12)
Then
Z Z
kf k22 = 2
|f (x)| dx = f (x)f (x) dx
R R
∞ Z
= x|f (x)|2 − x f ′ (x)f (x) + f (x)f ′ (x) dx
| {z –∞ } R
=0 as f ∈S (R)
Z
= −2 xℜ f (x)f ′ (x) dx. (by (9.12))
R
Hence Z
kf k22 = −2 ℜ xf (x)f ′ (x) dx 6 2kMI (f )k2 kf ′ k2
R
so that
kf k22 6 4πkMI (f )k2 kMI (fb)k2 ,
as claimed in the theorem.
Exercise 9.50. Show that if for some f ∈ S (R) we have equality in (9.11) then f has the
2 2
form f (x) = Ae−B x for constants A ∈ C and B > 0.
Exercise 9.51. Extend Theorem 9.49 by showing that for any f ∈ S (R) and x0 , t0 in R
we have
1
kMI−x0 (f )k2 kMI−t0 (fb)k2 > kf k22 ,
4π
and that equality holds only if
2
(x−x0 )2
f (x) = Ae2πixt0 e−B
p:G→C
Now suppose that π ′ , H′ , and v ′ have the properties stated in the lemma,
namely pπ,v = pπ′ ,v′ . Then the elements
ℓ
X
cm πgm v ∈ Hv
m=1
and
ℓ
X
cm πg′ m v ′ ∈ Hv′ ′
m=1
have the same norms in their respective Hilbert spaces as both norms can
be expressed as above in terms of the positive-definite function p. We can
define Ψ on a dense subset of Hv by setting
ℓ
! ℓ
X X
Ψ cm πgm v = cm πg′ m v ′ ,
m=1 m=1
As this holds for all α ∈ C, we may set α = 0 and see that p(e) > 0. Now
use both α = 1 and α = i to see that p(g −1 ) = p(g). Finally, if p(g) 6= 0,
we may set α = −|p(g)|/p(g) to see that 2p(e) − 2|p(g)| > 0. It follows
that |p(g)| 6 p(e) (which also holds if p(g) = 0), with equality for g = e,
giving the lemma.
The converse of the statement in Exercise 9.55(b) also holds, and will be
shown later (see Exercise 12.59).
346 9 Unitary Operators and Flows, Fourier Transform
The following describes all positive-definite functions for Rd and hence once
again all cyclic representations of Rd .
for all x ∈ Rd .
We note that this means that p(x) = Mχ(x) 1, 1 L2 (Rd ,µ , where Mχ(x) is
p
We postpone the proof of Bochner’s theorem and first discuss one of its
corollaries, the spectral theorem.
Theorem 9.58 (Spectral theorem for Rd ). Let d > 1 and suppose that π
is a unitary representation Rd ý
HL on a separable complex Hilbert space H.
Then there is a decomposition H = n>1 Hvn for some sequence (vn ) in H.
Moreover, for every v ∈ H the unitary representation π : Rd ý
Hv is unitar-
ily isomorphic to the unitary representation Mχ(x) on L2 (Rd , µv ), where µv
is the spectral measure of v ∈ H (obtained from pπ,v and Theorem 9.56)
and Mχ(x) is the unitary multiplication operator on L2 (Rd , µv ), as above.
Proof. The argument after Definition 9.7 shows that H can be written
as an orthogonal direct sum of cyclic representations Hvn for some vec-
tors v1 , v2 , . . . ∈ H. We apply Bochner’s theorem (Theorem 9.56) to a cyclic
representation, say Hv for v ∈ H, to find the spectral measure. Lemma 9.53,
the comment after Theorem 9.56, and Exercise 9.57 show that the cyclic
representation is isomorphic to the cyclic representation generated by 1
inside L2 (Rd , µ). It remains to show that this representation is all of the
space H′ = L2 (Rd , µ).
Suppose therefore that f ∈ L2 (Rd , µ) belongs to the orthogonal comple-
ment of H1′ , so that
Z
f (t)e2πix·t dµ(t) = f, Mχ(−x) 1 = 0
Rd
9.3 Spectral Theory of Unitary Flows 347
by Fubini’s theorem. Recalling that S (Rd ) ⊇ Cc∞ (Rd ) is dense in L2µ (Rd ),
we see that f = 0.
⊥
Since this holds for all f ∈ (H1′ ) it follows that H1′ = H′ = L2 (Rd , µ), as
required.
For the proof of Bochner’s theorem it will be convenient to reformulate
the defining property of positive-definite functions in terms of convolution as
in the next lemma.
R
is a limit of Riemann sums of the form in Definition 9.52, so f ∗ fe p dx > 0
whenever f ∈ Cc (Rd ). Approximating an arbitrary function f ∈ L1 (Rd ) by
such functions (using the continuity of a product in a Banach algebra from
Proposition 9.31) gives the result.
(see Lemma 9.59 for the definition of fe) and that |f |2 ∈ S (Rd ). By the
duality of multiplication and convolution (Corollary 9.40) we obtain
e
d
|f |2 = fb ∗ fb = fb ∗fb,
and so Z
e
2
Λ(|f | ) = fb ∗fb p dx > 0
by Lemma 9.59.
We wish to upgrade the above positivity statement to say that f > 0
and f ∈ Cc∞ (Rd ) implies Λ(f ) > 0. So let f ∈ Cc∞ (Rd ) be non-negative, and
define q
hε (t) = f (t) + εe−πktk2 .
We assume first that f ∈ Cc∞ (Rd ) is real-valued, and fix ε > 0. Then, for
2
sufficiently large a > 0, we have f (t) < (1 + ε)kf k∞ e−πkt/ak , for all t ∈ R
2
and similarly for −f . Therefore, we have (1 + ε)kf k∞ e−πkt/ak − f (t) = |h|2
where q
h(t) = (1 + ε)kf k∞ e−πkt/ak2 − f (t),
Proof. We will apply the spectral theorem (Theorem 9.2) for unitary flows
to describe the unitary representation in terms of multiplication operators
by scalars. As a slight simplification, we assume that H = Hv is cyclic for
some v ∈ H and refer to Exercise 9.62 for the general case.
Then by the spectral theorem we have Hv ∼ = L2µv (R), where v corres-
ponds to 1 and πx corresponds to Mχ(x) for all x ∈ R. As the isomorphism
between Hv and L2µv (R) is unitary it maps convergent sequences to conver-
gent sequences and hence also differentiable vectors (as in the definition of D)
for π precisely to the differentiable vectors for Mχ(·) . In other words, it suf-
fices to prove the theorem in the case where π = Mχ(·) and H = L2µ (R) for
a finite measure µ on R. We note that the spectral theorem provides a finite
measure, but the proof stays the same if µ is only locally finite. In this case,
we claim that D is given by D = {f ∈ L2µ (R) | MI (f ) ∈ L2µ (R)}. Indeed,
for f ∈ L2µ (R) we have
πx f − f e2πixt − 1
lim (t) = lim f (t) = 2πitf (t)
x→0 x x→0 x
Exercise 9.63. Apply the results above to the unitary flow (ρx f )(y) = f (y+x) for x, y ∈ R
and f ∈ L2 (R).
(a) Use the Fourier transform and the proof of Theorem 9.60 to show that
(b) Show that Cc∞ (R) ⊆ D and that Cc∞ (R) is dense in D when D is endowed with the
norm in Graph(A) where A is defined as in Theorem 9.60.
1
(c) Show that Graph(A) = H01 (R) and that A = 2πi ∂ x.
(d) Show moreover that H 1 (R) = H01 (R).
Using the Riesz representation theorem (Theorem 7.44) we now prove a ver-
sion of the existence of Haar measures from p. 92. Throughout this section
we will be working with real-valued functions.
will be approximated by
n
X Z
cj φ dmG .
j=1 G
M (f : φ)
Λφ (f ) = .
M (f0 : φ)
P P
We may think of nj=1 cj λgj φ as a φ-cover of f and of nj=1 cj as the total
weight of the φ-cover. Notice that {g ∈ G | φ(g) > 21 kφk∞ } is a non-empty
open subset of G, and since f ∈ Cc (G) has compact support it is easy to
see that a cover of f as in the definition of M (f : φ) exists, andPso M (f : φ)
n
is a well-defined non-negative real number. Moreover, if f0 6 j=1 cj λgj φ
† The word ‘gauge’ means a fixed standard of measure like a ruler.
10.1 Haar Measure 355
P
then kf0 k∞ 6 nj=1 cj kφk∞ and so M (f0 : φ) > kf0 k∞ kφk−1 ∞ > 0, which
implies that Λφ (f ) ∈ R>0 is well-defined.
We collect a few immediate properties of Λφ for a scalar α > 0 and func-
tions f, f1 , f2 ∈ Cc+ (G):
• (left-invariance) Λφ (λg f ) = Λφ (f );
• (positive homogeneity) Λφ (αf1 ) = αΛφ (f1 );
• (monotonicity) Λφ (f1 ) 6 Λφ (f2 ) whenever f1 6 f2 ; and
• (sub-additivity) Λφ (f1 + f2 ) 6 Λφ (f1 ) + Λφ (f2 ).
These properties are immediate consequences of the definitionP of M (f : φ)
n
and standard properties of the infimum. For instance, if f1 6 j=1 cj λgj φ
Pm
and f2 6 k=1 dk λhk φ for some scalars c1 , . . . , cn , d1 , . . . , dm > 0 and group
elements g1 , . . . , gn , hP
1 , . . . , hm ∈ G, P
then we obtain a φ-cover of f1 + f2 in
n m
the form f1 + f2 6 j=1 cj λgj φ + k=1 dk λhk φ and so M (f1 + f2 : φ) is
Pn Pm
bounded above by j=1 cj + k=1 dk . Since the φ-covers of f1 and f2 were
arbitrary this implies that
whenever φ, f ∈ Cc+ (G). Note that the second line follows from the first on
dividing by M (f0 : φ). For the proof of the first line in (10.1), suppose that
n
X
f6 cj λgj f0
j=1
and
m
X
f0 6 dk λhk φ
k=1
n X
X m
f6 cj dk λgj hk φ,
j=1 k=1
which gives ! !
n
X m
X
M (f : φ) 6 cj dk .
j=1 k=1
Since the f0 -cover of f and the φ-cover of f0 were arbitrary, this implies (10.1).
φ f1 f2
for all g ∈ G. Fixing g ∈ G and one j in the sum, we see that either
pk (g)φ(gj−1 g) = 0
and so n
X
M (fk : φ) 6 cj (pk (gj ) + δ)
j=1
As ε > 0 was arbitrary, this shows that Λ is additive in the sense that
= m(B1−1 )m(B2 ).
obtain the lemma since we see that the set {g ∈ G | m(gB1 ∩ B2 ) > 0} must
have positive measure with respect to m.
This contradiction proves the claim that f1 must be constant almost every-
where, and hence the proposition.
(d) Show that on a compact metrizable group a left Haar measure is also a right Haar
measure.
lim ψn ∗ f = lim f ∗ ψn = f
n→∞ n→∞
Exercise 10.8. A Haar measure on the additive reals (R, +) is (up to a scalar multiple)
the Lebesgue measure dx. Show that a Haar measure on the multiplicative reals (Rr{0}, ·)
dx
is given by |x| .
da db
under matrix multiplication. Show that dmG = a2
defines a left Haar measure on G
(right) da db
and dmG = defines a right Haar measure on G. Compute the modular character
|a|
on G (as defined in Exercise 10.5).
dg
Exercise 10.10. Show that dmGLd (R) (g) = | det g|d
defines a left and right Haar measure
on GLd (R), where dg denotes Lebesgue measure on the space of real d × d matrices.
†
Using the material of Chapter 8 we continue the discussion from Sec-
tion 7.2.2, where the concept of amenability was introduced for discrete
groups.
† Apart from Exercise 10.35 in Section 10.3, this section will not be used later.
362 10 Locally Compact Groups, Amenability, Property (T)
Definition 10.12. A group G admits Følner sets if for any compact subset K
of G and ε > 0 there exists a measurable set F ⊆ G of positive and finite m-
measure with
m(kF △F )
<ε
m(F )
for all k ∈ K. In this case we will also call F a Følner set (for (K, ε)).
Exercise 10.13. Suppose that G is a locally compact σ-compact metrizable group that
admits Følner sets. Show that there exists a sequence (called a Følner sequence) (Fn )
of measurable sets with positive and finite m-measure so that for any fixed k ∈ G we
have m(kFn △Fn )/m(Fn ) → 0 as n → ∞ and the convergence is uniform on compact
subsets of G.
In the discrete case K and F are finite sets and Definition 10.12 may be
thought of as follows. The Cayley graph Γ (G, K) associated to G and the
10.2 Amenable Groups 363
subset K (which may or may not generate G) is the graph with vertices given
by elements of G, with edges joining g to kg for any k ∈ K. Then G admits
Følner sets means that for any ε > 0 there is a finite set F such that the
number of edges in Γ (G, K) leaving F is at most ε|F |. This stands in stark
contrast to the property of being an expander graph (see Section 10.4).
It should be clear that the two notions above — amenability and admitting
Følner sets — are related. In fact, our main goal in this section is to prove
Lemma 7.24 and its converse in this more general setting. For the more
difficult part of the equivalence one more definition will be useful.
kλk f − f k2 < ε
for all k ∈ K.
Z
1 1
kλk f − f k1 = |1kF − 1F |dm = m (kF △F ) < ε
m(F ) m(F )
Since ε > 0 and K ⊆ G were arbitrary, we see that (2) implies (3) and (4).
Assuming that L2 (G) has almost invariant vectors. If G satisfies (4)
and f2 ∈ L2 (G) satisfies kf2 k2 = 1 and kλk f2 − f2 k2 < ε for all k in the
compact set K ⊆ G, then we define f (g) = f2 (g)2 for all g ∈ G and see
immediately that f > 0 and kf k1 = kf2 k22 = 1. Moreover, for k ∈ K we also
have
Z
kλk f − f k1 = f2 (k −1 g)2 − f2 (g)2 dm(g)
ZG
= f2 (k −1 g) − f2 (g) f2 (k −1 g) + f2 (g) dm(g)
G
= h|λk f2 − f2 |, |λk f2 + f2 |iL2 (G)
6 kλk f2 − f2 k2 kλk f2 + f2 k2 6 2ε.
for all k ∈ K. Since ε > 0 was arbitrary this proves (2) in the discrete case.
In the non-discrete case, the statement in (10.6) is an averaged form of
the inequality we are seeking, and as a result seems to be weaker than what
we need. For the upgrade we use the fact that ε > 0 was arbitrary: we have
shown that for any ε > 0 and δ > 0 there exists a measurable set F = Fα
such that Z
m(kF △F ) dm(k) < εδm(F ) < ∞.
K
Summarising, we have shown for any compact set K, any ε > 0, and any δ > 0
that there exists a measurable set F with finite measure and a subset N ⊆ K
with m(N ) < δ such that
|hf, λk−1 Φ − Φi| = |hf, λk−1 Φi − hf, Φi| = |hλk f − f, Φi| 6 εkΦk∞
for k ∈ K and Φ ∈ L∞ (G). Taking the image of such functions under the
embedding map ı into the dual of L∞ (G) we see that
A ε, Φ, k = {M ∈ M (G) | |M (Φi − λkj Φi )| 6 εkΦi k∞ for all i, j}
k = (k1 , . . . , kn ) ∈ Gn ,
and any ℓ, n ∈ N. By definition
A ε, Φ, k ⊆ M (G)
is weak* closed and contained in the closed unit ball of L∞ (G)∗ (check this).
Since any finite intersection of such sets will contain
another such set we see
that the collection of sets of the form A ε, Φ, k has the finite intersection
property. By the Banach–Alaoglu theorem (Theorem 8.10) it follows that
the intersection over these sets is non-empty. By definition, this intersection
consists of all left-invariant means on L∞ (G).
For the converse, which is perhaps the most surprising part of the whole
proof, we will need the following lemma.
Lemma 10.16. Let G be as above, and let
be the natural embedding into the bidual of L1 (G). Then the weak* closure of
the image of P(G) under ı in L∞ (G)∗ is M (G).
for all f ∈ P(G). This implies that Φ 6 c almost everywhere, since otherwise
we could find a measurable set B ⊆ G of finite positive measure with Φ(g) > c
for g ∈ B, and then setting f = m(B)1
1B ∈ P(G) leads to a contradiction.
However, Φ 6 c almost everywhere also implies that M (Φ) 6 c by the prop-
erties of M ∈ M (G). This contradiction proves the lemma.
We start with the discrete case as it is significantly easier.
Proof of (1)=⇒(3) in Theorem 10.15 for discrete G. Assume that
there exists a left-invariant mean M . Using M we wish to find, for any ε > 0
and finite K ⊆ G, a function f ∈ P(G) such that
kλk f − f k1 < ε
Note that
hλk−1 Φj − Φj , M i = M (λk−1 Φj − Φj ) = 0
for the invariant mean M . By Lemma 10.16 we know that ı (P) is dense
in M (G) with respect to the weak* topology, so there must exist an element f
of P(G) for which (10.10) is less than ε for all k ∈ K and j = 1, . . . , n, which
proves (10.9), (10.8), and hence that G fulfills the Reiter condition in L1 .
368 10 Locally Compact Groups, Amenability, Property (T)
Φ0 = f0 ∗ Φ
kλk f0 − f0 k1 < ε
for all k ∈ U . In other words, for Φ0 the left regular representation satisfies
the continuity claim appearing in Lemma 3.74, but with respect to the k · k∞
norm. This property of Φ0 is called left uniform continuity of Φ0 . As this is
precisely the assumption for the strong integral
R discussed in Proposition 3.81,
it follows that for f1 ∈ Cc (G) the integral R- f1 (g)λg Φ0 dmG (g) can be ob-
tained as a limit with respect to k · k∞ of Riemann sums of the form
10.2 Amenable Groups 369
X
f1 (gp )λgp Φ0 mG (P ),
P ∈ξ
since M (λg Φ0 ) = M (Φ0 ) for any g ∈ G. Using the estimate (10.11) again and
the density of Cc (G) in L1 (G) this extends to all f1 ∈ L1 (G). Restricting to
functions in P(G) the claim in (10.12) follows.
We now make the definition
for some f0 ∈ P(G). Note that Mtop (1) = M (1) = 1 and that Φ > 0 almost
surely implies f0 ∗ Φ > 0 and Mtop (Φ) > 0. We also claim that this definition
is independent of f0 . Using this independence we see that
lim kf0 ∗ ψn − f0 k1 = 0,
n→∞
kλgj ψ ∗ fn − fn k1 −→ 0
as n → ∞ for every j.
Now let K ⊆ G be a compact subset and fix ε > 0. By Lemma 3.74 there
exists some neighbourhood U of e ∈ G such that
kλgj ψ ∗ fn − fn k1 < ε
where we will think of f (g) as the proportion of positive answers in the neigh-
bourhood U g of g to the question of whether g should belong to an improved
version of F . In case G is not unimodular, we multiplied the integral and the
denominator in the first expression by ∆G (g), used m(U g) = ∆G (g)m(U )
in the denominator and the substitution h = ug ∈ U g for u ∈ U in the
integral (at first reading it may be helpful to assume that G is unimodu-
lar as this simplifies some of the expressions arising). Given any majority
parameter α ∈ (0, 1) we also define the set Fα = Fα (F, U ) by
This gives
βm ({g ∈ G | |f − 1F |(g) > β}) < εm(F )
372 10 Locally Compact Groups, Amenability, Property (T)
Applying this construction will give us the desired Følner set. To this end,
fix some non-empty compact subset K ⊆ G. Since K is compact and U has
non-empty interior there exist k1 , . . . , kn ∈ K such that
n
[
K⊆ kj U.
j=1
F ′ = F1/2 (F, U )
as above, so that
2ε
m (F ′ △F ) < m(F ) (10.17)
n
1
by (10.16). Assuming ε < 4 we have
Since n > 1 we see, from the Følner property of F for k1 ∈ K and (10.17),
that
m(F ′r(KF ′ )) 6 m(F ′r(k1 F ′ )) ≪ εm(F ′ ).
For the second inequality we first claim that
U F ′ ⊆ Fα = Fα (F, U 2 ) (10.18)
where the implicit constant only depends on the choice of U . Using the fact
that F is a Følner set for ({k1 , . . . , kn }, nε ) and (10.17), this gives
ε
m(kj U F ′rF ′ ) 6 m(kj U F ′rkj F ) + m(kj F rF ) + m(F rF ′ ) ≪ m(F ′ )
n
Sn
for j = 1, . . . , n. Taking the union and recalling that K ⊆ j=1 kj U , we
obtain !
[n
′r ′ ′r ′
m(KF F ) 6 m kj U F F ≪ εm(F ′ ).
j=1
Exercise 10.25. Let H < G be a closed subgroup with the property that X = G/H
supports a finite G-invariant Borel measure. Show that G is amenable if and only if H is.
d(g, h) = ℓS (gh−1 )
defines a metric on G.
Exercise 10.27. Show that the equivalence class [γS ]∼ of the growth function of a finitely
generated group is well-defined (meaning that it is independent of the choice of symmetric
generating set), allowing us to write γ (G) for any representative of the equivalence class.
Let us start with some fundamental definitions where we will assume that G
is a topological group and π is a unitary representation of G on a complex
Hilbert space H.
sup kπg v − vk 6 ε.
g∈Q
Definition 10.31 (Spectral gap). We say that π has spectral gap if π re-
stricted to (HG )⊥ does not have almost-invariant vectors, where
HG = {v ∈ H | πg v = v for all g ∈ G}
and property (T) will be (almost) exclusive. We note that apart from Exercise 10.35 the
following will be independent of Section 10.2.
376 10 Locally Compact Groups, Amenability, Property (T)
We note that the letter ‘T’ in property (T) stands for the trivial repres-
entation and that the parentheses indicate a neighbourhood of the trivial
representation. In fact, there is a definition of a topology on the family of
irreducible unitary representations of a topological group G — the Fell to-
pology — such that property (T) is equivalent to the trivial representation
being isolated in that topology.
Finding groups without property (T) is quite easy.
Exercise 10.34. Show that if G is a topological group with property (T), and φ is a
continuous homomorphism from G to G′ with dense image, then G′ also has property (T).
Conclude that the free group F (with at least one generator) does not have property (T).
Exercise 10.35. Let G be a discrete or locally compact σ-compact metrizable group. Show
that G is compact if and only if G is amenable and has property (T).
Exercise 10.37. Show that a discrete group with property (T) is finitely generated.
In the following we will consider the groups SLd (R) endowed with the topo-
2
logy induced by the inclusion SLd (R) ⊆ Matd,d (R) ∼= Rd . Každan gave the
definition of property (T) in 1967 and also gave the first examples of such
groups.
We note that G = SL2 (R) does not have property (T), but despite this,
many of its natural (and all of its irreducible) unitary representations have
spectral gap; we refer to [26] for references and a detailed discussion. The
main tool for proving the above theorem is the following relative version.
Theorem 10.39 (Každan). ASL2 (R), R2 has relative property (T), where
2 Ax 2
ASL2 (R) = SL2 (R) ⋉ R = | A ∈ SL2 (R), x ∈ R .
0 1
As we will see there is a way to push property (T) from the group SL3 (R)
to its discrete counterpart SL3 (Z).
As Margulis showed in 1988 discrete groups with property (T) quickly give
rise to expander families, which we will introduce in the next section.
378 10 Locally Compact Groups, Amenability, Property (T)
For the proof of Theorem 10.38 we need the following property of unitary rep-
resentations of G = SLd (R) for d = 2, 3 (due to Mautner [69] and Moore [75]).
For this we define the subgroup
1x
U12 = ux = |x∈R
01
For the proof we will use the following algebraic fact for K = R. For any
field K the group SLd (K) is generated by the elementary unipotent subgroups
(defined as above but with x ∈ K). This may be seen using a modified Gauss
elimination algorithm: given any g ∈ SLd (K) it is clear that the first column is
non-zero. Multiplying g on the left by elements of U12 (or another elementary
unipotent subgroup) corresponds to the row operation of adding a multiple
of the second row to the first row (or the same with any two other rows). For
example, for d = 2 we have
1x ab a + xc b + xd
=
01 cd c d
and calculate
1n 1 0 1 − n2 2 0
un gn u−n/2 = 1 = 1 1 ,
01 n 1 0 1 n 2
Since the eigenvalues of a′ satisfy a′1 6= a′3 and a′2 6= a′3 we may repeat the
argument for SL2 (R) twice more and see that v is fixed by all elementary
unipotent subgroups, which implies that v is fixed by all of SL3 (R).
Remaining with the case d = 3, suppose that v is fixed by an elementary
unipotent subgroup U . Since U is again contained in a subgroup H ∼ = SL2 (R)
we see that v is invariant under a non-trivial positive diagonal element to
which we may apply the arguments above.
The case d > 3 follows similarly by induction and will not be needed later,
so we leave this part of the proof to the reader (see Exercise 10.42(a)).
Exercise 10.42. (a) Confirm that the case d > 3 in Proposition 10.41 may be seen using
the same argument.
(b) Suppose that u ∈ SLd (R) is a non-trivial unipotent element (that is, u 6= I and all
ý
eigenvalues of u are equal to 1). Show that for any unitary representation π : SLd (R) H
any v ∈ H with πu v = v is invariant under all of SLd (R).
Proof. Recall that for any v ∈ H the spectral measure µv is uniquely de-
termined by the property
Z
πu(x) v, v = e2πix·t dµv (t)
R2
t
where we used A−1 u(x)A = u(A−1 x) for all x ∈ R2 . Hence µπA v = (A )−1
∗ µv
by uniqueness of the spectral measure.
µw = µw1 + µw2
which gives
Lemma 10.45 (No invariant measures). The natural action of SL2 (R) on
the projective line P1 (R) = R2r{0}/ ∼ has no invariant probability measures.
identify P1 (R) with SO2 (R)/M so that the action corresponds to translation
on the group. By uniqueness of the Haar measure (Proposition 10.2) there
is only one SO2 (R)-invariant probability measure on P1 (R). However, other
elements
of SL2 (R) do not preserve that measure. For example, the action
e
of does not preserve that probability measure (check this).
e−1
Then for every n > 1 there exists some (Qn , n1 )-invariant vector vn ∈ H
with kvn k = 1.
Let µvn be the spectral measure of vn with respect to R2 ⊳ ASL2 (R) for
each n > 1. If, for some n > 1, we have µvn ({0}) > 0, then by the spectral
theorem (Theorem 9.58)
νn = p∗ µvn
F = f ◦ p ∈ L ∞ (R2r{0}),
and extend it by, for example, setting F (0) = 0. For A ∈ Qn ∩ SL2 (R) the
vector vn satisfies kπA vn − vn k 6 n1 and so we have
384 10 Locally Compact Groups, Amenability, Property (T)
Z Z Z
f ◦ (At )−1 dνn = F ◦ (At )−1 dµvn = F d(At )−1
∗ µvn
P1 (R) R2 R2
Z Z
1
1
= F dµvn + Of n = f dνn + Of n
R2 P1 (R)
by Lemmas 10.43 and 10.44. Now let n = nk and take k → ∞ to see that
Z Z
f ◦ (At )−1 dν = f dν
P1 (R) P1 (R)
for all f ∈ C(P1 (R)) and A ∈ SL2 (R). However, this shows that ν is SL2 (R)-
invariant, which contradicts Lemma 10.45.
Exercise 10.47. Show that SLd (R) has property (T) for all d > 3.
The connection between SL3 (R) and its discrete subgroup SL3 (Z) is largely
controlled by the fact that SL3 (Z) is a lattice in SL3 (R). We will not discuss
the important notion of lattices in detail, but instead will work with the
following form of the result, which will be proved after its significance is
established.
Theorem 10.48 (SL3 (Z) is a lattice). There exists a Borel subset F of the
group G = SL3 (R), called a Ffundamental domain for SL3 (Z) in SL3 (R), such
that mG (F ) < ∞ and G = γ∈SL3 (Z) F γ.
Apart from this result, we will also need a simple form of induction of
unitary representations which will allow us to lift a unitary representation
of SL3 (Z) to a unitary representation of SL3 (R).
To explain this more generally, we let Γ < G be a discrete subgroup of a
locally compact, σ-compact, metrizable, unimodular group, and let F ⊆ G
be a fundamental domain for Γ in G, that is, a Borel subset such that
G
G= F γ. (10.19)
γ∈Γ
Simple examples include Γ = Zd < G = Rd with F = [0, 1)d for any d > 1;
we refer to [25] for more details on the properties of fundamental domains.
Furthermore, let πΓ : Γ ý
HΓ be a unitary representation of Γ on a separable
Hilbert space HΓ .
10.3 Property (T) 385
2
L 2
In the following we will think of f ∈ n∈N L (F ) as a measurable ℓ (N)-
valued and square-integrable function f : F ∋ g 7→ f (g) = (f1 (g), f2 (g), . . . ).
We note that f (g) belongs to ℓ2 (N) for almost every g ∈ F by Fubini’s
theorem. We also define the norm of f = (f1 , f2 , . . . ) by
∞
!1/2 Z 1/2
X
kf kHG = kfn k22 = kf (g)k22 dmG (g) .
n=1 F
and
n
X
f (gγ) = lim πΓ (γ)−1 (f1 (g), . . . , fn (g), 0, 0, . . .) = lim fk (g)πΓ (γ)−1 ek
n→∞ n→∞
k=1
shows that the components of f (gγ) are a convergent sum of finite linear
combinations of f1 (g), f2 (g), . . . . In particular, f (gγ) depends measurably
on g ∈ F . We will frequently identify f on F with the extension of f to all of G
386 10 Locally Compact Groups, Amenability, Property (T)
for every γ ∈ Γ .
by unimodularity of G.
Suppose now that g0 ∈ G and F is a measurable fundamental domain.
Then F ′ = g0 F is another fundamental domain and (10.23) defines a map φ
by
F ∋ g 7−→ g0 g ∈ F ′ 7−→ g0 gγg ∈ F
(with γg ∈ Γ being uniquely determined by the condition g0 gγg ∈ F ). It
is clear that the inverse to this map is given by the same procedure but
using g0−1 . To see that these maps are measure-preserving we consider the
function φ above. Now let B ⊆ F be measurable, and note that φ(B) is
defined by piecewise right translation of the set g0 B ⊆ F ′ = g0 F back to F . In
other words, we use (10.23) and apply the same cut-and-translate procedure
to obtain the desired equality
X
mG (B) = mG (g0 B) = mG (g0 B) ∩ (F γ −1 )
γ∈Γ
!
X G
= mG (g0 Bγ) ∩ F = mG (g0 Bγ) ∩ F = mG (φ(B)).
γ∈Γ γ∈Γ
Strictly speaking we should prove that mG (φ−1 (B)) = mG (B) as in the defin-
ition of a measure-preserving map (Definition 8.35), but since φ is invertible
this distinction is not important.
Suppose now that B ⊆ G is measurable with mG (B) > mG (F ). Apply-
ing (10.19) we see that
X X
mG (F ) < mG (B) = mG B ∩ (F γ −1 ) = mG (Bγ) ∩ F ,
γ∈Γ γ∈Γ
XZ XZ
kf k2HG = kf (g)k22 dmG (g) = kf (hγ −1 )k22 dmG (h)
γ∈Γ F ∩(F ′ γ −1 ) γ∈Γ (F γ)∩F ′
F
where we used the consequence F ′ = γ∈Γ (F γ) ∩ F ′ of (10.19).
We are now ready to prove the main properties of the unitary induction
(which in a sense combines the unitary representation πΓ and the measure-
preserving maps discussed in Lemma 10.51).
Proposition 10.52. Let G be a locally compact σ-compact metrizable uni-
modular group and Γ < G a lattice (so that there exists a fundamental do-
main F as in (10.19) with mG (F ) < ∞). Given a unitary representation πΓ
of Γ on a separable Hilbert space HΓ , the Hilbert space HG constructed above
admits a unitary representation πG of G defined by πG,g0 f (g) = f (g0−1 g)
for g0 , g ∈ G and f ∈ HG . Moreover, HΓ has a non-trivial Γ -fixed vector if
and only if HG has a non-trivial G-fixed vector, and HG has almost invariant
vectors if HΓ has almost invariant vectors.
Note that the formula defining πG,g0 is the same formula as for the left
regular representation on the space of functions on G, but that the space and
the norm are different.
Exercise 10.53. Let Γ < G be a discrete subgroup of a locally compact, σ-compact,
metrizable, unimodular group G. Let πΓ be the left regular representation of Γ on ℓ2 (Γ )
defined by πΓ,γ0 f (γ) = f (γ0−1 γ) for all f ∈ ℓ2 (Γ ) and γ0 , γ ∈ Γ . Show that the induced
representation πG is then unitarily isomorphic to the left regular representation of G.
1
of f to G will be f (g) = √ v for all g ∈ G. Notice that f ∈ HG
mG (F )
since mG (F ) < ∞. We therefore obtain a unit vector of HG that is invariant
with respect to G.
Pushing invariant vectors back to HΓ . Suppose for the opposite
direction that HG has a G-invariant unit vector f . Since G acts trans-
itively on itself, this implies that f (g) = v for some non-zero v in HΓ
and almost every g ∈ G (by using Exercise 10.4 for each component fj
of f = (f1 , f2 , . . .)). Since we also have f (gγ) = πΓ (γ)−1 f (g) for almost
every g ∈ G and all γ ∈ Γ we see that v ∈ HΓ is a non-zero Γ -invariant
vector.
Lifting almost invariant vectors. Suppose next that HΓ has almost
invariant vectors, let K ⊆ G be a compact subset and let ε > 0. Recall
that mG (F ) < ∞ since Γ < G is a lattice. By regularity of mG there exists
a compact subset L ⊆ F such that mG (F rL) < εmG (F ). Since L−1 KL ⊆ G
is compact and Γ < G is discrete, the set Q = Γ ∩ (L−1 KL) is a finite subset
of Γ . Suppose now that v ∈ HΓ is a (Q, ε)-almost invariant unit vector. Much
as in the discussion of invariant vectors, we define f ∈ HG by setting
(
v for g ∈ L,
f (g) =
0 for g ∈ F rL
and use the formula f (gγ) = πΓ (γ)−1 f (g) for all g ∈ F and γ ∈ Γ to extend f
to a function f ∈ HG .
Now let k ∈ K and g ∈ F . Then
γg = g −1 kg ′ ∈ L−1 KL ∩ Γ = Q
λg (f )(h, n) = f (g −1 h, n)
that is, we decompose F into the part Fin ⊆ F that stays inside F (under
the action of g and of g −1 ) and its relative complement
(see Figure 10.2). It may help to think of B as the bad set on which λg
and πG,g are quite different. We need to estimate its significance.
10.3 Property (T) 391
B+ Fin B−
g −1 F F gF
Fig. 10.2: The circle depicts F , and the action of g translates the circle to the
right, giving rise to the decomposition F = Fin ⊔ B.
as claimed.
Combining (10.25)–(10.26) with (10.24) we can now obtain
392 10 Locally Compact Groups, Amenability, Property (T)
Since this holds for any g ∈ V , we obtain the continuity of the unitary
representation and hence the theorem.
(where we write I for the identity matrix) which is also sometimes writ-
ten SO(3, R);
• the positive diagonal subgroup
a1 0 0
A = 0 a2 0 | a1 , a2 , a3 > 0 and a1 a2 a3 = 1 ;
0 0 a3
an(a′ n′ )−1 = k −1 k ′ ∈ K ∩ AN
since K and AN are both subgroups. Since all elements of K are diagonaliz-
iable over C with eigenvalues of absolute value one, we see that K ∩AN = {I},
which implies k = k ′ and an = a′ n′ . Similarly, since A ∩ N = {I} we now see
in the same way that a = a′ and n = n′ .
A lattice in Rd is a subgroup of the form Λ = gZd for some g ∈ GLd (R).
Recall from Lemma 10.51 that the co-volume of Λ is defined by the Lebesgue
measure of any fundamental domain F ⊆ Rd for Λ. Using F = g[0, 1)d we
see that the co-volume is given by |det g|.
The next result is part of a theory from 1896 due to Minkowski, who also
invented the descriptive name ‘geometry of numbers’ for it (see [74] for a
reprint and the monograph of Lekkerkerker [60] for more material in this
direction).
394 10 Locally Compact Groups, Amenability, Property (T)
Proof. For the proof of the proposition we will also use a version of the
conclusion in two dimensions. Let us assume first that d > 2 and Λ ⊆ Rd is
a discrete subgroup. Let w1 ∈ Λ be a shortest non-zero vector of Λ, let V be
the space (Rw1 )⊥ and let p : Rd → V denote the orthogonal projection. We
claim that any non-zero vector p(w) ∈ p(Λ) has
√
3
kp(w)k > 2 kw1 k. (10.27)
n12 = a−1 −1 1 1
1 hw2 , v1 i = a1 t hw1 , v1 i = t ∈ [− 2 , 2 )
10.3 Property (T) 395
w − ℓ2 w2 − ℓ3 w3 ∈ Rw1
w = ℓ1 w1 + ℓ2 w2 + ℓ3 w3 ,
Lemma 10.57. The group SL3 (R) is unimodular, and the Haar measure
on SL3 (R) decomposes with respect to the Iwasawa decomposition into the
(r)
product of the Haar measure mK on K and the right Haar measure mAN
on AN .
Proof. Notice first that SL3 (R) ⊆ Mat33 (R) = R9 is defined by a single
equation and hence is a hypersurface. We will define the Haar measure mSL3 (R)
on SL3 (R) using the Lebesgue measure mR9 by the following trick. Define a
measure µ on SL3 (R) by µ(B) = mR9 ({tg | t ∈ [0, 1], g ∈ B}) for any Borel
measurable set B ⊆ SL3 (R). To see that the set on the right-hand side is
measurable note that U = {m ∈ Mat33 (R) | det m ∈ (0, 1)} is open since the
determinant map is continuous, and on U the map
contains the non-empty open set {tg | t ∈ (0, 1), g ∈ B} and so in particu-
lar µ(B) > 0. Now let B ⊆ SL3 (R) be measurable and g0 ∈ SL3 (R). Then
ψ : K × AN −→ SL3 (R)
(k, an) 7−→ k(an)−1 ,
and note that the Gram–Schmidt procedure in the proof of Lemma 10.55
shows that ψ is a homeomorphism. Define a measure ν on K × AN by
a1 a1 a2 da1 da2
dn12 dn13 dn23 . (10.30)
a2 a3 a3 a1 a2
(a1 , a2 , n12 , n13 , n23 ) 7−→ (a1 , a2 , n12 +m12 , m13 +n13 +n12 m23 , n23 +m23 ),
and it is easy to see that this preserves the measure defined by (10.30).
Multiplying on the right by
398 10 Locally Compact Groups, Amenability, Property (T)
b1 0 0
0 b2 0
0 0 b3
with b1 , b2 > 0 and b3 = (b1 b2 )−1 we obtain the map
(a1 , a2 , n12 , n13 , n23 ) 7−→ a1 b1 , a2 b2 , bb21 n12 , bb13 n13 , bb32 n23 .
b2 b3 b3
we may substitute m12 = b1 n12 , m13 = b1 n13 , and m23 = b2 n23 to obtain
Z
f (a1 b1 , a2 b2 , m12 , m13 , m23 ) aa12 aa31 aa23 bb12 bb13 bb23 da1 da2
a1 a2 dm12 dm13 dm23 .
H
Applying Proposition 10.56 we see that there exists some c > 0 such
that SL3 (R) = B SL3 (Z), where B = KD is called a Siegel set, K = SO3 (R)
and the Borel measurable set D consists of all matrices in AN as in (10.29)
satisfying the conditions
0 < a1 6 ca2 6 c2 a3 , a3 = 1
a1 a2 , and n12 , n13 , n23 ∈ [− 12 , 12 ).
(r)
By Lemma 10.57 we can calculate mSL3 (R) (B) by calculating mAN (D). Note
that the conditions on the diagonal entries a1 , a2 , a3 imply that a1 ∈ (0, c1 ]
−1/2
and a2 ∈ [c2 a1 , c3 a1 ] for some constants c1 , c2 , c3 > 0. By Lemma 10.58
10.3 Property (T) 399
Z c1 Z c 3 a1
−1/2
(r) a1 a1 a2 da2 da1
mAN (D) 6 a2 (a1 a2 )−1 (a1 a2 )−1 a2 a1
0 c 2 a1
Z c1 Z c 3 a1
−1/2
h−1 ′ r
n hn = γn ∈ SL3 (Z) {I}.
However, this contradictsSthe fact that SL3 (Z) is a discrete subgroup of SL3 (R).
Next write SL3 (R) = n Kn as a countable union of compact subsets (for
example, define Kn to be the intersection of SL3 (R) with closed balls in R9
of radius n > 1 around 0). For each n choose a finite cover Un,1 , . . . , Un,mn
of Kn such that the above injectivity claim holds on each of these sets.
To simplify the notation, let us summarize the above by saying that we
have found a countable list of open sets U1 , U2 , . . . satisfying the injectivity
claim and covering all of SL3 (R). We now define F1 = B ∩ U1 and
gγ ∈
/ (F1 ∪ · · · ∪ Fn−1 ) SL3 (Z),
400 10 Locally Compact Groups, Amenability, Property (T)
to see that a k-regular graph on n vertices exists if and only if n > k+1 and nk
is even). Notice that this will impose a sparsity condition on the graph, since
the number of edges |E| will be a linear function of the number of vertices |V|
(in contrast to the case of a complete graph, for which |E| = 21 |V| (|V| − 1)).
In order to define the notion of high connectivity, we will need some pre-
parations. A graph G = (V, E) is called connected if for any two v, w ∈ V there
exists a path from v to w in that there is a list v = v0 , v1 , v2 , . . . , vn = w
of vertices in V with (vi , vi+1 ) ∈ E for i = 0, . . . , n − 1. Such a path may
consist of a single vertex, so each vertex is connected to itself by a path of
length zero. Notice that there is a natural metric on any connected graph:
we may define d(v, w) to be the minimal length of a path from v to w (that
is, the minimal number of edges in a path joining v to w; see Figure 10.3 and
Exercise 10.60). In this metric the diameter of a connected graph G is the
minimal N ∈ N with the property that for any two vertices v and w there is
a path of length no more than N connecting v to w.
Exercise 10.60. Verify that the notion of distance on a graph defines a metric on the set
of vertices of a connected graph.
v1
v2
v
The smaller the diameter is in comparison with V, the better the connectiv-
ity of the graph is. The worst case with the vertices strung out on a line (or
if we seek a 2-regular graph, arranged around a circle) has diameter |V| − 1
(or ⌊ |V|
2 ⌋). The other extreme case of a complete graph has diameter 1. In
the case of expander graphs we will see that such families may be found with
diameter N ≪ log |V|. The implied constant will depend on k and on ξ (as
in Definition 10.61) but is not allowed to depend on the particular graph G.
Considering the growth rate of the logarithm, it should be clear that this is
a formulation of high connectivity.
Definition 10.61 (Expanders). A sequence of finite k-regular graphs
A few comments are in order. We first note that the above definition of
the boundary of a subset of the vertex set of a graph does not coincide with
the boundary of S considered in the metric space Vi (the latter is empty
since Vi is discrete). Any finite collection of finite k-regular connected graphs
(formally, a sequence as in Definition 10.61 that repeats these) is an expander
family. As this is not at all interesting — and in particular does not achieve
the real benefit of the slower growth rate from the logarithmic bound on the
diameter — one usually requires in addition that |Vi | → ∞ as i → ∞. Notice
that we must also have k > 3, because k = 2 corresponds to a sequence of
regular polygons, which we quickly see cannot be an expander family.
An expander family consists of connected graphs, but as already mentioned
much more is true.
Ba (v) = {w ∈ Vi | d(v, w) 6 a}
|Vi |
has more than 2 elements if the integer a satisfies
log(|Vi |/2)
a>D= .
log 1 + ξ/(k + 1)
Assuming the claim, suppose that v, w ∈ Vi are any pair of vertices and set a
equal to ⌈D⌉. Then, by the claim, each of |Ba (v)| and |Ba (w)| is greater
than |V2i | , so that these two balls must have non-empty intersection. By the
triangle inequality, it follows that
Note that every element of ∂S∩S must connect to one element of ∂SrS and at
most k elements of ∂S ∩ S can connect to the same element of ∂SrS. We can
use this to define a map from ∂S ∩ S to ∂SrS that is at most k-to-1, showing
that |∂S ∩ S| 6 k|∂SrS|. This, together with |∂S| = |∂S ∩ S| + |∂SrS|, gives
1
|∂SrS| > |∂S|.
k+1
Together with the defining property of expander graphs, and assuming as we
may that ξ ∈ (0, 1), we deduce that
ξ
By induction we now prove |B0 (v)| = 1, |B1 (v)| = k + 1 > 1 + k+1 , and
n+1
|Bn+1 (v)| = |Bn (v)|+|∂Bn (v)rBn (v)| > 1+ k+1
ξ ξ
|Bn (v)| > 1+ k+1
for all n with |Bn (v)| 6 |V2i | . Since for n = a > D the lower bound is greater
than or equal to |V2i | , this proves the claim.
Thus expander families achieve a balance between the two constraints of
high connectivity (with logarithmic growth of the diameter) and sparsity
of the graph (with only linear growth of the number of edges and a fixed
number of edges at every vertex). However, several questions remain, the
most pressing of which are the following.
• Do expander families exist?
• What is their connection to functional analysis?
The first examples of expander families were found by Pinsker [87] (trans-
lated in [88]) using a non-constructive probabilistic argument. The same year
Margulis [67] (translation in [68]) was able to give an explicit construction(30)
using Každan’s Property (T) for the group SL3 (Z).
Towards the proof of this, we now exhibit a connection between the ex-
pander property and properties of eigenvalues of linear maps associated to
the graphs.
Let G = (V, E) be a finite graph and identify V with the set {1, 2, . . . , |V|}.
The adjacency matrix AG of the graph G is the matrix with |V| rows and |V|
columns and with entries in {0, 1} so that (AG )i,j = 1 if and only if there is
an edge from vertex i to vertex j. A simple graph G with adjacency matrix
010110
1 0 1 0 0 1
0 1 0 1 0 1
AG =
1 0 1 0 1 0
1 0 0 1 0 1
011010
404 10 Locally Compact Groups, Amenability, Property (T)
2 3
1
4
G
6 5
Several properties of the graph are reflected in the properties of the ad-
jacency matrix. The matrix AG is symmetric by our standing assumption
on the graph G = (V, E). We also define MG = k1 AG , which is an averaging
operator in the following sense. A vector x ∈ R|V| may be thought of as a
function on the set of vertices, and applying MG to x gives a new function
which at the vertex i is equal to the mean of the values of the function x at
all the neighbours of i. By analogy with the discussion in Section 1.2, one also
studies the graph Laplace operator ∆G = I − MG . Since MG is symmetric, it
is diagonalizable and has only real eigenvalues. Moreover,
X X X X
X
|(MG x)i | = (MG )i,j xj 6 (MG )i,j |xj | = |xj |,
i i j i,j j
since X
(MG )i,j = 1 (10.31)
i
for Gi , and order its eigenvalues λ1 (Mi ) = 1 > λ2 (Mi ) > · · · > λ|Vi | (Mi ).
Suppose that there exists some ε > 0 with
λ2 (Mi ) 6 1 − ε (10.32)
†
corresponding to the eigenvalues λ1 = 1 > λ2 > · · · > λ|V| . This then gives
X
M (1S ) = λj vj ,
j
and finally
|V|
X
M (1S ) − 1S = (λj − 1)vj .
j=2
Thus we need to relate the last norm to the size of S. To this end, notice
p
that a constant 1 is an eigenvector for the eigenvalue λ1 = 1, k1k2 = |V|,
and h1S , 1i = |S|, so the orthogonal projection of 1S onto 1 is |V|
|S|
1. There-
fore, as in (10.33), we may subtract from 1S the vector v1 = |S|
|V| 1 and obtain
† If λj = λj+1 for some j ∈ {2, . . . , |V| − 1} we may and will assume vj+1 = 0.
406 10 Locally Compact Groups, Amenability, Property (T)
X|V|
vj
=
1S − |S|
|V| 1
2 2
j=2
|S|
> 1 − |V| k1S k2 (by restricting the sum to S)
p |S|
> 21 |S| (since |V| 6 12 )
p
X|V|
p
|∂S| > kM (1S ) − 1S k2 > ε
vj
> 2ε |S|.
2
j=2
As this holds for all subsets S ⊆ V with |S| 6 |V| 2 and all graphs in the
2
sequence (Gi )i>1 , we see that this is an expander family with ξ = ε4 .
We note that the generating set S, for example, could be taken to comprise
the 12 matrices given by
1 ±1 0
0 1 0
0 0 1
together with all its conjugates by permutation matrices (a permutation mat-
rix is one obtained by permuting the rows or the columns of the identity mat-
rix). Furthermore, the sequence of sets with the transitive actions could, for
example, be Vn = SL3 (Z/pn Z), where pn denotes the nth odd prime number.
In this section we will prove Corollary 10.66. By Corollary 10.40 we know
that Γ = SL3 (Z) has property (T). In the following argument we let S = S −1
be a finite symmetric set of generators of Γ (not containing the identity). Such
a set exists by Exercise 10.37 (or Exercise 10.67 below).
Essential Exercise 10.67. Prove that SL3 (Z) is generated by the 12 ele-
ments given just after Corollary 10.66. (Note, however, that the argument
after Proposition 10.41 does not apply directly since Z is not a field.)
10.4 Highly Connected Networks: Expanders 407
πn : Γ ýH n = ℓ2 (Vn ) = CVn
.
defined by πn,γ f (v) = f (γ −1 v) for all γ ∈ Γ and f ∈ Hn . By transitivity of Γ
a Γ -invariant function in Hn must be constant, that is HnΓ = C1. Suppose
⊥
now that f ∈ HnΓ is a unit vector. By the uniform spectral gap property
above, there exists some γ ∈ S such that
We now show that this uniform claim implies that the sequence of
graphs (Gn ) defined by Gn = (Vn , En ) as in Corollary 10.66 is an expander
family by using Proposition 10.65. For this, let ε > 0 and suppose in addition
⊥
that f ∈ HnΓ is an eigenvector for the averaging operator Mn = MGn
associated to the graph Gn and eigenvalue λ2 (Mn ) > 1 − ε. By definition of
the graph structure and the averaging operator we have
1 X
Mn (f ) = πn,γ f,
|S|
γ∈S
and so
1 X
1 − ε 6 λ2 (Mn ) hf, f i = ℜ hMn (f ), f i = ℜ hπn,γ f, f i .
|S|
γ∈S
408 10 Locally Compact Groups, Amenability, Property (T)
ε2
for all γ ∈ S. Setting ε = 2|S|
0
, this contradicts (10.34). In other words, using
this ε we have λ2 (Mn ) < 1 − ε for every n, which shows that the assumptions
in Proposition 10.65 are satisfied, and so Corollary 10.66 follows.
Exercise 11.3. Let X be a compact topological space, and let A = C(X). Find the
spectrum of f ∈ C(X) as an element of the Banach algebra C(X).
This theorem is the first of many that relate the algebraic to the topolo-
gical structure in Banach algebras. The spectrum and the spectral radius of
an element are defined in purely algebraic terms, whereas the limit is defined
in terms of the norm. One surprising consequence is the following observa-
tion: If A is a unital Banach algebra contained in a larger Banach algebra B
(with compatible structures), then it is possible for an element a ∈ A to
be non-invertible in A but to be invertible in B. Thus the spectrum of an
element depends on the algebra it is viewed in, and σB (a) ⊆ σA (a) with
strict containment being a possibility (see Exercise 11.7). Despite this, the
spectral radius of a ∈ A is not changed when it is viewed as an element of B,
since Theorem 11.6 expresses it in terms of the norms of powers of a, which
are not affected by the switch from A to B (by the implicit compatibility
assumption).
Exercise 11.7. Let U : ℓ2 (Z) → ℓ2 (Z) be the unitary shift operator from Exercises 6.1
and 6.23(a), so U ((xn )) = (xn+1 ).
(a) Show that the spectrum of U considered within the algebra B of all bounded operators
on ℓ2 (Z) is given by S1 = {λ ∈ C | |λ| = 1}.
(b) Now consider the Banach algebra A generated by U (obtained by taking the closed
linear hull of U 0 = I, U, U 2 , . . .). Show that the spectrum of U within A is {λ ∈ C | |λ| 6 1}.
11.1 The Spectrum and Spectral Radius 411
Finally, let us comment on the precise shape of the spectral radius for-
mula (11.1). It will be relatively straightforward to show that scalars λ ∈ C
with |λ| > kak cannot belong to the spectrum of a ∈ A. However, it is also
clear that in general the norm may be much larger than the spectral radius.
Self-adjoint and, more generally, normal operators on Hilbert spaces will form
a nice exception to this. In fact, even in the elementary case of the algebra
of two-by-two matrices (equipped with the operator norm) the norm of the
matrix
1C
a=
0 1
can be made arbitrarily large by increasing the value of C, but the spectrum
always consists simply of 1 ∈ C. The right-hand side of the spectral radius
formula (11.1) essentially ignores the original size of the matrix a and instead
looks at the exponential growth rate of the norm of an . In the case at hand
the norm of an grows linearly which makes the right-hand side equal to one
(and thus equal to the left-hand side).
Exercise 11.9. Let k ∈ C([0, 1]2 ) be a continuous function, so that
Z x
K(f )(x) = k(x, t)f (t) dt
0
for f ∈ C([0, 1]) defines an operator K : C([0, 1]) → C([0, 1]). Determine σ(K).
For the proof of Theorem 11.6 we will use Cauchy integration on the
complex plane and convergent geometric series in the unital Banach algebra.
∞
X
(1A − a)−1 = an . (11.2)
n=0
Indeed, since the right-hand side converges absolutely we may take the
product and obtain
∞
! ∞
! ∞ ∞
X X X X
n n
(1A − a) a = a (1A − a) = an − an = 1 A
n=0 n=0 n=0 n=1
(here and below we will sometimes study λ1A − a instead of a − λ1A , which
clearly will not make any difference). For this notice first that by assumption
∞
X ∞
X n
kλ−mn amn k 6 |λ|−m kam k < ∞,
n=0 n=0
| {z }
<1
P∞
so the series n=0 λ−mn amn converges absolutely. Moreover, by combining
the first three factors in the product
X
∞
−1 −1 −(m−1) m−1
(λ1A − a) λ 1A + λ a+ ···+ λ a λ−mn amn
n=0
∞
X
1A − λ−m am λ−mn amn = 1A .
n=0
by (11.2). Noting that the factors commute with each other (which follows
easily from continuity of multiplication in the algebra), this proves the claim,
and so λ ∈ ρ(a).
The case m = 1 gives the remaining statement about the spectrum.
Exercise 11.12. Assume that (an ) and (bn ) are sequences in a Banach algebra satis-
fying an bn = bn an for all n > 1 and with limn→∞ an = a, limn→∞ bn = b. Show
that ab = ba.
We have shown that σ(a) ⊆ C is compact for any element a ∈ A, but have
yet to show that σ(a) is non-empty. This existence theorem uses Cauchy
integration, and to prepare for this we need the following lemma concerning
the resolvent.
with coefficients bn ∈ A.
Proof. We use essentially the same formulas as those that arise in the proof
of Proposition 11.10. Let a ∈ A and λ0 ∈ ρ(a) be as in the lemma. Suppose
that λ ∈ C satisfies |λ − λ0 | < k(λ0 1A − a)−1 k−1 . Then
is, for |λ − λ0 | < k(λ0 1A − a)−1 k−1 , an absolutely convergent power series,
as claimed.
414 11 Banach Algebras and the Spectrum
With this analyticity we are ready to prove the first part of Theorem 11.6.
Proof that σ(a) is non-empty. Let a be an element of a unital Banach
algebra, and suppose that σ(a) is empty. We first sketch an argument that
produces a contradiction from this assumption, and then fill in the details.
An entire function. Since σ(a) is empty, the resolvent function
for any closed piecewise differentiable path γ in C. The alert reader may notice
that this usage of Cauchy integration is a bit unorthodox, but should read
on — this will be resolved below. In particular, if γ is the closed positively
oriented path with centre 0 and radius kak + 1, then
∞
X
−1
R(z) = (z1A − a)−1 = 1
z 1A − z1 a = z −n−1 an (11.3)
n=0
for any z on the path γ, and the sum is absolutely convergent. Therefore,
I I ∞
X
0= R(z) dz = z −n−1 an dz
γ |z|=kak+1 n=0
∞
X I (11.4)
n
= a z −n−1 dz = 2πi1A ,
n=0 |z|=kak+1
since I (
−n−1 2πi if n = 0,
z dz =
|z|=kak+1 0 if n =
6 0.
Now 1A 6= 0, and so (11.4) shows that the assumption σ(a) = ∅ leads to a
contradiction.
Using the standard Cauchy integral formula. The difficulty with
the argument sketched above is that most of the integrals are integrals of A-
valued functions. Even though it is possible to make sense of integration for A-
valued functions (see Proposition 3.81), we do not need to extend the Cauchy
integral formula for A-valued functions because of the following argument
(which could be used to prove such an extension).
Let ℓ ∈ A∗ be a linear functional with ℓ(1A ) 6= 0 (such a functional is
guaranteed to exist by Theorem 7.3) and consider ℓ ◦ R : ρ(a) = C −→ C. By
Lemma 11.13, R(z) can locally be represented as a power series. By continuity
of ℓ, the same holds for ℓ ◦ R. It follows that ℓ ◦ R : C → C is an entire
11.1 The Spectrum and Spectral Radius 415
function (in the usual sense of complex analysis). Using this entire function
in the calculation in (11.4) we see that
I ∞
X I
0= ℓ ◦ R(z) dz = ℓ(an ) z −n−1 dz = 2πiℓ(1A ) 6= 0.
|z|=kak+1 n=0 |z|=kak+1
so that αnn > α for all n > 1. Let β > α be arbitrary and pick k > 1
such that αkk < β. For any n > k we apply division with remainder to
get n = mk + j for some j ∈ {0, . . . , k − 1} and m > 1. By the sub-additivity
property we then have
αn αmk αj mαk jα1 mk αk jα1
6 + 6 + = + .
n n n n n n k n
If now n = mk + j is large enough we see that the right-hand side is less
than β, which proves the statement.
Proof of Theorem 11.6. Notice first that the sequence (αn ) defined by
416 11 Banach Algebras and the Spectrum
αn = kan k
for all m > 0, where |z m | = (s+ε)m and the implicit constant only depends on
the restriction of R(z) = (z1A −a)−1 to {z ∈ C | |z| = s+ε}, and in particular
does not depend on m and ℓ. Expanding the circle to the radius kak + 1 does
not change the integral, so that we may use (11.3) again to see that
I I
−1 m
ℓ (z1A − a) z dz = ℓ (z1A − a)−1 z m dz
|z|=s+ε |z|=kak+1
∞
X I
= ℓ(an ) z −n+m−1 dz = 2πiℓ(am ).
n=0 |z|=kak+1
Taking the mth root and the limit we see that the implicit constant disap-
pears, and we get
11.2 C ∗ -algebras 417
p
lim m kam k 6 s + ε = max |λ| + ε.
m→∞ λ∈σ(a)
11.2 C ∗ -algebras
where we used the C ∗ -property of the norm for a2 , normality of a, and the C ∗ -
property of the norm for a∗ a and for a. Now suppose that (11.5) holds for a
n
given n > 1 and set b = a2 . Then
n+1 n n+1
ka2 k = kb2 k = kbk2 = ka2 k2 = kak2 ,
where we used the definition of b, the case n = 1 for the normal element b,
and the inductive hypothesis. This concludes the induction, proving (11.5)
for all n > 0. Applying Theorem 11.6 now gives the proposition.
Starting in Section 12.5, we will use the results of this section and their
refinements in Section 11.3.4 to obtain the spectral theory of commutat-
ive C ∗ -subalgebras of bounded operators on a Hilbert space.
Recall that the dual space A∗ of a Banach algebra A consists of all bounded
linear functionals A → C. If A is in addition commutative (with ab = ba for
all a, b ∈ A) then it is useful to study algebra homomorphisms. The trivial
map χ defined by χ(a) = 0 for all a ∈ A may also be considered an algebra
homomorphism, but we will exclude this trivial map in the discussion below.
If the Banach algebra that we consider also has a unit, then we can link the
notion of algebra homomorphisms to the spectrum of the elements of the
algebra. The following result establishes this link and a great deal more.
Theorem 11.23 (Properties of the Gelfand dual). Let A be a commut-
∗
ative unital Banach algebra over C. Then σ(A) ⊆ B1A is non-empty and
weak* compact, and σ(a) = {χ(a) | χ ∈ σ(A)} for every a ∈ A.
We start the proof of Theorem 11.23 by showing that any algebra homo-
morphism χ : A → C is continuous(31) (and so strictly speaking the continuity
hypothesis in Definition 11.22 could be dropped).
Lemma 11.24. Let A be a commutative Banach algebra, and let χ : A → C
be an algebra homomorphism. Then χ is continuous and kχk 6 1.
Proof. Suppose there is an element a ∈ A with kak < 1 and with kχ(a)k > 1.
Replacing P
a by a/χ(a) we may assume that kak < 1 and χ(a) = 1. Then the
series b = ∞ n
n=1 a converges and satisfies a + ab = b, so that
(a + J)(b + J) = ab + J
Proof. The first claim is an easy consequence of the fact that the multiplic-
ation map A × A → A is continuous by the discussion in Section 2.4.2.
For the second claim, notice that a proper ideal J ⊆ A cannot contain 1A ,
nor indeed any invertible element. By Proposition 11.10 this implies that 1A
420 11 Banach Algebras and the Spectrum
S = {J ⊆ R | J is an ideal and J0 ⊆ J ( R}
A/M = C(1A + M) ∼
= C.
χ(a − λ1A ) = 0
While the notions of invertibility and spectrum are linked to the existence
of a unit, the definition of the Gelfand dual is not. However, the topological
properties of σ(A) are changed by the absence of a unit.
422 11 Banach Algebras and the Spectrum
∗
which is easily seen to be a closed subset of B1A in the weak* topology.
For the last claim of the corollary note that if A has a unit, then The-
orem 11.23 applies and gives the statement. So assume that A does not have
a unit, and consider the algebra A1 = A ⊕ C with the multiplication and
norm as in Exercise 11.1. As argued in the beginning of the proof of The-
orem 11.23, χ1 (1A ) = 1 for any χ1 ∈ σ (A1 ) so that χ1 is uniquely determined
by χ = χ1 |A ∈ σ(A) ∪ {0}. Moreover, any χ ∈ σ(A) ∪ {0} can be extended
to a character χ1 ∈ σ (A1 ) by setting
χ1 (a + λ1A ) = χ(a) + λ
for any a + λ1A ∈ A1 , which allows us to identify σ (A1 ) with σ(A) ∪ {0}.
Applying Theorems 11.23 and 11.6 to A1 now gives
p
max |χ(a)| = max |χ(a)| = lim n kan k.
χ∈σ(A)∪{0} χ∈σ(A1 ) n→∞
Exercise 11.31. Show that every character χ ∈ σ(A) can be extended to a character χ1 ,
as claimed in the proof of Corollary 11.29.
f o (χ) = χ(f )
Just as in Theorem 11.23 we will always use the weak* topology on σ(A).
As we will see in the course of the proof, this can be deduced from Pro-
position 11.11. This might be a little confusing initially. How can a property
like maxλ∈σ(a) |λ| 6 kak imply that σ(a) is real? One way of viewing the
situation is to apply a vertical translation to the set σ(a), as illustrated in
Figure 11.1.
σ(a−iy1A )
kak
ka−iy1A k
C
Bkak
σ(a)
Fig. 11.1: Many possible λ ∈ C that satisfy the constraint |λ| 6 kak might not
p the constraint |λ − iy| 6 ka − iy 1k if the norm of a − iy1A for y ∈ R
satisfy
is kak2 + |y|2 as Figure 11.1 suggests. Taking y → ∞ and y → −∞ shows that
the spectrum σ(a) is a subset of R.
11.4 Locally Compact Abelian Groups 425
where we have used the C ∗ -property, the fact that a and 1A are self-adjoint,
and the fact that k1A k = 1 (see Exercise 11.20 and its hint on p. 581).
Writing λ = x0 + iy0 ∈ C with x0 , y0 ∈ R, the calculation above gives
where S1 is the multiplicative unit circle and the group operation is pointwise
multiplication.
for f1 , f2 ∈ L1 (G). The Gelfand dual σ(L1 (G)) of all non-trivial algebra ho-
momorphisms from L1 (G) onto C is a locally compact σ-compact metrizable
b The Gelfand trans-
space which can be identified with the Pontryagin dual G.
form can be identified with the Fourier back transform.
b is a
We now explain the two identifications in more detail. If χG ∈ G
continuous group homomorphism
χG : G −→ S1 = {z ∈ C | |z| = 1},
for χG ∈ G.b Since we identify the Pontryagin dual Gb with the Gelfand
b corresponds precisely
dual σ(L1 (G)), we see that every character χG ∈ G
to one χA ∈ σ(A) and vice versa, and that
fq(χG ) = χA (f ) = f o (χA )
is the Fourier back transform and is at the same time also the Gelfand trans-
form.
Proof of Proposition 11.38. By Proposition 3.91, L1 (G) is a separable
Banach algebra. By Exercise 3.92 (see also Lemma 3.59(1)), L1 (G) is commut-
ative. The proof that every continuous group homomorphism χG : G → S1
gives rise to an algebra homomorphism
Z
1
χA : L (G) ∋ f 7−→ f χG dm
G
is very similar to the proof for the case G = Rd in Proposition 9.31 and is
therefore left to the reader.
Corollary 11.29 shows that σ(L1 (G)) is locally compact in the weak* to-
pology. The claimed metrizability follows from Proposition 8.11 since L1 (G)
is separable. Finally, the σ-compactness follows since σ(L1 (G)) ∪ {0} is also
compact and metrizable by Corollary 11.29 and Proposition 8.11.
The main claim of the proposition is therefore that every non-trivial al-
gebra homomorphism χA : L1 (G) → C arises from some continuous group
11.4 Locally Compact Abelian Groups 427
1
for all f ∈ L (G). We have to show that χ can be chosen in Cb (G) and with
the property that χ(gh) = χ(g)χ(h) for all g, h ∈ G (which will also imply
that χ(g) ∈ S1 for all g ∈ G).
For this proof we apply the algebra homomorphism property of the
map χA ∈ σ(L1 (G)) for f, f0 ∈ L1 (G) and obtain together with Fubini’s
theorem that
Z
f (h) χ(h)χA (f0 ) dm(h) = χA (f )χA (f0 ) = χA (f ∗ f0 )
G
Z Z
= f (h)f0 (g − h) dm(h)χ(g) dm(g)
ZG G Z
= f (h) f0 (g − h)χ(g) dm(g) dm(h).
G G
As this holds for any fixed f0 and for all f ∈ L1 (G), the uniqueness property
in Proposition 7.34 implies that
Z
χ(h)χA (f0 ) = f0 (g − h)χ(g) dm(g) = χA (f0h ) (11.6)
G
χG (g1 + g2 ) = χA (f1 )−1 χA (f1g1 +g2 ) = χG (g1 )χA (f1 )−1 χA (f1g2 )
= χG (g1 )χG (g2 )
so kfb1 k∞ = 1 = kf1 k1 , but the maximum value of |fb1 (t)| is attained precisely
at the point t = 0. Now consider
with
fb2 (t) = fb1 (t) − fb1 (−t)
for t ∈ R and fb(0) = 0. Hence |fb2 (t)| achieves its maximum for some t0 6= 0,
so that
kfb2 k∞ = |fb2 (t0 )| 6 |fb1 (t0 )| + |fb1 (−t0 )| < 2kf1k1 = kf2 k1 ,
showing that the Fourier transform (and hence a Gelfand transform) need
not be an isometry.
Exercise 11.40. Let G be as in Proposition 11.38. When does L1 (G) have a unit with
respect to convolution?
As the following exercise shows, the theory developed above is quite power-
ful. In fact the original proof of the Wiener lemma [114] was complicated and
the Gelfand theory allows for a clean simple proof.
Exercise 11.41 (Wiener lemma for C(Td )). Let f ∈ C(Td ) be the limit of an abso-
lutely convergent Fourier series with f (x) 6= 0 for all x ∈ Td . Show that f1 is also the limit
of an absolutely convergent Fourier series.
Exercise 11.42. (a) (Wiener theorem for L1 (Td )) Let f ∈ L1 (Td ). Show that the
R
span hλy f | y ∈ Zd i is dense in L1 (Td ) if and only if fb(n) = f (x)χn (x) dx 6= 0 for
all n ∈ Z .
d
(b) (Wiener theorem for L1 (Rd )) Let f ∈ L1 (Rd ). Show that hλy f | y ∈ Rd i is dense
in L1 (Rd ) if and only if fb(t) 6= 0 for all t ∈ Rd .
for j = 1, . . . , n. This shows that UK,ε/M (χ0 ) ⊆ Nf1 ,...,fn ;ε (χ0 ), which gives
one direction for the equivalence of the two topologies.
For the reverse direction, we fix some χ0 ∈ G, b a compact subset K ⊆ G,
and ε > 0. We need to find some f0 , f1 , . . . , fn ∈ L1 (G) and δ > 0 so that
for all χ ∈ Nf0 ; 13 (χ0 ) and h ∈ G. In other words, the equi-continuity proper-
ties of all elements χ ∈ Nf0 ; 13 (χ0 ) at 0 are controlled by the continuity of the
map G ∋ h 7→ f0h ∈ L1 (G).
We set δ = min{ 5ε , 13 }. Using (11.8) and Lemma 3.74 we find some open
neighbourhood B ⊆ G of 0 ∈ G with compact closure such that h ∈ B, g0 ∈ G,
and χ ∈ Nf0 ; 13 (χ0 ) implies
1
We define fj = m(B) 1gj +B for j = 1, . . . , n and claim that Nf0 ,f1 ,...,fn ,δ (χ0 )
is the neigbourhood we were looking for. Indeed, let
For any g ∈ gj + B we can now combine (11.9) and (11.11) for χ and χ0 , and
the assumption χ ∈ Nfj ;δ (χ0 ) to obtain
for all g ∈ gj + B. Varying j and using (11.10) this implies (11.7). Thus, the
neighbourhoods of χ0 ∈ G b in the weak* topology are precisely the neighbour-
hoods of χ0 with respect to the topology of uniform convergence on compact
subsets of G.
b it is straightforward to check
With this identification of the topology on G,
that the group operations are continuous. Indeed,
b ∋ χ 7→ χ−1 = χ
G
b we therefore know that χ ∈ UK,ε/2 (χ0 )
is continuous. Similarly, for χ0 , η0 ∈ G
and η ∈ UK,ε/2 (η0 ) imply
ε ε
kχη − χ0 η0 kK,∞ 6 kχη − χ0 ηkK,∞ + kχ0 η − χ0 η0 kK,∞ < 2 + 2 = ε,
b×G
showing continuity of the group operation G b ∋ (χ, η) 7→ χη ∈ G.
b
The following exercises give further examples of the duality between a
group and its dual, both viewed as topological groups.
b
Exercise 11.44. (a) Suppose that G is a compact metrizable abelian group. Show that G
is discrete (and countable).
b is compact (and
(b) Suppose that G is a countable discrete abelian group. Show that G
metrizable).
• The study of L1 (G) for a locally compact abelian group can lead to a vast
generalization of the theory of Fourier series and the Fourier transform
to all such groups. This is known as Pontryagin duality or harmonic
analysis on locally compact abelian groups and will be discussed further
in Section 12.8.
• Another important class of Banach algebras with additional structure are
the von Neumann algebras. These are special C ∗ -sub-algebras of B(H) for
a Hilbert space H. We refer to Blackadar [11] for an overview.
Chapter 12
Spectral Theory and Functional
Calculus
In this chapter we use results from Chapter 11 to prove the spectral theorem
and develop the functional calculus for single self-adjoint operators and for
certain commutative C ∗ -algebras (arising, for example, from unitary repres-
entations of locally compact abelian groups). As an example of a self-adoint
operator we discuss the Laplace operator on a regular tree.
In this section we will study the spectrum (as defined for abstract algebras in
Section 11.1) in the context of bounded operators on a Hilbert space. More
precisely, we fix a complex Hilbert space H, let A = B(H) be the Banach
algebra of bounded operators, and study the spectrum of some T ∈ A.
As we have already seen in Exercise 6.25 and Example 9.1, the discrete
spectrum may well be empty for a given bounded operator. For the operators
in these examples the notion of eigenvector has to be generalized to a sequence
of approximate eigenvectors in the following sense.
For normal operators we will see that the approximate point spectrum
coincides with the whole spectrum. We now try to describe the non-discrete
part of the spectrum further.
In general the approximate point spectrum may not yet describe the whole
spectrum, which motivates the next definition.
im(T − λI) 6= H.
Exercise 12.6. (a) Show that σappt (T ) is a closed subset of C for any T ∈ B(H), and
that σapprox (T ) is a closed subset of C for any normal operator T ∈ B(H).
(b) Let H = L2 ([0, 1])2 and define T ∈ B(H) by T (f, g) = (MI f, f ) for all (f, g) ∈ H (so
that T (f, g)(x) = (xf (x), f (x)) for x ∈ (0, 1)). Show that σapprox (T ) is equal to (0, 1], and
in particular is not closed.
(c) Find an example of an operator T ∈ B(H) for which σdisc (T ) and σcont (T ) are not closed
subsets of C. More specifically, find an example of a self-adjoint operator for which σdisc (T )
is countable and dense in σapprox (T ) = σappt (T ) = [0, 1].
Exercise 12.7. Suppose Tj ∈ B(Hj ) for j = 1, 2 are bounded operators on two Hilbert
spaces H1 , H2 . Let T = T1 × T2 ∈ B(H1 × H2 ). Show that σdisc (T ) = σdisc (T1 )∪ σdisc (T2 ),
and similarly for σappt and σapprox . Find an example of a pair of self-adjoint operators
showing that the corresponding statement does not hold for the continuous spectrum.
Exercise 12.8. (a) Let µ be a compactly supported σ-finite measure on C, and let
for v ∈ L2µ (C) be the multiplication operator corresponding to the identity map on C.
Show that
and that σresid (MI ) is empty (here µcont is the measure determined by the decomposi-
tion µ = µcont + µdisc , where µcont has no atoms and µdisc is purely atomic).
(b) Let (X, B, µ) be a σ-finite measure space, and let g : X → C be a bounded measurable
function. Generalize (a) to the multiplication operator Mg on L2µ (X).
(c) Let X = [0, 1] ⊆ R, and let λcount be the counting measure on Q ∩ [0, 1] considered
as a σ-finite measure on X. Let MI be as in part (a). Describe each of the parts of the
spectrum of MI .
Example 12.9. Let T : ℓ2 (N) → ℓ2 (N) be the operator from Exercise 6.23(c)
defined by T (vn ) = (0, v1 , v2 , . . . ). Then kT vk = kvk for any v ∈ ℓ2 (N), and
so
0∈ / σappt (T ) = σdisc (T ) ∪ σapprox (T ).
However, the image of T is the proper closed subspace {v ∈ ℓ2 (N) | v1 = 0},
so 0 ∈ σresid (T ).
Exercise 12.10. For the operator T from Example 12.9, show that
σdisc (T ) = ∅,
σapprox (T ) = σcont (T ) = S1 = {λ ∈ C | |λ| = 1}, and
σresid (T ) = B1C = {λ ∈ C | |λ| < 1}.
The next lemma gives the main relationship between the parts of the
spectrum from this section and the spectrum in the sense of Definition 11.2
for A = B(H).
λ∈
/ σappt (T ) ∪ σresid (T ).
V = im(T − λI) 6= H.
The following definition is useful because it gives an ‘upper bound’ for the
spectrum.
Definition 12.12. The numerical range of T ∈ B(H) is the set
Exercise 12.14. Show that N (T ) is really only an upper bound for the spectrum of an
operator T ∈ B(H) by showing that N (T ) is the convex hull of the eigenvalues of T if T is
diagonalizable, that is, if H admits an orthonormal basis consisting of eigenvectors of T .
12.2 The Spectrum of a Tree 437
Exercise 12.19. Let (X, µ) be a σ-finite measure space and g ∈ L∞ µ (X). Show that the
essential spectrum σess (Mg ) of the normal operator Mg is given by
v0
Fig. 12.1: The 3-regular tree is illustrated here by showing all vertices of distance
no more than 4 from a given initial vertex v0 (also called the root). Of course the
pattern repeats indefinitely, from w and from all the other vertices at distance 4
from our chosen root v0 .
At first sight there are three natural operators that we can define on ℓ2 (V)
using the tree structure (and our discussion will also involve a fourth). In the
following we fix p > 2 and a (p + 1)-regular tree (V, E).
Definition 12.20. The averaging operator on ℓ2 (V) is defined by
1 X
T (f )(v) = f (w),
p + 1 w∼v
Exercise 12.22. Show that the summing operator S : ℓ2 (V) → ℓ2 (V) on a (p + 1)-regular
tree is a self-adjoint bounded operator with kSk 6 p + 1. Show that there is no eigen-
value λ ∈ σdisc (S) of absolute value |λ| = p + 1.
While it is not difficult to see that kSk 6 p + 1, one might also guess that
this upper bound is not the real value of kSk. Indeed, the proof of the last
statement of Exercise 12.22 already hints at this. Due to the very rapid
growth in the number of vertices in balls BnV (v0 ) (measured with respect to
the natural path length on the tree), elements of ℓ2 (V) must decay rather
rapidly. We start by calculating kSk and go on to discuss the spectrum of S
on ℓ2 (V).
Theorem 12.23. (32) Let p > 2 and let (V, E) be a (p + 1)-regular tree. The
√
summing operator S : ℓ2 (V) → ℓ2 (V) satisfies kSk = 2 p < p + 1.
√
Proof of the lower bound in Theorem 12.23. We first show kSk > 2 p
by considering the function f = fN defined by
( 1
p− 2 d(v,v0 ) if d(v, v0 ) 6 N ;
f (v) =
0 if d(v, v0 ) > N,
Now calculate
X √ √
S(f )(v) = p−(n−1)/2 + p−(n+1)/2 = pp−n/2 + pp−(n+1)/2 = 2 pf (v)
w∼v
d(w,v0 )=n+1
by using the same calculation as for kf k22 again. On dividing this lower bound
√
by (12.2) and letting N → ∞ we deduce that kSk > 2 p.
440 12 Spectral Theory and Functional Calculus
Exercise 12.24. Show that the sequence kf1 k fN in the previous proof is a sequence
N
of approximate eigenvectors of S in the sense of Definition 12.2.
For the proof of the upper bound we use an argument that goes back to
work of Gabber and Galil [38], which can also be used for other graphs.
Proof of the upper bound in Theorem 12.23. Let G = (V, E) be an
undirected graph with the property that every vertex v ∈ V has at most N
neighbours for some fixed N ∈ N. The summing operator S is again defined
by X
S(f )(v) = f (w)
w∼v
or equivalently
±2ℜ f (v)f (w) 6 |f (v)|2 λ(v, w) + |f (w)|2 λ(w, v).
for any f ∈ ℓ2 (V). Using Lemma 6.31 this implies that kSk 6 ρ.
It remains to define λ(v, w) in the context of the (p + 1)-regular tree so
√
that ρ = 2 p. We again use a root v0 ∈ V and define
(
p−1/2 if d(w, v0 ) = d(v, v0 ) + 1;
λ(v, w) =
p1/2 if d(w, v0 ) = d(v, v0 ) − 1.
We outline in this subsection, via a series of exercises, a proof of the fact that
the summing operator S on the (p + 1)-regular tree has no discrete spectrum.
For this proof we will use yet another normalization of the averaging and
summing operators. We refer to this as the unitarily normalized summation,
U1 = √1 S.
p
In fact we will also need the operators Un for n > 0 as defined in the next
exercise.
Exercise 12.27. For any n > 0, let Un be the operator that maps any function f on
a (p + 1)-regular tree to the function Un (f ) defined by
1 X X
Un (f )(v) = f (w),
pn/2 k6n, w∼k v
k≡n(mod2)
where w ∼k v means that w and v have distance k in the (p + 1)-regular tree. Then the
sequence of operators (Un ) satisfies U0 = I, U1 = √1p S, and
Un+1 = U1 ◦ Un − Un−1
for n > 1.
Definition 12.28. The Chebyshev polynomials of the second kind are the
polynomials Un ∈ Z[x] defined recursively by U0 (x) = 1, U1 (x) = 2x,
and Un+1 (x) = 2xUn (x) − Un−1 (x) for n > 1.
X 2
1 1
|f (w)|2 > U2n (cos θ) − U2n−2 (cos θ) |f (v)|2
w∼2n v
2 p
12.3 Main Goals: The Spectral Theorem and Functional Calculus 443
The main goal of this chapter is to establish two related theorems about
normal operators, the first of which gives a complete classification of normal
operators in terms of operators as in the next example (which featured in
other forms before).
Example 12.31. Let H = L2 (X, µ) for a σ-finite measure space (X, µ), and
let g : X → C be a bounded measurable function. The multiplication oper-
ator Mg is then normal on H. We claim that the spectrum σ(Mg ) is the essen-
tial range of g, consisting of all z ∈ C with the property that µ(g −1 U ) > 0
for any neighbourhood U of z. Note first that we have Mg − λI = Mg−λ .
If Xλ = {x ∈ X | g(x) = λ} has positive measure (which clearly implies
that λ belongs to the essential range), then λ lies in σdisc (Mg ) since, for ex-
ample, 1B ∈ ker(Mg − λI)r{0} for any measurable B ⊆ Xλ of positive finite
measure. If on the other hand λ lies in σdisc (Mg ) and v ∈ ker(Mg − λI)r{0},
then µ({x ∈ X | v(x) 6= 0}rXλ ) = 0, so that µ(Xλ ) > 0.
So suppose now g(x) 6= λ almost everywhere. Then we can solve the equa-
tion (Mg − λI)u = v, formally, for any v ∈ L2 (X, µ), by putting
u = (g − λ)−1 v,
v 7−→ (g − λ)−1 v
T
H −−−−→ H
φy
φ
y
L2µ (X) −−−−→ L2µ (X)
Mg
commutes.
kf (T )kop = kf k∞ (12.6)
(f (T ))∗ = f (T ∗ )
h(f (T )) = (h ◦ f )(T ).
of g might not lie entirely in σ(Mg ) (and so in the domain of the continuous or
measurable function). In fact, the description of the spectrum σ(Mg ) in (12.4)
shows that
(the complement of the support being the largest open set with measure 0)
so that, for almost every x ∈ X, g(x) lies in σ(Mg ) and therefore f (g(x)) is
defined for almost every x (we can set the value of the function f (g(x)) on
the zero-measure subset where g(x) ∈ / σ(Mg ) to be 0). This shows that all
the expressions in property (FC4) make sense.
For simplicity we will start with the case of self-adjoint operators, which
only needs the material from Section 11.1 and Section 11.2. In Section 12.5
we will start the discussion of commutative C ∗ -sub-algebras of B(H), which
includes the case of finitely many commuting normal operators and builds on
Section 11.3.
Before continuing, let us note a slightly confusing point in the notation
for the functional calculus. As usual I denotes the identity map x 7→ x
and 1 denotes the constant function x 7→ 1. Thus (FC1) states in particular
that 1(T ) = I and I(T ) = T for any normal operator T ∈ B(H). The
connection to multiplication operators should help to explain why this makes
sense.
d
X
FC(p) = p(T ) = aj T j
j=0
FC(f )∗ = FC(f¯)
Proof. For (1), observe first that the statement is trivially true if p is a
constant. If p has degree at least one, then fix λ ∈ C and factor the polyno-
12.4 Self-Adjoint Operators 449
If λ ∈
/ p(σ(T )), then the solutions λi to the equation p(z) = λ are not in σ(T ),
so each factor T −λi I is invertible, and hence p(T )−λI is invertible. It follows
that
σ(p(T )) ⊆ p(σ(T )).
Conversely, if λ ∈ p(σ(T )), then one of the λi must lie in σ(T ). Because the
factors commute, we can assume without loss of generality that either i = 1
if T − λi I is not surjective — in which case p(T ) − λI is not surjective either,
or i = d if T − λi I is not injective — in which case neither is p(T ) − λI. In
all situations, λ ∈ σ(p(T )), proving the reverse inclusion by Proposition 4.25.
The use of Proposition 4.25 here can be avoided by using the fact that
as desired.
Exercise 12.39. Analyze the proof of Theorem 12.37 above and find out where the argu-
ment fails for a normal operator that is not self-adjoint.
where (λn ) is the sequence of (real) eigenvalues of T with (en ) the sequence of
corresponding eigenvectors. If dim H < ∞ then the spectrum simply consists
of the eigenvalues, and if dim H = ∞ then
†
The following exercise generalizes Lemma 12.38(1) to any continuous func-
tion.
Exercise 12.44 (Spectral mapping theorem). Let H be a complex Hilbert space, and
let T ∈ B(H) be a self-adjoint operator. Show that σ(f (T )) = f (σ(T )) for any f ∈ C(σ(T )).
B = 12 (B + B ∗ ) + i 2i
1
(B − B ∗ )
1
and both 21 (B + B ∗ ) and 2i (B − B ∗ ) are self-adjoint. Thus it remains to show
that every self-adjoint operator can be written as a linear combination of two
unitary operators.
So let A be a self-adjoint operator on H and assume without loss of gen-
erality that kAkop 6 1. Then I − A2 is positive since it is clearly self-adjoint
and
(I − A2 )v, v = kvk2 − kAvk2 > 0
for all v ∈ H. By Corollary 12.45 we find an operator U = A + i(I − A2 )1/2
satisfying
U U ∗ = U ∗ U = A2 + (I − A2 ) = I
and 21 (U + U ∗ ) = A, which shows that A is the linear combination of two
unitary operators.
The next corollary, which will be generalized later, starts to show how the
functional calculus can be used to provide detailed information about the
spectrum.
Corollary 12.46 (Isolated points). Let H be a complex Hilbert space and
let T ∈ B(H) be a bounded self-adjoint operator. Let λ ∈ σ(T ) be an isolated
point meaning that there is some ε > 0 for which σ(T ) ∩ (λ − ε, λ + ε) = {λ}.
Then λ ∈ σdisc (T ).
f = 1{λ} : σ(T ) → C
P = f (T ) = f (T )2 = P 2 ,
and
P = f (T ) = f (T ) = P ∗
since f is real-valued, which shows that P is an orthogonal projection.
Moreover, we have an identity of continuous functions
Using the functional calculus, we can now clarify how the spectrum represents
an operator T and its action on vectors v ∈ H.
ℓ : C(σ(T )) −→ C
f 7−→ hf (T )v, vi
for all f ∈ C(σ(T )). Moreover, taking f = 1, we obtain (12.11) (which also
implies that kℓk = kvk2 ).
Hv = {f (T )v | f ∈ C(σ(T ))}
(If )(T ) = T f (T ).
and hence
φ ◦ T (f (T )v) = If = MI φ(f (T )v)
for all f ∈ C(σ(T )). By the density of the vectors f (T )v ∈ Hv for func-
tions f ∈ C(σ(T )) we obtain
φ ◦ T = MI ◦ φ. (12.12)
Thus the above discussion proves a special case of Theorem 12.33, namely
the case where T is self-adjoint and there exists some vector v with Hv = H.
It is important in this reasoning to keep track of the measure µv , which
depends on the vector v, and to remember that elements of L2 are actually
equivalence classes of functions. Indeed, it could well be that µv has support
which is much smaller than the spectrum, and then the values of a continu-
ous function f outside the support are irrelevant in viewing f as an element
of L2µv . In particular, the map C(σ(T )) → L2µv (σ(T )) is not necessarily in-
jective.
Definition 12.52. Let H be a Hilbert space and T ∈ B(H). The cyclic sub-
space generated by a vector v ∈ H (also called the cyclic vector for Hv )
equals the closure
Hv = {f (T )v | f ∈ C(σ(T ))} = hT n v | n ∈ N0 i.
It is not always the case that T admits a cyclic vector for all of H. However,
we have the following lemma which allows us to reduce many questions to
the cyclic case.
Notice that if H is separable, the index set in the above result is either
finite or countable, since each Hi is non-zero.
We can now prove Theorem 12.33 for a single self-adjoint operator.
Mg ◦ φ = φ ◦ T.
for all n > 1. If the list of subspaces is finite, H1 , . . . , Hn0 say, then we
set Hn = {0} for n > n0 and still work with the index set N.
By the argument at the beginning of this section, we have unitary maps
whenever this makes sense (for example, if f > 0, equivalently fn > 0 for
all n, or if f is integrable, which is equivalent to fn being µn -integrable for
all n and the sum of the integrals of |fn | being convergent). In particular,
X X
µ(X) = µn (σ(T )) = n−2 < ∞,
n>1 n>1
which implies that g∗ µ ({z | f (z) < 0}) = µ ({(z, n) | f (z) < 0}) = 0 and
so f > 0, since σ(T ) = σ(Mg ) = Supp(g∗ µ) by Example 12.31.
As the following exercises show, the material above is also useful for the study
of unitary representations.
ý ý
Exercise 12.57. Let G be a topological group, H1 and H2 complex Hilbert spaces, and
let π1 : G H1 and π2 : G H2 be unitary representations of G. Suppose that π1 and π2
are isomorphic in the sense that there exists a bijective bounded operator T from H1
to H2 with T π1 (g) = π2 (g)T for all g ∈ G. Show that this implies that π1 and π2
are also unitarily isomorphic, meaning that T can be chosen to be in addition a unitary
isomorphism T : H1 → H2 .
Exercise 12.59. Let G be a topological group, and recall the set P1 (G) of normalized
positive-definite functions in Cb (G) from Exercise 9.55. Show that p ∈ P1 (G) is extreme
in P1 (G) if and only if the associated unitary representation from Exercise 9.55(a) is
irreducible.
12.5 Commuting Normal Operators 459
a
H −−−−→ H
φy
φ
y
L2µ (X) −−−−→ L2µ (X)
Mga
A ∋ a 7−→ ao ∈ C(σ(A))
(a∗ )o = ao (12.13)
for all a ∈ A. (We note that Lemma 11.35 in the proof of Corollary 11.34 can
here be replaced by Lemma 12.15.) We recall that σ(A) is the generalization
of the spectrum of a single operator and note that in the following the inverse
map
C(σ(A)) ∋ f = ao 7−→ a ∈ A
should be thought of as a generalized continuous functional calculus.
Proof of Theorem 12.60. Fix v ∈ H and define a linear functional
Λ : C(σ(A)) → C
by
460 12 Spectral Theory and Functional Calculus
Λ(ao ) = hav, vi
for every a ∈ A. We claim that Λ is a positive functional on C(σ(A)). Suppose
that a ∈ A with ao > 0. Then√there exists some b = b∗ ∈ A (defined using
the Gelfand transform by bo = ao ) with b2 = a. The claimed positivity now
follows, since
Λ(ao ) = hav, vi = hbv, bvi > 0.
By the Riesz representation theorem (Theorem 7.44) there exists a positive
finite measure µv on σ(A) such that
Z
hav, vi = ao dµv
σ(A)
for all a ∈ A by (12.13). Just as in Sections 9.1.2 and 12.4.4, this induces a
unitary isomorphism between the cyclic subspace Hv = Av and L2µv (σ(A))
which sends av ∈ Av to ao ∈ C(σ(A)). In particular, for a, b ∈ A we have
with measure G
µ= µwn⊥ ,
n∈N
where the disjoint union notation indicates that we consider µwn⊥ as a measure
on σ(A) × {n} and then take the sum to obtain the measure µ on X.
With this
M M
L2µ (X) ∼
= L2 (X, µwn⊥ ) ∼
= Hwn⊥ = H, (12.14)
n∈N n∈N
12.6 Spectral Measures and the Measurable Functional Calculus 461
φ : H → L2µ (X),
Exercise 12.61. (a) Suppose that A is the unital commutative C ∗ -algebra generated by T ,
a normal operator on a complex Hilbert space H, in the sense that
A = hT m (T ∗ )n | m, n > 0i
is continuous and injective. Give a concrete example to show that the image of ı may not
be all of σ(T1 ) × σ(T2 ).
Exercise 12.62. State and prove a spectral theorem for normal compact operators as a
corollary of Theorem 12.60.
For the proof of Theorem 12.34 we now discuss some more general spectral
measures. As it makes little difference whether we consider a single (self-
adjoint or normal) operator or a commutative Banach algebra (as in the
previous section), we will do the latter. The reader only interested in the
case of a single self-adjoint operator T may replace the use of Theorem 12.60
below by Theorem 12.55, set σ(A) = σ(T ) as in Exercise 12.61(a) and replace
the operation ao 7→ a ∈ A by the continuous functional calculus
C(σ(T )) ∋ f 7−→ f (T ).
Using the spectral measures from above, we can now define for every meas-
urable function f ∈ L ∞ (σ(A)) a corresponding operator fH ∈ B(H). In the
case of A being generated by I and a normal operator T this gives a definition
of f (T ) for a function f ∈ L ∞ (σ(T )).
12.6 Spectral Measures and the Measurable Functional Calculus 463
for all w ∈ H. By linearity of v 7→ µv,w and the bound (12.17), we see that
v 7−→ vf = fH v
Proof. Recall that Corollary 11.34 gives the existence of a C ∗ -algebra iso-
morphism C(σ(A)) ∋ f = ao 7→ fH = a ∈ A (see Theorem 12.37 in the case
of a single self-adjoint operator).
Also recall that in Proposition 12.64 we derived the existence of the family
of finite complex-valued measures {µv,w } on σ(A) with
Z Z
hfH v, wi = hav, wi = ao dµv,w = f dµv,w (12.18)
σ(A) σ(A)
f = ao ∈ C(σ(A)).
We claim this implies that µv,w = µw,v . To see this, let a ∈ A and notice
that
Z Z Z
ao dµv,w = ha∗ v, wi = hv, awi = haw, vi = ao dµw,v = ao dµw,v ,
for any v, w ∈ H, and since this holds for all f = ao ∈ C(σ(A)) the claim
follows. Now we use essentially the same identity (in a slightly different order,
∗
and with a different logic) to deduce that fH = fH for f ∈ L ∞ (σ(A)). So
∞
let f ∈ L (σ(A)). Then
∗
hfH v, wi = hv, fH wi = hfH w, vi
Z Z
= f dµw,v = f dµv,w = (f )H v, w
As this holds for all f1 ∈ C(σ(A)) we obtain (12.20) for f2 ∈ C(σ(A)) and
for all v, w ∈ H. Using µv,w = µw,v this also shows that
Proof. We first prove (FC5). Suppose that S ∈ B(H) commutes with all
elements a ∈ A. We extend this again to fH for f ∈ L ∞ (σ(A)) using the
466 12 Spectral Theory and Functional Calculus
fH v = fH ◦ PV (v) = PV ◦ fH (v) ∈ V
first for f ∈ C[z] and then for all f ∈ C(σ(Mg )) by the properties of the
functional calculus in Theorem 12.37. Therefore,
Z Z
hf (Mg )v, wi = f dµv,w = (f ◦ g)vw dµ = hMf ◦g v, wi
σ(Mg ) X
for all v, w ∈ L2µ (X) and f ∈ L ∞ (σ(Mg )), which proves (FC4).
For the proof of (FC6) we assume first that H is cyclic. By Theorem 12.55
there is a finite measure space (X, µ), some bounded measurable g : X → C,
and a unitary isomorphism φ : H → L2µ (X) such that
φ ◦ T = Mg ◦ φ.
Let h ∈ L ∞ (f (σ(T ))) and apply (FC4) twice more to see that
Exercise 12.69. Suppose that H is a separable complex Hilbert space and that
A ⊆ A′ ⊆ B(H)
(d) Show that the two notions of measurable calculus are compatible in the sense that
any f ∈ L ∞ (σ(A)) defines some f ′ ∈ L ∞ (σ(A′ )) by f ′ = f ◦ π which satisfies fH = fH
′ .
Exercise 12.70. Generalize the results of Section 9.1.3 to the context of a single normal
operator T ∈ B(H) or a separable commutative unital C ∗ -sub-algebra of B(H).
Exercise 12.71. In the notation of Theorem 12.34, fix a normal operator T , suppose it
has a description as a multiplication operator Mg on some measure space L2µ (X), and
let ν = g∗ µ be the push-forward measure on C. Show that f (T ) is now well-defined
with f ∈ L∞ ν (σ(T )) by proving that if f1 , f2 ∈ L
∞ (σ(T )) agree ν-almost everywhere,
then f1 (T ) = f2 (T ).
for all v ∈ H and f ∈ C(σ(T )), where the series are well-defined because Pλ
is 0 for λ ∈
/ σ(T ).
To generalize this, it is natural to expect that one must replace the summa-
tions with appropriate integrals. Thus some form of integration for functions
taking values in B(H) is needed. Moreover, ker(T − λI) may be zero for all λ,
and so the projections must be generalized. We start by considering these
two questions abstractly.
B −→ P(H)
B 7−→ ΠB
then X
ΠB = ΠBn (12.21)
n>1
where the series converges in the strong operator topology (see Sec-
tion 8.3).
P
We note that (12.21) simply means that ΠB (v) = n>1 ΠBn (v) for v ∈ H
(see Exercise 8.57). In the study of a single normal operator T on H we
will set X = σ(T ) ⊆ C, and more generally X = σ(A) in the study of a
commutative separable unital C ∗ -sub-algebra A ⊆ B(H). Also notice that
Definition 12.73 resembles in some ways the definition of a (finite) Borel
measure on X. The discussion below will reveal further parallels to Lebesgue
integration.
Lemma 12.74. Let H be a complex Hilbert space and Π a projection-valued
measure on H defined Fn on the σ-algebra B of Borel subsets of a compact metric
space X. If X = j=1 Bj is a disjoint decomposition of X into measurable
L
subsets B1 , . . . , Bn ∈ B, then H = nj=1 im ΠBj is an orthogonal direct sum
of the closed subspaces im ΠB1 , . . . , im ΠBn ⊆ H.
Hj ⊥ Hk
Bj ∩ Bk = ∅
we have ΠBj ∪Bk = ΠBj +ΠBk by the properties of Π. Applying this operator
to w we obtain ΠBj ∪Bk w = w + ΠBk w, and taking the inner product with w
gives
470 12 Spectral Theory and Functional Calculus
ΠBj ∪B w
2 = ΠBj ∪B w, w = kwk2+hΠB w, wi = kwk2 + kΠB wk2 > kwk2 .
k k k k
for any B1 , B2 ∈ B.
which can be constructed as the uniform limit of the following simple approx-
C
imation. For any ε > 0 and measurable partition ξ = {P1 , . . . , Pm } of Bkf k∞
with diam Pj 6 ε and a choice of sample points λj ∈ Pj for j = 1, . . . , m we
define the simple function
m
X
fξ = λj 1f −1 (Pj ) (12.22)
j=1
ΠB = (1B )H
(fξn )H −→ fH
φ ◦ πg = Mg ◦ φ (12.25)
The proof of the corollary consists largely of assembling the evidence that
we have already proved it.
Proof of Corollary 12.81. By Proposition 11.38, L1 (G) is a separable
commutative Banach algebra. Applying Exercise 11.1 we obtain the separable
commutative unital Banach algebra L1 (G) ⊕ C, whose elements we will write
as f + λI where I denotes the multiplicative unit of the algebra.
Using Exercise 3.86 we define the bounded operator
By Proposition 3.91 and Exercise 3.92 it follows that the closure A of the
image ı(L1 (G) ⊕ C) is a separable commutative unital sub-algebra of B(H).
By Exercise 12.80, we see that A is also a C ∗ -sub-algebra. Applying The-
orem 12.60 we find a unitary isomorphism
φ : H → L2µ (X)
that assumption after one has understood the argument below, or by putting
the various cyclic subspaces back together as we have done many times before.
Since ı : L1 (G) ⊕ C → A has dense image, a linear functional on A is
uniquely determined by its restriction to the image ı L1 (G) ⊕ C . Equival-
ently, the dual map ∗
ı∗ : A∗ −→ L1 (G) ⊕ C
is injective. By Exercise 8.9(b) (see also the hint on p. 574), ı∗ is continuous
with respect to the weak* topology. By Theorem 11.23, X = σ(A) is compact,
and hence the restriction of ı∗ to X is a homeomorphism to
Next we define the measure µ′ = (ı∗ )∗ µ, which also gives us the identifica-
tion L2µ (X) = L2µ′ (X ′ ). Fix some f ∈ L1 (G), then we have φ◦ (f ∗π ) = Mao ◦ φ
with a = ı(f ) = f ∗π . We use ı∗ to identify X with the subset ı∗ (X) ⊆ X ′ ,
and claim that in this sense the function f o extends the function ao . Indeed
if χ ∈ X = σ(A), then
On the other hand, (1B ) (0) = 0 and hence M1oB 1{0} = 0. Recalling the
o
φ : H → L2µ (X)
476 12 Spectral Theory and Functional Calculus
we obtain
φ ◦ (ψkg0 ∗π ) = Mψ
~ g0 ◦ φ (12.26)
k
Z
}
ψk (χ) = ψk χ dm 6 kψk0 k1 = 1,
g0 g0 g
G
for all χ ∈ G, }
b so that kψ g0
k k∞ 6 1, that
Z
} g0 1
lim ψk (χ) = lim χ dm = χ(g0 ),
k→∞ k→∞ m(Bk ) Bk +g0
as claimed.
Since φ is a unitary isomorphism, (12.26) shows that (ψkg0 ∗π v) converges
in H. In order to identify the limit ve ∈ H, fix some w ∈ H and use the
definition of ψkg0 ∗π to see that
Z
g0
v , wi = lim hψk ∗π v, wi = lim
he ψ g0 (h) hπh v, wi dm(h)
k→∞ k→∞ G k
Z
1
= lim hπh v, wi dm(h) = hπg0 v, wi
k→∞ m(Bk ) B +g0
k
φ ◦ (f ∗π ) = Mfq ◦ φ
for any f ∈ L1 (G). This follows from the definition of f ∗π and Fubini’s
theorem. Indeed, using (12.25) in the form φ(πg u)(χ, n) = χ(g)φ(u)(χ, n) for
all (χ, n) ∈ Gb × N we obtain
Z
∗
hφ(f ∗π u), f1 i = hf ∗π u, φ f1 i = f (g) hπg u, φ∗ f1 i dm(g)
G
Z Z
= φ(πg u)(χ, n)f1 (χ, n)f (g) dµ(χ, n) dm(g)
G X
Z Z
= f (g)χ(g) dm(g)φ(u)(χ, n)f1 (χ, n) dµ(χ, n)
DX G E
= Mfq(φ(u)), f1
Using the spectral theorem for unitary representations as in the last corollary,
we turn to the question of whether the Pontryagin dual group is sufficiently
rich to separate points, as claimed on p. 92.
is not the identity map. Therefore, if we apply Corollary 12.81 to the reg-
ular representation λ of G on L2 (G) we see that there exists some χ ∈ G b
with χ(g0 ) 6= 1. Applying this to g0 = g − h for some g, h ∈ G with g 6= h,
we obtain completeness of characters, as claimed.
hg, ti = χt (g) ∈ S1
b For t ∈ G
for the dual pairing of g ∈ G and t ∈ G. b write Mt : L2 (G) → L2 (G)
for the multiplication operator defined by Mt (f )(g) = hg, tif (g) for g ∈ G.
Finally, we will write λb for the regular representation of G
b on (equivalence
b
classes of) functions f on G, so that
bt (f )(t) = f (t − t0 )
λ 0
b
for all t, t0 ∈ G.
12.8 Locally Compact Abelian Groups and Pontryagin Duality 479
We note that this generalizes Theorem 9.39 and Proposition 9.29, except
that we work here with the Fourier back transform. We split the proof of
the theorem into several steps. Our argument below may not be the most
direct approach, but will also help us to prove Pontryagin duality in the next
subsection. We will assume the hypotheses of Theorem 12.85 throughout.
b and k ∈ N. Therefore, ψ
for every t ∈ G f q 2
k ∗ ψk = |ψk | > 0 for every k ∈ N.
P∞ fk for some rapidly decaying positive sequence (ck )k
Setting ψ = k=1 ck ψk ∗ ψ
P∞
we obtain ψ ∈ L (G) ∩ L2 (G). By the above we have ψb = k=1 ck |ψ
1 fk | > 0
and the lemma follows.
φ0 (ψ)(t, n)
w(t, n) =
q
ψ(t)
for all f ∈ V and µ0 -almost every (t, n) ∈ X. This represents the main step
towards the lemma, which we will obtain by modifying the unitary isomorph-
ism and the measure as follows.
Since φ0 : L2 (G) → L2µ0 (X) is an isomorphism and V = L1 (G) ∩ L2 (G) is
dense in L2 (G), we see from (12.29) that w(t, n) 6= 0 µ0 -almost everywhere.
Using this we define the σ-finite measure µ1 on X by
dµ1
= |w|2 ,
dµ0
for all f ∈ V and µ1 -almost every (t, n) ∈ X. Since any two multiplication
operators on X commute, the new unitary isomorphism still satisfies the
conclusions of the spectral theorems.
To summarize, φ1 : L2 (G) → L2µ1 (X) is a unitary isomorphism satisfy-
ing (12.30). Finally, since φ1 (V) is dense in L2µ1 (X), this implies that every
element of L2µ1 (X) can be expressed as a pointwise limit of a sequence in φ1 (V)
and so has a representative that only depends on t ∈ G.b Let p : X → G b
b S denote
the projection to the first coordinate of X = G × N, and write X = n>1 Xn
as a union of sets Xn ⊆ X with finite measure. Then for every n > 1 we may
use the above observation for the function 1Xn ∈ L2µ1 (X) and see that there
exists a measurable set Yn ⊆ G b with µ1 (Xn △(Yn × N)) = 0. It follows that
[
µ1 X r Yn × N = 0
n>1
b with L2 (G)
and so p∗ µ1 is a σ-finite measure on G b = L2 (X).
p∗ µ1 µ1
Simplifying the notation, we may assume that φ : L2 (G) → L2µ (G) b is a
b
unitary isomorphism, that µ is a σ-finite measure on G, and that in addition
to the claims of the spectral theorem it also satisfies (12.27).
Proof. Throughout the proof we will use the function ψ from Lemma 12.86.
We first claim that µ is locally finite. Notice that ψq ∈ C0 (G)b by definition of
b
the topology on G in Propositions 11.33 and 11.38. Hence
Ot0 = t ∈ G b | |ψ| q 2 (t0 )
q 2 (t) > 1 |ψ|
2
b
for all t ∈ G.
Below we combine (12.31) for f = ψ ∈ L1 (G) with a similar claim for the
spectral measure of ψ ∈ L2 (G) and will obtain the lemma from this. By the
assumptions in the lemma we can define the spectral measures µF on G b for
the algebra L1 (G) acting on functions F ∈ L2 (G) by dµF = |φ(F )|2 dµ since
482 12 Spectral Theory and Functional Calculus
Z
hf ∗λ F, F iL2 (G) = fq|φ(F )|2 dµ
b
G
for all t0 ∈ G,b where we use the translation defined by Tt0 (t) = t − t0 for
all t ∈ Gb and the push-forward of the measure (as defined on p. 265). Indeed,
for f ∈ L1 (G) we have (by our definitions)
Z
fqdµχt0 ψ = f ∗λ χt0 ψ , χt0 ψ L2 (G)
b
G
Z
= f (g) λg χt0 ψ , χt0 ψ L2 (G) dm(g)
ZG
= f (g)χt0 (−g) χt0 λg ψ, χt0 ψ L2 (G) dm(g)
G | {z }
=hλg ψ,ψi
Z Z
= (χ−t0 f ) ∗λ ψ, ψ L2 (G)
= χ
−t0 f dµψ = fqd Tt0 µ ,
∗ ψ
b
G b
G
since (χ b q q b b
−t0 f )(t) = λt0 f (t) = f (t − t0 ) = f ◦ Tt0 (t) for all t ∈ G by (12.31).
This proves the claim (12.32) by the uniqueness properties of the spectral
measures (which follow from Exercise 12.88) as f ∈ L1 (G) was arbitrary.
We now combine (12.31), (12.32), and the assumption that φ(f ) = fq for
any f ∈ V. For some test function F ∈ Cc (G) b we then have
Z Z Z
q 2
F |ψ| dµ = F dµψ = F ◦ T−t0 d(Tt0 )∗ µψ
b
G b b
ZG G
Z
= b−t F dµχ ψ =
λ 0 t0 λ 0
~
b−t F |χ 2
t0 ψ| dµ
b b
ZG G
Z
= b q 2
λ−t0 F λ−t0 |ψ| dµ = b−t (F |ψ|
λ 0
q 2 ) dµ.
b
G b
G
q −2 ∈ Cc (G)
Replacing F by F |ψ| b we also obtain
Z Z
F dµ = b−t0 F dµ
λ
b
G b
G
b and F ∈ Cc (G).
for any t0 ∈ G b By the uniqueness property of the measure
in the Riesz representation theorem (Theorem 7.54) we deduce that µ is
invariant under translation.
12.8 Locally Compact Abelian Groups and Pontryagin Duality 483
Finally, note that µ(O) > 0 for any non-empty open subset, since otherwise
every compact subset could be covered by finitely many translates of O and
hence would have measure 0. Since µ 6= 0 we deduce that µ is a Haar measure.
is the Haar measure on G b extends easily to all of L2 (G). This concludes the
proof of the theorem.
We are now ready to establish a complete symmetry between G and its dual
b Using Proposition 11.43 we can define the dual group G b
b of the dual
group G.
b
group G of G and are led to the question of reflexivity of locally compact
abelian groups. Fortunately, the situation here is much better than that for
Banach spaces in Chapter 7, as the next result shows. Let us prepare for it
with the following exercise.
b−t ◦ φ
φ ◦ Mt = λ
484 12 Spectral Theory and Functional Calculus
b We will now read this formula backwards and derive the corollary
for all t ∈ G.
from it. For this we define a unitary isomorphism
b → L2 (G)
U : L2 (G) b
b → L2 (G),
ψ = φ−1 ◦ U : L2 (G)
bt = λ
which is also a unitary isomorphism. Now notice that U ◦ λ b−t ◦ U since
bt (f )) (t′ ) = λ
U (λ bt (f )(−t′ ) = f (−t′ − t)
and
b−t (U (f )) (t′ ) = U (f )(t′ + t) = f (−t′ − t)
λ
b and t, t′ ∈ G.
for all f ∈ L2 (G) b Since φ−1 ◦ λ
b−t = Mt ◦ φ−1 it follows that
bt = φ−1 ◦ U ◦ λ
ψ◦λ bt = φ−1 ◦ λ
b−t ◦ U = Mt ◦ φ−1 ◦ U = Mt ◦ ψ
Exercise 12.93. Let H < G be a closed subgroup, and define the annihilator group
[ ∼
(a) Show that G/H = H⊥.
b
b we can also define the double
(b) Using the canonical isomorphism between G and G
annihilator (H ⊥ )⊥ as a subgroup of G. Show that (H ⊥ )⊥ = H.
(c) Deduce from this that Hb∼ b ⊥.
= G/H
Exercise 12.96. Let (Gn ) be a sequence of compact groups and suppose in addition that
there is a surjective continuous homomorphism φn : Gn+1 → Gn for each n > 1. The
projective limit of the system (Gn , φn ) is defined by
n Y o
lim(Gn , φn ) = (gn ) ∈ Gn φn (gn+1 ) = gn for all n > 1 .
←−
n>1
Show that this is again a compact metric abelian group (with the topology inherited from
the product topology), and that
V [
lim(Gn , φn ) = cn ,
G
←−
n>1
where we use the injective continuous homomorphism φ cn : Gcn → G \ n+1 to identify the
c \ c,φ
group Gn with a subgroup of Gn+1 ; this direct limit is also written lim(G c ).
−→ n n
Exercise 12.97. Formulate and prove the dual statements to Exercise 12.95–12.96 (start-
ing with direct sums, respectively direct limits).
The example above as well as the following ones and Exercise 4.29 show
that unbounded self-adjoint operators cannot reasonably be required to be
defined on the whole Hilbert space. In contrast to Definition 4.27, we will in
this chapter always assume that X = H and Y = H′ are complex Hilbert
spaces, and that the domain DT ⊆ H is dense.
(DT , T ) : H −→ H′ .
Of course bounded operators between two Hilbert spaces are special cases
of this definition and in this case we will keep using the notation B : H → H′ .
We note that the inverse and composition of operators will be understood
here as in set theory: If
(DT , T ) : H → H′
is injective, then
(DT −1 , T −1 ) : H′ → H
is simply the inverse map, and it is densely defined if DT −1 = T (DT ) is dense
in H′ . If (DT , T ) : H → H′ and (DS , S) : H′ → H′′ are densely defined
operators, then
DST = {v ∈ DT | T v ∈ DS }
is a subspace and ST : DST → H′′ is linear, but in general it is not clear
whether this defines a densely defined operator.
We will prove this lemma together with Lemma 13.8, but only after we
have seen a few more examples. We note again that in the lemma above
and in the following definition equality of operators entails equality of their
domains.
Essential Exercise 13.5. (a) Check that Example 13.1 indeed defines a self-
adjoint operator in the sense of Definition 13.4.
(b) When does a complex-valued measurable function on a σ-finite meas-
ure space (X, B, µ) define a densely defined, closable, closed, self-adjoint, or
bounded multiplication operator?
13.1 Examples and Definitions 489
d
Example 13.6. Let dx : Cc∞ (R) −→ H = L2 (R) be the differentiation oper-
ator, and define an operator T by
d
Graph(T ) = Graph dx .
By Definitions 5.7, 5.14, and the properties of the weak derivative (Lemma 5.10
applied with d = k = 1) this indeed defines a map
d
d b
dx f (t) = 2πitf (t)
d
D \1 2
T = H0 (R) = DMg = {f ∈ L (R) | ktf (t)k2 < ∞}
d
: Cc∞ ((0, 1)) −→ L2 ((0, 1)).
dx
d
(a) Recall that T0 : H01 ((0, 1)) → L2 ((0, 1)) sending f to ∂ 1 f extends the operator dx to
a closed operator (DT0 , T0 ) : L2 ((0, 1)) → L2 ((0, 1)).
d
(b) Recall that Tp : H 1 (T) → L2 (T) sending f to ∂ 1 f also extends the operator dx to a
closed operator (DTp , Tp ) : L2 ((0, 1)) → L2 ((0, 1)).
490 13 Self-Adjoint and Symmetric Operators
(DT , T ) : H → H′
be a closed densely defined operator between two complex Hilbert spaces. The
orthogonal complement of the closed set
Graph(T ) ⊆ H × H′
^ ∗ ), where
is given by Graph(T
f : H′ × H −→ H × H′
(w, v) 7−→ (v, −w).
DT ∋ v 7→ hT v, wiH′
for all v ∈ DT . It is easy to check that DT ∗ is a linear subspace and that this
defines the linear operator T ∗ : DT ∗ → H.
For the proof of Lemma 13.3 we wish to show next that T ∗ is closed. For
this it is useful to first prove that
^ ∗ ),
Graph(T )⊥ = Graph(T (13.2)
which in particular will imply Lemma 13.8. Let w ∈ DT ∗ so that (13.1) holds
for all v ∈ DT . By definition,
^ ∗)
(T ∗ w, −w) ∈ Graph(T
and
h(v, T v), (T ∗ w, −w)iH×H′ = hv, T ∗ wiH − hT v, wiH′ = 0
for all v ∈ DT . On the other hand, if (v ′ , −w) ∈ Graph(T )⊥ so that
^ ∗ ).
(v ′ , −w) = (T ∗ w, −w) ∈ Graph(T
h(0, w0 ), (T ∗ w, −w)iH×H′ = 0
⊥
^ ∗ ) . By (13.2) and the
for all w ∈ DT ∗ , or equivalently (0, w0 ) ∈ Graph(T
characterization of the closed linear hull in Corollary 3.26 this is in turn
equivalent to
(0, w0 ) ∈ Graph(T ).
If now T is closable, then (0, w0 ) ∈ Graph(T ) implies that w0 = 0 and
so (DT ∗ )⊥ = {0} and thus DT ∗ = H′ . On the other hand, if T is not
closed, then there exists a non-zero vector (0, w0 ) ∈ Graph(T ) and so the
element w0 ∈ (DT ∗ )⊥ shows that T ∗ is not densely defined.
For the final remark of Lemma 13.3 we apply (13.2) to T and to T ∗
(and also note that the operator e is unitary and ((v, w) e ) e = −(v, w)
for all (v, w) ∈ H × H′ ) to see that
^ ∗) ⊥
Graph(T ) = Graph(T )⊥⊥ = Graph(T
g
= Graph(T ∗ )⊥ ^ ∗∗ ) g = Graph(T ∗∗ ),
= Graph(T
as claimed.
for some finite measure space (X, µ) and measurable function g : X → [0, ∞).
Graph(T )
(w, T w)
H
(w, 0) (v, 0)
^ ∗)
Graph(T )⊥ = Graph(T
B = PH ◦ PGraph ◦ ıH ,
so that
∗
B ∗ = ı∗H ◦ PGraph ∗
◦ PH = PH ◦ PGraph ◦ ıH = B
is self-adjoint. Moreover,
Also, by definition,
B = (I + T ∗ T )−1 , (13.5)
^ ∗)
(w, T w) − (v, 0) ∈ Graph(T )⊥ = Graph(T
so w ∈ DT ∗ T and w − v = −T ∗ T w, or equivalently
(I + T ∗ T )Bv = w + T ∗ T w = v.
(w, T w) ∈ Graph(T ),
^ ∗ ),
(T ∗ T w, −T w) ∈ Graph(T
and
(v, 0) = (w, T w) + (T ∗ T w, −T w),
which implies that w = Bv = B(I + T ∗ T )w, as claimed.
Now apply Theorem 12.55 to B to find a finite measure space (X, µ)
and some bounded measurable function h ∈ L∞ µ (X) so that B and Mh are
unitarily isomorphic. Since B is injective and satisfies (13.3)–(13.4), h takes
values in (0, 1] µ-almost everywhere. After modifying h on a null set we may
therefore assume that h takes values in (0, 1] everywhere.
Using the same isomorphism φ we claim that T ∗ T is isomorphic to Mg
for g = h1 − 1. Indeed,
φ(DT ∗ T ) = φ(im(B)) = im(Mh ) = f ∈ L2µ (X) | h1 f ∈ L2µ (X) = DMg ,
† The alert reader may at this point feel a sense of déjà vu (cf. Exercise 13.12).
494 13 Self-Adjoint and Symmetric Operators
for all such w. Applying Exercise 13.5(a) or 13.10 gives the theorem.
(these are the Dirichlet boundary conditions), and that its eigenfunctions are (scalar mul-
for n ∈ N. ∗
tiples of) the functions x 7→ sin(πnx)
d
(b) Let (DT , T ) = H 1 ((0, 1)), dx . Show that T T coincides with the negative of the
second weak derivative on
(which are the Neumann boundary conditions), and that its eigenfunctions are the func-
tions x 7→ cos(πnx) for n ∈ N0 .
d
(c) Let (DTp , Tp ) = H 1 (T), dx . Show that Tp∗ Tp coincides with −∆ on H 2 (T) (which
corresponds to the periodic boundary conditions).
(d) Show that T0∗ T0 , T ∗ T , and Tp∗ Tp are all different and no one extends any other.
Exercise 13.12. Compare the general construction of this section to the arguments of
Section 6.4.2.
Exercise 13.13. Let G = (V, E) be an undirected simple graph as in Section 10.4 (but
→
possibly infinite) such that any v ∈ V has finitely many neighbours, and let E be the set
→
of oriented edges as in Section 12.2. Let H = L2 (V) and H′ = L2 ( E ), where we simply
→
use the counting measure on the vertices in V and the edges in E . Now define
Exercise 13.14. Let G = (V, E) be a finite undirected graph as in Section 10.4, but now
glue for every edge e ∈ E connecting two vertices v1 , v2 ∈ V a compact line segment Se of
length ℓe > 0 between v1 and v2 . We assume that the graph is undirected and we put for
any two vertices at most one line segment linking them directly. This defines a topological
space Q, called a metric graph, consisting of a network of compact line segments (one for
each edge in the graph) that are glued together at the vertices of the graph. Endow Q with
the measure obtained from using the Lebesgue measure on each line segment Se , which in
particular leads to
13.3 Self-Adjoint Operators 495
X
L2 (Q) = L2 (Se ).
e∈E
Define H 1 (Q) to be the space of all continuous functions on Q such that the restriction
to the compact line segment Se ⊆ Q belongs to H 1 (Se ) for every edge e ∈ E. Define the
operator T : H 1 (Q) → L2 (Q) by setting (T f )Se = ∂ fe , where fe = f |Se and the weak
derivative is taken in H 1 (Se ) with respect to the fixed orientation on Se . Then
is a densely defined operator (check this). The study of the eigenfunctions of T ∗ T is called
the theory of quantum graphs.
(a) Describe the operators T ∗ and T ∗ T and their domains, especially in relationship to
the behaviour of the functions in the domain at the vertices.
(b) Show that there exists an orthonormal basis of L2 (Q) consisting of eigenfunctions
of T ∗ T .
(c) Assume now that G consists of four vertices with one vertex in the centre and three ver-
tices connected to it. Prove a version of Weyl’s law for the operator T ∗ T on the associated
quantum graph.
Using the construction from the last section we can also prove the spectral
theorem for general self-adjoint operators.
Theorem 13.15. Let (DT , T ) : H → H be a densely defined self-adjoint
operator. Then there exists a finite measure space (X, µ) and a real-valued
measurable function g : X → R such that (DT , T ) is unitarily isomorphic
to (DMg , Mg ), meaning that there is a unitary isomorphism φ : H → L2µ (X)
such that φ(DT ) = DMg and the diagram
T
H ⊇ DT −−−−→ H
φy
φ
y
L2µ (X) ⊇ DMg −−−−→ L2µ (X)
Mg
commutes.
Since a self-adjoint operator T as in Theorem 13.15 is also closed, it is clear
that we could directly apply the method of the previous section to T . Note,
however, that a simple application of Theorem 13.9 only gives a description
of T 2 , which does not allow a description of T . In fact, T 2 has a potentially
smaller domain, and may have lost some information about T (namely the
sign of eigenvalues or approximate eigenvalues). To compensate we will study
two operators: B as in the previous section, and A = T B, as in Figure 13.2.
Proof of Theorem 13.15. Let B = (I + T ∗ T )−1 = (I + T 2 )−1 be as in the
proof of Theorem 13.9. We also define
496 13 Self-Adjoint and Symmetric Operators
Graph(T )
(w, T w)
(0, T w) = (0, Av)
Fig. 13.2: For the proof of Theorem 13.15 we study the operators A and B.
A = T B = PH,2 ◦ PGraph ◦ ıH ,
BT ⊆ T B = A. (13.6)
(I + T 2 )v = w
(I + T 2 )T v = T (I + T 2 )v = T w,
BT w = T Bw
by (13.6). Since both AB and BA are defined on all of H this shows that A
and B commute.
Next we have to show that A and B together uniquely determine T (so
that when A and B are realized as multiplication operators we have some
hope of deducing a similar realization for T ). We claim that T = B −1 A and,
in particular,
DT = DB −1 A = {v ∈ H | Av ∈ im(B)}.
To see this, note that B −1 B = I since B is injective and hence
T = B −1 BT ⊆ B −1 T B = B −1 A
by (13.6). For the converse recall the construction of B in the proof of The-
orem 13.9 (see also Figures 13.2 and 13.3) and the definition of A. With these
we obtain
^ ) so
Since T is self-adjoint, Lemma 13.8 shows that Graph(T )⊥ = Graph(T
that we have equivalently
Taking the sum of (13.7) and (13.8) and using the identity (I + T 2 )B = I
gives
(v, B −1 Av) ∈ Graph(T ).
Thus v ∈ DT and T v = B −1 Av, as claimed (see Figure 13.3).
Now we apply Theorem 12.60 to A and B to obtain a finite measure
space (X, µ) and two functions gA : X → R and gB : X → (0, ∞) such
that A and B are conjugate to MgA and MgB , respectively. Since we have
shown that DT and T are purely defined in terms of A and B, we can finally
use the same unitary isomorphism φ to describe (DT , T ) as follows:
φ(DT ) = φ {v ∈ H | Av ∈ im(B)}
= {f ∈ L2µ (X) | MgA (f ) ∈ im(MgB )}
n o
= f ∈ L2µ (X) | ggB
A
f ∈ L2µ (X) = DMg ,
gA
where we set g = gB , and also
498 13 Self-Adjoint and Symmetric Operators
Graph(T )⊥
(0, B −1 Av) Graph(T )
(Bv, 0) (v, 0)
Fig. 13.3: As the proof of Theorem 13.15 shows, the two marked segments are
translates of each other.
for all v ∈ DT .
ý ý
Exercise 13.16 (Schur’s lemma for densely defined closed operators). Assume
that π1 : G H1 and π2 : G H2 are unitary representations of a topological group G
such that π1 is irreducible. Moreover, assume that (DT , T ) : H1 → H2 is a densely defined
closed operator satisfying π1 (g)DT ⊆ DT and T π1 (g) = π2 (g)T on DT for all g ∈ G. Show
that DT = H1 and that T is bounded, and deduce that the conclusions of Schur’s lemma
(Exercise 12.58) holds in this setting.
Graph(ı) = Graph(ı0 ) ⊆ H × HS .
Su = ı∗ ıu,
We finish this chapter (and hence our discussion of spectral theory) with a
series of exercises concerning work going back to von Neumann on the ex-
istence of self-adjoint extensions of a general symmetric operator. The main
tool for this discussion is the Cayley transform, the definition of which may
at first be a little surprising. To motivate the definition we recall that the
spectrum of a self-adjoint bounded operator is a compact subset of R. Gener-
alizing this definition we suppose now that (DT , T ) : H → H is a self-adjoint
operator on a complex Hilbert space H and define its resolvent set by
z−i
Next note that the function φ(z) = z+i maps R bijectively into the unit
circle with the point 1 removed, suggesting a way to associate a unitary
operator to a self-adjoint operator.
Exercise 13.22. Let (DT , T ) : H → H be a self-adjoint operator on a complex Hilbert
space H.
(a) Show that T + iI is injective and im(T + iI) = H.
(b) Show that U (T v + iv) = T v − iv for v ∈ DT defines a unitary operator U : H → H.
US (Sv + iv) = Sv − iv
13.4 Symmetric Operators 501
SU (w − U w) = i(w + U w)
Essential Exercise 13.25. Show that the procedure above is indeed the
inverse to the Cayley transform by the following steps.
(a) Given a densely defined symmetric operator (DS , S) : H → H, show
that S = SUS .
(b) Given a partially defined isometry (DU , U ) : H → H for which I − U is
injective and im(I − U ) is dense, show that U = USU .
Exercise 13.27. Give an example of a densely defined symmetric operator that does not
have a self-adjoint extension.
(a) Show that S has a self-adjoint extension if and only if n+ (S) = n− (S).
(b) A symmetric operator is called essentially self-adjoint if it has a unique
self-adjoint extension. Show that S is essentially self-adjoint if and only
if n+ (S) = n− (S) = 0.
Exercise 13.29. Find an example of an essentially self-adjoint operator that is not self-
adjoint.
Exercise 13.30. Show that we have n+ (S) = n− (S) = 1 for the operator S = iT0 from
Exercise 13.7. Deduce that we can parameterise the self-adjoint extensions Sα of S (in a
natural manner) by elements α ∈ S1 .
Gauss observed in 1792–93 (at the age of 15 or 16), that the density of primes
close to x seemed to be approximately log1 x , leading to the suggestion that
the prime counting function
has the asymptotic growth rate logx x , and this statement is now called the
prime number theorem. After many partial and weaker results, Hadamard
and (independently) de la Vallée-Poussin extended work in which Riemann
introduced complex-analytic methods to give the first proofs of the prime
number theorem in 1896. We will not concern ourselves with the error rate
in this approximation; the best conjectured error rate is a reformulation of
the famous Riemann hypothesis.(35)
Theorem 14.1
P (Prime number theorem). For the prime counting func-
tion π(x) = p6x 1 (where p runs over the primes in N) we have π(x) ∼ logx x
as x → ∞.
Proof. Suppose that (14.1) holds, and fix some small δ > 0. Then we also
have X
Λ(n) = x1−δ + o(x1−δ ) = oδ (x)
16n6x1−δ
The first sum on the left is the one we are interested in, so we wish to estimate
the second sum. For this, notice that n = pk ∈ (x1−δ , x] with k > 1 implies
that k 6 log x log x
log p 6 log 2 and p 6 x
1/2
. Therefore
X log x X
log p 6 log(x1/2 ) = O(x1/2 log2 x) = o(x),
log 2
x1−δ <pk 6x 16n6x1/2
k>1
The upper bound follows similarly using log p > (1 − δ) log x for p ∈ (x1−δ , x].
Since
X x
1−δ
16x = oδ
1−δ
log x
p6x
Exercise 14.3. Prove that Theorem 14.1 implies (14.1) in Lemma 14.2.
In the second reformulation below due to Tao we will start to see the
connection to functional analysis.
Proposition 14.4 (Second reformulation of PNT). Define the Λ-semi-
norm k · kΛ by
Z
X Λ(n)
kf kΛ = lim sup f log n − h − f (t) dt (14.2)
h→∞ n
n>1 R
kλf kΛ = |λ|kf kΛ
and
kf1 + f2 kΛ 6 kf1 kΛ + kf2 kΛ
for any f1 , f2 ∈ Cc (R) and λ ∈ C. In other words, k · kΛ defines a semi-
norm on Cc (R) once we have checked that it is well-defined in the sense
that kf kΛ < ∞ for every f ∈ Cc (R). We will prove this in the next section.
506 14 The Prime Number Theorem
or equivalently
X n Z ∞
Λ(n)g =x g(u) du + o(x).
x 0
n>1
and X X n
Λ(n) > Λ(n)g− > ( 12 − δ)x + oδ (x).
1
x
2 x<n6x
n>1
We sum estimates of this form to get the desired claim. To handle the error
term carefully we fix some ε > 0 and suppose M > 1 is such that the error
term is bounded in absolute
P value by εx whenever x > M . Also note that for
a fixed ε both M and n6M Λ(n) are oε (x). Hence we may write
X X X X
Λ(n) = Λ(n) + Λ(n) + · · · + Λ(n) + oε (x),
n6x 1 1 1 1 1
2 x<n6x 4 x<n6 2 x 2ℓ+1
x<n6
2ℓ
x
where ℓ > 0 is chosen maximally with 21ℓ x > M . Applying the asymptotic
1
in (14.3) to each sum we see that the main terms add to x− 2ℓ+1 x = x+oε (x)
14.2 The Selberg Symmetry Formula and Banach Algebra Norm 507
and the error terms add up to no more than 2εx by choice of M . As ε > 0
was arbitrary, we see that (14.1) in Lemma 14.2 follows.
Exercise 14.5. Show that PNT implies that the Λ-semi-norm vanishes on Cc (R).
This will be an important step towards the proof of PNT. Assuming that
the semi-norm is not identically zero will allow us to construct a Banach
algebra homomorphism from L1 (R) to the completion AΛ of Cc (R) with
respect† to the semi-norm k · kΛ , which will induce a dual homomorphism
from the space of characters AoΛ of AΛ into L1 (R)o ∼
= R. This will eventually
lead to a contradiction.
For the proof of Theorem 14.6 we will need some elementary tools from
number theory: the Selberg symmetry formula and Mertens’ theorem.
where the sum is taken over all divisors d of n ∈ N (including both 1 and n
itself).
The special case of convolution with the constant function 1 is of particular
interest as it simply corresponds to taking the sum over all divisors,
X
∗ 1)(n) =
(f D f (d).
d|n
† The reader may easily check that the formal mechanism of taking the completion with
respect to a semi-norm gives the same result as first forming the quotient with respect to
the kernel of the semi-norm and then taking the completion with respect to the norm.
‡ This is often just denoted f ∗ f as it is a multiplicative convolution on the semigroup N.
1 2
However, as we make more use of the additive convolution in this volume we reserve the
unadorned ∗ for the latter.
508 14 The Prime Number Theorem
X k
X
Λ(d) = Λ(pℓ ) = k log p = log n,
d|n ℓ=0
where p1 , . . . , pℓ denote distinct primes. Notice that the second case includes
the statement that µ(1) = (−1)0 = 1 as 1 is taken to be a product of no
primes.
Proposition 14.7 (Möbius inversion). Given functions f, g : N → R we
∗ 1 if and only if f = g D
have g = f D ∗ µ. Moreover δ1 = 1 D
∗ µ, where
(
1 for n = 1,
δ1 (n) =
0 otherwise.
and
14.2 The Selberg Symmetry Formula and Banach Algebra Norm 509
X
∗ f2 ) D
(f1 D ∗ f3 (n) = ∗ (f2 D
f1 (d)f2 (e)f3 (f ) = f1 D ∗ f3 )(n)
n=def
for any three functions f1 , f2 , f3 on N and all n ∈ N. Also note that the
function δ1 is an identity for Dirichlet convolution since
X
∗ δ1 )(n) =
(f D f (d)δ1 (e) = f (n).
n=de
X ℓ
X X Xℓ
ℓ
µ(d) = µ(1) + µ pj1 · · · pjr = (−1)r = (1 − 1)ℓ = 0,
r=1 j1 ,...,jr r=0
r
d|n
where the inner sum over j1 , . . . , jr runs over all different r-tuples of distinct
indices within {1, . . . , ℓ}.
Summarizing the above discussions for the von Mangoldt function we have
(
log p if n = pk for a prime p and some k > 1;
∗ log)(n) =
Λ(n) = (µ D
0 otherwise,
∗ log2 .
Λ2 = µ D
Proof. Below we will use the fact that Λ(n) = log p for n = pk and k > 1
and Λ(n) = 0 otherwise without explicit reference. Define f to be the second
expression in the lemma, that is
X n
f (n) = Λ(n) log n + Λ(d)Λ
d
d|n
for n ∈ N. We first claim that f also equals the third expression (defined by
the three cases). In fact, f (1) = 0 and if n = pk then
k−1
X
f (n) = log p log pk + log2 p = (2k − 1) log2 p.
ℓ=1
If n = pk11 pk22 for two different primes p1 , p2 and k1 , k2 > 1, then Λ(n) = 0
and
f (n) = Λ(pk11 )Λ(pk22 ) + Λ(pk22 )Λ(pk11 ) = 2 log p1 log p2 .
Finally,
if n has three or more prime factors, then clearly f (n) = 0 since
for dn either d or nd must have at least two prime divisors.
Now let n = p1k1 · · · pkℓ ℓ and calculate
kj kj1 kj2
X XX X X X
∗ 1)(n) =
(f D f (d) = f (paj ) + f (paj11 paj22 )
d|n j a=1 j1 6=j2 a1 =1 a2 =1
kj
XX X
= (2a − 1) log2 pj + 2 kj1 kj2 log pj1 log pj2
j a=1 j1 6=j2
X X
= kj2 log2 pj + 2 kj1 kj2 log pj1 log pj2
j j1 6=j2
X 2
= kj log pj = log2 n,
j
primes it might be easier to study. Whatever the true rationale may be, it is
still surprising that the following argument relies only on elementary analysis
and Möbius inversion.
Proposition 14.9 (Selberg symmetry formula). We have
X
Λ2 (n) = 2x log x + O(x)
n6x
for x > 1.
Proof. We fix some x > 1. The proof will use Möbius inversion and the
following (elementary) asymptotic estimates, which we will prove below.
X y −1 1
⌊y⌋ 1
= =1+O ; (14.6)
m m y y
m6y
X 1
1
= log y + c1 + O ; (14.7)
m y
m6y
X log(y/m)
1 log(1 + y)
= log2 y + c1 log y + c2 + O ; (14.8)
m 2 y
m6y
2
1 X 2 2 log (1 + y)
log m = log y − 2 log y + 2 + O , (14.9)
y y
m6y
where we changed the order of summation. Applying this with g(n) = log2 n
and f (n) = Λ2 (n) gives
1X X µ(d) 1 X
Λ2 (n) = log2 m, (14.10)
x d x/d x
n6x d6x m6 d
and we see that the estimate (14.9) might be useful for y = xd . Multiply-
ing (14.8) by 2, (14.7) by a constant c3 , (14.6) by a constant c4 , and summing
we can choose the constants to match† the right-hand side of the asymp-
† Explicitly, c3 = −2 − 2c1 and c4 = 2 − c2 − c1 c3 .
512 14 The Prime Number Theorem
where we set
Pn = dm and exchanged the order of summation again. Next we
recall that d|n µ(d) = (1 D
∗ µ)(n) = δ1 (n) for all n > 1 by Proposition 14.7,
and claim that
X x
log2 1 + = O(x). (14.11)
d
d6x
Together we obtain
1X 1
Λ2 (n) = F (x) + O(1) = 2 log x + c3 + c4 + O(1) = 2 log x + O(1)
x x
n6x
and so ⌊y⌋
X Z y
f (m) − f (t) dt 6 f (1) + f (y + 1).
1
m=1
14.2 The Selberg Symmetry Formula and Banach Algebra Norm 513
x
Using the substitution u = t (with du = x(−t−2 ) dt) we see that
Z Z x Z ∞ 2
x
x x log (1 + u)
log2 1 + dt = log2 (1 + u) 2 du 6 x du = O(x),
1 t 1 u 1 u2
X 1 Z ⌊y⌋+1
= f (t) dt.
m 1
m6y
Also note that f (t) − 1t ≪ 1
t2 . Hence
X 1 Z ⌊y⌋+1 Z y
1
= log y + f (t) dt − dt
m 1 1 t
m6y
Z ∞ Z ∞ Z ⌊y⌋+1
1 1
= log y + f (t) − dt − f (t) − dt + f (t) dt
1 t y t y
1
= log y + c1 + O
y
Note that Z y
log t 1
dt = log2 y
1 t 2
514 14 The Prime Number Theorem
and that
Z ∞ Z
log t ∞ ∞ log y + 1
dt = ue−u du = −ue−u − e−u = .
y t2 log y log y y
Hence, if we set Z ∞
log t
−c2 = f (t) − dt
1 t
then we have, similarly,
X log m Z y Z y Z ⌊y⌋+1
1 log t
= log2 y + f (t) dt − dt + f (t) dt
m 2 1 1 t y
m6y
Z ∞
1 2 log t log(1 + y)
= log y − c2 − f (t) − dt + O
2 y t y
1 log(1 + y)
= log2 y − c2 + O .
2 y
X∞
f (n)
νf = δlog n .
n=1
n
for any Borel subset B ⊆ R, and is again a Radon measure on [0, ∞).
as claimed.
For a Radon measure ν on R and some h ∈ R we define the shifted meas-
ure λh ν by Z Z
f dλh ν = f (t − h) dν(t).
R R
as h → ∞.
We will not use this lemma except as motivation for the next step, which
relies on the Selberg symmetry formula.
1
dνsym (t) = dνΛ (t) + d νΛ ∗ νΛ (t)
t
and λh νsym converges to 2m in the weak* topology as h → ∞.
X Λ2 (n)
νsym (B) = νΛ2 / log (B) = 1B (log n)
n log n
n>1
X
∗ Λ)(n)
Λ(n) (Λ D
= 1B (log n) +
n n log n
n>1
Z
1
= νΛ (B) + 1B (t) dνΛ D∗ Λ (t)
t
by the properties of Λ2 in Lemma 14.8. This gives the identity in the corollary.
To obtain the claimed convergence we apply the Selberg symmetry formula
(Proposition 14.9). By this formula,
X
Λ2 (n) = 2cx log x + O(x) = 2cx log x + o(x log x)
n6cx
for any constant c > 0. Dividing by x log x this gives the asymptotic
1 X
Λ2 (n) = 2(b − a) + o(1) (14.12)
x log x
ax<n6bx
as x → ∞ for any 0 < a < b. Now fix ε > 0 and assume that x is large
enough to ensure that ax < n 6 bx implies that
log x
1−ε6 6 1 + ε.
log n
1
Multiplying this by log x Λ2 (n) and summing over n ∈ (ax, bx] leads to
1
P Λ2 (n)
x ax<n6bx log n
1−ε6 1
P 6 1 + ε.
x log x ax<n6bx Λ2 (n)
= 2(b − a) + o(1)
14.2 The Selberg Symmetry Formula and Banach Algebra Norm 517
as x → ∞. Here the left-hand side equals the integral of the function defined
by f (t) = 1(log a,log b] (t)et with respect to λlog x νsym , so that
Z Z
1(log a,log b] (t)e dλlog x νsym = 2(b−a)+o(1) = 2 1(log a,log b] (t)et dt+o(1)
t
0 6 λh νΛ 6 λh νsym (14.13)
we see that (λhn νΛ )|[−ℓ,ℓ] can be identified with a bounded sequence of func-
tionals on C([−ℓ, ℓ]). Let Rℓ = supn>1 λhn νΛ ([−ℓ, ℓ]). By the Banach–Alaoglu
theorem (Theorem 8.10 and Proposition 8.11) the closed ball
C([−ℓ,ℓ])∗
B ℓ = B Rℓ
is compact and metrizable in the product topology (see Appendix A.3). Un-
folding the definitions, it follows that (λhn νΛ ) has a subsequence (λhnk νΛ )
R
such that f dλhnk νΛ converges as k → ∞ for any f ∈ Cc (R). As in-
tegration and the limit are linear, by taking the limit we obtain a linear
functional on Cc (R). Moreover, this functional is non-negative for any non-
negative f ∈ Cc (R) and so the Riesz representation theorem (Theorem 7.44)
shows that there is a Radon measure µ on R such that
Z Z
lim f dλhnk νΛ = f dµ,
k→∞
Using the density of Cc (R) in L1µ+m (R) we can approximate the character-
istic function 1B of any bounded measurable set by a non-negative func-
tion in Cc (R) simultaneously with respect to both µ and m, which im-
plies that µ(B) 6 2m(B). Using Proposition 3.29 it follows that µ is ab-
solutely continuous with respect to m and that the Radon–Nikodym deriv-
dµ
ative D = dm takes values in [0, 2] almost everywhere.
Recall from Lemma 14.12 that we wish to show that D ≡ 1.
Proof of first inequality in Theorem 14.6. Fix some f ∈ Cc (R) and
choose a sequence (hn ) with hn → ∞ as n → ∞ such that
Z
X Λ(n)
kf kΛ = lim f (log n − hn ) − f dm .
n→∞ n R
n>1
for x > 1.
Proof. We first claim that
X
Λ(n) = O(x), (14.15)
n6x
which will allow us to control error terms in the calculation below. The
bound (14.15) is a trivial consequence of PNT (in the form of the state-
ment (14.1)), but fortunately the results developed above are sufficient to
prove (14.15) quite directly. In fact, the continuity bound kf kΛ 6 kf k1
in (14.4) (proven above) implies that
X Λ(n) n
lim sup f log 6 2kf k1
x→∞ n x
n>1
for every f ∈ Cc ((0, ∞)). Using this for f (t) = et g+ (et ) and a non-negative
function g+ ∈ Cc (R) with 1[ 21 ,1] 6 g+ we obtain (much as in the proof of
Proposition 14.4) that
Z ∞
1 X
lim sup Λ(n) 6 2 g+ (t) dt,
x→∞ x 1 0
2 x<n6x
and so X
Λ(n) 6 Cx
1
2 x<n6x
6 Cx + 12 Cx + · · · + 1
2ℓ
Cx 6 2Cx,
X X Λ(d)
log n = x + O(x). (14.16)
d
n6x d6x
as a sum of three signed measures. For the first of these we recall from
Corollary 14.13 that
λh ρ3 −→ 0 (14.21)
We assume in this section that the semi-norm k·kΛ defined in Proposition 14.4
is non-trivial. Let AΛ be the completion of Cc (R) with respect to k · kΛ and
note that Theorem 14.6 shows that there is a Banach algebra homomorphism
Φ : L1 (R) → AΛ .
Essential Exercise 14.16. Give the details of the argument that deduces
the existence of a Banach algebra homomorphism Φ as above from The-
orem 14.6.
and so kf ∗n kΛ > 1 for all n > 1. As argued above, this gives the existence
of ξ ∈ R as in the theorem.
Now let one such ξ ∈ R be fixed and suppose that f ∈ Cc (R) has
f (t)e−2πitξ > 0
Notice that Theorems 14.17 and 14.18 together show that kf kΛ = 0 for
every f ∈ Cc (R). Proposition 14.4 then gives the PNT (Theorem 14.1).
Proof of Theorem 14.18 for ξ 6= 0. In this case we will only use that
the density function D in Proposition 14.14 takes values in [0, 2] ⊆ R almost
surely. Let
1 if |t| 6 1,
f0 (t) = 2 − |t| if |t| ∈ [1, 2],
0 otherwise,
and for a fixed ξ 6= 0 we define f (t) = f0 (t)e2πitξ . Choose a sequence (hn )
with hn → ∞ as n → ∞ for which
Z Z
kf kΛ = lim f dλhn νΛ − f dm.
n→∞
Choose θ ∈ R with Z
iθ
kf kΛ = e f (D − 1) dm,
where the strict inequality follows from ξ 6= 0. Thus the theorem follows in
this case.
X Λ(n) X Λ(n)
f0 (log n − h) 6
n n
n>1 eh−(N +1) 6n6eh+1
for all sufficiently large h. Thus kf0 kΛ < kf0 k1 and the theorem follows.
Essential Exercise 14.21. (a) For any a ∈ Z with gcd(q, a) = 1 show that
the function fa = 1{k∈Z|k≡a (mod q)} can be expressed as a linear combination
of Dirichlet characters of modulus q.
(b) Show that in order to prove Theorem 14.19 it is enough to show that
X
χ(n)Λ(n) = o(x) (14.25)
n6x
Working towards the proof of the algebra inequality, we replace the second
von Mangoldt function with the twisted version χΛ2 , which by Lemma 14.8
satisfies
X
χ(n)Λ2 (n) = χ(d)µ(d)χ nd log2 nd (14.26)
d|n
X
n n
= χ(n)Λ(n) log n + χ(d)Λ(d)χ d Λ d
d|n
∗ χΛ)(n),
= χ(n)Λ(n) log n + (χΛ D
χ 1
dνsym = dνχΛ2 / log = dνχΛ + d(νχΛ ∗ νχΛ ),
t
where the second equality follows from the formula above, just as in the proof
of Corollary 14.13.
Essential Exercise 14.25. (a) Show that
2
1X 2 log (1 + y)
χ(n) log n = O
y y
n6y
for y > 1.
(b) Deduce the twisted version of the Selberg symmetry formula,
X
χ(n)Λ2 (n) = O(x)
n6x
for x > 1.
χ
(c) Show that λh νsym → 0 in the weak* topology as h → ∞.
Essential Exercise 14.26. Show that kf1 ∗ f2 kχ 6 kf1 kχ kf2 kχ for all func-
tions f1 , f2 ∈ Cc (R).
Essential Exercise 14.27. Show that Theorem 14.17 also holds in a similar
way for the semi-norm k · kχ .
Essential Exercise 14.28. Use Exercise 14.27 to prove Theorem 14.18 for
the semi-norm k · kχ and ξ 6= 0.
It remains to establish the analogue of Theorem 14.18 for k · kχ and ξ = 0.
In this case we previously used the full force of Mertens’ theorem (The-
orem 14.15). Here we replace this with the statement
X χ(n)Λ(n)
= O(1) (14.27)
n
n6x
14.5 Primes in Arithmetic Progressions 529
for x > 1, which we prove in the following subsection (this is also due to
Dirichlet).
Essential Exercise 14.29. Assuming (14.27), prove Theorem 14.18 for k·kχ
and ξ = 0, and conclude the proof of Theorem 14.19.
In this section we will prove (14.27). This will require a brief excursion into
the beginnings of analytic number theory; we refer to Serre [97] for more
details. The tools needed are basic properties of Dirichlet series and the
Abel summation formula. Following a convention going back to Riemann, we
write s = σ+it with σ, t ∈ R for any s ∈ C. There are shorter proofs of (14.27)
which do not use complex analysis; we refer, for example, to Tao [103] for the
details.
θ(mn) = θ(m)θ(n)
for all m, n > 1 and is multiplicative if the same property holds for all m, n > 1
with gcd(m, n) = 1. In particular, Dirichlet characters are completely multi-
plicative and the Möbius function is multiplicative.
for x > 1.
(2) Writing χ0 for the trivial character, L(s, χ0 ) has a meromorphic exten-
sion to the half-plane H+ which has a simple pole at s = 1, and is holo-
morphic on H+r{1}.
(3) If χ is a non-trivial character, then L(1, χ) 6= 0.
Corollary 14.32. The bound (14.27) holds for any non-trivial Dirichlet
character χ.
The following is a rather simple but useful tool for our discussions.
Lemma 14.33 (Abel summation). For any sequences (an ) and (bn ),
m−1
X
Sm = An (bn − bn+1 ) + Am bm
n=1
m
X m
X
where Am = an and Sm = an bn , for all m > 1.
n=1 n=1
P
where Sk = m6k χ(m) x
m with k = ⌊ d ⌋ is the partial sum appearing in The-
orem 14.31(1). By (14.28) we then have |Sk − L(1, χ)| ≪ k1 . Substituting this
into the expression above gives
14.5 Primes in Arithmetic Progressions 531
by (14.15).
We want to show that the left-hand side in the last calculation is also O(1),
as then (14.27) follows since L(1, χ) 6= 0. For this we use Lemma 14.33
with an = χ(n) and bn = logn n . Note that
m
X
Am = an
n=1
Pq−1
satisfies |Am | 6 φ(q) since n=0 χ(n) = 0 and χ(n + q) = χ(n) for all n ∈ N.
This gives
X χ(n) log n ℓ−1
X
6 φ(q) |bn − bn+1 | + φ(q)bℓ ,
n
n6x n=1
Xm m−1
X
χ(n)
s
= S m = An n−s − (n + 1)−s + Am m−s . (14.29)
n=1
n n=1
after again using the triangle inequality and |An | 6 φ(q), since the sum
telescopes. This gives the claim in (14.28).
Properties of L(s, χ0 ). Let χ0 be the trivial character of modulus q. It
will be convenient to start by recalling some properties of the Riemann zeta
function, defined for ℜ(s) > 1 by
X∞
1
ζ(s) = s
.
n=1
n
One easily see that this series converges absolutely for ℜ(s) > 1, and so defines
a holomorphic function there by Exercise 14.30. To obtain the extension
to H+ and the pole at s = 1, we write
∞
X Z ∞ X∞ Z n+1
1
ζ(s) − = n−s − x−s dx = n−s − x−s dx,
s − 1 n=1 1 n=1 n
and as in the proof of the first part of the theorem, we see that the series on
the right-hand side converges uniformly on any compact subset K ⊆ H+ .
Returning to the trivial Dirichlet character χ0 of modulus q, we will see
that the difference between L(s, χ0 ) and ζ (or more precisely, their ratio)
is relatively benign. Let p1 , . . . , pℓ be the finite list of primes that divide q.
Using unique factorization and
X
L(s, χ0 ) = n−s
n:gcd(n,q)=1
we obtain
for ℜ(s) > 1 by absolute convergence of all the series involved. Since the
±1
functions s 7→ (1 − p−s ) are holomorphic on H+ , we may use the results
for ζ above and deduce the same properties for L(s, χ0 ).
The last part of the proof of Theorem 14.31 requires some facility with
Dirichlet series provided by the following exercise and lemma.
P
Essential Exercise 14.34. Show that if f (s) = n>1 anns converges abso-
P
lutely for ℜ(s) > σ0 , then − n>1 an nlog
s
n
converges absolutely and uniformly
P
on compact subsets of {s ∈ C | ℜ(s) > σ0 }, and f ′ (s) = − n>1 an nlog s
n
there.
Lemma 14.35. Let (an ) be a real P sequence with an > 0 for all n > 1 and
suppose the Dirichlet series f (s) = n>1 anns converges for ℜ(s) > 1. Suppose
that f can be extended to a meromorphic function on H+ , also denoted f .
Then either
P
• n>1 anns converges absolutely for ℜ(s) > 0 and f is holomorphic on H+ ,
or P P
• there exists some σ0 > 0 such that n>1 naσn = ∞, n>1 anns converges
0
Proof. Define n X an o
σ0 = inf σ > 0 | < ∞ .
nσ
n>1
By non-negativity
P of the coefficients an and monotonicity of σ 7→ n−σ , we see
an
that the series n>1 ns converges absolutely for ℜ(s) > σ0 , and so defines a
holomorphic function there (see Exercise 14.30), which must therefore coin-
cide P
with f . If σ0 = 0 then we are P in the first case of the lemma. If σ0 > 0
and n>1 naσn = ∞, then f (s) = n>1 anns for ℜ(s) > σ0 . Moreover, it is
0
easy to see (for example, using the monotone convergence theorem) by non-
negativity of the coefficients that
X an X an
lim f (σ) = lim = = ∞,
σցσ0 σցσ0 nσ nσ0
n>1 n>1
for all k > 0. By the Taylor expansion of f at σ0 , this gives for sufficiently
small ε > 0 that
1 (k)
∞
X ∞
X 1 X an (log n)k k
f (σ0 − ε) = f (σ0 ) (−ε)k = ε
k! k! nσ0
k=0 k=0 n>1
X ∞
X 1
= an (− log n)k n−σ0 (−ε)k
k!
n>1 k=0
since all terms are again non-negative. The inner sum is precisely the Taylor
expansion of s 7→ n−s at σ0 and so gives n−(σ0 −ε) and hence
X an
= f (σ0 − ε) < ∞,
nσ0 −ε
n>1
s 7→ L(s, χ0 )
would be holomorphic on H+ . Here the product is taken over all the charac-
ters.
We will see that ζq has a pole at 1 by using Euler product expansions.
Unique factorization in the integers and complete multiplicativity of χ show
that
X χ(n) Y χ(p)
−1 Y
χ(p)
−1
L(s, χ) = = 1− s = 1− s
ns p
p p
n>1 p:gcd(p,q)=1
14.5 Primes in Arithmetic Progressions 535
for ℜ(s) > 1. This may be seen by extending the argument for (14.30) to all
primes and using absolute convergence. Taking the product over all Dirichlet
characters χ again gives the function
Y Y χ(p)
−1
ζq (s) = 1− s . (14.31)
χ
p
p:gcd(p,q)=1
where ωf (p) is a primitive f (p)th root of unity. Using this in the expres-
sion (14.31) for ζq (s) gives
Y −g(p)
ζq (s) = 1 − p−f (p)s
p:gcd(p,q)=1
Y g(p)
= 1 + p−f (p)s + p−2f (p)s + · · · (14.32)
p:gcd(p,q)=1
X an X an
= =
ns n
ns
n:gcd(n,q)=1
for ℜ(s) > 1, where we expanded the Euler product once again into a con-
vergent Dirichlet series with certain coefficients an . Notice that the precise
form ofP(14.32) shows that an ∈ N0 for all n ∈ N. By Lemma 14.35 the
series n>1 anns either converges absolutely for ℜ(s) > 0 and is holomorphic
on H+ or there exists some σ0 > 0 such that ζq has a pole at σ0 . In the latter
case it follows that σ0 = 1 and L(1, χ) 6= 0 for every non-trivial Dirichlet
character χ 6= χ0 .
It remains to show that the former case cannot occur. To see this, notice
that for any prime p with gcd(p, q) = 1 and σ > 0 we have
g(p)
1 + p−f (p)σ + p−2f (p)σ + · · · > 1 + p−φ(q)σ + p−2φ(q)σ + . . .
P an
and hence under the assumption that ζq (σ) = n>1 nσ converges for σ > 0,
536 14 The Prime Number Theorem
X an Y g(p)
−f (p)σ −2f (p)σ
ζq (σ) = = 1 + p + p + · · ·
nσ
n>1 p:gcd(p,q)=1
Y
> 1 + p−φ(q)σ + p−2φ(q)σ + . . . = L(φ(q)σ, χ0 ) (14.33)
p:gcd(p,q)=1
We will be using naive set theory, and in particular will use without specific
reference the axioms of Zermelo–Fraenkel set theory with the axiom of choice
(we refer to Kelley [51] for a good general source on all of the material in this
appendix). This does require some caution. For example, it does not permit
there to be a set that contains all sets, for if there were such a ‘universal’ set V
then its subset C = {A ∈ V | A ∈ / A} forces the statement C ∈ C ⇐⇒ C ∈ / C,
which is contradictory.
Here are some basic properties of sets that we will use without comment.
(1) A set will never contain itself. S
(2) For every set S of sets there is a set A∈S A, the union, containing all
elements that are contained in some A ∈ S.
(3) For every set A there is a power set P(A) containing all subsets of A.
(4) Any condition on the elements of a set can be used to define a new set,
namely the subset of all elements that satisfy the condition.
Examples of sets include the empty set ∅, the natural numbers N, the real
numbers R, the set of functions R → C, which may also be written as CR ,
and so on.
The following axiom of set theory is less intuitive than those above, but it
plays a central role in analysis.
While this axiom appears quite innocent (indeed, it appears almost obvi-
ous), it turns out to have a number of exotic consequences.(36) The axiom of
choice has many equivalent formulations, one of which is Zorn’s lemma, which
is particularly useful in analysis. In order to state this, recall that a partial
order on a set S is a relation 4 with the reflexivity property that a 4 a
for all a ∈ S, the transitivity property that a 4 b, b 4 c =⇒ a 4 c for
all a, b, c ∈ S, and the anti-symmetry property that a 4 b, b 4 a =⇒ a = b
for all a, b ∈ S. A partial order is a linear order if for every pair a, b ∈ S
we have either a 4 b or b 4 a. A maximal element in a partially ordered
set (S, 4) is an element m ∈ S for which m 4 a for some a ∈ S implies
that a = m.
Zorn’s lemma. Let (S, 4) be a partially ordered set, and suppose that for
every linearly ordered subset L ⊆ S there exists an element m ∈ S with ℓ 4 m
for all ℓ ∈ L. Then there exists a maximal element m ∈ S.
One might imagine setting out to prove Zorn’s lemma inductively along
the following lines. Starting with a single element (which certainly forms a
linearly ordered set) one can build larger and larger linearly ordered subsets.
If the current linearly ordered subset L has a maximal element, then it may
also be a maximal element for S, in which case we are done. Otherwise, one
can use the assumed property and add an element to L which is bigger than
every element of L. Repeating this inductively (by transfinite induction, and
noting that this procedure only ends once a maximal element in S is found),
Zorn’s lemma follows. However, in the course of the proof one has to make
(potentially uncountably) many choices, and doing this carefully reveals that
the argument needs the axiom of choice.
The notion of an open set is fundamental for defining continuity and conver-
gence.
Definition A.1. Let X be a set. A family T ⊆ P(X) of subsets of X is called
a topology on X if
• ∅, X ∈ T ;
• if O1 , O2 ∈ T then O1 ∩ O2 ∈ T ; S
• if Oi ∈ T for all i ∈ I, where I is an arbitrary index set, then i∈I Oi ∈ T .
The pair (X, T ) is called a topological space. The elements of a topology are
called open sets and a set A ⊆ X with XrA ∈ T is called closed. A set that
is both open and closed is called a clopen set.
Given a point x in a topological space, a neighbourhood of x is a set V
containing an open set U that contains x. We will often want to assume that
A.2 Basic Definitions in Topology 539
neighbourhoods are open sets, in which case we will speak of open neighbour-
hoods. A topological space is called Hausdorff if for any points x1 6= x2 in X
there exist neighbourhoods U1 of x1 and U2 of x2 such that U1 ∩ U2 = ∅.
Many of the topological spaces that we will study are particularly well-
behaved ones arising from a metric.
It is easy to check that the collection of all open sets in a metric space
defines a topology on the metric space. If instead of strict positivity we only
have
• (positivity) d(x, y) > 0 for all x, y ∈ X
then we say that d is a pseudo-metric; this also gives rise to a topology in
the same way.
Definition A.4. Let X be a set and suppose that T1 and T2 are two topo-
logies on X. If the identity map I : X → X viewed as a map from (X, T1 )
to (X, T2 ) is continuous (which means that T2 ⊆ T1 ), then T2 is said to be
weaker or coarser than T1 , and T1 is called stronger or finer than T2 .
The open sets in the initial topology are arbitrary unions of finite intersec-
tions of elements of fı−1 (TYı ) for various ı ∈ I. The initial topology can also
be characterized by the following universal property. A function g : Z → X
is continuous if and only if fı ◦ g : Z → Yı is continuous for each ı ∈ I.
A particular case of the initial topology is the product topology.
Proof. For the main part of the argument it is important to know that we
may assume that dn only takes on values in [0, 1). To see this, we claim that
dn
if dn is any pseudo-metric then dn = 1+d n
is a pseudo-metric that defines
the same topology as dn does.
Positivity and symmetry of dn are clear since they hold for dn . Hence it
is enough to check the triangle inequality for dn . For this, notice first that
u
the function u 7→ 1+u maps from [0, ∞) to [0, 1), is monotone increasing and
satisfies
u+v u v
6 + (A.1)
1+u+v 1+u 1+v
for u, v ∈ [0, ∞). The inequality (A.1) follows from the inequality
A.3 Inducing Topologies 543
as required. It is clear that dn (x, y) < ε for x, y ∈ X implies that dn (x, y) < ε.
ε
For the converse, notice that dn (x, y) < 1+ε implies that dn (x, y) < ε for
u
all ε > 0 and x, y ∈ X, since u 7→ 1+u is strictly monotonely increasing. This
implies that dn and dn define the same open sets.
So suppose that dn : X × X → [0, 1) is a pseudo-metric for each n > 1.
We define ∞
X 1
d(x, y) = d (x, y).
n n
n=1
2
Since this sum converges on X × X, it defines another pseudo-metric on X.
We claim that the topology induced by d is precisely the weakest topology
that is finer than all the topologies induced by dn for n > 1.
Suppose first that O ⊆ X is an open set with respect to d, and let x ∈ O.
By definition there exists an ε > 0 with
N
\
dn
Bε/2N (x) ⊆ Bεd (x) ⊆ O
n=1
ε
since if y ∈ X satisfies dn (y, x) < 2N for n = 1, . . . , N then
N
X ∞
X 1
d(x, y) 6 dn (x, y) + < ε.
n=1
2n
n=N +1
N
\
x∈ On ⊆ O,
n=1
then 21n dn (y, x) < 2εN and for n ∈ N with 1 6 n 6 N this implies
that dn (y, x) < ε, hence y ∈ On and so the claim. The first part of the
lemma follows.
Now suppose that
Y∞
X= Xn
n=1
for all n > 1, which implies that (xn ) = (yn ), so d is a metric on X. The
topology induced by the pseudo-metric
The space (X, T ) is called compact if every open cover has a finite subcover,
that is, a finite subset V ⊆ U which is also an open cover.
for any finite subset {ı1 , . . . , ık } ⊆ I, and has the infinite intersection property
if \
Aı 6= ∅.
ı∈I
For metric spaces there are further equivalent properties characterizing com-
pactness.
• A metric space (X, d) is sequentially compact if any sequence (xn ) in X
has a convergent subsequence.
• A metric space (X, d) is compact if and only if it is complete and
totally bounded, meaning that for every ε > 0 there is a finite set of
points {x1 , . . . , xn } in X with
546 Appendix A: Set Theory and Topology
n
[
X= Bε (xi ).
i=1
Exercise A.19. Recall the proofs that the different notions of compactness coincide for
metric spaces.
Lemma A.22. Let X be a locally compact space. Then for every compact
subset K ⊆ X there exists an open subset O ⊆ X with compact closure that
contains K. If X is in addition σ-compact,
S∞ then there exists a sequence of
o
compact sets (Kn ) such that X = n=1 Kn and Kn ⊆ Kn+1 for all n > 1.
The implication (3) =⇒ (1) uses the axiom of choice in the form of
Zorn’s lemma (to show that any filter is contained in an ultrafilter; see Ex-
ercise A.25).
Exercise A.25. (a) Use Zorn’s lemma to show that every filter has a finer filter that is
an ultrafilter.
(b) Prove Proposition A.24.
(c) Use Proposition A.24 to prove Tychonoff’s theorem.
This definition, which says that disjoint closed sets can be separated by
open sets, may be thought of as requiring that there are ‘enough’ open sets.
An important consequence is that there are ‘enough’ continuous functions in
the following sense (this presentation is taken from Tao’s blog [103]).
(4) For every closed set K ⊆ X and every open set U ⊇ K, there exists a
continuous function f : X → [0, 1] with 1K 6 f 6 1U .
548 Appendix A: Set Theory and Topology
For metric spaces the proof of the difficult step below is rather simple:
Given two disjoint closed sets K and L we can define the function f as in (3)
by
d(x, L)
f (x) =
d(x, K) + d(x, L)
for x ∈ X, where d(·, K) and d(·, L) are the continuous distance functions
defined in (2.29).
Proof. The implications (3) ⇐⇒ (4) and (1) ⇐⇒ (2) are clear, since a set
is closed if and only if its complement is open.
Assume now that (3) holds. Given disjoint closed sets K, L ⊆ X, let f be
the function given by (3). Then the open sets U = {x ∈ X | f (x) > 0.9}
and V = {x ∈ X | f (x) < 0.1} show (1).
Assume next that (2) holds, let K = K1 be a closed set, and let U = U0
be an open set with K1 ⊆ U0 . By (2), we can find a closed set K1/2 and an
open set U1/2 with
U0 ⊇ K1/2 ⊇ U1/2 ⊇ K1 .
Applying (2) again twice gives closed sets K1/4 , K3/4 and open sets U1/4 , U3/4
with
U0 ⊇ K1/4 ⊇ U1/4 ⊇ K1/2 ⊇ U1/2 ⊇ K3/4 ⊇ U3/4 ⊇ K1 .
Continuing in exactly the same way and setting K0 = X and U1 = ∅, we
construct for every rational q ∈ D = { 2an | n > 0, a ∈ Z, 0 6 a 6 2n } a closed
set Kq and an open set Uq with Kq ⊇ Uq for q ∈ D and with Uq1 ⊇ Kq2 for
all q1 , q2 ∈ D with q1 < q2 . Now define
(
0 for x ∈
/ U0 ,
f (x) =
sup{q ∈ D | x ∈ Uq } otherwise.
for s 6 1 and f −1 ((−∞, s)) = X for s > 1. Hence both f −1 ((s, ∞))
and f −1 ((−∞, s)) are open sets for any real s, so f is continuous and (4)
follows.
For a continuous function f on a topological space X with values in R, C,
or a vector space, we define the support of f to be
Supp f = {x ∈ X | f (x) 6= 0}
for all x ∈ X.
f : A → [−1, 1]
2
Continuing inductively starting with f3 = 32 f2 − h2 |A , we find func-
tions h1 , h2 , . . . , hn : X → [−1, 1] with
f − h1 + 2 h2 + · · · + ( 2 )n−1 hn |A
6 2 n . (A.2)
3 3 ∞ 3
We set
∞
X
2 n−1
F = 3 hn ,
n=1
Measure theory is one approach to making rigorous the idea of the size (or
length, volume, and so on) of a set in an abstract setting. We refer to the
notes of Tao [105] for a good general introduction to measure theory. By
carefully controlling the complexity of the sets allowed in the theory, the
basic intuition (for example, that the volume of the disjoint union of two
sets is the sum of their volumes) can be developed into a powerful theory,
indispensable in several fields including functional analysis and probability.
The path to the definition of the Lebesgue integral starts with a discussion
about which sets (and hence which functions) are allowed in the theory.
Definition B.1. Let X be a set. A family A ⊆ P(X) of subsets of X is called
an algebra if it satisfies the following properties:
• ∅, X ∈ A;
• if A ∈ A then Ac = XrS A ∈ A;
• if A1 , . . . , An ∈ A then ni=1 Ai ∈ A;
and if, in addition,
S∞
• if A1 , A2 , · · · ∈ A then n=1 An ∈ A
then A is a σ-algebra.
If A is a σ-algebra, then we call the pair (X, A) a measurable space and
the elements of A measurable sets or A-measurable sets.
It is straightforward to check that the intersection of any collection of σ-
algebras is also a σ-algebra. Hence for any family C ⊆ P(X) of subsets there is
a unique smallest σ-algebra containing C, called the σ-algebra generated by C,
and denoted σ(C). If X is a topological space, then the σ-algebra generated by
all open subsets of X is called the Borel σ-algebra, and is denoted B or B(X).
where the sum on the right-hand side may or may not converge.
∞
! ∞
[ X
µ An = µ(An );
n=1 n=1
µ(A) = µ(A)
for any A ∈ A.
B.1 Basic Definitions and Measurability 553
One can show rather easily that the integral defined in (B.2) is independent
of the particular description of f as a finite sum as in (B.1). For the next
definition, the analogous claim is an important step in the theory (this is
essentially the monotone convergence theorem discussed below).
0 6 fm 6 fn 6 f
Implicit in this definition is the fact that (on σ-finite measure spaces) any
non-negative measurable function is a pointwise limit of simple functions.
Notice also that we permit sets to have infinite measure and functions to
have infinite integral. If f : X → [0, ∞] = R>0 ∪ {∞}, then we define
Z
f dµ = ∞
X
554 Appendix B: Measure Theory
The space of integrable functions forms a vector space, and the integral is
a linear function on that vector space. Moreover, the integral satisfies the
following fundamental continuity properties, each of which is a consequence
of the σ-additivity of the measure µ.
for the space of integrable functions on a measure space (X, B, µ), and define
Z
kf k1 = |f | dµ
for all f ∈ Lµ1 (X). It is easy to check that kλf k1 = |λ|kf k1 and
kf + gk1 6 kf k1 + kgk1
(µ × ν)(B × C) = µ(B)ν(C)
and the inequality between the left-hand side and the right-hand side of (B.5)
also holds trivially if f (x) = 0 or g(x) = 0. Integrating (B.5) over x ∈ X gives
Z
|f g| dµ 6 p1 kf kpp + 1q kgkqq = 1,
p/q
If now kf + gkp ∈ (0, ∞) we can divide (B.6) by kf + gkp and the theorem
follows since p − p/q = 1.
However, if kf + gkp = 0 then the inequality in the theorem is trivially
satisfied. Finally, note that
Even though measurable functions are typically very far from being continu-
ous, if we are working with a finite measure on the Borel σ-algebra of a metric
space then they are nearly continuous in the following sense.
Theorem B.17 (Lusin: near-continuity of measurable functions).
Let X be a metric space, let µ be a finite measure on the Borel σ-algebra of X,
let Y be a separable metric space, and let f : X → Y be (Borel) measurable.
Then for every ε > 0 there exists a closed set K ⊆ X with µ(XrK) < ε
such that f |K is continuous. If X is σ-compact, then K can be chosen to be
compact.
As the proof will show, we will in essence produce the continuity of f |K
by removing very small open subsets around every possible discontinuity. To
do this we will use the following regularity property of measures on metric
spaces.
Lemma B.18 (Regularity of measures). Let (X, d) be a metric space and
let µ be a finite Borel measure on X. Then for every Borel set B ⊆ X and
every ε > 0 there exists a closed set K ⊆ X and an open set O ⊆ X with
K⊆B⊆O
with µ(OrK) < ε. The statement of the lemma is then A = B, which we will
prove in stages. By definition of the Borel σ-algebra B, it is enough to show
that A is a σ-algebra containing all the open sets.
Closure under complements. Since taking complements switches open
and closed sets, A is closed under taking complements. Explicitly, if B lies
in A and for a given ε > 0 we have K ⊆ B ⊆ O as in the definition of A,
then XrO ⊆ XrB ⊆ XrK and µ ((XrK)r(XrO)) = µ(OrK) < ε, which
shows that XrB ∈ A.
Open sets. Using the continuous distance function x 7→ d(x, A) for a closed
subset A of X from (2.29) it follows that
\
A= {x ∈ X | d(x, A) < n1 }
n>1
| {z }
=On , an open set
K1 ⊆ B1 ⊆ O1
and
K2 ⊆ B2 ⊆ O2
as in the definition of A, with µ(O1rK1 ) < ε and µ(O2rK2 ) < ε. Now
define K = K1 ∪ K2 , B = B1 ∪ B2 and O = O1 ∪ O2 so that K is closed, O is
open, and K ⊆ B ⊆ O. Moreover, µ(OrK) 6 µ(O1rK1 ) + µ(O2rK2 ) < 2ε,
and since ε > 0 was arbitrary we deduce that B = B1 ∪B2 ∈ A. By induction,
the same holds for any finite union.
Countable unions. By the steps above, B1 ∪· · ·∪Bn ∈ A if B1 , . . . , Bn ∈ A
for any n > 1. Therefore, and since we are interested in the union of these
sets, we may assume that B1 , B2 , · · · ∈ A satisfy Bn ⊆ Bn+1 for all n > 1.
Define B1′ = B1 and Bn+1
′
= Bn+1rBn ∈ A for all n > 1, so that
∞ ∞
!
X [
′
µ(Bn ) = µ Bn < ∞
n=1 n=1
by our assumption that µ is a finite measure. Therefore, for any ε > 0 there
exists some m > 1 with
X∞
′
µ(Bn+1 ) < ε.
n=m
and
∞
!
[
µ(OrK) 6 µ(O K) + µ
′r
On
n=m+1
∞
! ∞
!
[ [
<ε+µ (OnrBn′ ) +µ Bn′ < 3ε.
n=m+1 n=m+1
It follows that
∞
[
Bn ∈ A.
n=1
Conclusion. By the above A is a σ-algebra that contains all open sets, and
as mentioned above this forces A = B.
(f |K )−1 (U ) = K ∩ f −1 (U )
Notice first that Kn ∪ XrOn is closed for all n > 1, and so K is also closed.
Second, we have
B.5 Signed Measures 561
∞
! ∞
\ X
µ(XrK) = µ Xr (Kn ∪ XrOn ) 6 µ Xr(Kn ∪ XrOn ) < ε
n=1 n=1
| {z }
=OnrKn
f |−1
K (Un ) = K ∩ f
−1
(Un ) = K ∩ On
is an open subset of K (in the induced topology). Since this holds for all the
sets Un in the basis, it follows
S∞ that f |K is continuous.
If now in addition X = n=1 Ln is a countable union of compact sets,
SN
then K ′ = K ∩ n=1 Ln satisfies the final claim of the proposition if N is
sufficiently large.
In the setting considered in this section there is a convenient formulation
of the support of a measure as follows.
Definition B.19. The support Supp µ of a Borel measure µ on the Borel σ-
algebra of a metric space X is the set of all points x ∈ X with the property
that every neighbourhood of x has positive measure.
Notice that with this definition µ(Xr Supp µ) = 0 for the spaces con-
sidered in this section.
this, suppose that ν1 = g1 dµ1 and ν2 = g2 dµ2 are signed measures as above,
and λ1 , λ2 are scalars. Then we may define the finite measure µ = µ1 + µ2
which satisfies µ1 , µ2 ≪ µ and so dµ1 = f1 dµ and dµ2 = f2 dµ for some non-
negative functions f1 , f2 ∈ Lµ1 (X). This gives the presentation dνj = gj fj dµ
for j = 1, 2, and so d(λ1 ν1 + λ2 ν2 ) = (λ1 g1 f1 + λ2 g2 f2 ) dµ defines the linear
combination λ1 ν1 + λ2 ν2 of the signed measures ν1 and ν2 .
Hints for Selected Problems
Exercise 1.3 (p. 3): One way to start the proof is to lift φ to a function φ : R → C and to
show that φ must be differentiable and satisfies a differential equation by comparing φ(x)
with Z Z
x+ε ε
ψ(x) = φ(t) dt = φ(x + t) dt = φ(x)c
x 0
Rε
(for ε small enough to ensure that c = 0
φ(t) dt 6= 0).
Exercise 1.7 (p. 11): Extend the given function first to an odd function on (−1, 1) and
then by periodicity to a function on R/2Z. Then use the Fourier series.
Exercise 2.7 (p. 20): For the first part consider rapidly oscillating functions, which can
have k · kC([0,1]) small and k · kC 1([0,1]) large. For the second, use the fundamental theorem
of calculus.
to conclude from strict convexity of the unit ball that kvk−1 kvk = kwk−1 kwk.
Exercise 2.26 (p. 29): For (a) assume that (yn ) is a sequence in Y converging to x ∈ X,
and note that (yn ) must be a Cauchy sequence. For the reverse implication in (b) assume
that (yn ) is a Cauchy sequence in Y and note that it then is also a Cauchy sequence in X.
Exercise 2.39 (p. 42): For (b) we note that in the formulation of the compactness criterion
in C0 (X) (which is not given in the exercise) an extra uniformity condition regarding decay
at infinity is necessary.
Exercise 2.43 (p. 47): Notice first that without constants the given proof cannot be
applied. Add a point xnew , forming Xnew = X ⊔ {xnew }. Extend functions in A to Xnew
by setting f (xnew ) = 0 for all f ∈ A. Define Anew = A + R1 (or A + C1) and apply
Theorem 2.40. For (a) let xnew be an isolated point of Xnew . For (b) define the topology
on Xnew so that this space is the one-point compactification of X.
Exercise 2.48 (p. 51): Recall that a Riemann integrable function is a function that can
be approximated from above and below by step functions such that the integral of the
difference is arbitrary small. With this in mind repeat the argument for (2) =⇒ (1) to
show that (1) =⇒ (3).
Exercise 2.50 (p. 51): Express this in terms of the orbit of 0 under t 7→ t + log10 2
modulo 1.
Exercise 2.56 (p. 56): Apply (2.31) for kvk 6 1 and consider L(kvk−1 v) for non-zero
vectors v ∈ V .
Exercise 2.61 (p. 59): Use the Cauchy integral formula to see that Ez and
ıO : V ∋ f 7→ f |O ∈ C(O)
are continuous. To prove injectivity, define Dr = {z ∈ C | |z| < r} for r < 1, Vr = V (Dr )
and its completion H p (Dr ). Now consider the maps
Y
H p (D) ∋ f 7−→ ıDr (f ) | r < 1 ∈ C(Dr )
r<1
and Y Y
C(Dr ) ∋ (fr | r < 1) 7−→ (fr | r < 1) ∈ Hp (Dr ),
r<1 r<1
and notice that for f ∈ V the composition of the two maps is given by
V ∋ f 7→ (f |Dr | r < 1)
which satisfies kf kH p (D) = supr<1 kf |Dr kH p (Dr ) . Extend this to all f ∈ H p (D) and
consider now the case where f ∈ H p (D) satisfies ıDr (f ) = 0 for all r < 1.
Exercise 2.70 (p. 69): Using the discussion concerning (2.47) the assumption is equivalent
to f ′′ = λf with the boundary conditions f (0) = f (1) = 0.
Exercise 3.5 (p. 74): For (b) express hx1 , yi + hx2 , yi in terms of the norm using the
definition and apply the parallelogram identity separately to the positive and the negative
parts to obtain 21 hx1 + x2 , 2yi. Setting x2 = 0, this gives hx1 , yi = 21 hx1 , 2yi. Now consider
rational multiples of x1 to prove linearity of h·, ·i. For part (c) verify first the complex
polarization identity
3
1X k
hx, yi = i kx + ik yk2
4
k=0
for elements x, y ∈ H of a complex inner product space H.
Exercise 3.9 (p. 75): Use the polarization identity from Exercise 3.5.
Exercise 3.10 (p. 76): For (a), analyze the proof of the triangle inequality using the
equality case of the Cauchy–Schwarz inequality in Proposition 3.2. For (b) show that the
closed unit ball is strictly convex and apply Exercise 2.18.
Hints for Selected Problems 565
Exercise 3.15 (p. 77): For (a), use the inequality (a2 /(a2 + b2 ))q + (b2 /(a2 + b2 ))q 6 1.
For (b) use Jensen’s inequality.
Exercise 3.21 (p. 81): For (a) apply Corollary 3.19 to the linear functional sending y
to B(x, y) for a fixed x ∈ H. For (b), notice that kT xk > ckxk and show that this implies
that T (H) ⊆ H is closed. Finally, x ∈ T (H)⊥ implies hT x, xi = 0 > ckxk2 .
Exercise 3.23 (p. 81): Either use Section 2.2.2, or define an inner product on the closure
of the image of H in the double dual.
Exercise 3.28 (p. 82): For (1) simply use the evaluation maps for every n ∈ N. For (2)
(i)
use the axiom of choice to fix for every i ∈ I a sequence (ym ) of rationals approaching i
(i) (i)
and define xn to be equal to 1 if the nth rational number appears in the sequence (ym )
and to be equal to 0 otherwise. Now use (2) for proving (3), (3) for (4), and that I is
uncountable to conclude.
Exercise 3.30 (p. 85): For (b) recall (from analysis or as a trivial case of Fubini’s theorem)
P∞ P∞ P∞ P∞
that if amn > 0 then m=1 n=1 amn = n=1 m=1 amn (where the sum is also
allowed to be infinity).
Exercise 3.31 (p. 85): One approach is to use the result for the case of a finite measure
space and apply Exercise 3.30.
Exercise 3.33 (p. 85): First show that ν can be identified with a linear functional
on L ∞ (X) and show that kνk is precisely the operator norm. For (b) use Lemma 2.28,
P
Exercise B.20, and the fact that µ = ∞n=1 µn defines a finite measure if each µn is a finite
P∞
measure for n > 1 and n=1 µn (X) < ∞.
Exercise 3.34 (p. 86): Working first with real-valued L2 functions, start with the pro-
jection operator P : L2µ (X, B) → L2µ (X, A) and show that kP (f )k1 6 kf k1 .
L
Exercise 3.37 (p. 87): For (a), show first that the inner product is well-defined on n Hn
and satisfies all the properties of an inner product. Then show that a Cauchy sequence
in the sum gives rise to Cauchy sequences in each Hn for all n. For (b) use (a) and the
P L
canonical map (vn ) 7→ n vn from the abstract Hilbert space sum n Hn into H.
Exercise 3.48 (p. 94): Recall that χe1 , . . . , χed are sufficient to separate points on Td ,
and that the group of characters generated by these are all the characters of the stated
form.
Exercise 3.49 (p. 95): The characters already appeared implicitly in Section 1.1.
Exercise 3.50 (p. 95): For (d), write Lp (G) for functions on G with respect to normalized
Haar measure, and ℓp (G) b for functions on Gb with respect to counting measure. Notice
that f = 1Supp(f ) f so that
Exercise 3.53 (p. 95): For (a) prove first that TΓ is a metric compact abelian group
and show that G is a closed subgroup. For (b) notice that G can be identified with the
group of characters on G and that every γ0 ∈ Γ defines the continuous group homomorph-
ism χγ0 : G ∋ (zγ ) 7→ e2πizγ0 ∈ S1 , which is non-trivial by the theorem on completeness of
characters (applied to Γ ). If now χ is a character on G, then there exists a neighbourhood U
1
of 0 ∈ G such that χ(U ) ⊆ B1/10 S (1). By the definition of the product topology there exist
finitely many γ1 , . . . , γd ∈ Γ such that H = {(zγ ) ∈ G | zγ1 = · · · = zγd = 0} ⊆ U .
Using that H is a subgroup and χ is a homomorphism, show that χ(H) = 1, which
shows that χ is well-defined on G/H and depends only on the coordinates zγ1 , . . . , zγd for
any (zγ ) ∈ G. Combine this with Exercise 3.48 to conclude that χ can be expressed in
terms of χγ1 , . . . , χγd .
Exercise 3.55 (p. 96): For part (b), consider the odd extension of a given function f
in L2 ((0, 1)) and apply part (a) rephrased for L2 ((−1, 1)). For part (c) consider the even
extension.
Exercise 3.56 (p. 96): Use de Moivre’s formula (e2πiφ )n = cos(2πnφ) + i sin(2πnφ).
Exercise 3.69 (p. 106): Localize to a small open subset Bδ (x) ⊆ U by multiplying by
a function Cc∞ (Bδ (x)) which is equal to 1 on Bδ/2 (x). Treat the new localized function
as an element on T2 . Now generalize Theorem 3.57 to give an inequality concerning (and
as a result, the existence of) ∂1 ∂2 f at x. This exercise should become easier after reading
Theorem 5.6.
Exercise 3.72 (p. 107): Do this via a familiar sequence of approximations, first for indic-
ator functions of measurable sets, then for simple functions, then for non-negative functions
by monotone convergence, and finally for all integrable functions.
Exercise 3.76 (p. 110): First use the case p = 1 to see that
Z
n
x∈X|
G
.
φ(g)f (g −1 x) dmG (g) 6= 0
o
G
.
φ(g)f1 (g −1 x) dmG (g) =
G
.
φ(g)f2 (g −1 x) dmG (g)
.
for almost every x ∈ X. Use this to see that φ ∗f is well-defined for an equivalence class of
functions f ∈ L∞
µ (X).
Exercise 3.77 (p. 110): For the continuity requirement approximate (vn ) by a finitely
supported vector (v1 , . . . , vk , 0, . . .) for some k > 1 and then use continuity of the unitary
representations π1 , . . . , πk .
Exercise 3.82 (p. 113): Prove the same statements first for the Riemann sums.
Exercise 3.86 (p. 115): Use Proposition 3.83 for the first part. For part (b) go through
the proof of Lemma 3.75 to see that it also works for a measure ν. Then take a second
.
function f ′ ∈ L2µ (X) and apply Fubini’s theorem to f ′ , ν ∗f (and similarly to f ′ , φ ∗f ). .
Hints for Selected Problems 567
Exercise 3.90 (p. 118): Use integration by parts just as in the proof of Theorem 3.57 to
bound χn π∗ f uniformly on compact subsets of R2 .
Exercise 3.92 (p. 119): Repeat the argument for Lemma 3.59(1).
Exercise 3.93 (p. 119): For µ, ν ∈ M(G) and any Borel measurable B ⊆ G define
ZZ
µ ∗ ν(B) = 1B (gh) dµ(g) dν(h).
Exercise 4.4 (p. 123): See Example 8.56 for the counter-example.
for n > 1.
so that f is continuous at x if and only if oscf (x) = 0. Show that the set
is open in L1 ((0, 1)) for all 0 6 a < b 6 1 and n > 1, and show that
[
D(a,b),n = O(c,d),n
(c,d):a<c<d<b
is open and dense. Now take the intersection over (a, b) ∈ Q2 with a < b and n ∈ N.
+
Exercise 4.20 (p. 128): Consider for every n ∈ N the set Bn of functions f in C([0, 1])
with the property that there exists some x ∈ [0, 21 ] such that
f (x+h)−f (x)
h 6n
+
for all h ∈ (0, 12 ]. Use compactness of [0, 12 ] to show that each Bn is closed. Show
that C([0, 1])rBn
+
is dense, for example by using piecewise linear functions. Repeat the
−
argument considering difference quotients for x ∈ [ 12 , 1] and h ∈ [− 21 , 0) to define Bn . Con-
S
clude that C([0, 1])r +
n∈N (Bn
−
∪ Bn ) is dense and consists of functions that are nowhere
differentiable.
568 Hints for Selected Problems
Exercise 4.23 (p. 130): Use the argument from the proof of Lemma 4.22.
Exercise 5.11 (p. 142): Recall first that Cc (U ) ⊆ Lp (U ) is dense by Proposition 2.51.
Given some f ∈ Cc (U ) choose some open V with compact closure V ⊆ U . Now apply the
Stone–Weierstrass theorem (in the form of Exercise 2.43(b)) to Cc∞ (V ) ⊆ C0 (V ).
Exercise 5.15 (p. 143): Describe the relationship between the Fourier coefficients of f
and of ∂ α f and use Lemma 5.4. Alternatively, convolve with a suitable version of ε from
Exercise 5.17 and show that the resulting smooth (or, using Exercise 5.17, L2 ) function
actually approximates f with respect to the norm on H k (Td ).
Exercise 5.16 (p. 144): Show by induction that the derivatives of ψ for t < 0 are of the
form p( 1t )ψ(t) for some real-valued polynomials p and show that such functions converge
to 0 as t ր 0. Use this and the mean value theorem for differentiation to show that all
derivatives of ψ at t = 0 vanish.
Exercise 5.17 (p. 144): For (a) use Exercise 5.16. For (b) argue as in the proof of
Theorem 3.54. For (c), differentiate under the integral (which may be justified by dominated
convergence). In (e) the appropriate convergence is with respect to k · kp , which can be
obtained using the density of Cc (U ) ⊆ Lp (U ), Lemma 3.75, and parts (b) and (d).
Exercise 5.18 (p. 144): Localize the function f ∈ C(U ) to a small set using a smooth
function of compact support (for example, replace f with f (x)ε (x − x0 ) for ε from Ex-
ercise 5.17) and consider it as a smooth function on Td (see also Lemma 5.36). Then
Pd
j=1 nj ≍ knk2 for n ∈ Z and k > 1, Lemma 5.4, and Theorem 5.6.
use (3.18), 2k 2k d
Exercise 5.19 (p. 144): Note that for λ ∈ (0, 1) the function f λ (x) = f (λx) is defined on a
slightly larger version of the set U . Let ε > 0 and recall the function ε from Exercise 5.17.
Show that the restriction of fj ∗ ε is the weak ej -partial derivative of f ∗ ε on the
set Uε = {x ∈ U | x + Bε ⊆ U }. Now choose some λ < 1 sufficiently close to 1 and ε > 0
sufficiently small and show that the smooth function f λ ∗ ε is defined on U , its restriction
is close to f in L2 , and that the same applies to the weak partial derivatives.
Exercise 5.25 (p. 146): Either convolve with an approximate identity (that is, with ε
from Exercise 5.17) or show that the sequence of functions (fn ) defined by fn = min{n, f }
all lie in H 1 (B1/2 ).
Exercise 5.27 (p. 146): For (a) (and (b)) consider first functions in C ∞ (U ) ∩ H k (U )
(respectively Cc∞ (V )). For (c) consider for instance d = 1, V = (0, 21 ) ⊆ U = (0, 1), find
some χ ∈ Cc∞ (U ) with χ( 12 ) 6= 0 and show that χ|V ∈ H 1 (V )rH01 (V ) using the argument
in Example 5.20.
Exercise 5.30 (p. 147): Use the regular map φ to pull back any function
f0 ∈ C ∞ (U ) ∩ H 1 (U )
(or f0 ∈ H 1 (U )) to an element
Exercise 5.35 (p. 150): Here elements of function spaces on the closed cube are defined
to have the claimed degree of smoothness in the interior of the cube, and in addition have
the property that all the claimed partial derivatives extend continuously to the closure.
For (a) apply the trace operator in Example 5.28 (see Exercise 5.29) for every α ∈ Nd−10
with kαk1 6 k − 1 to see that the map
H k (U ) ∋ f 7→ ∂ α f |Sy ∈ L2 (Sy )
Exercise 5.37 (p. 151): Choose ε > 0 such that K + B3ε ⊆ U . Let ε be the function
from Exercise 5.17 and consider ε ∗ 1K+Bε .
Exercise 5.39 (p. 152): Apply the arguments behind Lemma 5.36 and Theorem 5.34 using
some fixed χ ∈ Cc∞ (U ) with χ|K ≡ 1.
S
Exercise 5.52 (p. 161): Set Kj = Vjr i6=j Vi for j = 0, . . . , k and apply Lemma A.28 to
find a continuous partition of unity. Combine this with Exercise 5.17 to obtain the smooth
partition of unity.
Exercise 5.54 (p. 163): Average the function over large balls with different centres and
use Proposition 5.53 and the boundedness assumption to estimate the difference between
the values at the two centres.
Exercise 5.57 (p. 165): Assume first either that U is a set of the form U ∩ Bε z (0) or
is convex as in Definition 5.31. Show that for φ ∈ C ∞ (U ) we have
so g satisfies the usual integration by parts formula but even for φ ∈ C ∞ (U ). Then for λ > 1
show that the function g λ defined by
(
λ g(λx) for λx ∈ U,
g (x) =
0 for λx ∈
/U
is in H01 (U ) (for example, by using similar arguments to Exercise 5.19) take λ ց 1, and
conclude that g ∈ H01 (U ). Finally, use Lemmas 5.40, 5.41, and ∆g = 0. For more general
sets as in Definition 5.31 use a smooth partition of unity to localize g to sets of the
form U ∩ Bε z (0) (without destroying the feature that g vanishes in the square-mean
sense at the boundary).
Exercise 6.1 (p. 167): For (a) note that an eigenvalue would have absolute value one and
the eigenvector would have to be a sequence with constant absolute value. For (b) consider
geometric sequences.
570 Hints for Selected Problems
Exercise 6.6 (p. 169): Use Hölder’s inequality and the Arzela–Ascoli theorem to prove
compactness for p > 1. For p = 1 compactness fails as one sees from studying a se-
quence (fn ) of positive functions with integral one and support [ 12 − 1 1
,
n 2
+ 1
n
] for n > 3.
Exercise 6.9 (p. 170): The special case of k = 0 in (a) is treated in detail in Lemma 6.58
and the method there also works for general k > 0. For (b) use the Arzela–Ascoli theorem
(Theorem 2.38) in the case of compact closure and consider a fixed non-zero function and
all its shifts in the case of U = R. In (c) the answer is negative, for example because the
closure of the image of the unit ball contains all characters.
Exercise 6.12 (p. 171): Use the first part of the proof of Proposition 6.11, Proposi-
tion 2.51, and Lemma 6.7.
Exercise 6.16 (p. 174): Consider the images under K of the functions fn = 1[3n,3n+1]
for n > 1, all of which have L2 norm one.
Exercise 6.24 (p. 176): Set V = U (H)⊥ and n m
L shown that U (V ) ⊥ U (V ) for all in-
tegers m, n with 0 6 n < m. Define Hshift = n>0 U V and show that
\
Hunitary = H⊥
shift = U nH
n>0
Exercise 6.29 (p. 178): Expand φ and f in terms of the orthonormal basis of The-
orem 6.27, and compare coefficients.
T
Exercise 6.33 (p. 181): In both cases show that n∈J ker(An − λn I) is invariant un-
der A1 , A2 , . . . for any choice of λn and any choice of index set J ⊆ N. Show moreover
that this intersection is finite-dimensional if λ1 6= 0. Now apply Theorem 6.27 to A1 and
to An restricted to the eigenspaces of A1 .
Exercise 6.34 (p. 182): To prove the second inequality (6.14) first take the linear hull V0
of the eigenvectors v1 , . . . , vk corresponding to the first k positive eigenvalues (assume first
that there are at least k positive eigenvalues) and calculate the minimum. Then let W be
the linear hull of V0⊥ and the k-th eigenvector vk (also belonging to V0 ), and note that
any k-dimensional subspace V will intersect W non-trivially.
Exercise 6.45 (p. 192): Consider the self-adjoint compact operator A∗ A and apply The-
orem 6.27. Using that basis, define P such that P 2 = A∗ A, and define Q so that A = QP .
Exercise 6.47 (p. 193): Calculate the trace-class norm for k a character (it will be 1) and
use absolute convergence of Fourier series (Theorem 6.47).
Exercise 6.49 (p. 195): For a fixed compact set Y ⊆ X use the proof of Proposition 6.48
R
to show that Y |k(x, x)| dµ(x) 6 kKktc . Conclude that k is integrable along the diagonal.
S
Fix an increasing sequence of compact sets with X = n>1 Yn . For every n consider a
sequence of partitions ξn,ℓ ℓ>1 of YnrYn−1 as in the proof of Proposition 6.48 (where
we set Y0 = ∅). Use this sequence (by enumerating N2 in some fashion) to define an
orthonormal basis. To conclude, use in addition Lemma 6.41.
Hints for Selected Problems 571
Exercise 6.50 (p. 195): For (a), suppose that (Ak ) is a Cauchy sequence with respect
to k · ktc . Since k · kop 6 k · ktc we have limk→∞ Ak = A ∈ B(H). For given k, ℓ, N > 1 and
any list of orthonormal vectors (vn )n=1,...,N and (wn )n=1,...,N we have
N
X
|h(Ak − Aℓ )vn , wn i| 6 kAk − Aℓ ktc .
n=1
For k = 1 this shows A ∈ TC(H), and taking k → ∞ then gives kAk − Aktc → 0.
Exercise 6.52 (p. 195): Let (vn ) and (wn ) be lists of orthonormal vectors, and notice that
N
X N Z
X
Z
|hAvn , wn i| = hAt vn , wm i dµ(t) 6 kAt ktc dµ(t)
n=1 n=1 T T
for all N > 0. This gives the first claim; the argument for the trace of A is similar.
Exercise 6.53 (p. 195): Part (a) follows quickly from the identity
|hAej , ek i| = |hA∗ ek , ej i|
for all j, k. For (b), let (fn ) be a different orthonormal basis and note that
X X
|hAej , ek i|2 = kAej k2 = |hAej , fk i|2 ,
k >1 k >1
and so X X X
|hAej , ek i|2 = |hAej , fk i|2 = |hej , A∗ fk i|2 .
j,k>1 j,k>1 j,k>1
Arguing similarly one can also replace ej by fj . For (c), suppose first that B = U is unitary
and apply Lemma 6.38 to conclude the argument. For (d) define
X
hA1 , A2 iHS = hA1 ej , ek i hA2 ej , ek i
j,k>1
and show that the (amn ) ∈ ℓ2 (N2 ) correspond precisely to operators A ∈ HS(H) by setting
X X X
A c j ej = ajk cj ek
j >1 k >1 j >1
(Proposition 6.11 with X = Y = N shows this is a well-defined bounded operator). For (e)
suppose A, B ∈ HS(H) and calculate
X X
kABk2HS = |hABej , ek i|2 = |hBej , A∗ ek i|2
j,k>1 j,k>1
X
6 kBej k2 kA∗ ek k2 = kBk2HS kAk2HS
j,k>1
by part (a) and (b) and its proof. For (f) assume that H is infinite-dimensional, define Ben
to be n−1/2 en and show that B ∈ HS(H). For (g) and (h) apply Proposition 6.11.
572 Hints for Selected Problems
Exercise 6.54 (p. 196): Let A, B ∈ HS(H) and let (vn ) and (wn ) be two orthonormal
lists. Then
N N N
!1/2 N
!1/2
X X X X
∗ 2 ∗ 2
|hABvn , wn i| 6 kBvn kkA wn k 6 kBvn k kA wn k
n=1 n=1 n=1 n=1
shows that kABktc 6 kAkHS kBkHS by Exercise 6.53(a) and (b) (and its proof above).
For (b) suppose first that P is positive, self-adjoint and trace-class, and find A ∈ HS(H)
with P = A2 . Then apply Exercise 6.45.
Exercise 6.55 (p. 196): For (a), show that for any w ∈ H the map H ∋ v 7→ hv, wi0
is a bounded linear operator that depends semi-linearly on w. Conclude that it must be
of the form hv, wi0 = hv, AwiH for a bounded operator A. Use the properties of h·, ·i0 to
show that A is positive and self-adjoint. For (b) recall that H ∋ f 7→ f (x) is a bounded
functional and show that A as in (a) is of the form A(v) = hv, vx i vx for some vx ∈ H.
For (c) show that supx∈K kvx k is finite for all compact subsets K of U (for example, by
analyzing the arguments leading to Theorem 5.34).
Exercise 6.62 (p. 199): Apply the argument behind Theorem 5.45 to prove that
Exercise 6.63 (p. 201): For (a), differentiate under the integral sign to express Jn′ and Jn′′
as integrals. Simplify
x2 Jn′′ (x) + (x2 − n2 )Jn
Exercise 6.66 (p. 204): Given f (x) = sin(πR−1 n1 x1 ) · · · sin(πR−1 nd xd ), for x ∈ (0, R)d
and f (x) = 0 for x ∈ Rdr(0, R)d , define
fλ (x) = f ( R
2
,..., R
2
) + λ(x − ( R
2
,..., R
2
))
for λ > 1 and fe = fλ ∗ ε (cf. Exercise 5.17; also see the proof of Corollary 8.47).
Exercise 6.70 (p. 208): Use ∆fn = λn fn and f ∈ Cc∞ (U ) to first show that
for any k > 1. Then fix some compact K ⊆ U and use Exercise 6.62 to bound kfn kK,∞ in
terms of |λn |, whose growth rate we know.
Exercise 7.5 (p. 212): Construct the complement as a kernel of a linear map, using the
Hahn–Banach theorem.
Hints for Selected Problems 573
Exercise 7.12 (p. 214): Consider a dense countable subset {ℓ1 , ℓ2 , . . . } of X ∗ and choose
for every ℓn some xn ∈ X with kxn k = 1 and |ℓn (xn )| > kℓn k/2. Now take the Q-linear
(or Q(i)-linear) hull of {xn }, which is countable, and show that it is dense.
Exercise 7.15 (p. 218): After establishing linearity over C choose θ ∈ C with |θ| = 1
and θ LIM((an )) = LIM((θan )) > 0 to prove that the complex extension has norm one.
Exercise 7.26 (p. 222): If H is abelian and finitely generated, then H is a quotient
of some Zd and so has Følner sequences. Use this for finitely generated subgroups of a
countable abelian group G to find a Følner sequence for G.
Exercise 7.27 (p. 222): One approach is to construct a box-like (not cube-like) Følner
sequence. An alternative is to write the group as a semi-direct product and use Proposi-
tion 7.20.
Exercise 7.28 (p. 223): Emulate the strategy used to show that a free group is not
amenable in Example 7.22.
Exercise 7.30 (p. 223): For (a), let mG be a finitely additive left-invariant mean on G.
.
Let x0 ∈ X and B ⊆ X and define mX (B) = mG ({g ∈ G | g x0 ∈ B}). For (b), note
that setting X = R2 does not immediately work in order to prove that (a) implies (b), as
one would have mX (K) = 0 for any bounded set K in R2 . Instead, use mX as in (a) to
construct a finitely additive function m defined on all bounded sets by setting
m(B) = cn mX (2n+1 Z2 + B)
for any subset B of [−2n , 2n )2 and define cn > 0 so that m([0, 1)2 ) = 1, and show that
the definition does not depend on n.
Exercise 7.45 (p. 240): Check the claim first for open subsets, and then argue along the
lines used to prove Proposition 2.51.
Exercise 7.49 (p. 241): For (a) note that X has a countable base {Un } for the topology,
and write every clopen set as a union of finitely many Un . For (b) use this to construct an
injective continuous map from X to {0, 1}N .
Exercise 7.53 (p. 248): Apply Theorem 7.44 to obtain a locally finite measure representing
the restriction of Λ to C0 (X). Assuming that µ(X) = ∞, find some function f ∈ C0 (X)
R
for which X f dµ = ∞, and then use positivity to obtain a contradiction. Finally, show
that µ represents Λ on all of C0 (X) by showing that Λ is necessarily bounded.
Exercise 7.57 (p. 252): Combine the argument in Section 7.4.4 with Theorem 7.54.
Exercise 7.58 (p. 252): For (a), notice that ℓ∞ (N) can be embedded into the Banach
1
space L ∞ (X) using the subset { n | n ∈ N}. Now extend the Banach limit from ℓ∞ (N)
∞
to L (X) and show that it does not arise from a signed measure on X. For (b), if f is a
non-measurable bounded function X → R then f induces a linear functional on the space
{µ ∈ M(X) | |µ|(B) = 0 for all B ⊆ XrD for some countable set D ⊆ X},
R
since for each such measure one can define f dµ as a countable sum. Now extend this
functional to all of M(X).
Exercise 8.6 (p. 255): To see that (1) is necessary apply Theorem 4.1.
574 Hints for Selected Problems
Exercise 8.9 (p. 256): For (b) prove (A∗ )−1 Nx1 ,...,xn ;ε (A∗ y0∗ ) = NAx1 ,...,Axn ;ε (y0∗ ).
Exercise 8.12 (p. 258): For (a) use the Baire category theorem (Theorem 4.12). For (b)
assume that the neighbourhoods of the form Nx1 ,...,xn ;1/n (0) form a basis of the weak*
topology neighbourhoods of 0 ∈ X ∗ and conclude that X is the linear hull of {x1 , x2 , . . . }
by using the same argument as in the proof of Lemma 8.13.
Exercise 8.14 (p. 259): Apply Exercises 7.11–7.12 to reduce to the separable case.
Exercise 8.15 (p. 259): Suppose that there is a sequence that converges weakly but not
in norm. Show that this implies that there is a sequence (fn ) in ℓ1 (N) such that kfn k1 = 1
for all n > 1 but for which fn converges weakly to 0 as n → ∞. Use this to con-
struct a strictly increasing sequence of natural numbers (Ij ) and a subsequence (fnj ) such
PIj−1 1
P∞ 1
that k=1 |fnj (k)| 6 5
and k=Ij +1 |fnj (k)| 65
for all j > 1, where we set I0 = 0.
P∞
Using this partition, construct an element h in ℓ (N) for which
∞
k=1 fnj (k)h(k) does
not converge to 0 as j → ∞.
Exercise 8.21 (p. 261): For every g ∈ G consider the map Lg : CR (G) → CR (G) defined
by (Lg f )(x) = f (gx). Show that {Λ ∈ C(G)∗ | Λ = Λ ◦ Lg1 = · · · = Λ ◦ Lgn , Λ > 0, Λ(1) = 1}
is a closed non-empty subset of the unit ball in C(G)∗ for any g1 , . . . , gn ∈ G. To see that
these sets are non-empty, use induction and suppose that Λ0 belongs to the set defined
1 K−1P k
by g1 , . . . , gn−1 ∈ G. Then any weak* limit of K k=0 Λ0 ◦ Lgn will belong to the set
defined by g1 , . . . , gn ∈ G. See also Exercise 8.37 and the discussion there.
Exercise 8.22 (p. 262): For both parts of the exercise, let (vn ) be any sequence in H
with kvn k 6 1, assume without loss of generality that vn → v ∈ H as n → ∞ in the weak*
topology, and recall Exercise 8.6. Use compactness of A to prove that kAA∗ (vn − v)k → 0
as n → ∞ and consider
kA∗ (vn − v)k2 = hA∗ (vn − v), A∗ (vn − v)i = h(vn − v), AA∗ (vn − v)i .
Exercise 8.23 (p. 262): Show first that SH is weak* closed and non-empty. Us-
ing Theorem 8.10 deduce that the intersection is only empty if some finite intersec-
tion SH1 ∩ · · · ∩ SHn is empty. However, H = H1 + · · · + Hn is another finitely generated
subgroup and so SH1 ∩ · · · ∩ SHn = SH is non-empty.
Exercise 8.24 (p. 262): Give G the discrete topology, so that by amenability there is a
Banach limit in (ℓ∞ (G))∗ . Restrict this to C(G) and deduce the existence of a translation-
invariant measure from the Riesz representation theorem.
Exercise 8.26 (p. 262): Suppose without loss of generality that x∗0 = 0. Now apply weak*
compactness to the weak* closed subsets BsX ∗ ∩ K for s > inf k∈K kkk.
Exercises 8.32–8.33 (p. 264): Both exercises require the generalization of Section 2.3.3
to Td .
Exercise 8.40 (p. 268): For ergodicity, use Fourier series as in the proof of Lemma 8.38.
Exercise 8.44 (p. 272): Apply Exercise 8.39 to obtain weak* convergence on the set
Hints for Selected Problems 575
V + Bs/2 .
Exercise 8.59 (p. 292): The uniform operator topology is the only topology that has
neighbourhoods that are bounded with respect to the operator norm. If xn in X and yn
in Y ∗ have norm one for all n > 1, then
∞
X 1 ∗
L 7−→ yn (Lxn )
2n
n=1
is a continuous functional on B(X, Y ) and so also continuous with respect to the weak
topology. Choosing the sequence (xn ) carefully makes this functional not continuous with
respect to the strong, nor the weak, operator topology. Finally, notice that for the strong
operator topology and x ∈ Xr{0} there exists a neighbourhood, namely Nx;1 (0), such
that {Lx | L ∈ Nx;1 (0)} ⊆ Y is bounded while there is no such neighbourhood in the weak
operator topology.
Exercise 8.62 (p. 293): Apply the Hahn–Banach lemma (Lemma 7.1).
Exercise 8.67 (p. 295): For (c) suppose that ℓ : MF([0, 1])∗ → C is continuous and linear.
Suppose ε > 0 is chosen so that f ∈ Uε (0) implies that |ℓ(f )| < 1. Given any f ∈ MF([0, 1])
P
use a partition of [0, 1] to split f into a finite sum f = nk=1 fk such that λfk ∈ Uε (0) for
all λ ∈ C.
Exercise 8.75 (p. 300): Look at the proof of Theorem 7.3 to see how to obtain a complex-
linear functional from a real-linear functional.
Exercise 8.76 (p. 300): Without loss of generality we may assume that 0 is an interior
point of K. Fix some y0 ∈ L so that 0 is an interior point of M = K − L + y0 . Since K
and L are disjoint, K − L cannot contain 0 and M does not contain y0 . Now let g be the
gauge function of M so g(y0 ) > 1. Define f (λy0 ) = λg(y0 ) for all scalars λ. Extend f to
the whole space with f (x) 6 g(x) for all x using the Hahn–Banach lemma, and notice
that f (x) 6 1 for all x ∈ M and f (y0 ) > 1.
Exercise 8.77 (p. 300): Let r = inf x∈K kz − xk. Hence for every ε > 0 there is an x0 ∈ K
such that v = z − x0 satisfies kvk < r + ε and hence for ℓ ∈ X ∗ with kℓk = 1 we have
For the converse set L = Br (z) and apply Exercise 8.76 for K and L.
Exercise 8.78 (p. 300): Consider the compact convex set K = ı B1X with the clos-
ure taken with respect to the weak* topology, assume that ℓ ∈ (B1X )∗∗rK, and apply
Theorem 8.73 using the weak* topology on X ∗∗ .
Exercise 8.84 (p. 304): For (a) use the Arzela–Ascoli theorem. For (c) use piecewise
linear functions as in (b) to approximate a given function f ∈ K. For (d) use the fact that
any f ∈ K is almost everywhere differentiable with derivative in [−1, 1].
576 Hints for Selected Problems
Exercise 8.87 (p. 304): Notice that two possible barycentres of µ cannot be separated
by X ∗ .
Exercise 8.89 (p. 306): To see that the set of barycentres is closed, show that
Exercise 8.91 (p. 306): Use induction on the dimension n. If x0 is a boundary point,
then there exists a hyperplane V that contains x0 with the property that K lies in one of
the closed half-spaces with boundary V . If x0 is an interior point, take any extreme point y
and find a boundary point z such that x0 is in the line segment from y to z.
Exercise 8.93 (p. 311): One direction is clear using Theorem 8.90. For the other direction,
suppose that µ represents x0 and µ 6= δx0 . Then there exists some y in Supp(µ)r{x0 },
a linear functional ℓ ∈ X ∗ , and an open neighbourhood U of y with µ(U ) ∈ (0, 1)
1 1
and supz∈U ℓ(z) < ℓ(x0 ). Now use µ = λ µ(U µ| + (1 − λ) µ| with λ = µ(U )
) U µ(KrU ) KrU
and the existence of barycentres (Lemma 8.88) to see that x0 is not extreme.
Exercise 9.9 (p. 319): For (a) the precise condition is f (x) 6= 0 for µv -almost every x. Sim-
plifying the notation, assume that H = L2 (T, µ). If f vanishes on a set B of positive meas-
ure, then clearly Hf ⊥ 1B . So suppose that f 6= 0 almost everywhere. Then clearly gf ∈ Hf
for g any character, hence for g any trigonometric polynomial, hence for g ∈ C(T), hence
for g = 1O for any open set O by dominated convergence, and finally for g = 1G for
T
any Gδ -set n>1 On . Since any measurable set coincides modulo µ with a Gδ -set, we may
apply dominated convergence once again to obtain the case g ∈ L∞ µ (T). Apply this to
the function defined by gn = f1 1{x∈T||f (x)|>1/n} to obtain 1{x∈T||f (x)|>1/n} ∈ Hf for
all n > 1 and conclude that 1 ∈ Hf . In (b) the spectral measure is given by the Lebesgue
measure.
P P n
Exercise 9.10 (p. 320): For (a) note first that ∞ n=0 dn
∞
k=1 ck z
k = z for sufficiently
smallP
z, and as an
P identity in the
ring C JzK of formal power series. Using the assumption
that ∞ ∞ k n < ∞ we see first that
n=0 |dn | k=1 |ck |kAk
∞
X ∞
X N
X K
X
n n
g f (A) = dn c k Ak ≈ dn c k Ak
n=0 k=1 n=0 k=1
N
X K
X NK
X
n
dn c k Ak = A+ eN,K,ℓ Aℓ ,
n=0 k=1 ℓ=min{N,K}+1
where the last sum can be made arbitrarily small if N and K are sufficiently large.
and prove that Hρ can be expressed as the direct sum of certain subspaces of L2 (T, µwn ).
Hints for Selected Problems 577
Exercise 9.15 (p. 323): For (a), multiply by the square root of the Radon–Nikodym
derivative of one measure with respect to the other. Using (a), we may modify the measures
in Corollary 9.13 to satisfy µn = µ1 |Bn for some nested sequence of Borel sets
B1 = T ⊇ B2 ⊇ · · · ,
hence (b) follows by repeating that argument. For (c), show that f ∈ L2 (T, ν1 ) satisfies
the property defining H(1) . For the converse note that we can define in a measurable way
for any u ∈ Cn with n > 2 or u ∈ ℓ2 (N) a vector u′ in the same space with u′ ⊥ u and
with ku′ k = kuk, for example by first projecting onto C2 ⊆ Cn ⊆ ℓ2 (N) and there using
the orthogonal direction or a suitable multiple of the first basis vector if the projection
is zero. Now consider a general function F = (f1 , f2 , . . . , f∞ ) with fn : (T, νn ) → Cn
and f∞ : (T, ν∞ ) → ℓ2 (N). If fn 6= 0 for some n ∈ {∞, 2, 3, . . . } then, using the argument
above, construct a function fn′ with fn (x) ⊥ fn′ (x) and with kfn (x)k = kfn′ (x)k for νn -
′ = 0 for all m 6= n and conclude that F does not satisfy the property
almost every x. Set fm
defining H(1) , showing the reverse inclusion. For (d) argue in a similar way: For F given
by (0, f2 , 0, . . . , 0), define F2 by rotating f2 and show the defining property for H(2) . We
note again that for n > 3 we can measurably define for every u1 , u2 ∈ Cn or in ℓ2 (N) a
vector u′ orthogonal to both u1 and u2 with ku′ k = ku1 k. Using this argue as before.
Exercise 9.19 (p. 326): Fix some v ∈ H and describe the unitary operator U on Hv by a
multiplication operator on L2 (S1 , µv ). Now calculate the spectral measures µv,w = µv,P (w)
in that context, where P : H → Hv is the orthogonal projection.
Exercise 9.23 (p. 327): For Rα note that the characters are eigenfunctions. For A show
that a character is mapped to a character but that the orbit of any non-trival character is
infinite, and then apply Lemma 9.12.
Exercise 9.33 (p. 334): For (c), apply the Stone–Weierstrass theorem (see Exercise 2.43).
For (d), first approximate g simultaneously in L1 (Rd ) and L2 (Rd ) by some function f0
2
in Cc (Rd ). Then approximate eπkxk f0 (x) by some function f1 ∈ A with respect to k · k∞ ,
2
and notice that f (x) = e−πkxk f1 (x) will then approximate g with respect to k·k1 and k·k2 .
For (e) consider f1 , f2 ∈ A and express the inner product in the form
Z
hf1 , f2 i = f1 f2 dx = \
f1 f2 (0)
Exercise 9.41 (p. 340): Show that the four-fold Fourier transform of a function is again
the original function, and apply the argument used in Section 1.1 (or Theorem 3.80). Also
consider the function f in Example 9.27 together with λx0 f and products of such functions
to prove that all four possible eigenvalues appear.
Exercise 9.47 (p. 342): Consider the associated (well-defined) function g in C ∞ (Td )
P
defined by g(x) = n∈Zd f (n + x).
Exercise 9.48 (p. 342): First show that we can approximate any function in Lp (R) by
a function of compact support, so it is enough to approximate a compactly supported
function f ∈ Lp (R) (so that fb ∈ C ∞ (R)). Let h1 ∈ Cc∞ (R) be a non-trivial real-valued
function with h1 (x) = h1 (−x), define h to be h1 ∗ h1 , multiply by a scalar so that h(0) = 1,
578 Hints for Selected Problems
cr ∗ f → f in Lp (R) as r → ∞ using
and set hr (x) = h(rx) for all r > 0. Now prove that h
Jensen’s inequality as in the proof of Theorem 9.39.
Exercise 9.50 (p. 343): Use the condition for equality in the Cauchy–Schwarz inequality
to deduce that f must satisfy a differential equation of the form f ′ (x) = λxf (x) for
some λ ∈ R.
Exercise 9.51 (p. 343): Notice that if for f we have equality, then we have equality in
Exercise 9.50 for g(x) = e−2πixt0 f (x) + x0 .
Exercise 9.55 (p. 345): For (a) let C[G] be the space of finitely supported complex-valued
measures and use p to define a semi-inner product on C[G] with
for all g, h ∈ G. Then show that πh (δg ) = δhg extends to a unitary representation on
the completion of C[G] modulo the kernel of the semi-norm induced by the semi-inner
product. For (b) assume that p is extreme and the unitary representation in (a) is reducible,
decompose the generator into the components corresponding to an invariant subspace and
its orthocomplement, and study the matrix coefficient of these three vectors.
Exercise 9.63 (p. 352): For (b) notice that this is equivalent to C\
V
c (R) being dense
∞
in Graph(A) = {(f, g) ∈ L (R) × L (R) | g(t) = tf (t) for t ∈ R}. For this improve the
2 2
argument for Exercise 9.48 (also see its hint on p. 577) by proving that
cr ∗ f = MI h
MI h cr ∗ f + h
cr ∗ (MI f )
cr ∗ f → 0 in L2 (R) as r → ∞. For (d) extend Proposition 9.43 to
and showing that MI h
weak derivatives.
k k+1
Exercise 10.4 (p. 360): Set B1 = f −1 z ∈ C | ℜ(z) ∈ [ n , n ), ℑ(z) ∈ [ nℓ , ℓ+1
n
) for
some k, ℓ ∈ Z and n ∈ N, and set B2 = GrB1 . Combine the assumption in the exercise
and Lemma 10.3 to conclude that mG (B1 ) = 0 or mG (B2 ) = 0. Vary k, ℓ, n to conclude
the proof.
Exercise 10.5 (p. 360): For (a) show that θ∗ mG (B) = mG (θ −1 B) defines a left Haar
measure and use Proposition 10.2. For the continuity in (b) let B = K be a fixed compact
set and use the regularity of the measure mG . For (c) use the substitution formula
Z Z
f ◦ θ dmG = f dθ∗ mG .
since ψn vanishes outside Un . For the convolution on the right a different argument is
needed as follows. Write
Z Z
f ∗ ψn (g) = f (h)ψn (h−1 g ) dmG (h) = f (gk)ψn (k −1 ) dmG (k),
G | {z } G
=k−1
Hints for Selected Problems 579
using the subsitution gk = h. From here (depending on how much one wishes to assume
about the sequence of functions) one could assume that each ψn is symmetric in the sense
that ψn (g) = ψn (g −1 ), or use the fact that the modular character is itself a continuous
function so the difference between integrating against ψn (k −1 ) and ψn (k) for k ∈ Un is
small. To deduce the result for f ∈ L1 (G) use the usual approximation arguments.
Exercise 10.20 (p. 373): Use Proposition 7.20(a) and the Følner condition. For the
converse use the same argument as in Exercise 8.23.
Exercise 10.21 (p. 373): For (1) notice that the proof that (3) =⇒ (1) in Theorem 10.15
only uses finite sets. For (2) show, for example, that a function f in P(G) satisfying Reiter’s
condition in Definition 10.14 for a compact K ⊆ G and ε > 0 also satisfies a topological
version for all f0 ∈ P(G) that vanish outside of K. Use this to induce a left-invariant
mean that is also topologically left-invariant.
Exercise 10.22 (p. 373): For (1) show that G × G is amenable. Then use the left-
invariant mean M2 on G × G to define M (φ) = M2 ((g1 , g2 ) 7→ φ(g1 g2−1 )). For (2) convolve
a function f as in Reiter’s condition with its flipped version fe(g) = f (g −1 ) and show
that f ∗ fe satisfies Reiter’s condition for left- and right-multiplication.
Exercise 10.23 (p. 373): If G is discrete and uncountable, then combine the following
with the conclusion in Exercise 10.20. So assume that G is σ-compact, locally compact,
and metric. For (1) =⇒ (2), define for f ∈ C(X) the functional Λ(f ) = M (g 7→ f (gx))
for some left-invariant mean M on G and some x ∈ X. For (2) =⇒ (3) one would like to
use the action of G on K to find an invariant measure µ and then apply Lemma 8.88 to
find a G-invariant barycentre of µ in K. However, as K is not assumed to be metrizable
this requires a small work-aroundSas follows. Let k · k1 , . . . , k · km be a finite collection of
semi-norms on V and write G = ∞ n=1 Gn with Gn compact and Gn contained in Gn+1
o
are finite for any 1 6 k 6 m and n > 1 and are compatible with the topology on V .
Define V0 = {v ∈ V | kvkk,n = 0 for all 1 6 k 6 m and n > 1}, set W = V /V0 and
define p : V → W by p(v) = v + V0 . Equip W with the collection of quotient semi-norms
induced by k · kk,n and show that p is continuous. Show that G acts continuously on W
and that p is equivariant for the G-action. By applying the argument for the metrizable
case outlined above, show that the set
and the linear action λ∗g for the left regular representation λg : L∞ (G) −→ L∞ (G). If G
is discrete this is possible, but in general this does not define a continuous affine action of
the sort considered in (3). For this reason, define the subspace
580 Hints for Selected Problems
of left uniformly continuous functions on G, and its dual space X = (LUC(G))∗ with the
topology induced by the semi-norms kM kK,φ = supg∈K |M (λg φ)| for M ∈ (LUC(G))∗ ,
where K ⊆ G is a non-empty compact subset and φ ∈ LUC(G). Show that on any bounded
subset B of (LUC(G))∗ the topology induced by these semi-norms agrees with the weak*
topology on B. Show that the action λ∗g for g ∈ G on X satisfies the assumptions of (3)
and deduce that there exists a left-invariant mean on LUC(G). Use the argument from
the proof of Lemma 10.17 and the step (1) =⇒ (3) in Theorem 10.15 to complete the
argument.
Exercise 10.24 (p. 374): For (1), show that the Reiter condition for G implies the Reiter
condition for H. Let f ∈ Cc (G) ∩ P(G) satisfy the Reiter condition for ε > 0 and a
finite K ⊆ H. Define the space X = H\G with the usual map g 7→ Hg from G → X, and
the probability measure ν on X by
Z
ν(B) = 1B (Hg)f (g) dm(g).
G
is defined and bounded above by |K|ε. RShow that F (h0 g) = F (g) for every h0 in H (even
if H is not unimodular), that ν {Hg | f (hg) dmH (h) = 0} = 0, and choose a compact
subset L ⊆ H so that f (g) > 0 and f (hg) > 0 (or f (k −1 hg) > 0) implies h ∈ L. Then
Z Z
F (Hg) dν = F (Hg)f (g) dmG (g)
X G
X Z Z
1
= F (Hg) f (hg) dmH (h) dmG (g)
mH (L) G L
k∈K
X Z Z
1
= |f (k −1 hg) − f (hg)| dmH (h) dmG (g)
mH (L) G L
k∈K
X
= kλk f − f k1 < |K|ε.
k∈K
Finally, use Exercise 10.21. For (2) use Exercise 10.23 (either (2) or (3)).
Exercise 10.28 (p. 374): For (1), fix a generating set and use metric open balls of
increasing radius to define a Følner sequence.
Exercise 10.30 (p. 375): Take the closed convex hull K of {πg v | g ∈ G} and apply
Theorem 3.13 with v0 = 0.
Exercise 10.35 (p. 376): Apply the definitions and Exercises 10.7 and 10.30.
Exercise 10.37 (p. 377): For any finite subset F ⊆ G define the subgroup HF = hF i gen-
erated by F . Note that G acts on the quotient space G/HF and also unitarily on ℓ2 (G/HF ).
L
Consider the direct product representation of G on F ℓ2 (G/HF ).
Hints for Selected Problems 581
Exercise 10.47 (p. 384): Combine Theorem 10.38 and Proposition 10.41.
Exercise 10.53 (p. 388): Given a function f ∈ L2 (G) show that for almost every g ∈ G
the function fg : Γ ∋ γ 7→ f (gγ) belongs to ℓ2 (Γ ). Define φ : L2 (G) → HG by φ(f )(g) = fg
for all g ∈ G.
Exercise 10.64 (p. 404): Show that the characteristic function of a ‘connected compon-
ent’ is also an eigenfunction for eigenvalue one, and use the level sets of a non-constant
eigenfunction for the converse.
Exercise 10.67 (p. 406): Combine the argument after Proposition 10.41 with division
with remainder in Z.
Exercise 11.4 (p. 410): For λ = 0. Use Corollary 4.30 (or Exercise 6.25) to see that Mg
is invertible if and only if µ({0}) = 0 (respectively, g is non-zero µ-almost everywhere)
and z 7→ z1 (resp. g1 ) is essentially bounded with respect to µ.
Exercise 11.7 (p. 410): For (a) use the isomorphism between ℓ2 (Z) and L2 (T) provided
by Fourier series (Theorem 3.54). For (b), you may show that A is isomorphic (as a Banach
algebra) to the algebra generated by S with S as in Exercise 6.1(b).
Exercise 11.9 (p. 411): Use Lemma 2.67 and Theorem 11.6.
Exercise 11.12 (p. 413): Recall from Section 2.4.2 that multiplication is continuous.
Exercise 11.18 (p. 417): Use the C ∗ -property of the norm to first show kak 6 ka∗ k for
all a ∈ A.
Exercise 11.20 (p. 417): Start with the identity 1∗A 1A = 1A 1∗A = 1∗A , apply the star
operator and then use the C ∗ -property.
Exercise 11.36 (p. 425): For (a), combine Proposition 11.21, Corollary 11.29, and Exer-
cise 2.43(b). For (c), use (a) and the fact that C0 (σ(A)) ⊕ C ∼
= C(σ(A) ∪ {∞}).
Exercise 11.41 (p. 428): The Banach algebra of limits of absolutely convergent Fourier
series with pointwise multiplication is isometrically isomorphic to ℓ1 (Zd ) with convolution.
Apply Theorem 11.23 and Proposition 11.38 to Zd and Z cd ∼
= Td .
Exercise 11.42 (p. 428): Show first that if G is a locally compact metrizable abelian group
then
V = hλy f | y ∈ Gi
Exercise 12.10 (p. 435): For the description of σresid (T ), prove that
(im(T − λI))⊥ = ker T ∗ − λI
and then use this together with an explicit description of T ∗ (see Exercises 6.23(c)
and 6.1(b)).
Exercise 12.17 (p. 437): Since the kernel of I − A is the eigenspace of A for eigenvalue 1,
almost injectivity follows directly from compactness of A (see, for example, Exercise 3.40).
The proof that im(I − A) is closed is a little more involved. Assume first that
(I − A)vn = vn − Avn → w
as n → ∞ with vn ∈ ker(I − A)⊥ . Show that (vn ) is bounded (for example, by assuming
that kvn k → ∞ as n → ∞ and applying compactness for vn ′ = kv k−1 v ). Finally, use
n n
compactness of A to conclude w ∈ im(I − A). To prove that T = I − A is almost surjective
assume that V = (T (H))⊥ is infinite-dimensional, and let (vn ) be an orthonormal basis of V
so that hvn , vn − Avn i = 0 for all n > 1. Now choose a subsequence (vnk ) with Avnk → w
as k → ∞ and derive a contradiction.
Exercise 12.18 (p. 437): For the first direction assume that T is Fredholm, let H1
be (ker(T ))⊥ , H2 be im(T ), and use Proposition 4.25 to show that T |H1 : H1 → H2
−1
has a bounded inverse. Define S|H2 to be T |H1 and S|H⊥ = 0. For the converse
2
apply Exercise 12.17 to the compact operators ST − I and T S − I.
Exercise 12.21 (p. 438): Use Fourier series and the isomorphism ℓ2 (Z) ∼
= L2 (T).
Exercise 12.27 (p. 442): The base cases n = 0 and n = 1 hold trivially by definition.
For n > 1 consider
1 1 X 1 X X X
√ S Un (f ) (v) = √ Un (f )(v′ ) = (n+1)/2 f (w)
p p ′ p ′
v ∼v v ∼v k6n, w∼k v ′
k≡n(mod2)
and count how often the term f (w) appears in this sum, distinguishing between the
cases d(w, v) = n + 1 and d(w, v) 6 n − 1.
Exercise 12.29 (p. 442): Use the addition formula for sin (n + 1)θ + θ .
1
Exercise 12.30 (p. 442): For (a) use the operator U2n − U
p 2n−2
and Cauchy–Schwarz
on the finite set {w | w ∼2n v}. For (b) define fe(v) = (−1)d(v,v0 ) f (v) for a fixed vertex v .
0
For (c) treat the case θ = 0 first. If θ > 0 recall first that p > 2 and use Exercise 12.29
and (b) to deduce that it is enough to show that there are infinitely many n with
1
| sin((2n + 1)θ)| > 2
+ε
for some fixed ε > 0. Note that this holds, for example, if (2n + 1)θ ∈ π Z + [ π4 , 3π 4
]. Now
consider the following three cases: If θ 6 π4 , then every closed interval of length π2 is visited
by the rotation on R/(π Z) by 2θ infinitely often. If θ = π2 − φ for some φ ∈ (0, π4 ], then
π
(2n + 1)θ + Zπ = 2
− (2n + 1)φ + Zπ
Hints for Selected Problems 583
π
and the same argument applies. In the only remaining case θ = 2
works.
dµ 1/2
Exercise 12.32 (p. 444): Use multiplication by dν
.
Exercise 12.54 (p. 456): Adapt the argument from Section 9.1.2. If H is not separable,
combine these arguments with Zorn’s lemma.
Exercise 12.57 (p. 458): Apply the polar decomposition from Exercise 12.48 to find an
isometry U : H1 → H2 and a positive self-adjoint operator A ∈ B(H1 ) with T = U A.
Deduce that U and A are bijective and show Aπ1 (g) = π1 (g)A and U π1 (g) = π2 (g)U for
all g ∈ G.
Exercise 12.58 (p. 458): For (a), consider the self-adjoint operator A = B ∗ B with
for all g ∈ G. If σ(A) contains more than one point, then there exist two non-zero func-
tions f1 , f2 ∈ C(σ(A)) such that f1 f2 = 0, which implies that V = ker(f1 (A)) is a closed
proper subspace. Then show that V is invariant under π1 (g) for all g ∈ G. For (b), apply
B+B ∗ B−B ∗
the same argument to 2
and 2i
.
ý
Exercise 12.59 (p. 458): By Exercise 9.55(b) all that remains is to show that irreducibility
ý
of the unitary representation implies extremality. Suppose therefore that πφ : G Hφ
ý
and φ = λφ1 + (1 − λ)φ2 for some λ ∈ (0, 1) and φ1 , φ2 ∈ P(G). Construct π1 : G H1
with generator v1 and π2 : G H2 with generator v2 using Exercise 9.55(a) so that
Exercise 12.61 (p. 461): For (a) and the first part of (b) notice that ı is continuous
by definition of the weak* topology. For the example in (b) set T2 = f (T1 ) for some
function f ∈ C(σ(T1 )), or consider a measure µ on σ(T1 ) × σ(T2 ) whose support projects
surjectively onto each coordinate and define both operators as multiplication operators, or
use, for example, two diagonal 3-by-3 matrices T1 , T2 , each with two different eigenvalues
such that T1 T2 has 3 different eigenvalues.
Exercise 12.65 (p. 462): Show that dµv,w = f0 dµv satisfies (12.15) for all a ∈ A.
Exercise 12.69 (p. 467): For (a) assume that µ(U × N) = 0 for some non-empty open
set U ⊆ σ(A). Use some non-zero f ∈ Cc (U ) ֒→ C(σ(A)) ∼ = A to derive a contradiction.
For (b), notice that continuity of π follows from the definition of the weak* topology.
a for the Gelfand transform of a ∈ A when considered as an element of A′
For (c), write b
and show that b a = ao ◦ π for all a ∈ A. Now use the characterizing property of spectral
measures. For (d) use (c). For (e), note that π(σ(A′ )) ⊆ σ(A) is compact and that by (c)
we have Supp(µv,w ) ⊆ π(σ(A′ )) for all v, w ∈ H. Now apply (a).
Exercise 12.72 (p. 468): Note that ST = T S. By (FC5) this gives ST 1/n = T 1/n S,
where T 1/n is defined as in Corollary 12.42. Apply Theorem 12.60 to the C ∗ -algebra
generated by I, S and T 1/n to realize both as multiplication operators Mg resp. Mh
on L2 (X, µ) for a finite measure space (X, µ) and two positive functions g, h in L∞
µ (X)
with g n = hn µ-almost everywhere.
Exercise 12.75 (p. 470): Consider first the case B1 ⊆ B2 and show that in this
case im ΠB1 ⊆ im ΠB2 using the argument in the proof of Lemma 12.74.
Exercise 12.77 (p. 472): First deal with simple functions using the properties of a
projection-valued measure and Exercise 12.75.
Exercise 12.78 (p. 472): It suffices to consider the case f = 0. Fix v ∈ H and
show that µv (B) = hΠB v, vi for B ∈ B defines a finite measure on X. Then show
R
2
R R
that
X fn (λ) dΠλ v
= X
|fn |2 (λ) dΠλ v, v = X
|fn |2 dµv , and apply dominated
convergence.
Exercise 12.82 (p. 477): In both contexts the cyclic subspace is the minimal invariant
closed subspace containing a given v ∈ H and so it suffices to show that the notions of
invariance are equivalent. It is easy to verify using only the definition of convolution that
a closed subspace that is invariant under the unitary representation is invariant under
convolution. To see the converse, use the same approximation argument as in the proof of
Corollary 12.81.
Exercise 12.88 (p. 481): Show that L 0 b is a subalgebra that is closed under
1 (G) ⊆ C (G)
Exercise 12.90 (p. 483): For (a) suppose that gn → g0 and tn → t0 as n → ∞. Recall
from Proposition 11.43 that the topology on Gb can be defined by uniform convergence on
compact sets, and apply this to K = {gn | n > 1}∪{g0 }. For (b), notice first that (a) implies
b → S1 is continuous. Moreover, uniform continuity of h·, ·i restricted to K × L
that ı(g) : G
b shows that ı(gn )|L → ı(g0 )|L uniformly as n → ∞. By
for some compact subset L ⊆ G
b
b is continuous. For (c)
Proposition 11.43 this shows that ı(gn ) → ı(g0 ) and so ı : G → G
Hints for Selected Problems 585
approximate f1 , f2 ∈ L2 (G) by f1′ , f2′ ∈ Cc (G) and notice that λg f1′ , f2′ = 0 once g is
outside a certain compact subset. For (d), apply Theorem 12.85 to see that Mgn f → 0 in
b
b and a second time to see that b
the weak topology as n → ∞ for f ∈ L2 (G) λı(gn ) f → 0
b Now use continuity of the unitary representation b
b b b
b
weakly as n → ∞ for f ∈ L2 (G). λ of G
b
b to conclude that ı(gn ) → ∞ as n → ∞.
on L2 (G)
Exercise 12.93 (p. 484): For (a) notice that a character on G/H can be lifted to G using
composition with the quotient map G → G/H. For (b) use Theorem 12.84 on G/H to
b ⊥.
show that (H ⊥ )⊥ ⊆ H. For (c) apply (a) to G/H
Exercise 12.94 (p. 484): For (b) suppose first that θ has dense image and conclude
b t ) = 1 for some t ∈ G
from θ(χ b that t = 0. For the converse, use Exercise 12.93 to find a
b ∩ (im θ)⊥ if im θ is not dense in G.
non-trivial character t ∈ G
Exercise 12.95 (p. 485): For the isomorphism between the dual group of the product
and the direct sum of the dual groups show that the elements of the direct sum define
characters and that these separate points.
Q
Exercise 12.96 (p. 485): By definition lim(Gn , φn ) is a subgroup of n>1 Gn . Combine
←−
Exercise 12.93 and Exercise 12.95.
Exercise 12.97 (p. 485): Use Exercise 12.96 and Pontryagin duality.
with 1 ∈ H 1 (T)rH01 ((0, 1)) and I ∈ H 1 ((0, 1))rH 1 (T) where I(x) = x for x ∈ (0, 1).
Use Fourier series to show that Tp = −Tp∗ . Use the definition of weak derivatives to show
that T = −T0∗ is the weak derivative on H 1 ((0, 1)).
Exercise 13.10 (p. 494): Show first that (im(B))⊥ = {0} and deduce that B −1 is densely
defined. To see that B −1 is self-adjoint prove that B −1 u, v = u, B −1 v for u, v ∈ im(B)
and that B(B −1 )∗ u, v = hu, vi for any u ∈ D(B−1 )∗ and v ∈ H.
Exercise 13.11 (p. 494): To see that in (a) and (b) there are no other eigenfunctions than
the given ones, use elliptic regularity (Theorem 5.34 and Example 5.20) to conclude that
the eigenfunctions satisfy certain differential equations with boundary conditions.
Exercise 13.13 (p. 494): For (a) assume first that there exists a bound N on the number
of neighbours and show that Tinitial (f )((v1 , v2 )) = f (v1 ) and Tterminal (f )((v1 , v2 )) = f (v2 )
→
for any f ∈ L2 (V) and (v1 , v2 ) ∈ E defines a pair of bounded operators
→
Tinitial , Tterminal : L2 (V) −→ L2 ( E )
with T = Tterminal − Tinitial . For the converse consider functions f = δvn so that the
vertex vn ∈ V has more than n neighbours. In (b) the operator T ∗ is defined on a subset
→ P
of L2 ( E ) and maps g ∈ DT ∗ to T ∗ (g)(v) = w∼v g(w, v) − g(v, w) for all v ∈ V.
Exercise 13.14 (p. 494): In (a), show that DT ∗ is defined by Kirchhoff’s law: That is, a
function
586 Hints for Selected Problems
M
f ∈ L2 (Q) = L2 (Se )
→
e∈ E
at every vertex v ∈ V. For (b) argue as in Section 6.4.2. For (c) show that the eigenfunctions
are on each interval defined by an appropriate trigonometric function that vanishes on
the three vertices that are not in the centre. Use the Kirchhoff condition in the centre
to find the constraint for the eigenvalues. Assume first that the ratios of the lengths are
incommensurable and reduce the counting to the counting of poles of another trigonometric
function.
Exercise 13.16 (p. 498): Using the natural unitary representation on H1 × H2 show
that Graph(T ) is invariant and that the operators B in the proofs of Theorems 13.9
and 13.15 commute with the unitary representation. Now apply Exercise 12.58(b).
Exercise 13.19 (p. 499): Show that the eigenvalues of H are unbounded, and then use
Exercise 4.29.
Exercise 13.21 (p. 500): By the spectral theorem (Theorem 13.15) it is sufficient to
consider the multiplication operator Mg for a real-valued function g on a finite measure
space.
Exercise 13.23 (p. 501): For (a) take the inner product with v ∈ DS . For (b) simply
calculate kSv ± ivk2 ; for (c) and (d) notice that
for all v ∈ DS .
Exercise 13.26 (p. 501): To show that SU is self-adjoint apply Theorem 9.2 to see that
it is sufficient to consider unitary multiplication operators. Then apply Exercise 13.5(a).
Exercise 13.27 (p. 502): Show that U = T from Example 12.9 is an isometry defined on
the whole Hilbert space for which I − U is injective and im(I − U ) is dense, and U cannot
be extended to a unitary operator.
Hints for Selected Problems 587
Exercise 13.28 (p. 502): By applying the Cayley transform (and its inverse) it is sufficient
to consider the associated partial isometries.
Exercise 13.30 (p. 502): Show that if g ∈ L2 ((0, 1)) satisfies
for all f ∈ H01 ((0, 1)) it follows that g ∈ H 1 ((0, 1)) satisfies the equation g ′ = ±g. Conclude
from this that n+ (S) = n− (S) = 1.
Exercise 14.16 (p. 523): First take the quotient of Cc (R) by the kernel of k · kΛ using
Lemma 2.15. Apply Theorem 2.32 to obtain the completion AΛ of the quotient of Cc (R).
Use the Banach algebra inequality in Theorem 14.6 to show that the convolution operation
extends to AΛ and gives it the structure of a Banach algebra. Now use (14.4) and the
automatic extension property in Proposition 2.59 to extend the canonical map from Cc (R)
to AΛ to a map from L1 (R) to AΛ .
Exercise 14.21 (p. 527): For (a) use the fact that the characters on the abelian
group (Z/q Z)× form an orthonormal basis of L2m (Z/q Z)× where m is the counting
1
measure multiplied by φ(q) (notice that this is a special case of Exercise 3.50). For (b)
1
notice that the coefficient of the trivial character in the Fourier expansion of fa is φ(q) .
Now combine the assumption and PNT itself in the form (14.1) (see also Exercise 14.3).
Exercise 14.22 (p. 527): Argue along the lines of the proof of Proposition 14.4, but use
Lemma 14.12 to control the error term (as we cannot use monotonicity).
Exercise 14.23 (p. 527): Estimate
X X Λ(n)
χ(n)Λ(n)
kf kχ = lim sup f (log n − h) 6 lim sup |f |(log n − h)
h→∞ n h→∞ n
Exercise 14.24 (p. 527): For any b ∈ Z define Λb = Λfb = Λ1{k∈Z|k≡b (mod q)} so
P q−1 Pq−1
that Λ = b=0 Λb and νΛ = b=0 νΛb . Now use the convergence in Lemma 14.12 and
argue as in the proof of Proposition 14.14 to find a subsequence on which (λh νΛb ) converges
Pq−1
for all b = 0, . . . , q − 1. Finally, note that νχΛ = b=0 χ(b)νΛb .
Exercise 14.25 (p. 528): For (a) apply Abel summation (Lemma 14.33) with the
choices an = χ(n) and bn = log2 n and use the fact that |An | 6 q for all n > 1 to
see that
m
X m−1
X
An (bn − bn+1 ) 6 q (bn+1 − bn ) + qbm = 2qbm 6 2q log2 (1 + y)
n=1 n=1
where m = ⌊y⌋. For (b) use (14.26), re-order the summation, and apply (14.11). For (c)
argue as in Corollary 14.13 (using Corollary 14.13 to control errors).
Exercise 14.26 (p. 528): The argument is similar to the proof of the algebra inequality
for k · kΛ , but much simpler. Use Exercise 14.25(c) to obtain
Z
kf1 ∗ f2 kχ = lim sup f1 ∗ f2 dλh ρ
h→∞
588 Hints for Selected Problems
1
where ρ is defined by dρ = t
d(νχΛ ∗ νχΛ ), and then repeat the argument for (14.22).
Exercise 14.27 (p. 528): Verify that the proof of Theorem 14.17 works if k · kΛ is simply
replaced by k · kχ throughout.
Exercise 14.28 (p. 528): Use the same f0 and f as in the corresponding case of The-
R
orem 14.18. If now kf kχ = f Dχ dm = kf k1 = kf0 k1 for the density Dχ from
Exercise 14.24, then |Dχ (t)| = 1 almost everywhere for t in [−2, 2]. However, this
forces Dχ ∈ im χ almost everywhere, which leads to a contradiction.
Exercise 14.29 (p. 529): Use (14.27) as a replacement for Mertens’ theorem in the proof
of the ξ = 0 case in Theorem 14.18. Use this, Exercise 14.27 and Exercise 14.28 to conclude
that k · kχ = 0 for any non-trivial Dirichlet character χ. Conclude by using Exercise 14.22.
|an |
Exercise 14.34 (p. 533): If ℜ(s) = σ > σ0 then annlog
s
n
≪ , which implies
P n(σ+σ0 )/2
an log n
that g = − n>1 ns
converges absolutely and uniformly on compact subsets of the
half plane {s ∈ C | ℜ(s) > σ0 }. Integrating g term-by-term along line segments shows
that f ′ = g.
(
1 for g(x) = 0,
Exercise B.20 (p. 561): Define dµ′ = |g| dµ and g ′ (x) =
arg g(x) for g(x) =
6 0.
Notes
(1) (Page v) This description — natural in light of the fact that there seem to be more than
seven hundred books in Mathematical Reviews whose title contains the phrase ‘Functional
Analysis’ — appears in the preface to the monograph of Aubin [2] and is doubtless older
than that.
(2) (Page 6) The Laplace operator is intimately connected with both geometry and phys-
ics. An elegant brief discussion in the notes of Arnold [1, Ch. 4] points out the connection
between the Laplace operator applied to a surface f : R2 → R with |f | small viewed as a
perturbation of a flat sheet, the area of the surface defined by f , and the work required
to bend the surface into this shape. An aspect we are not able to explore here — essential
to the physical meaning of the Laplace operator — is reflected in Arnold’s comment “The
enemies of physics define the Laplace operator in their mathematical textbooks by [rela-
tion (1.5)], which renders this physical object relativistically meaningless (it depends not
only on the function to which the operator is applied, but also on the choice of the coordin-
ate system). On the contrary, the operators [. . . ] and ∆ depend only on the Riemannian
metric and do not depend on the coordinate system.”
(3) (Page 24) The proof here is taken from a note by Väisälä [107], and the original result
states that if X ⊆ C is a compact set for which CrX is connected, then any continuous
function X → C whose restriction to the interior X o is holomorphic, is a uniform limit
of a sequence of polynomials. Without the additional hypothesis that the function be
holomorphic on the interior the result is simply false, as indicated. If CrX is not connected
a similar result holds using rational functions instead of polynomials.
(7) (Page 59) Bergman [8] introduced the space of holomorphic functions in a complex
domain with sufficiently regular behaviour at the boundary to ensure they are absolutely
integrable. Part of their importance is that they are Banach spaces; we refer to the mono-
graph of Hedenmalm, Korenblum and Zhu [43] for an accessible treatment.
(8) (Page 77) In fact, the space Lp (X) (or ℓp (N)) is uniformly convex for any p in (1, ∞),
µ
but the proof for p in (1, 2) is more involved; we refer to Clarkson [18] for the details.
(9) (Page 82) The property that all closed subspaces are complemented in fact charac-
terizes Hilbert spaces in the following sense. Lindenstrauss and Tzafriri [63] showed that
if (V, k · k) is a Banach space in which every closed subspace is complemented then the
norm is equivalent to one induced by a scalar product.
(10) (Page 92) In fact the existence of a left-invariant Borel measure is closely related to
local compactness. Weil [110] showed that if a group has a left-invariant measure for which
a convolution can be defined, then there is a topology on the group with the property
that the completion of the group in that topology is locally compact, and the left-invariant
measure is essentially the Haar measure on the completion. Oxtoby [83], in investigating
what invariant measures can be found on groups that are not locally compact, showed that
a complete separable metric group possesses a left-invariant Borel measure if and only if
the group is locally compact and dense in itself.
(11) (Page 93) In particular, the convergence in L2 does not imply convergence of the
Fourier series at any given point, and a priori does not even imply convergence almost
everywhere. In the classical setting G = T, these questions have been of central importance.
Dirichlet proved that the Fourier series converges at each point if f ∈ C 1 (T), and Paul
du Bois-Reymond showed that there is a function f ∈ C(T) whose Fourier series diverges
at one point. Lusin conjectured that the Fourier series converges almost everywhere to
the function for f ∈ L2 (T), and Kolmogorov [56] found a function in L1 (T) whose Fourier
series diverges almost everywhere. Carleson [16] proved the convergence almost everywhere
for f ∈ L2 (T), an extremely difficult result later extended to f ∈ Lp (T) for p ∈ (1, ∞)
by Hunt [48]. We refer to Lacey [58] for a modern, approachable, account. The situation
is more complicated for functions on compact abelian groups, in part because there is no
canonical way to sum over the group of characters.
(12) (Page 95) This form of uncertainty principle is pointed out for finite cyclic groups
as part of a wider investigation by Donoho and Stark [23]. In the case where G is the
group Z/pZ for a prime p, Tao [104] proved the stronger result that
but the proof requires methods in matrix theory beyond our scope.
(13) (Page 126) The Baire category theorem result is a powerful tool across much of topology
and analysis. It was shown by Osgood [81] for R, and independently by Baire [3] for Rd .
It was later applied in functional analysis by Banach and Steinhaus [4].
(14) (Page 128) This analogy is pursued in a monograph by Oxtoby [82], motivated by work
of Sierpiński [98] and Erdős [28], who showed that under the assumption of the continuum
hypothesis there is an injective function f : R → R with f = f −1 with the property
that f (A) is a null set if and only if A is of first category. An approach to constructing sets
with prescribed Diophantine approximation properties was given by Schmidt via what we
now call Schmidt games [93]. The simplest of these takes the following form: Let X be a
metric space, S ⊆ X any subset, and fix constants α, β ∈ (0, 1). The game is played as
follows: the first player, Bob, chooses any open ball B0 ⊆ X with radius ρ0 . Then Alice,
the second player, chooses a ball B1 ⊆ B0 with radius ρ1 = αρ0 . Bob then chooses a
ball B2 ⊆ B1 with radius ρ2 = βρ1 , Alice chooses a ball B3 ⊆ B2 with radius ρ3 = αρ2 ,
and so on. The intersection of all the balls Bn for n > 1 comprises a single point x. If x ∈ S
then Alice wins the game, if not Bob wins. If Alice can force a victory, then the set S is
called (α, β)-winning, and S is said to be α-winning if it is (α, β)-winning for all β ∈ (0, 1).
Clearly S needs to be dense if it is (α, β)-winning, and it may be shown that there are
Notes 591
some null sets that are also meagre and α-winning. Moreover, any countable intersection
of α-winning sets is again α-winning.
(15) (Page 143) This is shown by Meyers and Serrin [72]; if the closure is taken of functions
that are smooth up to the boundary then the situation is different. We refer to Evans [30]
for an accessible account.
(16) (Page 182) Horn’s conjecture [47], which was proved in two parts, one by Klyachko [53]
and the other by Knutson and Tao [54] says the following. If A and B are Hermitian n × n
matrices, then an ordered triple (I, J, K) of subsets of {1, . . . , n} with the same cardinality
is called admissible if the inequality
X X X
λi (A + B) 6 λi (A) + λi (B)
i∈I i∈J i∈K
holds. Horn’s conjecture was that all such admissable inequalities together with the trace
identity (6.13) characterize the possible eigenvalues of pairs of Hermitian matrices and their
sum. We refer to the survey article by Knutson and Tao [55] for the details and references.
(17) (Page 182) This is one of a large number of results in matrix analysis and its applica-
tions by Weyl [111]. Courant and Hilbert [20, p. 286] give this inequality the following phys-
ically intuitive meaning, familiar to anyone who has used a stringed musical instrument: If
a dynamical system stiffens, then the frequency of its fundamental tone or resonance, and
that of all the overtones, increases.
(18) (Page 196) This relation between semi-norms and traces may be found in the work of
Bessel functions.
(20) (Page 202) Weyl’s motivation came from a problem in black body radiation, though it
was well understood at the time that the mathematical questions also arose in the theory
of vibrations. The result was foreshadowed by Lord Rayleigh [100] in 1877, who used a
three-dimensional lattice point counting problem to count vibrational modes in a cube,
allowing him to asymptotically count the number of ‘overtones’. Somerfeld and Lorentz
conjectured in 1910 that in fact the quantity was also independent of the shape, giving
the context in which Weyl proved this remarkable theorem. Weyl also gave error terms in
dimensions 2 and 3, and conjectured the form of a second term in terms of the area of the
boundary of U in dimension 3.
(21) (Page 202) Milnor [73] noted that a remarkable pair of lattices in R16 constructed by
Witt [115] gives rise to a pair of 16-dimensional tori that have the same eigenvalues but
different shapes. Much later Gordon, Webb, and Wolpert [40] exhibited two non-convex
polygons in R2 with the same eigenvalues but different shapes. In the positive direction,
Zelditch [117] showed that the answer to Kac’s question is yes for a large class of convex
subsets of R2 with analytic boundary.
(22) (Page 215) A sequence (z ) of complex numbers with |z | < 1 for all n > 1 is said
n P∞ n
to satisfy the Blaschke condition [12] if n=1 (1 − |zn |) < ∞; in this case the Blaschke
Q |z |
product B(z) = ∞ n zn −z
n=1 zn 1−zn z , where the product is taken over all n with zn 6= 0, and
with a factor z if zn = 0, is analytic in the open unit disk and vanishes at each zn . Finite
Blaschke products as used here may be characterized as the analytic functions on the open
unit disk with continuous extension to the closed unit disk.
(23) (Page 223) This was shown by Banach and Tarski in 1924 [5]; we refer to the monograph
of Wagon [108] for more details, other related paradoxical decompositions, and the history
of this kind of result.
(24) (Page 226) This alludes to the observation that if every room is occupied in a hotel with
infinitely many rooms then a new guest can always be accommodated. If there are countably
many rooms, this is done by moving each guest to the ‘next’ room: “Sobald nun ein neuer
Gast hinzukommt, braucht der Wirt nur zu veranlassen, dass jeder der alten Gäste das
592 Notes
Zimmer mit der um 1 höheren Nummer bezieht, und es wird für den Neuangekommenen
das Zimmer 1 frei” (from a lecture of Hilbert in 1924; see [31, p. 730]).
(25) (Page 262) We refer to Parthasarathy [84] for a more detailed treatment of the theory
of probability measures on compact metric spaces, and to [27, Ch. 4] for material on
equidistribution from a dynamical point of view.
(26) (Page 267) This is the Kryloff–Bogoliouboff Theorem [57], and it means that a con-
tinuous transformation on a compact metric space always gives rise to one (and perhaps
to many) measure-preserving systems.
(27) (Page 296) The theory of distributions is of central importance in partial differen-
tial equations, where it sometimes allows solutions to be found in the sense of distribu-
tions when they cannot be readily found in the classical sense (as seen in Chapters 5
and 6). The theory of generalized functions was initiated by Sobolev [99] to provide weak
solutions to certain partial differential equations, and then developed systematically by
Schwartz [94], [95].
(28) (Page 342) The uncertainty principle has many extensions, generalizations, and applic-
ations. We would struggle to do better than to quote Folland and Sitaram [34] both for its
extensive bibliography and for its elegant description: “The uncertainty principle is partly
a description of a characteristic feature of quantum mechanical systems, partly a state-
ment about the limitations of one’s ability to perform measurements on a system without
disturbing it, and partly a meta-theorem in harmonic analysis that can be summed up as
follows. A non-zero function and its Fourier transform cannot both be sharply localized.”
(29) (Page 353) The approach developed here is close to that of von Neumann, whose
lectures on the original work of Haar are now available in a convenient form [80].
(30) (Page 403) Margulis’ argument showed in particular that the quotients SL (Z)/Λ by
3
finite index subgroups Λ are (via a standard graph structure on them) an expander family.
To prove this, we will discuss unitary representations of the group SL3 (Z) (that is, actions
of SL3 (Z) by unitary transformations on a Hilbert space). There is also a family of certain
finite quotients SL2 (Z)/Λ that give an expander family, but the proof of this lies deeper
and goes beyond what we will be able to cover. We refer to the monographs of Sarnak [92],
Lubotzky [66] or the notes [26] for the details.
(31) (Page 419) This is the simplest result in the topic of automatic continuity, which
asks for algebraic conditions on Banach algebras A and B that ensure that any algebra
homomorphism χ : A → B is continuous. We refer to the monograph of Dales [21] for a
thorough account.
(32) (Page 439) We refer to the monograph of Lubotzky [66, Sec. 4.5] and the papers of
Kesten [52] and Buck [14] for more details (and for generalizations to other Cayley graphs).
(33) (Page 442) The reader should not confuse the word ‘classical’ with ‘outdated’. Apart
from playing an important role in approximation theory and differential equations these
relations and the resulting polynomials are, in part because of their relation to regular
trees, of great importance for number theory and related areas; we refer to the work of
Lindenstrauss on arithmetic quantum unique ergodicity [62] for a striking instance of this.
(34) (Page 503) The priority for the elementary proof and for some of the steps toward it
ticularly striking one is the existence of paradoxical decompositions, which was discussed
in Section 7.2.3.
References
44. S. Helgason, Differential geometry, Lie groups, and symmetric spaces, in Pure and
Applied Mathematics 80 (Academic Press, Inc. [Harcourt Brace Jovanovich, Publish-
ers], New York-London, 1978).
45. E. Hewitt and K. A. Ross, Abstract harmonic analysis. Vol. I, in Grundlehren der
Mathematischen Wissenschaften 115 (Springer-Verlag, Berlin, second ed., 1979).
46. D. Hilbert and E. Schmidt, Integralgleichungen und Gleichungen mit unendlich vielen
Unbekannten, in Teubner-Archiv zur Mathematik [Teubner Archive on Mathematics],
11 (BSB B. G. Teubner Verlagsgesellschaft, Leipzig, 1989). Edited and with a fore-
word and afterword by A. Pietsch.
47. A. Horn, ‘Eigenvalues of sums of Hermitian matrices’, Pacific J. Math. 12 (1962),
225–241.
48. R. A. Hunt, ‘On the convergence of Fourier series’, in Orthogonal Expansions and
their Continuous Analogues (Proc. Conf., Edwardsville, Ill., 1967), pp. 235–255
(Southern Illinois Univ. Press, Carbondale, Ill., 1968).
49. H. Iwaniec and E. Kowalski, Analytic number theory, in American Mathematical
Society Colloquium Publications 53 (American Mathematical Society, Providence,
RI, 2004).
50. M. Kac, ‘Can one hear the shape of a drum?’, Amer. Math. Monthly 73 (1966), no. 4,
part II, 1–23.
51. J. L. Kelley, General topology (Springer-Verlag, New York-Berlin, 1975). Reprint of
the 1955 edition [Van Nostrand, Toronto, Ont.], Graduate Texts in Mathematics, No.
27.
52. H. Kesten, ‘Symmetric random walks on groups’, Trans. Amer. Math. Soc. 92 (1959),
336–354.
53. A. A. Klyachko, ‘Stable bundles, representation theory and Hermitian operators’,
Selecta Math. (N.S.) 4 (1998), no. 3, 419–445.
54. A. Knutson and T. Tao, ‘The honeycomb model of GLn (C) tensor products. I. Proof
of the saturation conjecture’, J. Amer. Math. Soc. 12 (1999), no. 4, 1055–1090.
55. A. Knutson and T. Tao, ‘Honeycombs and sums of Hermitian matrices’, Notices
Amer. Math. Soc. 48 (2001), no. 2, 175–186.
56. A. Kolmogorov, ‘Une série de Fourier-Lebesgue divergente presque partout’, Funda-
menta math. 4 (1923), 324–328.
57. N. Kryloff and N. Bogoliouboff, ‘La théorie générale de la mesure dans son application
à l’étude des systèmes dynamiques de la mécanique non linéaire’, Ann. of Math. (2)
38 (1937), no. 1, 65–113.
58. M. T. Lacey, ‘Carleson’s theorem: proof, complements, variations’, Publ. Mat. 48
(2004), no. 2, 251–307.
59. P. D. Lax, Functional analysis, in Pure and Applied Mathematics (New York) (Wiley-
Interscience [John Wiley & Sons], New York, 2002).
60. C. G. Lekkerkerker, Geometry of numbers, in Bibliotheca Mathematica, Vol.
VIII (Wolters-Noordhoff Publishing, Groningen; North-Holland Publishing Co.,
Amsterdam-London, 1969).
61. V. B. Lidskiı̆, ‘Non-selfadjoint operators with a trace’, Dokl. Akad. Nauk SSSR 125
(1959), 485–487.
62. E. Lindenstrauss, ‘Invariant measures and arithmetic quantum unique ergodicity’,
Ann. of Math. (2) 163 (2006), no. 1, 165–219.
63. J. Lindenstrauss and L. Tzafriri, ‘On the complemented subspaces problem’, Israel
J. Math. 9 (1971), 263–269.
64. M. Loève, Probability theory. I, in Graduate Texts in Mathematics 45 (Springer-
Verlag, New York, fourth ed., 1977).
65. M. Loève, Probability theory. II, in Graduate Texts in Mathematics 46 (Springer-
Verlag, New York, fourth ed., 1978).
66. A. Lubotzky, Discrete groups, expanding graphs and invariant measures, in Modern
Birkhäuser Classics (Birkhäuser Verlag, Basel, 2010). With an appendix by Jonathan
D. Rogawski, Reprint of the 1994 edition.
596 References
93. W. M. Schmidt, ‘On badly approximable numbers and certain games’, Trans. Amer.
Math. Soc. 123 (1966), 178–199.
94. L. Schwartz, Théorie des distributions. Tome I, in Actualités Sci. Ind., no. 1091 =
Publ. Inst. Math. Univ. Strasbourg 9 (Hermann & Cie., Paris, 1950).
95. L. Schwartz, Théorie des distributions. Tome II, in Actualités Sci. Ind., no. 1122 =
Publ. Inst. Math. Univ. Strasbourg 10 (Hermann & Cie., Paris, 1951).
96. A. Selberg, ‘An elementary proof of the prime-number theorem’, Ann. of Math. (2)
50 (1949), 305–313.
97. J.-P. Serre, A course in arithmetic (Springer-Verlag, New York-Heidelberg, 1973).
Translated from the French, Graduate Texts in Mathematics, No. 7.
98. W. Sierpiński, ‘Sur les fonctions jouissant de la propriété de Baire de fonctions con-
tinues’, Ann. of Math. (2) 35 (1934), no. 2, 278–283.
99. S. Soboleff, ‘Méthode nouvelle à résoudre le problème de Cauchy pour les équations
linéaires hyperboliques normales’, Rec. Math. [Mat. Sbornik] N.S. 1(43) (1936), no. 1,
39–72.
100. J. W. Strutt, The theory of sound. Second edition, revised and enlarged. Volume II.
(London: Macmillan. 520 S. 8◦ , 1896) (English). 3rd Baron Rayleigh.
101. M. Takesaki, Theory of operator algebras. I (Springer-Verlag, New York, 1979).
102. M. Talagrand, ‘Pettis integral and measure theory’, Mem. Amer. Math. Soc. 51
(1984), no. 307, ix+224.
103. T. Tao, A Banach algebra proof of the prime number theorem; Urysohn’s lemma; The
prime number theorem in arithmetic progressions; Elementary multiplicative number
theory (https://terrytao.wordpress.com ). Accessed: 29th October 2015.
104. T. Tao, ‘An uncertainty principle for cyclic groups of prime order’, Math. Res. Lett.
12 (2005), no. 1, 121–127.
105. T. Tao, An introduction to measure theory, in Graduate Studies in Mathematics 126
(American Mathematical Society, Providence, RI, 2011).
106. F. Trèves, Topological vector spaces, distributions and kernels (Academic Press, New
York, 1967).
107. J. Väisälä, ‘A proof of the Mazur–Ulam theorem’, Amer. Math. Monthly 110 (2003),
no. 7, 633–635.
108. S. Wagon, The Banach-Tarski paradox, in Encyclopedia of Mathematics and its Ap-
plications 24 (Cambridge University Press, Cambridge, 1985). With a foreword by
Jan Mycielski.
109. K. Weierstrass, ‘Über die analytische Darstellbarkeit sogenannter willkürlicher Func-
tionen einer reellen Veränderlichen’, Verl. d. Kgl. Akad. d. Wiss. Berlin 2 (1885),
633–639.
110. A. Weil, L’intègration dans les groupes topologiques et ses applications, in Actual.
Sci. Ind., no. 869 (Hermann et Cie., Paris, 1940).
111. H. Weyl, ‘Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Dif-
ferentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung)’,
Math. Ann. 71 (1911), 441–479.
112. H. Weyl, ‘Über die Gleichverteilung von Zahlen mod Eins’, Math. Ann. 77 (1916),
313–352.
113. E. T. Whittaker and G. N. Watson, A course of modern analysis, in Cambridge
Mathematical Library (Cambridge University Press, Cambridge, 1996).
114. N. Wiener, ‘Tauberian theorems’, Ann. of Math. (2) 33 (1932), no. 1, 1–100.
115. E. Witt, ‘Eine Identität zwischen Modulformen zweiten Grades’, Abh. Math. Sem.
Hansischen Univ. 14 (1941), 323–337.
116. D. Witte Morris, Introduction to arithmetic groups (Deductive Press, 2015).
117. S. Zelditch, ‘Spectral determination of analytic bi-axisymmetric plane domains’,
Geom. Funct. Anal. 10 (2000), no. 3, 628–677.
598
Notation
quences, 20 138
Lµ1 (X), space of integrable func- ∆, weak Laplace operator, 154
tions, 21 K(V, W ), K(V ), space of compact
B(X), space of bounded functions, operators, 168
27 A∗ , adjoint of operator A, 175
Cb (X), space of continuous bounded HS(H), space of Hilbert–Schmidt
functions, 27 operators on H, 195
C0 (X), space of continuous func- ωd , volume of the unit ball in Rd ,
tions vanishing at infin- 202
ity, 27 TdR , scaled torus, 202
λcount , counting measure, 28 F2 , free group on two generators,
L ∞ (X), space of bounded meas- 225
urable functions, 29 ΣX , simple integrable functions on
c0 , space of null sequences, 39 X, 235
CR (X), CC (X), real- and complex- T∗ µ, push-forward of a measure,
valued continuous func- 265
tions, 42 D(U ), space of distributions on U ,
1A , indicator function of the set A, 296
50 L1 , space of equivalence classes of
integrable functions, 297
NOTATION 599
d
S (R ), Schwartz space of func- Λ, von Mangoldt function, 504
tions on Rd , 341
ý
Λ2 , second von Mangoldt function,
π : G H, unitary representa- 509
tion, 344 Bε (·), ε ball in a metric space, 539
P1 (G), positive-definite functions lim F , limit of a convergent filter,
on a group, 345 540
λg , left regular representation, 353 limF f , convergence along a filter,
λg , shift in domain, 362 540
AB, product of sets in a group, 371 σ(C), σ-algebra generated by C,
HG , subspace of invariant vectors 551
in a unitary representa-
tion, 375
600
General Index