0% found this document useful (0 votes)
11 views100 pages

Advanced Functional Analysis: Lecture Notes

These lecture notes cover advanced functional analysis topics, focusing on bounded and unbounded self-adjoint operators, as well as locally convex vector spaces. The content is structured into sections that include the spectral theorem, vector space topologies, and basic theory of distributions, with compulsory material outlined for examination purposes. The notes are based on Dirk Werner's textbook and include references to additional resources for further study.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views100 pages

Advanced Functional Analysis: Lecture Notes

These lecture notes cover advanced functional analysis topics, focusing on bounded and unbounded self-adjoint operators, as well as locally convex vector spaces. The content is structured into sections that include the spectral theorem, vector space topologies, and basic theory of distributions, with compulsory material outlined for examination purposes. The notes are based on Dirk Werner's textbook and include references to additional resources for further study.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Lecture Notes

Advanced Functional Analysis

Günther Hörmann
Fakultät für Mathematik
Universität Wien
[email protected]

Sommersemester 2023
(Corrected and updated version as of May 23, 2024)
Preface
These lecture notes draw a substantial part of their material and style as well as its logical
build-up from Chapters VII and VIII of Dirk Werner’s excellent textbook [Wer18] (in
German). Regarding the prerequisites according to an introductory course on functional
analysis at our faculty we may refer to Chapters I–VI in [Wer18] or [Hoe21], both in
German, or for English texts to the corresponding chapters in [Con10, Con16, Tes14].
(The latter sources contain also many further aspects and material.)
In course of the semester we might occasionally provide hints to supplementary concepts,
examples, or further applications not covered in these notes. In fact, these notes do cer-
tainly not replace a book on the subject and are particularly sparse with intermediate and
explanatory or motivating texts in between mathematical statements. However, Sections
1-7 as presented in the notes define the compulsory material for the exam. (Thus excluding
Section 0 and the appendix.)
Many thanks go to Michael Kunzinger for several suggestions and corrections regarding a
previous version of these notes and to Nobuya Kakehashi for several subtle mathematical
comments leading to further improvements later on.
Günther Hörmann
Contents

I. Bounded and unbounded self-adjoint operators 1


0. Before we begin . . . 3

1. The spectral theorem for bounded self-adjoint operators 15

2. Unbounded operators 33

3. The spectral theorem for unbounded self-adjoint operators 45

II. Locally convex vector spaces 53


4. Vector space topologies 55

5. Continuous linear maps and functionals 63

6. Weak and weak* topologies 75

7. Basic theory of distributions 81

Appendix 89

Bibliography 93
Part I.

Bounded and unbounded self-adjoint


operators

1
0. Before we begin . . .

Recalling a few functional analytic basics

Notation and conventions: N = {1, 2, . . .}, H denotes a complex Hilbert space with scalar
product h., .i (conjugate-linear in its second argument), L(H) is the space of bounded linear
operators on H; sequences (fn )n∈N are often written simply in the form (fn ).

If S ∈ L(H), then its adjoint S ∗ ∈ L(H) can be characterized by the condition hSx, yi =
hx, S ∗ yi for all x, y ∈ H; we have S ∗∗ := (S ∗ )∗ = S. The kernel ker(S) := {x ∈ H | Sx = 0}
is always a closed subspace of H, while the range (or image) ran(S) := {Sx | x ∈ H}
is a subspace of H that is not necessarily closed. A basic relation involving these is
ran(S)⊥ = ker(S ∗ ), hence also ker(S) = ran(S ∗ )⊥ and ran(S) = ran(S)⊥⊥ = ker(S ∗ )⊥ .

Recall that T ∈ L(H) is self-adjoint, if T = T ∗ , i.e., hT x, yi = hx, T yi for all x, y ∈ H; and


that T is normal, if T T ∗ = T ∗ T . Clearly, every self-adjoint operator is normal. We call
T ∈ L(H) positive, if hT x, xi ≥ 0 holds for all x ∈ H. On complex Hilbert spaces, positive
operators in this sense are automatically self-adjoint ([Wer18, Satz V.5.6]). Finally, an
operator P ∈ L(H) is an orthogonal projection—onto the closed subspace ran(P ) and we
have H = ker(P )⊕ran(P ) as an orthogonal direct sum—, if and only if P is self-adjoint and
P 2 = P . In the sequel we will simply speak of projections to mean orthogonal projections.

0.1. Proposition: If T ∈ L(H) is self-adjoint, then

(i) hT x, xi ∈ R for every x ∈ H,

(ii) kT k = sup |hT x, xi|.


kxk≤1

Remark: Property (i) is, in fact, also sufficient for self-adjointness of a bounded operator
on complex Hilbert spaces (cf. [Wer18, Satz V.5.6]).

Proof: (i): hT x, xi = hx, T xi = hT x, xi.

(ii): For any x ∈ H with kxk ≤ 1, we have |hT x, xi| ≤ kT xk · kxk ≤ kT k · kxk2 ≤ kT k,
hence it remains to show that kT k ≤ supkxk≤1 |hT x, xi| =: M .

3
By elementary maneuvering,

hT (x + y), x + yi − hT (x − y), x − yi = 2hT x, yi + 2hT y, xi = 2hT x, yi + 2hx, T yi


= 2(hT x, yi + hT x, yi) = 4 RehT x, yi.

Therefore,

4 RehT x, yi ≤ |hT (x + y), x + yi| + |hT (x − y), x − yi| ≤ M kx + yk2 + M kx − yk2


= M (2kxk2 + 2kyk2 ) (by the parallelogram law).

If kxk, kyk ≤ 1, then we obtain 4 RehT x, yi ≤ M · 4, hence RehT x, yi ≤ M . Replacing x


by λx we deduce

Re(λhT x, yi) ≤ M ∀λ ∈ C, |λ| = 1, ∀x, y ∈ H, kxk, kyk ≤ 1,

thus, |hT x, yi| ≤ M , if kxk, kyk ≤ 1. We conclude that kT xk ≤ M , if kxk ≤ 1, which in


turn yields kT k ≤ M .
0.2. Basic closed range condition: Recall that any T ∈ L(H) satisfying for some c > 0
the following lower bound condition

∀x ∈ H : kT xk ≥ ckxk

has a closed range ran(T ) = {T x | x ∈ H} in H and the inverse T −1 : ran(T ) → H is


continuous with kT −1 k ≤ 1/c. In fact, a convergent sequence (T xn ) in ran(T ) is a Cauchy
sequence in H and the estimate kxn − xm k ≤ kT xn − T xm k/c shows that (xn ) is a Cauchy
sequence in H, hence convergent to some x ∈ H, thus (T xn ) converges to T x ∈ ran(T ), thus
ran(T ) is closed. Clearly, T is injective and with y = T x we have kT −1 yk = kxk ≤ kyk/c.
0.3. Theorem (Lax-Milgram): Let B : H × H → C be a sesquilinear form.
(a) The following are equivalent:
(i) B is continuous,
(ii) B is separately continuous,
(iii) ∃M ≥ 0 ∀x, y ∈ H: |B(x, y)| ≤ M kxk kyk.
(b) If B is continuous and M is as in (a), part (iii), then there exists a unique T ∈ L(H)
such that B(x, y) = hT x, yi for all x, y ∈ H and we have kT k ≤ M .
If, in addition, B is bounded below in the sense that |B(x, x)| ≥ c kxk2 holds with some
c > 0 for all x ∈ H, then T is invertible and kT −1 k ≤ 1/c.
Proof: (a): (i) ⇒ (ii) is clear.
(ii) ⇒ (iii): For y ∈ H define the continuous linear functional ly on H by ly (x) := B(x, y).
If x ∈ H, then y 7→ B(x, y) is conjugate-linear and continuous, hence there is some Cx ≥ 0

4
such that |ly (x)| = |B(x, y)| ≤ Cx kyk. Thus, the family A := {ly ∈ H 0 | y ∈ H, kyk ≤ 1} of
linear functionals is pointwise bounded on H, since supkyk≤1 |ly (x)| ≤ Cx . By the Banach-
Steinhaus theorem (or uniform boundedness principle), A is bounded in H 0 , i.e., there is
an M ≥ 0 such that kly k ≤ M for all y ∈ H with kyk ≤ 1. In other words,

∀x, y ∈ H, kxk ≤ 1, kyk ≤ 1 : |B(x, y)| ≤ M.

Since B(x, y) = 0 if x = 0 or y = 0, and |B( kxk


x y
, kyk )| ≤ M , if x 6= 0 and y 6= 0, by the
above inequality, we obtain |B(x, y)| ≤ M kxkkyk.
(iii) ⇒ (i): Let (x0 , y0 ) ∈ H and ε > 0 be given. Choose δ ∈ ]0, 1] such that δ < ε/(M (1 +
kx0 k + ky0 k)), and put Uδ := {(x, y) ∈ H × H | kx − x0 k + ky − y0 k < δ}. Then Uδ is a
neighborhood of (x0 , y0 ) in H × H (with respect to the product topology) and we have for
(x, y) ∈ Uδ :

|B(x, y) − B(x0 , y0 )| = |(B(x, y) − B(x0 , y)) + (B(x0 , y) − B(x0 , y0 ))|


≤ |B(x − x0 , y)| + |B(x0 , y − y0 )| ≤ M kx − x0 k kyk + M kx0 k ky − y0 k
≤ M δ (ky0 k + δ) + M kx0 k δ = δ M (kx0 k + ky0 k + δ) ≤ δ M (kx0 k + ky0 k + 1) < ε.

(b): If x ∈ H then hx (y) := B(x, y) defines a conjugate-linear continuous functional on H


and the conjugate-linear variant of the Riesz-Fréchet theorem provides us with a unique
vector T (x) ∈ H such that B(x, y) = hx (y) = hT (x), yi for every y ∈ H. We obtain a
map T : H → H, x 7→ T (x), which is easily seen to be linear, since hT (λ1 x1 + λ2 x2 ), yi =
B(λ1 x1 + λ2 x2 , y) = λ1 B(x1 , y) + λ2 B(x2 , y) = hλ1 T (x1 ) + λ2 T (x2 ), yi for every y implies
T (λ1 x1 + λ2 x2 ) = λ1 T (x1 ) + λ2 T (x2 ).
We have |hT x, yi| = |B(x, y)| ≤ M kxk kyk, hence supkyk≤1 |hT x, yi| ≤ M kxk, i.e., kT xk ≤
M kxk, and therefore kT k ≤ M .
Uniqueness of T : If hT x, yi = B(x, y) = hSx, yi for all x, y ∈ H, then T x − Sx ⊥ H for all
x ∈ H, hence T = S.
Finally, c kxk2 ≤ |B(x, x)| = |hT x, xi| ≤ kT xk kxk implies ckxk ≤ kT xk, which shows
(cf. 0.2) that ran(T ) is closed and that T −1 is continuous as linear map ran(T ) → H
with kT −1 k ≤ 1/c. We claim that ran(T ) = H, for otherwise there is some z 6= 0 with
z ⊥ ran(T ), i.e., 0 = |hT z, zi| = |B(z, z)| ≥ c kzk2 , a contradiction.
0.4. The spectrum of a bounded operator T ∈ L(H): For proofs not given here and
the more general Banach space context see [Wer18, Abschnitt VI.1].
The resolvent set
ρ(T ) := {λ ∈ C | ∃(λ − T )−1 in L(H)}
is an open subset of C and the resolvent map R : ρ(T ) → L(H), Rλ := R(λ) := (λ − T )−1
is analytic, i.e., is locally given by a power series with coefficients from L(H).

5
The spectrum σ(T ) := C \ ρ(T ) is a compact non-empty subset of C. We have σ(T ∗ ) =
{λ | λ ∈ σ(T )}, since ((λ − T )−1 )∗ = ((λ − T )∗ )−1 = (λ − T ∗ )−1 .
If |λ| > kT k we can apply the Neumann series to find (λ − T )−1 = (I − T /λ)−1 /λ, thus
σ(T ) ⊆ {λ ∈ C | |λ| ≤ kT k}.
We obtain a slightly sharper “encirclement” of the spectrum by the spectral radius r(T ) :=
inf n∈N kT n k1/n = limn→∞ kT n k1/n , which clearly satisfies r(T ) ≤ kT k (since kT n k1/n ≤
(kT kn )1/n = kT k), namely
r(T ) = max{|λ| | λ ∈ σ(T )}.
If T is normal, then r(T ) = kT k. In particular, we can then always find a spectral value
λ ∈ σ(T ) with |λ| = kT k.
A spectral value λ ∈ C is defined by the failure of (λ − T )−1 to exist in L(H). Note that,
due to the open mapping principle, the bijectivity of (λ − T ) : H → H already implies
continuity of its inverse. Hence the “cases of failure” can be separated into the following
three classes:
1. λ − T fails to be injective (it has a nontrivial kernel and thus λ is an eigenvalue),
2. λ − T is injective, not surjective, and ran(λ − T ) is dense in H,
3. λ − T is injective and ran(λ − T ) is not dense in H (hence T is also not surjective).
Accordingly, we have the following decomposition of the spectrum (as a disjoint union)

σ(T ) = {λ ∈ C | @(λ − T )−1 in L(H)} = {λ ∈ C | ker(λ − T ) 6= {0}}


| {z }
σp (T )

∪ {λ ∈ C | ker(λ − T ) = {0}, ran(λ − T ) 6= H, ran(λ − T ) dense}


| {z }
σc (T )

∪ {λ ∈ C | ker(λ − T ) = {0}, ran(λ − T ) not dense in H} .


| {z }
σr (T )

The point spectrum σp (T ) consists of the eigenvalues of T , σc (T ) is called the continuous


spectrum, and σr (T ) is the residual spectrum 1 . However, we show that the latter does not
play any role for normal (or self-adjoint) operators.
0.5. Lemma: If T is normal, then σr (T ) = ∅.
Proof: The operator λ − T is normal as well, hence ker(λ − T ) = ker(λ − T )∗ . (In fact,
k(λ − T )xk = k(λ − T )∗ xk holds, since 0 = h((λ − T )∗ (λ − T ) − (λ − T )(λ − T )∗ )x, xi = h(λ −
T )∗ (λ − T )x, xi − h(λ − T )(λ − T )∗ x, xi = k(λ − T )xk2 − k(λ − T )∗ xk2 .) Thus, λ ∈ σr (T ) would
imply {0} =
6 ran(λ − T )⊥ = ker(λ − T )∗ = ker(λ − T ), i.e., λ ∈ σp (T ), a contradiction.
0.6. Proposition (on approximate eigenvalues of normal operators): Let T ∈
L(H) be normal and λ ∈ C, then the following are equivalent:
1
A simple example for the existence of a residual spectrum is λ = 0 for the right-shift R(x1 , x2 , . . .) =
(0, x1 , x2 , . . .) on l2 (N), since it has (1, 0, 0, . . .) orthogonal to its range.

6
(i) λ ∈ σ(T ),

(ii) there is a sequence (xn ) in H with kxn k = 1 and lim (λxn − T xn ) = 0.


n→∞

Proof: (i) ⇒ (ii): By the above lemma we have σ(T ) = σp (T ) ∪ σc (T ). If λ is an eigenvalue


with normalized eigenvector x, then we may simply put xn := x and obtain λxn − T xn = 0.

It remains to consider λ ∈ σc (T ). Then λ − T is injective and ran(λ − T ) is dense in H, but


ran(λ − T ) 6= H. There can be no constant c > 0 such that k(λ − T )xk ≥ c kxk holds for
all x ∈ H, since this would imply continuity of (λ − T )−1 as a linear map ran(λ − T ) → H
with a continuous extension to H due to the density of ran(λ − T ) (and uniform continuity
of continuous linear maps). Therefore, for every n ∈ N we can find yn ∈ H such that
k(λ − T )yn k < kyn k/n. Putting xn := yn /kyn k we obtain kλxn − T xn k < 1/n.

(ii) ⇒ (i): If λ 6∈ σ(T ), then (λ − T )−1 : H → H is bounded and hence there is some
M > 0 such that k(λ − T )−1 zk ≤ M kzk holds for all z ∈ H. Equivalently, upon putting
z = (λ − T )x, we have kxk ≤ M k(λ − T )xk for all x ∈ H, which contradicts (ii), since
this means kλx − T xk ≥ 1/M > 0 for every x ∈ H with kxk = 1.

And a bit of measure theory

Here we collect a few measure theoretic notions and results (available also, e.g, in Appendix
A and Chapters I, II, VII of [Wer18]); more on the measure theoretic background can be
found in the excellent textbooks [Bau90, Els11] (in German) or [Bau01, Coh80, Con16,
Fol99, Rud86] (in English). Prerequisites from basic topology courses as in [Hoe20] (con-
tained in English in the book [Wil70]) will be used throughout the course without special
notice. We denote by P(Ω) the power set of the set Ω.

(A) Measures

0.7. Definition: Let Ω be a set. A sigma algebra on Ω is a subset Σ ⊆ P(Ω) satisfying

(i) ∅ ∈ Σ,

(ii) A ∈ Σ ⇒ Ω \ A ∈ Σ,

(iii) Aj ∈ Σ (j ∈ N) ⇒
S
Aj ∈ Σ.
j∈N

0.8. Definition: Let Ω be a topological space. The Borel sigma algebra on Ω is the
smallest sigma algebra B(Ω) containing the topology, i.e., the system of open subsets of
Ω. The elements in B(Ω) are called Borel sets.

7
Clearly, B(Ω) contains all closed subsets, and in case of a Hausdorff space also all compact
subsets. In R every interval is a Borel set, in C ∼
= R2 any product of two intervals belongs
to B(C). For more on these issues see [Els11, Kapitel I, §4].

0.9. Definition: Let Σ be a sigma algebra on the set Ω. A measure is a map µ : Σ → [0, ∞]
satisfying

(i) µ(∅) = 0,

(ii) σ-additivity: If A1 , A2 , . . . is a sequence of pairwise disjoint sets in Σ, then


[ X
µ( Aj ) = µ(Aj ).
j∈N j∈N

If µ(Ω) < ∞, µ is a finite measure, and, if µ(Ω) = 1, it is said to be a probability measure.


The triple (Ω, Σ, µ), or often simply the pair (Ω, µ), is called a measure space (or probability
space, if µ(Ω) = 1). If Ω is a topological space and Σ ⊇ B(Ω), then a measure µ defined
on Σ (or rather its restriction µ |B(Ω) ) is called a Borel measure on Ω.

The two simplest (nontrivial) examples of measures on an arbitrary set Ω with sigma
algebra P(Ω) are the Dirac measure δp concentrated at p ∈ Ω with δp (A) = 1, if p ∈ A,
δp (A) = 0 otherwise, and the counting measure µ with µ(A) = ∞, if A is an infinite subset
of Ω, and µ(A) equal to the number of elements of A, if A is finite.

0.10. Theorem: For every d ∈ N there is a unique measure (defined at least) on the Borel
sigma algebra B(Rd ) which is translation invariant and assigns the value (b1 −a1 ) · · · (bd −ad )
to the product of closed bounded intervals [a1 , b1 ] × · · · × [ad , bd ]. This measure is called
the d-dimensional Lebesgue measure.

(B) Construction of the integral

0.11. Definition: Let Σ be a sigma algebra on Ω. A function f : Ω → R (or Ω → [0, ∞])


is measurable (or Σ-measurable), if f −1 ([a, b[) ∈ Σ for any a ≤ b. A complex function
f : Ω → C is measurable, if Re f and Im f are measurable functions Ω → R. In case of a
topological space with Σ = B(Ω), a measurable function is called Borel measurable.

Continuous functions on a topological space are easily seen to be Borel measurable.

For any A ⊆ Ω we have the characteristic function (or indicator function) of A, defined
by χA (q) = 1, if q ∈ A, and χA (q) = 0, if q 6∈ A. Clearly, χA is measurable if and only if
A ∈ Σ.

8
Integral of step functions: A simple function or step function (or an elementary func-
tion) on Ω is a function of the form
m
X
f= cj χAj ,
j=1

where cj ∈ C and Aj ∈ Σ (j = 1, . . . , m), and A1 , A2 , . . . Am are pairwise disjoint. The


integral of a non-negative step function f with respect to a measure µ on Σ is defined by
Z Xm
f dµ = cj µ(Aj ).
j=1

Integral of non-negative measurable functions: It can be shown that pointwise limits


of sequences of measurable functions are measurable and that any measurable function
f : Ω → [0, ∞] is the pointwise increasing limit of a sequence (ϕn ) of non-negative step
functions 0 ≤R ϕ1 (q) ≤ ϕ2 (q) ≤ · · · (cf. [Els11, Kapitel III, Satz 4.13]). The real sequence
of integrals ( ϕn dµ)n∈N is thus increasing and one puts
Z Z
f dµ := lim ϕn dµ ∈ [0, ∞].
n→∞

In case f dµ < ∞ the function f : Ω → [0, ∞] is called integrable (or µ-integrable).


R

Integral of real- or complex-valued measurable functions: Let f : Ω → R be mea-


surable, then f+ := max(f, 0) and f− := max(−f, 0) are measurable functions Ω → [0, ∞[
and f = f+ − f− . If both f+ and f− are integrable, then f is called integrable and we put
Z Z Z
f dµ := f+ dµ − f− dµ.

A complex-valued measurableR functionRf : Ω → C is Rcalled integrable, if Re f and Im f are


integrable, and we then put f dµ := Re f dµ + i Im f dµ.
A property Φ depending on the points in Ω is said to hold µ-almost everywhere (µ-a.e.),
if there is a set N ∈ Σ with µ(N ) = 0 (N is a null set) such that Φ(q) holds whenever
q 6∈ N .
0.12. Theorem (on dominated convergence): Let f and f1 , f2 , . . . be measurable
functions on Ω and suppose that f is the pointwise limit of fn µ-almost everywhere. If
there is an integrable function g on Ω such that |fn | ≤ g holds for all n ∈ N and µ-almost
everywhere, then f as well as every fn is integrable and we have
Z Z
lim fn dµ = f dµ.
n→∞

(In fact, the slightly stronger statement lim |fn − f | dµ = 0 is true.)


R
n→∞

9
Brief review of the construction of Lp -spaces: Let (Ω, Σ, µ) be a measure space.
If
R 1 ≤p p < ∞, we define L (Ω, µ) as vector space quotient of L := {f : Ω → C measurable |
p p

|f | dµ < ∞} modulo N := {f measurable | f = 0 µ-a.e.} and equip it with the norm


kclass of f kp := ( |f |p dµ)1/p . We obtain a Banach space, which in case of a compact
R

subset Ω ⊆ Rd and µ the (restriction of) d-dimensional Lebesgue measure is the completion
of the space of continuous functions on Ω with respect to k.kp . (More generally, the
compactly supported continuous functions on a locally compact Hausdorff space Ω are
dense in Lp (Ω, µ), if µ is a regular Borel measure.)
In case p = ∞ we define L∞ (Ω) to be the set of all µ-measurable functions f : Ω → C that
are bounded µ-a.e., i.e., there exists a set N ∈ Σ with µ(N ) = 0 such that the restriction
f |Ω\N is bounded. On the quotient vector space L∞ (Ω, µ) := L∞ /N we have the norm
kclass of f k∞ := inf{supx∈Ω\N |f (x)| | N ∈ Σ, µ(N ) = 0}.
Note that Ω = N and µ the counting measure gives lp (N) as special cases.

(C) The Banach space of bounded Borel functions

Let Ω ⊆ C and denote by Bb (Ω) the vector space of bounded Borel measurable functions
f : Ω → C. We obtain Bb (Ω) as a closed subspace of the Banach space of all bounded func-
tions on Ω equipped with the supremum norm kf k∞ := supx∈Ω |f (x)|, since measurability
is preserved even under pointwise limits. The proof of the monotone pointwise approxi-
mation of non-negative measurable functions shows, in fact, that a bounded measurable
function can be uniformly approximated by step functions (cf. [Els11, Kapitel III, Korollar
4.14(a)]), i.e., the step functions are dense in the Banach space (Bb (Ω), k.k∞ ).
We state a technical lemma (cf. [Wer18, Lemma VII.1.5]) that will be useful in constructing
a measurable functional calculus for self-adjoint operators on Hilbert spaces. (A proof is
given in the appendix.)
0.13. Lemma: Let Ω ⊂ C be compact and (Bb (Ω), k.k∞ ) be the Banach space of bounded
Borel measurable functions Ω → C. Suppose U ⊆ Bb (Ω) has the following properties:
(a) C(Ω) ⊆ U ,
(b) fn ∈ U (n ∈ N), sup kfn k∞ < ∞, and f (t) := lim fn (t) exists for every t ∈ Ω
n∈N n→∞
=⇒ f ∈ U.
Then U = Bb (Ω).

(D) Signed and complex measures

0.14. Definition: Let Σ be a σ-algebra on the set Ω. A (finite) signed measure is a


σ-additive map µ : Σ → R. A complex measure is a σ-additive map µ : Σ → C.

10
In both cases, µ(∅) = 0 follows from σ-additivity, since we excluded the possibility of
infinite values in the above definition: µ(∅) = µ(∅ ∪ ∅) = µ(∅) + µ(∅). A map µ : Σ → C is
a complex measure if and only if Re µ and Im µ are finite signed measures.
Let µ be a signed or complex measure. We define the variation |µ| : Σ → [0, ∞[ of µ by

Xn
|µ|(A) := sup{ |µ(Ek )| | E1 , . . . , En ∈ Σ pairwise disjoint, A = E1 ∪ . . . ∪ En }.
k=1

The variation |µ| is a finite (positive) measure and the so-called Jordan decomposition holds
for a signed measure: There are finite (positive) measures µ+ and µ− on Σ (concentrated
on measurable subsets with |µ|-null overlap only) such that µ = µ+ − µ− and |µ| = µ+ + µ− .

Integrability of measurable functions f with respect to a (finite) signed


R measure R µ is defined
in terms of |µ|-integrability and the integral is given by f dµ := f dµ − f dµ− . For
R +

a complex measure µ = µ1 + iµ2 with real part µ1 and imaginary part µ2 the notion of
µ-integrability is also Rdefined in terms of |µ|-integrability and the integral is then given by
f dµ := f dµ1 + i f dµ2 .
R R

(E) Regular measures

0.15. Definition: Let Ω be a Hausdorff space. A Borel measure µ on Ω is regular, if


(i) µ(C) < ∞ for every compact subset C ⊆ Ω,
(ii) for every A ∈ B(Ω),

µ(A) = sup{µ(C) | C ⊆ A, C compact} = inf{µ(O) | A ⊆ O, O open}.

A signed or complex Borel measure is called regular, if the variation |µ| is regular. We
denote by M (Ω) the vector space of all signed or complex regular Borel measures on Ω.
0.16. Lemma: If Ω is a compact metric space or a complete separable metric space or an
open subset of Rd , then every finite Borel measure on Ω is regular. The Lebesgue measure
is regular. (This result is included in Ulam’s theorem [Els11, Kapitel VIII, Satz 1.16].)

(F) Riesz representation theorem

Let Σ be a sigma algebra on the set Ω. The vector space of signed (or complex) measures
on Σ becomes a Banach space when equipped with the variation norm kµk := |µ|(Ω) (cf.
[Wer18, Abschnitt I.1, Beispiel (j), Seiten 22-24]). If Ω is a Hausdorff space and Σ = B(Ω),
then the set of signed (or complex) regular Borel measures M (Ω) is a closed subspace of
the former ([Els11, Kapitel VIII, Folgerung 2.22.a)]), hence (M (Ω), k.k) is a Banach space.

11
0.17. Theorem (Riesz representation theorem): Let Ω be a compact metric space.
Then the normed dual C(Ω)0 of (C(Ω), k.k∞ ) is isometrically isomorphic to (M (Ω), k.k)
via the map R : M (Ω) → C(Ω)0 , given by
Z
(Rµ)(f ) := f dµ.

For a proof we refer to [Wer18, Theorem II.2.5] (or [Els11, Kapitel VIII, §2] and [Bau01]
for more general variants of the theorem).

(G) Absolutely continuous functions

We introduce a notion that is stronger than plain continuity, weaker than Lipschitz conti-
nuity, and provides the perfect setting for a general fundamental theorem of calculus.
0.18. Definition: A function f : [a, b] → C is absolutely continuous, if it satisfies the
following: ∀ε > 0 ∃δ > 0 such that for any n ∈ N and sequences [a1 , b1 ], . . . , [an , bn ] of
subintervals with a ≤ a1 < b1 ≤ a2 < b2 ≤ . . . ≤ an < bn ≤ b of [a, b] we have
n
X n
X
(bk − ak ) < δ =⇒ |f (bk ) − f (ak )| < ε.
k=1 k=1

The Cantor function and f : [0, 1] → R, f (0) := 0, f (x) := x sin(1/x) (x > 0), are
prominent examples of continuous√ functions which are not absolutely continuous. The
function g : [0, 1] → R, g(x) := x, is absolutely continuous, but not Lipschitz continuous.
0.19. Theorem (Fundamental theorem of calculus): A function f : [a, b] → C is
absolutely continuous if and only if it is differentiable almost everywhere and f 0 defines a
Lebesgue integrable function on [a, b]. In this case we have for every t ∈ [a, b]
Zt
f (t) = f (a) + f 0 dµ,
a
Rt
where µ denotes the one-dimensional Lebesgue measure and f 0 dµ means χ[a,t] f 0 dµ.
R
a

We may even extend the formula of integration by parts to the case of absolutely continuous
functions (see [Els11, Kapitel VII, 4.16] for (a) and [Bog07, 5.8.43] for (b)).
0.20. Proposition (Integration by parts): (a) If f and g are absolutely continuous
functions [a, b] → C, then
Zb Zb
f 0 g dµ = f (b)g(b) − f (a)g(a) − f g 0 dµ.
a a

12
(b) Suppose f and g are functions on R that are absolutely continuous on bounded intervals
and such that f, f 0 g, f g 0 ∈ L1 (R), then
Z∞ Z∞
f 0 g dµ = − f g 0 dµ.
−∞ −∞

(H) Image measures

If Σj is a σ-algebra on Ωj (j = 1, 2) and the map F : Ω1 → Ω2 has the property that


F −1 (A2 ) ∈ Σ1 whenever A2 ∈ Σ2 (this is the notion of Σ1 -Σ2 -measurability), then a measure
µ on Σ1 can be “transported” to a measure ν on Σ2 by setting ν(A2 ) := µ(F −1 (A2 )). We
call ν the image measure of µ with respect to F and write ν =: F (µ).
0.21. Theorem: A Σ2 -measurable function f : Ω2 → C is F (µ)-integrable if and only
if the function f ◦ F : Ω1 → C is µ-integrable. In this case we have the transformation
formula Z Z
f dF (µ) = (f ◦ F ) dµ.
Ω2 Ω1

(We refer to [Els11, Kapitel V, §3, Unterabschnitt 1] for a proof and to [Els11, Kapitel V,
§4] for further variants of transformation formulae.)

13
1. The spectral theorem for bounded
self-adjoint operators
1.1. The finite-dimensional case: If H = Cn , equipped with the standard inner prod-
uct, and T is a self-adjoint (or normal) operator on Cn , then we know from linear algebra
that there is an orthonormal basis B of Cn consisting of eigenvectors of T , i.e., the ma-
trix of T with respect to the basis B is diagonal and the eigenvalues of T , with their
multiplicities, occur as the entries along the diagonal. Identifying vectors v in Cn with
functions v : {1, 2, . . . , n} → C, vl := v(l) being the l-th component of v, a diagonal matrix
D with diagonal entries d(1), . . . , d(n) ∈ C acts on v as a multiplication operator, because
(D · v)(l) = d(l)v(l) for l = 1, . . . , n; thus, we may state that the unitary transformation
U : Cn → Cn associated with the orthonormal basis B (used as column vectors of U ) maps
T to a multiplication operator D on l2 ({1, . . . , n}) ∼ = Cn via

(1.1) D = U −1 T U.

In this finite-dimensional case, the spectrum σ(T ) = {λ1 , . . . , λm } is the set of pairwise
distinct eigenvalues. Let Ej denote the orthogonal projection onto the eigenspace of λj
(j = 1, . . . , m), then we may write the diagonal representation of T in the abstract form
m
X
(1.2) T = λj Ej .
j=1

For any polynomial function p on C we easily derive the following formula (using Ej Ek = 0,
m
X
if j 6= k, and Ej = Ej for every l ∈ N): p(T ) =
l
p(λj )Ej .
j=1
Nothing prevents us from using the scheme of this formula to define f (T ) for an arbitrary
function f : σ(T ) → C, namely
m
X
f (T ) := f (λj )Ej .
j=1

As an example, it is straight-forward to show that


P∞the above definition with
Pmf =k exp agrees
with the usual power series definition of e := k=0 T /k!, since T = j=1 λj Ej .
T k k

15
1.2. The case of compact operators: If T is a compact self-adjoint (or normal) operator
on the complex Hilbert space H, then the analogue of equation (1.2) holds with an infinite
sum, convergent in the operator norm ([Wer18, Korollar VI.3.3]) and a multiplication
operator analogue of the diagonal matrix representation (1.1) can be constructed via a
unitary map U : l2 (S) → H, where S is a set of cardinality equal to the Hilbert dimension
of H (i.e., the cardinality of a, hence any, complete orthonormal system in H), such that
(U −1 T U x)(s) = h(s) x(s) (s ∈ S) holds with some h ∈ l∞ (S) and for every x ∈ l2 (S). (We
give a sketch of the construction in the appendix.)
In the current section we discuss the unitary equivalence of a bounded self-adjoint operator
T on a complex Hilbert space H with a multiplication operator on an appropriate space
L2 (Ω, µ). We will also discuss how to define f (T ) for a bounded Borel measurable function
f on σ(T ) based on an integral representation generalizing (1.2).
1.3. Lemma: Let the numerical range of T ∈ L(H) be W (T ) := {hT x, xi | kxk = 1}.
Then W (T ) is a bounded subset of C satisfying σ(T ) ⊆ W (T ).
Proof: Boundedness of W (T ) follows from |hT x, xi| ≤ kT xkkxk ≤ kT kkxk2 . Let λ ∈
C \ W (T ) and d denote the distance from λ to the compact set W (T ), hence d > 0. If
x ∈ H with kxk = 1, then

0 < dkxk = d ≤ |λ − hT x, xi| = |h(λ − T )x, xi| ≤ k(λ − T )xkkxk = k(λ − T )xk,

which proves injectivity of λ − T and that the inverse (λ − T )−1 : ran(λ − T ) → H is


continuous with norm bounded by 1/d; the estimate dkxk ≤ k(λ − T )xk holds for all
x ∈ H upon rescaling, which proves closedness of ran(λ − T ) (cf. 0.2). We show that
ran(λ − T ) is also dense, thus ran(λ − T ) = H and λ ∈ ρ(T ) and the lemma will be proved.
If ran(λ − T ) were not dense in H, then we had some x0 ∈ ran(λ − T )⊥ with kx0 k = 1 and
thus
0 = h(λ − T )x0 , x0 i = λ − hT x0 , x0 i,
contradicting the fact λ 6∈ W (T ).
Recalling Proposition 0.1 we immediately obtain for self-adjoint T the relations W (T ) ⊂ R,
kT k = max{|λ| | λ ∈ W (T )}, and in combination with the lemma the following statement.
1.4. Corollary: If T ∈ L(H) is self-adjoint, then σ(T ) ⊂ R, more precisely,

σ(T ) ⊆ [m(T ), M (T )],

where m(T ) := inf{hT x, xi | kxk = 1} and M (T ) := sup{hT x, xi | kxk = 1}. In particular,


σ(T ) ⊂ [0, ∞[ for a positive operator T ).
Let T ∈ L(H) be self-adjoint. A first task is the definition of f (T ) ∈ L(H) P for any
function f ∈ C(σ(T )). This is very easy, if f is P
a polynomial function f (t) = nk=0 ak tk ,
since then the only reasonable choice is f (T ) := nk=0 ak T k , with the convention T 0 := I.

16
For general f ∈ C(σ(T )), one may continuously extend1 f to a function f˜ on the real
interval [m(T ), M (T )] ⊃ σ(T ) and then use approximation by polynomials according to the
Weierstraß theorem (cf., e.g., [Wer18, Satz I.2.11]). This leads to the so-called continuous
functional calculus, which is summarized in the following theorem. Since this construction
is often contained already in standard introductory courses on functional analysis, we
repeat the technical details of its proof only in the appendix.
1.5. Theorem (Continuous functional calculus): If T ∈ L(H) is self-adjoint, then
there is a unique map Φ : C(σ(T )) → L(H) with the following properties:
(a) Φ(id) = T and Φ(1) = I,
(b) Φ is an involutive algebra homomorphism, i.e.,
• Φ is C-linear,
• ∀f, g ∈ C(σ(T )): Φ(f · g) = Φ(f ) · Φ(g),
• Φ(f¯) = Φ(f )∗ ,
(c) Φ is continuous with respect to the norm k.k∞ on C(σ(T )), in fact, isometric,
i.e., kΦ(f )k = kf k∞ .
We write f (T ) instead of Φ(f ) and call f 7→ f (T ) the continuous functional calculus of T .
The following list collects a few more properties of the continuous functional calculus.
1.6. Theorem: Let T ∈ L(H) be self-adjoint and f ∈ C(σ(T )), then the following hold:
(i) kf (T )k = kf k∞ ,
(ii) f ≥ 0 implies that f (T ) is positive,
(iii) x ∈ H and T x = λx implies f (T )x = f (λ)x,
(iv) f (T ) is normal; f (T ) is self-adjoint, if and only if f is real-valued,
(v) the spectral mapping theorem: σ(f (T )) = f (σ(T )).
Moreover, C ∗ (T ) := {f (T ) ∈ L(H) | f ∈ C(σ(T ))} is a closed involutive commutative
subalgebra of L(H).
Proof: (i): This is only a restatement of part (c) in the above theorem.
(ii): We may write f = g 2 with g ∈ C(σ(T )) and g ≥ 0. We get f (T ) = g(T )g(T ) and
g(T )∗ = g(T ) = g(T ), hence for any x ∈ H,

hf (T )x, xi = hg(T )x, g(T )∗ xi = hg(T )x, g(T )xi = hg(T )x, g(T )xi = kg(T )xk2 ≥ 0.
1
With the Euclidean topology, R and any closed subset is a normal topological space.

17
(iii): This is elementary, if f is a polynomial, and extends to general continuous f by
uniform polynomial approximation pn → f (thanks to the Stone-Weierstraß theorem applied
to the compact subset σ(T ) ⊆ R; cf., e.g., [Wer18, Satz VIII.4.7]), since this implies pn (T ) → f (T )
with convergence in the operator norm.
(iv): f (T )∗ f (T ) = (f¯f )(T ) = (f f¯)(T ) = f (T )f (T )∗ ; f¯ = f implies f (T )∗ = f¯(T ) = f (T ), and for
the converse note that f (T ) = f (T )∗ = f¯(T ) in view of (i) yields kf − f¯k∞ = kf (T ) − f¯(T )k = 0,
hence f = f¯.
(v): For the case of a polynomial function, this relation is elementary (and already established,
e.g., in course of the proof of Theorem 1.5; cf. Claim (i) in the proof given in the appendix).
Let µ ∈ C \ f (σ(T )). Then g := 1
f −µ ∈ C(σ(T )) and g · (f − µ) = (f − µ) · g = 1, hence

g(T )(f (T ) − µ) = (f (T ) − µ)g(T ) = I,

which shows that µ ∈ ρ(f (T )); i.e., σ(f (T )) ⊆ f (σ(T )).


Let µ ∈ f (σ(T )). There is some λ ∈ σ(T ) such that µ = f (λ) and we have to show that f (λ) ∈
σ(f (T )). For every n ∈ N choose a polynomial pn with kf − pn k∞ ≤ 1/n (again thanks to the
Stone-Weierstraß theorem). It follows that kf (T ) − pn (T )k ≤ 1/n as well as |pn (λ) − f (λ)| ≤ 1/n.
Since pn is a polynomial, we have pn (λ) ∈ σ(pn (T )). By Proposition 0.6 there is some approximate
eigenvector xn ∈ H with kxn k = 1 and k(pn (T ) − pn (λ))xn k ≤ 1/n. In summary, we obtain for
every n ∈ N,

k(f (T ) − µ)xn k = k(f (T ) − pn (T ) + pn (T ) − pn (λ) + pn (λ) − f (λ))xn k


3
≤ kf (T ) − pn (T )k kxn k + k(pn (T ) − pn (λ))xn k + |pn (λ) − f (λ)| kxn k ≤ ,
| {z } | {z } | {z } n
≤1/n ≤1/n ≤1/n

which means, again by Proposition 0.6, that µ ∈ σ(f (T )).


Since the map f 7→ f (T ) = Φ(f ) is multiplicative and involutive, C ∗ (T ) = Φ(C(σ(T ))) is an
involutive subalgebra of L(H) and commutativity follows from the same property of the pointwise
multiplication in C(σ(T )). The closedness of C ∗ (T ) follows from the completeness of C(σ(T ))
together with the fact that f 7→ f (T ) = Φ(f ) is an isometry: If (fn (T ))n∈N is a sequence in C ∗ (T )
that converges in L(H), then it is a Cauchy sequence with respect to the operator norm; isometry
implies that (fn )n∈N is a Cauchy sequence in C(σ(T )); hence there is some f ∈ C(σ(T )) such
that fn → f uniformly and thus fn (T ) → f (T ) ∈ C ∗ (T ).
1.7. Remark: The notation for the closed involutive abelian algebra C ∗ (T ) shall indicate the
fact that it is exactly the C ∗ -subalgebra of L(H) generated by the normal element T and I. On
the background of this “Banach algebra point of view”, the above continuous functional calculus
appears as a special case of isometric embeddings (Gelfand transforms) or isomorphisms of abstract
commutative Banach or C ∗ -algebras (cf. [Con10, Chapter VIII] or [Wer18, Korollar IX.3.8]).

In the finite-dimensional case with spectral representation (1.2) the spectrum σ(T ) = {λ1 , . . . , λm }
is a discrete topological space and the corresponding projections E1 , . . . , Em onto the eigenspaces
can be obtained via functional calculus: Let f : σ(T ) → C be defined by f (λj ) = 0, if j 6= k, and

18
f (λk ) = 1, then Ek = m j=1 f (λj )Ej = f (T ). Note that f = χ{λk } , the characteristic function
P
of the singleton {λk }. In the general case, σ(T ) will not be discrete and hence a characteristic
function χA , A ⊂ σ(T ), will not be continuous. If we are able to extend the functional calculus
appropriately, then χA (T ) is a perfect way to produce an orthogonal projection, since χ2A = χA and
χA is real-valued. From characteristic functions we get to step functions by linear combinations
and then pointwise limits would bring us into the realm of measurable functions.

To investigate an extension of the continuous functional calculus for a self-adjoint operator T ∈


L(H), let x, y ∈ H and write for any f ∈ C(σ(T )),

lx,y (f ) := hf (T )x, yi.

The map lx,y : C(σ(T )) → C is linear and

|lx,y (f )| ≤ kf (T )kkxkkyk = kf k∞ kxkkyk,

thus, lx,y is an element in the dual space of (C(σ(T )), k.k∞ ) with klx,y k ≤ kxkkyk. By the Riesz
representation theorem (see Theorem 0.17) there is a complex Borel measure µx,y on σ(T ) with
kµx,y k = klx,y k, such that
Z
(1.3) hf (T )x, yi = f dµx,y ∀f ∈ C(σ(T )).
σ(T )

Now we observe that the right-hand side of (1.3) makes sense also for f ∈ Bb (σ(T )) and, con-
sidering its dependence on (x, y), defines a continuous sesquilinear form bf : H × H → C, since
(x, y) 7→ µx,y clearly is sesquilinear H × H → M (σ(T )) as seen from the defining Equation (1.3)
and
Z
(1.4) |bf (x, y)| = | f dµx,y | ≤ kf k∞ kµx,y k = kf k∞ klx,y k ≤ kf k∞ kxkkyk.
σ(T )

By the Lax-Milgram theorem (Theorem 0.3(b)), bf defines an operator f (T ) ∈ L(H) such that

(1.5) hf (T )x, yi = bf (x, y) ∀x, y ∈ H.

Thereby the main ingredient for the measurable functional calculus has been constructed, namely
a map Bb (σ(T )) → L(H) given by f 7→ f (T ).

1.8. Theorem (Measurable functional calculus): Let T ∈ L(H) be self-adjoint, then there
is a unique map Φ
b : Bb (σ(T )) → L(H) with the following properties (we will typically write f (T )
to mean Φ(f )):
b

(a) Φ
b is an involutive algebra homomorphism that extends the continuous functional calculus
Φ, i.e., Φ
b |C(σ(T )) = Φ, and is k.k -continuous, more precisely, kf (T )k ≤ kf k ,
∞ ∞

(b) if fn ∈ Bb (σ(T )) (n ∈ N) is such that supn∈N kfn k∞ < ∞ and fn (t) → f (t) pointwise for
t ∈ σ(T ), then fn (T ) x → f (T ) x for all x ∈ H.

19
Proof: Uniqueness: Suppose Ψ b is another measurable functional calculus satisfying (a) and (b).
Define U := {f ∈ Bb (σ(T )) | Ψ(f b )}. Then C(σ(T )) ⊆ U and U satisfies also condition
b ) = Φ(f
(b) in Lemma 0.13, hence U = Bb (σ(T )), which proves Ψ
b = Φ.
b

Existence: If f ∈ Bb (σ(T )), then we define Φ(f


b ) := f (T ) ∈ L(H) as above via the sesquilinear
form bf , based on the complex Borel measures µx,y ∈ M (σ(T )) for all x, y ∈ H, by the property in
(1.5). Linearity of Φ
b : Bb (σ(T )) → L(H) is clear from (1.3) and we have |hf (T )x, yi| = |bf (x, y)| ≤
kf k∞ kxkkyk by (1.4), hence kf (T )k ≤ kf k∞ , which shows continuity of Φ. b

We easily obtain an intermediate, weaker, variant of (b): If x, y ∈ H are arbitrary and fn (n ∈ N),
f are as in the premise of (b), then by the theorem on dominated convergence
Z Z
(1.6) hfn (T )x, yi = fn dµx,y → f dµx,y = hf (T )x, yi (n → ∞),

i.e., fn (T )x → f (T )x weakly in H for every x.

We will first show that Φb is multiplicative and involutive before getting back to the improvement
of (1.6) establishing (b).

We already know that Φ(f


b g) = Φ(f
b ) Φ(g),
b if both f and g are continuous.
Let g be continuous and put U := {f ∈ Bb (σ(T )) | Φ(fb g) = Φ(f
b ) Φ(g)},
b then we clearly have
C(σ(T )) ⊆ U . Suppose that the sequence fn ∈ U (n ∈ N) is uniformly bounded and pointwise
convergent to f ∈ Bb (σ(T )). Then by (1.6),

hΦ(f
b )(Φ(g)x),
b yi = lim hΦ(f
b n )(Φ(g)x),
b yi = lim hΦ(f
b n g)x, yi = hΦ(f
b g)x, yi ∀x, y ∈ H,
n→∞ n→∞

which implies Φ(f


b )Φ(g)
b b g), hence f ∈ U and Lemma 0.13 yields U = Bb (σ(T )).
= Φ(f
Now let f be measurable and bounded and put V := {g ∈ Bb (σ(T )) | Φ(f b g) = Φ(f
b ) Φ(g)}.
b We
have shown above that C(σ(T )) ⊆ V , and, employing Lemma 0.13 in the same way again (strictly
speaking, upon rewriting hΦ(f
b )(Φ(g)x),
b yi = hΦ(g)x,
b b )∗ yi and then using gn → g), we find
Φ(f
V = Bb (σ(T )), thus, multiplicativity of Φ.
b

A similar technique is used to show Φ(f b f¯), which clearly holds for continuous functions:
b )∗ = Φ(
b ∗ b ¯
Put U := {f ∈ Bb (σ(T )) | Φ(f ) = Φ(f )} and note that C(σ(T )) ⊆ U . If (fn ) is a uniformly
bounded sequence in U and converges pointwise to f , then

b )∗ x, yi = hx, Φ(f
hΦ(f b )yi = hΦ(f
b )y, xi = lim hΦ(f
b n )y, xi = lim hΦ(f b f¯)x, yi
b n )x, yi = hΦ(
n→∞ n→∞

holds for arbitrary x, y ∈ H and hence f ∈ U ; thus, U = Bb (σ(T )) again by Lemma 0.13, i.e., Φ
b
is involutive.

Finally, we prove (b): Let fn (n ∈ N), f be as in the premise of (b) and x ∈ H. By (1.6) we have
fn (T )x → f (T )x weakly in H. By the properties of Φb we obtain in addition

kfn (T )xk2 = hfn (T )x, fn (T )xi = hfn (T )∗ fn (T )x, xi = h(fn fn )(T )x, xi
→ h(f¯f )(T )x, xi = hf (T )∗ f (T )x, xi = kf (T )xk2 (n → ∞).

20
We may therefore conclude that fn (T )x → f (T )x in H, since kfn (T )x − f (T )xk2 = kfn (T )xk2 −
2 Rehfn (T )x, f (T )xi + kf (T )xk2 → kf (T )xk2 − 2 Rehf (T )x, f (T )xi + kf (T )xk2 = 0.

1.9. Remark: While Φ is injective, since kΦ(f )k = kf k∞ , its extension Φb is in general not. For
example, we will prove later in this section that for any λ ∈ σ(T ), Φ(χ
b {λ} ) 6= 0 if and only if λ is
an eigenvalue.

As first application of the measurable functional calculus we investigate the orthogonal projections
obtained from characteristic functions.

1.10. Lemma: Let T ∈ L(H) be self-adjoint, then the following hold:

(i) for every Borel set A ⊆ σ(T ), χA (T ) is an orthogonal projection,

(ii) χ∅ (T ) = 0 and χσ(T ) (T ) = I,

(iii) if A1 ,S
A2 , . . . are pairwise disjoint Borel subsets of σ(T ) and x ∈ H, then we have with
A := ∞ j=1 Aj
X∞
χAj (T )x = χA (T )x,
j=1

(iv) χA (T )χB (T ) = χA∩B (T ) for Borel sets A, B ⊆ σ(T ).

Proof: (i) follows from χ2A = χA and χA = χA .

(ii): χ∅ = 0 ∈ C(σ(T )) and Φ(0) = 0 (by linearity); χσ(T ) = 1 in C(σ(T )) and Φ(1) = I.

(iii): Put fn = nj=1 χAj , f := χA and apply Theorem 1.8(b).


P

(iv) follows from χA χB = χA∩B .

1.11. Remark: In general, there is no operator norm convergent analogue of (iii), since the
projection χAj (T ) has operator norm 1, unless it is the zero operator (corresponding to the case
µx,y (Aj ) = 0 for all x, y ∈ H).

The previous lemma opens the door to so-called projection-valued measures on the Borel sigma
algebra B(R) by considering the map E : B(R) → L(H), A 7→ χA∩σ(T ) (T ), which is an example
of the following concept.

1.12. Definition: A map E : B(R) → L(H), A 7→ EA is called a spectral measure, if every EA is


an orthogonal projection and

(i) E∅ = 0, ER = I,
S∞
(ii) if A1 , A2 , . . . are pairwise disjoint Borel sets in R, A := j=1 Aj , and x ∈ H, then


X
EAj x = EA x.
j=1

21
A spectral measure E has compact support, if there is a compact subset K ⊂ R such that EK = I.
It is an exercise to deduce EA EB = EA∩B = EB EA and monotonicity, i.e., EA ≤ EB (in the sense
that EB − EA is a positive operator), if A ⊆ B.
1.13. Integration with respect to a spectral P measure: Let E be a spectral measure. We
first define the integral of a step function h = m
j=1 αj χAj on R simply by

Z m
X
h dE := αj EAj
j=1

and note that this is independent of the representation of h: If m


Pn
l=1 βl χBl , we
P
j=1 αj χAj =
may suppose that both A1 , . . . , Am and B1 , . . . , Bn are partitions of R (by possibly adding terms
with coefficient 0 in the sum representations). Then EAj = l=1 EAj ∩Bl , EBl = m
Pn P
j=1 EAj ∩Bl
and αj = βl , if Aj ∩ Bl 6= ∅, therefore
m
X X n
X
αj EAj = αj EAj ∩Bl = βl EBl .
j=1 1≤j≤m l=1
1≤l≤n

We claim that k h dEk ≤ khk∞ . In fact, if kxk ≤ 1 and h = m


j=1 αj χAj , then (since the EAj x
R P
are pairwise orthogonal)

2
Z  2 m
X m
X  m
X
2
h dE x = αj EAj x = 2
|αj | kEAj xk ≤ max |αj | 2
kEAj xk2
j=1,...,m
j=1 j=1 j=1
2
m
X 2
= khk2∞ EAj x = khk2∞ ESm
j=1 Aj
x ≤ khk2∞ kxk2 .
j=1

Let f ∈ Bb (R), then by density of the step functions, there is a uniformly convergent sequence of
step functions hn → f . We have
Z Z Z
k hn dE − hm dEk = k (hn − hm ) dEk ≤ khn − hm k∞ ,

hence ( hn dE)n∈N is a Cauchy sequence in L(H) and possesses a limit; furthermore, this limit is
R

independent of the choice of approximating sequence (hn ), since any mix of such sequences yields
a Cauchy sequence of integrals, which cannot have a different accumulation point. Therefore, we
may define Z Z
f dE := lim hn dE.
n→∞

Notations such as , and will be used in the sequel. The map


R R R R
f dE, f (λ) dE λ R f dE f →
7 f dE
¯
clearly satisfies f dE = ( f dE) , is linear Bb (R) → L(H), and continuous, since

R R

Z
k f dEk ≤ kf k∞ .

22
Furthermore, a real-valued function f ∈ Bb (R) gives a self-adjoint operator
R
f dE ∈ L(H).

RIf E has compact support, say EK = I for a compact set K ⊂ R, and f ∈ Bb (K), then we define
dE (formally, uponRextending f to R, e.g., by the value 0 outside K); we will
R
K f dE := χ K f
often abuse notation by still writing f dE in this case. The choice of K with EK = I is not
Ressential, since A ∈ B(R) with A ∩ K = ∅ implies EA = EA I = EA EK = EA∩K = 0 and hence
χA f dE = 0.

The measurable functional calculus for a self-adjoint operator T ∈ L(H) provided us with the
means to define a compactly supported spectral measure E associated with T by EA := χA∩σ(T ) (T )
(A ∈ B(R)).
Conversely, if a compactly supported spectral measure E is given, say EK R= I with K ⊂ R
compact, then any f ∈ C(K) is bounded and Borel measurable. Hence T := λ dEλ ∈ L(H) is
defined and self-adjoint, since idK is real-valued. We will show that the spectral measure of T
is exactly E and that the integrals with respect to E give the (uniquely determined) measurable
functional calculus of T .
1.14. Proposition: Let E be a compactly supported spectral measure and T = λ dEλ . Then
R

Eσ(T ) = I, T is self-adjoint, and its measurable functional calculus is given by


Z
Ψ : Bb (σ(T )) → L(H), f 7→ f dE.
σ(T )

In particular, we may conclude that


Z
∀A ∈ B(σ(T )) : χA (T ) = Ψ(χA ) = χA dE = EA .

Proof: We first prove condition (a) of Theorem 1.8: We already know that Ψ is involutive,
linear, and continuous. Multiplicativity is clear in case of step functions, since Ψ(χA )Ψ(χB ) =
EA EB = EA∩B = Ψ(χA∩B ) = Ψ(χA χB ), and follows in general by a routine argument employing
uniform approximation by step functions. We clearly have Ψ(id) = T and, by multiplicativity,
Ψ(idn ) = T n for n ∈ N. It remains to show Ψ(1) = I, then we have that Ψ coincides with the
continuous functional calculus on polynomials, hence on all of C(σ(T )).
Note that Ψ(1) = σ(T ) 1 dE = Eσ(T ) , thus we need to show that Eσ(T ) = I. In fact, we will show
R

Eρ(T )∩R = 0, which implies Eσ(T ) = Eσ(T ) + Eρ(T )∩R = E(σ(T )∪ρ(T ))∩R = ER = I.
Here, and also in the sequel, we will simplify notation by writing EA instead of EA∩R , if A ⊆ C
is a Borel set.
Choose a < b such that E]a,b] = I, i.e., EK = I for some compact subset K of ]a, b]. Let µ ∈ ρ(T ).
If µ 6∈ ]a, b], then choosing an open neighborhood U of µ with U ∩ K = ∅ yields EU = 0.
It remains to consider the case µ ∈ ]a, b]. Recall that the set of invertible operators is open in
L(H) and that inversion is a continuous map on that set. Therefore, given C := k(µ − T )−1 k + 1,
there is δ > 0 such that

∀S ∈ L(H) : kS − (µ − T )k ≤ δ ⇒ S is invertible and kS −1 k ≤ C.

23
We may suppose that δ = (b − a)/N for some N ∈ N and that δ < 1/C.

Put aP
k := a + kδ (k = 0, . . . , N ), Ak := ]ak−1 , ak ] (k = 1, . . . , N ), and
Pconsider the step function
N N
h := k=1 ak χAk . Then |λ − h(λ)| ≤ δ for all λ ∈ ]a, b]. Note that k=1 EAk = E]a,b] = I and
EAk EAl = EAk ∩Al = δkl EAk . Writing µI = N k=1 µEAk gives
P

N
X N
X Z
k (µ − ak )EAk − (µ − T )k = kT − ak EAk k = kT − h dEk ≤ sup |λ − h(λ)| ≤ δ,
k=1 k=1 a<λ≤b

hence B := N k=1 (µ−ak )EAk is invertible and kB k ≤ C. Denoting by Σ the sum only over those
−1 0
P
k ∈ {1, . . . , N } such that EAk 6= 0 we may write B = Σ (µ − ak )EAk and B −1 = Σ0 (µ − ak )−1 EAk .
0

Therefore,
1
C ≥ kB −1 k = max{ | 1 ≤ k ≤ N and EAk 6= 0}.
|µ − ak |
Since µ ∈ ]a, b] = A1 ∪ . . . ∪ AN there is some k such that µ ∈ Ak = ]ak−1 , ak ]. Therefore,
|µ − ak | ≤ |ak−1 − ak | = δ < 1/C, which implies EAk = 0. If µ < ak , then U := ]ak−1 , ak [ is
an open neighborhood of µ such that EU = 0. If µ = ak , then |µ − ak+1 | = δ < 1/C and also
EAk+1 = 0, so that U := ]ak−1 , ak+1 [ is an open neighborhood of µ with EU = 0.

To summarize, we found for any µ ∈ ρ(T ) an open neighborhood U of µ such that EU = 0.

If M ⊂ ρ(T ) is compact, we can cover M by neighborhoods U (µ) (µ ∈ M ) constructed as above


with EU (µ) = 0. PmBy compactness, finitely many U (µ1 ), . . . , U (µm ) suffice, hence 0 ≤ EM ≤
j=1 EU (µj ) = 0, hence EM = 0. For arbitrary x ∈ H, the finite positive Borel
ESm ≤ 2
j=1 U (µj )
measure A 7→ hEA x, xi, B(R) → [0, ∞[ is regular due to Lemma 0.16; it vanishes on compact
subsets of ρ(T ) and we therefore have hEρ(T ) x, xi = 0; since x was arbitrary and Eρ(T ) is self-
adjont, we conclude from Proposition 0.1(ii) that kEρ(T ) k = supkxk≤1 |hEρ(T ) x, xi| = 0, hence
Eρ(T ) = 0.

We finally show that Ψ satisfies also condition (b) of Theorem 1.8: It suffices to show the weaker
variant (1.6), since the final argument in the proof of Theorem 1.8 could be repeated here with
Ψ(fn ) in place of fn (T ).

Let x, y ∈ H and denote by


Pνx,y the complex measure on B(σ(T )) given by A 7→ hEA x, yi. Then
for any step function h = mj=1 αj χAj we have

Z Xm Xm Z
hΨ(h)x, yi = h( h dE)x, yi = h αj EAj x, yi = αj hEAj x, yi = h dνx,y ,
j=1 j=1
| {z }
νx,y (Aj )

and for g ∈ Bb (σ(T )) we apply uniform approximation by step functions and pass to the limit to
obtain Z
hΨ(g)x, yi = g dνx,y .

2
EA∪B = EA\B + EB ≤ EA + EB and similarly for finitely many terms

24
If (fn ) and f are as in the premise of condition (b), then (1.6) follows from the theorem on
dominated convergence, since
Z Z
hΨ(fn )x, yi = fn dνx,y → f dνx,y = hΨ(f )x, yi (n → ∞).

Let T ∈ L(H) be self-adjoint with spectral Rmeasure E. Put EA := χA∩σ(T ) (T ) (A ∈ B(R)), so


that Eσ(T ) = χσ(T ) (T ) = I, and define S := σ(T ) λ dEλ .
We claim that S = T .
Let ε > 0 and h be a step function on σ(T ) such that k id −hk∞ ≤ ε/2; suppose h = m
P
j=1 αj χAj
(with Aj ∈ B(σ(T )) pairwise disjoint). Denote by f (T ) the functional calculus of T and by Ψ(f )
that of S (according to the above proposition), then

kT − Sk ≤ kT − h(T )k + kh(T ) − Ψ(h)k + kΨ(h) − Sk =


m
X Z Z Z
k(id −h)(T )k + k αj χAj (T ) − h(λ) dEλ k + k h(λ) dEλ − λ dEλ k =
j=1
σ(T ) σ(T ) σ(T )
m
X m
X Z
k(id −h)(T )k + k αj χAj (T ) − αj EAj k + k (h − id) dEk ≤
j=1 j=1
σ(T )
m
X ε ε
k id −hk∞ + k αj (χAj (T ) − EAj ) k + kh − id k∞ ≤ + 0 + = ε.
2 2
j=1
| {z }
=0

Thus, we have proved the following statement.


1.15. Theorem (Spectral theorem for self-adjoint bounded operators): Let T ∈ L(H)
be self-adjoint, then there exists a unique spectral measure E with compact support on R such
that Z
T = λ dEλ .
σ(T )

The map f 7→ σ(T ) f dE, Bb (σ(T )) → L(H), defines the measurable functional calculus of T ,
R

and f (T ) is determined by the measures µx,y ∈ M (σ(T )) (x, y ∈ H), µx,y (A) = hEA x, yi, via
Z Z
hf (T )x, yi = f dµx,y =: f (λ) dhEλ x, yi ∀x, y ∈ H.
σ(T ) σ(T )

1.16. The finite-dimensional case revisited: Let H = Cn , T self-adjoint on H with σ(T ) =


{λ1 , . . . λm }, and E1 , . . . , Em denote the projections onto the eigenspaces. Then B(σ(T )) =
P(σ(T )) and putting for any A ⊆ σ(T )
X
EA := Ej
{j∈{1,...,m}|λj ∈A}

25
Pm
defines the spectral measure of T , since Eσ(T ) = j=1 Ej = I and
m
X m
X Z
T = λj Ej = λj E{λj } = λ dEλ .
j=1 j=1
σ(T )

1.17. Example: Let H = L2 ([0, 1]) and T : L2 ([0, 1]) → L2 ([0, 1]) be the multiplication operator
(T x)(t) = tx(t) (t ∈ [0, 1], x ∈ L2 ([0, 1])). Clearly, T is self-adjoint and T − λ possesses a
continuous inverse, namely, ((T − λ)−1 x)(t) = x(t)/(t − λ), if and only if λ ∈ C \ [0, 1], hence
σ(T ) = [0, 1]. It is an exercise to show that T has no eigenvalues, thus, by Lemma 0.5 we have
σ(T ) = σc (T ) = [0, 1]. Let x, y ∈ L2 ([0, 1]), then

Z1 Z
hT x, yi = λx(λ)y(λ) dλ = λ x(λ)y(λ) dλ,
0 σ(T )

which suggests that µx,y is Lebesgue measure on [0, 1] with density function xy, i.e, that the
spectral measure E is defined by (EA x)(t) := χA∩[0,1] (t)x(t) (A ∈ B(R)). Indeed3 , hEA x, yi =
R1
0 χA∩[0,1] (λ)x(λ)y(λ) dλ = A∩[0,1] x(λ)y(λ) dλ and
R

Z Z Z1
λ dhEλ x, yi = id dµx,y = hT x, yi = λx(λ)y(λ) dλ.
σ(T ) σ(T ) 0

1.18. Remark: Let T ∈ L(H) be self-adjoint with spectral measure E. An operator S ∈ L(H)
commutes with T , if and only if S commutes with every EA (A ∈ B(R)). The statement can be
shown by first noting that ST = T S is equivalent to S commuting with all powers T n , in other
words, that hST n x, yi = hT n Sx, yi holds for all x, y ∈ H. Using
Z
λn dhEλ Sx, yi = hT n Sx, yi = hST n x, yi = hT n x, S ∗ yi
Z Z

= λ dhEλ x, S yi = λn dhSEλ x, yi
n

and density of polynomials, this in turn can be seen to mean that the complex measures ν1 (A) :=
hSEA x, yi and ν2 (A) := hEA Sx, yi agree when considered as functionals on C(σ(T )). Varying x
and y we obtain SEA = EA S.
1.19. Proposition: If S ∈ L(H) is self-adjoint, g : σ(S) → R and f : R → C are bounded and
Borel measurable, then
(f ◦ g)(S) = f (g(S)).

Proof: The composition f ◦ g : σ(S) → C is Borel measurable and bounded, therefore (f ◦ g)(S) is
defined by the measurable functional calculus. Let F denote the spectral measure of S. Since g is
real-valued, g(S) is self-adjoint and possesses as a unique spectral measure E. It suffices to prove
3
Strictly speaking,we also need to check that E is a spectral measure, but this is easy.

26
the statement of the proposition in case f = χA (A ∈ B(R)), since it can then be extended to
step functions and measurable functions by the usual techniques and properties of the functional
calculus.
We have χA ◦ g = χg−1 (A) , thus it suffices to show Fg−1 (A) = EA for every A ∈ B(R). This in turn
is equivalent to the equality of the measures µx,y (A) := hEA x, yi and νx,y (A) := hFg−1 (A) x, yi for
arbitrary x, y ∈ H. Note that νx,y is the image measure of the measure ρx,y (B) := hFB x, yi (B ∈
B(R)) associated with the spectral measure F of S, since νx,y (A) = hFg−1 (A) x, yi = ρx,y (g −1 (A)),
i.e., νx,y = g(ρx,y ). We obtain
Z Z Z Z
λ dνx,y = g(λ) dρx,y = hg(S) x, yi = λ dhEλ x, yi = λn dµx,y ,
n n n n

therefore νx,y and µx,y are equal as (continuous) linear functionals on polynomial functions on
σ(S). By the theorem of Weierstraß, the two measures are equal on C(σ(S)). Since x, y were
arbitrary, we have shown Fg−1 (A) = EA .
1.20. Example (roots of positive operators): Let T ∈ L(H) be positive and n ∈ N, then there
is a unique positive S ∈ L(H) such that S n = T . Existence is clear, if we note that σ(T ) ⊆ [0, ∞[
and put S := T 1/n . To show uniqueness, suppose S is positive, hence σ(S) ⊆ [0, ∞[, and satisfies
S n = T . Let g : σ(S) → R be defined by g(s) := sn and f : R → R, f (t) := (tχσ(T ) (t))1/n . Then
f ◦ g = idσ(S) , since σ(T ) = σ(S n ) = g(σ(S)), and we obtain from Proposition 1.19

S = (f ◦ g)(S) = f (g(S)) = f (S n ) = f (T ) = T 1/n .

1.21. Remark (polar decomposition): For arbitrary T ∈ L(H), the operator T ∗ T is positive
and we may define the positive operator |T | := (T ∗ T )1/2 . Note that k|T |xk2 = hx, |T |2 xi =
hx, T ∗ T xi = kT xk2 , which enables us to define U : ran(|T |) → ran(T ) by U (|T |x) := T x and

extend it as an isometry to ran(|T |) → ran(T ). Putting U = 0 on ran(|T |) = ran(|T |)⊥ =
ker(|T |) = ker(T ), we obtain an operator U ∈ L(H) with ker(U ) = ker(T ), which acts as an
isometry ker(T )⊥ → ran(T ), a so-called partial isometry, with the property U |T | = T . This
relation is called polar decomposition of T , since it resembles the similar formula z = eiα |z| for
complex numbers. (It can be shown that U |T | = T = (T T ∗ )1/2 U , see, e.g., [KR, Theorem 6.1.2].)
1.22. Description of the spectrum in terms of the spectral measure: Let T ∈ L(H) be
self-adjoint with spectral measure E and λ be a complex number.
(a) λ 6∈ σ(T ) if and only if there is an open neighborhood U ⊆ C of λ such that EU = 0.
Proof: By construction, Eσ(T ) = I (Proposition 1.14), thus Eρ(T ) = 0, and the resolvent set ρ(T )
is open, which proves the ‘only if’ part. To show the reverse implication, suppose U is an open
neighborhood of λ such that EU = 0. Define the bounded measurable function f : σ(T ) → C by
f (t) := 1/(λ − t), if t ∈ σ(T ) \ U , and f (t) := 0, if t ∈ U ∩ σ(T ). The function g : σ(T ) → C,
g(t) := λ − t is also bounded and measurable and f · g = χσ(T )\U , hence

f (T )(λ − T ) = f (T )g(T ) = (f g)(T ) = χσ(T )\U (T ) = Eσ(T )\U = Eσ(T ) − EU = I.

The same holds for (λ − T )f (T ) = g(T )f (T ) = (gf )(T ) = (f g)(T ), thus showing that f (T ) ∈
L(H) is an inverse of λ − T , i.e., λ ∈ ρ(T ).

27
(b) λ ∈ σp (T ) (i.e., λ is an eigenvalue of T ) if and only if E{λ} 6= 0.
In this case, E{λ} projects onto the eigenspace of λ.
Proof: We show that ran(E{λ} ) = ker(λ − T ), then the result follows immediately.
• ran(E{λ} ) ⊆ ker(λ − T ): x ∈ ran(E{λ} ) means E{λ} x = x; let y ∈ H arbitrary and note that
(λ − t)χ{λ} (t) = 0 for all t to obtain
Z
h(λ − T )x, yi = h(λ − T )E{λ} x, yi = (λ − t)χ{λ} (t) dhEt x, yi = 0;

thus, (λ − T )x = 0.
• ker(λ − T ) ⊆ ran(E{λ} ): If T x = λx, then f (T )x = f (λ)x for every f ∈ C(σ(T )) by Theorem
1.6(iii); by the properties of the measurable functional calculus(Theorem 1.8) and an application
of Lemma 0.13, we deduce that the relation also holds with f ∈ Bb (σ(T )); putting f = χ{λ} yields
E{λ} x = χ{λ} (λ)x = x, thus, x ∈ ran(E{λ} ).

(c) If λ is an isolated point in σ(T ), then λ ∈ σp (T ).


Proof: Choose an open set U with U ∩ σ(T ) = {λ}; we have EU = EU \{λ} + E{λ} = E{λ} , since
U \ {λ} ⊆ ρ(T ) and Eρ(T ) = 0; U is an open neighborhood of λ and λ ∈ σ(T ), hence EU 6= 0 by
(a); thus, E{λ} 6= 0 and therefore λ ∈ σp (T ) by (b).

(d) σ(T ) is the smallest compact subset with Eσ(T ) = I.


Proof: Let K ⊆ R be compact with EK = I. Then ER\K = 0 and, by (a), µ ∈ R \ K implies
µ 6∈ σ(T ), i.e., σ(T ) ⊆ K.

We will now discuss the aspect of diagonalization in terms of unitary equivalence with a multipli-
cation operator.
1.23. Multiplication operator version of the spectral theorem: A vector x ∈ H is said
to be a cyclic vector for an operator S ∈ L(H), if the linear span of the set {x, Sx, S 2 x, . . .} is
dense in H. (Note that then H is necessarily separable, hence possesses a complete orthonormal
system of at most countable cardinality.) The multiplication operator T in Example 1.17 possesses
the constant function 1 ∈ L2 ([0, 1]) as a cyclic vector, because (T n x)(t) = tn x(t) and therefore
span{1, T 1, T 2 1, . . .} contains the L2 -dense subspace of all polynomial functions on [0, 1]. In
contrast, the identity operator I ∈ L(H) never possesses a cyclic vector, unless the dimension of
H is 0 or 1.
Theorem A: Let T ∈ L(H) be self-adjoint with spectral measure E and suppose x ∈ H is a
cyclic vector for T . If µ := µx,x (recall µx,x (A) = hEA x, xi), then there exists a unitary operator
U : H → L2 (σ(T ), µ) such that we have, for every ϕ ∈ L2 (σ(T ), µ),

(U T U −1 ϕ)(t) = t ϕ(t) µ-almost everywhere.

Proof: Start out with ϕ ∈ C(σ(T )) ⊆ L2 (σ(T ), µ), then we have


Z Z
2
kϕk2 = 2
|ϕ(t)| dµ(t) = ϕ(t) ϕ(t) dhEt x, xi = hϕ(T )∗ ϕ(T )x, xi = kϕ(T )xk2 ,
σ(T ) σ(T )

28
i.e., the linear map V0 : C(σ(T )) → H, ϕ 7→ ϕ(T )x is isometric with respect to k.k2 . The unique
continuous extension V : L2 (σ(T ), µ) → H is linear and also isometric. Moreover, ran(V ) ⊇
ran(V0 ) ⊇ span{x, T x, T 2 x, . . .} is dense in H, hence V is also surjective (since kV ψk = kψk2
shows that V has closed range; cf. 0.2); therefore V is a unitary map. It satisfies the following
relation for every ϕ in the dense subspace C(σ(T )) ⊆ L2 (σ(T ), µ):

T (V ϕ) = T (ϕ(T )x) = (id(T )ϕ(T ))x = (id · ϕ)(T )x = V (id · ϕ),

in other words, (V −1 T V ϕ)(t) = t ϕ(t) holds for all ϕ ∈ L2 (σ(T ), µ) for µ-almost all t ∈ σ(T ).
Thus, the theorem is proved upon setting U := V −1 .

In the sense of Theorem A, the multiplication operator in Example 1.17 is the prototype of any
self-adjoint operator with (purely continuous) spectrum [0, 1] and a cyclic vector x such that µx,x
is (equivalent to) the Lebesgue measure.

The extension of the above theorem to the general situation of a self-adjoint operator T on a
Hilbert space H faces merely technical difficulties. The major step in overcoming these is to show
that H can be decomposed into T -invariant subspaces Hj (j ∈ J, J some index set) with cyclic
vectors in each Hj . Then Theorem A applies to each restriction Tj of T to these components and
we come up with an orthogonal direct sum of multiplication operators on L2 -spaces with respect
to different measure spaces (σ(Tj ), B(σ(Tj )), µj ). This still yields a multiplication operator—
although not given as multiplication by a “global identity function”—on the direct sum, which
can be implemented as a single space L2 (Ω, µ) by artificially producing a disjoint union Ω of the
sets σ(Tj ) and consider the “sum” µ of the measures µj on it. We do not go into the details of
the constructions here, but refer to [Wer18, Satz VII.1.21 and Lemma VII.1.22] for a proof in the
separable case and to [Con10, Chapter IX, Theorem 4.6] or [Kab14, Abschnitt 15.3] for a proof of
the general situation.

Theorem B: Let T ∈ L(H) be self-adjoint, then there exists a measure space (Ω, Σ, µ), a bounded
measurable function f : Ω → R, and a unitary operator U : H → L2 (Ω, µ) such that we have, for
every ϕ ∈ L2 (Ω, µ),
U T U −1 ϕ = f ϕ µ-almost everywhere.

1.24. Example (convolution operators): [In this example we call on a few more basic results
from measure theory and Fourier analysis that are typically covered by the companion master
level course on Real Analysis; we may also refer to [Con16, Fol99].] Let f ∈ L1 (R) ∩ L2 (R) and
x ∈ L2 (R), then for every s ∈ R, the function t 7→ f (s − t) belongs to L2 as well and
Z
(f ∗ x)(s) := f (s − t)x(t) dt
R

defines a measurable function f ∗ x : R → C, the so-called convolution of f and x, which obeys


the estimate (a special case of Young’s inequality)

kf ∗ xk2 ≤ kf k1 kxk2 .

29
The map x 7→ f ∗ x defines a linear continuous operator Cf : L2 (R) → L2 (R), which is self-adjoint,
if f (r) = f (−r) holds for all r ∈ R, since
Z Z Z Z
hCf x, yi = f (s − t)x(t) dt y(s) ds = x(t) f (s − t) y(s) ds dt,
Z Z
while hx, Cf yi = x(t) f (t − s) y(s) ds dt.

The Fourier transform,


√ originally defined for functions g ∈ L1 (R) via a formula like gb(τ ) =
exp(−itτ )g(t) dt/ 2π (factors involving 2π are a matter of convention), extends to a unitary
R

operator F on L2 (R) (Plancherel’s theorem) and is famous for intertwining differentiation with
multiplication by id (on functions with appropriate regularity or decay properties) as well as
transforming convolution into multiplication, i.e.,

F(f ∗ x) = 2π fb · xb (f ∈ L1 ∩ L2 , x ∈ L2 ).

By the Riemann-Lebesgue lemma, fb is continuous and vanishes at infinity.



Thus, Cf is unitarily equivalent to the operator Mh of multiplication by h = 2π fb, since FCf x =
F(f ∗ x) = h · x
b = Mh xb, i.e.,
FCf F−1 = Mh .
Unitarily equivalent operators have the same spectrum, since λ − T is invertible if and only if
λ − U T U −1 is. Hence we may determine σ(Cf ) via σ(Mh ), which in case of a continuous function
h vanishing at infinity can be shown to be h(R) = h(R)∪{0}. Furthermore, a number λ can be seen
to be an eigenvalue of Mh , hence also of Cf , if and only if the set {t ∈ R | h(t) = λ} = h−1 ({λ})
has positive Lebesgue measure.
1.25. Towards a spectral theorem for normal operators: If T ∈ L(H) is a normal operator,
then T and T ∗ commute and
1 1
S1 := (T + T ∗ ), S2 := (T − T ∗ )
2 2i
are commuting self-adjoint operators such that T = S1 + iS2 . Let E, F denote the spectral
measure of S1 , S2 , respectively. For arbitrary A, B ∈ B(R), the projections EA and FB commute
due to Remark 1.18. The operator GA×B := EA FB is an orthogonal projection, since G∗A×B =
FB∗ EA A×B and GA×B = (EA FB )(EA FB ) = EA FB = EA FB = GA×B . For
∗ =F E =E F =G 2 2 2
B A A B
arbitrary x, y ∈ H, it can be shown that the map A × B 7→ hGA×B x, yi can be extended to a
complex measure on B(R2 ), hence also on B(C), and that the following spectral theorem can be
obtained for T = S1 + iS2 along the lines of the constructions for self-adjoint operators carried
out above. We skip the technical details and may refer to, e.g., [Con10, Chapters VIII and IX]
or [Kab14, Abschnitte 15.3-5] for the statements following below. We observe that the spectral
representation at least becomes plausible by this sketch of a calculation:
Z Z Z Z
T = S1 + iS2 = S1 · I + iI · S2 = λ dEλ · 1 dFµ + i 1 dEλ · µ dFµ
ZZ Z Z
= (λ + iµ) dEλ dFµ = (λ + iµ) dG(λ,µ) = z dGz .

30
Theorem: Let T ∈ L(H) be normal, then there exists a unique spectral measure G : B(C) →
L(H) with compact support such that
Z
T = λ dGλ .
σ(T )

The map f 7→ f (λ) dGλ , Bb (σ(T )) → L(H) defines the measurable functional calculus for T .
R

Moreover, T is unitarily equivalent to an operator on some L2 (Ω, µ) of multiplication by a complex


bounded measurable function.

Spectral mapping theorem: If f ∈ C(σ(T )), then σ(f (T )) = f (σ(T )) = {f (λ) | λ ∈ σ(T )}.

1.26. Example (unitary operators): Let U ∈ L(H) be unitary, then U is normal. Recall that

σ(U ) ⊆ S 1 = {z ∈ C | |z| = 1},

which can be argued as follows: Since U is bijective and kU k = 1, we have 0 ∈ ρ(U ) and
{λ ∈ C | |λ| > 1} ⊆ ρ(U ); if 0 < |λ| < 1, then λU as well as ( λ1 − U ∗ ) are bijective, hence
λ − U = (−λU )( λ1 − U ∗ ) is continuously invertible showing that λ ∈ ρ(U ).

Consider the function arg : S 1 → R defined uniquely by the requirements ei arg(z) = z and
arg(S 1 ) ⊆ ] − π, π]. The function arg is bounded by construction and continuous except for a
jump at z = −1, hence Borel measurable. The measurable functional calculus for U allows us
to define A := arg(U ) ∈ L(H), which is self-adjoint, because arg is real-valued. The relation
(exp ◦(i arg))(z) = exp(i arg(z)) = z now implies

exp(iA) = exp(i arg(U )) = U.

Note that kAk = k arg(U )k ≤ k arg k∞ = π implies σ(A) ⊆ [−π, π].

1.27. Remark (Gelfand theory for commutative unital C∗ -algebras): As with the con-
tinuous functional calculus for self-adjoint bounded operators described in Theorem 1.6, we may
consider the commutative closed subalgebra C ∗ (T ) := {f (T ) | f ∈ C(σ(T ))} of L(H) now for
an arbitrary normal operator T ∈ L(H). Both L(H) and C ∗ (T ) are examples of a unital C ∗ -
algebra, i.e., a Banach algebra A with a unit element I ∈ A and a conjugate linear involution
map A → A, A 7→ A∗ satisfying (A∗ )∗ = A and (AB)∗ = B ∗ A∗ such that kA∗ Ak = kAk2 . Since
kABk ≤ kAkkBk holds in any normed algebra, one easily deduces that involution is an isometry,
i.e., kA∗ k = kAk (see [Con10, Chapter VIII, Proposition 1.7] or [Wer18, Lemma IX.3.3]).

While L(H) is not commutative4 , for any normal operator T , the C ∗ -subalgebra C ∗ (T ) ⊆ L(H)
is commutative and is, in fact, the smallest C ∗ -algebra of L(H) containing I, T , and T ∗ . Further
examples of commutative unital C ∗ -algebras are provided by C(X) = {f : X → C | f continuous}
for any compact Hausdorff space X, with involution f ∗ (x) := f (x) (x ∈ X) and the supremum
norm k.k∞ . According to the so-called Gelfand-Naimark theorem, every unital commutative C ∗ -
algebra is of this latter form (cf. [Con10, Chapter VIII, Theorem 2.1] or [Wer18, Theorem IX.3.4]).
4
unless H is one-dimensional

31
The basic construction of the Gelfand isomorphism establishing the Gelfand-Naimark theorem
proceeds as follows: Let A be a unital commutative C ∗ -algebra and denote by X(A) the set
of all nonzero characters on A, i.e., multiplicative linear functionals A → C. It can be shown
that X(A) ⊆ A0 and that every character is also continuous with respect to the weak* topology
on A (cf. Example 4.12.1)). Moreover, the character space X(A) is easily seen to be a weak*
closed subset of the unit ball in A0 , hence is weak* compact by the Alaoglu-Bourbaki theorem
(see Section 6). Recall the canonical isometry ι : A → A00 , given by ιA (µ) := µ(A) for all A ∈ A
and µ ∈ A0 . Using the fact X(A) ⊆ A0 , the Gelfand transformation A → C(X(A)) is given by
A 7→ A b := ιA |X(A) and can be shown to be an isometric ∗ -isomorphism

A∼
= C(X(A)).

In case A = C ∗ (T ) one can show that X(C ∗ (T )) is homeomorphic to σ(T ) (see [Wer18, Satz IX.3.6]
or [Con10, Chapter VIII, Proposition 2.3]) and therefore the Gelfand transformation implies

C ∗ (T ) ∼
= C(σ(T )).

This provides yet another view on the continuous functional calculus for the normal operator T .

32
2. Unbounded operators
Differential operators, in applications often supplied with boundary conditions, are prominent
examples of linear maps between function spaces, but they can typically not be implemented as
bounded operators with respect to L2 -norms. For example, let V := {f ∈ C 1 ([0, 1]) | f (0) = 0 =
f (1)} and T0 : V → C([0, 1]) be given by f 7→ if 0 . Then T0 is bounded, if we equip V with the
norm kf k∞,1 = kf k∞ +kf 0 k∞ and C([0, 1]) with k.k∞ or k.k2 . But the map f 7→ if 0 is unbounded
(hence discontinuous) considered as linear map T : V → L2 ([0, 1]) with the L2 -norm put on both
spaces1 . But the L2 -setting still has the advantage of reflecting the following symmetry property
of T , being defined on the (dense) subspace V ⊆ L2 ([0, 1]):

Z1 Z1
t=1
0
∀x, y ∈ V : hT x, yi = i x (t) y(t) dt = ix(t) y(t) −i x(t) y 0 (t) dt = hx, T yi.
t=0
0 0

2.1. Definition: (a) Consider a linear map T : dom(T ) → H, where dom(T ) ⊆ H is a subspace.
Then T is called an operator on H with domain dom(T ). The operator T is said to be densely
defined, if dom(T ) is dense in H.
(b) An operator S : dom(S) → H is said to be an extension of T , in notation T ⊆ S, if dom(T ) ⊆
dom(S) and Sx = T x holds for all x ∈ dom(T ).
(c) Two operators T and S are equal, we write T = S, if T ⊆ S and S ⊆ T , i.e., dom(T ) = dom(S)
and T x = Sx for all x ∈ dom(T ).
(d) An operator T : dom(T ) → H is symmetric, if

hT x, yi = hx, T yi ∀x, y ∈ dom(T ).

In the above sense, any element S ∈ L(H) is an operator on H with dom(S) = H. Clearly, if T
is a densely defined operator on H which happens to be continuous (with respect to the Hilbert
space norm restricted to dom(T )), then T has a unique extension T ⊆ S ∈ L(H). Finally, by
the Hellinger-Toeplitz theorem ([Wer18, Satz V.5.5] or [Tes14, §4.1, Theorem 4.9]), a symmetric
operator with dom(T ) = H is bounded and self-adjoint, i.e., T ∈ L(H) with T ∗ = T .
The operator T on L2 ([0, 1]) mentioned above, with domain V = {f ∈ C 1 ([0, 1]) | f (0) = 0 =
f (1)} and given by T f = if 0 , is symmetric. If Sf = if 0 with dom(S) := {f ∈ C 1 ([0, 1]) | f (0) =
f (1)}, then S is a symmetric extension of T .
1

√ at hands: We find that fn (t) := sin(nπt) satisfies fn ∈ V and kfn k2 = 1/ 2,
To have an explicit example
whereas kT fn k2 = nπ/ 2 → ∞ as n → ∞.

33
Recall that the adjoint T ∗ of a bounded operator T is uniquely defined by the requirement
hT x, yi = hx, T ∗ yi for all x, y ∈ H and can be obtained by assigning to y ∈ H the unique
vector z ∈ H such that the linear functional x 7→ hT x, yi is represented by x 7→ hx, zi. This con-
struction still works in the more general context, if x 7→ hT x, yi is defined on a dense subspace,
namely dom(T ), and is a continuous linear functional.

2.2. Definition: Let T be a densely defined operator on H. On the subspace

dom(T ∗ ) := {y ∈ H | x 7→ hT x, yi is continuous dom(T ) → C}

we define the adjoint T ∗ of T as follows: If y ∈ dom(T ∗ ), then x 7→ hT x, yi has a unique continuous


extension to a linear functional on H; by the Riesz-Fréchet theorem this functional can be uniquely
represented in the form x 7→ hx, zi with z ∈ H; we put T ∗ y := z.
By construction,
hT x, yi = hx, T ∗ yi ∀x ∈ dom(T ), ∀y ∈ dom(T ∗ ).
The operator T is called self-adjoint, if T ∗ = T .

As noted above, the Hellinger-Toeplitz theorem guarantees consistency of this new notion of self-
adjointness with the corresponding one for bounded operators. Obviously, self-adjoint operators
are symmetric. Conversely, symmetric operators need not be self-adjoint (in general, they need
not even be densely defined): In the above example, dom(T ) ( dom(S) ⊆ dom(T ∗ ), since f 7→
hif 0 , gi = hf, ig 0 i is L2 -continuous on dom(T ) for every g ∈ dom(S); therefore, T is symmetric,
densely defined, but not self-adjoint.

A densely defined symmetric operator satisfies dom(T ) ⊆ dom(T ∗ ), since x 7→ hT x, yi = hx, T yi


is continuous dom(T ) → C for every y ∈ dom(T ), thus T ⊆ T ∗ and T ∗ is densely defined as well;
moreover, T ∗∗ := (T ∗ )∗ can be defined in this case. In general (without symmetry), a densely
defined operator T may have non-dense dom(T ∗ ), even examples with dom(T ∗ ) = {0} can be
found (see, e.g, [Wer18, page 371]).

2.3. Definition: Let T be an operator on H with graph

gr(T ) := {(x, T x) ∈ H × H | x ∈ dom(T )}

and equip Hq× H with the inner product h(x, y), (u, v)i2 := hx, ui + hy, vi, defining the norm
k(x, y)k2 = kxk2 + kyk2 (which induces the product topology on H × H), thus obtaining a
Hilbert space structure on H × H. The operator T is closed (or has closed graph), if gr(T ) is
closed in H × H. Equivalently, for any sequence (xn ) in dom(T ) such that xn → x and T xn → y
in H, we have x ∈ dom(T ) and y = T x.

Recall that the closed graph theorem states that a closed operator T with dom(T ) = H is
continuous. In case of unbounded operators the closedness serves as a partial replacement of
continuity properties. To give an example of an operator that is not closed we turn again
to T x = ix0 with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]): Consider
xn (t) = ( 41 + n1 )1/2 − ((t − 21 )2 + n1 )1/2 , x(t) = 12 − |t − 12 |, and y(t) = i sgn( 21 − t); then xn → x uni-
formly, hence in L2 ([0, 1]), and T xn → y in L2 ([0, 1]) (dominated convergence), but x 6∈ dom(T ).
(A pity, since T x = y almost everywhere; the chosen domain is too narrow).

34
2.4. Proposition: Let T : dom(T ) → H be densely defined, then:
(i) T ∗ is closed,
(ii) if T ∗ is densely defined, then T ⊆ T ∗∗ ,
(iii) if T ∗ is densely defined and S is a closed extension of T , then T ∗∗ ⊆ S.
This means that T ∗∗ is the “smallest” closed extension of T , the so-called closure of T .
Proof: (i): Suppose yn ∈ dom(T ∗ ), yn → y, and T ∗ yn → z. Then we have for every x ∈ dom(T ),

hT x, yi = limhT x, yn i = limhx, T ∗ yn i = hx, zi.

Therefore x 7→ hT x, yi = hx, zi is continuous dom(T ) → C, hence y ∈ dom(T ∗ ) and z = T ∗ y.


(ii): If x ∈ dom(T ) and y ∈ dom(T ∗ ), then hT x, yi = hx, T ∗ yi. Hence y 7→ hT ∗ y, xi = hy, T xi
is continuous dom(T ∗ ) → C, hence x ∈ dom(T ∗∗ ) and hT x, yi = hx, T ∗ yi = hT ∗∗ x, yi for every
y ∈ dom(T ∗ ). Since dom(T ∗ ) is dense, we obtain T x = T ∗∗ x for every x ∈ dom(T ), in summary,
T ⊆ T ∗∗ .
(iii): The statement follows once we showed gr(T ) = gr(T ∗∗ ). Here, gr(T ) ⊆ gr(T ∗∗ ) holds, since
T ∗∗ = (T ∗ )∗ has closed graph by (i) and T ⊆ T ∗∗ by (ii). It remains to prove gr(T ) ⊇ gr(T ∗∗ ),
which follows, if we show gr(T )⊥ ⊆ gr(T ∗∗ )⊥ .
Let (u, v) ∈ gr(T )⊥ . For every (x, T x) ∈ gr(T ), we have 0 = h(x, T x), (u, v)i2 = hx, ui + hT x, vi,
hence hT x, vi = −hx, ui. This shows that v ∈ dom(T ∗ ) and T ∗ v = −u.
Let (z, T ∗∗ z) ∈ gr(T ∗∗ ) arbitrary, then

h(z, T ∗∗ z), (u, v)i2 = hz, ui + hT ∗∗ z, vi = hz, ui + hz, T ∗ vi = hz, |u +{zT ∗ v}i = 0,
u−u

thus, (u, v) ∈ gr(T ∗∗ )⊥ .


2.5. Corollary: Let T : dom(T ) → H be densely defined, then:
(i) T is symmetric, if and only if T ⊆ T ∗ .
In this case, T ⊆ T ∗∗ ⊆ T ∗ = T ∗∗∗ and also T ∗∗ is symmetric.
(ii) T is closed and symmetric, if and only if T = T ∗∗ ⊆ T ∗ .
(iii) T is self-adjoint, if and only if T = T ∗∗ = T ∗ .
Proof: (i): That symmetry of T implies T ⊆ T ∗ has been noted already in the paragraph preceding
Definition 2.3; the converse is clear. By Proposition 2.4, T ∗ is closed and T ∗∗ is the closure of T ,
therefore T ⊆ T ∗ implies T ⊆ T ∗∗ ⊆ T ∗ . Since T ∗ is densely defined as well, the closure of T ∗ is
T ∗∗∗ ; since T ∗ is closed, we have T ∗ = T ∗∗∗ . In particular, we obtained T ∗∗ ⊆ T ∗∗∗ , hence T ∗∗ is
symmetric.
(ii): If T = T ∗∗ ⊆ T ∗ , then T is symmetric by (i) and closed by Proposition 2.4. [T = (T ∗ )∗ ]
If T is closed and symmetric, then T = T ∗∗ by Proposition 2.4 and T ⊆ T ∗ by (i).
(iii): Self-adjointness is defined by the equality T = T ∗ , which implies that T is closed by Propo-
sition 2.4, hence T = T ∗∗ .

35
There is an interesting intermediate case between self-adjointness and symmetry for densely de-
fined operators, namely self-adjointness of the adjoint.
2.6. Definition: A densely defined symmetric operator T on H is essentially self-adjoint, if its
closure T ∗∗ is self-adjoint, or, equivalently, T ⊆ T ∗∗ = T ∗ (= T ∗∗∗ by the above corollary).
In general, self-adjoint extensions of symmetric operators need not exist, but if T is essentially
self-adjoint it possesses its closure T ∗∗ = T ∗ as the unique self-adjoint extension. (This is easily
proved upon observing that T ⊆ S implies S ∗ ⊆ T ∗ .)
2.7. An example revisited: We study again the example T x = ix0 with dom(T ) = {x ∈
C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]). We have seen above that T is symmetric, but not
self-adjoint, and not even closed.
We claim that dom(T ∗ ) = {x : [0, 1] → C | x is absolutely continuous and x0 ∈ L2 ([0, 1])} and
T ∗ x = ix0 .
Rt
Let x ∈ dom(T ∗ ) and y := T ∗ x ∈ L2 ([0, 1]). We put F (t) := 0 y(s) ds and note that, by
the fundamental theorem of calculus, F is absolutely continuous2 and satisfies F 0 = y almost
everywhere. We may use the formula of integration by parts (Proposition 0.20) and obtain for
any z ∈ dom(T ) (recall that z(0) = 0 = z(1))

Z1 Z1
t=1
∗ 0
hT z, xi = hz, T xi = hz, yi = hz, F i = z(t)F 0 (t) dt = z(t)F (t) − z 0 (t)F (t) dt
t=0
0 0
= −hz , F i = hiz 0 , −iF i = hT z, −iF i,
0

i.e., hT z, x + iF i = 0, and therefore

(∗) x + iF ∈ ran(T )⊥ .

R1 R1
Observe that T z ∈ C([0, 1]) with 0 T z = i 0 z 0 = i(z(1) − z(0)) = 0 for every z ∈ dom(T );
R1
on the other hand, any continuous function w with 0 w = 0 is in ran(T ), since w = iW 0 with
Rt
W ∈ dom(T ), if W (t) := −i 0 w(s) ds (which yields W (0) = W (1) = 0). Thus,
Z 1
ran(T )⊥ = {w ∈ C([0, 1]) | w = 0}⊥ = {w ∈ C([0, 1]) | hw, 1i = 0}⊥
0
 ⊥
= {w ∈ C([0, 1]) | hw, 1i = 0} = {w ∈ L2 ([0, 1]) | hw, 1i = 0}⊥ = ({1}⊥ )⊥ = span{1}.

We conclude from (∗) that x + iF ∈ span{1}, hence x = −iF + α1 with some α ∈ C, which shows
that x is absolutely continuous and ix0 = F 0 + 0 = y = T ∗ x ∈ L2 ([0, 1]).
If x : [0, 1] → C is absolutely continuous with x0 ∈ L2 ([0, 1]), then we obtain for every z ∈ dom(T )
(again employing integration by parts)

hT z, xi = hiz 0 , xi = hz, ix0 i,


R1 R1
2
y ∈ L1 ([0, 1]), since 0
|y| ds = 0
1 · |y| ds ≤ k1k2 · kyk2 = kyk2

36
which proves that z 7→ hT z, xi is continuous, thus x ∈ dom(T ∗ ).
As an exercise, one can show along similar lines that dom(T ∗∗ ) = {x ∈ dom(T ∗ ) | x(0) = 0 =
x(1)} ( dom(T ∗ ) and T ∗∗ x = ix0 , i.e, T ∗∗ ( T ∗ . We conclude that T is not essentially self-adjoint
(and T ∗ is not symmetric).
2.8. Example (unbounded multiplication operators): Let (Ω, Σ, µ) be a measure space,
f : Ω → R measurable, dom(T ) := {x ∈ L2 (Ω, µ) | f · x ∈ L2 (Ω, µ)}, and T x := f · x. Since f is
real-valued, T is symmetric.
We show that dom(T ) is dense: For n ∈ N consider Ωn := {ω ∈ Ω | |f (ω)| ≤ n} ∈ Σ; then
n∈N Ωn = Ω and Vn := {x ∈ L (Ω, µ) | ∀ω ∈ Ω \ Ωn : x(ω) = 0} ⊆ dom(T ) (n ∈ N); if x ∈
2
S
L (Ω, µ), then xn := x χΩn ∈ Vn ⊆ dom(T ) and xn → x in L2 (Ω, µ) (by dominated convergence).
2

So far, we have seen that T is a densely defined symmetric operator, hence T ⊆ T ∗ . We claim
that T is self-adjoint. It suffices to show that dom(T ∗ ) ⊆ dom(T ), since symmetry then implies
T ∗ ⊆ T , which yields T ∗ = T .
Let x ∈ dom(T ∗ ). We note that χΩn f ∈ L∞ (Ω, µ) and that χΩn z ∈ dom(T ) for every z ∈ dom(T ),
hence we have for every n ∈ N,

hz, χΩn T ∗ xi = hχΩn z, T ∗ xi = hT (χΩn z), xi = hf χΩn z, xi = hz, χΩn f xi.

By density of dom(T ), we deduce that χΩn T ∗ x = χΩn f x holds in L2 (Ω, µ) for every n ∈ N. Since
χΩn → 1 pointwise, we obtain T ∗ x = f x as measurable functions almost everywhere on Ω, hence
f x ∈ L2 (Ω, µ) (because T ∗ x belongs to L2 (Ω, µ)), which proves x ∈ dom(T ).

Operators of the form λ − T , with λ ∈ C, can always be defined on dom(λ − T ) := dom(T ). We


will make use of this convention from now on.
2.9. Lemma: Let T be a densely defined operator on H:
(i) ker(T ∗ ∓ i) = ran(T ± i)⊥ , in particular,

ker(T ∗ ∓ i) = {0} ⇔ ran(T ± i) dense in H.

(Moreover, being orthogonal complements, the subspaces ker(T ∗ ∓ i) are closed.)


(ii) If T is closed and symmetric, then ran(T ± i) is closed in H.
Proof: (i): Clearly, (T ± i)∗ = T ∗ ∓ i, thus we have

y ∈ ran(T ± i)⊥ ⇔ ∀z ∈ dom(T ) : h(T ± i)z, yi = 0 ⇔


∗ ∗
y ∈ dom(T ) and ∀z ∈ dom(T ) : hz, (T ∓ i)yi = 0 ⇔ y ∈ ker(T ∗ ∓ i).

(ii): By symmetry of T , hT x, xi ∈ R for every x ∈ dom(T ), thus we have

(2.1) k(T ± i)xk2 = kT xk2 + kxk2 ± 2 Re hT x, ixi = kT xk2 + kxk2 ≥ kxk2 .


| {z }
−ihT x,xi ∈ iR

Therefore (T ± i)−1 : ran(T ± i) → dom(T ) exists and is continuous.

37
To show closedness of ran(T ± i), suppose (xn ) is a sequence in dom(T ) such that the sequence
((T ± i)xn ) in ran(T ± i) converges to y ∈ H. By continuity of (T ± i)−1 , the Cauchy sequence
((T ±i)xn ) in ran(T ±i) is then mapped to the Cauchy sequence (xn ) in dom(T ). Hence x := lim xn
exists in H and T xn → y ∓ ix. Since T is a closed operator, x ∈ dom(T ) and y ∓ ix = T x, i.e.,
y = (T ± i)x ∈ ran(T ± i).
2.10. Theorem: Let T be a symmetric and densely defined operator on H, then the following
are equivalent:
(i) T is self-adjoint,
(ii) T is closed and ker(T ∗ ± i) = {0},
(iii) ran(T ± i) = H.
Proof: (i) ⇒ (ii): T = T ∗ is closed by Proposition 2.4. By symmetry of T ∗ and Equation (2.1) in
the proof of Lemma 2.9 (applied to T ∗ in place of T ), we find that (T ∗ ± i)x = 0 implies x = 0.
(ii) ⇒ (iii): By the above lemma, ran(T ± i)⊥ = ker(T ∗ ∓ i) = {0} and ran(T ± i) is closed, since
T is closed. Thus, we obtain ran(T ± i) = H.
(iii) ⇒ (i): The densely defined symmetric operator T satisfies T ⊆ T ∗ , hence it suffices to show
dom(T ∗ ) ⊆ dom(T ). Let y ∈ dom(T ∗ ). Since H = ran(T − i) we can find some x ∈ dom(T ) such
that (T ∗ − i)y = (T − i)x. Due to T ⊆ T ∗ we may thus write (T ∗ − i)y = (T ∗ − i)x. The previous
lemma gives ker(T ∗ − i) = ran(T + i)⊥ = H ⊥ = {0}, which implies that T ∗ − i is injective, hence
y = x ∈ dom(T ).
The theorem provides us with a very short argument that the operator T of multiplication by the
real-valued measurable function f from Example 2.8 is self-adjoint: The functions 1/(f ± i) and
f /(f ± i) are measurable and bounded, hence any h ∈ L2 (Ω, µ) is in the range of (T ± i), since
(f ± i) · f h±i = h is clear and f · f h±i = f f±i · h ∈ L2 (Ω, µ) shows that f h±i ∈ dom(T ).
2.11. Corollary: Let T be a symmetric and densely defined operator on H, then the following
are equivalent:
(i) T is essentially self-adjoint,
(ii) ker(T ∗ ± i) = {0},
(iii) ran(T ± i) is dense in H.
Proof: (ii) ⇔ (iii): Follows directly from Lemma 2.9(i).
(i) ⇔ (ii): Recall that we have T ⊆ T ∗∗ ⊆ T ∗ = T ∗∗∗ by Corollary 2.5(i). The above theorem,
applied to T ∗∗ (which is closed), states that T ∗∗ is self-adjoint, if and only if ker(T ∗ ± i) =
ker(T ∗∗∗ ± i) = {0}.

For a densely defined symmetric operator T , the Hilbert dimensions of ker(T ∗ ± i) are called
deficiency indices of T . The corollary says that T is essentially self-adjoint, if and only if both
deficiency indices are 0. More generally, we will show in the following theorem, that self-adjoint
extensions exist, if and only if the deficiency indices are equal. Recall that ker(T ∗ ±i) = ran(T ∓i)⊥
by Lemma 2.9(i).

38
2.12. Theorem: A symmetric and densely defined operator T on H possesses self-adjoint ex-
tensions, if and only if its deficiency indices are equal, which means that dim ker(T ∗ + i) =
dim ker(T ∗ − i), where dim refers to the cardinality of a complete orthonormal system.

Remark on the following proof: The basic idea will be to transfer some of the properties of the
bijective map t 7→ t−i
t+i
between R and {z ∈ C | |z| = 1, z 6= 1}, with inverse u 7→ i u+1
u−1 , to the level
of operators via the so-called Cayley-transform U := (T + i)(T − i) of T .
−1

Proof: Suppose that S is a self-adjoint extension of T . Recall from (2.1) that T − i is injective,
hence U := (T + i)(T − i)−1 can be defined on dom(U ) := ran(T − i). By Theorem 2.10,
ran(S − i) = H and we may define V := (S + i)(S − i)−1 on dom(V ) := H. Equation (2.1)
and Theorem 2.10 applied to S show that V is isometric and surjective, thus unitary. Since
T ⊆ S, V maps ran(T − i) onto ran(T + i); moreover, as unitary operator, V also maps the
respective orthogonal complements onto one another. Therefore, Lemma 2.9(i) implies that V
maps ker(T ∗ + i) = ran(T − i)⊥ unitarily onto ker(T ∗ − i) = ran(T + i)⊥ , hence the deficiency
indices have to be equal.

In proving the reverse implication, start by supposing dim ker(T ∗ + i) = dim ker(T ∗ − i). We learn
from (2.1) that
k(T + i)xk = k(T − i)xk ∀x ∈ dom(T ),
therefore, U : ran(T − i) → ran(T + i), (T − i)x 7→ (T + i)x is a well-defined isometric (hence
injective) and surjective operator, which can be uniquely extended to an isometry between the
corresponding closures of the subspaces. Appealing to Lemma 2.9(i), we obtain dim ran(T − i)⊥ =
dim ker(T ∗ + i) = dim ker(T ∗ − i) = dim ran(T + i)⊥ , which allows us to extend U to a unitary
operator V : H → H by simply mapping a complete orthonormal system of ran(T − i)⊥ into such
for ran(T + i)⊥ . We proceed in four steps:

1. V − I is injective: If y ∈ H with (V − I)y = 0, then also (V ∗ − I)y = 0, since V − I is normal3 .


We therefore have for every x ∈ dom(T ),

2ihx, yi = h(T + i)x − (T − i)x, yi = hV (T − i)x − (T − i)x, yi


= h(V − I)(T − i)x, yi = h(T − i)x, (V ∗ − I)yi = 0,

i.e., y ∈ dom(T )⊥ = {0}.

2. Construction of an extension S of T : Let dom(S) := ran(V − I) and use the injectivity of


V − I to see that (V − I)z 7→ i(V z + z) gives a well-defined linear map S : ran(V − I) → H. If
x ∈ dom(T ), then (V − I)(T − i)x = V (T − i)x − (T − i)x = (T + i)x − (T − i)x = 2ix, which
shows that x ∈ ran(V − I) = dom(S) and, moreover, that
 
1 1 1 
Sx = S (V − I)(T − i)x = i(V + I)(T − i)x = V (T − i)x + (T − i)x
2i 2i 2
1  1
= (T + i)x + (T − i)x = (2T x) = T x.
2 2
3
Again by [Wer18, Lemma V.5.10], kRxk = kR∗ xk, if R ∈ L(H) is normal. (Recall once more the quick
2 2
proof: 0 = h(R∗ R − RR∗ )x, xi = hR∗ Rx, xi − hRR∗ x, xi = kRxk − kR∗ xk .)

39
3. S is symmetric: If x ∈ dom(S) = ran(V − I), say x = (V − I)y for some y ∈ H, then

hSx, xi = hi(V + I)y, (V − I)yi = i (hV y, V yi + hy, V yi − hV y, yi − hy, yi)


= i (hy, V yi − hV y, yi) = i2i Imhy, V yi = −2 Imhy, V yi

is a real number, in particular hSx, xi = hSx, xi = hx, Sxi. As an exercise, one can show that this
implies symmetry of S (e.g., by comparing the expanded expressions for hS(x1 + x2 ), x1 + x2 i =
hx1 + x2 , S(x1 + x2 )i with arbitrary x1 , x2 ∈ dom(S)).
4. S is self-adjoint: We show that ran(S ± i) = H, then the proof is complete by Theorem 2.10.
For every x ∈ H, we have (S − i)(V − I)x = i(V + I)x − i(V − I)x = 2ix and (S + i)(V − I)x =
i(V + I)x + i(V − I)x = 2iV x, thus,
   
1 1 ∗
x = (S − i) (V − I)x = (S + i) (V − I)V x ,
2i 2i

hence x ∈ ran(S − i) and x ∈ ran(S + i).


2.13. Examples: 1) Let R+ := ]0, ∞[ and consider T x = ix0 on dom(T ) = D(R+ ) := {h ∈
C ∞ (R+ ) | supp(h) compact} ⊆ L2 (R+ ). It is shown, e.g., in [Wer18, Lemma V.1.10], hat D(R+ )
is dense. By techniques very similar to those applied in Example 2.7 above, one can show that

dom(T ∗ ) = {x ∈ L2 (R+ ) | the restriction x |I is absolutely continuous


for every compact interval I ⊂ R+ and x0 ∈ L2 (R+ )}

and T ∗ x = ix0 . Therefore, T is symmetric and densely defined.


Determining ker(T ∗ ± i) means solving the ordinary differential equation y 0 ± y = 0 for y ∈
dom(T ∗ ). The solutions are automatically in C 1 (R+ ) and have to be of the form αe∓t with
α ∈ C. The requirement y ∈ L2 (R+ ) reduces the solution spaces to

ker(T ∗ + i) = span{e−t }, ker(T ∗ − i) = {0},

and shows that the deficiency indices are different. By Corollary 2.11, T is not essentially self-
adjoint and Theorem 2.12 shows that there is no self-adjoint extension of T .
2) In Example 2.8 put Ω = Rn , Σ = B(Rn ), µ the Lebesgue measure, and f : Rn → R, f (ξ) = |ξ|2 .
We immediately obtain that the operator M of multiplication by f with domain dom(M ) = {ψ ∈
L2 (Rn ) | ξ 7→ |ξ|2 ψ(ξ) belongs to L2 (Rn )} is self-adjoint.
Consider M0 ϕ := f · ϕ with dense domain

(2.2) S (Rn ) := {ϕ ∈ C ∞ (Rn ) | ∀α, β ∈ Nn : ξ 7→ ξ α ∂ β ϕ(ξ) is bounded}


⊇ D(Rn ) := {h ∈ C ∞ (Rn ) | supp(h) compact},

then M0 is symmetric, densely defined, and M0 ⊆ M . Obviously, ran(M0 ±i) ⊇ D(Rn ) shows that
these ranges are dense (since D(Rn ) is dense in L2 (Rn ), again by [Wer18, Lemma V.1.10]), hence
Corollary 2.11 implies that M0 is essentially self-adjoint. Its unique self-adjoint extension and
closure is given by M , i.e., M = M0∗∗ (refer to Proposition 2.4 and directly show gr(M ) ⊆ gr(M0 ).)

40
Well-known properties of the Fourier transform imply
n
X n
X n
X
(F∆h)(ξ) = F(∂j2 h)(ξ) = 2b
(iξj ) h(ξ) = − ξj2 b
h(ξ) = −|ξ|2 b
h(ξ) = −(M b
h)(ξ),
j=1 j=1 j=1

thus M is unitarily equivalent to the operator −∆ on the domain H 2 (Rn ) := F−1 dom(M ), which
can be identified with the Sobolev space of order 2 (i.e., the L2 -functions with weak derivatives up
to order 2 belonging to L2 ). We deduce that −∆ with domain H 2 (Rn ) ⊆ L2 (Rn ) is self-adjoint.
2.14. Revisiting the example revisited in Example 2.7: We study the self-adjoint extensions
of T x = ix0 with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]).
The deficiency spaces ker(T ∗ ± i) are easily determined by solving y 0 ± y = 0 for y ∈ dom(T ∗ ).
Both basic solutions t 7→ e∓t =: e∓ (t) belong to dom(T ∗ ), which yields

ran(T − i)⊥ = ker(T ∗ + i) = span{e− } and ran(T + i)⊥ = ker(T ∗ − i) = span{e+ }.

The operator U constructed in the proof of Theorem 2.12 maps ran(T −i) → ran(T +i), i(x0 −x) 7→
i(x0 + x) for any x ∈ dom(T ). It is extended to a unitary operator Vp on L2 ([0, 1]) by assigning a
fixed normalized vector in ran(T − i)⊥ , say ef − := ec0 e− with c0 := 2/(e2 − 1), to a normalized
vector in ran(T + i) , say V (f
⊥ iγ
+ for some γ ∈ R, where e
e− ) := e · ef f+ := c0 e+ . Every γ ∈ [0, 2π[
gives a different extension V and these are all possible unitary extensions of U . The corresponding
self-adjoint extension S ⊃ T is then given on dom(S) = ran(V − I) by (V − I)z 7→ i(V + I)z.
Since T ∗∗ is the closure of T , the self-adjoint extensions of T and T ∗∗ agree. The latter can
be shown to be described by the family of operators Sα (α ∈ C, |α| = 1) with Sα x = ix0 and
dom(Sα ) = {x : [0, 1] → C | x is absolutely continuous and x0 ∈ L2 ([0, 1]), x(0) = αx(1)} (see,
e.g., [Sch00, 5.3, Beispiel 3 and 5.1, Beispiel 4, Beispiel 5] or [RS75, Section X.1, Example 1], or
also [Con10, Chapter X, Examples 2.21 and 1.11]).

2.15. Extension of the notion of a spectrum: In the most general case, the definition of the
spectrum of an operator which is unbounded or/and not defined on the whole Hilbert space is not
completely uniform in the literature. Thus, let us illustrate two technical issues before giving a
definition.
Artefacts with non-dense domains: Let H = l2 (N) and consider the left-shift, given by
(Lx)n = xn+1 for x = (xn ) ∈ l2 (N). The bounded operator L ∈ L(H) clearly has ker(L) =
span{e1 }, in particular 0 ∈ σ(L). If dom(L0 ) := {e1 }⊥ and L0 is L restricted to dom(L0 ), then
we have the bounded inverse R0 : l2 (N) → dom(L0 ) given by the right-shift. Hence we could not
consider 0 to be a spectral value of L0 . However, if dom(L1 ) is any dense subspace of l2 (N) and
L1 is the corresponding restriction of L, then any sequence from dom(L1 ) approximating e1 can
be used to show that L1 is not continuously invertible, hence 0 could be considered a spectral
value of L1 . This indicates why we will define the spectrum only for densely defined operators.
Non-closed operators have empty resolvent sets: Let T : dom(T ) → H be a linear operator
(not necessarily densely defined) and suppose that λ ∈ C is such that λ − T is bijective dom(T ) →
H with bounded inverse. We will show that T has to be closed.
We first show that λ − T is closed: Let (xn ) be a sequence in dom(T ) = dom(λ − T ) such that

41
xn → x in H and (λ − T )xn → y in H. Then zn := (λ − T )xn belongs to (λ − T )(dom(T )) and
zn → y, hence continuity of (λ − T )−1 implies

xn = (λ − T )−1 zn → (λ − T )−1 y in addition to xn → x,

therefore, x = (λ − T )−1 y ∈ dom(T ) and (λ − T )x = y.


Now we find that also T is closed: Let (xn ) be a sequence in dom(T ) such that xn → x in H and
T xn → z in H. Then we have

(λ − T )xn = λxn − T xn → λx − z.

Since λ − T is closed, this implies x ∈ dom(λ − T ) = dom(T ) and λx − z = (λ − T )x, i.e., T x = z.


This fact is the reason why some authors prefer to define the spectrum only in case of closed
operators, since non-closed operators automatically have spectrum equal to C. However, there
are also closed unbounded densely defined operators with spectrum C, i.e., empty resolvent set.
(An explicit example is given by the multiplication operator (T f )(z) = zf (z) on dom(T ) := {f ∈
L2 (C) | z 7→ zf (z) ∈ L2 (C)} ⊂ L2 (C); details of arguments, e.g., that T is closed etc., can be
easily supplied with the help of [Wei00, Abschnitt 6.1].)

Definition: Let T : dom(T ) → H be a densely defined linear operator, then

(i) ρ(T ) := {λ ∈ C | λ − T is bijective dom(T ) → H with bounded inverse} is called the


resolvent set of T ,

(ii) R : ρ(T ) → L(H), λ 7→ Rλ := (λ − T )−1 , is the resolvent map of T ,

(iii) σ(T ) := C \ ρ(T ) is called the spectrum of T .

Remark: (i) For any closed subset S ⊆ C there is some closed densely defined operator T on a
separable Hilbert space H such that σ(T ) = S (see [Wei00, Beispiel 5.10]).

(ii) If T is closed, then it suffices to check whether λ − T is bijective to conclude λ ∈ ρ(T ) (this
follows from [Wer18, Satz IV.4.4]).

(iii) By the observation made above, σ(T ) = C for every non-closed densely defined operator T .
In particular, σ(T ) need not be compact for unbounded operators. Neither is it guaranteed that
the spectrum of an unbounded operator is non-empty. (We will give an example below.)

Proposition: Let T : dom(T ) → H be a densely defined linear operator, then

(i) ρ(T ) is open,

(ii) the resolvent map λ 7→ Rλ is analytic ρ(T ) → L(H) and we have

Rλ − Rµ = (µ − λ)Rλ Rµ ,

(Note that, in particular, Rλ and Rµ commute.)

(iii) σ(T ) is closed.

42
Proof: Clearly, (iii) follows from (i) due to σ(T ) = C \ ρ(T ). The proofs of (i) and (ii) are very
similar to the case of bounded T , since, by definition, the resolvent Rλ = (λ − T )−1 belongs to
L(H) for every λ ∈ ρ(T ) and maps H into dom(T ): In detail, we obtain the resolvent equation
in (ii) from

Rλ = Rλ (µ − T )Rµ = Rλ (µ − λ + λ − T )Rµ = (µ − λ)Rλ Rµ + Rµ .

As for (i), suppose λ ∈ ρ(T ) and µ ∈ C satisfies |λ−µ| < 1/kRλ k. Then k(λ−µ)Rλ k < 1 and hence
1 − (λ − µ)Rλ is invertible with inverse given by the Neumann series, thus as a power series in the
variable µ with coefficients in L(H). The resolvent equation would suggest Rλ = Rµ (1−(λ−µ)Rλ ),
and indeed S := Rλ (1 − (λ − µ)Rλ )−1 ∈ L(H) is a power series in the variable µ such that
(µ − T )Sx = S(µ − T )x = x holds for all x in the dense subspace dom(T ) ⊆ H (note that Rλ
commutes with (1 − (λ − µ)Rλ )−1 use µ − T = µ − λ + λ − T ). Therefore µ ∈ ρ(T ), thus ρ(T ) is
open. Obviously, we have also proven the claim of analyticity in (ii) along the way.
Examples: 1) On H = L2 ([0, 1]) consider T x = ix0 with

dom(T ) = {x ∈ C([0, 1]) | x is absolutely continuous, x0 ∈ L2 ([0, 1]), x(0) = 0}.

The domain is dense and T is a closed operator (without proof, but recommended as an exercise).
We claim that σ(T ) = ∅.
Let λ ∈ C and x ∈ L2 ([0, 1]) be arbitrary and consider the equation (λ−T )y = x, which translates
into the ordinary differential equation y 0 + iλy = ix. The solutions to the homogeneous equation
are complex multiples of the function t 7→ exp(−iλt) and the usual “variation of constant” and
adaptation to the “initial condition” y(0) = 0 yields the explicit solution formula

Zt
y(t) = i eiλ(s−t) x(s) ds =: (Sλ x)(t) (t ∈ [0, 1]).
0

We see that indeed this defines a solution y ∈ dom(T ) and thus Sλ : L2 ([0, 1]) → dom(T ) is the
inverse of λ − T . Closedness of T (hence of λ − T ) implies boundedness of Sλ (again by [Wer18,
Satz IV.4.4]), but alternatively it is not difficult to see directly that kSλ xk2 ≤ e| Im(λ)| kxk2 . Thus,
Sλ is the resolvent Rλ and λ ∈ ρ(T ).
2) Let (Ω, Σ, µ) be a σ-finite4 measure space and f : Ω → R be measurable. Consider the multi-
plication operator T x = f x on L2 (Ω, µ) with domain dom(T ) = {x ∈ L2 (Ω, µ) | f x ∈ L2 (Ω, µ)}.
By Example 2.8, T is self-adjoint.
We claim that σ(T ) = {λ ∈ R | ∀ε > 0 : µ f −1 ([λ − ε, λ + ε]) > 0} =: essential range of f .


If λ ∈ C is not in the essential range of f , then 1/(λ − f ) ∈ L∞ (Ω, µ) and the corresponding
fy
multiplication operator is bounded; moreover, it is an inverse of λ − T , since λ−f λ
= −y + λ−f y∈
L (Ω, µ) + L (Ω, µ) · L (Ω, µ) ⊆ L (Ω, µ) for every y ∈ L (Ω, µ). Therefore, λ ∈ ρ(T ).
2 ∞ 2 2 2

If λ ∈ ρ(T ), then we may argue pointwise µ-almost everywhere to show that the resolvent Rλ =
(λ − T )−1 : L2 (Ω, µ) → dom(T ) has to be given as operator of multiplication by the function
4
S
i.e., ∃Ωn ∈ Σ (n ∈ N): µ(Ωn ) < ∞ and n∈N Ωn = Ω.

43
h := 1/(λ − f ). Note that for any E ∈ Σ of finite measure we have
Z Z
|h(ω)| dµ(ω) = |h(ω)χE (ω)|2 dµ(ω) = kRλ χE k22 ≤ kRλ k2 kχE k22 = kRλ k2 µ(E).
2

We show that µ {ω ∈ Ω | |h(ω)| > kRλ k} = 0: For n ∈ N let An := {ω ∈ Ω | |h(ω)| ≥ n+1



n kRλ k};
if E is a measurable set of finite measure and with E ⊆ An , then we deduce from the above that

(n + 1)2 (n + 1)2
Z Z
2 2
2
kRλ k µ(E) ≥ |h(ω)| dµ(ω) ≥ kRλ k 1 dµ = kRλ k2 µ(E),
n2 n2
E E

hence µ(E) = 0 (since certainly Rλ 6= 0); therefore, µ(An ) = 0 for every n ∈ N, and
S
n∈N An =
{ω ∈ Ω | |h(ω)| > kRλ k} has measure 0.
We conclude that |λ−f1(ω)| = |h(ω)| ≤ kRλ k holds for µ-almost all ω ∈ Ω, which implies that λ
cannot belong to in the essential range of f .

Theorem: Let T be a symmetric densely defined operator on H, then

T is self-adjoint ⇐⇒ σ(T ) ⊆ R.

Proof: ⇐: We have ∓i ∈ ρ(T ), which implies ran(T ± i) = H. By Theorem 2.10, T is self-adjoint.


⇒: Let z = λ + iµ with real µ 6= 0 and consider S := (T − λ)/µ on dom(S) := dom(T ). The
operator S is self-adjoint and by Equation (2.1) (on page 37), applied to S, we obtain for every
x ∈ dom(T ):

k(z − T )xk2 = k(λ + iµ)x − (λ + µS)xk2 = kiµx − µSxk2 = µ2 k(i − S)xk2 ≥ µ2 kxk2 .

From this we learn that the inverse (z − T )−1 exists as linear map ran(z − T ) → dom(T ) and is
bounded. Since S is self-adjoint, ran(z − T ) = ran(i − S) = H by Theorem 2.10 and therefore
z ∈ ρ(T ).
Thus, symmetric operators need not have real spectrum. Recall that non-closed operators have
spectrum equal to C, hence, in particular, any non-closed densely defined symmetric operator T
has σ(T ) = C 6⊆ R (for example, T x = ix0 on L2 ([0, 1]) with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) =
0 = x(1)}, which has been studied repeatedly above).

44
3. The spectral theorem for
unbounded self-adjoint operators
3.1. Theorem (Multiplication operator version of the spectral theorem): Let the oper-
ator T : dom(T ) → H be self-adjoint, then there exists a measure space (Ω, Σ, µ), a measurable
function f : Ω → R, and a unitary operator U : H → L2 (Ω, µ) such that
(a) ∀x ∈ H: x ∈ dom(T ) if and only if f · U x ∈ L2 (Ω, µ),
(b) T is unitarily equivalent via U to the multiplication operator Mf ϕ := f ϕ on L2 (Ω, µ) with
dom(Mf ) := {ϕ ∈ L2 (Ω, µ) | f ϕ ∈ L2 (Ω, µ)}, i.e., for every ϕ ∈ dom(Mf ),

U T U −1 ϕ = Mf ϕ = f ϕ µ-almost everywhere.

Proof: By Theorem 2.15, σ(T ) ⊆ R and therefore R := (T + i)−1 and (T − i)−1 exist as bounded
operators H → dom(T ). The plan of the proof is to show that R is normal, apply the spectral
theorem for bounded normal operators to R, and “map” everything back to T via the formal
relation T = R−1 − i.
Let z1 , z2 ∈ H arbitrary. There exist x, y ∈ dom(T ) such that z1 = (T + i)x and z2 = (T − i)y
and we obtain

hRz1 , z2 i = hx, (T − i)yi = h(T + i)x, yi = hz1 , (T − i)−1 z2 i,

i.e., R∗ = (T − i)−1 and the resolvent equation, Proposition 2.15(ii), shows that
1 ∗
(∗) RR∗ = R∗ R = (R − R).
2i
Thus, R is normal and by Theorem 1.25 there is a measure space (Ω, Σ, µ), a unitary map U : H →
L2 (Ω, µ), and a bounded measurable function g : Ω → C such that U R U −1 = Mg , where Mg
denotes the operator of multiplication by g on L2 (Ω, µ).
Being an inverse map, R is injective, hence Mg is and N := {ω ∈ Ω | g(ω) = 0} has to be a
1
µ-null set. Therefore, f (ω) := − i is defined for µ-almost every ω ∈ Ω. Moreover, from (∗)
g(ω)
we obtain
1 1
|g|2 = (g − g) = (−2i Im(g)) = − Im(g)
2i 2i
and may deduce that

1 g(ω) Re(g(ω)) + i|g(ω)|2 Re(g(ω))


∀ω ∈ Ω \ N : f (ω) = −i= −i= 2
−i= ∈ R,
g(ω) g(ω)g(ω) |g(ω)| |g(ω)|2

45
in particular, f can be considered as a real-valued measurable function f : Ω → R such that
f g = 1 − ig holds µ-almost everywhere, which also shows that f g is essentially bounded.
We prove (a): Let x ∈ dom(T ), then there is y ∈ H such that x = Ry and we obtain

f · U x = f · U Ry = f · (Mg U y) = f g · U y ∈ L2 (Ω, µ),

because f g is essentially bounded and U y ∈ L2 (Ω, µ).


To show the reverse implication, suppose f ·U x ∈ L2 (Ω, µ). Then (f +i)U x ∈ L2 (Ω, µ) as well and
we may choose y ∈ H such that (f + i)U x = U y, hence g · U y = g(f + i)U x = (gf + ig)U x = U x.
In other words, x = (U −1 Mg U )y = Ry = (T + i)−1 y ∈ dom(T ).
We prove (b): By (a) we have dom(Mf ) = U (dom(T )). Let ϕ ∈ dom(Mf ) and x ∈ dom(T ) such
that ϕ = U x. Choose y ∈ H with x = Ry = (T + i)−1 y, then y = T x + ix and g · U y = U x as
above. We obtain µ-almost everywhere
1 1
U T U −1 ϕ = U T x = U (y − ix) = U y − iU x = U x − iU x = ( − i)U x = f · U x = f · ϕ.
g g

3.2. Example: [As earlier in Example 1.24 we will make use of basic facts about the Fourier
transform and refer again to the course on Real Analysis or [Con16, Fol99].]
We consider the operator T x = −ix0 with dom(T ) = S (R) ⊆ L2 (R).
Let F denote the Fourier transform as unitary operator L2 (R) → L2 (R), then F(S (R)) = S (R)
and, similarly to Example 2.13, 2), we obtain (F T F−1 ϕ)(t) = t ϕ(t) (from the so-called “exchange”
of differentiation with multiplication by the Fourier transform), i.e., the unitary equivalence of T
with the operator M , with dom(M ) = S (R) of multiplication by the real variable. Clearly, T and
M are symmetric and densely defined; moreover, both subspaces ran(T ± i) = F−1 (ran(M ± i))
and ran(M ± i) are dense in L2 (R), since the latter obviously contains D(R). Thus, T and M
are essentially self-adjoint by Corollary 2.11 and their unique self-adjoint extensions are given by
(their closures) T ∗ and M ∗ .
We claim that dom(M ∗ ) = {ψ ∈ L2 (R) | t 7→ t ψ(t) ∈ L2 (R)}: It is clear that any ψ R∈ L2 (R) such
that t 7→ t ψ(t) ∈ L2 (R) belongs to dom(M ∗ ), since we may then write hM ϕ, ψi = ϕ(t) tψ(t) dt
for every ϕ ∈ dom(M ) = S (R) and obtain |hM ϕ, ψi| ≤ ( t2 |ψ(t)|2 dt)1/2 kϕk2 . To show the
R

reverse inclusion, suppose ψ ∈ L2 (R) is such that lψ : ϕ → hM ϕ, ψi is an L2 -norm continuous


linear functional S (R) → C. By density of S (R) ⊆ L2 (R), it means that there is some η ∈ L2 (R)
such that
Z Z
(∗) ∀ϕ ∈ S (R) : ϕ(t) t ψ(t) dt = ϕ(t) η(t) dt.

(Note that we have lψ (ϕ) = hϕ, ηi for every ϕ ∈ L2 (R), but at this moment we have the formula
lψ (ϕ) = tϕ(t)ψ(t) dt only if ϕ ∈ S (R).) Since both t 7→ ψ(t) and η are integrable on every
R

compact subset of R, we may use the following fact, which is shown later in course of an injectivity
argument R in Example 1) of 7.5: If f1 and f2 are locally integrable functions on R such that
ϕf1 = ϕf2 holds for every ϕ ∈ S (R), then f1 = f2 almost everywhere. Thus, (∗) implies that
R

t ψ(t) = η(t) for almost all t, hence the function t 7→ t ψ(t) belongs to L2 (R).

46
We also obtain (M ∗ ψ)(t) = t ψ(t) and recognize M ∗ as a special case of the self-adjoint multipli-
cation operator in Example 2.8. Therefore, T ∗ x = −ix0 on the domain H 1 (R) := F−1 (dom(M ∗ )),
which can be shown to consist of all L2 -functions that are absolutely continuous on compact
intervals and have derivative in L2 (R)—this is the (L2 -based) Sobolev space of order 1 on R.
3.3. Spectral representation of an unbounded self-adjoint operator: Let T be self-adjoint
and Mf as in Theorem 3.1. If h : R → C is a bounded measurable function, then h ◦ f : Ω → C
is bounded measurable and the multiplication operator h(Mf ) := Mh◦f of multiplication by
h ◦ f is bounded on L2 (Ω, µ). The map h 7→ h(Mf ) is continuous and multiplicative Bb (R) →
L(L2 (Ω, µ)), since clearly kMh◦f k ≤ khk∞ and (h1 · h2 ) ◦ f = (h1 ◦ f ) · (h2 ◦ f ).
For any Borel subset A ⊆ R we put

FA := χA (Mf ) = MχA ◦f = Mχf −1 (A) .

It is not difficult to check that F : A 7→ FA is a spectral measure and that


Z
h(Mf ) = h(λ) dFλ ∀h ∈ Bb (R),
R

since the latter obviously holds for step functions and the multiplication operator action reduces
the question about the relevant limits to the approximation of bounded measurable functions by
step functions. In general, F will not have compact support.
Let EA := U −1 FA U (A ∈ B(R)), then E : A 7→ EA is a spectral measure B(R) → L(H) and for
any bounded measurable function h : R → C we have
Z
h(λ) dEλ = U −1 h(Mf )U,
R

again by extending the obvious relation for characteristic functions to step functions and further
to bounded measurable functions via approximation. We put h(T ) := U −1 h(Mf )U and obtain
the usual properties of a functional calculus for the map h 7→ h(T ).
We will extend this functional calculus to allow for a measurable unbounded real-valued function
R thereby
h : R → R, obtaining a self-adjoint unbounded operator h(T ) on the dense domain Dh :=
{x ∈ H | |h(λ)| dhEλ x, xi < ∞} (the density is not obvious and will be proved below).
2

Let x, y ∈ H and ϕ, ψ ∈ L2 (Ω, µ) such that U x = ϕ, U y = ψ, then for any A ∈ B(R),


Z Z
−1 −1
hEA x, yi = hU FA U x, U U yi = hFA ϕ, ψi = χf −1 (A) ϕ ψ dµ = ϕ ψ dµ.
f −1 (A)

Putting ν(B) := B ϕψ dµ we may interpret the last expression above as image measure f (ν)(A) =
R

ν(f −1 (A)), hence hEA x, yi = f (ν)(A). The transformation formula for the integral of a dhEλ x, yi-
integrable function g : R → R with respect to image measures then gives
Z Z Z
(3.1) g(λ) dhEλ x, yi = (g ◦ f ) dν = (g ◦ f ) · ϕ · ψ dµ.
R Ω Ω

47
Applying the above equation to |h|2 we find that x ∈ Dh , if and only if ϕ = U x satisfies Ω |h ◦
R

f |2 |ϕ|2 dµ < ∞. Note that Example 2.8, now with h ◦ f in place of f there, shows that the set
of such ϕ, namely, dom(Mh◦f ), is dense in L2 (Ω, µ). Therefore, the corresponding set of vectors
x = U −1 ϕ, which is Dh , is dense in H.
For every x ∈ Dh and y ∈ H, the integral R h(λ) dhEλ x, yi exists: Using ϕ = U x, ψ = U y and
R

the analogue of (3.1) for the variations of the complex measures, we have
 1/2  1/2
Z Z Z Z
|h(λ)| d|hEλ x, yi| = |h ◦ f | · |ϕ| · |ψ| dµ ≤  |h ◦ f |2 · |ϕ|2 dµ  |ψ|2 dµ
R Ω Ω Ω
 1/2
Z
= |h(λ)|2 dhEλ x, xi kU yk2 < ∞.
| {z }
R =kyk

Hence, for every x ∈ Dh there is a unique vector z ∈ H such that R h(λ) dhEλ x, yi = hz, yi for
R

all y ∈ H; we put h(T )x := z. Therefore, we obtain


Z
(3.2) hh(T )x, yi = h(λ) dhEλ x, yi ∀x ∈ Dh , ∀y ∈ H.
R

RThis defines an operator h(T ) on H with dense domain Dh . The formal notation h(T ) =
h(λ) dEλ is commonly used, although this integral is in general not convergent with respect
to the operator norm; the precise statement is Equation (3.2).
The special case h(t) = t gives x ∈ Dh , if and only if ϕ = U x satisfies f ϕ ∈ L2 (Ω, µ), i.e.,
U x ∈ dom(Mf ), equivalently x ∈ dom(T ) by Theorem 3.1; moreover, we have
Z Z
λ dhEλ x, yi = f ϕ ψ dµ = hMf ϕ, ψi = hMf U x, U yi = hU −1 Mf U x, yi = hT x, yi.
R Ω

Similarly, one may show that hh(T )x, yi = hU −1 Mh◦f U x, yi holds for all x ∈ Dh , y ∈ H, which
implies also the self-adjointness of h(T ) due to the established unitary equivalence with R the
self-adjoint multiplication operator Mh◦f with domain dom(Mh◦f ) = {ϕ ∈ L (Ω, µ) | Ω |h ◦
2

f |2 |ϕ|2 dµ < ∞}.


To summarize, we have proved the following spectral representation.
Theorem: Let T : dom(T ) → H be self-adjoint, then there exists a unique spectral measure E
such that Z
hT x, yi = λ dhEλ x, yi ∀x ∈ dom(T ), y ∈ H.
R

If h : R → R is measurable and Dh := {x ∈ H | |h(λ)|2 dhEλ x, xi < ∞}, then


R

Z
hh(T )x, yi = h(λ) dhEλ x, yi ∀x ∈ Dh , y ∈ H,
R

defines a self-adjoint operator h(T ) on H with domain Dh .

48
3.4. A glimpse at the formalism of quantum mechanics: [We discuss here the formalism
based directly on Hilbert spaces containing the physical states as in the so-called Schrödinger
picture described in [Kab14, Hal13, Tes14b, Wei00]. The more abstract algebraic approach is based
on C ∗ -algebras of observables and linear functionals on these as states (see, e.g., [Ara99, Thi03]).]

The basic idea is that (the state of) a single particle in Rd is described by a so-called complex wave
function Rd × R → C, (x, t) 7→ ψ(x, t), where ρt (x) := |ψ(x, t)|2 is interpreted as the probability
density associated with the (state of the) particle at time t. Thus, the number
Z Z
|ψ(x, t)|2 dx = χK (x)|ψ(x, t)|2 dx
K Rd

corresponds to the probability of finding the particle (in state ψ) at time t inside the measurable
region K ⊆ R and, in particular, Rd |ψ(x, t)|2 dx = 1, since the particle has to be somewhere.
d
R

To put it differently, ψ(., t) ∈ L2 (Rd ) for every t ∈ R, kψ(., t)kL2 = 1, and the value of the L2 -inner
product hχK ψ(., t), ψ(., t)i gives the expectation of the random variable x 7→ χK (x) at time t (for
a particle in state ψ). The bounded linear operator ψ 7→ χK ψ is but one example of an observable,
the most prominent other examples are the unbounded operators of spatial (coordinate) position
Xj : ψ 7→ xj ψ and momentum (coordinates) Pj : ψ 7→ ~i ∂xj ψ (ψ ∈ S (Rd )).

In more abstract terms, the quantum mechanical (vector) states are represented by elements ψ
of a complex (usually separable) Hilbert space H that are normalized such that kψk = 1. The
observables are represented by densely defined linear operators on H. (Densely defined, since the
observable should be “observable in most of the possible states”.) The expectation (or mean value
of measurement) of an observable A in the state ψ ∈ dom(A) is given by

Eψ (A) := hAψ, ψi.

In order to guarantee that all expectation values are real numbers, we suppose that A is symmetric
(then hAψ, ψi = hψ, Aψi = hAψ, ψi). But there are clear indications to require even more: The
set of all possible measurements of an observable A, represented by the spectrum σ(A), should be
real, thus, calling on Theorem 2.15, we further suppose that the observables are self-adjoint.

Note that in case ψ is an eigenvector for A corresponding to the eigenvalue λ0 we obtain

Eψ (A) = λ0 hψ, ψi = λ0 ,

since Aψ = λ0 ψ. In general, the mean deviation ∆ψ (A) := kAψ − Eψ (A)ψk of A in the state ψ
will not vanish. We have just seen that precisely the eigenvalues occur as sharp measurements
(i.e., with zero deviation).

Recall from linear algebra that two self-adjoint operators A and B on a finite-dimensional Hilbert
space are simultaneously diagonalizable if and only if they commute, i.e., [A, B] := AB − BA = 0.
Considering now the infinite-dimensional case and interpreting the eigenvalues as the possible
sharp measurements of observables, one can generalize this and ask to what extent simultaneous
measurements Eψ (A), Eψ (B) can be achieved at least with small mean deviations. There is a
general lower bound, which in the example A = Xj , B = Pj of the position and momentum

49
operators (in the same coordinate) implies the famous Heisenberg uncertainty principle from the
fact [Xj , Pj ] = i~I: If ψ ∈ {ϕ ∈ dom(A) ∩ dom(B) | Bϕ ∈ dom(A) and Aϕ ∈ dom(B)}, then

1
∆ψ (A) ∆ψ (B) ≥ |h[A, B]ψ, ψi|.
2
Proof: Let A0 := A − Eψ (A) and B 0 := B − Eψ (B). Since I commutes with every operator, we
have1

h[A, B]ψ, ψi = h(AB − BA)ψ, ψi = h(A0 B 0 − B 0 A0 )ψ, ψi = hA0 B 0 ψ, ψi − hψ, (B 0 A0 )∗ ψi


= hA0 B 0 ψ, ψi − hψ, A0 B 0 ψi = hA0 B 0 ψ, ψi − hA0 B 0 ψ, ψi = 2i ImhA0 B 0 ψ, ψi.

Therefore, we deduce

|h[A, B]ψ, ψi| ≤ 2 |hA0 B 0 ψ, ψi| = 2 |hB 0 ψ, A0 ψi| ≤ 2 kB 0 ψkkA0 ψk = 2 ∆ψ (A) ∆ψ (B).

The dynamical aspect of the quantum mechanical particle in Rd as introduced above is in the
t-evolution of the wave function ψ(., t) ∈ L2 (Rd ). The model assumption is that ψ(., t) =
U (t)ψ(., 0), where U (t) is linear (superposition principle) and maps states into states, thus
kU (t)ψ(., 0)k2 = kψ(., 0)k2 . Hence it is plausible to suppose that U (t) is a unitary operator
for every t. Moreover, it is natural to suppose that the evolution of duration t starting at time
s gives the same result as the evolution of duration t + s starting at time 0, i.e., to require
U (t)U (s)ψ(., 0) = U (t)ψ(., s) = ψ(., t + s) = U (t + s)ψ(., 0), or, U (t)U (s) = U (t + s) on the
level of operators. Thus, t 7→ U (t) is a group homomorphism from (R, +) to the group of unitary
operators on L2 (Rd ). Finally, we will expect a certain continuity property of the evolution, at
least along each trajectory of a given initial state, i.e., continuity of the map t 7→ U (t)ψ(., 0),
R → L2 (Rd ).

In abstract terms, we implement the dynamical aspect of a quantum mechanical model on the
Hilbert space H of states by a so-called strongly continuous (s-continuous) unitary group of op-
erators U (t) (t ∈ R), which means that U (t) ∈ L(H) is unitary for every t, U (t)U (s) = U (t + s)
holds for all s, t ∈ R, and for every ψ ∈ H and t0 ∈ R we have limt→t0 U (t)ψ = U (t0 )ψ. (We
see that the term ‘strong continuity’ here simply refers to the pointwise continuity of the map
t 7→ U (t), since the convergence U (t) → U (t0 ) is tested pointwise on H; the topology on L(H)
describing this pointwise convergence is usually called the strong operator topology, hence the
notion of s-continuity.)

We can obtain a strongly continuous unitary group on H from any self-adjoint operator S via
functional calculus. We may simply put U (t) := exp(itS) and check, basically as an exercise, that
all the required properties hold (cf. [Con10, Chapter X, Theorem 5.1] or [Kab14, Satz 16.16] or
[Wei00, Satz 7.4]). Moreover, it is not difficult to see that dom(S) is invariant under U (t) and

d U (t + h)ψ − U (t)ψ
∀ψ ∈ dom(S) :

U (t)ψ := lim = iS U (t)ψ,
dt h→0 h
1
boldly using the relation (ST )∗ = T ∗ S ∗ for unbounded operators here without proof

50
i.e., ϕ(t) := U (t)ψ satisfies the initial value problem

d
(3.3) ϕ(t) = iSϕ(t), ϕ(0) = ψ.
dt

By the famous theorem of Stone (see [Con10, Chapter X, Theorem 5.6] or [Kab14, Satz 16.18]
or [Wei00, Satz 7.3]), every s-continuous unitary group U (t) (t ∈ R) is given in the above form
U (t) = exp(itS), where S is a unique self-adjoint operator. In fact, S may be constructed on
dom(S) := {ψ ∈ H | ∃ limh→0 (U (h)ψ −ψ)/h} by the assignment Sψ := −i·limh→0 (U (h)ψ −ψ)/h.
Therefore, the dynamics of the quantum mechanical system is determined by this self-adjoint
operator S. In terms of physics, S corresponds to the energy of the system and is called the
Hamiltonian and the differential equation in (3.3) is called the Schrödinger equation. Observe
that in case the initial value ψ happens to be an eigenvector for S with eigenvalue λ ∈ R, i.e.,
Sψ = λψ, then the solution to (3.3) is given by ϕ(t) = eiλt ψ, which corresponds to a bound
state in physics, since ϕ(t) ∈ span{ψ} for all times t ∈ R; in particular, the expectation values of
observables remain constant: Eϕ(t) (A) = hAϕ(t), ϕ(t)i = eiλt e−iλt hAψ, ψi = Eψ (A).

51
Part II.

Locally convex vector spaces

53
4. Vector space topologies
In this second part of the lecture notes, X will denote a vector space with scalar field K = R or
K = C.
4.1. Definition: A seminorm on X is a map p : X → [0, ∞[ such that ∀λ ∈ K, x, y ∈ X:
(a) p(λx) = |λ| p(x),
(b) p(x + y) ≤ p(x) + p(y).
Note that p(x) ≥ 0 and that p(0) = 0 (by (a)), but p(x) = 0 does not imply x = 0.
4.2. Example: Let X = C[0,1] = {x : [0, 1] → C} and for every t ∈ [0, 1] let pt : X → [0, ∞[ be
defined by pt (x) := |x(t)|. Then pt is a seminorm on X and we see that pt (x) = 0 for any function
x with x(t) = 0. The set of seminorms P := {pt | t ∈ [0, 1]} can be used to describe the pointwise
convergence of functions in the sense that xn → x pointwise on [0, 1], if and only if

∀t ∈ [0, 1] : pt (xn − x) → 0 (n → ∞),

written in more abstract terms: ∀p ∈ P we have p(xn − x) → 0 as n → ∞.


The statement for the pointwise convergence of a net (xj )j∈J to x in X is completely analogous:
It is equivalent to the fact that, for every p ∈ P , the net (p(xj − x))j∈J converges to 0 in C, i.e.,
for every ε > 0 there is some j0 ∈ J such that p(xj − x) < ε, if j ≥ j0 .
(Recall that (J, ≤) is a directed set, if ≤ is a reflexive and transitive relation on J such that for
any j1 , j2 ∈ J we can find j3 ∈ J with j1 ≤ j3 and j2 ≤ j3 .)
We will show how to define vector space topologies with good properties from seminorms. The
very basic notion is as follows.
4.3. Definition: Let X be a vector space over the field K and τ be a topology on X. Then (X, τ )
is a topological vector space, if addition X × X → X, (x, y) 7→ x + y, and scalar multiplication
K × X → X, (λ, x) 7→ λx, are continuous maps with respect to the product topologies on X × X
and K × X.
The requirements for a topology τ on X to turn (X, τ ) into a topological vector space are not
trivial, as the example of the discrete topology on X 6= {0} illustrates: The scalar multiplication
then cannot be continuous K × X → X, since with any x 6= 0 and λn = 1/n we have (λn , x) →
(0, x) and 0 · x = 0, but 0 6= λn · x does not converge to 0 in X with respect to the discrete
topology.
As a preparation for the study of the so-called locally convex vector space topologies we need the
following notions describing properties of subsets of A ⊆ X. Recall that A is convex, if for every
x, y ∈ A and for every λ ∈ R, 0 ≤ λ ≤ 1 we have λx + (1 − λ)y ∈ A.

55
4.4. Definition: Let A ⊆ X, then A is called
(a) balanced, if for every x ∈ A and every λ ∈ K, |λ| ≤ 1, we have λx ∈ A,
(b) absolutely convex, if A is convex and balanced,
(c) absorbing, if for every x ∈ X there is a number cx > 0 such that for all λ ∈ K, 0 ≤ λ ≤ cx ,
we have λx ∈ A. (Since x ∈ λ1 A, if 0 < λ ≤ cx , we may say: “Any point in X is swallowed
by appropriate dilations of the subset A.”) [Slightly different definition in [MV92, Tre67].]
4.5. Lemma: A subset A ⊆ X is absolutely convex, if and only if it satisfies

x, y ∈ A and λ, µ ∈ K with |λ| + |µ| ≤ 1 =⇒ λx + µy ∈ A.

Proof: Suppose that A is absolutely convex and let x, y, λ, µ be as specified above. If λ = 0 or


µ
µ = 0, the implication holds, since A is balanced. Let λ 6= 0 and µ 6= 0, then both |λ|
λ
x and |µ| y
belong to A, since A is balanced. We obtain
 
|λ| λ |µ| µ
λx + µy = (|λ| + |µ|) · · x+ · y ∈ A,
| {z } |λ| + |µ| |λ| |λ| + |µ| |µ|
≤1 | {z }
λ µ
convex combination of |λ|
x and |µ|
y

since A is convex and balanced.


Suppose the implication in the statement of the lemma holds. Clearly, convex combinations are a
special case of its premise, thus A is convex. To see that A is also balanced, let x ∈ A and ν ∈ K
with |ν| ≤ 1. Then νx = νx + 0x ∈ A.
4.6. Neighborhood systems from seminorms: Let P be a set of seminorms on X. For any
finite subset F ⊆ P and any ε > 0 we define

UF,ε := {x ∈ X | ∀p ∈ F : p(x) ≤ ε}.

We collect all of these subsets in the family U := {UF,ε | ε > 0, F ⊆ P finite} and list its basic
properties:
(1) ∀U ∈ U we have 0 ∈ U .
This is clear, since p(0) = 0.
(2) ∀U1 , U2 ∈ U there exists U ∈ U such that U ⊆ U1 ∩ U2 .
Let Uj = UFj ,εj (j = 1, 2) and put F = F1 ∪ F2 , ε := min(ε1 , ε2 ), then U := UF,ε fulfills the
requirement, since UF,ε ⊆ UF1 ,ε1 ∩ UF2 ,ε2 .
(3) ∀U ∈ U ∃V ∈ U such that V + V ⊆ U .
If U = UF,ε we may put V := UF,ε/2 , since UF,ε/2 + UF,ε/2 ⊆ UF,ε .
(4) Every U ∈ U is absorbing.
Let U = UF,ε and x0 ∈ X. Put αx0 := maxp∈F p(x0 ). If αx0 = 0, then x0 ∈ UF,ε . If αx0 > 0,
then we put cx0 := ε/αx0 and obtain for every p ∈ F and every λ ∈ K with 0 ≤ λ ≤ cx0
that p(λx0 ) = λ p(x0 ) ≤ cx0 αx0 ≤ ε, i.e, λx0 ∈ UF,ε .

56
(5) ∀U ∈ U ∀λ ∈ K, λ > 0, there is some V ∈ U with λV ⊆ U .
Let U = UF,ε , λ > 0, and note that λ · UF,ε/λ = UF,ε , thus V := UF,ε/λ does the job.
(6) Every U ∈ U is balanced.
This is clear here from the seminorm properties.
(7) Every U ∈ U is absolutely convex.
This is also clear here from the seminorm properties.

4.7. Proposition: If U is a non-empty system of subsets of X satisfying (1)-(6) as in 4.6, we


obtain a topology τ on X by specifying for any O ⊆ X that

O∈τ ⇐⇒ ∀x ∈ O ∃U ∈ U : x + U ⊆ O.

Then U is a basis of τ -neighborhoods at 0 and (X, τ ) is a topological vector space.


(Moreover, then clearly {x + U | U ∈ U} is a basis of neighborhoods at x, for every x ∈ X.)
Proof: We show that τ is a topology on X: Clearly, X ∈ τ and ∅ ∈ τ (the latter as a “formality”). If
O1 , O2 ∈ τ and x ∈ O1 ∩ O2 , then choose U1 , U2 ∈ U such that x + Uj ⊆ Oj (j = 1, 2). By (2), we
find U ∈ U with x + U ⊆ x + U1 ∩SU2 ⊆ O1 ∩ O2 . Thus O1 ∩ O2 ∈ τ . Finally, if M is some set,
Oq ∈ τ for every q ∈ M , and x ∈ S q∈M Oq , thenSx ∈ Oq0 for some q0 ∈ M . There exists some
U ∈ U such that x + U ⊆ Oq0 ⊆ q∈M Oq . Thus, q∈M Oq is open.
Every U ∈ U is a τ -neighborhood of 0: Let O := {x ∈ U | ∃W ∈ U : x+W ⊆ U }. By construction,
0 ∈ O ⊆ U and it remains to show that O is open. Let x ∈ O and W ∈ U with x + W ⊆ U . By
(3), there is some V ∈ U such that V + V ⊆ W . We show that x + V ⊆ O, which implies then
that O is open. If y ∈ x + V , then y + V ⊆ x + V + V ⊆ x + W ⊆ U , hence y ∈ O.
U is a basis of neighborhoods at 0: If B ⊆ X is a τ -neighborhood of 0, then there is some O ∈ τ
with 0 ∈ O ⊆ B. By construction of τ , there is some U ∈ U such that U = 0 + U ⊆ O ⊆ B.
Addition a : X × X → X, a(x, y) := x + y, is continuous: Given O ∈ τ , we prove that a−1 (O) is
open in X × X. Let (x, y) ∈ a−1 (O), i.e, x + y ∈ O. By construction of τ , there is some U ∈ U
with (x+y)+U ⊆ O. Choose V ∈ U according to (3) with V +V ⊆ U . Then (x+V )×(y +V ) is a
neighborhood of (x, y) and (x + V ) × (y + V ) ⊆ a−1 (O), since (x + V ) + (y + V ) = x + y + V + V ⊆
x + y + U ⊆ O.
Scalar multiplication m : K × X → X, m(λ, x) := λx, is continuous: Given O ∈ τ , we prove
that m−1 (O) is open in K × X. Let (λ, x) ∈ m−1 (O), i.e, λx ∈ O. By construction of τ , there
is some U ∈ U with λx + U ⊆ O. Our aim is to find some ε > 0 and W ∈ U such that
m(Dε (λ) × (x + W )) ⊆ O, where Dε (λ) = {µ ∈ K | |µ − λ| < ε}.
Choose V ∈ U with V + V ⊆ U (by property (3)) and then 1 ≥ ε > 0 such that εx ∈ V (by
property (4)). Since V is balanced, by (6), we have (µ − λ)x ∈ V for every µ ∈ Dε (λ). Finally,
we employ (5) to choose W ∈ U such that (|λ| + ε)W ⊆ V ; by (6), we further have µW ⊆ V , if
|µ| ≤ |λ| + ε. We obtain, for every µ ∈ Dε (λ) and for every w ∈ W ,

µ(x + w) = λx + (µ − λ)x + µw ∈ λx + V + V ⊆ λx + U ⊆ O.

57
4.8. Remark: It is not difficult to see that every topological vector space possesses a basis of
neighborhoods at 0 satisfying properties (1)-(6). Clearly, (1) and (2) hold for any neighborhood
basis at 0. Properties (3)-(5) hold for the system of all neighborhoods at 0: Property (3) follows
from continuity of addition at (0, 0). Property (4) follows from continuity of λ 7→ λx at 0 ∈ K
for any x ∈ X, and (5) from continuity of the map x 7→ λx at 0 ∈ X for any λ. Finally, every
neighborhood U of 0 contains a balanced 0-neighborhood W : By continuity of scalar multiplication
at (0, 0) ∈ K × X, we can find ε > 0Sand a neighborhood V of 0 such that λV ⊆ U , if |λ| ≤ ε;
setting W := {λv | |λ| ≤ ε, v ∈ V } = |λ|≤ε λV we clearly obtain a balanced subset, which is also
a 0-neighborhood, since each λV with λ 6= 0 is one (as the image of V under the homeomorphism
x 7→ λx). In conclusion, the set of all balanced neighborhoods at 0 satisfies (1)-(6).

We learn from 4.6 and 4.7 that a family of seminorms generates a topology for a topological vector
space. In this topology, the basic neighborhoods UF,ε have the additional property (7) of being
absolutely convex, which has not been used in Proposition 4.7. As it turns out, this additional
property is crucial, for example, to guarantee a rich theory of linear functionals. Moreover, most
of the topological vector spaces arising in applications are built from seminorms and it can be
shown that topologically this is equivalent to the existence of a basis of neighborhoods at 0
consisting of (absolutely) convex subsets (see, e.g., [MV92, Lemmata 22.2 und 22.4] or [Tre67,
Proposition 7.6], and [Wer18, Satz VIII.1.5] for a sketch of the proof). The key ingredient in
constructing seminorms from absolutely convex neighborhoods of 0 is the Minkowski functional,
pU (x) := inf{λ > 0 | x ∈ λU } (see also a corresponding lemma in 5.10).
4.9. Definition: A topological vector space is said to be locally convex, if it possesses a basis of
neighborhoods at 0 consisting of convex sets.
From the discussion and the references given above, we have the following statement.
4.10. Proposition: A topological vector space (X, τ ) is locally convex, if and only if τ is defined
by a family of seminorms as in 4.6 and 4.7.
In particular, every normed vector space is a locally convex space, which happens to be metrizable
as well and thus also a Hausdorff space. In general, locally convex or topological vector spaces need
not be Hausdorff as the example of the chaotic topology (corresponding to the single seminorm
x 7→ 0) illustrates. However, there are the following useful criteria. (We note that the equivalence
(i) ⇔ (iii) holds in general topological vector spaces).
4.11. Lemma: Let P be a set of seminorms on X generating the locally convex vector space
topology τ . Then the following are equivalent:
(i) (X, τ ) is a Hausdorff space,
(ii) for every x ∈ X, x 6= 0, there is a p ∈ P such that p(x) 6= 0,
(iii) there is a neighborhood basis U of 0 such that
T
U = {0}.
U ∈U

Proof: (i) ⇒ (ii): Let x ∈ X, x 6= 0. The neighborhoods at x are of the form x + U , where U
is a neighborhood of 0. By the Hausdorff property, we can separate x and 0, hence we find two
neighborhoods U and V of 0 such that (x + U ) ∩ V = ∅. We may suppose that V = UF,ε = {y ∈
X | p(y) ≤ ε, p ∈ F } with a finite subset F ⊆ P and ε > 0. Since x 6∈ V , there is some p ∈ F
such that p(x) > ε > 0.

58
(ii) ⇒ (iii): We only have to note that p(x) = 0 for every p ∈ P , if and only if x ∈ UF,ε for every
finite subset F ⊆ P and every ε > 0.

(iii) ⇒ (i): Let x, y ∈ X, x 6= y, hence x − y 6= 0 and there has to be some U ∈ U such that
x − y 6∈ U . The map (x, y) 7→ x − y is continuous (being the composition of addition with scalar
multiplication by −1 in the second component), thus, there are neighborhoods V and W of 0 such
that W − V ⊆ U . Therefore x − y 6∈ W − V , which means (x + V ) ∩ (y + W ) = ∅, i.e., we have
found disjoint neighborhoods of x and y.

4.12. Examples: 1) Topology of pointwise convergence: Let T be a set and X be a subspace


of the K-vector space of functions T → K, e.g., T = [0, 1] and X = C([0, 1]). For every t ∈ T ,
consider the seminorm pt (x) = |x(t)| (x ∈ X). The family P = {pt | t ∈ T } defines a locally
convex Hausdorff topology on X.

2) Topology of uniform convergence on compact sets: Let T be a topological space and X be a


subspace of C(T, K) = C(T ). For every compact subset K ⊆ T , we put pK (x) := supt∈K |x(t)|
(x ∈ X). The family of seminorms P = {pK | K ⊆ X compact} generates a locally convex
Hausdorff topology on X.

3) Let Ω ⊆ Rn be open and E (Ω) be the vector space C ∞ (Ω) equipped with the following
seminorms: For m ∈ N0 and K ⊂ Ω compact, define

pK,m (x) := max sup |∂ α x(t)| (x ∈ E (Ω)).


α∈Nn
0 ,|α|≤m t∈K

The family of seminorms P = {pK,m | m ∈ N0 , K ⊂ Ω compact} generates a locally convex


Hausdorff topology on E (Ω).

4) The so-called Schwartz space S (Rn ) is the set of functions ϕ ∈ C ∞ (Rn ) such that every
derivative ∂ β ϕ, β ∈ Nn0 , is rapidly decreasing, i.e., for every α ∈ Nn0 , the function x 7→ xα ∂ β ϕ(x)
is bounded on Rn . We define seminorms on S (Rn ) for any m ∈ N0 and α ∈ Nn0 by

pα,m (ϕ) := sup (1 + |x|m ) |∂ α ϕ(x)| (ϕ ∈ S (Rn )),


x∈Rn

or occasionally also pl,m (ϕ) := max|α|≤l pα,m (ϕ) (l, m ∈ N0 ). The corresponding families of
seminorms define a locally convex Hausdorff topology on S (Rn ).

5) Let Ω ⊆ Rn be open, K ⊂ Ω compact, and DK (Ω) denote the set of all functions ϕ ∈ C ∞ (Ω)
with supp(ϕ) ⊆ K.
(Recall that supp(ϕ) is the closure, in the trace topology of Ω, of the set {x ∈ Ω | ϕ(x) 6= 0}.
Equivalently, it is the complement, within Ω, of the largest open set, where ϕ vanishes. I suppose
that the existence of nonzero functions ϕ ∈ C ∞ (Ω) with compact support has been shown in your
second or third semester Analysis course; if not, a reference is [Wer18, Beispiel zwischen Definition
V.1.9 und Lemma V.1.10]. Clearly, if K ◦ = ∅, then DK (Ω) = {0}.)
We equip DK (Ω) with the structure of a locally convex Hausdorff space generated by the family
of seminorms
pm (ϕ) := max sup |∂ α ϕ(x)| (m ∈ N0 , ϕ ∈ DK (Ω)).
|α|≤m x∈Ω

59
6) Let Ω ⊆ Rn be open. The space of test
S functions D(Ω) is the set of all functions ϕ ∈ C (Ω)

with compact support, i.e., D(Ω) = DK (Ω).


K⊂Ω
K compact

We could consider seminorms pm (m ∈ N0 ) as above also on D(Ω), the only difference is that
the functions now do not have their support inside some fixed compact set K. But it turns out
that the corresponding topology is not appropriate to reflect the structure of D(Ω) as a union
of the spaces DK (Ω) with their topology τK defined in the previous example. Such a structural
feature is desirable to obtain a function space with good localization properties and convenient
approximation procedures being concentrated on bounded regions. Therefore, we consider the set
P of seminorms p on D(Ω) such that the restriction p |DK (Ω) is continuous with respect to τK for
every compact set K ⊂ Ω. The locally convex topology on D(Ω) generated from P is a special
case of the so-called inductive limit topologies from the general theory of topological vector spaces.
It is also Hausdorff, which we will show later in the section on distribution theory.

7) If (X, k.k) is a normed space, then the locally convex topology generated by P = {k.k} is the
norm topology on X.

8) Let X be a normed space with dual space X 0 (i.e., the space of all continuous linear functionals
X → K). Then the family of seminorms P = {px0 | x0 ∈ X 0 }, where

px0 (x) := |x0 (x)| (x ∈ X),

generate the weak topology σ(X, X 0 ) on X, which is locally convex and Hausdorff (due to the
theorem of Hahn-Banach).

9) Let X 0 be the dual space of the normed space X and consider for every x ∈ X the following
seminorm px on X 0 :
px (x0 ) := |x0 (x)| (x0 ∈ X 0 ).
The set of seminorms P = {px | x ∈ X} generates the weak* topology σ(X 0 , X) on X 0 . It coincides
with the topology of pointwise convergence for the subspace of continuous linear functions X → K.
In general, σ(X 0 , X) is different from the weak topology σ(X 0 , X 00 ).

10) Let X and Y be normed spaces. Apart from the (operator) norm topology on L(X, Y ), the
following two locally convex topologies are also of interest in operator theory: The strong operator
topology is generated by the seminorms

px (T ) := kT xk (T ∈ L(X, Y ), x ∈ X),

and the weak operator topology, which is defined by the seminorms

px,y0 (T ) := |y 0 (T x)| (T ∈ L(X, Y ), x ∈ X, y 0 ∈ Y 0 ).

Both are Hausdorff topologies, which in case of the weak operator topology follows by appealing to
the Hahn-Banach theorem. Recall that we have used the notions of (sequential) convergence with
respect to these topologies already implicitly while discussing the measurable functional calculus
of self-adjoint operators on Hilbert spaces (with X = Y = H and y 0 (x) corresponding to an inner
product).

60
11) The notion of weak topology in probability theory: Let Ω be a metric space and M (Ω) be
the space of regular signed or complex Borel measures on Ω. For every f in the space of bounded
continuous functions Cb (Ω) we define the seminorm
Z
pf (µ) := | f dµ|

on M (Ω). The family of seminorms P = {pf | f ∈ Cb (Ω)} generates a locally convex topology on
M (Ω), which in case of compact Ω corresponds to the weak* topology due to the Riesz represen-
tation theorem (since we then have Cb (Ω) = C(Ω) and C(Ω)0 ∼ = M (Ω)). This “probabilistic weak
topology” always has the Hausdorff property, the proof is based on the regularity of the measures
(cf. [Els11, Kapitel VIII, §1, Satz 4.6]).

61
5. Continuous linear maps and
functionals
Some of the locally convex spaces of functions have been introduced essentially in order to provide
a setting for linear functionals on these spaces, which are considered as generalized functions or
distributions. It is the property of local convexity that guarantees the existence of continuous
linear functionals via the Hahn-Banach theorem. Moreover, we have seen in the definition of the
test function space D(Ω) that seminorms with continuous restrictions to any DK (Ω) are used to
define the locally convex topology on D(Ω).

5.1. Lemma: Let (X, τ ) be a locally convex vector space and P be a family a seminorms
generating the topology.

(a) If q : X → [0, ∞[ is a seminorm, then the following are equivalent:

(i) q is continuous,

(ii) q is continuous at 0,

(iii) the set Uq,1 := {x ∈ X | q(x) ≤ 1} is a neighborhood of 0.

(b) Every p ∈ P is continuous.

(c) A seminorm q on X is continuous, if and only if there is a finite subset F ⊆ P and a real
number M > 0 such that
∀x ∈ X : q(x) ≤ M max p(x).
p∈F

Proof: (a): The implications (i) ⇒ (ii) ⇒ (iii) are immediate. Suppose that (iii) holds. We
show that then (i) follows, which completes the proof of (a): Let x ∈ X and ε > 0; we set
U := ε · Uq,1 = {y ∈ X | q(y) ≤ ε} and note that x + U is a neighborhood of x; if y ∈ U , then1

|q(x + y) − q(x)| ≤ q((x + y) − x) = q(y) ≤ ε,

thus, q(x + U ) ⊆ [q(x) − ε, q(x) + ε], proving continuity of q at x.

(b): This follows from (a),(iii), since by definition of τ , the sets {x ∈ X | p(x) ≤ 1} with p ∈ P
are neighborhoods of 0.
1
Here, we simply use the “reverse triangle inequality”: q(w) = q((w − z) + z) ≤ q(w − z) + q(z) and
q(z) = q((z − w) + w) ≤ q(z − w) + q(w) = q(w − z) + q(w) implies |q(w) − q(z)| ≤ q(w − z).

63
(c): By definition of τ and (a), (iii), the continuity of q is equivalent to the existence of a finite
set F ⊆ P and ε > 0 such that UF,ε ⊆ Uq,1 . Recalling UF,ε = {x ∈ X | ∀p ∈ F : p(x) ≤ ε}, the
inclusion relation UF,ε ⊆ Uq,1 is equivalent to

1
(∗) ∀x ∈ X : max p(x) ≤ 1 ⇒ q(x) ≤ 1.
ε p∈F

If the inequality stated in (c) holds, then (∗) follows upon putting ε := 1/M . Conversely, suppose
(∗) holds and let x ∈ X. If we had q(x) > maxp∈F p(x)/ε, then we could choose µ > 0 with
q(x) > µ > maxp∈F p(x)/ε and obtain a contradiction to (∗) for y := x/µ, since q(y) = q(x)/µ > 1
and maxp∈F p(y)/ε = maxp∈F p(x)/(εµ) < 1. Thus, the inequality stated in (c) holds with
M := 1/ε.
5.2. Corollary: Let (X, τ ) be a locally convex vector space and P be a family a seminorms
generating the topology. If Q ⊇ P is a family of τ -continuous (!) seminorms on X, then Q also
generates the topology τ .
Proof: Clearly, the topology generated by Q ⊇ P is finer than τ , but by the above lemma and
the continuity of each q ∈ Q it is also coarser than τ .
We can now provide simple criteria for the continuity of linear maps between locally convex vector
spaces.
5.3. Theorem: Let (X, τP ) and (Y, τQ ) be two locally convex vector spaces, where τP is generated
by the family P of seminorms on X and τQ by the family of seminorms Q on Y . Let T : X → Y
be a linear map, then the following are equivalent:
(i) T is continuous,
(ii) T is continuous at 0,
(iii) if q is a continuous seminorm on Y , then q ◦ T is a continuous seminorm on X,
(iv) for every q ∈ Q there is a finite subset F ⊆ P and a real number M > 0 such that

∀x ∈ X : q(T x) ≤ M max p(x).


p∈F

Proof: (i) ⇒ (ii) is clear and (ii) ⇒ (i) is easy: Let x ∈ X; a neighborhood of T x is of the form
T x + V , where V is a τQ -neighborhood of 0 in Y ; choose a τP -neighborhood U of 0 in X such
that T (U ) ⊆ V , then clearly T (x + U ) = T x + T (U ) ⊆ T x + V .
(ii) ⇒ (iii): The composition q ◦ T is continuous at 0 and clearly defines seminorm on X. Hence,
by (a) in the lemma above, q ◦ T is a continuous seminorm.
(iii) ⇒ (iv): This follows from (b), applied to q ∈ Q, and (c), applied to q ◦ T , in the lemma shown
previously.
(iv) ⇒ (ii): Let V be a τQ -neighborhood of 0 in Y . We may suppose that V is given by finitely
many seminorms q1 , . . . , qn ∈ Q and ε > 0 in the form V = {y ∈ Y | qj (y) ≤ ε (j = 1, . . . , n)}.
For every j = 1, . . . , n, choose Fj ⊆ P and Mj > 0 according to (iv) with qj in place of q. We

64
put F := nj=1 Fj , M := maxnj=1 Mj , and consider the τP -neighborhood U := UF,ε/M of 0 in X.
S
If x ∈ U and j ∈ {1, . . . , n}, then
ε
qj (T x) ≤ Mj max p(x) ≤ M max p(x) ≤ M = ε,
p∈Fj p∈F M

hence T x ∈ V . Therefore, we have shown that T (U ) ⊆ V , i.e., T is continuous at 0.

The most important special case Y = K (with the Euclidean topology, generated from the single
norm |.|) deserves an explicit statement.

5.4. Corollary: Let X be a locally convex vector space and P be a family a seminorms generating
the topology. A linear functional l on X is continuous, if and only if there are finitely many
seminorms p1 , . . . , pm ∈ P and a constant M > 0 such that

∀x ∈ X : |l(x)| ≤ M max pj (x).


j=1,...,m

5.5. Definition: (i) Let (X, τ ) be a locally convex vector space. Then its dual space X 0 , or (Xτ )0 ,
is the vector space of continuous linear functionals on X.
(ii) Let (X, τ1 ) and (Y, τ2 ) be locally convex vector spaces, then L(X, Y ), or L(Xτ1 , Yτ2 ), denotes
the vector space of continuous linear maps X → Y .

Note that we have X 0 = L(X, K) and the fact that L(X, Y ) is a vector space follows from Theorem
5.3.

Recall that a topology τ1 on a set X is finer than τ2 , if τ2 ⊆ τ1 , or, equivalently, the identity map
on X is continuous as a map id : (X, τ1 ) → (X, τ2 ). If τ1 and τ2 are locally convex topologies on
the vector space X, then the latter is equivalent to id ∈ L(Xτ1 , Xτ2 ). The topologies are equal, if
both id ∈ L(Xτ1 , Xτ2 ) and id ∈ L(Xτ2 , Xτ1 ) hold.

5.6. Examples: 1) Let (X, k.k) be a normed space and τ denote the topology induced by the
norm, then τ is finer than the weak topology σ(X, X 0 ), since for any x0 ∈ X 0 we have

∀x ∈ X : px0 (x) = |x0 (x)| ≤ kx0 kkxk,

which shows that id ∈ L(Xτ , Xσ(X,X 0 ) ).

2) Let X = C(Rn ) and denote by τ1 the topology of uniform convergence on compact sets, by τ2
the topology of pointwise convergence. Then τ1 is finer than τ2 , since for any t ∈ Rn we may take
K = {t} as compact set and obtain

∀x ∈ C(Rn ) : pt (x) = |x(t)| = sup |x(t)| = pK (x).


t∈K

We briefly study convergence of nets in locally convex vector spaces. Recall that in general,
sequences are not sufficiently powerful to characterize continuity of maps or closures of subsets (cf.
Example 5.8 below) in topological spaces that are non-metrizable. (More precisely, the “barrier”
are the first countable, or AA1, spaces).

65
5.7. Proposition: Let the locally convex topology τ on X be generated by the family of semi-
norms P . Then a net (xj )j∈I converges to x in X with respect to τ , if and only if lim p(xj − x) = 0
for every p ∈ P .

Proof: The net (xj ) converges to x, if and only if the net (xj − x) converges to 0. Thus, we may
suppose that x = 0.

If (xj ) converges to 0, then lim p(xj ) = 0 for every p ∈ P due to continuity of p (Lemma 5.1(b)).

Now suppose that lim p(xj ) = 0 for every p ∈ P and let U = UF,ε be a typical neighborhood of 0
with F ⊆ P finite and ε > 0. In particular, lim p(xj ) = 0 for every p ∈ F . Choose jp ∈ I such
that p(xj ) ≤ ε for every j ≥ jp . Since (I, ≤) is a directed set (and F is finite), there is some j 0 ∈ I
such that j 0 ≥ jp for every p ∈ F . We obtain p(xj ) ≤ ε for every j ≥ j 0 , hence xj ∈ U if j ≥ j 0 .
Thus, we have shown xj → 0 in (X, τ ).

5.8. Example: Let X = l2 , equipped with its weak topology σ := σ(l2 , l2 ), and consider the
subset
A := {em + m en | 1 ≤ m < n} ⊆ l2 ,
where ek ∈ l2 denotes the vector with k-th component 1 and all other components 0.

Claim 1: 0 ∈ A (weak closure).

Let U be a typical σ-neighborhood of 0, i.e., U = {x ∈ l2 | |hx, yj i| ≤ ε (j = 1, . . . , r)} with ε > 0


and y1 , . . . , yr ∈ l2 . Choose m such that |yj (m)| ≤ ε/2 for j = 1, . . . , r. Then choose n > m such
that |yj (n)| ≤ ε/(2m) for j = 1, . . . , r. We obtain
ε ε
|hem + m en , yj i| ≤ |yj (m)| + m |yj (n)| ≤ +m = ε (j = 1, . . . , r),
2 2m
i.e., em + m en ∈ A ∩ U . We have therefore shown that V ∩ A 6= ∅ for every neighborhood V of 0.

Claim 2: There is no sequence in A that converges weakly to 0.

Suppose (emk + mk enk )k∈N is a sequence in A that converges weakly to 0. By Proposition 5.7,
we conclude that

∀y ∈ l2 : lim |y(mk ) + mk y(nk )| = lim |hemk + mk enk , yi| = 0.


k→∞ k→∞

If (mk )k∈N is bounded, there is some q ∈ N such that mk = q infinitely often. Choosing y = eq
and recalling nk > mk , then produces the contradiction that y(mk ) + mk y(nk ) = 1 infinitely often
and converges to 0 as k → ∞.
If (mk )k∈N is unbounded, we may suppose that both (mk ) and (nk ) are strictly increasing. Let
y ∈ l2 be given by y(j) := 1/k, if j = nk , and y(j) := 0 otherwise. Then y(mk ) + mk y(nk ) ≥
mk /k ≥ 1 for every k ∈ N, which also contradicts y(mk ) + mk y(nk ) → 0.

Remark: We can obtain a net in A converging weakly to 0 as follows. The set of all typical
neighborhoods UF,ε of 0 becomes a directed set, if we introduce the relation UF1 ,ε1 ≤ UF2 ,ε2
meaning UF1 ,ε1 ⊇ UF2 ,ε2 . To the index set element UF,ε we assign the element aUF,ε := em + m en
constructed as in the proof of Claim 1 for F = {y1 , . . . , yr } and ε > 0.

66
5.9. Examples: 1) The dual space of the Schwartz space S (Rn ), Example 4.12.4), is denoted by
S 0 (Rn ) and called the space of temperate (or tempered) distributions. The locally convex topology
on S (Rn ) is generated by the seminorms pα,m (ϕ) := supx∈Rn (1+|x|m ) |∂ α ϕ(x)| (m ∈ N0 , α ∈ Nn0 ).
We show that every f ∈ Lp (Rn ) (1 ≤ p ≤ ∞) provides an element in S 0 (Rn ), in the sense that
the map Tf : S (Rn ) → C,
Z
Tf (ϕ) := f (x) ϕ(x) dx (ϕ ∈ S (Rn )),
Rn

defines a continuous linear functional. Recall Hölder’s inequality, which states that for any g ∈
Lq (Rn ), p1 + 1q = 1 with the convention ∞ 1
= 0, we have kf gkL1 ≤ kf kLp kgkLq . Since S (Rn ) ⊆
L (R ) (1 ≤ q ≤ ∞), we obtain for any ϕ ∈ S (Rn ),
q n

Z
|Tf (ϕ)| ≤ |f (x) ϕ(x)| dx = kf ϕkL1 ≤ kf kLp kϕkLq .
Rn

If p = 1, then q = ∞ and we have 2kϕkL∞ = p0,0 (ϕ), thus, |Tf (ϕ)| ≤ kf kL1 p0,0 (ϕ) proves the
continuity of Tf .
If p > 1, then 1 ≤ q < ∞ and

 1/q  1/q
Z Z
1 q
kϕkLq =  |ϕ(x)|q dx = (1 + |x|n+1 )ϕ(x) dx
 
(1 + |x| n+1 q
) | {z }
Rn Rn ≤ p0,n+1 (ϕ)q
1
≤ · p0,n+1 (ϕ) =: Mq · p0,n+1 (ϕ),
1 + |.|n+1 Lq

which implies |Tf (ϕ)| ≤ kf kLp Mq p0,n+1 (ϕ) and proves the continuity of Tf .
2) Let X be a normed space, τ denote the topology induced by the norm on X, and σ := σ(X, X 0 ).
As noted above, τ is finer than σ. Thus, if µ : X → K is a weakly continuous linear functional,
then µ is also norm continuous, i.e., we have (Xσ )0 ⊆ (Xτ )0 = X 0 .
If x0 ∈ X 0 , then |x0 (x)| = px0 (x) for every x ∈ X, therefore x0 is also weakly continuous by
Corollary 5.4; hence, also X 0 = (Xτ )0 ⊆ (Xσ )0 holds. To summarize, we have shown that

(Xσ(X,X 0 ) )0 = X 0 .

3) Let X and Y be normed spaces and consider the strong operator topology on L(X, Y ), described
in Example 4.12.10) and generated by the seminorms px (T ) := kT xk (T ∈ L(X, Y ), x ∈ X). We
denote this locally convex space by Lst (X, Y ) and claim that a linear functional Φ on L(X, Y )
belongs to Lst (X, Y )0 , if and only if Φ is of the following form: There are n ∈ N, x1 , . . . , xn ∈ X,
and µ1 , . . . , µn ∈ Y 0 such that
n
X
Φ(T ) = µj (T xj ) (T ∈ L(X, Y )).
j=1

Clearly, any Φ of the above form is continuous with respect to the strong operator topology by
Corollary 5.4, since |Φ(T )| ≤ n maxnj=1 kµj kkT xj k ≤ (n maxnj=1 kµj k) · maxnl=1 pxl (T ).

67
Suppose Φ ∈ Lst (X, Y )0 , then Corollary 5.4 implies that there are n ∈ N, M > 0, and x1 , . . . , xn ∈
X such that

(∗) |Φ(T )| ≤ M max pxj (T ) = M max kT xj k.


1≤j≤n 1≤j≤n

Let ln∞ (Y ) denote the K-vector space Y n equipped with the norm k(y1 , . . . , yn )k∞ := maxnj=1 kyj k.
We claim that any norm continuous linear functional µ ∈ ln∞ (Y )0 is given in the form
n
X
(∗∗) µ((y1 , . . . , yn )) = µj (yj )
j=1

with certain µ1 , . . . , µn ∈ Y 0 : In fact, define µj (z) := µ((0, . . . , z, 0, . . .)) (here, z is in the


jth component; j = 1, . . . , n), then µj : Y → K is linear and continuous, since |µj (z)| ≤
0)k∞ = kµkkzk; finally, (y1 , . . . , yn ) = (y1 , 0, . . . , 0) + . . . + (0, . . . , 0, yn ) yields
kµkk(0, . . . , z, . . . ,P
µ((y1 , . . . , yn )) = nj=1 µj (yj ).
Consider the subspace V := {(T x1 , . . . , T xn ) ∈ Y n | T ∈ L(X, Y )} of ln∞ (Y ). Note that by (∗) we
have Φ(T ) = 0, if (T x1 , . . . , T xn ) = (0, . . . , 0), which shows that the linear functional ν : V → K,
ν((T x1 , . . . , T xn )) := Φ(T ) is well-defined. Moreover, (∗) also implies that |ν((T x1 , . . . , T xn ))| ≤
M k(T x1 , . . . , T xn )k∞ , i.e., ν is a norm continuous linear functional on the subspace V ⊆ ln∞ (Y ).
By the Hahn-Banach theorem (for normed spaces), there is an extension µ ∈ ln∞ (Y )0 of ν. Let µ be
given by µ1 , . . . , µn ∈ Y 0 according to (∗∗). Then we obtain, putting (y1 , . . . , yn ) = (T x1 , . . . , T xn )
with T ∈ L(X, Y ),
n
X
Φ(T ) = ν((T x1 , . . . , T xn )) = µ((T x1 , . . . , T xn )) = µj (T xj ).
j=1

5.10. Hahn-Banach theorem(s) for locally convex vector spaces: Here, I suppose that
you have already seen at least a proof of the extension theorem for normed spaces. Most likely, it
was based on a so-called linear algebraic version as lemma, involving the crucial “lemma of Zorn
argument” and an upper bound of the linear functional in terms of a seminorm or a convex function
or a sublinear function (e.g., as in [Con10, Chapter III, Section 6] or [Tes14, Theorem 4.11] or
[Wer18, Sätze III.1.2, III.1.4 und Lemma III.1.3]; note that a seminorm is a convex function and
also an example of a sublinear function). Therefore, we “recall” the following statement without
repeating a proof.
Linear algebraic version of Hahn-Banach’s extension theorem: Let V be a subspace of
the K-vector space X and p : X → R be sublinear, i.e., p(λx) = λp(x), if λ ≥ 0, and p(x + y) ≤
p(x) + p(y). If l : V → K is a linear functional satisfying Re l(x) ≤ p(x) for every x ∈ V , then
there is a linear functional L : X → K such that L(x) = l(x), if x ∈ V , and Re L(x) ≤ p(x) for
every x ∈ X.
Extension theorem: Let V be a subspace of the locally convex vector space X and l ∈ V 0 .
Then there exists an extension L ∈ X 0 of l.
Proof: Suppose that the locally convex topology on X is generated by the family of seminorms
P . Then the subspace topology on V is generated by the set of restrictions {p|V | p ∈ P } as

68
family of seminorms on V (since the basic neighborhoods of 0 in V are just obtained from of
those in X intersected with V ). By continuity of l, there are M > 0 and p1 , . . . pm ∈ P such
that |l(x)| ≤ M maxm j=1 pj (x) for every x ∈ V . Define p : X → [0, ∞[ by p(x) := M maxj=1 pj (x)
m

(x ∈ X), then p is a continuous seminorm on X and Re l(x) ≤ |l(x)| ≤ p(x) for every x ∈ V . By
the linear algebraic extension theorem, there is an extension L of l to X satisfying Re L(x) ≤ p(x)
for every x ∈ X. Since p is a seminorm we have p(λx) = p(x) for every λ ∈ K with |λ| = 1, hence
we may conclude that |L(x)| ≤ p(x) holds for every x ∈ X. This shows that L ∈ X 0 (strictly
speaking, upon calling on Lemma 5.1(c) and Corollary 5.4).
We turn now to so-called geometric versions of the Hahn-Banach theorem, which state properties
on separation of convex subsets by values of linear functionals.
Lemma on Minkowski functionals: Let W be a convex neighborhood of 0 in the locally convex
vector space X. Then the Minkowski functional pW : X → [0, ∞],

pW (x) := inf{α > 0 | x ∈ αW },

has finite values and defines a sublinear function on X. If W is an absolutely convex neighborhood
of 0, then pW is a continuous seminorm on X.
Proof: Since W is absorbing2 , we have pW (x) < ∞ for every x ∈ X. From the definition of pW ,
we immediately obtain pW (λx) = λ pW (x), if λ ≥ 0.
We show subadditivity of pW , i.e., that pW (x + y) ≤ pW (x) + pW (y) holds for all x, y ∈ X: Let
x, y ∈ X and ε > 0 be arbitrary. By definition of pW as an infimum, there are λ, µ > 0 such that
λ < pW (x) + 2ε , µ < pW (y) + 2ε and λx ∈ W , µy ∈ W . By convexity of W ,

x+y λ x µ y
= · + · ∈ W,
λ+µ λ+µ λ λ+µ µ

which implies pW (x + y) ≤ λ + µ < pW (x) + pW (y) + ε. Since ε > 0 was arbitrary, we obtain
pW (x + y) ≤ pW (x) + pW (y).
If, in addition, W is balanced, then λW = W holds whenever |λ| = 1, thus pW (λx) = pλW (λx) =
pW (x) in this case. Therefore, we obtain for any λ 6= 0,

λ
pW (λx) = pW ( |λ|x) = pW (|λ|x) = |λ| pW (x) (x ∈ X),
|λ|

and may conclude that pW is a seminorm. The continuity of pW on X now follows from Lemma
5.1(a)(iii), since {x ∈ X | pW (x) ≤ 1} = W is a neighborhood of 0.
Separation lemma: Let V be a nonempty convex open subset of the locally convex vector space
X. If 0 ∈
/ V , then there exists x0 ∈ X 0 such that Re x0 (x) < 0 for every x ∈ V .
Proof: Let x0 ∈ V and U := −x0 + V . Then U is convex, open, 0 ∈ U , and −x0 6∈ U . There
is an absolutely convex neighborhood W of 0 such that W ⊆ U . By the lemma on Minkowski
functionals, we have that pU is sublinear and pW is a continuous seminorm.
2
Recall: Any neighborhood of 0 is absorbing, since it contains an absorbing basic neighborhood defined
by finitely many seminorms.

69
Let Y := R-span{−x0 } and define the R-linear functional l : Y → R by l(t(−x0 )) := t pU (−x0 ).
We claim that l(y) ≤ pU (y) for every y ∈ Y : If t > 0, then l(t(−x0 )) = tpU (−x0 ) = pU (t(−x0 ));
if t ≤ 0, then l(t(−x0 )) = t pU (−x0 ) ≤ 0 ≤ pU (t(−x0 )). By real linear algebraic extension, there
is an R-linear extension L : X → R such that L(x) ≤ pU (x) for every x ∈ X. By W ⊆ U we have
also L(x) ≤ pU (x) ≤ pW (x), i.e., |L(x)| ≤ pW (x) for every x ∈ X, since pW is a seminorm. Thus,
L is a continuous R-linear functional on X, since pW is a continuous seminorm (again employing
Lemma 5.1(c) and Corollary 5.4)).
Let x ∈ V = x0 + U , say x = x0 + u with u ∈ U . We claim that pU (u) < 1: In fact, if pU (u) ≥ 1,
then ut ∈ X \U for every 0 < t < 1; since X \U is closed, u = limt<1,t→1 ut ∈ X \U , a contradiction.
Furthermore, note that pU (−x0 ) ≥ 1, since −x0 6∈ U . Therefore, we obtain

L(x) = L(u) + L(x0 ) = L(u) + l(x0 ) ≤ pU (u) + l(x0 ) = pU (u) − pU (−x0 ) < 0.

Finally, in the case of a complex vector space we obtain a C-linear continuous functional x0 ∈ X 0
with Re x0 = L by setting x0 (x) := L(x) − iL(ix) (direct computation shows x0 ((a + ib)x) = . . . = (a + ib)x0 (x)),
which clearly satisfies Re x0 (v) = L(v) < 0 for every v ∈ V .
Separation theorem I: Let V1 and V2 be convex subsets of the locally convex vector space X.
If V1 is open and V1 ∩ V2 = ∅, then there exists x0 ∈ X 0 such that

∀v1 ∈ V1 , ∀v2 ∈ V2 : Re x0 (v1 ) < Re x0 (v2 ).

Proof: Let V := V1 − V2 , then V is convex (aSconvex combination of differences is a difference of


convex combinations) and open (write V = x∈V2 (V1 − x) as union of open subsets; translation
by −x is a homeomorphism!). Since V1 ∩ V2 = ∅, we have 0 6∈ V . By the lemma we may find
x0 ∈ X 0 such that Re x0 (v1 ) − Re x0 (v2 ) = Re x0 (v1 − v2 ) < 0 for all v1 ∈ V1 and v2 ∈ V2 .
Separation theorem II: Let V be a closed convex subset of the locally convex vector space X.
If x ∈ X \ V , then there exist x0 ∈ X 0 and ε > 0 such that

∀v ∈ V : Re x0 (x) + ε ≤ Re x0 (v).

If, in addition, V is absolutely convex, then there exist y 0 ∈ X 0 and ε > 0 such that

∀v ∈ V : |y 0 (v)| + ε ≤ Re y 0 (x).

Proof: Choose an absolutely convex open neighborhood U of 0 such that (x + U ) ∩ V = ∅. By


the separation theorem I there exists x0 ∈ X 0 such that

(∗) ∀u ∈ U, ∀v ∈ V : Re x0 (x) + Re x0 (u) < Re x0 (v).

Since U is absorbing, there is some u0 ∈ U with x0 (u0 ) 6= 0 (for otherwise, we had x0 = 0, which
contradicts the above inequality). Being also a balanced subset, U contains every multiple λu0
with |λ| ≤ 1, hence there is some u1 ∈ U with Re x0 (u1 ) > 0. Therefore ε := sup{Re x0 (u) | u ∈ U }
is positive. Picking an arbitrary v0 ∈ V , the inequality in (∗) implies Re x0 (u) ≤ Re x0 (v0 ) −
Re x0 (x) for every u ∈ U , hence Re x0 (u) is bounded above and ε < ∞. We have thus shown that
Re x0 (x) + ε ≤ Re x0 (v) for every v ∈ V , which proves the first part of the statement.

70
Note that, running through the reasoning above with y 0 := −x0 , we obtain Re y 0 (x) − ε ≥ Re y 0 (v)
for every v ∈ V . If V is absolutely convex, then V = λV for every λ ∈ K with |λ| = 1, hence we
may conclude that Re y 0 (x) − ε ≥ |y 0 (v)| holds for every v ∈ V .
Corollary: If X is a Hausdorff locally convex vector space, then X 0 separates points, i.e., for any
x, y ∈ X with x 6= y, there is x0 ∈ X 0 such that x0 (x) 6= x0 (y).
Proof: Apply the separation theorem II to V := {y}, which is convex and also closed (by the
Hausdorff property).
Remark: The proofs of the Hahn-Banach theorems of extension and separation II do rely in
a crucial way on basic properties of locally convex topological vector spaces, since they employ
seminorms or nontrivial convex open neighborhoods of 0. In general topological vector spaces the
extension theorem may fail and it may happen that convex open 0-neighborhood are trivial. Both
is illustrated by theR following example: Consider Lp ([0, 1]), but with 0 < p < 1. Then it is easy to
1
see that d(f, g) := 0 |f (t) − g(t)|p dt defines a metric on Lp ([0, 1]), which provides the topology of
a topological vector space. However, one can show the following properties (cf. [Wer18, Aufgabe
VIII.8.3] or [Con10, Chapter IV, Example 3.16]):
• If U is a convex neighborhood of 0, then U = Lp ([0, 1]).
• There is no nonzero continuous linear functional on Lp ([0, 1]), i.e., Lp ([0, 1])0 = {0}.
The separation theorem I does hold in topological vector spaces, though the one convex subset
which is supposed to be open might be a trivial subset (cf. [Con10, Chapter IV, Theorem 3.7.;
see also the paragraph preceding Example 3.16]).
5.11. An application of Hahn-Banach separation—the Krein-Milman theorem: An
extreme point x0 in a convex set K in a K-vector space X is a point x0 ∈ K that cannot be part
of a non-degenerate line segment inside K, i.e.,

∀x1 , x2 ∈ K, ∀λ ∈ R, 0 < λ < 1 : x0 = λx1 + (1 − λ)x2 ⇒ x1 = x2 = x0 .

The Krein-Milman theorem states that a nonempty compact convex subset of a locally convex
space is the closed convex hull of its extreme points. Before stating and proving this theorem we
clarify or introduce some of the relevant notation.
If B is a subset of a vector space let co B denote its convex hull (i.e., the intersection of all convex
subsets containing B), and, in case of a topological vector space, let co B denote the closure of
the convex hull. The latter is a convex set: Let x, y ∈ co B and 0 ≤ λ ≤ 1; there are nets (xj )
and (yj ) in co B with x = lim xj and y = lim yj ; by continuity of the vector space operations, we
obtain λx + (1 − λ)y = lim(λxj + (1 − λ)yj ) ∈ co B. Therefore, co B is the smallest closed convex
set containing B and is called the closed convex hull of B.
As a generalization of the notion of extreme points, a nonempty subset F of a convex set K is
called a face of K, if F is convex and satisfies

∀x1 , x2 ∈ K, ∀λ ∈ R, 0 < λ < 1 : λx1 + (1 − λ)x2 ∈ F ⇒ x1 , x2 ∈ F.

Thus, x0 is an extreme point of K, if and only if F := {x0 } is a face of K. We denote by ex K


the set of extreme points of K.

71
As an interesting example we may mention, without proof, the case X = C(Ω)0 ∼ = M (Ω), where Ω
is a compact metric space (and M (Ω) denotes the space of signed or complex regular Borel mea-
sures, cf. the Riesz representation theorem 0.17). Let K denote the subset of all Borel probability
measures on Ω. Then clearly K 6= ∅ and K is convex. Particular elements in K are the Dirac
measures δω concentrated at ω ∈ Ω and it turns out that (cf. [Wer18, Beispiel (f) in VIII.4])

ex K = {δω | ω ∈ Ω}.

Lemma: Let K be a non-empty compact convex subset of a Hausdorff locally convex vector space
X and ρ ∈ X 0 . Let c := max{Re ρ(x) | x ∈ K} (making use of the continuity of Re ρ : X → R
and of the compactness of K), then F := {x ∈ K | Re ρ(x) = c} is a compact face of K.
Proof: Clearly, F is nonempty, convex, and closed, hence also compact. If x1 , x2 ∈ K, 0 < λ < 1,
and λx1 + (1 − λ)x2 ∈ F , then Re ρ(xj ) ≤ c (j = 1, 2) and c = Re ρ(λx1 + (1 − λ)x2 ) =
λ Re ρ(x1 ) + (1 − λ) Re ρ(x2 ), hence Re ρ(x1 ) = Re ρ(x2 ) = c, which implies x1 , x2 ∈ F .
Theorem (Krein-Milman): If K is a nonempty compact convex subset of a Hausdorff locally
convex vector space X, then ex K 6= ∅ and K = co (ex K).
Proof: Let F denote the set of all compact faces of K, which is nonempty since K ∈ F and
partially
T ordered by the inclusion relation. If F0 ⊆ F is totally ordered, then we may conclude that
F0 := F ∈F0 F is nonempty (by the finite intersection property for compact sets) and certainly
compact and a face of K. Thus, F0 is a lower bound of F0 . We may therefore apply Zorn’s lemma
and deduce that there exists F ∈ F that is minimal with respect to inclusion.
We will show that F consists of a single point x ∈ K, i.e., F = {x} and hence x is an extreme
point of K, which then implies that ex K 6= ∅.
Recall that as a face of K, the set F is nonempty. We prove the above assertion by contradiction.
Suppose there are two distinct elements x1 and x2 of F . By the corollary to the Hahn-Banach
theorems there is some ρ ∈ X 0 such that Re ρ(x1 ) 6= Re ρ(x2 ).
Applying the above lemma to F in place of K we may conclude that there is some c ∈ R such
that G := {x ∈ F | Re ρ(x) = c} is a compact face of F . It follows easily from the definition that
G is also a face of K, hence G ∈ F. Since it cannot be that both x1 and x2 belong to G, we obtain
that G is a compact face proper subset of F , contradicting the minimality of F .
It remains to show that K = co (ex K). The relation K ⊇ co (ex K) is obvious and we will prove
K ⊆ co (ex K) by contradiction. Denote E := co (ex K) and suppose x0 ∈ K \ E. By (an obvious
variant of) Hahn-Banach separation II there exist ρ ∈ X 0 and a real number a such that

(∗) ∀y ∈ E : Re ρ(x0 ) > a ≥ Re ρ(y).

Let c1 := max{Re ρ(x) | x ∈ K}, then c1 > a and the set F1 := {x ∈ K | Re ρ(x) = c1 } is a
compact face of K by the previous lemma. Since F1 is also nonempty, compact, and convex, the
first part of the proof shows that F1 has an extreme point x1 , which is then3 also an extreme
point of K. In particular, x1 ∈ E, while Re ρ(x1 ) = c1 > a, which contradicts (∗).
3
It is easy to see that an extreme point of a face of K is an extreme point of K as well.

72
Corollary: Let K be a non-empty compact convex subset of a Hausdorff locally convex vector
space X. If ρ ∈ X 0 , then there is an extreme point x0 of K such that Re ρ(x) ≤ Re ρ(x0 ) holds
for all x ∈ K.
Proof: We may put c := max{Re ρ(x) | x ∈ K} and obtain from the above lemma that F := {x ∈
K | Re ρ(x) = c} is a compact face of K. By the Krein-Milman theorem the nonempty compact
convex subset F has an extreme point x0 . It follows from the definition that an extreme point of
a face of K is an extreme point of K, thus x0 is an extreme point of K and, since x0 ∈ F , we
have Re ρ(x0 ) = c ≥ Re ρ(x) for every x ∈ K.
Remark: One can also show the following related result (cf. [Wer18, Theorem VIII.4.4(c)] or
[Con10, Chapter V, Theorem 7.8] or [KR, Theorem 1.4.5]): If K is a nonempty compact convex
subset of a Hausdorff locally convex vector space X and B is a closed subset of K such that
K = co B, then B contains the extreme points of K, i.e., B ⊇ ex K.

73
6. Weak and weak* topologies
6.1. Dual pairs: Let X, Y be K-vector spaces and (x, y) 7→ hx, yi be a bilinear map X × Y → K.
We call (X, Y ) a dual pair, if

(6.1) ∀x ∈ X, x 6= 0, ∃y ∈ Y : hx, yi =
6 0 and ∀y ∈ Y, y 6= 0, ∃x ∈ X : hx, yi =
6 0.

For any y ∈ Y , we have the induced linear functional ly (x) := hx, yi on X and the map y 7→ ly is
linear from Y into the algebraic dual X ∗ of X. The requirement above says that the latter is an
injective map, and the same holds for the assignment x 7→ hx, .i, X → Y ∗ . In this sense, we have
Y ,→ X ∗ and X ,→ Y ∗ for any dual pair (X, Y ). Note that the ranges of these injective maps are
point separating subspaces in the respective duals.

For any y ∈ Y we have the seminorm py (x) := |hx, yi| on X and the set P of these seminorms
generates a locally convex vector space topology on X, which we call the σ(X, Y )-topology (or
weak topology) on X. The σ(Y, X)-topology on Y is defined analogously. It follows from the
definition of a dual pair and Lemma 4.11 that the σ(X, Y )-topology and the σ(Y, X)-topology are
always Hausdorff. By Proposition 5.7, a net (xj )j∈I in X converges to 0 in the σ(X, Y )-topology,
if and only if hxj , yi → 0 for every y ∈ Y . Considering xj (j ∈ I) as elements in Y ∗ , this translates
into pointwise convergence of the net (xj ).

Examples: 1) Let X be a locally convex Hausdorff vector space and Y = X 0 , then (X, X 0 ) is a
dual pair with respect to the bilinear map (x, x0 ) 7→ x0 (x), since the second part in (6.1) follows
from the definition of maps X → K and the first part from the corollary of the Hahn-Banach
theorem stated in 5.10. The corresponding topology σ(X, X 0 ) is also called the weak topology on
X and this is consistent with the topology described in Example 4.12.8) in case of a normed space.

2) In the situation of 1), we also obtain a dual pair (X 0 , X) and σ(X 0 , X) is the topology of
pointwise convergence, which we call weak* topology as in the case of normed spaces discussed in
Example 4.12.9).

3) Coming back to Example 4.12.11), consider a metric space Ω, X = Cb (Ω) the space of bounded
continuous functions, and Y = M (Ω) the space of regular signed or complex Borel measures on
Ω. The bilinear map Z
(f, µ) 7→ f dµ (f ∈ Cb (Ω), µ ∈ M (Ω))

defines the structure of a dual pair (the argument is again based on the regularity of the measures,
see [Els11, Kapitel VIII, §1, Satz 4.6]). The σ(M (Ω), Cb (Ω))-topology corresponds to the notion
of weak topology of probability theory as indicated in Example 4.12.11).

75
4) On X = RR (the space of functions R → R) consider for any t ∈ R the evaluation functional
δt : RR → R, f 7→ f (t). Let Y := span{δt | t ∈ R} (a subspace of (RR )∗ ). We obtain a dual pair
by defining the bilinear map X × Y → R,
m
X m
X
hf, λk δtk i := λk f (tk ),
k=1 k=1

and σ(X, Y ) is the topology of pointwise convergence as in Example 4.12.1).


5) Let X and Y be normed spaces. We will describe the weak operator topology on L(X, Y ) from
Example 4.12.10) in terms of a σ-topology of a dual pairing as follows: The (purely algebraic
K-)tensor product X ⊗PY 0 can be described equivalently as the set of all linear maps u : X 0 → Y 0
m
of the Pm form u(x0 ) = j=1 x (xj )yj with m ∈ N and xj ∈ X, yj ∈ Y (j = 1, . . . , m), i.e.,
0 0 0 0

u = j=1 xj ⊗ yj0 . We claim that we obtain a dual pair (L(X, Y ), X ⊗ Y 0 ) via the bilinear map
h., .i : L(X, Y ) × (X ⊗ Y 0 ) → K, given by
* m + m
X X
0
T, xj ⊗ yj := yj0 (T xj ).
j=1 j=1

To show that for any T 6= 0 there is a u ∈ X ⊗ Y with hT, ui = 0 6 0, first choose x0 ∈ X such
that T x0 6= 0, then there is some y0 ∈ Y with
0 0
P y0 (T x0 ) 6=0 0 by the Hahn-Banach theorem; thus,
0

6 0. Furthermore, for given u = m


hT, x0 ⊗ y00 i = j=1 xj ⊗ yj 6= 0 we have to find T ∈ L(X, Y ) such
that hT, ui = 6 0; there is some x0 ∈ X such that 0 6= u(x00 ) ∈ Y 0 ; hence, there is also some y0 ∈ Y
0 0

such that u(x00 )(y0 ) 6= 0; we define T ∈ L(X, Y ) by T x := x00 (x)y0 and arrive at
 
Xm Xm X m
hT, ui = yj0 (T xj ) = yj0 (x00 (xj )y0 ) =  x00 (xj )yj0  (y0 ) = u(x00 )(y0 ) 6= 0.
j=1 j=1 j=1

Any of the seminorms pu (T ) = |hT, ui|, with u = xj ⊗yj0 , is obviously continuous with respect to
P
the weak operator topology. Therefore, Corollary 5.2 shows that the σ(L(X, Y ), X ⊗ Y 0 )-topology
coincides with the weak operator topology.

6.2. The dual space of (X, σ(X, Y )): We need a preparatory result from linear algebra.
Lemma: Let l, l1 , . . . , ln : X → K be linear functionals. Then the following are equivalent:
(i) l ∈ span{l1 , . . . , ln },
(ii) ∃M > 0 such that |l(x)| ≤ M maxnj=1 |lj (x)| holds for every x ∈ X,
(iii) nj=1 ker(lj ) ⊆ ker(l).
T

Proof: (i) ⇒ (ii) ⇒ (iii) is clear. It remains to show (iii) ⇒ (i).


Let V := {(l1 (x), . . . , ln (x)) ∈ Kn | x ∈ X}, then (iii) guarantees that the linear map φ0 : V → K,
φ0 (l1 (x), . . . , ln (x)) = l(x) is well-defined. LetPφ : Kn → K be a linear extension of φ0 , then there
are α1 , . . . , αn ∈ K such that φ(ξ1 , . . . , ξn ) = nj=1 αj ξj and we obtain
n
X
∀x ∈ X : l(x) = φ0 (l1 (x), . . . , ln (x)) = φ(l1 (x), . . . , ln (x)) = αj lj (x).
j=1

76
Corollary: Let (X, Y ) be a dual pair. A linear functional on X is σ(X, Y )-continuous, if and
only if it is of the form x 7→ hx, yi for some y ∈ Y . Therefore,

(Xσ(X,Y ) )0 = Y.

Proof: By Corollary 5.4, a linear functional l on X is σ(X, Y )-continuous, if and only if it satisfies
an estimate as in statement (ii) of the above lemma with lj (x) = hx, yj i and yj ∈ Y .PTherefore,
the σ(X, Y )-continuityPof l is also equivalent to being a linear combination l = αj li , i.e.,
l(x) = hx, yi with y = αj yj .
In particular, in case of a locally convex vector space (X, τ ) with the dual pairs (X, X 0 ) or (X 0 , X),
we obtain the following statements as direct consequences:
(a) A linear functional on X is weakly continuous, if and only if it is τ -continuous.
(Compare with Example 5.9.2), where this result has been noted for normed spaces.)
(b) A linear functional on X 0 is weak* continuous, if and only if it is of the form of an evaluation
functional x0 7→ x0 (x) with some x ∈ X, i.e., (Xσ(X
0 0
0 ,X) ) = X.

6.3. Proposition: Let (X, Y ) be a dual pair. The weak topology σ(X, Y ) has the following
property: A map f from any topological space (T, τ ) into (X, σ(X, Y )) is continuous, if and only
if the composition map y ◦ f : t 7→ hf (t), yi is continuous T → K for every y ∈ Y .
We may remark that, in fact ([Sch71, Sections II.5 and IV.1]), σ(X, Y ) is the coarsest topology
on X such that every y ∈ Y defines a continuous function and thus is the initial or projective
topology on X with respect to the linear functionals given by the elements y ∈ Y .
Proof: Clearly, if f : (T, τ ) → (X, σ(X, Y )) is continuous, then y ◦ f is continuous for every y ∈ Y
due to the corollary in 6.2.
To prove the converse, suppose y ◦ f is continuous for every y ∈ Y . Let t ∈ T and U be a
σ(X, Y )-neighborhood of 0 in X. We have to show that there is a τ -neighborhood W of t in
T such that f (W ) ⊆ f (t) + U . We may assume that U is a basis neighborhood of the form
U = {x ∈ X | |hx, yj i| ≤ ε (j = 1, . . . , n)} with y1 , . . . , yn ∈ Y . Every function yj ◦ f (1 ≤ j ≤ n)
is continuous, thus, there exist τ -neighborhoods Wj of t in T such that |hf (s) − f (t), yj i| ≤ ε, if
s ∈ Wj . Putting W := W1 ∩. . .∩Wn we obtain a neighborhood of t such that f (W )−f (t) ⊆ U .

6.4. Polar subsets: Let (X, Y ) be a dual pair, A ⊆ X, and B ⊆ Y . The polar of A is

A◦ := {y ∈ Y | ∀x ∈ A : Rehx, yi ≤ 1}

and the polar of B is


B ◦ := {x ∈ X | ∀y ∈ B : Rehx, yi ≤ 1}.
Note that A◦◦ := (A◦ )◦ is defined and a subset of X.
Remark: (i) The following can be shown as an exercise: If X is a normed space and BX denotes
the closed unit ball, then the duality (X, X 0 ) yields (BX )◦ = BX 0 . Furthermore, if U is a subspace
of X, then U ◦ coincides with the annihilator U ⊥ = {x0 ∈ X 0 | x0 |U = 0}.

77
(ii) In the literature, one often finds the definition of the polar of a subset A ⊆ X in the form
A◦ = {y ∈ Y | ∀x ∈ A : |hx, yi| ≤ 1}, which we would call here the absolute polar of A.
For subsets of X or Y of the dual pair (X, Y ), closures shall always refer to the weak topology
σ(X, Y ) or σ(Y, X), unless stated otherwise. Let A and Aj (j = 1, 2 or j ∈ J) denote subsets of
X, then we collect the following list of properties1 :
(a) A1 ⊆ A2 ⇒ A◦2 ⊆ A◦1 ,
(b) 0 ∈ A◦ and A ⊆ A◦◦ ,
(c) A◦ = co (A◦ ) and A◦ = (co A)◦ ,
(d) if A is balanced, then A◦ = {y ∈ Y | ∀x ∈ A : |hx, yi| ≤ 1} (absolute polar),
(e) if λ > 0, then (λA)◦ = λ1 A◦ ,
(f) ( j∈J Aj )◦ = j∈J A◦j ,
S T

(g) ( j∈J Aj )◦ ⊇ co j∈J A◦j .


T S

Proof: Properties (a), (b), and (d)-(f) are clear or very easily shown.
(c): A◦ = x∈A {y ∈ Y | Rehx, yi ≤ 1} is an intersection of closed convex sets, hence it is itself
T
convex and closed, and therefore A◦ = co (A◦ ). Since A ⊆ co A, we have (co A)◦ ⊆ A◦ by (a),
hence it remains to show that any y ∈ A◦ belongs to (co A)◦ . This follows upon observing that
Rehλx1 + (1 − λ)x2 , yi = λ Rehx1 , yi + (1 − λ) Rehx2 , yi (for any real λ) and Rehlim xj , yi =
lim Rehxj , yi.
(g): Put A := S j∈J Aj , then A ⊆ Aj for every j ∈ J, hence A◦ ⊇ A◦j for S every◦ j ∈ J, and
T
therefore A ⊇ j∈J Aj . By (c), A is convex and closed, thus also A ⊇ co j∈J Aj .
◦ ◦ ◦ ◦

Bipolar theorem: A◦◦ = co (A ∪ {0}).


Proof: By (b), 0 ∈ (A◦ )◦ = A◦◦ and A ⊆ A◦◦ , hence A ∪ {0} ⊆ A◦◦ . By (c), A◦◦ is convex and
closed, hence co (A ∪ {0}) ⊆ A◦◦ .
Let V := co (A ∪ {0}) and suppose x0 ∈ A◦◦ \ V . Since V is closed and convex, the Hahn-Banach
separation theorem II and Corollary 6.2 allow us to find some y0 ∈ Y and ε > 0 such that
Rehx0 , y0 i + ε ≤ Rehv, y0 i holds for every v ∈ V , or, as noted towards the end in the proof of
Hahn-Banach separation II, with y1 := −y0 we obtain Rehx0 , y1 i − ε ≥ Rehv, y1 i for every v ∈ V .
Hence there is a real number α such that

∀v ∈ V : Rehx0 , y1 i > α > Rehv, y1 i.

Since 0 ∈ V , we have α > 0, and putting y2 := y1 /α we obtain for every a ∈ A ⊆ V ,

Rehx0 , y2 i > 1 > Reha, y2 i.

By the second inequality, y2 ∈ A◦ . But then the first inequality implies x0 6∈ A◦◦ , which gives a
contradiction. Thus, we also have A◦◦ ⊆ V and the theorem is proved.
1
The analogous properties hold for subsets of Y .

78
Corollary: Let C ⊆ X be convex with 0 ∈ C, then we have:

C is σ(X, Y )-closed ⇐⇒ ∃B ⊆ Y : C = B ◦ .

Proof: The implication ‘⇐’ follows by (c), and ‘⇒’ follows upon setting B := C ◦ and appealing
to the bipolar theorem.

6.5. Alaoglu-Bourbaki theorem: Let X be a locally convex vector space and U be a neigh-
borhood of 0 in X. Then U ◦ (in the sense of the dual pair (X, X 0 )) is weak* compact.
Proof: Let τp denote the topology of pointwise convergence on theQset KX of all functions X → K.
Recall that τp corresponds to the product topology on KX = x∈X K and therefore we have
Tychonoff’s theorem as a convenient criterion for compactness of product sets in (KX , τp ). We
have X 0 ⊆ KX and τp |X 0 = σ(X 0 , X) (as noted in 6.1, Example 1) above).
We may suppose that U is absolutely convex, for we always have an absolutely convex neighbor-
hood V of 0 with V ⊆ U , which gives that U ◦ ⊆ V ◦ and compactness of U ◦ follows from that of
V ◦ . By property (d) in 6.4, we may thus suppose that

U ◦ = {x0 ∈ X 0 | ∀x ∈ U : |x0 (x)| ≤ 1}.

For every x ∈ X, there is some λx > 0 with x ∈ λx · U (since U is absorbing). If x ∈ U , we choose


λx ≤ 1 (e.g., λx = 1 is always possible). Since x/λx ∈ U by construction, we have for all x0 ∈ U ◦
and x ∈ X,
x
(∗) |x0 (x)| = λx |x0 ( )| ≤ λx .
λx
By Tychonoff’s theorem, the set
Y
K := {λ ∈ K | |λ| ≤ λx } = {f ∈ KX | ∀x ∈ X : f (x) ∈ Kx }
| {z }
x∈X
=:Kx

is τp -compact and (∗) shows that U ◦ ⊆ K. It remains to show that U ◦ is τp -closed in K.


If f ∈ K is in the τp -closure of U ◦ , then being a pointwise limit of a net of linear functionals from
U ◦ , f is a linear map. Due to our choice of λx ≤ 1 for x ∈ U and knowing that f ∈ K, we have
f (U ) ⊆ {λ ∈ K | |λ| ≤ 1}, which implies that f is continuous2 at 0, hence f ∈ X 0 . Thus, we have
τp τp σ(X 0 ,X)
shown that U ◦ ⊆ X 0 and therefore deduce U ◦ = U ◦ = U ◦.
Applying the Alaoglu-Bourbaki theorem to the closed unit ball BX in a normed space X and
recalling from Remark (i) above that U := BX implies U ◦ = (BX )◦ = BX 0 , we obtain the
following special case (Alaoglu’s or Banach-Alaoglu theorem).
6.6. Corollary: Let X be a normed space, then the closed unit ball BX 0 of X 0 is weak* compact.
6.7. Remark: (i) If X is a separable locally convex vector space, then the topology induced
by σ(X 0 , X) on the polar of a 0-neighborhood is metrizable ([Sch71, Chapter IV, 1.7]) and com-
pactness means sequential compactness. Thus, in case of a separable normed space X, bounded
2
Since, e.g., for every ε > 0 we obtain f (εU ) ⊆ {λ | |λ| ≤ ε} and εU is a neighborhood of 0 in X.

79
sequences in the dual X 0 always possess weak* convergent subsequences. Without separabil-
ity this is no longer true. An example of this failure is the following: Consider the evaluation
functionals δn ∈ l∞0 (n ∈ N) given by δn (x) := xn for every x = (xm )m∈N ∈ l∞ . We have
δn ∈ Bl∞ 0 = (Bl∞ )◦ . Suppose we had a weak* convergent subsequence (δnk )k∈N of (δn )n∈N . For
every x ∈ l∞ put y(x) := limk→∞ δnk (x). Then y ∈ Bl∞ 0 , since Bl∞ 0 is weak* closed (being a
weak* compact subset). Let z = (zm )m∈N ∈ l∞ be defined by

0, if ∀k ∈ N : m 6= nk ,


zm := 1, if ∃l ∈ N : m = n2l ,
0, if ∃l ∈ N0 : m = n2l+1 .

We obtain δn2l (z) = 1 for l ∈ N and δn2l+1 (z) = 0 for l ∈ N0 and therefore arrive at the
contradiction y(z) = limk→∞ δnk (z) = lim(0, 1, 0, 1, 0, 1, . . .).
(ii) Recall that a Banach space X is reflexive, if it is canonically isomorphic to X 00 via the map
ι : X → X 00 , ι(x)(x0 ) := x0 (x). Combining the bipolar and the Alaoglu-Bourbaki theorem, one can
obtain characterizations of reflexivity for Banach spaces (cf. [Wer18, Satz VIII.3.18] or [Con10,
Chapter V, Theorem 4.2]), e.g., in the following form: The Banach space X is reflexive, if and
only if BX is weakly compact.

6.8. Weak* compactness of convex sets and extreme points: Although we always have
plenty of convex sets in any locally convex space, the compactness of these subsets is often true
only in the weak or weak* topologies and the Alaoglu-Bourbaki theorem plays an important part
in many applications of the Krein-Milman theorem 5.11. A particular case is again the closed
unit ball in the dual of a normed space, where the Krein-Milman theorem applies for the weak*
topology and can be used, e.g., to show that the Banach spaces c0 (the space of real or complex
sequences converging to 0) and L1 ([0, 1]) cannot be the dual of a normed space, since their closed
unit balls do not have any extreme points (cf. [Wer18, Korollar VIII.4.6]).

80
7. Basic theory of distributions
The theory of distributions aims at an extension of the notion of scalar functions on an open
subset of Ω ⊆ Rn while maintaining a calculus of differentiation. The basic idea is to consider
linear functionals on function spaces and we can make the following observations:

(a) Functions can be considered as linear functionals (see also Example 5.9.1)): For example,
suppose f ∈ L1loc (Ω), then1
Z
Tf (ϕ) := f (x)ϕ(x) dx

defines a linear functional Tf : Cc (Ω) → C. But now we are not tied up with functions anymore
in implementing such functionals. In particular, we have finite measures acting as functionals
(similarly as in the Riesz representation theorem). One of the most prominent functionals is given
by the Dirac measure δx0 concentrated at the point x0 ∈ Ω, acting by
Z
δx0 (ϕ) = ϕ(x) dδx0 = ϕ(x0 ).

(b) Differentiation of non-smooth functions can be implemented taking up the idea of integration
by parts (which is the concept of a weak derivative): On Ω = ]0, 1[, for example, if f ∈ C 1 (]0, 1[)
and ϕ ∈ Cc1 (]0, 1[), then

Z1 Z1
Tf 0 (ϕ) = f 0 (x)ϕ(x) dx = − f (x)ϕ0 (x) dx = −Tf (ϕ0 ).
0 0

However, if f is not differentiable and we have merely f ∈ L1loc (]0, 1[), then we may still define
the linear functional Tf 0 (ϕ) := −Tf (ϕ0 ) on Cc1 (]0, 1[) and consider Tf 0 to be the derivative of Tf .
We observe that we may obtain definitions of derivatives up to order k ∈ N, if we define the
functionals instead on the smaller spaces Cck (Ω), or, to be prepared for derivatives of arbitrary
order, consider the space Cc∞ (Ω) = D(Ω) of test functions known from Example 4.12.6).

7.1. The space of test functions: Recall from Examples 5) and 6) in 4.12, that we defined
the locally convex topology τ on D(Ω) via the family of seminorms P on D(Ω) such that, for
every p ∈ P , the restriction p |DK (Ω) is continuous (DK (Ω), τK ) → [0, ∞[ for every compact subset
1
Recall that L1loc (Ω) is the set of measurable functions that are integrable on every compact subset of Ω,
and Cc (Ω) denotes the space of continuous functions with compact support.

81
K ⊂ Ω. Here, DK (Ω) is the set of functions ϕ ∈ Cc∞ (Ω) with supp(ϕ) ⊆ K and the locally convex
topology τK is generated by the family of seminorms

pm (ϕ) := max sup |∂ α ϕ(x)| (m ∈ N0 , ϕ ∈ DK (Ω)).


|α|≤m x∈Ω

Note that pl ≤ pm , if l ≤ m, thus, any finite number of seminorms pm1 , . . . , pmr is bounded by
the single seminorm pm , if m ≥ mj (j = 1, . . . , r). Strictly speaking, we should indicate the
dependence on K in the notation for the seminorms pm , but we do not want to overload the
notation and it is convenient to think of pm as denoting the appropriate restriction to DK (Ω) of
a similar seminorm given on D(Ω).
By Lemma 5.1(c) and the monotonicity of pm with respect to m observed above, a seminorm p
on D(Ω) belongs to P , if and only if for every compact subset K ⊂ Ω there are c > 0 and m ∈ N0
such that

(7.1) ∀ϕ ∈ DK (Ω) : p(ϕ) ≤ c pm (ϕ).

Note that, in general, c and m will depend on K.


Lemma: (a) τ|DK (Ω) = τK .
(b) DK (Ω) is a τ -closed subspace of D(Ω).
(c) (D(Ω), τ ) is a Hausdorff space.
(d) Let Y be a locally convex space and L : D(Ω) → Y be linear. Then L is τ -continuous, if and
only if the restriction L |DK (Ω) is τK -continuous for every compact set K ⊂ Ω.
Proof: (a): The subspace topology τ|DK (Ω) is generated by the family of seminorms Q = {p |DK (Ω) |
p ∈ P } and we have by construction {pm | m ∈ N0 } ⊆ Q and that Q consists of τK -continuous
seminorms, thus, Corollary 5.2 proves the claim.
(b): We have DK (Ω) = πx−1 ({0}), where every πx : D(Ω) → [0, ∞[, πx (ϕ) := |ϕ(x)|, is
T
x∈Ω\K
a seminorm belonging to P , since πx (ϕ) ≤ p0 (ϕ) for every ϕ ∈ DM (Ω) and M ⊂ Ω compact.
Therefore, every subset πx−1 ({0}) is τ -closed.
(c): We appeal to Lemma 4.11 upon observing that for any D(Ω) 3 ϕ 6= 0 there is some x ∈ Ω
with ϕ(x) 6= 0 and hence πx as in (b) is a seminorm in P with πx (ϕ) 6= 0.
(d): If L is τ -continuous, then by (a) the restriction is τK -continuous. To prove the converse we
apply Theorem 5.3(iii): Let q be a continuous seminorm on Y , then by assumption q ◦(L |DK (Ω) ) =
(q ◦ L) |DK (Ω) is a continuous seminorm on DK (Ω), i.e., q ◦ L ∈ P , in particular, q ◦ L is a τ -
continuous seminorm on D(Ω).
In applications of distribution theory it is extremely convenient that linear functionals on D(Ω)
are automatically τ -continuous, if they are sequentially continuous. We will prove this fact in
Theorem 7.3 below, where we will see that it is based essentially on property (d) above. This is
also the reason why we are content here with describing convergence of sequences in D(Ω).
Proposition: Let (ϕl )l∈N be a sequence in D(Ω), then the following are equivalent:

82
(i) ϕl → 0 with respect to τ ,
(ii) there exists a compact set K ⊂ Ω such that supp(ϕl ) ⊆ K for every l ∈ N and ϕl → 0 holds
in DK (Ω),
(iii) there exists a compact set K ⊂ Ω such that supp(ϕl ) ⊆ K for every l ∈ N and for every
α ∈ Nn0 the sequence (∂ α ϕl )l∈N converges uniformly to 0.
Proof: Statement (iii) is just a reformulation of (ii) and the implication (ii) ⇒ (i) is clear from
the definition of τ (and (a) in the above lemma), hence it suffices to prove (i) ⇒ (ii).
Suppose (i) holds but (ii) were wrong, i.e., there is no compact subset K ⊂ Ω with the properties
◦ ◦
stated. Then we can find an increasing sequence of compact sets K1 ⊂ K2 ⊂ K2 ⊂ K3 ⊂ K3 ⊂

. . . ⊂ Ω with m∈N Km = Ω and a subsequence (ϕlm )m∈N such that ϕlm ∈ DKm (Ω)\DKm−1 (Ω) for
S
every m ∈ N (note that (i) and supp(ϕl ) ⊆ K for every l ∈ N would imply ϕl → 0 in DK (Ω)). Any

compact subset of Ω is contained in some Km , since the sets Km (m ∈ N) provide an increasing
open cover of Ω. Therefore, D(Ω) = m∈N DKm (Ω).
S


Choose xm ∈ Km \ Km−1 with αm := |ϕlm (xm )| > 0. We have seen in the previous proof of
the lemma that the seminorms πm : ϕ 7→ |ϕ(xm )|/αm are τ -continuous on D(Ω). We note that
πm |DKr (Ω) = 0, if m > r. Hence for every ϕ ∈ D(Ω) there are at most finitely many m ∈ N such
that πm (ϕ) 6= 0 and we may define the seminorm π on D(Ω) by π(ϕ) = ∞ m (ϕ). If M ⊂ Ω
P
m=1 πP
is compact, there is some N ∈ N such that M ⊆ KN , which implies π |DM (Ω) = N m=1 πm |DM (Ω)
and shows τM -continuity of the restriction. Thus, π ∈ P , in particular, π is τ -continuous.
Since ϕl → 0 by assumption, we have π(ϕlm ) → π(0) = 0 (m → ∞), but at the same time
π(ϕlm ) ≥ πm (ϕlm ) = 1, a contradiction.

7.2. Definition: The dual space of (D(Ω), τ ) is denoted by D 0 (Ω) and called the space of
distributions on Ω.
7.3. Theorem: Let T : D(Ω) → C be linear, then the following are equivalent:
(i) T ∈ D 0 (Ω), i.e., T is continuous,
(ii) for every compact set K ⊂ Ω, the restriction T |DK (Ω) is continuous on DK (Ω),
(iii) for every compact set K ⊂ Ω, there are m ∈ N0 and c > 0 such that

∀ϕ ∈ DK (Ω) : |T (ϕ)| ≤ c pm (ϕ) = c max sup |∂ α ϕ(x)|,


|α|≤m x∈Ω

(iv) sequential continuity of T at 0, i.e., ϕl → 0 in D(Ω) implies T ϕl → 0 in C.


Proof: (i) ⇔ (ii) is a special case of (d) in Lemma 7.1.
(ii) ⇔ (iii) holds, because the seminorms pm generate the topology τK on DK (Ω).
(i) ⇒ (iv) is the general topological fact that continuity implies sequential continuity.
(iv) ⇒ (ii): We immediately obtain the sequential continuity of T |DK (Ω) on DK (Ω). The locally
convex topology τK on DK (Ω) is generated by the countable set of seminorms {pm | m ∈ N0 }. It is

83
a routine exercise to show that, generally in a locally convex vector space in these circumstances,
we obtain a metric inducing the same topology by setting (see, e.g., [Kab14, Satz 1.1])

X pm (ϕ − ψ)
d(ϕ, ψ) := 2−m (ϕ, ψ ∈ DK (Ω)).
1 + pm (ϕ − ψ)
m=0

(For the proof of the triangle inequality, it is advisable to make use of the fact that t 7→ t/(1 + t) is an increasing function [0, ∞[→ R.

Thus, we have seen that (DK (Ω), τK ) is a metrizable space, therefore sequential continuity of
T |DK (Ω) implies continuity.
7.4. Remark: The constant c as well as the nonnegative integer m in condition (iii) of the above
theorem depend, in general, on the compact set K. In case a uniform m can be found for all K,
then the minimum of these numbers m is called the order of the distribution.

7.5. Examples: 1) The linear functional Tf , defined for any f ∈ L1loc (Ω) in observation (a) of
the introduction to the current section, is indeed a distribution. For every compact K ⊂ Ω and
ϕ ∈ DK (Ω), we have
Z Z
|Tf (ϕ)| ≤ |f (x)||ϕ(x)| dx ≤ |f (x)| dx · sup |ϕ(x)| = kf |K kL1 · p0 (ϕ),
x∈Ω
Ω K

hence Tf ∈ D 0 (Ω) (and of order 0).


Moreover, the map f 7→ Tf is linear and injective and the distributions in the image of this
embedding L1loc (Ω) ,→ D 0 (Ω) are called regular
R distributions.
(Proof of the injectivity: If Tf = 0, then Ω f ϕ = 0 for every ϕ ∈ D(Ω). Let K ⊂ Ω be a
compact set, then we can apply standard approximation procedures2 to manufacture a uniformly
RboundedR sequence (ϕlR)l∈N in D(Ω) with ϕl → χK pointwise, thus, dominated convergence implies
Kf = χK f = lim ϕl f = 0. Since K was arbitrary and the compact sets generate the Borel
sigma algebra B(Ω) (on the σ-compact Hausdorff space Ω, [Els11, Kapitel 1, Folgerung 4.2]), we
obtain3 A∩K f = 0 for every A ∈ B(Ω) and every compact set K ⊂ Ω. Therefore, f |K = 0 almost
R

everywhere for every compact subset K (by [Els11, Kapitel IV, Satz 4.4]), which implies f = 0
almost everywhere, since Ω is σ-compact.)
2) Let µ be a complex Borel measure on Ω (recall that µ is a regular measure by Lemma 0.16).
Then Z
µ(ϕ) := ϕ dµ

defines a distribution (of order 0), since |µ(ϕ)| ≤ |µ|(K) · p0 (ϕ) for every ϕ ∈ DK (Ω). An
argument as in 1) shows that also here the map assigning the functional to each measure is
2
T∞ S. 461] is as follows:n Consider the compact sets Kl := R{x ∈ Ω |
The construction sketched in [Wer18,
d(x, K) ≤ 1/l}; we have K = l=1 Kl ; choose ϕ0 ∈ D(R ) with supp(ϕ0 ) ⊆ {|x| ≤ 1} and ϕ0 = 1
and put ϕl (x) := Kl ln ϕ0 (l(x−y)) dy for every x ∈ Ω; then ϕl is smooth (integration with parameters),
R

has supp(ϕl ) ⊆ Kl + {|x| ≤ 1/l}, andR satisfies ϕl (x) = 1, if x ∈ K.


3
Note that we may not suppose that A f exists for every A ∈ B(Ω), since f need not be integrable on
non-compact sets.

84
injective. A prominent example, mentioned already in observation (a) of the introduction, is the
Dirac distribution δx0 concentrated at the point x0 ∈ Ω, which is obtained from the Dirac measure
at x0 and acts by δx0 (ϕ) = ϕ(x0 ).
We can easily show that δx0 is not a regular distribution: Suppose δx0 = Tf for some f ∈ L1loc (Ω),
then Ω0 := Ω \ {x0 } is an open subset of Rn and D(Ω0 ) ,→ D(Ω), since any smooth function with
compact support in Ω0 can be extended by 0 outside to give an element in D(Ω) (compact sets
have finite distance to the boundary of any surrounding open set). By slight abuse of notation,
we have Tf |Ω0 = (Tf )|D(Ω0 ) = δx0 |D(Ω0 ) = 0 and injectivity of L1loc (Ω0 ) 3 g 7→ Tg ∈ D 0 (Ω0 ) yields
that f |Ω0 = 0 almost everywhere. Therefore, we obtain f = 0 almost everywhere, which would
imply δx0 = 0, a contradiction, since there are functions ϕ ∈ D(Ω) with ϕ(x0 ) = 1.
3) The linear functional T (ϕ) := ϕ0 (0) is a distribution (of order 1) on R, i.e., T ∈ D 0 (R), since
|T (ϕ)| ≤ p1 (ϕ). (A proof that T is not of order 0 can be obtained from testing with functions
ϕn (x) := ϕ0 (nx), where ϕ0 ∈ D(R) satisfies ϕ0 (0) = 1, ϕ00 (0) = 1.)
4) The linear functional T (ϕ) := ∞ (n) (n) is defined on all of D(R) and continuous, since with
P
n=0 ϕ P −1 (n)
N ∈ N sufficiently large such that supp(ϕ) ⊆ [−N, N ], we obtain4 |T (ϕ)| ≤ N n=0 |ϕ (n)| ≤
N · pN −1 (ϕ). This distribution is not of finite order.

In observation (b) of the introductory paragraphs to the current section we have indicated a way to
extend (the linear operation of) differentiation from function spaces to distributions by mimicking
the integration by parts formula and “letting the derivatives fall on the test function”. This can
be described most systematically in terms of the general concept of an adjoint to a linear map.
7.6. Definition: Let X and Y be locally convex spaces and L : X → Y be a continuous linear
map. The adjoint of L is defined as the linear map L0 : Y 0 → X 0 , given by L0 (y 0 ) := y 0 ◦ L for
every y 0 ∈ Y 0 .
7.7. Remark: Recall that (Xσ(X 0
0 ,X) ) = X by Corollary 6.2. For every x ∈ X = (Xσ(X 0 ,X) ) ,
0 0 0

the map x ◦ L0 : Y 0 → K is given by (x ◦ L0 )(y 0 ) = L0 (y 0 )(x) = (y 0 ◦ L)(x) = y 0 (Lx) and thus a


σ(Y 0 , Y )-continuous linear functional. Therefore, Proposition 6.3 implies that L0 is continuous as
linear map (Y 0 , σ(Y 0 , Y )) → (X 0 , σ(X 0 , X)), i.e., adjoints of continuous linear maps are always
weak* continuous.

7.8. Differentiation of distributions: If α ∈ Nn0 , then ϕ 7→ (−1)|α| ∂ α ϕ defines a continuous


linear map D(Ω) → D(Ω), since pm (∂ α ϕ) ≤ pm+|α| (ϕ) shows continuity DK (Ω) → DK (Ω) for
every compact set K ⊂ Ω and Lemma 7.1(d) applies. We define ∂ α : D 0 (Ω) → D 0 (Ω) to be the
adjoint of this map, i.e.,

(∂ α T )(ϕ) := (−1)|α| T (∂ α ϕ) (ϕ ∈ D(Ω), T ∈ D 0 (Ω)).

By the above remark, ∂ α is weak* continuous, and the formula of integration by parts (in several
variables with iterated integrals and vanishing boundary terms due to compact support of the test
functions) shows that the new definition of ∂ α is consistent when applied to regular distributions
4
Note that ϕ vanishes of infinite order at the boundary of its support, hence ϕ(N ) (N ) = 0.

85
stemming from functions that are sufficiently often continuously differentiable, i.e., ∂ α Tf = T∂ α f
in these cases.
Examples: 1) The Heaviside function is H := χ[0,∞[ ∈ L∞ (R) ⊂ L1loc (R) ,→ D 0 (R). We have
Z∞
0 0
(TH ) (ϕ) = −TH (ϕ ) = − ϕ0 (x) dx = ϕ(0) = δ0 (ϕ) (ϕ ∈ D(R)).
0

Furthermore, (TH )00 (ϕ) = TH (ϕ00 ) = −ϕ0 (0) = −δ0 (ϕ0 ) = δ00 (ϕ).
2) The (class of the) measurable function f , given by f (x) := log(|x|), belongs to L1loc (R) and we
obtain, for every ϕ ∈ D(R),

Z∞ Z∞
0 0 0
ϕ0 (x) + ϕ0 (−x) log(x) dx

(Tf ) (ϕ) = −Tf (ϕ ) = − ϕ (x) log(|x|) dx = −
−∞ 0
Z∞ Z∞
0
=− (ϕ(x) − ϕ(−x)) log(x) dx = − lim (ϕ(x) − ϕ(−x))0 log(x) dx
ε>0,ε→0
0 ε
Z∞
 ∞ ϕ(x) − ϕ(−x)
=− lim (ϕ(x) − ϕ(−x)) log x ε
+ lim dx
ε>0,ε→0 ε>0,ε→0 x
ε
Z∞ Z∞
ϕ(x) − ϕ(−x) ϕ(x) − ϕ(−x)
= lim dx = dx,
ε>0,ε→0 x x
ε 0

since ϕ(x) − ϕ(−x) = 2xϕ0 (0) + O(x2 ) near x = 0 and ϕ has compact support. The distribution
vp(1/x) := (Tf )0 is called the Cauchy principal value of x1 , since its action on a test function is
equivalently given by Z
1 ϕ(x)
vp( )(ϕ) = lim dx.
x ε>0,ε→0 x
|x|>ε

Note that, with the classical pointwise derivative, we have (log(|x|))0 = 1/x on R \ {0}.

7.9. Fourier transform of temperate distributions: Finally, we will briefly discuss the
extension of the Fourier transform beyond L1 (Rn ) and L2 (Rn ). Our goal is to define it as the
adjoint of the classical Fourier transform on an appropriate test function space. As it turns out,
D(Rn ) is not suitable for that purpose, since the Fourier transform Fϕ of a compactly supported
function ϕ is easily seen to define an entire function on Cn and thus Fϕ cannot have compact
support too, unless ϕ = 0. However, a good alternative is the Schwartz space S (Rn ) with
seminorms pα,m (ϕ) := supx∈Rn (1 + |x|m ) |∂ α ϕ(x)| (m ∈ N0 , α ∈ Nn0 ). Its dual, the space of
temperate distributions S 0 (Rn ), has been introduced already in Example 5.9.1),R where we have
also seen that Lp (Rn ) ,→ S 0 (Rn ) (1 ≤ p ≤ ∞) via the map f 7→ Tf , Tf (ϕ) = Rn f ϕ. It is not
difficult to show (see [Wer18, Satz VIII.5.11]) that the identical embedding D(Rn ) ,→ S (Rn ) is
continuous with dense image and therefore has an injective adjoint S 0 (Rn ) ,→ D 0 (Rn ). Thus,
every temperate distribution is a distribution.

86
Recall (or take for granted or look-up in [Wer18, Section V.2] or [Con16, Theorem 6.1] or [Con10,
Chapter X, §6]) that F, defined by
Z
1
Fϕ(x) = e−ixξ ϕ(x) dx (ϕ ∈ S (Rn ), x ∈ Rn ),
(2π)n/2
Rn

is a bijective linear map S (Rn ) → S (Rn ) with inverse given by


Z
1
(F−1 ψ)(x) = eixξ ψ(ξ) dξ = (Fψ)(−x).
(2π)n/2
Rn

Moreover, we have the so-called “exchange formulae”

∂ α (Fϕ) = (−i)|α| F(xα ϕ) and F(∂ α ϕ) = i|α| ξ α Fϕ

and, by a simple application of Fubini’s theorem, we obtain also


Z Z
(∗) (Fψ)(ξ) · ϕ(ξ) dξ = ψ(x) · (Fϕ)(x) dx.
Rn Rn

We note that an equivalent, technically convenient, family of seminorms on S (Rn ) is given by

qα,Q (ϕ) := sup |Q(x)∂ α ϕ(x)|,


x∈Rn

where Q is a polynomial function on Rn .


Lemma: (a) For every α ∈ Nn0 , the assignments ϕ 7→ xα ϕ and ϕ 7→ ∂ α ϕ define continuous linear
maps S (Rn ) → S (Rn ).
(b) There is a constant c > 0 such that

∀ϕ ∈ S (Rn ) ∀ξ ∈ Rn : |(Fϕ)(ξ)| ≤ c p0,n+1 (ϕ).

Proof: (a): Leiniz’ rule applied to ∂ β (xα ϕ(x)) yields an upper bound for pβ,m (xα ϕ) in the form
of a finite sum with terms qβ,Qm+|α|−l (0 ≤ l ≤ max(|β|, m + |α|)), where Qm+|α|−l is a polynomial
function of order at most m + |α| − l. The continuity of ϕ 7→ ∂ α ϕ is even more obvious, since
pβ,m (∂ α ϕ) = pβ+α,m (ϕ).
Z Z Z
−ixξ dx
(b): (2π) |(Fϕ)(ξ)| = | e
n/2
ϕ(x) dx| ≤ |ϕ(x)| dx ≤ · p0,n+1 (ϕ).
1 + |x|n+1
Rn Rn Rn

By part (a) of this lemma and the embedding S 0 (Rn ) ,→ D 0 (Rn ), we obtain differentiation as
linear map S 0 (Rn ) → S 0 (Rn ). Moreover, multiplication by polynomial functions is obtained as
linear map S 0 (Rn ) → S 0 (Rn ) as the transpose of the map in (a). For the Fourier transform, we
obtain the analogous result from the following
Theorem: Both F and F−1 are continuous linear maps S (Rn ) → S (Rn ).

87
Proof: For any α ∈ Nn0 and any polynomial function Q(ξ) = |γ|≤N aγ ξ γ , we have to give an
P
upper bound of qα,Q (Fϕ) = supξ∈Rn |Q(ξ)∂ α (Fϕ)(ξ)| in terms of finitely many seminorms of ϕ.
Let Q(−i∂) := |γ|≤N aγ (−i)|γ| ∂ γ , then we have, by (b) in the above lemma and the exchange
P
formulae,

∀ξ ∈ Rn : |Q(ξ)(∂ α Fϕ)(ξ)| = |F(Q(−i∂)(xα ϕ))(ξ)| ≤ c p0,n+1 (Q(−i∂)(xα ϕ)).

By part (a) in the above lemma, we have an upper bound for p0,n+1 (Q(−i∂)(xα ϕ)) in terms of
j=1 pβj ,mj (ϕ), which completes the proof of continuity of F. Since (F ϕ)(x) = (Fϕ)(−x),
maxN −1

the continuity of F follows as well.


−1

Formula (∗) shows that on Schwartz functions, considered as regular temperate distributions, the
Fourier transform is its own adjoint. Thus, we come up with the following
Definition: If T ∈ S 0 (Rn ), then its Fourier transform F T ∈ S 0 (Rn ) is defined by

(F T )(ϕ) := T (Fϕ) (ϕ ∈ S (Rn )),

i.e., as the adjoint of the Fourier transform on S (Rn ).


The Fourier transform is a bijective5 linear map S 0 (Rn ) → S 0 (Rn ) and it is easily verified that
the exchange formulae extend to S 0 (Rn ).
Examples: 1) Let f ∈ L1 (Rn ) and ϕ ∈ S (Rn ), then applying Fubini’s theorem we obtain
Z Z Z
1
(F Tf )(ϕ) = Tf (Fϕ) = f (x) · (Fϕ)(x) dx = f (x) e−ixξ ϕ(ξ) dξ dx
(2π)n/2
Rn Rn Rn
Z Z Z
1 −ixξ
= e f (x) dx ϕ(ξ) dξ = (Ff )(ξ) · ϕ(ξ) dξ = (TFf )(ϕ).
(2π)n/2
Rn Rn Rn

(Recall that, by the lemma of Riemann-Lebesgue, Ff is a continuous function [vanishing at infin-


ity], if f ∈ L1 (Rn ).) Similarly, one can show that also the classical Fourier-Plancherel transform
on L2 (Rn ) is consistently extended by the Fourier transform on S 0 (Rn ).
Z
1 1
2) (Fδ0 )(ϕ) = δ0 (Fϕ) = (Fϕ)(0) = n/2
1 · ϕ(x) dx = T1 (ϕ).
(2π) (2π)n/2
Rn

3) (F T1 )(ϕ) = (2π)n/2 (F(Fδ0 ))(ϕ) = (2π)n/2 δ0 (F Fϕ) and applying (Fϕ)(x) = (F−1 ϕ)(−x) we
obtain

(F T1 )(ϕ) = (2π)n/2 δ0 (F Fϕ) = (2π)n/2 ϕ(−0) = (2π)n/2 ϕ(0) = (2π)n/2 δ0 (ϕ).

5
It is easy to see that the adjoint of a bijective linear continuous map is bijective.

88
Appendix
We restate and prove Lemma 0.13
Lemma: Let Ω ⊂ C be compact and (Bb (Ω), k.k∞ ) be the Banach space of bounded Borel
measurable functions Ω → C. Suppose U ⊆ Bb (Ω) has the following properties:

(a) C(Ω) ⊆ U ,

(b) fn ∈ U (n ∈ N), sup kfn k∞ < ∞, and f (t) := lim fn (t) exists for every t ∈ Ω
n∈N n→∞
=⇒ f ∈ U.

Then U = Bb (Ω).
Proof: Let S := {S ⊆ Bb (Ω) | S ⊇ C(Ω) and S satisfies (b)} and put V := S. Clearly
T
S∈S
Bb (Ω) ∈ S , hence S 6= ∅ and C(Ω) ⊆ V ⊆ U . We will show that V = Bb (Ω).
Claim 1: V is a vector subspace.
For any f0 ∈ V put Vf0 := {g ∈ Bb (Ω) | f0 + g ∈ V }. Note that Vf0 possesses property (b).
If f0 ∈ C(Ω), then C(Ω) ⊆ Vf0 and Vf0 satisfies (b); hence Vf0 ∈ S and V ⊆ Vf0 , i.e.,

f0 ∈ C(Ω), g ∈ V =⇒ f0 + g ∈ V.

Thus, if g0 ∈ V , then f + g0 ∈ V for every f ∈ C(Ω); therefore C(Ω) ⊆ Vg0 and, furthermore,
Vg0 ∈ S ; this in turn implies V ⊆ Vg0 , i.e.,

g0 ∈ V, g ∈ V ⇒ g ∈ Vg0 ⇒ g0 + g ∈ V.

It remains to show that αg ∈ V , if α ∈ C and g ∈ V : Fix an arbitrary α ∈ C and put


Vα := {g ∈ Bb (Ω) | αg ∈ V }. Then C(Ω) ⊆ Vα and Vα satisfies (b), hence V ⊆ Vα , i.e.

g∈V =⇒ αg ∈ V.

Claim 2: The step functions are contained in V .


From Claim 1 we already know that V is a vector space, hence it suffices to show χE ∈ V for any
Borel set E ∈ B(Ω). Let ∆ := {E ∈ B(Ω) | χE ∈ V }.
If E ⊆ Ω is open (in the subspace topology of Ω), then there is a sequence (fn ) of functions
fn ∈ C(Ω) with 0 ≤ fn ≤ 1 and converging pointwise to χE . (Choose a compact exhaustion

89
E = n∈N An and let fn be an Urysohn function for the closed pair An and Ω \ E.) Property (b)
S
applied to (fn ) yields χE ∈ V , thus E ∈ ∆. Therefore ∆ contains the topology of Ω, which is a
generating system for the Borel sigma algebra B(Ω) stable under finite intersections.

By the theory of Dynkin systems from measure theory ([Els11, Kapitel I, §6.2] or [Bau01, Chapter
I, §2]), we obtain ∆ = B(Ω), if we prove the following two properties:
(1) E ∈ ∆ ⇒ Ω \ E ∈ ∆,
(2) if (En ) is a sequence of pairwise disjoint sets in ∆, then E := n∈N En ∈ ∆.
S

Property (1) holds, sinceP1 and χE belong to V and therefore also χΩ\E = 1 − χE . To show (2) we
simply note that χE = χEn in the sense of pointwise convergence, hence (b) implies χE ∈ V .

It follows from Claims 1 and 2 that V is k.k∞ -dense in Bb (Ω). By property (b), every S ∈ S is
a closed subset of Bb (Ω), hence V is also closed and therefore V = Bb (Ω).

Sketch of the construction in 1.2 for compact normal operators

The unitary equivalence with a multiplication operator can be constructed along the follow-
ing lines: There is an orthonormal (possibly finite) sequence of eigenvectors of T , say W =
{w1 , w2 , . . .} = {wj | j ∈ N }, where N = N or N = {1, . . . , m} for some m ∈ N, corresponding to
the sequence d : N → C of all non-zero eigenvalues d(1), d(2), . . . (with multiplicities) of T , which
is either finite or converges to 0 ([Wer18, Theorem VI.3.2]), such that H = ker(T ) ⊕ span(W )
(orthogonal direct sum). If ker(T ) 6= {0} extend W by an orthonormal system K = {yr | r ∈ R}
of ker(T ), where R ∩ N = ∅, to obtain an orthonormal system B = W ∪ K of H and write
B = {bs | s ∈ S}, where S = N ∪ R and bs := wj , if s = j ∈ N , and bs := yr , if s = r ∈ R.
Let es ∈ l2 (S) be the complete orthonormal system given by es (t) = 1, if t = s, and es (t) = 0
otherwise, and define U by unique (uniformly continuous) extension of U es := bs (s ∈ S) to
l2 (S) = span({es | s ∈ S). Then T U es = 0, if s ∈ R, and T U es = d(s) bs , if s ∈ N , and therefore
we have for any x ∈ l2 (S), (U −1 T U x)(s) = 0 = 0 x(s), if s ∈ R, and (U −1 T U x)(s) = d(s) x(s),
if s ∈ N ; in other words, (U −1 T U x)(s) = h(s) x(s), where h ∈ l∞ (S) is given by h |R = 0 and
h |N = d.

We restate and prove Theorem 1.5

Theorem (Continuous functional calculus): If T ∈ L(H) is self-adjoint, then there is a


unique map Φ : C(σ(T )) → L(H) with the following properties:

(a) Φ(id) = T and Φ(1) = I,

(b) Φ is an involutive algebra homomorphism, i.e.,

• Φ is C-linear,

• ∀f, g ∈ C(σ(T )): Φ(f · g) = Φ(f ) · Φ(g),

• Φ(f¯) = Φ(f )∗ ,

90
(c) Φ is continuous with respect to the norm k.k∞ on C(σ(T )), in fact, isometric,
i.e., kΦ(f )k = kf k∞ .
We write f (T ) instead of Φ(f ) and call f 7→ f (T ) the continuous functional calculus of T .
Proof: Uniqueness: Because of (c) and thanks to the Stone-Weierstraß theorem applied to the
compact subset σ(T ) ⊆ R (cf., e.g., [Wer18, Satz VIII.4.7]), Φ is determined by the values on
polynomial functions; then by linearity, the knowledge of Φ(idn ) suffices; finally, by multiplicativity
according to (b), this boils down to fixing Φ(id) and Φ(1), which are determined by (a).
Pn
Existence: If f ∈ C(σ(TPn )) is the restriction of a polynomial function p(t) = k=0 ak t , then we
k

put Φ0 (f ) := p(T ) = k=0 ak T k . Before we proceed, we have to show that Φ0 is well-defined6 ,


i.e., if two polynomials p and q satisfy p |σ(T ) = q |σ(T ) , then p(T ) = q(T ). This will be clarified
once we have shown the following two claims for a polynomial p:
(i) σ(p(T )) = {p(λ) | λ ∈ σ(T )} = p(σ(T )),
(ii) kp(T )k = sup |p(λ)| = kp |σ(T )k∞ .
λ∈σ(T )

Proof of (i): This is obvious for constant polynomials, hence we may suppose deg(p) ≥ 1.
Let λ ∈ σ(T ). We may write p(t) − p(λ) = (t − λ)h(t) with some polynomial h 6= 0 and
obtain p(T ) − p(λ) = (T − λ)h(T ). Due to the commutativity of λ − T with h(T ) and p(T ) − p(λ),
continuous invertibility of p(T ) − p(λ) would then imply the same for T − λ, hence p(λ) ∈ σ(p(T )).
Thus, p(σ(T )) ⊆ σ(p(T )).
Let µ ∈ σ(p(T )), then deg(p − µ) ≥ 1 and we have a polynomial factorization p(t) − µ = c(t −
λ1 ) · · · (t−λm ) with λ1 , . . . , λm ∈ C and c ∈ C\{0}, which yields p(T )−µ = c(T −λ1 ) · · · (T −λm ).
If λj belonged to ρ(T ) for every j = 1, . . . , m, then p(T ) − µ would be continuously invertible, a
contradiction; hence λj ∈ σ(T ) for some j and therefore µ = p(λj ) ∈ p(σ(T )). Thus, σ(p(T )) ⊆
p(σ(T )).
Proof of (ii): Elementary calculations show that p 7→ p(T ) is an involutive algebra homomorphism
from polynomials into L(H). Note that p(T ) is normal, since p(T )∗ p(T ) = p(T )p(T ) = (pp)(T ) =
(pp)(T ) = p(T )p(T )∗ , and thus has norm equal to its spectral radius ([Wer18, Satz VI.1.7]). We
obtain
kp(T )k = sup{|µ| | µ ∈ σ(p(T ))} = sup{|p(λ)| | λ ∈ σ(T )} = sup |p(λ)|.
λ∈σ(T )

We have now shown that the map Φ0 is well-defined on the space V0 of polynomial functions on
σ(T ) and gives an involutive algebra homomorphism into L(H) which is isometric due to Claim
(ii), since kΦ0 (f )k = kf k∞ for any f ∈ V0 ; in particular, Φ0 is continuous. Let Φ denote the
unique continuous linear extension of Φ0 to C(σ(T )) (making use of the density of V0 due to the
Stone-Weierstraß theorem). That Φ is isometric follows directly from the density of the polynomial
functions and Claim (ii) above. It remains to show that Φ is involutive and multiplicative, which
follows from routine arguments using limits of polynomials and the fact that Φ0 is already known
to be involutive and multiplicative.

6
Note that σ(T ) can be a finite discrete subset of R.

91
Bibliography
[Ara99] H. Araki: Mathematical Theory of Quantum Fields.
Oxford University Press 1999.
[Bau90] H. Bauer: Maß- und Integrationstheorie.
De Gruyter 1990.
[Bau01] H. Bauer: Measure and Integration Theory.
De Gruyter 2001.
[Bog07] V. I. Bogachev: Measure Theory. Volume I
Springer-Verlag 2007.
[Coh80] D. L. Cohn: Measure Theory.
Birkhäuser 1980.
[Con16] A. Constantin: Fourier Analysis. Part I – Theory.
Cambridge University Press 2016.
[Con10] J. B. Conway: A Course in Functional Analysis.
Springer Science+Business Media, 2nd edition 2010.
[Els11] J. Elstrodt: Maß- und Integrationstheorie.
Springer-Verlag, 7. Auflage 2011.
[Fol99] G. B. Folland: Real Analysis.
Wiley, 2nd edition 1999.
[Hal13] B. C. Hall: Quantum Theory for Mathematicians.
Springer-Verlag 2013.
[Hoe21] G. Hörmann: Funktionalanalysis.
Vorlesung, Universität Wien, Sommersemester 2021.
http://www.mat.univie.ac.at/˜gue/material.html
[Hoe20] G. Hörmann: Grundbegriffe der Topologie.
Vorlesung, Universität Wien, Wintersemester 2020/21.
http://www.mat.univie.ac.at/˜gue/material.html
[Kab14] W. Kaballo: Aufbaukurs Funktionalanalysis und Operatortheorie.
Springer-Verlag 2014.
[KR] R. V. Kadison and J. R. Ringrose: Fundamentals of the theory of operator algebras.
Volumes I and II.
Academic Press, 1983 and 1986.

93
[MV92] R. Meise and D. Vogt: Einführung in die Funktionalanalysis.
Vieweg Verlag 1992.
[RS75] M. Reed and B. Simon: Methods of modern mathematical physics
Volume II: Fourier analysis, self-adjointness.
Academic Press 1975.
[Rud86] W. Rudin: Real and Complex Analysis.
McGraw-Hill, 3rd edition 1986.
[Sch71] H. Schaefer: Topological Vector Spaces.
Springer-Verlag 1971.
[Sch00] H. Schröder: Funktionalanalysis.
Harri Deutsch, 2. Auflage 2000.
[Tes14] G. Teschl: Topics in Real and Functional Analysis.
Lecture Notes, University of Vienna, winter term 2013/14.
http://www.mat.univie.ac.at/˜gerald/ftp/book-fa/index.html
[Tes14b] G. Teschl: Mathematical Methods in Quantum Mechanics.
With Applications to Schrödinger Operators
American Mathematical Society, 2nd edition 2014.
[Thi03] W. Thirring: Quantum Mathematical Physics:
Atoms, Molecules and Large Systems.
Springer-Verlag, 2nd edition 2003.
[Tre67] F. Trèves: Topological vector spaces, distributions and kernels.
Academic Press 1967.
[Wei00] J. Weidmann: Lineare Operatoren in Hilberträumen: Teil I: Grundlagen.
Teubner-Verlag 2000.
[Wer18] D. Werner: Funktionalanalysis.
Springer-Verlag, 8. Auflage 2018.
[Wil70] S. Willard: General Topology.
Addison-Wesley 1970.

94

You might also like