Advanced Functional Analysis: Lecture Notes
Advanced Functional Analysis: Lecture Notes
Günther Hörmann
Fakultät für Mathematik
Universität Wien
[email protected]
Sommersemester 2023
(Corrected and updated version as of May 23, 2024)
Preface
These lecture notes draw a substantial part of their material and style as well as its logical
build-up from Chapters VII and VIII of Dirk Werner’s excellent textbook [Wer18] (in
German). Regarding the prerequisites according to an introductory course on functional
analysis at our faculty we may refer to Chapters I–VI in [Wer18] or [Hoe21], both in
German, or for English texts to the corresponding chapters in [Con10, Con16, Tes14].
(The latter sources contain also many further aspects and material.)
In course of the semester we might occasionally provide hints to supplementary concepts,
examples, or further applications not covered in these notes. In fact, these notes do cer-
tainly not replace a book on the subject and are particularly sparse with intermediate and
explanatory or motivating texts in between mathematical statements. However, Sections
1-7 as presented in the notes define the compulsory material for the exam. (Thus excluding
Section 0 and the appendix.)
Many thanks go to Michael Kunzinger for several suggestions and corrections regarding a
previous version of these notes and to Nobuya Kakehashi for several subtle mathematical
comments leading to further improvements later on.
Günther Hörmann
Contents
2. Unbounded operators 33
Appendix 89
Bibliography 93
Part I.
1
0. Before we begin . . .
Notation and conventions: N = {1, 2, . . .}, H denotes a complex Hilbert space with scalar
product h., .i (conjugate-linear in its second argument), L(H) is the space of bounded linear
operators on H; sequences (fn )n∈N are often written simply in the form (fn ).
If S ∈ L(H), then its adjoint S ∗ ∈ L(H) can be characterized by the condition hSx, yi =
hx, S ∗ yi for all x, y ∈ H; we have S ∗∗ := (S ∗ )∗ = S. The kernel ker(S) := {x ∈ H | Sx = 0}
is always a closed subspace of H, while the range (or image) ran(S) := {Sx | x ∈ H}
is a subspace of H that is not necessarily closed. A basic relation involving these is
ran(S)⊥ = ker(S ∗ ), hence also ker(S) = ran(S ∗ )⊥ and ran(S) = ran(S)⊥⊥ = ker(S ∗ )⊥ .
Remark: Property (i) is, in fact, also sufficient for self-adjointness of a bounded operator
on complex Hilbert spaces (cf. [Wer18, Satz V.5.6]).
(ii): For any x ∈ H with kxk ≤ 1, we have |hT x, xi| ≤ kT xk · kxk ≤ kT k · kxk2 ≤ kT k,
hence it remains to show that kT k ≤ supkxk≤1 |hT x, xi| =: M .
3
By elementary maneuvering,
Therefore,
∀x ∈ H : kT xk ≥ ckxk
4
such that |ly (x)| = |B(x, y)| ≤ Cx kyk. Thus, the family A := {ly ∈ H 0 | y ∈ H, kyk ≤ 1} of
linear functionals is pointwise bounded on H, since supkyk≤1 |ly (x)| ≤ Cx . By the Banach-
Steinhaus theorem (or uniform boundedness principle), A is bounded in H 0 , i.e., there is
an M ≥ 0 such that kly k ≤ M for all y ∈ H with kyk ≤ 1. In other words,
5
The spectrum σ(T ) := C \ ρ(T ) is a compact non-empty subset of C. We have σ(T ∗ ) =
{λ | λ ∈ σ(T )}, since ((λ − T )−1 )∗ = ((λ − T )∗ )−1 = (λ − T ∗ )−1 .
If |λ| > kT k we can apply the Neumann series to find (λ − T )−1 = (I − T /λ)−1 /λ, thus
σ(T ) ⊆ {λ ∈ C | |λ| ≤ kT k}.
We obtain a slightly sharper “encirclement” of the spectrum by the spectral radius r(T ) :=
inf n∈N kT n k1/n = limn→∞ kT n k1/n , which clearly satisfies r(T ) ≤ kT k (since kT n k1/n ≤
(kT kn )1/n = kT k), namely
r(T ) = max{|λ| | λ ∈ σ(T )}.
If T is normal, then r(T ) = kT k. In particular, we can then always find a spectral value
λ ∈ σ(T ) with |λ| = kT k.
A spectral value λ ∈ C is defined by the failure of (λ − T )−1 to exist in L(H). Note that,
due to the open mapping principle, the bijectivity of (λ − T ) : H → H already implies
continuity of its inverse. Hence the “cases of failure” can be separated into the following
three classes:
1. λ − T fails to be injective (it has a nontrivial kernel and thus λ is an eigenvalue),
2. λ − T is injective, not surjective, and ran(λ − T ) is dense in H,
3. λ − T is injective and ran(λ − T ) is not dense in H (hence T is also not surjective).
Accordingly, we have the following decomposition of the spectrum (as a disjoint union)
6
(i) λ ∈ σ(T ),
(ii) ⇒ (i): If λ 6∈ σ(T ), then (λ − T )−1 : H → H is bounded and hence there is some
M > 0 such that k(λ − T )−1 zk ≤ M kzk holds for all z ∈ H. Equivalently, upon putting
z = (λ − T )x, we have kxk ≤ M k(λ − T )xk for all x ∈ H, which contradicts (ii), since
this means kλx − T xk ≥ 1/M > 0 for every x ∈ H with kxk = 1.
Here we collect a few measure theoretic notions and results (available also, e.g, in Appendix
A and Chapters I, II, VII of [Wer18]); more on the measure theoretic background can be
found in the excellent textbooks [Bau90, Els11] (in German) or [Bau01, Coh80, Con16,
Fol99, Rud86] (in English). Prerequisites from basic topology courses as in [Hoe20] (con-
tained in English in the book [Wil70]) will be used throughout the course without special
notice. We denote by P(Ω) the power set of the set Ω.
(A) Measures
(i) ∅ ∈ Σ,
(ii) A ∈ Σ ⇒ Ω \ A ∈ Σ,
(iii) Aj ∈ Σ (j ∈ N) ⇒
S
Aj ∈ Σ.
j∈N
0.8. Definition: Let Ω be a topological space. The Borel sigma algebra on Ω is the
smallest sigma algebra B(Ω) containing the topology, i.e., the system of open subsets of
Ω. The elements in B(Ω) are called Borel sets.
7
Clearly, B(Ω) contains all closed subsets, and in case of a Hausdorff space also all compact
subsets. In R every interval is a Borel set, in C ∼
= R2 any product of two intervals belongs
to B(C). For more on these issues see [Els11, Kapitel I, §4].
0.9. Definition: Let Σ be a sigma algebra on the set Ω. A measure is a map µ : Σ → [0, ∞]
satisfying
(i) µ(∅) = 0,
The two simplest (nontrivial) examples of measures on an arbitrary set Ω with sigma
algebra P(Ω) are the Dirac measure δp concentrated at p ∈ Ω with δp (A) = 1, if p ∈ A,
δp (A) = 0 otherwise, and the counting measure µ with µ(A) = ∞, if A is an infinite subset
of Ω, and µ(A) equal to the number of elements of A, if A is finite.
0.10. Theorem: For every d ∈ N there is a unique measure (defined at least) on the Borel
sigma algebra B(Rd ) which is translation invariant and assigns the value (b1 −a1 ) · · · (bd −ad )
to the product of closed bounded intervals [a1 , b1 ] × · · · × [ad , bd ]. This measure is called
the d-dimensional Lebesgue measure.
For any A ⊆ Ω we have the characteristic function (or indicator function) of A, defined
by χA (q) = 1, if q ∈ A, and χA (q) = 0, if q 6∈ A. Clearly, χA is measurable if and only if
A ∈ Σ.
8
Integral of step functions: A simple function or step function (or an elementary func-
tion) on Ω is a function of the form
m
X
f= cj χAj ,
j=1
9
Brief review of the construction of Lp -spaces: Let (Ω, Σ, µ) be a measure space.
If
R 1 ≤p p < ∞, we define L (Ω, µ) as vector space quotient of L := {f : Ω → C measurable |
p p
subset Ω ⊆ Rd and µ the (restriction of) d-dimensional Lebesgue measure is the completion
of the space of continuous functions on Ω with respect to k.kp . (More generally, the
compactly supported continuous functions on a locally compact Hausdorff space Ω are
dense in Lp (Ω, µ), if µ is a regular Borel measure.)
In case p = ∞ we define L∞ (Ω) to be the set of all µ-measurable functions f : Ω → C that
are bounded µ-a.e., i.e., there exists a set N ∈ Σ with µ(N ) = 0 such that the restriction
f |Ω\N is bounded. On the quotient vector space L∞ (Ω, µ) := L∞ /N we have the norm
kclass of f k∞ := inf{supx∈Ω\N |f (x)| | N ∈ Σ, µ(N ) = 0}.
Note that Ω = N and µ the counting measure gives lp (N) as special cases.
Let Ω ⊆ C and denote by Bb (Ω) the vector space of bounded Borel measurable functions
f : Ω → C. We obtain Bb (Ω) as a closed subspace of the Banach space of all bounded func-
tions on Ω equipped with the supremum norm kf k∞ := supx∈Ω |f (x)|, since measurability
is preserved even under pointwise limits. The proof of the monotone pointwise approxi-
mation of non-negative measurable functions shows, in fact, that a bounded measurable
function can be uniformly approximated by step functions (cf. [Els11, Kapitel III, Korollar
4.14(a)]), i.e., the step functions are dense in the Banach space (Bb (Ω), k.k∞ ).
We state a technical lemma (cf. [Wer18, Lemma VII.1.5]) that will be useful in constructing
a measurable functional calculus for self-adjoint operators on Hilbert spaces. (A proof is
given in the appendix.)
0.13. Lemma: Let Ω ⊂ C be compact and (Bb (Ω), k.k∞ ) be the Banach space of bounded
Borel measurable functions Ω → C. Suppose U ⊆ Bb (Ω) has the following properties:
(a) C(Ω) ⊆ U ,
(b) fn ∈ U (n ∈ N), sup kfn k∞ < ∞, and f (t) := lim fn (t) exists for every t ∈ Ω
n∈N n→∞
=⇒ f ∈ U.
Then U = Bb (Ω).
10
In both cases, µ(∅) = 0 follows from σ-additivity, since we excluded the possibility of
infinite values in the above definition: µ(∅) = µ(∅ ∪ ∅) = µ(∅) + µ(∅). A map µ : Σ → C is
a complex measure if and only if Re µ and Im µ are finite signed measures.
Let µ be a signed or complex measure. We define the variation |µ| : Σ → [0, ∞[ of µ by
Xn
|µ|(A) := sup{ |µ(Ek )| | E1 , . . . , En ∈ Σ pairwise disjoint, A = E1 ∪ . . . ∪ En }.
k=1
The variation |µ| is a finite (positive) measure and the so-called Jordan decomposition holds
for a signed measure: There are finite (positive) measures µ+ and µ− on Σ (concentrated
on measurable subsets with |µ|-null overlap only) such that µ = µ+ − µ− and |µ| = µ+ + µ− .
a complex measure µ = µ1 + iµ2 with real part µ1 and imaginary part µ2 the notion of
µ-integrability is also Rdefined in terms of |µ|-integrability and the integral is then given by
f dµ := f dµ1 + i f dµ2 .
R R
A signed or complex Borel measure is called regular, if the variation |µ| is regular. We
denote by M (Ω) the vector space of all signed or complex regular Borel measures on Ω.
0.16. Lemma: If Ω is a compact metric space or a complete separable metric space or an
open subset of Rd , then every finite Borel measure on Ω is regular. The Lebesgue measure
is regular. (This result is included in Ulam’s theorem [Els11, Kapitel VIII, Satz 1.16].)
Let Σ be a sigma algebra on the set Ω. The vector space of signed (or complex) measures
on Σ becomes a Banach space when equipped with the variation norm kµk := |µ|(Ω) (cf.
[Wer18, Abschnitt I.1, Beispiel (j), Seiten 22-24]). If Ω is a Hausdorff space and Σ = B(Ω),
then the set of signed (or complex) regular Borel measures M (Ω) is a closed subspace of
the former ([Els11, Kapitel VIII, Folgerung 2.22.a)]), hence (M (Ω), k.k) is a Banach space.
11
0.17. Theorem (Riesz representation theorem): Let Ω be a compact metric space.
Then the normed dual C(Ω)0 of (C(Ω), k.k∞ ) is isometrically isomorphic to (M (Ω), k.k)
via the map R : M (Ω) → C(Ω)0 , given by
Z
(Rµ)(f ) := f dµ.
Ω
For a proof we refer to [Wer18, Theorem II.2.5] (or [Els11, Kapitel VIII, §2] and [Bau01]
for more general variants of the theorem).
We introduce a notion that is stronger than plain continuity, weaker than Lipschitz conti-
nuity, and provides the perfect setting for a general fundamental theorem of calculus.
0.18. Definition: A function f : [a, b] → C is absolutely continuous, if it satisfies the
following: ∀ε > 0 ∃δ > 0 such that for any n ∈ N and sequences [a1 , b1 ], . . . , [an , bn ] of
subintervals with a ≤ a1 < b1 ≤ a2 < b2 ≤ . . . ≤ an < bn ≤ b of [a, b] we have
n
X n
X
(bk − ak ) < δ =⇒ |f (bk ) − f (ak )| < ε.
k=1 k=1
The Cantor function and f : [0, 1] → R, f (0) := 0, f (x) := x sin(1/x) (x > 0), are
prominent examples of continuous√ functions which are not absolutely continuous. The
function g : [0, 1] → R, g(x) := x, is absolutely continuous, but not Lipschitz continuous.
0.19. Theorem (Fundamental theorem of calculus): A function f : [a, b] → C is
absolutely continuous if and only if it is differentiable almost everywhere and f 0 defines a
Lebesgue integrable function on [a, b]. In this case we have for every t ∈ [a, b]
Zt
f (t) = f (a) + f 0 dµ,
a
Rt
where µ denotes the one-dimensional Lebesgue measure and f 0 dµ means χ[a,t] f 0 dµ.
R
a
We may even extend the formula of integration by parts to the case of absolutely continuous
functions (see [Els11, Kapitel VII, 4.16] for (a) and [Bog07, 5.8.43] for (b)).
0.20. Proposition (Integration by parts): (a) If f and g are absolutely continuous
functions [a, b] → C, then
Zb Zb
f 0 g dµ = f (b)g(b) − f (a)g(a) − f g 0 dµ.
a a
12
(b) Suppose f and g are functions on R that are absolutely continuous on bounded intervals
and such that f, f 0 g, f g 0 ∈ L1 (R), then
Z∞ Z∞
f 0 g dµ = − f g 0 dµ.
−∞ −∞
(We refer to [Els11, Kapitel V, §3, Unterabschnitt 1] for a proof and to [Els11, Kapitel V,
§4] for further variants of transformation formulae.)
13
1. The spectral theorem for bounded
self-adjoint operators
1.1. The finite-dimensional case: If H = Cn , equipped with the standard inner prod-
uct, and T is a self-adjoint (or normal) operator on Cn , then we know from linear algebra
that there is an orthonormal basis B of Cn consisting of eigenvectors of T , i.e., the ma-
trix of T with respect to the basis B is diagonal and the eigenvalues of T , with their
multiplicities, occur as the entries along the diagonal. Identifying vectors v in Cn with
functions v : {1, 2, . . . , n} → C, vl := v(l) being the l-th component of v, a diagonal matrix
D with diagonal entries d(1), . . . , d(n) ∈ C acts on v as a multiplication operator, because
(D · v)(l) = d(l)v(l) for l = 1, . . . , n; thus, we may state that the unitary transformation
U : Cn → Cn associated with the orthonormal basis B (used as column vectors of U ) maps
T to a multiplication operator D on l2 ({1, . . . , n}) ∼ = Cn via
(1.1) D = U −1 T U.
In this finite-dimensional case, the spectrum σ(T ) = {λ1 , . . . , λm } is the set of pairwise
distinct eigenvalues. Let Ej denote the orthogonal projection onto the eigenspace of λj
(j = 1, . . . , m), then we may write the diagonal representation of T in the abstract form
m
X
(1.2) T = λj Ej .
j=1
For any polynomial function p on C we easily derive the following formula (using Ej Ek = 0,
m
X
if j 6= k, and Ej = Ej for every l ∈ N): p(T ) =
l
p(λj )Ej .
j=1
Nothing prevents us from using the scheme of this formula to define f (T ) for an arbitrary
function f : σ(T ) → C, namely
m
X
f (T ) := f (λj )Ej .
j=1
15
1.2. The case of compact operators: If T is a compact self-adjoint (or normal) operator
on the complex Hilbert space H, then the analogue of equation (1.2) holds with an infinite
sum, convergent in the operator norm ([Wer18, Korollar VI.3.3]) and a multiplication
operator analogue of the diagonal matrix representation (1.1) can be constructed via a
unitary map U : l2 (S) → H, where S is a set of cardinality equal to the Hilbert dimension
of H (i.e., the cardinality of a, hence any, complete orthonormal system in H), such that
(U −1 T U x)(s) = h(s) x(s) (s ∈ S) holds with some h ∈ l∞ (S) and for every x ∈ l2 (S). (We
give a sketch of the construction in the appendix.)
In the current section we discuss the unitary equivalence of a bounded self-adjoint operator
T on a complex Hilbert space H with a multiplication operator on an appropriate space
L2 (Ω, µ). We will also discuss how to define f (T ) for a bounded Borel measurable function
f on σ(T ) based on an integral representation generalizing (1.2).
1.3. Lemma: Let the numerical range of T ∈ L(H) be W (T ) := {hT x, xi | kxk = 1}.
Then W (T ) is a bounded subset of C satisfying σ(T ) ⊆ W (T ).
Proof: Boundedness of W (T ) follows from |hT x, xi| ≤ kT xkkxk ≤ kT kkxk2 . Let λ ∈
C \ W (T ) and d denote the distance from λ to the compact set W (T ), hence d > 0. If
x ∈ H with kxk = 1, then
0 < dkxk = d ≤ |λ − hT x, xi| = |h(λ − T )x, xi| ≤ k(λ − T )xkkxk = k(λ − T )xk,
16
For general f ∈ C(σ(T )), one may continuously extend1 f to a function f˜ on the real
interval [m(T ), M (T )] ⊃ σ(T ) and then use approximation by polynomials according to the
Weierstraß theorem (cf., e.g., [Wer18, Satz I.2.11]). This leads to the so-called continuous
functional calculus, which is summarized in the following theorem. Since this construction
is often contained already in standard introductory courses on functional analysis, we
repeat the technical details of its proof only in the appendix.
1.5. Theorem (Continuous functional calculus): If T ∈ L(H) is self-adjoint, then
there is a unique map Φ : C(σ(T )) → L(H) with the following properties:
(a) Φ(id) = T and Φ(1) = I,
(b) Φ is an involutive algebra homomorphism, i.e.,
• Φ is C-linear,
• ∀f, g ∈ C(σ(T )): Φ(f · g) = Φ(f ) · Φ(g),
• Φ(f¯) = Φ(f )∗ ,
(c) Φ is continuous with respect to the norm k.k∞ on C(σ(T )), in fact, isometric,
i.e., kΦ(f )k = kf k∞ .
We write f (T ) instead of Φ(f ) and call f 7→ f (T ) the continuous functional calculus of T .
The following list collects a few more properties of the continuous functional calculus.
1.6. Theorem: Let T ∈ L(H) be self-adjoint and f ∈ C(σ(T )), then the following hold:
(i) kf (T )k = kf k∞ ,
(ii) f ≥ 0 implies that f (T ) is positive,
(iii) x ∈ H and T x = λx implies f (T )x = f (λ)x,
(iv) f (T ) is normal; f (T ) is self-adjoint, if and only if f is real-valued,
(v) the spectral mapping theorem: σ(f (T )) = f (σ(T )).
Moreover, C ∗ (T ) := {f (T ) ∈ L(H) | f ∈ C(σ(T ))} is a closed involutive commutative
subalgebra of L(H).
Proof: (i): This is only a restatement of part (c) in the above theorem.
(ii): We may write f = g 2 with g ∈ C(σ(T )) and g ≥ 0. We get f (T ) = g(T )g(T ) and
g(T )∗ = g(T ) = g(T ), hence for any x ∈ H,
hf (T )x, xi = hg(T )x, g(T )∗ xi = hg(T )x, g(T )xi = hg(T )x, g(T )xi = kg(T )xk2 ≥ 0.
1
With the Euclidean topology, R and any closed subset is a normal topological space.
17
(iii): This is elementary, if f is a polynomial, and extends to general continuous f by
uniform polynomial approximation pn → f (thanks to the Stone-Weierstraß theorem applied
to the compact subset σ(T ) ⊆ R; cf., e.g., [Wer18, Satz VIII.4.7]), since this implies pn (T ) → f (T )
with convergence in the operator norm.
(iv): f (T )∗ f (T ) = (f¯f )(T ) = (f f¯)(T ) = f (T )f (T )∗ ; f¯ = f implies f (T )∗ = f¯(T ) = f (T ), and for
the converse note that f (T ) = f (T )∗ = f¯(T ) in view of (i) yields kf − f¯k∞ = kf (T ) − f¯(T )k = 0,
hence f = f¯.
(v): For the case of a polynomial function, this relation is elementary (and already established,
e.g., in course of the proof of Theorem 1.5; cf. Claim (i) in the proof given in the appendix).
Let µ ∈ C \ f (σ(T )). Then g := 1
f −µ ∈ C(σ(T )) and g · (f − µ) = (f − µ) · g = 1, hence
In the finite-dimensional case with spectral representation (1.2) the spectrum σ(T ) = {λ1 , . . . , λm }
is a discrete topological space and the corresponding projections E1 , . . . , Em onto the eigenspaces
can be obtained via functional calculus: Let f : σ(T ) → C be defined by f (λj ) = 0, if j 6= k, and
18
f (λk ) = 1, then Ek = m j=1 f (λj )Ej = f (T ). Note that f = χ{λk } , the characteristic function
P
of the singleton {λk }. In the general case, σ(T ) will not be discrete and hence a characteristic
function χA , A ⊂ σ(T ), will not be continuous. If we are able to extend the functional calculus
appropriately, then χA (T ) is a perfect way to produce an orthogonal projection, since χ2A = χA and
χA is real-valued. From characteristic functions we get to step functions by linear combinations
and then pointwise limits would bring us into the realm of measurable functions.
thus, lx,y is an element in the dual space of (C(σ(T )), k.k∞ ) with klx,y k ≤ kxkkyk. By the Riesz
representation theorem (see Theorem 0.17) there is a complex Borel measure µx,y on σ(T ) with
kµx,y k = klx,y k, such that
Z
(1.3) hf (T )x, yi = f dµx,y ∀f ∈ C(σ(T )).
σ(T )
Now we observe that the right-hand side of (1.3) makes sense also for f ∈ Bb (σ(T )) and, con-
sidering its dependence on (x, y), defines a continuous sesquilinear form bf : H × H → C, since
(x, y) 7→ µx,y clearly is sesquilinear H × H → M (σ(T )) as seen from the defining Equation (1.3)
and
Z
(1.4) |bf (x, y)| = | f dµx,y | ≤ kf k∞ kµx,y k = kf k∞ klx,y k ≤ kf k∞ kxkkyk.
σ(T )
By the Lax-Milgram theorem (Theorem 0.3(b)), bf defines an operator f (T ) ∈ L(H) such that
Thereby the main ingredient for the measurable functional calculus has been constructed, namely
a map Bb (σ(T )) → L(H) given by f 7→ f (T ).
1.8. Theorem (Measurable functional calculus): Let T ∈ L(H) be self-adjoint, then there
is a unique map Φ
b : Bb (σ(T )) → L(H) with the following properties (we will typically write f (T )
to mean Φ(f )):
b
(a) Φ
b is an involutive algebra homomorphism that extends the continuous functional calculus
Φ, i.e., Φ
b |C(σ(T )) = Φ, and is k.k -continuous, more precisely, kf (T )k ≤ kf k ,
∞ ∞
(b) if fn ∈ Bb (σ(T )) (n ∈ N) is such that supn∈N kfn k∞ < ∞ and fn (t) → f (t) pointwise for
t ∈ σ(T ), then fn (T ) x → f (T ) x for all x ∈ H.
19
Proof: Uniqueness: Suppose Ψ b is another measurable functional calculus satisfying (a) and (b).
Define U := {f ∈ Bb (σ(T )) | Ψ(f b )}. Then C(σ(T )) ⊆ U and U satisfies also condition
b ) = Φ(f
(b) in Lemma 0.13, hence U = Bb (σ(T )), which proves Ψ
b = Φ.
b
We easily obtain an intermediate, weaker, variant of (b): If x, y ∈ H are arbitrary and fn (n ∈ N),
f are as in the premise of (b), then by the theorem on dominated convergence
Z Z
(1.6) hfn (T )x, yi = fn dµx,y → f dµx,y = hf (T )x, yi (n → ∞),
We will first show that Φb is multiplicative and involutive before getting back to the improvement
of (1.6) establishing (b).
hΦ(f
b )(Φ(g)x),
b yi = lim hΦ(f
b n )(Φ(g)x),
b yi = lim hΦ(f
b n g)x, yi = hΦ(f
b g)x, yi ∀x, y ∈ H,
n→∞ n→∞
A similar technique is used to show Φ(f b f¯), which clearly holds for continuous functions:
b )∗ = Φ(
b ∗ b ¯
Put U := {f ∈ Bb (σ(T )) | Φ(f ) = Φ(f )} and note that C(σ(T )) ⊆ U . If (fn ) is a uniformly
bounded sequence in U and converges pointwise to f , then
b )∗ x, yi = hx, Φ(f
hΦ(f b )yi = hΦ(f
b )y, xi = lim hΦ(f
b n )y, xi = lim hΦ(f b f¯)x, yi
b n )x, yi = hΦ(
n→∞ n→∞
holds for arbitrary x, y ∈ H and hence f ∈ U ; thus, U = Bb (σ(T )) again by Lemma 0.13, i.e., Φ
b
is involutive.
Finally, we prove (b): Let fn (n ∈ N), f be as in the premise of (b) and x ∈ H. By (1.6) we have
fn (T )x → f (T )x weakly in H. By the properties of Φb we obtain in addition
kfn (T )xk2 = hfn (T )x, fn (T )xi = hfn (T )∗ fn (T )x, xi = h(fn fn )(T )x, xi
→ h(f¯f )(T )x, xi = hf (T )∗ f (T )x, xi = kf (T )xk2 (n → ∞).
20
We may therefore conclude that fn (T )x → f (T )x in H, since kfn (T )x − f (T )xk2 = kfn (T )xk2 −
2 Rehfn (T )x, f (T )xi + kf (T )xk2 → kf (T )xk2 − 2 Rehf (T )x, f (T )xi + kf (T )xk2 = 0.
1.9. Remark: While Φ is injective, since kΦ(f )k = kf k∞ , its extension Φb is in general not. For
example, we will prove later in this section that for any λ ∈ σ(T ), Φ(χ
b {λ} ) 6= 0 if and only if λ is
an eigenvalue.
As first application of the measurable functional calculus we investigate the orthogonal projections
obtained from characteristic functions.
(iii) if A1 ,S
A2 , . . . are pairwise disjoint Borel subsets of σ(T ) and x ∈ H, then we have with
A := ∞ j=1 Aj
X∞
χAj (T )x = χA (T )x,
j=1
(ii): χ∅ = 0 ∈ C(σ(T )) and Φ(0) = 0 (by linearity); χσ(T ) = 1 in C(σ(T )) and Φ(1) = I.
1.11. Remark: In general, there is no operator norm convergent analogue of (iii), since the
projection χAj (T ) has operator norm 1, unless it is the zero operator (corresponding to the case
µx,y (Aj ) = 0 for all x, y ∈ H).
The previous lemma opens the door to so-called projection-valued measures on the Borel sigma
algebra B(R) by considering the map E : B(R) → L(H), A 7→ χA∩σ(T ) (T ), which is an example
of the following concept.
(i) E∅ = 0, ER = I,
S∞
(ii) if A1 , A2 , . . . are pairwise disjoint Borel sets in R, A := j=1 Aj , and x ∈ H, then
∞
X
EAj x = EA x.
j=1
21
A spectral measure E has compact support, if there is a compact subset K ⊂ R such that EK = I.
It is an exercise to deduce EA EB = EA∩B = EB EA and monotonicity, i.e., EA ≤ EB (in the sense
that EB − EA is a positive operator), if A ⊆ B.
1.13. Integration with respect to a spectral P measure: Let E be a spectral measure. We
first define the integral of a step function h = m
j=1 αj χAj on R simply by
Z m
X
h dE := αj EAj
j=1
2
Z 2 m
X m
X m
X
2
h dE x = αj EAj x = 2
|αj | kEAj xk ≤ max |αj | 2
kEAj xk2
j=1,...,m
j=1 j=1 j=1
2
m
X 2
= khk2∞ EAj x = khk2∞ ESm
j=1 Aj
x ≤ khk2∞ kxk2 .
j=1
Let f ∈ Bb (R), then by density of the step functions, there is a uniformly convergent sequence of
step functions hn → f . We have
Z Z Z
k hn dE − hm dEk = k (hn − hm ) dEk ≤ khn − hm k∞ ,
hence ( hn dE)n∈N is a Cauchy sequence in L(H) and possesses a limit; furthermore, this limit is
R
independent of the choice of approximating sequence (hn ), since any mix of such sequences yields
a Cauchy sequence of integrals, which cannot have a different accumulation point. Therefore, we
may define Z Z
f dE := lim hn dE.
n→∞
Z
k f dEk ≤ kf k∞ .
22
Furthermore, a real-valued function f ∈ Bb (R) gives a self-adjoint operator
R
f dE ∈ L(H).
RIf E has compact support, say EK = I for a compact set K ⊂ R, and f ∈ Bb (K), then we define
dE (formally, uponRextending f to R, e.g., by the value 0 outside K); we will
R
K f dE := χ K f
often abuse notation by still writing f dE in this case. The choice of K with EK = I is not
Ressential, since A ∈ B(R) with A ∩ K = ∅ implies EA = EA I = EA EK = EA∩K = 0 and hence
χA f dE = 0.
The measurable functional calculus for a self-adjoint operator T ∈ L(H) provided us with the
means to define a compactly supported spectral measure E associated with T by EA := χA∩σ(T ) (T )
(A ∈ B(R)).
Conversely, if a compactly supported spectral measure E is given, say EK R= I with K ⊂ R
compact, then any f ∈ C(K) is bounded and Borel measurable. Hence T := λ dEλ ∈ L(H) is
defined and self-adjoint, since idK is real-valued. We will show that the spectral measure of T
is exactly E and that the integrals with respect to E give the (uniquely determined) measurable
functional calculus of T .
1.14. Proposition: Let E be a compactly supported spectral measure and T = λ dEλ . Then
R
Proof: We first prove condition (a) of Theorem 1.8: We already know that Ψ is involutive,
linear, and continuous. Multiplicativity is clear in case of step functions, since Ψ(χA )Ψ(χB ) =
EA EB = EA∩B = Ψ(χA∩B ) = Ψ(χA χB ), and follows in general by a routine argument employing
uniform approximation by step functions. We clearly have Ψ(id) = T and, by multiplicativity,
Ψ(idn ) = T n for n ∈ N. It remains to show Ψ(1) = I, then we have that Ψ coincides with the
continuous functional calculus on polynomials, hence on all of C(σ(T )).
Note that Ψ(1) = σ(T ) 1 dE = Eσ(T ) , thus we need to show that Eσ(T ) = I. In fact, we will show
R
Eρ(T )∩R = 0, which implies Eσ(T ) = Eσ(T ) + Eρ(T )∩R = E(σ(T )∪ρ(T ))∩R = ER = I.
Here, and also in the sequel, we will simplify notation by writing EA instead of EA∩R , if A ⊆ C
is a Borel set.
Choose a < b such that E]a,b] = I, i.e., EK = I for some compact subset K of ]a, b]. Let µ ∈ ρ(T ).
If µ 6∈ ]a, b], then choosing an open neighborhood U of µ with U ∩ K = ∅ yields EU = 0.
It remains to consider the case µ ∈ ]a, b]. Recall that the set of invertible operators is open in
L(H) and that inversion is a continuous map on that set. Therefore, given C := k(µ − T )−1 k + 1,
there is δ > 0 such that
23
We may suppose that δ = (b − a)/N for some N ∈ N and that δ < 1/C.
Put aP
k := a + kδ (k = 0, . . . , N ), Ak := ]ak−1 , ak ] (k = 1, . . . , N ), and
Pconsider the step function
N N
h := k=1 ak χAk . Then |λ − h(λ)| ≤ δ for all λ ∈ ]a, b]. Note that k=1 EAk = E]a,b] = I and
EAk EAl = EAk ∩Al = δkl EAk . Writing µI = N k=1 µEAk gives
P
N
X N
X Z
k (µ − ak )EAk − (µ − T )k = kT − ak EAk k = kT − h dEk ≤ sup |λ − h(λ)| ≤ δ,
k=1 k=1 a<λ≤b
hence B := N k=1 (µ−ak )EAk is invertible and kB k ≤ C. Denoting by Σ the sum only over those
−1 0
P
k ∈ {1, . . . , N } such that EAk 6= 0 we may write B = Σ (µ − ak )EAk and B −1 = Σ0 (µ − ak )−1 EAk .
0
Therefore,
1
C ≥ kB −1 k = max{ | 1 ≤ k ≤ N and EAk 6= 0}.
|µ − ak |
Since µ ∈ ]a, b] = A1 ∪ . . . ∪ AN there is some k such that µ ∈ Ak = ]ak−1 , ak ]. Therefore,
|µ − ak | ≤ |ak−1 − ak | = δ < 1/C, which implies EAk = 0. If µ < ak , then U := ]ak−1 , ak [ is
an open neighborhood of µ such that EU = 0. If µ = ak , then |µ − ak+1 | = δ < 1/C and also
EAk+1 = 0, so that U := ]ak−1 , ak+1 [ is an open neighborhood of µ with EU = 0.
We finally show that Ψ satisfies also condition (b) of Theorem 1.8: It suffices to show the weaker
variant (1.6), since the final argument in the proof of Theorem 1.8 could be repeated here with
Ψ(fn ) in place of fn (T ).
Z Xm Xm Z
hΨ(h)x, yi = h( h dE)x, yi = h αj EAj x, yi = αj hEAj x, yi = h dνx,y ,
j=1 j=1
| {z }
νx,y (Aj )
and for g ∈ Bb (σ(T )) we apply uniform approximation by step functions and pass to the limit to
obtain Z
hΨ(g)x, yi = g dνx,y .
2
EA∪B = EA\B + EB ≤ EA + EB and similarly for finitely many terms
24
If (fn ) and f are as in the premise of condition (b), then (1.6) follows from the theorem on
dominated convergence, since
Z Z
hΨ(fn )x, yi = fn dνx,y → f dνx,y = hΨ(f )x, yi (n → ∞).
The map f 7→ σ(T ) f dE, Bb (σ(T )) → L(H), defines the measurable functional calculus of T ,
R
and f (T ) is determined by the measures µx,y ∈ M (σ(T )) (x, y ∈ H), µx,y (A) = hEA x, yi, via
Z Z
hf (T )x, yi = f dµx,y =: f (λ) dhEλ x, yi ∀x, y ∈ H.
σ(T ) σ(T )
25
Pm
defines the spectral measure of T , since Eσ(T ) = j=1 Ej = I and
m
X m
X Z
T = λj Ej = λj E{λj } = λ dEλ .
j=1 j=1
σ(T )
1.17. Example: Let H = L2 ([0, 1]) and T : L2 ([0, 1]) → L2 ([0, 1]) be the multiplication operator
(T x)(t) = tx(t) (t ∈ [0, 1], x ∈ L2 ([0, 1])). Clearly, T is self-adjoint and T − λ possesses a
continuous inverse, namely, ((T − λ)−1 x)(t) = x(t)/(t − λ), if and only if λ ∈ C \ [0, 1], hence
σ(T ) = [0, 1]. It is an exercise to show that T has no eigenvalues, thus, by Lemma 0.5 we have
σ(T ) = σc (T ) = [0, 1]. Let x, y ∈ L2 ([0, 1]), then
Z1 Z
hT x, yi = λx(λ)y(λ) dλ = λ x(λ)y(λ) dλ,
0 σ(T )
which suggests that µx,y is Lebesgue measure on [0, 1] with density function xy, i.e, that the
spectral measure E is defined by (EA x)(t) := χA∩[0,1] (t)x(t) (A ∈ B(R)). Indeed3 , hEA x, yi =
R1
0 χA∩[0,1] (λ)x(λ)y(λ) dλ = A∩[0,1] x(λ)y(λ) dλ and
R
Z Z Z1
λ dhEλ x, yi = id dµx,y = hT x, yi = λx(λ)y(λ) dλ.
σ(T ) σ(T ) 0
1.18. Remark: Let T ∈ L(H) be self-adjoint with spectral measure E. An operator S ∈ L(H)
commutes with T , if and only if S commutes with every EA (A ∈ B(R)). The statement can be
shown by first noting that ST = T S is equivalent to S commuting with all powers T n , in other
words, that hST n x, yi = hT n Sx, yi holds for all x, y ∈ H. Using
Z
λn dhEλ Sx, yi = hT n Sx, yi = hST n x, yi = hT n x, S ∗ yi
Z Z
∗
= λ dhEλ x, S yi = λn dhSEλ x, yi
n
and density of polynomials, this in turn can be seen to mean that the complex measures ν1 (A) :=
hSEA x, yi and ν2 (A) := hEA Sx, yi agree when considered as functionals on C(σ(T )). Varying x
and y we obtain SEA = EA S.
1.19. Proposition: If S ∈ L(H) is self-adjoint, g : σ(S) → R and f : R → C are bounded and
Borel measurable, then
(f ◦ g)(S) = f (g(S)).
Proof: The composition f ◦ g : σ(S) → C is Borel measurable and bounded, therefore (f ◦ g)(S) is
defined by the measurable functional calculus. Let F denote the spectral measure of S. Since g is
real-valued, g(S) is self-adjoint and possesses as a unique spectral measure E. It suffices to prove
3
Strictly speaking,we also need to check that E is a spectral measure, but this is easy.
26
the statement of the proposition in case f = χA (A ∈ B(R)), since it can then be extended to
step functions and measurable functions by the usual techniques and properties of the functional
calculus.
We have χA ◦ g = χg−1 (A) , thus it suffices to show Fg−1 (A) = EA for every A ∈ B(R). This in turn
is equivalent to the equality of the measures µx,y (A) := hEA x, yi and νx,y (A) := hFg−1 (A) x, yi for
arbitrary x, y ∈ H. Note that νx,y is the image measure of the measure ρx,y (B) := hFB x, yi (B ∈
B(R)) associated with the spectral measure F of S, since νx,y (A) = hFg−1 (A) x, yi = ρx,y (g −1 (A)),
i.e., νx,y = g(ρx,y ). We obtain
Z Z Z Z
λ dνx,y = g(λ) dρx,y = hg(S) x, yi = λ dhEλ x, yi = λn dµx,y ,
n n n n
therefore νx,y and µx,y are equal as (continuous) linear functionals on polynomial functions on
σ(S). By the theorem of Weierstraß, the two measures are equal on C(σ(S)). Since x, y were
arbitrary, we have shown Fg−1 (A) = EA .
1.20. Example (roots of positive operators): Let T ∈ L(H) be positive and n ∈ N, then there
is a unique positive S ∈ L(H) such that S n = T . Existence is clear, if we note that σ(T ) ⊆ [0, ∞[
and put S := T 1/n . To show uniqueness, suppose S is positive, hence σ(S) ⊆ [0, ∞[, and satisfies
S n = T . Let g : σ(S) → R be defined by g(s) := sn and f : R → R, f (t) := (tχσ(T ) (t))1/n . Then
f ◦ g = idσ(S) , since σ(T ) = σ(S n ) = g(σ(S)), and we obtain from Proposition 1.19
1.21. Remark (polar decomposition): For arbitrary T ∈ L(H), the operator T ∗ T is positive
and we may define the positive operator |T | := (T ∗ T )1/2 . Note that k|T |xk2 = hx, |T |2 xi =
hx, T ∗ T xi = kT xk2 , which enables us to define U : ran(|T |) → ran(T ) by U (|T |x) := T x and
⊥
extend it as an isometry to ran(|T |) → ran(T ). Putting U = 0 on ran(|T |) = ran(|T |)⊥ =
ker(|T |) = ker(T ), we obtain an operator U ∈ L(H) with ker(U ) = ker(T ), which acts as an
isometry ker(T )⊥ → ran(T ), a so-called partial isometry, with the property U |T | = T . This
relation is called polar decomposition of T , since it resembles the similar formula z = eiα |z| for
complex numbers. (It can be shown that U |T | = T = (T T ∗ )1/2 U , see, e.g., [KR, Theorem 6.1.2].)
1.22. Description of the spectrum in terms of the spectral measure: Let T ∈ L(H) be
self-adjoint with spectral measure E and λ be a complex number.
(a) λ 6∈ σ(T ) if and only if there is an open neighborhood U ⊆ C of λ such that EU = 0.
Proof: By construction, Eσ(T ) = I (Proposition 1.14), thus Eρ(T ) = 0, and the resolvent set ρ(T )
is open, which proves the ‘only if’ part. To show the reverse implication, suppose U is an open
neighborhood of λ such that EU = 0. Define the bounded measurable function f : σ(T ) → C by
f (t) := 1/(λ − t), if t ∈ σ(T ) \ U , and f (t) := 0, if t ∈ U ∩ σ(T ). The function g : σ(T ) → C,
g(t) := λ − t is also bounded and measurable and f · g = χσ(T )\U , hence
The same holds for (λ − T )f (T ) = g(T )f (T ) = (gf )(T ) = (f g)(T ), thus showing that f (T ) ∈
L(H) is an inverse of λ − T , i.e., λ ∈ ρ(T ).
27
(b) λ ∈ σp (T ) (i.e., λ is an eigenvalue of T ) if and only if E{λ} 6= 0.
In this case, E{λ} projects onto the eigenspace of λ.
Proof: We show that ran(E{λ} ) = ker(λ − T ), then the result follows immediately.
• ran(E{λ} ) ⊆ ker(λ − T ): x ∈ ran(E{λ} ) means E{λ} x = x; let y ∈ H arbitrary and note that
(λ − t)χ{λ} (t) = 0 for all t to obtain
Z
h(λ − T )x, yi = h(λ − T )E{λ} x, yi = (λ − t)χ{λ} (t) dhEt x, yi = 0;
thus, (λ − T )x = 0.
• ker(λ − T ) ⊆ ran(E{λ} ): If T x = λx, then f (T )x = f (λ)x for every f ∈ C(σ(T )) by Theorem
1.6(iii); by the properties of the measurable functional calculus(Theorem 1.8) and an application
of Lemma 0.13, we deduce that the relation also holds with f ∈ Bb (σ(T )); putting f = χ{λ} yields
E{λ} x = χ{λ} (λ)x = x, thus, x ∈ ran(E{λ} ).
We will now discuss the aspect of diagonalization in terms of unitary equivalence with a multipli-
cation operator.
1.23. Multiplication operator version of the spectral theorem: A vector x ∈ H is said
to be a cyclic vector for an operator S ∈ L(H), if the linear span of the set {x, Sx, S 2 x, . . .} is
dense in H. (Note that then H is necessarily separable, hence possesses a complete orthonormal
system of at most countable cardinality.) The multiplication operator T in Example 1.17 possesses
the constant function 1 ∈ L2 ([0, 1]) as a cyclic vector, because (T n x)(t) = tn x(t) and therefore
span{1, T 1, T 2 1, . . .} contains the L2 -dense subspace of all polynomial functions on [0, 1]. In
contrast, the identity operator I ∈ L(H) never possesses a cyclic vector, unless the dimension of
H is 0 or 1.
Theorem A: Let T ∈ L(H) be self-adjoint with spectral measure E and suppose x ∈ H is a
cyclic vector for T . If µ := µx,x (recall µx,x (A) = hEA x, xi), then there exists a unitary operator
U : H → L2 (σ(T ), µ) such that we have, for every ϕ ∈ L2 (σ(T ), µ),
28
i.e., the linear map V0 : C(σ(T )) → H, ϕ 7→ ϕ(T )x is isometric with respect to k.k2 . The unique
continuous extension V : L2 (σ(T ), µ) → H is linear and also isometric. Moreover, ran(V ) ⊇
ran(V0 ) ⊇ span{x, T x, T 2 x, . . .} is dense in H, hence V is also surjective (since kV ψk = kψk2
shows that V has closed range; cf. 0.2); therefore V is a unitary map. It satisfies the following
relation for every ϕ in the dense subspace C(σ(T )) ⊆ L2 (σ(T ), µ):
in other words, (V −1 T V ϕ)(t) = t ϕ(t) holds for all ϕ ∈ L2 (σ(T ), µ) for µ-almost all t ∈ σ(T ).
Thus, the theorem is proved upon setting U := V −1 .
In the sense of Theorem A, the multiplication operator in Example 1.17 is the prototype of any
self-adjoint operator with (purely continuous) spectrum [0, 1] and a cyclic vector x such that µx,x
is (equivalent to) the Lebesgue measure.
The extension of the above theorem to the general situation of a self-adjoint operator T on a
Hilbert space H faces merely technical difficulties. The major step in overcoming these is to show
that H can be decomposed into T -invariant subspaces Hj (j ∈ J, J some index set) with cyclic
vectors in each Hj . Then Theorem A applies to each restriction Tj of T to these components and
we come up with an orthogonal direct sum of multiplication operators on L2 -spaces with respect
to different measure spaces (σ(Tj ), B(σ(Tj )), µj ). This still yields a multiplication operator—
although not given as multiplication by a “global identity function”—on the direct sum, which
can be implemented as a single space L2 (Ω, µ) by artificially producing a disjoint union Ω of the
sets σ(Tj ) and consider the “sum” µ of the measures µj on it. We do not go into the details of
the constructions here, but refer to [Wer18, Satz VII.1.21 and Lemma VII.1.22] for a proof in the
separable case and to [Con10, Chapter IX, Theorem 4.6] or [Kab14, Abschnitt 15.3] for a proof of
the general situation.
Theorem B: Let T ∈ L(H) be self-adjoint, then there exists a measure space (Ω, Σ, µ), a bounded
measurable function f : Ω → R, and a unitary operator U : H → L2 (Ω, µ) such that we have, for
every ϕ ∈ L2 (Ω, µ),
U T U −1 ϕ = f ϕ µ-almost everywhere.
1.24. Example (convolution operators): [In this example we call on a few more basic results
from measure theory and Fourier analysis that are typically covered by the companion master
level course on Real Analysis; we may also refer to [Con16, Fol99].] Let f ∈ L1 (R) ∩ L2 (R) and
x ∈ L2 (R), then for every s ∈ R, the function t 7→ f (s − t) belongs to L2 as well and
Z
(f ∗ x)(s) := f (s − t)x(t) dt
R
kf ∗ xk2 ≤ kf k1 kxk2 .
29
The map x 7→ f ∗ x defines a linear continuous operator Cf : L2 (R) → L2 (R), which is self-adjoint,
if f (r) = f (−r) holds for all r ∈ R, since
Z Z Z Z
hCf x, yi = f (s − t)x(t) dt y(s) ds = x(t) f (s − t) y(s) ds dt,
Z Z
while hx, Cf yi = x(t) f (t − s) y(s) ds dt.
operator F on L2 (R) (Plancherel’s theorem) and is famous for intertwining differentiation with
multiplication by id (on functions with appropriate regularity or decay properties) as well as
transforming convolution into multiplication, i.e.,
√
F(f ∗ x) = 2π fb · xb (f ∈ L1 ∩ L2 , x ∈ L2 ).
30
Theorem: Let T ∈ L(H) be normal, then there exists a unique spectral measure G : B(C) →
L(H) with compact support such that
Z
T = λ dGλ .
σ(T )
The map f 7→ f (λ) dGλ , Bb (σ(T )) → L(H) defines the measurable functional calculus for T .
R
Spectral mapping theorem: If f ∈ C(σ(T )), then σ(f (T )) = f (σ(T )) = {f (λ) | λ ∈ σ(T )}.
1.26. Example (unitary operators): Let U ∈ L(H) be unitary, then U is normal. Recall that
which can be argued as follows: Since U is bijective and kU k = 1, we have 0 ∈ ρ(U ) and
{λ ∈ C | |λ| > 1} ⊆ ρ(U ); if 0 < |λ| < 1, then λU as well as ( λ1 − U ∗ ) are bijective, hence
λ − U = (−λU )( λ1 − U ∗ ) is continuously invertible showing that λ ∈ ρ(U ).
Consider the function arg : S 1 → R defined uniquely by the requirements ei arg(z) = z and
arg(S 1 ) ⊆ ] − π, π]. The function arg is bounded by construction and continuous except for a
jump at z = −1, hence Borel measurable. The measurable functional calculus for U allows us
to define A := arg(U ) ∈ L(H), which is self-adjoint, because arg is real-valued. The relation
(exp ◦(i arg))(z) = exp(i arg(z)) = z now implies
1.27. Remark (Gelfand theory for commutative unital C∗ -algebras): As with the con-
tinuous functional calculus for self-adjoint bounded operators described in Theorem 1.6, we may
consider the commutative closed subalgebra C ∗ (T ) := {f (T ) | f ∈ C(σ(T ))} of L(H) now for
an arbitrary normal operator T ∈ L(H). Both L(H) and C ∗ (T ) are examples of a unital C ∗ -
algebra, i.e., a Banach algebra A with a unit element I ∈ A and a conjugate linear involution
map A → A, A 7→ A∗ satisfying (A∗ )∗ = A and (AB)∗ = B ∗ A∗ such that kA∗ Ak = kAk2 . Since
kABk ≤ kAkkBk holds in any normed algebra, one easily deduces that involution is an isometry,
i.e., kA∗ k = kAk (see [Con10, Chapter VIII, Proposition 1.7] or [Wer18, Lemma IX.3.3]).
While L(H) is not commutative4 , for any normal operator T , the C ∗ -subalgebra C ∗ (T ) ⊆ L(H)
is commutative and is, in fact, the smallest C ∗ -algebra of L(H) containing I, T , and T ∗ . Further
examples of commutative unital C ∗ -algebras are provided by C(X) = {f : X → C | f continuous}
for any compact Hausdorff space X, with involution f ∗ (x) := f (x) (x ∈ X) and the supremum
norm k.k∞ . According to the so-called Gelfand-Naimark theorem, every unital commutative C ∗ -
algebra is of this latter form (cf. [Con10, Chapter VIII, Theorem 2.1] or [Wer18, Theorem IX.3.4]).
4
unless H is one-dimensional
31
The basic construction of the Gelfand isomorphism establishing the Gelfand-Naimark theorem
proceeds as follows: Let A be a unital commutative C ∗ -algebra and denote by X(A) the set
of all nonzero characters on A, i.e., multiplicative linear functionals A → C. It can be shown
that X(A) ⊆ A0 and that every character is also continuous with respect to the weak* topology
on A (cf. Example 4.12.1)). Moreover, the character space X(A) is easily seen to be a weak*
closed subset of the unit ball in A0 , hence is weak* compact by the Alaoglu-Bourbaki theorem
(see Section 6). Recall the canonical isometry ι : A → A00 , given by ιA (µ) := µ(A) for all A ∈ A
and µ ∈ A0 . Using the fact X(A) ⊆ A0 , the Gelfand transformation A → C(X(A)) is given by
A 7→ A b := ιA |X(A) and can be shown to be an isometric ∗ -isomorphism
A∼
= C(X(A)).
In case A = C ∗ (T ) one can show that X(C ∗ (T )) is homeomorphic to σ(T ) (see [Wer18, Satz IX.3.6]
or [Con10, Chapter VIII, Proposition 2.3]) and therefore the Gelfand transformation implies
C ∗ (T ) ∼
= C(σ(T )).
This provides yet another view on the continuous functional calculus for the normal operator T .
32
2. Unbounded operators
Differential operators, in applications often supplied with boundary conditions, are prominent
examples of linear maps between function spaces, but they can typically not be implemented as
bounded operators with respect to L2 -norms. For example, let V := {f ∈ C 1 ([0, 1]) | f (0) = 0 =
f (1)} and T0 : V → C([0, 1]) be given by f 7→ if 0 . Then T0 is bounded, if we equip V with the
norm kf k∞,1 = kf k∞ +kf 0 k∞ and C([0, 1]) with k.k∞ or k.k2 . But the map f 7→ if 0 is unbounded
(hence discontinuous) considered as linear map T : V → L2 ([0, 1]) with the L2 -norm put on both
spaces1 . But the L2 -setting still has the advantage of reflecting the following symmetry property
of T , being defined on the (dense) subspace V ⊆ L2 ([0, 1]):
Z1 Z1
t=1
0
∀x, y ∈ V : hT x, yi = i x (t) y(t) dt = ix(t) y(t) −i x(t) y 0 (t) dt = hx, T yi.
t=0
0 0
2.1. Definition: (a) Consider a linear map T : dom(T ) → H, where dom(T ) ⊆ H is a subspace.
Then T is called an operator on H with domain dom(T ). The operator T is said to be densely
defined, if dom(T ) is dense in H.
(b) An operator S : dom(S) → H is said to be an extension of T , in notation T ⊆ S, if dom(T ) ⊆
dom(S) and Sx = T x holds for all x ∈ dom(T ).
(c) Two operators T and S are equal, we write T = S, if T ⊆ S and S ⊆ T , i.e., dom(T ) = dom(S)
and T x = Sx for all x ∈ dom(T ).
(d) An operator T : dom(T ) → H is symmetric, if
In the above sense, any element S ∈ L(H) is an operator on H with dom(S) = H. Clearly, if T
is a densely defined operator on H which happens to be continuous (with respect to the Hilbert
space norm restricted to dom(T )), then T has a unique extension T ⊆ S ∈ L(H). Finally, by
the Hellinger-Toeplitz theorem ([Wer18, Satz V.5.5] or [Tes14, §4.1, Theorem 4.9]), a symmetric
operator with dom(T ) = H is bounded and self-adjoint, i.e., T ∈ L(H) with T ∗ = T .
The operator T on L2 ([0, 1]) mentioned above, with domain V = {f ∈ C 1 ([0, 1]) | f (0) = 0 =
f (1)} and given by T f = if 0 , is symmetric. If Sf = if 0 with dom(S) := {f ∈ C 1 ([0, 1]) | f (0) =
f (1)}, then S is a symmetric extension of T .
1
√
√ at hands: We find that fn (t) := sin(nπt) satisfies fn ∈ V and kfn k2 = 1/ 2,
To have an explicit example
whereas kT fn k2 = nπ/ 2 → ∞ as n → ∞.
33
Recall that the adjoint T ∗ of a bounded operator T is uniquely defined by the requirement
hT x, yi = hx, T ∗ yi for all x, y ∈ H and can be obtained by assigning to y ∈ H the unique
vector z ∈ H such that the linear functional x 7→ hT x, yi is represented by x 7→ hx, zi. This con-
struction still works in the more general context, if x 7→ hT x, yi is defined on a dense subspace,
namely dom(T ), and is a continuous linear functional.
As noted above, the Hellinger-Toeplitz theorem guarantees consistency of this new notion of self-
adjointness with the corresponding one for bounded operators. Obviously, self-adjoint operators
are symmetric. Conversely, symmetric operators need not be self-adjoint (in general, they need
not even be densely defined): In the above example, dom(T ) ( dom(S) ⊆ dom(T ∗ ), since f 7→
hif 0 , gi = hf, ig 0 i is L2 -continuous on dom(T ) for every g ∈ dom(S); therefore, T is symmetric,
densely defined, but not self-adjoint.
and equip Hq× H with the inner product h(x, y), (u, v)i2 := hx, ui + hy, vi, defining the norm
k(x, y)k2 = kxk2 + kyk2 (which induces the product topology on H × H), thus obtaining a
Hilbert space structure on H × H. The operator T is closed (or has closed graph), if gr(T ) is
closed in H × H. Equivalently, for any sequence (xn ) in dom(T ) such that xn → x and T xn → y
in H, we have x ∈ dom(T ) and y = T x.
Recall that the closed graph theorem states that a closed operator T with dom(T ) = H is
continuous. In case of unbounded operators the closedness serves as a partial replacement of
continuity properties. To give an example of an operator that is not closed we turn again
to T x = ix0 with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]): Consider
xn (t) = ( 41 + n1 )1/2 − ((t − 21 )2 + n1 )1/2 , x(t) = 12 − |t − 12 |, and y(t) = i sgn( 21 − t); then xn → x uni-
formly, hence in L2 ([0, 1]), and T xn → y in L2 ([0, 1]) (dominated convergence), but x 6∈ dom(T ).
(A pity, since T x = y almost everywhere; the chosen domain is too narrow).
34
2.4. Proposition: Let T : dom(T ) → H be densely defined, then:
(i) T ∗ is closed,
(ii) if T ∗ is densely defined, then T ⊆ T ∗∗ ,
(iii) if T ∗ is densely defined and S is a closed extension of T , then T ∗∗ ⊆ S.
This means that T ∗∗ is the “smallest” closed extension of T , the so-called closure of T .
Proof: (i): Suppose yn ∈ dom(T ∗ ), yn → y, and T ∗ yn → z. Then we have for every x ∈ dom(T ),
h(z, T ∗∗ z), (u, v)i2 = hz, ui + hT ∗∗ z, vi = hz, ui + hz, T ∗ vi = hz, |u +{zT ∗ v}i = 0,
u−u
35
There is an interesting intermediate case between self-adjointness and symmetry for densely de-
fined operators, namely self-adjointness of the adjoint.
2.6. Definition: A densely defined symmetric operator T on H is essentially self-adjoint, if its
closure T ∗∗ is self-adjoint, or, equivalently, T ⊆ T ∗∗ = T ∗ (= T ∗∗∗ by the above corollary).
In general, self-adjoint extensions of symmetric operators need not exist, but if T is essentially
self-adjoint it possesses its closure T ∗∗ = T ∗ as the unique self-adjoint extension. (This is easily
proved upon observing that T ⊆ S implies S ∗ ⊆ T ∗ .)
2.7. An example revisited: We study again the example T x = ix0 with dom(T ) = {x ∈
C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]). We have seen above that T is symmetric, but not
self-adjoint, and not even closed.
We claim that dom(T ∗ ) = {x : [0, 1] → C | x is absolutely continuous and x0 ∈ L2 ([0, 1])} and
T ∗ x = ix0 .
Rt
Let x ∈ dom(T ∗ ) and y := T ∗ x ∈ L2 ([0, 1]). We put F (t) := 0 y(s) ds and note that, by
the fundamental theorem of calculus, F is absolutely continuous2 and satisfies F 0 = y almost
everywhere. We may use the formula of integration by parts (Proposition 0.20) and obtain for
any z ∈ dom(T ) (recall that z(0) = 0 = z(1))
Z1 Z1
t=1
∗ 0
hT z, xi = hz, T xi = hz, yi = hz, F i = z(t)F 0 (t) dt = z(t)F (t) − z 0 (t)F (t) dt
t=0
0 0
= −hz , F i = hiz 0 , −iF i = hT z, −iF i,
0
(∗) x + iF ∈ ran(T )⊥ .
R1 R1
Observe that T z ∈ C([0, 1]) with 0 T z = i 0 z 0 = i(z(1) − z(0)) = 0 for every z ∈ dom(T );
R1
on the other hand, any continuous function w with 0 w = 0 is in ran(T ), since w = iW 0 with
Rt
W ∈ dom(T ), if W (t) := −i 0 w(s) ds (which yields W (0) = W (1) = 0). Thus,
Z 1
ran(T )⊥ = {w ∈ C([0, 1]) | w = 0}⊥ = {w ∈ C([0, 1]) | hw, 1i = 0}⊥
0
⊥
= {w ∈ C([0, 1]) | hw, 1i = 0} = {w ∈ L2 ([0, 1]) | hw, 1i = 0}⊥ = ({1}⊥ )⊥ = span{1}.
We conclude from (∗) that x + iF ∈ span{1}, hence x = −iF + α1 with some α ∈ C, which shows
that x is absolutely continuous and ix0 = F 0 + 0 = y = T ∗ x ∈ L2 ([0, 1]).
If x : [0, 1] → C is absolutely continuous with x0 ∈ L2 ([0, 1]), then we obtain for every z ∈ dom(T )
(again employing integration by parts)
36
which proves that z 7→ hT z, xi is continuous, thus x ∈ dom(T ∗ ).
As an exercise, one can show along similar lines that dom(T ∗∗ ) = {x ∈ dom(T ∗ ) | x(0) = 0 =
x(1)} ( dom(T ∗ ) and T ∗∗ x = ix0 , i.e, T ∗∗ ( T ∗ . We conclude that T is not essentially self-adjoint
(and T ∗ is not symmetric).
2.8. Example (unbounded multiplication operators): Let (Ω, Σ, µ) be a measure space,
f : Ω → R measurable, dom(T ) := {x ∈ L2 (Ω, µ) | f · x ∈ L2 (Ω, µ)}, and T x := f · x. Since f is
real-valued, T is symmetric.
We show that dom(T ) is dense: For n ∈ N consider Ωn := {ω ∈ Ω | |f (ω)| ≤ n} ∈ Σ; then
n∈N Ωn = Ω and Vn := {x ∈ L (Ω, µ) | ∀ω ∈ Ω \ Ωn : x(ω) = 0} ⊆ dom(T ) (n ∈ N); if x ∈
2
S
L (Ω, µ), then xn := x χΩn ∈ Vn ⊆ dom(T ) and xn → x in L2 (Ω, µ) (by dominated convergence).
2
So far, we have seen that T is a densely defined symmetric operator, hence T ⊆ T ∗ . We claim
that T is self-adjoint. It suffices to show that dom(T ∗ ) ⊆ dom(T ), since symmetry then implies
T ∗ ⊆ T , which yields T ∗ = T .
Let x ∈ dom(T ∗ ). We note that χΩn f ∈ L∞ (Ω, µ) and that χΩn z ∈ dom(T ) for every z ∈ dom(T ),
hence we have for every n ∈ N,
By density of dom(T ), we deduce that χΩn T ∗ x = χΩn f x holds in L2 (Ω, µ) for every n ∈ N. Since
χΩn → 1 pointwise, we obtain T ∗ x = f x as measurable functions almost everywhere on Ω, hence
f x ∈ L2 (Ω, µ) (because T ∗ x belongs to L2 (Ω, µ)), which proves x ∈ dom(T ).
37
To show closedness of ran(T ± i), suppose (xn ) is a sequence in dom(T ) such that the sequence
((T ± i)xn ) in ran(T ± i) converges to y ∈ H. By continuity of (T ± i)−1 , the Cauchy sequence
((T ±i)xn ) in ran(T ±i) is then mapped to the Cauchy sequence (xn ) in dom(T ). Hence x := lim xn
exists in H and T xn → y ∓ ix. Since T is a closed operator, x ∈ dom(T ) and y ∓ ix = T x, i.e.,
y = (T ± i)x ∈ ran(T ± i).
2.10. Theorem: Let T be a symmetric and densely defined operator on H, then the following
are equivalent:
(i) T is self-adjoint,
(ii) T is closed and ker(T ∗ ± i) = {0},
(iii) ran(T ± i) = H.
Proof: (i) ⇒ (ii): T = T ∗ is closed by Proposition 2.4. By symmetry of T ∗ and Equation (2.1) in
the proof of Lemma 2.9 (applied to T ∗ in place of T ), we find that (T ∗ ± i)x = 0 implies x = 0.
(ii) ⇒ (iii): By the above lemma, ran(T ± i)⊥ = ker(T ∗ ∓ i) = {0} and ran(T ± i) is closed, since
T is closed. Thus, we obtain ran(T ± i) = H.
(iii) ⇒ (i): The densely defined symmetric operator T satisfies T ⊆ T ∗ , hence it suffices to show
dom(T ∗ ) ⊆ dom(T ). Let y ∈ dom(T ∗ ). Since H = ran(T − i) we can find some x ∈ dom(T ) such
that (T ∗ − i)y = (T − i)x. Due to T ⊆ T ∗ we may thus write (T ∗ − i)y = (T ∗ − i)x. The previous
lemma gives ker(T ∗ − i) = ran(T + i)⊥ = H ⊥ = {0}, which implies that T ∗ − i is injective, hence
y = x ∈ dom(T ).
The theorem provides us with a very short argument that the operator T of multiplication by the
real-valued measurable function f from Example 2.8 is self-adjoint: The functions 1/(f ± i) and
f /(f ± i) are measurable and bounded, hence any h ∈ L2 (Ω, µ) is in the range of (T ± i), since
(f ± i) · f h±i = h is clear and f · f h±i = f f±i · h ∈ L2 (Ω, µ) shows that f h±i ∈ dom(T ).
2.11. Corollary: Let T be a symmetric and densely defined operator on H, then the following
are equivalent:
(i) T is essentially self-adjoint,
(ii) ker(T ∗ ± i) = {0},
(iii) ran(T ± i) is dense in H.
Proof: (ii) ⇔ (iii): Follows directly from Lemma 2.9(i).
(i) ⇔ (ii): Recall that we have T ⊆ T ∗∗ ⊆ T ∗ = T ∗∗∗ by Corollary 2.5(i). The above theorem,
applied to T ∗∗ (which is closed), states that T ∗∗ is self-adjoint, if and only if ker(T ∗ ± i) =
ker(T ∗∗∗ ± i) = {0}.
For a densely defined symmetric operator T , the Hilbert dimensions of ker(T ∗ ± i) are called
deficiency indices of T . The corollary says that T is essentially self-adjoint, if and only if both
deficiency indices are 0. More generally, we will show in the following theorem, that self-adjoint
extensions exist, if and only if the deficiency indices are equal. Recall that ker(T ∗ ±i) = ran(T ∓i)⊥
by Lemma 2.9(i).
38
2.12. Theorem: A symmetric and densely defined operator T on H possesses self-adjoint ex-
tensions, if and only if its deficiency indices are equal, which means that dim ker(T ∗ + i) =
dim ker(T ∗ − i), where dim refers to the cardinality of a complete orthonormal system.
Remark on the following proof: The basic idea will be to transfer some of the properties of the
bijective map t 7→ t−i
t+i
between R and {z ∈ C | |z| = 1, z 6= 1}, with inverse u 7→ i u+1
u−1 , to the level
of operators via the so-called Cayley-transform U := (T + i)(T − i) of T .
−1
Proof: Suppose that S is a self-adjoint extension of T . Recall from (2.1) that T − i is injective,
hence U := (T + i)(T − i)−1 can be defined on dom(U ) := ran(T − i). By Theorem 2.10,
ran(S − i) = H and we may define V := (S + i)(S − i)−1 on dom(V ) := H. Equation (2.1)
and Theorem 2.10 applied to S show that V is isometric and surjective, thus unitary. Since
T ⊆ S, V maps ran(T − i) onto ran(T + i); moreover, as unitary operator, V also maps the
respective orthogonal complements onto one another. Therefore, Lemma 2.9(i) implies that V
maps ker(T ∗ + i) = ran(T − i)⊥ unitarily onto ker(T ∗ − i) = ran(T + i)⊥ , hence the deficiency
indices have to be equal.
In proving the reverse implication, start by supposing dim ker(T ∗ + i) = dim ker(T ∗ − i). We learn
from (2.1) that
k(T + i)xk = k(T − i)xk ∀x ∈ dom(T ),
therefore, U : ran(T − i) → ran(T + i), (T − i)x 7→ (T + i)x is a well-defined isometric (hence
injective) and surjective operator, which can be uniquely extended to an isometry between the
corresponding closures of the subspaces. Appealing to Lemma 2.9(i), we obtain dim ran(T − i)⊥ =
dim ker(T ∗ + i) = dim ker(T ∗ − i) = dim ran(T + i)⊥ , which allows us to extend U to a unitary
operator V : H → H by simply mapping a complete orthonormal system of ran(T − i)⊥ into such
for ran(T + i)⊥ . We proceed in four steps:
39
3. S is symmetric: If x ∈ dom(S) = ran(V − I), say x = (V − I)y for some y ∈ H, then
is a real number, in particular hSx, xi = hSx, xi = hx, Sxi. As an exercise, one can show that this
implies symmetry of S (e.g., by comparing the expanded expressions for hS(x1 + x2 ), x1 + x2 i =
hx1 + x2 , S(x1 + x2 )i with arbitrary x1 , x2 ∈ dom(S)).
4. S is self-adjoint: We show that ran(S ± i) = H, then the proof is complete by Theorem 2.10.
For every x ∈ H, we have (S − i)(V − I)x = i(V + I)x − i(V − I)x = 2ix and (S + i)(V − I)x =
i(V + I)x + i(V − I)x = 2iV x, thus,
1 1 ∗
x = (S − i) (V − I)x = (S + i) (V − I)V x ,
2i 2i
and shows that the deficiency indices are different. By Corollary 2.11, T is not essentially self-
adjoint and Theorem 2.12 shows that there is no self-adjoint extension of T .
2) In Example 2.8 put Ω = Rn , Σ = B(Rn ), µ the Lebesgue measure, and f : Rn → R, f (ξ) = |ξ|2 .
We immediately obtain that the operator M of multiplication by f with domain dom(M ) = {ψ ∈
L2 (Rn ) | ξ 7→ |ξ|2 ψ(ξ) belongs to L2 (Rn )} is self-adjoint.
Consider M0 ϕ := f · ϕ with dense domain
then M0 is symmetric, densely defined, and M0 ⊆ M . Obviously, ran(M0 ±i) ⊇ D(Rn ) shows that
these ranges are dense (since D(Rn ) is dense in L2 (Rn ), again by [Wer18, Lemma V.1.10]), hence
Corollary 2.11 implies that M0 is essentially self-adjoint. Its unique self-adjoint extension and
closure is given by M , i.e., M = M0∗∗ (refer to Proposition 2.4 and directly show gr(M ) ⊆ gr(M0 ).)
40
Well-known properties of the Fourier transform imply
n
X n
X n
X
(F∆h)(ξ) = F(∂j2 h)(ξ) = 2b
(iξj ) h(ξ) = − ξj2 b
h(ξ) = −|ξ|2 b
h(ξ) = −(M b
h)(ξ),
j=1 j=1 j=1
thus M is unitarily equivalent to the operator −∆ on the domain H 2 (Rn ) := F−1 dom(M ), which
can be identified with the Sobolev space of order 2 (i.e., the L2 -functions with weak derivatives up
to order 2 belonging to L2 ). We deduce that −∆ with domain H 2 (Rn ) ⊆ L2 (Rn ) is self-adjoint.
2.14. Revisiting the example revisited in Example 2.7: We study the self-adjoint extensions
of T x = ix0 with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) = 0 = x(1)} ⊆ L2 ([0, 1]).
The deficiency spaces ker(T ∗ ± i) are easily determined by solving y 0 ± y = 0 for y ∈ dom(T ∗ ).
Both basic solutions t 7→ e∓t =: e∓ (t) belong to dom(T ∗ ), which yields
The operator U constructed in the proof of Theorem 2.12 maps ran(T −i) → ran(T +i), i(x0 −x) 7→
i(x0 + x) for any x ∈ dom(T ). It is extended to a unitary operator Vp on L2 ([0, 1]) by assigning a
fixed normalized vector in ran(T − i)⊥ , say ef − := ec0 e− with c0 := 2/(e2 − 1), to a normalized
vector in ran(T + i) , say V (f
⊥ iγ
+ for some γ ∈ R, where e
e− ) := e · ef f+ := c0 e+ . Every γ ∈ [0, 2π[
gives a different extension V and these are all possible unitary extensions of U . The corresponding
self-adjoint extension S ⊃ T is then given on dom(S) = ran(V − I) by (V − I)z 7→ i(V + I)z.
Since T ∗∗ is the closure of T , the self-adjoint extensions of T and T ∗∗ agree. The latter can
be shown to be described by the family of operators Sα (α ∈ C, |α| = 1) with Sα x = ix0 and
dom(Sα ) = {x : [0, 1] → C | x is absolutely continuous and x0 ∈ L2 ([0, 1]), x(0) = αx(1)} (see,
e.g., [Sch00, 5.3, Beispiel 3 and 5.1, Beispiel 4, Beispiel 5] or [RS75, Section X.1, Example 1], or
also [Con10, Chapter X, Examples 2.21 and 1.11]).
2.15. Extension of the notion of a spectrum: In the most general case, the definition of the
spectrum of an operator which is unbounded or/and not defined on the whole Hilbert space is not
completely uniform in the literature. Thus, let us illustrate two technical issues before giving a
definition.
Artefacts with non-dense domains: Let H = l2 (N) and consider the left-shift, given by
(Lx)n = xn+1 for x = (xn ) ∈ l2 (N). The bounded operator L ∈ L(H) clearly has ker(L) =
span{e1 }, in particular 0 ∈ σ(L). If dom(L0 ) := {e1 }⊥ and L0 is L restricted to dom(L0 ), then
we have the bounded inverse R0 : l2 (N) → dom(L0 ) given by the right-shift. Hence we could not
consider 0 to be a spectral value of L0 . However, if dom(L1 ) is any dense subspace of l2 (N) and
L1 is the corresponding restriction of L, then any sequence from dom(L1 ) approximating e1 can
be used to show that L1 is not continuously invertible, hence 0 could be considered a spectral
value of L1 . This indicates why we will define the spectrum only for densely defined operators.
Non-closed operators have empty resolvent sets: Let T : dom(T ) → H be a linear operator
(not necessarily densely defined) and suppose that λ ∈ C is such that λ − T is bijective dom(T ) →
H with bounded inverse. We will show that T has to be closed.
We first show that λ − T is closed: Let (xn ) be a sequence in dom(T ) = dom(λ − T ) such that
41
xn → x in H and (λ − T )xn → y in H. Then zn := (λ − T )xn belongs to (λ − T )(dom(T )) and
zn → y, hence continuity of (λ − T )−1 implies
(λ − T )xn = λxn − T xn → λx − z.
Remark: (i) For any closed subset S ⊆ C there is some closed densely defined operator T on a
separable Hilbert space H such that σ(T ) = S (see [Wei00, Beispiel 5.10]).
(ii) If T is closed, then it suffices to check whether λ − T is bijective to conclude λ ∈ ρ(T ) (this
follows from [Wer18, Satz IV.4.4]).
(iii) By the observation made above, σ(T ) = C for every non-closed densely defined operator T .
In particular, σ(T ) need not be compact for unbounded operators. Neither is it guaranteed that
the spectrum of an unbounded operator is non-empty. (We will give an example below.)
Rλ − Rµ = (µ − λ)Rλ Rµ ,
42
Proof: Clearly, (iii) follows from (i) due to σ(T ) = C \ ρ(T ). The proofs of (i) and (ii) are very
similar to the case of bounded T , since, by definition, the resolvent Rλ = (λ − T )−1 belongs to
L(H) for every λ ∈ ρ(T ) and maps H into dom(T ): In detail, we obtain the resolvent equation
in (ii) from
As for (i), suppose λ ∈ ρ(T ) and µ ∈ C satisfies |λ−µ| < 1/kRλ k. Then k(λ−µ)Rλ k < 1 and hence
1 − (λ − µ)Rλ is invertible with inverse given by the Neumann series, thus as a power series in the
variable µ with coefficients in L(H). The resolvent equation would suggest Rλ = Rµ (1−(λ−µ)Rλ ),
and indeed S := Rλ (1 − (λ − µ)Rλ )−1 ∈ L(H) is a power series in the variable µ such that
(µ − T )Sx = S(µ − T )x = x holds for all x in the dense subspace dom(T ) ⊆ H (note that Rλ
commutes with (1 − (λ − µ)Rλ )−1 use µ − T = µ − λ + λ − T ). Therefore µ ∈ ρ(T ), thus ρ(T ) is
open. Obviously, we have also proven the claim of analyticity in (ii) along the way.
Examples: 1) On H = L2 ([0, 1]) consider T x = ix0 with
The domain is dense and T is a closed operator (without proof, but recommended as an exercise).
We claim that σ(T ) = ∅.
Let λ ∈ C and x ∈ L2 ([0, 1]) be arbitrary and consider the equation (λ−T )y = x, which translates
into the ordinary differential equation y 0 + iλy = ix. The solutions to the homogeneous equation
are complex multiples of the function t 7→ exp(−iλt) and the usual “variation of constant” and
adaptation to the “initial condition” y(0) = 0 yields the explicit solution formula
Zt
y(t) = i eiλ(s−t) x(s) ds =: (Sλ x)(t) (t ∈ [0, 1]).
0
We see that indeed this defines a solution y ∈ dom(T ) and thus Sλ : L2 ([0, 1]) → dom(T ) is the
inverse of λ − T . Closedness of T (hence of λ − T ) implies boundedness of Sλ (again by [Wer18,
Satz IV.4.4]), but alternatively it is not difficult to see directly that kSλ xk2 ≤ e| Im(λ)| kxk2 . Thus,
Sλ is the resolvent Rλ and λ ∈ ρ(T ).
2) Let (Ω, Σ, µ) be a σ-finite4 measure space and f : Ω → R be measurable. Consider the multi-
plication operator T x = f x on L2 (Ω, µ) with domain dom(T ) = {x ∈ L2 (Ω, µ) | f x ∈ L2 (Ω, µ)}.
By Example 2.8, T is self-adjoint.
We claim that σ(T ) = {λ ∈ R | ∀ε > 0 : µ f −1 ([λ − ε, λ + ε]) > 0} =: essential range of f .
If λ ∈ C is not in the essential range of f , then 1/(λ − f ) ∈ L∞ (Ω, µ) and the corresponding
fy
multiplication operator is bounded; moreover, it is an inverse of λ − T , since λ−f λ
= −y + λ−f y∈
L (Ω, µ) + L (Ω, µ) · L (Ω, µ) ⊆ L (Ω, µ) for every y ∈ L (Ω, µ). Therefore, λ ∈ ρ(T ).
2 ∞ 2 2 2
If λ ∈ ρ(T ), then we may argue pointwise µ-almost everywhere to show that the resolvent Rλ =
(λ − T )−1 : L2 (Ω, µ) → dom(T ) has to be given as operator of multiplication by the function
4
S
i.e., ∃Ωn ∈ Σ (n ∈ N): µ(Ωn ) < ∞ and n∈N Ωn = Ω.
43
h := 1/(λ − f ). Note that for any E ∈ Σ of finite measure we have
Z Z
|h(ω)| dµ(ω) = |h(ω)χE (ω)|2 dµ(ω) = kRλ χE k22 ≤ kRλ k2 kχE k22 = kRλ k2 µ(E).
2
(n + 1)2 (n + 1)2
Z Z
2 2
2
kRλ k µ(E) ≥ |h(ω)| dµ(ω) ≥ kRλ k 1 dµ = kRλ k2 µ(E),
n2 n2
E E
hence µ(E) = 0 (since certainly Rλ 6= 0); therefore, µ(An ) = 0 for every n ∈ N, and
S
n∈N An =
{ω ∈ Ω | |h(ω)| > kRλ k} has measure 0.
We conclude that |λ−f1(ω)| = |h(ω)| ≤ kRλ k holds for µ-almost all ω ∈ Ω, which implies that λ
cannot belong to in the essential range of f .
T is self-adjoint ⇐⇒ σ(T ) ⊆ R.
k(z − T )xk2 = k(λ + iµ)x − (λ + µS)xk2 = kiµx − µSxk2 = µ2 k(i − S)xk2 ≥ µ2 kxk2 .
From this we learn that the inverse (z − T )−1 exists as linear map ran(z − T ) → dom(T ) and is
bounded. Since S is self-adjoint, ran(z − T ) = ran(i − S) = H by Theorem 2.10 and therefore
z ∈ ρ(T ).
Thus, symmetric operators need not have real spectrum. Recall that non-closed operators have
spectrum equal to C, hence, in particular, any non-closed densely defined symmetric operator T
has σ(T ) = C 6⊆ R (for example, T x = ix0 on L2 ([0, 1]) with dom(T ) = {x ∈ C 1 ([0, 1]) | x(0) =
0 = x(1)}, which has been studied repeatedly above).
44
3. The spectral theorem for
unbounded self-adjoint operators
3.1. Theorem (Multiplication operator version of the spectral theorem): Let the oper-
ator T : dom(T ) → H be self-adjoint, then there exists a measure space (Ω, Σ, µ), a measurable
function f : Ω → R, and a unitary operator U : H → L2 (Ω, µ) such that
(a) ∀x ∈ H: x ∈ dom(T ) if and only if f · U x ∈ L2 (Ω, µ),
(b) T is unitarily equivalent via U to the multiplication operator Mf ϕ := f ϕ on L2 (Ω, µ) with
dom(Mf ) := {ϕ ∈ L2 (Ω, µ) | f ϕ ∈ L2 (Ω, µ)}, i.e., for every ϕ ∈ dom(Mf ),
U T U −1 ϕ = Mf ϕ = f ϕ µ-almost everywhere.
Proof: By Theorem 2.15, σ(T ) ⊆ R and therefore R := (T + i)−1 and (T − i)−1 exist as bounded
operators H → dom(T ). The plan of the proof is to show that R is normal, apply the spectral
theorem for bounded normal operators to R, and “map” everything back to T via the formal
relation T = R−1 − i.
Let z1 , z2 ∈ H arbitrary. There exist x, y ∈ dom(T ) such that z1 = (T + i)x and z2 = (T − i)y
and we obtain
i.e., R∗ = (T − i)−1 and the resolvent equation, Proposition 2.15(ii), shows that
1 ∗
(∗) RR∗ = R∗ R = (R − R).
2i
Thus, R is normal and by Theorem 1.25 there is a measure space (Ω, Σ, µ), a unitary map U : H →
L2 (Ω, µ), and a bounded measurable function g : Ω → C such that U R U −1 = Mg , where Mg
denotes the operator of multiplication by g on L2 (Ω, µ).
Being an inverse map, R is injective, hence Mg is and N := {ω ∈ Ω | g(ω) = 0} has to be a
1
µ-null set. Therefore, f (ω) := − i is defined for µ-almost every ω ∈ Ω. Moreover, from (∗)
g(ω)
we obtain
1 1
|g|2 = (g − g) = (−2i Im(g)) = − Im(g)
2i 2i
and may deduce that
45
in particular, f can be considered as a real-valued measurable function f : Ω → R such that
f g = 1 − ig holds µ-almost everywhere, which also shows that f g is essentially bounded.
We prove (a): Let x ∈ dom(T ), then there is y ∈ H such that x = Ry and we obtain
3.2. Example: [As earlier in Example 1.24 we will make use of basic facts about the Fourier
transform and refer again to the course on Real Analysis or [Con16, Fol99].]
We consider the operator T x = −ix0 with dom(T ) = S (R) ⊆ L2 (R).
Let F denote the Fourier transform as unitary operator L2 (R) → L2 (R), then F(S (R)) = S (R)
and, similarly to Example 2.13, 2), we obtain (F T F−1 ϕ)(t) = t ϕ(t) (from the so-called “exchange”
of differentiation with multiplication by the Fourier transform), i.e., the unitary equivalence of T
with the operator M , with dom(M ) = S (R) of multiplication by the real variable. Clearly, T and
M are symmetric and densely defined; moreover, both subspaces ran(T ± i) = F−1 (ran(M ± i))
and ran(M ± i) are dense in L2 (R), since the latter obviously contains D(R). Thus, T and M
are essentially self-adjoint by Corollary 2.11 and their unique self-adjoint extensions are given by
(their closures) T ∗ and M ∗ .
We claim that dom(M ∗ ) = {ψ ∈ L2 (R) | t 7→ t ψ(t) ∈ L2 (R)}: It is clear that any ψ R∈ L2 (R) such
that t 7→ t ψ(t) ∈ L2 (R) belongs to dom(M ∗ ), since we may then write hM ϕ, ψi = ϕ(t) tψ(t) dt
for every ϕ ∈ dom(M ) = S (R) and obtain |hM ϕ, ψi| ≤ ( t2 |ψ(t)|2 dt)1/2 kϕk2 . To show the
R
(Note that we have lψ (ϕ) = hϕ, ηi for every ϕ ∈ L2 (R), but at this moment we have the formula
lψ (ϕ) = tϕ(t)ψ(t) dt only if ϕ ∈ S (R).) Since both t 7→ ψ(t) and η are integrable on every
R
compact subset of R, we may use the following fact, which is shown later in course of an injectivity
argument R in Example 1) of 7.5: If f1 and f2 are locally integrable functions on R such that
ϕf1 = ϕf2 holds for every ϕ ∈ S (R), then f1 = f2 almost everywhere. Thus, (∗) implies that
R
t ψ(t) = η(t) for almost all t, hence the function t 7→ t ψ(t) belongs to L2 (R).
46
We also obtain (M ∗ ψ)(t) = t ψ(t) and recognize M ∗ as a special case of the self-adjoint multipli-
cation operator in Example 2.8. Therefore, T ∗ x = −ix0 on the domain H 1 (R) := F−1 (dom(M ∗ )),
which can be shown to consist of all L2 -functions that are absolutely continuous on compact
intervals and have derivative in L2 (R)—this is the (L2 -based) Sobolev space of order 1 on R.
3.3. Spectral representation of an unbounded self-adjoint operator: Let T be self-adjoint
and Mf as in Theorem 3.1. If h : R → C is a bounded measurable function, then h ◦ f : Ω → C
is bounded measurable and the multiplication operator h(Mf ) := Mh◦f of multiplication by
h ◦ f is bounded on L2 (Ω, µ). The map h 7→ h(Mf ) is continuous and multiplicative Bb (R) →
L(L2 (Ω, µ)), since clearly kMh◦f k ≤ khk∞ and (h1 · h2 ) ◦ f = (h1 ◦ f ) · (h2 ◦ f ).
For any Borel subset A ⊆ R we put
since the latter obviously holds for step functions and the multiplication operator action reduces
the question about the relevant limits to the approximation of bounded measurable functions by
step functions. In general, F will not have compact support.
Let EA := U −1 FA U (A ∈ B(R)), then E : A 7→ EA is a spectral measure B(R) → L(H) and for
any bounded measurable function h : R → C we have
Z
h(λ) dEλ = U −1 h(Mf )U,
R
again by extending the obvious relation for characteristic functions to step functions and further
to bounded measurable functions via approximation. We put h(T ) := U −1 h(Mf )U and obtain
the usual properties of a functional calculus for the map h 7→ h(T ).
We will extend this functional calculus to allow for a measurable unbounded real-valued function
R thereby
h : R → R, obtaining a self-adjoint unbounded operator h(T ) on the dense domain Dh :=
{x ∈ H | |h(λ)| dhEλ x, xi < ∞} (the density is not obvious and will be proved below).
2
Putting ν(B) := B ϕψ dµ we may interpret the last expression above as image measure f (ν)(A) =
R
ν(f −1 (A)), hence hEA x, yi = f (ν)(A). The transformation formula for the integral of a dhEλ x, yi-
integrable function g : R → R with respect to image measures then gives
Z Z Z
(3.1) g(λ) dhEλ x, yi = (g ◦ f ) dν = (g ◦ f ) · ϕ · ψ dµ.
R Ω Ω
47
Applying the above equation to |h|2 we find that x ∈ Dh , if and only if ϕ = U x satisfies Ω |h ◦
R
f |2 |ϕ|2 dµ < ∞. Note that Example 2.8, now with h ◦ f in place of f there, shows that the set
of such ϕ, namely, dom(Mh◦f ), is dense in L2 (Ω, µ). Therefore, the corresponding set of vectors
x = U −1 ϕ, which is Dh , is dense in H.
For every x ∈ Dh and y ∈ H, the integral R h(λ) dhEλ x, yi exists: Using ϕ = U x, ψ = U y and
R
the analogue of (3.1) for the variations of the complex measures, we have
1/2 1/2
Z Z Z Z
|h(λ)| d|hEλ x, yi| = |h ◦ f | · |ϕ| · |ψ| dµ ≤ |h ◦ f |2 · |ϕ|2 dµ |ψ|2 dµ
R Ω Ω Ω
1/2
Z
= |h(λ)|2 dhEλ x, xi kU yk2 < ∞.
| {z }
R =kyk
Hence, for every x ∈ Dh there is a unique vector z ∈ H such that R h(λ) dhEλ x, yi = hz, yi for
R
RThis defines an operator h(T ) on H with dense domain Dh . The formal notation h(T ) =
h(λ) dEλ is commonly used, although this integral is in general not convergent with respect
to the operator norm; the precise statement is Equation (3.2).
The special case h(t) = t gives x ∈ Dh , if and only if ϕ = U x satisfies f ϕ ∈ L2 (Ω, µ), i.e.,
U x ∈ dom(Mf ), equivalently x ∈ dom(T ) by Theorem 3.1; moreover, we have
Z Z
λ dhEλ x, yi = f ϕ ψ dµ = hMf ϕ, ψi = hMf U x, U yi = hU −1 Mf U x, yi = hT x, yi.
R Ω
Similarly, one may show that hh(T )x, yi = hU −1 Mh◦f U x, yi holds for all x ∈ Dh , y ∈ H, which
implies also the self-adjointness of h(T ) due to the established unitary equivalence with R the
self-adjoint multiplication operator Mh◦f with domain dom(Mh◦f ) = {ϕ ∈ L (Ω, µ) | Ω |h ◦
2
Z
hh(T )x, yi = h(λ) dhEλ x, yi ∀x ∈ Dh , y ∈ H,
R
48
3.4. A glimpse at the formalism of quantum mechanics: [We discuss here the formalism
based directly on Hilbert spaces containing the physical states as in the so-called Schrödinger
picture described in [Kab14, Hal13, Tes14b, Wei00]. The more abstract algebraic approach is based
on C ∗ -algebras of observables and linear functionals on these as states (see, e.g., [Ara99, Thi03]).]
The basic idea is that (the state of) a single particle in Rd is described by a so-called complex wave
function Rd × R → C, (x, t) 7→ ψ(x, t), where ρt (x) := |ψ(x, t)|2 is interpreted as the probability
density associated with the (state of the) particle at time t. Thus, the number
Z Z
|ψ(x, t)|2 dx = χK (x)|ψ(x, t)|2 dx
K Rd
corresponds to the probability of finding the particle (in state ψ) at time t inside the measurable
region K ⊆ R and, in particular, Rd |ψ(x, t)|2 dx = 1, since the particle has to be somewhere.
d
R
To put it differently, ψ(., t) ∈ L2 (Rd ) for every t ∈ R, kψ(., t)kL2 = 1, and the value of the L2 -inner
product hχK ψ(., t), ψ(., t)i gives the expectation of the random variable x 7→ χK (x) at time t (for
a particle in state ψ). The bounded linear operator ψ 7→ χK ψ is but one example of an observable,
the most prominent other examples are the unbounded operators of spatial (coordinate) position
Xj : ψ 7→ xj ψ and momentum (coordinates) Pj : ψ 7→ ~i ∂xj ψ (ψ ∈ S (Rd )).
In more abstract terms, the quantum mechanical (vector) states are represented by elements ψ
of a complex (usually separable) Hilbert space H that are normalized such that kψk = 1. The
observables are represented by densely defined linear operators on H. (Densely defined, since the
observable should be “observable in most of the possible states”.) The expectation (or mean value
of measurement) of an observable A in the state ψ ∈ dom(A) is given by
In order to guarantee that all expectation values are real numbers, we suppose that A is symmetric
(then hAψ, ψi = hψ, Aψi = hAψ, ψi). But there are clear indications to require even more: The
set of all possible measurements of an observable A, represented by the spectrum σ(A), should be
real, thus, calling on Theorem 2.15, we further suppose that the observables are self-adjoint.
Eψ (A) = λ0 hψ, ψi = λ0 ,
since Aψ = λ0 ψ. In general, the mean deviation ∆ψ (A) := kAψ − Eψ (A)ψk of A in the state ψ
will not vanish. We have just seen that precisely the eigenvalues occur as sharp measurements
(i.e., with zero deviation).
Recall from linear algebra that two self-adjoint operators A and B on a finite-dimensional Hilbert
space are simultaneously diagonalizable if and only if they commute, i.e., [A, B] := AB − BA = 0.
Considering now the infinite-dimensional case and interpreting the eigenvalues as the possible
sharp measurements of observables, one can generalize this and ask to what extent simultaneous
measurements Eψ (A), Eψ (B) can be achieved at least with small mean deviations. There is a
general lower bound, which in the example A = Xj , B = Pj of the position and momentum
49
operators (in the same coordinate) implies the famous Heisenberg uncertainty principle from the
fact [Xj , Pj ] = i~I: If ψ ∈ {ϕ ∈ dom(A) ∩ dom(B) | Bϕ ∈ dom(A) and Aϕ ∈ dom(B)}, then
1
∆ψ (A) ∆ψ (B) ≥ |h[A, B]ψ, ψi|.
2
Proof: Let A0 := A − Eψ (A) and B 0 := B − Eψ (B). Since I commutes with every operator, we
have1
Therefore, we deduce
|h[A, B]ψ, ψi| ≤ 2 |hA0 B 0 ψ, ψi| = 2 |hB 0 ψ, A0 ψi| ≤ 2 kB 0 ψkkA0 ψk = 2 ∆ψ (A) ∆ψ (B).
The dynamical aspect of the quantum mechanical particle in Rd as introduced above is in the
t-evolution of the wave function ψ(., t) ∈ L2 (Rd ). The model assumption is that ψ(., t) =
U (t)ψ(., 0), where U (t) is linear (superposition principle) and maps states into states, thus
kU (t)ψ(., 0)k2 = kψ(., 0)k2 . Hence it is plausible to suppose that U (t) is a unitary operator
for every t. Moreover, it is natural to suppose that the evolution of duration t starting at time
s gives the same result as the evolution of duration t + s starting at time 0, i.e., to require
U (t)U (s)ψ(., 0) = U (t)ψ(., s) = ψ(., t + s) = U (t + s)ψ(., 0), or, U (t)U (s) = U (t + s) on the
level of operators. Thus, t 7→ U (t) is a group homomorphism from (R, +) to the group of unitary
operators on L2 (Rd ). Finally, we will expect a certain continuity property of the evolution, at
least along each trajectory of a given initial state, i.e., continuity of the map t 7→ U (t)ψ(., 0),
R → L2 (Rd ).
In abstract terms, we implement the dynamical aspect of a quantum mechanical model on the
Hilbert space H of states by a so-called strongly continuous (s-continuous) unitary group of op-
erators U (t) (t ∈ R), which means that U (t) ∈ L(H) is unitary for every t, U (t)U (s) = U (t + s)
holds for all s, t ∈ R, and for every ψ ∈ H and t0 ∈ R we have limt→t0 U (t)ψ = U (t0 )ψ. (We
see that the term ‘strong continuity’ here simply refers to the pointwise continuity of the map
t 7→ U (t), since the convergence U (t) → U (t0 ) is tested pointwise on H; the topology on L(H)
describing this pointwise convergence is usually called the strong operator topology, hence the
notion of s-continuity.)
We can obtain a strongly continuous unitary group on H from any self-adjoint operator S via
functional calculus. We may simply put U (t) := exp(itS) and check, basically as an exercise, that
all the required properties hold (cf. [Con10, Chapter X, Theorem 5.1] or [Kab14, Satz 16.16] or
[Wei00, Satz 7.4]). Moreover, it is not difficult to see that dom(S) is invariant under U (t) and
d U (t + h)ψ − U (t)ψ
∀ψ ∈ dom(S) :
U (t)ψ := lim = iS U (t)ψ,
dt h→0 h
1
boldly using the relation (ST )∗ = T ∗ S ∗ for unbounded operators here without proof
50
i.e., ϕ(t) := U (t)ψ satisfies the initial value problem
d
(3.3) ϕ(t) = iSϕ(t), ϕ(0) = ψ.
dt
By the famous theorem of Stone (see [Con10, Chapter X, Theorem 5.6] or [Kab14, Satz 16.18]
or [Wei00, Satz 7.3]), every s-continuous unitary group U (t) (t ∈ R) is given in the above form
U (t) = exp(itS), where S is a unique self-adjoint operator. In fact, S may be constructed on
dom(S) := {ψ ∈ H | ∃ limh→0 (U (h)ψ −ψ)/h} by the assignment Sψ := −i·limh→0 (U (h)ψ −ψ)/h.
Therefore, the dynamics of the quantum mechanical system is determined by this self-adjoint
operator S. In terms of physics, S corresponds to the energy of the system and is called the
Hamiltonian and the differential equation in (3.3) is called the Schrödinger equation. Observe
that in case the initial value ψ happens to be an eigenvector for S with eigenvalue λ ∈ R, i.e.,
Sψ = λψ, then the solution to (3.3) is given by ϕ(t) = eiλt ψ, which corresponds to a bound
state in physics, since ϕ(t) ∈ span{ψ} for all times t ∈ R; in particular, the expectation values of
observables remain constant: Eϕ(t) (A) = hAϕ(t), ϕ(t)i = eiλt e−iλt hAψ, ψi = Eψ (A).
51
Part II.
53
4. Vector space topologies
In this second part of the lecture notes, X will denote a vector space with scalar field K = R or
K = C.
4.1. Definition: A seminorm on X is a map p : X → [0, ∞[ such that ∀λ ∈ K, x, y ∈ X:
(a) p(λx) = |λ| p(x),
(b) p(x + y) ≤ p(x) + p(y).
Note that p(x) ≥ 0 and that p(0) = 0 (by (a)), but p(x) = 0 does not imply x = 0.
4.2. Example: Let X = C[0,1] = {x : [0, 1] → C} and for every t ∈ [0, 1] let pt : X → [0, ∞[ be
defined by pt (x) := |x(t)|. Then pt is a seminorm on X and we see that pt (x) = 0 for any function
x with x(t) = 0. The set of seminorms P := {pt | t ∈ [0, 1]} can be used to describe the pointwise
convergence of functions in the sense that xn → x pointwise on [0, 1], if and only if
55
4.4. Definition: Let A ⊆ X, then A is called
(a) balanced, if for every x ∈ A and every λ ∈ K, |λ| ≤ 1, we have λx ∈ A,
(b) absolutely convex, if A is convex and balanced,
(c) absorbing, if for every x ∈ X there is a number cx > 0 such that for all λ ∈ K, 0 ≤ λ ≤ cx ,
we have λx ∈ A. (Since x ∈ λ1 A, if 0 < λ ≤ cx , we may say: “Any point in X is swallowed
by appropriate dilations of the subset A.”) [Slightly different definition in [MV92, Tre67].]
4.5. Lemma: A subset A ⊆ X is absolutely convex, if and only if it satisfies
We collect all of these subsets in the family U := {UF,ε | ε > 0, F ⊆ P finite} and list its basic
properties:
(1) ∀U ∈ U we have 0 ∈ U .
This is clear, since p(0) = 0.
(2) ∀U1 , U2 ∈ U there exists U ∈ U such that U ⊆ U1 ∩ U2 .
Let Uj = UFj ,εj (j = 1, 2) and put F = F1 ∪ F2 , ε := min(ε1 , ε2 ), then U := UF,ε fulfills the
requirement, since UF,ε ⊆ UF1 ,ε1 ∩ UF2 ,ε2 .
(3) ∀U ∈ U ∃V ∈ U such that V + V ⊆ U .
If U = UF,ε we may put V := UF,ε/2 , since UF,ε/2 + UF,ε/2 ⊆ UF,ε .
(4) Every U ∈ U is absorbing.
Let U = UF,ε and x0 ∈ X. Put αx0 := maxp∈F p(x0 ). If αx0 = 0, then x0 ∈ UF,ε . If αx0 > 0,
then we put cx0 := ε/αx0 and obtain for every p ∈ F and every λ ∈ K with 0 ≤ λ ≤ cx0
that p(λx0 ) = λ p(x0 ) ≤ cx0 αx0 ≤ ε, i.e, λx0 ∈ UF,ε .
56
(5) ∀U ∈ U ∀λ ∈ K, λ > 0, there is some V ∈ U with λV ⊆ U .
Let U = UF,ε , λ > 0, and note that λ · UF,ε/λ = UF,ε , thus V := UF,ε/λ does the job.
(6) Every U ∈ U is balanced.
This is clear here from the seminorm properties.
(7) Every U ∈ U is absolutely convex.
This is also clear here from the seminorm properties.
O∈τ ⇐⇒ ∀x ∈ O ∃U ∈ U : x + U ⊆ O.
µ(x + w) = λx + (µ − λ)x + µw ∈ λx + V + V ⊆ λx + U ⊆ O.
57
4.8. Remark: It is not difficult to see that every topological vector space possesses a basis of
neighborhoods at 0 satisfying properties (1)-(6). Clearly, (1) and (2) hold for any neighborhood
basis at 0. Properties (3)-(5) hold for the system of all neighborhoods at 0: Property (3) follows
from continuity of addition at (0, 0). Property (4) follows from continuity of λ 7→ λx at 0 ∈ K
for any x ∈ X, and (5) from continuity of the map x 7→ λx at 0 ∈ X for any λ. Finally, every
neighborhood U of 0 contains a balanced 0-neighborhood W : By continuity of scalar multiplication
at (0, 0) ∈ K × X, we can find ε > 0Sand a neighborhood V of 0 such that λV ⊆ U , if |λ| ≤ ε;
setting W := {λv | |λ| ≤ ε, v ∈ V } = |λ|≤ε λV we clearly obtain a balanced subset, which is also
a 0-neighborhood, since each λV with λ 6= 0 is one (as the image of V under the homeomorphism
x 7→ λx). In conclusion, the set of all balanced neighborhoods at 0 satisfies (1)-(6).
We learn from 4.6 and 4.7 that a family of seminorms generates a topology for a topological vector
space. In this topology, the basic neighborhoods UF,ε have the additional property (7) of being
absolutely convex, which has not been used in Proposition 4.7. As it turns out, this additional
property is crucial, for example, to guarantee a rich theory of linear functionals. Moreover, most
of the topological vector spaces arising in applications are built from seminorms and it can be
shown that topologically this is equivalent to the existence of a basis of neighborhoods at 0
consisting of (absolutely) convex subsets (see, e.g., [MV92, Lemmata 22.2 und 22.4] or [Tre67,
Proposition 7.6], and [Wer18, Satz VIII.1.5] for a sketch of the proof). The key ingredient in
constructing seminorms from absolutely convex neighborhoods of 0 is the Minkowski functional,
pU (x) := inf{λ > 0 | x ∈ λU } (see also a corresponding lemma in 5.10).
4.9. Definition: A topological vector space is said to be locally convex, if it possesses a basis of
neighborhoods at 0 consisting of convex sets.
From the discussion and the references given above, we have the following statement.
4.10. Proposition: A topological vector space (X, τ ) is locally convex, if and only if τ is defined
by a family of seminorms as in 4.6 and 4.7.
In particular, every normed vector space is a locally convex space, which happens to be metrizable
as well and thus also a Hausdorff space. In general, locally convex or topological vector spaces need
not be Hausdorff as the example of the chaotic topology (corresponding to the single seminorm
x 7→ 0) illustrates. However, there are the following useful criteria. (We note that the equivalence
(i) ⇔ (iii) holds in general topological vector spaces).
4.11. Lemma: Let P be a set of seminorms on X generating the locally convex vector space
topology τ . Then the following are equivalent:
(i) (X, τ ) is a Hausdorff space,
(ii) for every x ∈ X, x 6= 0, there is a p ∈ P such that p(x) 6= 0,
(iii) there is a neighborhood basis U of 0 such that
T
U = {0}.
U ∈U
Proof: (i) ⇒ (ii): Let x ∈ X, x 6= 0. The neighborhoods at x are of the form x + U , where U
is a neighborhood of 0. By the Hausdorff property, we can separate x and 0, hence we find two
neighborhoods U and V of 0 such that (x + U ) ∩ V = ∅. We may suppose that V = UF,ε = {y ∈
X | p(y) ≤ ε, p ∈ F } with a finite subset F ⊆ P and ε > 0. Since x 6∈ V , there is some p ∈ F
such that p(x) > ε > 0.
58
(ii) ⇒ (iii): We only have to note that p(x) = 0 for every p ∈ P , if and only if x ∈ UF,ε for every
finite subset F ⊆ P and every ε > 0.
(iii) ⇒ (i): Let x, y ∈ X, x 6= y, hence x − y 6= 0 and there has to be some U ∈ U such that
x − y 6∈ U . The map (x, y) 7→ x − y is continuous (being the composition of addition with scalar
multiplication by −1 in the second component), thus, there are neighborhoods V and W of 0 such
that W − V ⊆ U . Therefore x − y 6∈ W − V , which means (x + V ) ∩ (y + W ) = ∅, i.e., we have
found disjoint neighborhoods of x and y.
3) Let Ω ⊆ Rn be open and E (Ω) be the vector space C ∞ (Ω) equipped with the following
seminorms: For m ∈ N0 and K ⊂ Ω compact, define
4) The so-called Schwartz space S (Rn ) is the set of functions ϕ ∈ C ∞ (Rn ) such that every
derivative ∂ β ϕ, β ∈ Nn0 , is rapidly decreasing, i.e., for every α ∈ Nn0 , the function x 7→ xα ∂ β ϕ(x)
is bounded on Rn . We define seminorms on S (Rn ) for any m ∈ N0 and α ∈ Nn0 by
or occasionally also pl,m (ϕ) := max|α|≤l pα,m (ϕ) (l, m ∈ N0 ). The corresponding families of
seminorms define a locally convex Hausdorff topology on S (Rn ).
5) Let Ω ⊆ Rn be open, K ⊂ Ω compact, and DK (Ω) denote the set of all functions ϕ ∈ C ∞ (Ω)
with supp(ϕ) ⊆ K.
(Recall that supp(ϕ) is the closure, in the trace topology of Ω, of the set {x ∈ Ω | ϕ(x) 6= 0}.
Equivalently, it is the complement, within Ω, of the largest open set, where ϕ vanishes. I suppose
that the existence of nonzero functions ϕ ∈ C ∞ (Ω) with compact support has been shown in your
second or third semester Analysis course; if not, a reference is [Wer18, Beispiel zwischen Definition
V.1.9 und Lemma V.1.10]. Clearly, if K ◦ = ∅, then DK (Ω) = {0}.)
We equip DK (Ω) with the structure of a locally convex Hausdorff space generated by the family
of seminorms
pm (ϕ) := max sup |∂ α ϕ(x)| (m ∈ N0 , ϕ ∈ DK (Ω)).
|α|≤m x∈Ω
59
6) Let Ω ⊆ Rn be open. The space of test
S functions D(Ω) is the set of all functions ϕ ∈ C (Ω)
∞
We could consider seminorms pm (m ∈ N0 ) as above also on D(Ω), the only difference is that
the functions now do not have their support inside some fixed compact set K. But it turns out
that the corresponding topology is not appropriate to reflect the structure of D(Ω) as a union
of the spaces DK (Ω) with their topology τK defined in the previous example. Such a structural
feature is desirable to obtain a function space with good localization properties and convenient
approximation procedures being concentrated on bounded regions. Therefore, we consider the set
P of seminorms p on D(Ω) such that the restriction p |DK (Ω) is continuous with respect to τK for
every compact set K ⊂ Ω. The locally convex topology on D(Ω) generated from P is a special
case of the so-called inductive limit topologies from the general theory of topological vector spaces.
It is also Hausdorff, which we will show later in the section on distribution theory.
7) If (X, k.k) is a normed space, then the locally convex topology generated by P = {k.k} is the
norm topology on X.
8) Let X be a normed space with dual space X 0 (i.e., the space of all continuous linear functionals
X → K). Then the family of seminorms P = {px0 | x0 ∈ X 0 }, where
generate the weak topology σ(X, X 0 ) on X, which is locally convex and Hausdorff (due to the
theorem of Hahn-Banach).
9) Let X 0 be the dual space of the normed space X and consider for every x ∈ X the following
seminorm px on X 0 :
px (x0 ) := |x0 (x)| (x0 ∈ X 0 ).
The set of seminorms P = {px | x ∈ X} generates the weak* topology σ(X 0 , X) on X 0 . It coincides
with the topology of pointwise convergence for the subspace of continuous linear functions X → K.
In general, σ(X 0 , X) is different from the weak topology σ(X 0 , X 00 ).
10) Let X and Y be normed spaces. Apart from the (operator) norm topology on L(X, Y ), the
following two locally convex topologies are also of interest in operator theory: The strong operator
topology is generated by the seminorms
px (T ) := kT xk (T ∈ L(X, Y ), x ∈ X),
Both are Hausdorff topologies, which in case of the weak operator topology follows by appealing to
the Hahn-Banach theorem. Recall that we have used the notions of (sequential) convergence with
respect to these topologies already implicitly while discussing the measurable functional calculus
of self-adjoint operators on Hilbert spaces (with X = Y = H and y 0 (x) corresponding to an inner
product).
60
11) The notion of weak topology in probability theory: Let Ω be a metric space and M (Ω) be
the space of regular signed or complex Borel measures on Ω. For every f in the space of bounded
continuous functions Cb (Ω) we define the seminorm
Z
pf (µ) := | f dµ|
Ω
on M (Ω). The family of seminorms P = {pf | f ∈ Cb (Ω)} generates a locally convex topology on
M (Ω), which in case of compact Ω corresponds to the weak* topology due to the Riesz represen-
tation theorem (since we then have Cb (Ω) = C(Ω) and C(Ω)0 ∼ = M (Ω)). This “probabilistic weak
topology” always has the Hausdorff property, the proof is based on the regularity of the measures
(cf. [Els11, Kapitel VIII, §1, Satz 4.6]).
61
5. Continuous linear maps and
functionals
Some of the locally convex spaces of functions have been introduced essentially in order to provide
a setting for linear functionals on these spaces, which are considered as generalized functions or
distributions. It is the property of local convexity that guarantees the existence of continuous
linear functionals via the Hahn-Banach theorem. Moreover, we have seen in the definition of the
test function space D(Ω) that seminorms with continuous restrictions to any DK (Ω) are used to
define the locally convex topology on D(Ω).
5.1. Lemma: Let (X, τ ) be a locally convex vector space and P be a family a seminorms
generating the topology.
(i) q is continuous,
(ii) q is continuous at 0,
(c) A seminorm q on X is continuous, if and only if there is a finite subset F ⊆ P and a real
number M > 0 such that
∀x ∈ X : q(x) ≤ M max p(x).
p∈F
Proof: (a): The implications (i) ⇒ (ii) ⇒ (iii) are immediate. Suppose that (iii) holds. We
show that then (i) follows, which completes the proof of (a): Let x ∈ X and ε > 0; we set
U := ε · Uq,1 = {y ∈ X | q(y) ≤ ε} and note that x + U is a neighborhood of x; if y ∈ U , then1
(b): This follows from (a),(iii), since by definition of τ , the sets {x ∈ X | p(x) ≤ 1} with p ∈ P
are neighborhoods of 0.
1
Here, we simply use the “reverse triangle inequality”: q(w) = q((w − z) + z) ≤ q(w − z) + q(z) and
q(z) = q((z − w) + w) ≤ q(z − w) + q(w) = q(w − z) + q(w) implies |q(w) − q(z)| ≤ q(w − z).
63
(c): By definition of τ and (a), (iii), the continuity of q is equivalent to the existence of a finite
set F ⊆ P and ε > 0 such that UF,ε ⊆ Uq,1 . Recalling UF,ε = {x ∈ X | ∀p ∈ F : p(x) ≤ ε}, the
inclusion relation UF,ε ⊆ Uq,1 is equivalent to
1
(∗) ∀x ∈ X : max p(x) ≤ 1 ⇒ q(x) ≤ 1.
ε p∈F
If the inequality stated in (c) holds, then (∗) follows upon putting ε := 1/M . Conversely, suppose
(∗) holds and let x ∈ X. If we had q(x) > maxp∈F p(x)/ε, then we could choose µ > 0 with
q(x) > µ > maxp∈F p(x)/ε and obtain a contradiction to (∗) for y := x/µ, since q(y) = q(x)/µ > 1
and maxp∈F p(y)/ε = maxp∈F p(x)/(εµ) < 1. Thus, the inequality stated in (c) holds with
M := 1/ε.
5.2. Corollary: Let (X, τ ) be a locally convex vector space and P be a family a seminorms
generating the topology. If Q ⊇ P is a family of τ -continuous (!) seminorms on X, then Q also
generates the topology τ .
Proof: Clearly, the topology generated by Q ⊇ P is finer than τ , but by the above lemma and
the continuity of each q ∈ Q it is also coarser than τ .
We can now provide simple criteria for the continuity of linear maps between locally convex vector
spaces.
5.3. Theorem: Let (X, τP ) and (Y, τQ ) be two locally convex vector spaces, where τP is generated
by the family P of seminorms on X and τQ by the family of seminorms Q on Y . Let T : X → Y
be a linear map, then the following are equivalent:
(i) T is continuous,
(ii) T is continuous at 0,
(iii) if q is a continuous seminorm on Y , then q ◦ T is a continuous seminorm on X,
(iv) for every q ∈ Q there is a finite subset F ⊆ P and a real number M > 0 such that
Proof: (i) ⇒ (ii) is clear and (ii) ⇒ (i) is easy: Let x ∈ X; a neighborhood of T x is of the form
T x + V , where V is a τQ -neighborhood of 0 in Y ; choose a τP -neighborhood U of 0 in X such
that T (U ) ⊆ V , then clearly T (x + U ) = T x + T (U ) ⊆ T x + V .
(ii) ⇒ (iii): The composition q ◦ T is continuous at 0 and clearly defines seminorm on X. Hence,
by (a) in the lemma above, q ◦ T is a continuous seminorm.
(iii) ⇒ (iv): This follows from (b), applied to q ∈ Q, and (c), applied to q ◦ T , in the lemma shown
previously.
(iv) ⇒ (ii): Let V be a τQ -neighborhood of 0 in Y . We may suppose that V is given by finitely
many seminorms q1 , . . . , qn ∈ Q and ε > 0 in the form V = {y ∈ Y | qj (y) ≤ ε (j = 1, . . . , n)}.
For every j = 1, . . . , n, choose Fj ⊆ P and Mj > 0 according to (iv) with qj in place of q. We
64
put F := nj=1 Fj , M := maxnj=1 Mj , and consider the τP -neighborhood U := UF,ε/M of 0 in X.
S
If x ∈ U and j ∈ {1, . . . , n}, then
ε
qj (T x) ≤ Mj max p(x) ≤ M max p(x) ≤ M = ε,
p∈Fj p∈F M
The most important special case Y = K (with the Euclidean topology, generated from the single
norm |.|) deserves an explicit statement.
5.4. Corollary: Let X be a locally convex vector space and P be a family a seminorms generating
the topology. A linear functional l on X is continuous, if and only if there are finitely many
seminorms p1 , . . . , pm ∈ P and a constant M > 0 such that
5.5. Definition: (i) Let (X, τ ) be a locally convex vector space. Then its dual space X 0 , or (Xτ )0 ,
is the vector space of continuous linear functionals on X.
(ii) Let (X, τ1 ) and (Y, τ2 ) be locally convex vector spaces, then L(X, Y ), or L(Xτ1 , Yτ2 ), denotes
the vector space of continuous linear maps X → Y .
Note that we have X 0 = L(X, K) and the fact that L(X, Y ) is a vector space follows from Theorem
5.3.
Recall that a topology τ1 on a set X is finer than τ2 , if τ2 ⊆ τ1 , or, equivalently, the identity map
on X is continuous as a map id : (X, τ1 ) → (X, τ2 ). If τ1 and τ2 are locally convex topologies on
the vector space X, then the latter is equivalent to id ∈ L(Xτ1 , Xτ2 ). The topologies are equal, if
both id ∈ L(Xτ1 , Xτ2 ) and id ∈ L(Xτ2 , Xτ1 ) hold.
5.6. Examples: 1) Let (X, k.k) be a normed space and τ denote the topology induced by the
norm, then τ is finer than the weak topology σ(X, X 0 ), since for any x0 ∈ X 0 we have
2) Let X = C(Rn ) and denote by τ1 the topology of uniform convergence on compact sets, by τ2
the topology of pointwise convergence. Then τ1 is finer than τ2 , since for any t ∈ Rn we may take
K = {t} as compact set and obtain
We briefly study convergence of nets in locally convex vector spaces. Recall that in general,
sequences are not sufficiently powerful to characterize continuity of maps or closures of subsets (cf.
Example 5.8 below) in topological spaces that are non-metrizable. (More precisely, the “barrier”
are the first countable, or AA1, spaces).
65
5.7. Proposition: Let the locally convex topology τ on X be generated by the family of semi-
norms P . Then a net (xj )j∈I converges to x in X with respect to τ , if and only if lim p(xj − x) = 0
for every p ∈ P .
Proof: The net (xj ) converges to x, if and only if the net (xj − x) converges to 0. Thus, we may
suppose that x = 0.
If (xj ) converges to 0, then lim p(xj ) = 0 for every p ∈ P due to continuity of p (Lemma 5.1(b)).
Now suppose that lim p(xj ) = 0 for every p ∈ P and let U = UF,ε be a typical neighborhood of 0
with F ⊆ P finite and ε > 0. In particular, lim p(xj ) = 0 for every p ∈ F . Choose jp ∈ I such
that p(xj ) ≤ ε for every j ≥ jp . Since (I, ≤) is a directed set (and F is finite), there is some j 0 ∈ I
such that j 0 ≥ jp for every p ∈ F . We obtain p(xj ) ≤ ε for every j ≥ j 0 , hence xj ∈ U if j ≥ j 0 .
Thus, we have shown xj → 0 in (X, τ ).
5.8. Example: Let X = l2 , equipped with its weak topology σ := σ(l2 , l2 ), and consider the
subset
A := {em + m en | 1 ≤ m < n} ⊆ l2 ,
where ek ∈ l2 denotes the vector with k-th component 1 and all other components 0.
Suppose (emk + mk enk )k∈N is a sequence in A that converges weakly to 0. By Proposition 5.7,
we conclude that
If (mk )k∈N is bounded, there is some q ∈ N such that mk = q infinitely often. Choosing y = eq
and recalling nk > mk , then produces the contradiction that y(mk ) + mk y(nk ) = 1 infinitely often
and converges to 0 as k → ∞.
If (mk )k∈N is unbounded, we may suppose that both (mk ) and (nk ) are strictly increasing. Let
y ∈ l2 be given by y(j) := 1/k, if j = nk , and y(j) := 0 otherwise. Then y(mk ) + mk y(nk ) ≥
mk /k ≥ 1 for every k ∈ N, which also contradicts y(mk ) + mk y(nk ) → 0.
Remark: We can obtain a net in A converging weakly to 0 as follows. The set of all typical
neighborhoods UF,ε of 0 becomes a directed set, if we introduce the relation UF1 ,ε1 ≤ UF2 ,ε2
meaning UF1 ,ε1 ⊇ UF2 ,ε2 . To the index set element UF,ε we assign the element aUF,ε := em + m en
constructed as in the proof of Claim 1 for F = {y1 , . . . , yr } and ε > 0.
66
5.9. Examples: 1) The dual space of the Schwartz space S (Rn ), Example 4.12.4), is denoted by
S 0 (Rn ) and called the space of temperate (or tempered) distributions. The locally convex topology
on S (Rn ) is generated by the seminorms pα,m (ϕ) := supx∈Rn (1+|x|m ) |∂ α ϕ(x)| (m ∈ N0 , α ∈ Nn0 ).
We show that every f ∈ Lp (Rn ) (1 ≤ p ≤ ∞) provides an element in S 0 (Rn ), in the sense that
the map Tf : S (Rn ) → C,
Z
Tf (ϕ) := f (x) ϕ(x) dx (ϕ ∈ S (Rn )),
Rn
defines a continuous linear functional. Recall Hölder’s inequality, which states that for any g ∈
Lq (Rn ), p1 + 1q = 1 with the convention ∞ 1
= 0, we have kf gkL1 ≤ kf kLp kgkLq . Since S (Rn ) ⊆
L (R ) (1 ≤ q ≤ ∞), we obtain for any ϕ ∈ S (Rn ),
q n
Z
|Tf (ϕ)| ≤ |f (x) ϕ(x)| dx = kf ϕkL1 ≤ kf kLp kϕkLq .
Rn
If p = 1, then q = ∞ and we have 2kϕkL∞ = p0,0 (ϕ), thus, |Tf (ϕ)| ≤ kf kL1 p0,0 (ϕ) proves the
continuity of Tf .
If p > 1, then 1 ≤ q < ∞ and
1/q 1/q
Z Z
1 q
kϕkLq = |ϕ(x)|q dx = (1 + |x|n+1 )ϕ(x) dx
(1 + |x| n+1 q
) | {z }
Rn Rn ≤ p0,n+1 (ϕ)q
1
≤ · p0,n+1 (ϕ) =: Mq · p0,n+1 (ϕ),
1 + |.|n+1 Lq
which implies |Tf (ϕ)| ≤ kf kLp Mq p0,n+1 (ϕ) and proves the continuity of Tf .
2) Let X be a normed space, τ denote the topology induced by the norm on X, and σ := σ(X, X 0 ).
As noted above, τ is finer than σ. Thus, if µ : X → K is a weakly continuous linear functional,
then µ is also norm continuous, i.e., we have (Xσ )0 ⊆ (Xτ )0 = X 0 .
If x0 ∈ X 0 , then |x0 (x)| = px0 (x) for every x ∈ X, therefore x0 is also weakly continuous by
Corollary 5.4; hence, also X 0 = (Xτ )0 ⊆ (Xσ )0 holds. To summarize, we have shown that
(Xσ(X,X 0 ) )0 = X 0 .
3) Let X and Y be normed spaces and consider the strong operator topology on L(X, Y ), described
in Example 4.12.10) and generated by the seminorms px (T ) := kT xk (T ∈ L(X, Y ), x ∈ X). We
denote this locally convex space by Lst (X, Y ) and claim that a linear functional Φ on L(X, Y )
belongs to Lst (X, Y )0 , if and only if Φ is of the following form: There are n ∈ N, x1 , . . . , xn ∈ X,
and µ1 , . . . , µn ∈ Y 0 such that
n
X
Φ(T ) = µj (T xj ) (T ∈ L(X, Y )).
j=1
Clearly, any Φ of the above form is continuous with respect to the strong operator topology by
Corollary 5.4, since |Φ(T )| ≤ n maxnj=1 kµj kkT xj k ≤ (n maxnj=1 kµj k) · maxnl=1 pxl (T ).
67
Suppose Φ ∈ Lst (X, Y )0 , then Corollary 5.4 implies that there are n ∈ N, M > 0, and x1 , . . . , xn ∈
X such that
Let ln∞ (Y ) denote the K-vector space Y n equipped with the norm k(y1 , . . . , yn )k∞ := maxnj=1 kyj k.
We claim that any norm continuous linear functional µ ∈ ln∞ (Y )0 is given in the form
n
X
(∗∗) µ((y1 , . . . , yn )) = µj (yj )
j=1
5.10. Hahn-Banach theorem(s) for locally convex vector spaces: Here, I suppose that
you have already seen at least a proof of the extension theorem for normed spaces. Most likely, it
was based on a so-called linear algebraic version as lemma, involving the crucial “lemma of Zorn
argument” and an upper bound of the linear functional in terms of a seminorm or a convex function
or a sublinear function (e.g., as in [Con10, Chapter III, Section 6] or [Tes14, Theorem 4.11] or
[Wer18, Sätze III.1.2, III.1.4 und Lemma III.1.3]; note that a seminorm is a convex function and
also an example of a sublinear function). Therefore, we “recall” the following statement without
repeating a proof.
Linear algebraic version of Hahn-Banach’s extension theorem: Let V be a subspace of
the K-vector space X and p : X → R be sublinear, i.e., p(λx) = λp(x), if λ ≥ 0, and p(x + y) ≤
p(x) + p(y). If l : V → K is a linear functional satisfying Re l(x) ≤ p(x) for every x ∈ V , then
there is a linear functional L : X → K such that L(x) = l(x), if x ∈ V , and Re L(x) ≤ p(x) for
every x ∈ X.
Extension theorem: Let V be a subspace of the locally convex vector space X and l ∈ V 0 .
Then there exists an extension L ∈ X 0 of l.
Proof: Suppose that the locally convex topology on X is generated by the family of seminorms
P . Then the subspace topology on V is generated by the set of restrictions {p|V | p ∈ P } as
68
family of seminorms on V (since the basic neighborhoods of 0 in V are just obtained from of
those in X intersected with V ). By continuity of l, there are M > 0 and p1 , . . . pm ∈ P such
that |l(x)| ≤ M maxm j=1 pj (x) for every x ∈ V . Define p : X → [0, ∞[ by p(x) := M maxj=1 pj (x)
m
(x ∈ X), then p is a continuous seminorm on X and Re l(x) ≤ |l(x)| ≤ p(x) for every x ∈ V . By
the linear algebraic extension theorem, there is an extension L of l to X satisfying Re L(x) ≤ p(x)
for every x ∈ X. Since p is a seminorm we have p(λx) = p(x) for every λ ∈ K with |λ| = 1, hence
we may conclude that |L(x)| ≤ p(x) holds for every x ∈ X. This shows that L ∈ X 0 (strictly
speaking, upon calling on Lemma 5.1(c) and Corollary 5.4).
We turn now to so-called geometric versions of the Hahn-Banach theorem, which state properties
on separation of convex subsets by values of linear functionals.
Lemma on Minkowski functionals: Let W be a convex neighborhood of 0 in the locally convex
vector space X. Then the Minkowski functional pW : X → [0, ∞],
has finite values and defines a sublinear function on X. If W is an absolutely convex neighborhood
of 0, then pW is a continuous seminorm on X.
Proof: Since W is absorbing2 , we have pW (x) < ∞ for every x ∈ X. From the definition of pW ,
we immediately obtain pW (λx) = λ pW (x), if λ ≥ 0.
We show subadditivity of pW , i.e., that pW (x + y) ≤ pW (x) + pW (y) holds for all x, y ∈ X: Let
x, y ∈ X and ε > 0 be arbitrary. By definition of pW as an infimum, there are λ, µ > 0 such that
λ < pW (x) + 2ε , µ < pW (y) + 2ε and λx ∈ W , µy ∈ W . By convexity of W ,
x+y λ x µ y
= · + · ∈ W,
λ+µ λ+µ λ λ+µ µ
which implies pW (x + y) ≤ λ + µ < pW (x) + pW (y) + ε. Since ε > 0 was arbitrary, we obtain
pW (x + y) ≤ pW (x) + pW (y).
If, in addition, W is balanced, then λW = W holds whenever |λ| = 1, thus pW (λx) = pλW (λx) =
pW (x) in this case. Therefore, we obtain for any λ 6= 0,
λ
pW (λx) = pW ( |λ|x) = pW (|λ|x) = |λ| pW (x) (x ∈ X),
|λ|
and may conclude that pW is a seminorm. The continuity of pW on X now follows from Lemma
5.1(a)(iii), since {x ∈ X | pW (x) ≤ 1} = W is a neighborhood of 0.
Separation lemma: Let V be a nonempty convex open subset of the locally convex vector space
X. If 0 ∈
/ V , then there exists x0 ∈ X 0 such that Re x0 (x) < 0 for every x ∈ V .
Proof: Let x0 ∈ V and U := −x0 + V . Then U is convex, open, 0 ∈ U , and −x0 6∈ U . There
is an absolutely convex neighborhood W of 0 such that W ⊆ U . By the lemma on Minkowski
functionals, we have that pU is sublinear and pW is a continuous seminorm.
2
Recall: Any neighborhood of 0 is absorbing, since it contains an absorbing basic neighborhood defined
by finitely many seminorms.
69
Let Y := R-span{−x0 } and define the R-linear functional l : Y → R by l(t(−x0 )) := t pU (−x0 ).
We claim that l(y) ≤ pU (y) for every y ∈ Y : If t > 0, then l(t(−x0 )) = tpU (−x0 ) = pU (t(−x0 ));
if t ≤ 0, then l(t(−x0 )) = t pU (−x0 ) ≤ 0 ≤ pU (t(−x0 )). By real linear algebraic extension, there
is an R-linear extension L : X → R such that L(x) ≤ pU (x) for every x ∈ X. By W ⊆ U we have
also L(x) ≤ pU (x) ≤ pW (x), i.e., |L(x)| ≤ pW (x) for every x ∈ X, since pW is a seminorm. Thus,
L is a continuous R-linear functional on X, since pW is a continuous seminorm (again employing
Lemma 5.1(c) and Corollary 5.4)).
Let x ∈ V = x0 + U , say x = x0 + u with u ∈ U . We claim that pU (u) < 1: In fact, if pU (u) ≥ 1,
then ut ∈ X \U for every 0 < t < 1; since X \U is closed, u = limt<1,t→1 ut ∈ X \U , a contradiction.
Furthermore, note that pU (−x0 ) ≥ 1, since −x0 6∈ U . Therefore, we obtain
L(x) = L(u) + L(x0 ) = L(u) + l(x0 ) ≤ pU (u) + l(x0 ) = pU (u) − pU (−x0 ) < 0.
Finally, in the case of a complex vector space we obtain a C-linear continuous functional x0 ∈ X 0
with Re x0 = L by setting x0 (x) := L(x) − iL(ix) (direct computation shows x0 ((a + ib)x) = . . . = (a + ib)x0 (x)),
which clearly satisfies Re x0 (v) = L(v) < 0 for every v ∈ V .
Separation theorem I: Let V1 and V2 be convex subsets of the locally convex vector space X.
If V1 is open and V1 ∩ V2 = ∅, then there exists x0 ∈ X 0 such that
∀v ∈ V : Re x0 (x) + ε ≤ Re x0 (v).
If, in addition, V is absolutely convex, then there exist y 0 ∈ X 0 and ε > 0 such that
∀v ∈ V : |y 0 (v)| + ε ≤ Re y 0 (x).
Since U is absorbing, there is some u0 ∈ U with x0 (u0 ) 6= 0 (for otherwise, we had x0 = 0, which
contradicts the above inequality). Being also a balanced subset, U contains every multiple λu0
with |λ| ≤ 1, hence there is some u1 ∈ U with Re x0 (u1 ) > 0. Therefore ε := sup{Re x0 (u) | u ∈ U }
is positive. Picking an arbitrary v0 ∈ V , the inequality in (∗) implies Re x0 (u) ≤ Re x0 (v0 ) −
Re x0 (x) for every u ∈ U , hence Re x0 (u) is bounded above and ε < ∞. We have thus shown that
Re x0 (x) + ε ≤ Re x0 (v) for every v ∈ V , which proves the first part of the statement.
70
Note that, running through the reasoning above with y 0 := −x0 , we obtain Re y 0 (x) − ε ≥ Re y 0 (v)
for every v ∈ V . If V is absolutely convex, then V = λV for every λ ∈ K with |λ| = 1, hence we
may conclude that Re y 0 (x) − ε ≥ |y 0 (v)| holds for every v ∈ V .
Corollary: If X is a Hausdorff locally convex vector space, then X 0 separates points, i.e., for any
x, y ∈ X with x 6= y, there is x0 ∈ X 0 such that x0 (x) 6= x0 (y).
Proof: Apply the separation theorem II to V := {y}, which is convex and also closed (by the
Hausdorff property).
Remark: The proofs of the Hahn-Banach theorems of extension and separation II do rely in
a crucial way on basic properties of locally convex topological vector spaces, since they employ
seminorms or nontrivial convex open neighborhoods of 0. In general topological vector spaces the
extension theorem may fail and it may happen that convex open 0-neighborhood are trivial. Both
is illustrated by theR following example: Consider Lp ([0, 1]), but with 0 < p < 1. Then it is easy to
1
see that d(f, g) := 0 |f (t) − g(t)|p dt defines a metric on Lp ([0, 1]), which provides the topology of
a topological vector space. However, one can show the following properties (cf. [Wer18, Aufgabe
VIII.8.3] or [Con10, Chapter IV, Example 3.16]):
• If U is a convex neighborhood of 0, then U = Lp ([0, 1]).
• There is no nonzero continuous linear functional on Lp ([0, 1]), i.e., Lp ([0, 1])0 = {0}.
The separation theorem I does hold in topological vector spaces, though the one convex subset
which is supposed to be open might be a trivial subset (cf. [Con10, Chapter IV, Theorem 3.7.;
see also the paragraph preceding Example 3.16]).
5.11. An application of Hahn-Banach separation—the Krein-Milman theorem: An
extreme point x0 in a convex set K in a K-vector space X is a point x0 ∈ K that cannot be part
of a non-degenerate line segment inside K, i.e.,
The Krein-Milman theorem states that a nonempty compact convex subset of a locally convex
space is the closed convex hull of its extreme points. Before stating and proving this theorem we
clarify or introduce some of the relevant notation.
If B is a subset of a vector space let co B denote its convex hull (i.e., the intersection of all convex
subsets containing B), and, in case of a topological vector space, let co B denote the closure of
the convex hull. The latter is a convex set: Let x, y ∈ co B and 0 ≤ λ ≤ 1; there are nets (xj )
and (yj ) in co B with x = lim xj and y = lim yj ; by continuity of the vector space operations, we
obtain λx + (1 − λ)y = lim(λxj + (1 − λ)yj ) ∈ co B. Therefore, co B is the smallest closed convex
set containing B and is called the closed convex hull of B.
As a generalization of the notion of extreme points, a nonempty subset F of a convex set K is
called a face of K, if F is convex and satisfies
71
As an interesting example we may mention, without proof, the case X = C(Ω)0 ∼ = M (Ω), where Ω
is a compact metric space (and M (Ω) denotes the space of signed or complex regular Borel mea-
sures, cf. the Riesz representation theorem 0.17). Let K denote the subset of all Borel probability
measures on Ω. Then clearly K 6= ∅ and K is convex. Particular elements in K are the Dirac
measures δω concentrated at ω ∈ Ω and it turns out that (cf. [Wer18, Beispiel (f) in VIII.4])
ex K = {δω | ω ∈ Ω}.
Lemma: Let K be a non-empty compact convex subset of a Hausdorff locally convex vector space
X and ρ ∈ X 0 . Let c := max{Re ρ(x) | x ∈ K} (making use of the continuity of Re ρ : X → R
and of the compactness of K), then F := {x ∈ K | Re ρ(x) = c} is a compact face of K.
Proof: Clearly, F is nonempty, convex, and closed, hence also compact. If x1 , x2 ∈ K, 0 < λ < 1,
and λx1 + (1 − λ)x2 ∈ F , then Re ρ(xj ) ≤ c (j = 1, 2) and c = Re ρ(λx1 + (1 − λ)x2 ) =
λ Re ρ(x1 ) + (1 − λ) Re ρ(x2 ), hence Re ρ(x1 ) = Re ρ(x2 ) = c, which implies x1 , x2 ∈ F .
Theorem (Krein-Milman): If K is a nonempty compact convex subset of a Hausdorff locally
convex vector space X, then ex K 6= ∅ and K = co (ex K).
Proof: Let F denote the set of all compact faces of K, which is nonempty since K ∈ F and
partially
T ordered by the inclusion relation. If F0 ⊆ F is totally ordered, then we may conclude that
F0 := F ∈F0 F is nonempty (by the finite intersection property for compact sets) and certainly
compact and a face of K. Thus, F0 is a lower bound of F0 . We may therefore apply Zorn’s lemma
and deduce that there exists F ∈ F that is minimal with respect to inclusion.
We will show that F consists of a single point x ∈ K, i.e., F = {x} and hence x is an extreme
point of K, which then implies that ex K 6= ∅.
Recall that as a face of K, the set F is nonempty. We prove the above assertion by contradiction.
Suppose there are two distinct elements x1 and x2 of F . By the corollary to the Hahn-Banach
theorems there is some ρ ∈ X 0 such that Re ρ(x1 ) 6= Re ρ(x2 ).
Applying the above lemma to F in place of K we may conclude that there is some c ∈ R such
that G := {x ∈ F | Re ρ(x) = c} is a compact face of F . It follows easily from the definition that
G is also a face of K, hence G ∈ F. Since it cannot be that both x1 and x2 belong to G, we obtain
that G is a compact face proper subset of F , contradicting the minimality of F .
It remains to show that K = co (ex K). The relation K ⊇ co (ex K) is obvious and we will prove
K ⊆ co (ex K) by contradiction. Denote E := co (ex K) and suppose x0 ∈ K \ E. By (an obvious
variant of) Hahn-Banach separation II there exist ρ ∈ X 0 and a real number a such that
Let c1 := max{Re ρ(x) | x ∈ K}, then c1 > a and the set F1 := {x ∈ K | Re ρ(x) = c1 } is a
compact face of K by the previous lemma. Since F1 is also nonempty, compact, and convex, the
first part of the proof shows that F1 has an extreme point x1 , which is then3 also an extreme
point of K. In particular, x1 ∈ E, while Re ρ(x1 ) = c1 > a, which contradicts (∗).
3
It is easy to see that an extreme point of a face of K is an extreme point of K as well.
72
Corollary: Let K be a non-empty compact convex subset of a Hausdorff locally convex vector
space X. If ρ ∈ X 0 , then there is an extreme point x0 of K such that Re ρ(x) ≤ Re ρ(x0 ) holds
for all x ∈ K.
Proof: We may put c := max{Re ρ(x) | x ∈ K} and obtain from the above lemma that F := {x ∈
K | Re ρ(x) = c} is a compact face of K. By the Krein-Milman theorem the nonempty compact
convex subset F has an extreme point x0 . It follows from the definition that an extreme point of
a face of K is an extreme point of K, thus x0 is an extreme point of K and, since x0 ∈ F , we
have Re ρ(x0 ) = c ≥ Re ρ(x) for every x ∈ K.
Remark: One can also show the following related result (cf. [Wer18, Theorem VIII.4.4(c)] or
[Con10, Chapter V, Theorem 7.8] or [KR, Theorem 1.4.5]): If K is a nonempty compact convex
subset of a Hausdorff locally convex vector space X and B is a closed subset of K such that
K = co B, then B contains the extreme points of K, i.e., B ⊇ ex K.
73
6. Weak and weak* topologies
6.1. Dual pairs: Let X, Y be K-vector spaces and (x, y) 7→ hx, yi be a bilinear map X × Y → K.
We call (X, Y ) a dual pair, if
(6.1) ∀x ∈ X, x 6= 0, ∃y ∈ Y : hx, yi =
6 0 and ∀y ∈ Y, y 6= 0, ∃x ∈ X : hx, yi =
6 0.
For any y ∈ Y , we have the induced linear functional ly (x) := hx, yi on X and the map y 7→ ly is
linear from Y into the algebraic dual X ∗ of X. The requirement above says that the latter is an
injective map, and the same holds for the assignment x 7→ hx, .i, X → Y ∗ . In this sense, we have
Y ,→ X ∗ and X ,→ Y ∗ for any dual pair (X, Y ). Note that the ranges of these injective maps are
point separating subspaces in the respective duals.
For any y ∈ Y we have the seminorm py (x) := |hx, yi| on X and the set P of these seminorms
generates a locally convex vector space topology on X, which we call the σ(X, Y )-topology (or
weak topology) on X. The σ(Y, X)-topology on Y is defined analogously. It follows from the
definition of a dual pair and Lemma 4.11 that the σ(X, Y )-topology and the σ(Y, X)-topology are
always Hausdorff. By Proposition 5.7, a net (xj )j∈I in X converges to 0 in the σ(X, Y )-topology,
if and only if hxj , yi → 0 for every y ∈ Y . Considering xj (j ∈ I) as elements in Y ∗ , this translates
into pointwise convergence of the net (xj ).
Examples: 1) Let X be a locally convex Hausdorff vector space and Y = X 0 , then (X, X 0 ) is a
dual pair with respect to the bilinear map (x, x0 ) 7→ x0 (x), since the second part in (6.1) follows
from the definition of maps X → K and the first part from the corollary of the Hahn-Banach
theorem stated in 5.10. The corresponding topology σ(X, X 0 ) is also called the weak topology on
X and this is consistent with the topology described in Example 4.12.8) in case of a normed space.
2) In the situation of 1), we also obtain a dual pair (X 0 , X) and σ(X 0 , X) is the topology of
pointwise convergence, which we call weak* topology as in the case of normed spaces discussed in
Example 4.12.9).
3) Coming back to Example 4.12.11), consider a metric space Ω, X = Cb (Ω) the space of bounded
continuous functions, and Y = M (Ω) the space of regular signed or complex Borel measures on
Ω. The bilinear map Z
(f, µ) 7→ f dµ (f ∈ Cb (Ω), µ ∈ M (Ω))
Ω
defines the structure of a dual pair (the argument is again based on the regularity of the measures,
see [Els11, Kapitel VIII, §1, Satz 4.6]). The σ(M (Ω), Cb (Ω))-topology corresponds to the notion
of weak topology of probability theory as indicated in Example 4.12.11).
75
4) On X = RR (the space of functions R → R) consider for any t ∈ R the evaluation functional
δt : RR → R, f 7→ f (t). Let Y := span{δt | t ∈ R} (a subspace of (RR )∗ ). We obtain a dual pair
by defining the bilinear map X × Y → R,
m
X m
X
hf, λk δtk i := λk f (tk ),
k=1 k=1
u = j=1 xj ⊗ yj0 . We claim that we obtain a dual pair (L(X, Y ), X ⊗ Y 0 ) via the bilinear map
h., .i : L(X, Y ) × (X ⊗ Y 0 ) → K, given by
* m + m
X X
0
T, xj ⊗ yj := yj0 (T xj ).
j=1 j=1
To show that for any T 6= 0 there is a u ∈ X ⊗ Y with hT, ui = 0 6 0, first choose x0 ∈ X such
that T x0 6= 0, then there is some y0 ∈ Y with
0 0
P y0 (T x0 ) 6=0 0 by the Hahn-Banach theorem; thus,
0
such that u(x00 )(y0 ) 6= 0; we define T ∈ L(X, Y ) by T x := x00 (x)y0 and arrive at
Xm Xm X m
hT, ui = yj0 (T xj ) = yj0 (x00 (xj )y0 ) = x00 (xj )yj0 (y0 ) = u(x00 )(y0 ) 6= 0.
j=1 j=1 j=1
Any of the seminorms pu (T ) = |hT, ui|, with u = xj ⊗yj0 , is obviously continuous with respect to
P
the weak operator topology. Therefore, Corollary 5.2 shows that the σ(L(X, Y ), X ⊗ Y 0 )-topology
coincides with the weak operator topology.
6.2. The dual space of (X, σ(X, Y )): We need a preparatory result from linear algebra.
Lemma: Let l, l1 , . . . , ln : X → K be linear functionals. Then the following are equivalent:
(i) l ∈ span{l1 , . . . , ln },
(ii) ∃M > 0 such that |l(x)| ≤ M maxnj=1 |lj (x)| holds for every x ∈ X,
(iii) nj=1 ker(lj ) ⊆ ker(l).
T
76
Corollary: Let (X, Y ) be a dual pair. A linear functional on X is σ(X, Y )-continuous, if and
only if it is of the form x 7→ hx, yi for some y ∈ Y . Therefore,
(Xσ(X,Y ) )0 = Y.
Proof: By Corollary 5.4, a linear functional l on X is σ(X, Y )-continuous, if and only if it satisfies
an estimate as in statement (ii) of the above lemma with lj (x) = hx, yj i and yj ∈ Y .PTherefore,
the σ(X, Y )-continuityPof l is also equivalent to being a linear combination l = αj li , i.e.,
l(x) = hx, yi with y = αj yj .
In particular, in case of a locally convex vector space (X, τ ) with the dual pairs (X, X 0 ) or (X 0 , X),
we obtain the following statements as direct consequences:
(a) A linear functional on X is weakly continuous, if and only if it is τ -continuous.
(Compare with Example 5.9.2), where this result has been noted for normed spaces.)
(b) A linear functional on X 0 is weak* continuous, if and only if it is of the form of an evaluation
functional x0 7→ x0 (x) with some x ∈ X, i.e., (Xσ(X
0 0
0 ,X) ) = X.
6.3. Proposition: Let (X, Y ) be a dual pair. The weak topology σ(X, Y ) has the following
property: A map f from any topological space (T, τ ) into (X, σ(X, Y )) is continuous, if and only
if the composition map y ◦ f : t 7→ hf (t), yi is continuous T → K for every y ∈ Y .
We may remark that, in fact ([Sch71, Sections II.5 and IV.1]), σ(X, Y ) is the coarsest topology
on X such that every y ∈ Y defines a continuous function and thus is the initial or projective
topology on X with respect to the linear functionals given by the elements y ∈ Y .
Proof: Clearly, if f : (T, τ ) → (X, σ(X, Y )) is continuous, then y ◦ f is continuous for every y ∈ Y
due to the corollary in 6.2.
To prove the converse, suppose y ◦ f is continuous for every y ∈ Y . Let t ∈ T and U be a
σ(X, Y )-neighborhood of 0 in X. We have to show that there is a τ -neighborhood W of t in
T such that f (W ) ⊆ f (t) + U . We may assume that U is a basis neighborhood of the form
U = {x ∈ X | |hx, yj i| ≤ ε (j = 1, . . . , n)} with y1 , . . . , yn ∈ Y . Every function yj ◦ f (1 ≤ j ≤ n)
is continuous, thus, there exist τ -neighborhoods Wj of t in T such that |hf (s) − f (t), yj i| ≤ ε, if
s ∈ Wj . Putting W := W1 ∩. . .∩Wn we obtain a neighborhood of t such that f (W )−f (t) ⊆ U .
6.4. Polar subsets: Let (X, Y ) be a dual pair, A ⊆ X, and B ⊆ Y . The polar of A is
A◦ := {y ∈ Y | ∀x ∈ A : Rehx, yi ≤ 1}
77
(ii) In the literature, one often finds the definition of the polar of a subset A ⊆ X in the form
A◦ = {y ∈ Y | ∀x ∈ A : |hx, yi| ≤ 1}, which we would call here the absolute polar of A.
For subsets of X or Y of the dual pair (X, Y ), closures shall always refer to the weak topology
σ(X, Y ) or σ(Y, X), unless stated otherwise. Let A and Aj (j = 1, 2 or j ∈ J) denote subsets of
X, then we collect the following list of properties1 :
(a) A1 ⊆ A2 ⇒ A◦2 ⊆ A◦1 ,
(b) 0 ∈ A◦ and A ⊆ A◦◦ ,
(c) A◦ = co (A◦ ) and A◦ = (co A)◦ ,
(d) if A is balanced, then A◦ = {y ∈ Y | ∀x ∈ A : |hx, yi| ≤ 1} (absolute polar),
(e) if λ > 0, then (λA)◦ = λ1 A◦ ,
(f) ( j∈J Aj )◦ = j∈J A◦j ,
S T
Proof: Properties (a), (b), and (d)-(f) are clear or very easily shown.
(c): A◦ = x∈A {y ∈ Y | Rehx, yi ≤ 1} is an intersection of closed convex sets, hence it is itself
T
convex and closed, and therefore A◦ = co (A◦ ). Since A ⊆ co A, we have (co A)◦ ⊆ A◦ by (a),
hence it remains to show that any y ∈ A◦ belongs to (co A)◦ . This follows upon observing that
Rehλx1 + (1 − λ)x2 , yi = λ Rehx1 , yi + (1 − λ) Rehx2 , yi (for any real λ) and Rehlim xj , yi =
lim Rehxj , yi.
(g): Put A := S j∈J Aj , then A ⊆ Aj for every j ∈ J, hence A◦ ⊇ A◦j for S every◦ j ∈ J, and
T
therefore A ⊇ j∈J Aj . By (c), A is convex and closed, thus also A ⊇ co j∈J Aj .
◦ ◦ ◦ ◦
By the second inequality, y2 ∈ A◦ . But then the first inequality implies x0 6∈ A◦◦ , which gives a
contradiction. Thus, we also have A◦◦ ⊆ V and the theorem is proved.
1
The analogous properties hold for subsets of Y .
78
Corollary: Let C ⊆ X be convex with 0 ∈ C, then we have:
C is σ(X, Y )-closed ⇐⇒ ∃B ⊆ Y : C = B ◦ .
Proof: The implication ‘⇐’ follows by (c), and ‘⇒’ follows upon setting B := C ◦ and appealing
to the bipolar theorem.
6.5. Alaoglu-Bourbaki theorem: Let X be a locally convex vector space and U be a neigh-
borhood of 0 in X. Then U ◦ (in the sense of the dual pair (X, X 0 )) is weak* compact.
Proof: Let τp denote the topology of pointwise convergence on theQset KX of all functions X → K.
Recall that τp corresponds to the product topology on KX = x∈X K and therefore we have
Tychonoff’s theorem as a convenient criterion for compactness of product sets in (KX , τp ). We
have X 0 ⊆ KX and τp |X 0 = σ(X 0 , X) (as noted in 6.1, Example 1) above).
We may suppose that U is absolutely convex, for we always have an absolutely convex neighbor-
hood V of 0 with V ⊆ U , which gives that U ◦ ⊆ V ◦ and compactness of U ◦ follows from that of
V ◦ . By property (d) in 6.4, we may thus suppose that
79
sequences in the dual X 0 always possess weak* convergent subsequences. Without separabil-
ity this is no longer true. An example of this failure is the following: Consider the evaluation
functionals δn ∈ l∞0 (n ∈ N) given by δn (x) := xn for every x = (xm )m∈N ∈ l∞ . We have
δn ∈ Bl∞ 0 = (Bl∞ )◦ . Suppose we had a weak* convergent subsequence (δnk )k∈N of (δn )n∈N . For
every x ∈ l∞ put y(x) := limk→∞ δnk (x). Then y ∈ Bl∞ 0 , since Bl∞ 0 is weak* closed (being a
weak* compact subset). Let z = (zm )m∈N ∈ l∞ be defined by
0, if ∀k ∈ N : m 6= nk ,
zm := 1, if ∃l ∈ N : m = n2l ,
0, if ∃l ∈ N0 : m = n2l+1 .
We obtain δn2l (z) = 1 for l ∈ N and δn2l+1 (z) = 0 for l ∈ N0 and therefore arrive at the
contradiction y(z) = limk→∞ δnk (z) = lim(0, 1, 0, 1, 0, 1, . . .).
(ii) Recall that a Banach space X is reflexive, if it is canonically isomorphic to X 00 via the map
ι : X → X 00 , ι(x)(x0 ) := x0 (x). Combining the bipolar and the Alaoglu-Bourbaki theorem, one can
obtain characterizations of reflexivity for Banach spaces (cf. [Wer18, Satz VIII.3.18] or [Con10,
Chapter V, Theorem 4.2]), e.g., in the following form: The Banach space X is reflexive, if and
only if BX is weakly compact.
6.8. Weak* compactness of convex sets and extreme points: Although we always have
plenty of convex sets in any locally convex space, the compactness of these subsets is often true
only in the weak or weak* topologies and the Alaoglu-Bourbaki theorem plays an important part
in many applications of the Krein-Milman theorem 5.11. A particular case is again the closed
unit ball in the dual of a normed space, where the Krein-Milman theorem applies for the weak*
topology and can be used, e.g., to show that the Banach spaces c0 (the space of real or complex
sequences converging to 0) and L1 ([0, 1]) cannot be the dual of a normed space, since their closed
unit balls do not have any extreme points (cf. [Wer18, Korollar VIII.4.6]).
80
7. Basic theory of distributions
The theory of distributions aims at an extension of the notion of scalar functions on an open
subset of Ω ⊆ Rn while maintaining a calculus of differentiation. The basic idea is to consider
linear functionals on function spaces and we can make the following observations:
(a) Functions can be considered as linear functionals (see also Example 5.9.1)): For example,
suppose f ∈ L1loc (Ω), then1
Z
Tf (ϕ) := f (x)ϕ(x) dx
Ω
defines a linear functional Tf : Cc (Ω) → C. But now we are not tied up with functions anymore
in implementing such functionals. In particular, we have finite measures acting as functionals
(similarly as in the Riesz representation theorem). One of the most prominent functionals is given
by the Dirac measure δx0 concentrated at the point x0 ∈ Ω, acting by
Z
δx0 (ϕ) = ϕ(x) dδx0 = ϕ(x0 ).
Ω
(b) Differentiation of non-smooth functions can be implemented taking up the idea of integration
by parts (which is the concept of a weak derivative): On Ω = ]0, 1[, for example, if f ∈ C 1 (]0, 1[)
and ϕ ∈ Cc1 (]0, 1[), then
Z1 Z1
Tf 0 (ϕ) = f 0 (x)ϕ(x) dx = − f (x)ϕ0 (x) dx = −Tf (ϕ0 ).
0 0
However, if f is not differentiable and we have merely f ∈ L1loc (]0, 1[), then we may still define
the linear functional Tf 0 (ϕ) := −Tf (ϕ0 ) on Cc1 (]0, 1[) and consider Tf 0 to be the derivative of Tf .
We observe that we may obtain definitions of derivatives up to order k ∈ N, if we define the
functionals instead on the smaller spaces Cck (Ω), or, to be prepared for derivatives of arbitrary
order, consider the space Cc∞ (Ω) = D(Ω) of test functions known from Example 4.12.6).
7.1. The space of test functions: Recall from Examples 5) and 6) in 4.12, that we defined
the locally convex topology τ on D(Ω) via the family of seminorms P on D(Ω) such that, for
every p ∈ P , the restriction p |DK (Ω) is continuous (DK (Ω), τK ) → [0, ∞[ for every compact subset
1
Recall that L1loc (Ω) is the set of measurable functions that are integrable on every compact subset of Ω,
and Cc (Ω) denotes the space of continuous functions with compact support.
81
K ⊂ Ω. Here, DK (Ω) is the set of functions ϕ ∈ Cc∞ (Ω) with supp(ϕ) ⊆ K and the locally convex
topology τK is generated by the family of seminorms
Note that pl ≤ pm , if l ≤ m, thus, any finite number of seminorms pm1 , . . . , pmr is bounded by
the single seminorm pm , if m ≥ mj (j = 1, . . . , r). Strictly speaking, we should indicate the
dependence on K in the notation for the seminorms pm , but we do not want to overload the
notation and it is convenient to think of pm as denoting the appropriate restriction to DK (Ω) of
a similar seminorm given on D(Ω).
By Lemma 5.1(c) and the monotonicity of pm with respect to m observed above, a seminorm p
on D(Ω) belongs to P , if and only if for every compact subset K ⊂ Ω there are c > 0 and m ∈ N0
such that
82
(i) ϕl → 0 with respect to τ ,
(ii) there exists a compact set K ⊂ Ω such that supp(ϕl ) ⊆ K for every l ∈ N and ϕl → 0 holds
in DK (Ω),
(iii) there exists a compact set K ⊂ Ω such that supp(ϕl ) ⊆ K for every l ∈ N and for every
α ∈ Nn0 the sequence (∂ α ϕl )l∈N converges uniformly to 0.
Proof: Statement (iii) is just a reformulation of (ii) and the implication (ii) ⇒ (i) is clear from
the definition of τ (and (a) in the above lemma), hence it suffices to prove (i) ⇒ (ii).
Suppose (i) holds but (ii) were wrong, i.e., there is no compact subset K ⊂ Ω with the properties
◦ ◦
stated. Then we can find an increasing sequence of compact sets K1 ⊂ K2 ⊂ K2 ⊂ K3 ⊂ K3 ⊂
◦
. . . ⊂ Ω with m∈N Km = Ω and a subsequence (ϕlm )m∈N such that ϕlm ∈ DKm (Ω)\DKm−1 (Ω) for
S
every m ∈ N (note that (i) and supp(ϕl ) ⊆ K for every l ∈ N would imply ϕl → 0 in DK (Ω)). Any
◦
compact subset of Ω is contained in some Km , since the sets Km (m ∈ N) provide an increasing
open cover of Ω. Therefore, D(Ω) = m∈N DKm (Ω).
S
◦
Choose xm ∈ Km \ Km−1 with αm := |ϕlm (xm )| > 0. We have seen in the previous proof of
the lemma that the seminorms πm : ϕ 7→ |ϕ(xm )|/αm are τ -continuous on D(Ω). We note that
πm |DKr (Ω) = 0, if m > r. Hence for every ϕ ∈ D(Ω) there are at most finitely many m ∈ N such
that πm (ϕ) 6= 0 and we may define the seminorm π on D(Ω) by π(ϕ) = ∞ m (ϕ). If M ⊂ Ω
P
m=1 πP
is compact, there is some N ∈ N such that M ⊆ KN , which implies π |DM (Ω) = N m=1 πm |DM (Ω)
and shows τM -continuity of the restriction. Thus, π ∈ P , in particular, π is τ -continuous.
Since ϕl → 0 by assumption, we have π(ϕlm ) → π(0) = 0 (m → ∞), but at the same time
π(ϕlm ) ≥ πm (ϕlm ) = 1, a contradiction.
7.2. Definition: The dual space of (D(Ω), τ ) is denoted by D 0 (Ω) and called the space of
distributions on Ω.
7.3. Theorem: Let T : D(Ω) → C be linear, then the following are equivalent:
(i) T ∈ D 0 (Ω), i.e., T is continuous,
(ii) for every compact set K ⊂ Ω, the restriction T |DK (Ω) is continuous on DK (Ω),
(iii) for every compact set K ⊂ Ω, there are m ∈ N0 and c > 0 such that
83
a routine exercise to show that, generally in a locally convex vector space in these circumstances,
we obtain a metric inducing the same topology by setting (see, e.g., [Kab14, Satz 1.1])
∞
X pm (ϕ − ψ)
d(ϕ, ψ) := 2−m (ϕ, ψ ∈ DK (Ω)).
1 + pm (ϕ − ψ)
m=0
(For the proof of the triangle inequality, it is advisable to make use of the fact that t 7→ t/(1 + t) is an increasing function [0, ∞[→ R.
Thus, we have seen that (DK (Ω), τK ) is a metrizable space, therefore sequential continuity of
T |DK (Ω) implies continuity.
7.4. Remark: The constant c as well as the nonnegative integer m in condition (iii) of the above
theorem depend, in general, on the compact set K. In case a uniform m can be found for all K,
then the minimum of these numbers m is called the order of the distribution.
7.5. Examples: 1) The linear functional Tf , defined for any f ∈ L1loc (Ω) in observation (a) of
the introduction to the current section, is indeed a distribution. For every compact K ⊂ Ω and
ϕ ∈ DK (Ω), we have
Z Z
|Tf (ϕ)| ≤ |f (x)||ϕ(x)| dx ≤ |f (x)| dx · sup |ϕ(x)| = kf |K kL1 · p0 (ϕ),
x∈Ω
Ω K
everywhere for every compact subset K (by [Els11, Kapitel IV, Satz 4.4]), which implies f = 0
almost everywhere, since Ω is σ-compact.)
2) Let µ be a complex Borel measure on Ω (recall that µ is a regular measure by Lemma 0.16).
Then Z
µ(ϕ) := ϕ dµ
Ω
defines a distribution (of order 0), since |µ(ϕ)| ≤ |µ|(K) · p0 (ϕ) for every ϕ ∈ DK (Ω). An
argument as in 1) shows that also here the map assigning the functional to each measure is
2
T∞ S. 461] is as follows:n Consider the compact sets Kl := R{x ∈ Ω |
The construction sketched in [Wer18,
d(x, K) ≤ 1/l}; we have K = l=1 Kl ; choose ϕ0 ∈ D(R ) with supp(ϕ0 ) ⊆ {|x| ≤ 1} and ϕ0 = 1
and put ϕl (x) := Kl ln ϕ0 (l(x−y)) dy for every x ∈ Ω; then ϕl is smooth (integration with parameters),
R
84
injective. A prominent example, mentioned already in observation (a) of the introduction, is the
Dirac distribution δx0 concentrated at the point x0 ∈ Ω, which is obtained from the Dirac measure
at x0 and acts by δx0 (ϕ) = ϕ(x0 ).
We can easily show that δx0 is not a regular distribution: Suppose δx0 = Tf for some f ∈ L1loc (Ω),
then Ω0 := Ω \ {x0 } is an open subset of Rn and D(Ω0 ) ,→ D(Ω), since any smooth function with
compact support in Ω0 can be extended by 0 outside to give an element in D(Ω) (compact sets
have finite distance to the boundary of any surrounding open set). By slight abuse of notation,
we have Tf |Ω0 = (Tf )|D(Ω0 ) = δx0 |D(Ω0 ) = 0 and injectivity of L1loc (Ω0 ) 3 g 7→ Tg ∈ D 0 (Ω0 ) yields
that f |Ω0 = 0 almost everywhere. Therefore, we obtain f = 0 almost everywhere, which would
imply δx0 = 0, a contradiction, since there are functions ϕ ∈ D(Ω) with ϕ(x0 ) = 1.
3) The linear functional T (ϕ) := ϕ0 (0) is a distribution (of order 1) on R, i.e., T ∈ D 0 (R), since
|T (ϕ)| ≤ p1 (ϕ). (A proof that T is not of order 0 can be obtained from testing with functions
ϕn (x) := ϕ0 (nx), where ϕ0 ∈ D(R) satisfies ϕ0 (0) = 1, ϕ00 (0) = 1.)
4) The linear functional T (ϕ) := ∞ (n) (n) is defined on all of D(R) and continuous, since with
P
n=0 ϕ P −1 (n)
N ∈ N sufficiently large such that supp(ϕ) ⊆ [−N, N ], we obtain4 |T (ϕ)| ≤ N n=0 |ϕ (n)| ≤
N · pN −1 (ϕ). This distribution is not of finite order.
In observation (b) of the introductory paragraphs to the current section we have indicated a way to
extend (the linear operation of) differentiation from function spaces to distributions by mimicking
the integration by parts formula and “letting the derivatives fall on the test function”. This can
be described most systematically in terms of the general concept of an adjoint to a linear map.
7.6. Definition: Let X and Y be locally convex spaces and L : X → Y be a continuous linear
map. The adjoint of L is defined as the linear map L0 : Y 0 → X 0 , given by L0 (y 0 ) := y 0 ◦ L for
every y 0 ∈ Y 0 .
7.7. Remark: Recall that (Xσ(X 0
0 ,X) ) = X by Corollary 6.2. For every x ∈ X = (Xσ(X 0 ,X) ) ,
0 0 0
By the above remark, ∂ α is weak* continuous, and the formula of integration by parts (in several
variables with iterated integrals and vanishing boundary terms due to compact support of the test
functions) shows that the new definition of ∂ α is consistent when applied to regular distributions
4
Note that ϕ vanishes of infinite order at the boundary of its support, hence ϕ(N ) (N ) = 0.
85
stemming from functions that are sufficiently often continuously differentiable, i.e., ∂ α Tf = T∂ α f
in these cases.
Examples: 1) The Heaviside function is H := χ[0,∞[ ∈ L∞ (R) ⊂ L1loc (R) ,→ D 0 (R). We have
Z∞
0 0
(TH ) (ϕ) = −TH (ϕ ) = − ϕ0 (x) dx = ϕ(0) = δ0 (ϕ) (ϕ ∈ D(R)).
0
Furthermore, (TH )00 (ϕ) = TH (ϕ00 ) = −ϕ0 (0) = −δ0 (ϕ0 ) = δ00 (ϕ).
2) The (class of the) measurable function f , given by f (x) := log(|x|), belongs to L1loc (R) and we
obtain, for every ϕ ∈ D(R),
Z∞ Z∞
0 0 0
ϕ0 (x) + ϕ0 (−x) log(x) dx
(Tf ) (ϕ) = −Tf (ϕ ) = − ϕ (x) log(|x|) dx = −
−∞ 0
Z∞ Z∞
0
=− (ϕ(x) − ϕ(−x)) log(x) dx = − lim (ϕ(x) − ϕ(−x))0 log(x) dx
ε>0,ε→0
0 ε
Z∞
∞ ϕ(x) − ϕ(−x)
=− lim (ϕ(x) − ϕ(−x)) log x ε
+ lim dx
ε>0,ε→0 ε>0,ε→0 x
ε
Z∞ Z∞
ϕ(x) − ϕ(−x) ϕ(x) − ϕ(−x)
= lim dx = dx,
ε>0,ε→0 x x
ε 0
since ϕ(x) − ϕ(−x) = 2xϕ0 (0) + O(x2 ) near x = 0 and ϕ has compact support. The distribution
vp(1/x) := (Tf )0 is called the Cauchy principal value of x1 , since its action on a test function is
equivalently given by Z
1 ϕ(x)
vp( )(ϕ) = lim dx.
x ε>0,ε→0 x
|x|>ε
Note that, with the classical pointwise derivative, we have (log(|x|))0 = 1/x on R \ {0}.
7.9. Fourier transform of temperate distributions: Finally, we will briefly discuss the
extension of the Fourier transform beyond L1 (Rn ) and L2 (Rn ). Our goal is to define it as the
adjoint of the classical Fourier transform on an appropriate test function space. As it turns out,
D(Rn ) is not suitable for that purpose, since the Fourier transform Fϕ of a compactly supported
function ϕ is easily seen to define an entire function on Cn and thus Fϕ cannot have compact
support too, unless ϕ = 0. However, a good alternative is the Schwartz space S (Rn ) with
seminorms pα,m (ϕ) := supx∈Rn (1 + |x|m ) |∂ α ϕ(x)| (m ∈ N0 , α ∈ Nn0 ). Its dual, the space of
temperate distributions S 0 (Rn ), has been introduced already in Example 5.9.1),R where we have
also seen that Lp (Rn ) ,→ S 0 (Rn ) (1 ≤ p ≤ ∞) via the map f 7→ Tf , Tf (ϕ) = Rn f ϕ. It is not
difficult to show (see [Wer18, Satz VIII.5.11]) that the identical embedding D(Rn ) ,→ S (Rn ) is
continuous with dense image and therefore has an injective adjoint S 0 (Rn ) ,→ D 0 (Rn ). Thus,
every temperate distribution is a distribution.
86
Recall (or take for granted or look-up in [Wer18, Section V.2] or [Con16, Theorem 6.1] or [Con10,
Chapter X, §6]) that F, defined by
Z
1
Fϕ(x) = e−ixξ ϕ(x) dx (ϕ ∈ S (Rn ), x ∈ Rn ),
(2π)n/2
Rn
Proof: (a): Leiniz’ rule applied to ∂ β (xα ϕ(x)) yields an upper bound for pβ,m (xα ϕ) in the form
of a finite sum with terms qβ,Qm+|α|−l (0 ≤ l ≤ max(|β|, m + |α|)), where Qm+|α|−l is a polynomial
function of order at most m + |α| − l. The continuity of ϕ 7→ ∂ α ϕ is even more obvious, since
pβ,m (∂ α ϕ) = pβ+α,m (ϕ).
Z Z Z
−ixξ dx
(b): (2π) |(Fϕ)(ξ)| = | e
n/2
ϕ(x) dx| ≤ |ϕ(x)| dx ≤ · p0,n+1 (ϕ).
1 + |x|n+1
Rn Rn Rn
By part (a) of this lemma and the embedding S 0 (Rn ) ,→ D 0 (Rn ), we obtain differentiation as
linear map S 0 (Rn ) → S 0 (Rn ). Moreover, multiplication by polynomial functions is obtained as
linear map S 0 (Rn ) → S 0 (Rn ) as the transpose of the map in (a). For the Fourier transform, we
obtain the analogous result from the following
Theorem: Both F and F−1 are continuous linear maps S (Rn ) → S (Rn ).
87
Proof: For any α ∈ Nn0 and any polynomial function Q(ξ) = |γ|≤N aγ ξ γ , we have to give an
P
upper bound of qα,Q (Fϕ) = supξ∈Rn |Q(ξ)∂ α (Fϕ)(ξ)| in terms of finitely many seminorms of ϕ.
Let Q(−i∂) := |γ|≤N aγ (−i)|γ| ∂ γ , then we have, by (b) in the above lemma and the exchange
P
formulae,
By part (a) in the above lemma, we have an upper bound for p0,n+1 (Q(−i∂)(xα ϕ)) in terms of
j=1 pβj ,mj (ϕ), which completes the proof of continuity of F. Since (F ϕ)(x) = (Fϕ)(−x),
maxN −1
Formula (∗) shows that on Schwartz functions, considered as regular temperate distributions, the
Fourier transform is its own adjoint. Thus, we come up with the following
Definition: If T ∈ S 0 (Rn ), then its Fourier transform F T ∈ S 0 (Rn ) is defined by
3) (F T1 )(ϕ) = (2π)n/2 (F(Fδ0 ))(ϕ) = (2π)n/2 δ0 (F Fϕ) and applying (Fϕ)(x) = (F−1 ϕ)(−x) we
obtain
5
It is easy to see that the adjoint of a bijective linear continuous map is bijective.
88
Appendix
We restate and prove Lemma 0.13
Lemma: Let Ω ⊂ C be compact and (Bb (Ω), k.k∞ ) be the Banach space of bounded Borel
measurable functions Ω → C. Suppose U ⊆ Bb (Ω) has the following properties:
(a) C(Ω) ⊆ U ,
(b) fn ∈ U (n ∈ N), sup kfn k∞ < ∞, and f (t) := lim fn (t) exists for every t ∈ Ω
n∈N n→∞
=⇒ f ∈ U.
Then U = Bb (Ω).
Proof: Let S := {S ⊆ Bb (Ω) | S ⊇ C(Ω) and S satisfies (b)} and put V := S. Clearly
T
S∈S
Bb (Ω) ∈ S , hence S 6= ∅ and C(Ω) ⊆ V ⊆ U . We will show that V = Bb (Ω).
Claim 1: V is a vector subspace.
For any f0 ∈ V put Vf0 := {g ∈ Bb (Ω) | f0 + g ∈ V }. Note that Vf0 possesses property (b).
If f0 ∈ C(Ω), then C(Ω) ⊆ Vf0 and Vf0 satisfies (b); hence Vf0 ∈ S and V ⊆ Vf0 , i.e.,
f0 ∈ C(Ω), g ∈ V =⇒ f0 + g ∈ V.
Thus, if g0 ∈ V , then f + g0 ∈ V for every f ∈ C(Ω); therefore C(Ω) ⊆ Vg0 and, furthermore,
Vg0 ∈ S ; this in turn implies V ⊆ Vg0 , i.e.,
g0 ∈ V, g ∈ V ⇒ g ∈ Vg0 ⇒ g0 + g ∈ V.
g∈V =⇒ αg ∈ V.
89
E = n∈N An and let fn be an Urysohn function for the closed pair An and Ω \ E.) Property (b)
S
applied to (fn ) yields χE ∈ V , thus E ∈ ∆. Therefore ∆ contains the topology of Ω, which is a
generating system for the Borel sigma algebra B(Ω) stable under finite intersections.
By the theory of Dynkin systems from measure theory ([Els11, Kapitel I, §6.2] or [Bau01, Chapter
I, §2]), we obtain ∆ = B(Ω), if we prove the following two properties:
(1) E ∈ ∆ ⇒ Ω \ E ∈ ∆,
(2) if (En ) is a sequence of pairwise disjoint sets in ∆, then E := n∈N En ∈ ∆.
S
Property (1) holds, sinceP1 and χE belong to V and therefore also χΩ\E = 1 − χE . To show (2) we
simply note that χE = χEn in the sense of pointwise convergence, hence (b) implies χE ∈ V .
It follows from Claims 1 and 2 that V is k.k∞ -dense in Bb (Ω). By property (b), every S ∈ S is
a closed subset of Bb (Ω), hence V is also closed and therefore V = Bb (Ω).
The unitary equivalence with a multiplication operator can be constructed along the follow-
ing lines: There is an orthonormal (possibly finite) sequence of eigenvectors of T , say W =
{w1 , w2 , . . .} = {wj | j ∈ N }, where N = N or N = {1, . . . , m} for some m ∈ N, corresponding to
the sequence d : N → C of all non-zero eigenvalues d(1), d(2), . . . (with multiplicities) of T , which
is either finite or converges to 0 ([Wer18, Theorem VI.3.2]), such that H = ker(T ) ⊕ span(W )
(orthogonal direct sum). If ker(T ) 6= {0} extend W by an orthonormal system K = {yr | r ∈ R}
of ker(T ), where R ∩ N = ∅, to obtain an orthonormal system B = W ∪ K of H and write
B = {bs | s ∈ S}, where S = N ∪ R and bs := wj , if s = j ∈ N , and bs := yr , if s = r ∈ R.
Let es ∈ l2 (S) be the complete orthonormal system given by es (t) = 1, if t = s, and es (t) = 0
otherwise, and define U by unique (uniformly continuous) extension of U es := bs (s ∈ S) to
l2 (S) = span({es | s ∈ S). Then T U es = 0, if s ∈ R, and T U es = d(s) bs , if s ∈ N , and therefore
we have for any x ∈ l2 (S), (U −1 T U x)(s) = 0 = 0 x(s), if s ∈ R, and (U −1 T U x)(s) = d(s) x(s),
if s ∈ N ; in other words, (U −1 T U x)(s) = h(s) x(s), where h ∈ l∞ (S) is given by h |R = 0 and
h |N = d.
• Φ is C-linear,
• Φ(f¯) = Φ(f )∗ ,
90
(c) Φ is continuous with respect to the norm k.k∞ on C(σ(T )), in fact, isometric,
i.e., kΦ(f )k = kf k∞ .
We write f (T ) instead of Φ(f ) and call f 7→ f (T ) the continuous functional calculus of T .
Proof: Uniqueness: Because of (c) and thanks to the Stone-Weierstraß theorem applied to the
compact subset σ(T ) ⊆ R (cf., e.g., [Wer18, Satz VIII.4.7]), Φ is determined by the values on
polynomial functions; then by linearity, the knowledge of Φ(idn ) suffices; finally, by multiplicativity
according to (b), this boils down to fixing Φ(id) and Φ(1), which are determined by (a).
Pn
Existence: If f ∈ C(σ(TPn )) is the restriction of a polynomial function p(t) = k=0 ak t , then we
k
Proof of (i): This is obvious for constant polynomials, hence we may suppose deg(p) ≥ 1.
Let λ ∈ σ(T ). We may write p(t) − p(λ) = (t − λ)h(t) with some polynomial h 6= 0 and
obtain p(T ) − p(λ) = (T − λ)h(T ). Due to the commutativity of λ − T with h(T ) and p(T ) − p(λ),
continuous invertibility of p(T ) − p(λ) would then imply the same for T − λ, hence p(λ) ∈ σ(p(T )).
Thus, p(σ(T )) ⊆ σ(p(T )).
Let µ ∈ σ(p(T )), then deg(p − µ) ≥ 1 and we have a polynomial factorization p(t) − µ = c(t −
λ1 ) · · · (t−λm ) with λ1 , . . . , λm ∈ C and c ∈ C\{0}, which yields p(T )−µ = c(T −λ1 ) · · · (T −λm ).
If λj belonged to ρ(T ) for every j = 1, . . . , m, then p(T ) − µ would be continuously invertible, a
contradiction; hence λj ∈ σ(T ) for some j and therefore µ = p(λj ) ∈ p(σ(T )). Thus, σ(p(T )) ⊆
p(σ(T )).
Proof of (ii): Elementary calculations show that p 7→ p(T ) is an involutive algebra homomorphism
from polynomials into L(H). Note that p(T ) is normal, since p(T )∗ p(T ) = p(T )p(T ) = (pp)(T ) =
(pp)(T ) = p(T )p(T )∗ , and thus has norm equal to its spectral radius ([Wer18, Satz VI.1.7]). We
obtain
kp(T )k = sup{|µ| | µ ∈ σ(p(T ))} = sup{|p(λ)| | λ ∈ σ(T )} = sup |p(λ)|.
λ∈σ(T )
We have now shown that the map Φ0 is well-defined on the space V0 of polynomial functions on
σ(T ) and gives an involutive algebra homomorphism into L(H) which is isometric due to Claim
(ii), since kΦ0 (f )k = kf k∞ for any f ∈ V0 ; in particular, Φ0 is continuous. Let Φ denote the
unique continuous linear extension of Φ0 to C(σ(T )) (making use of the density of V0 due to the
Stone-Weierstraß theorem). That Φ is isometric follows directly from the density of the polynomial
functions and Claim (ii) above. It remains to show that Φ is involutive and multiplicative, which
follows from routine arguments using limits of polynomials and the fact that Φ0 is already known
to be involutive and multiplicative.
6
Note that σ(T ) can be a finite discrete subset of R.
91
Bibliography
[Ara99] H. Araki: Mathematical Theory of Quantum Fields.
Oxford University Press 1999.
[Bau90] H. Bauer: Maß- und Integrationstheorie.
De Gruyter 1990.
[Bau01] H. Bauer: Measure and Integration Theory.
De Gruyter 2001.
[Bog07] V. I. Bogachev: Measure Theory. Volume I
Springer-Verlag 2007.
[Coh80] D. L. Cohn: Measure Theory.
Birkhäuser 1980.
[Con16] A. Constantin: Fourier Analysis. Part I – Theory.
Cambridge University Press 2016.
[Con10] J. B. Conway: A Course in Functional Analysis.
Springer Science+Business Media, 2nd edition 2010.
[Els11] J. Elstrodt: Maß- und Integrationstheorie.
Springer-Verlag, 7. Auflage 2011.
[Fol99] G. B. Folland: Real Analysis.
Wiley, 2nd edition 1999.
[Hal13] B. C. Hall: Quantum Theory for Mathematicians.
Springer-Verlag 2013.
[Hoe21] G. Hörmann: Funktionalanalysis.
Vorlesung, Universität Wien, Sommersemester 2021.
http://www.mat.univie.ac.at/˜gue/material.html
[Hoe20] G. Hörmann: Grundbegriffe der Topologie.
Vorlesung, Universität Wien, Wintersemester 2020/21.
http://www.mat.univie.ac.at/˜gue/material.html
[Kab14] W. Kaballo: Aufbaukurs Funktionalanalysis und Operatortheorie.
Springer-Verlag 2014.
[KR] R. V. Kadison and J. R. Ringrose: Fundamentals of the theory of operator algebras.
Volumes I and II.
Academic Press, 1983 and 1986.
93
[MV92] R. Meise and D. Vogt: Einführung in die Funktionalanalysis.
Vieweg Verlag 1992.
[RS75] M. Reed and B. Simon: Methods of modern mathematical physics
Volume II: Fourier analysis, self-adjointness.
Academic Press 1975.
[Rud86] W. Rudin: Real and Complex Analysis.
McGraw-Hill, 3rd edition 1986.
[Sch71] H. Schaefer: Topological Vector Spaces.
Springer-Verlag 1971.
[Sch00] H. Schröder: Funktionalanalysis.
Harri Deutsch, 2. Auflage 2000.
[Tes14] G. Teschl: Topics in Real and Functional Analysis.
Lecture Notes, University of Vienna, winter term 2013/14.
http://www.mat.univie.ac.at/˜gerald/ftp/book-fa/index.html
[Tes14b] G. Teschl: Mathematical Methods in Quantum Mechanics.
With Applications to Schrödinger Operators
American Mathematical Society, 2nd edition 2014.
[Thi03] W. Thirring: Quantum Mathematical Physics:
Atoms, Molecules and Large Systems.
Springer-Verlag, 2nd edition 2003.
[Tre67] F. Trèves: Topological vector spaces, distributions and kernels.
Academic Press 1967.
[Wei00] J. Weidmann: Lineare Operatoren in Hilberträumen: Teil I: Grundlagen.
Teubner-Verlag 2000.
[Wer18] D. Werner: Funktionalanalysis.
Springer-Verlag, 8. Auflage 2018.
[Wil70] S. Willard: General Topology.
Addison-Wesley 1970.
94