0% found this document useful (0 votes)
80 views123 pages

Introduction To Functional Analysis - Daniel Daners - 2017

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views123 pages

Introduction To Functional Analysis - Daniel Daners - 2017

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

Introduction to Functional Analysis

Daniel Daners

School of Mathematics and Statistics


University of Sydney, NSW 2006
Australia

Semester 1, 2017 Copyright ©2008–2017 The University of Sydney


Contents

I Preliminary Material 1
1 The Axiom of Choice and Zorn’s Lemma . . . . . . . . . . . . . . . 1
2 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 12

II Banach Spaces 17
6 Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7 Examples of Banach Spaces . . . . . . . . . . . . . . . . . . . . . . 20
7.1 Elementary Inequalities . . . . . . . . . . . . . . . . . . . . 20
7.2 Spaces of Sequences . . . . . . . . . . . . . . . . . . . . . . 23
7.3 Lebesgue Spaces . . . . . . . . . . . . . . . . . . . . . . . . 25
7.4 Spaces of Bounded and Continuous Functions . . . . . . . . . 26
8 Basic Properties of Bounded Linear Operators . . . . . . . . . . . . . 27
9 Equivalent Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
10 Finite Dimensional Normed Spaces . . . . . . . . . . . . . . . . . . 33
11 Infinite Dimensional Normed Spaces . . . . . . . . . . . . . . . . . . 34
12 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

III Banach algebras and the Stone-Weierstrass Theorem 41


13 Banach algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
14 The Stone-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . 42

IV Hilbert Spaces 47
15 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 47
16 Projections and Orthogonal Complements . . . . . . . . . . . . . . . 52
17 Orthogonal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
18 Abstract Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . 62

V Linear Operators 69
19 Baire’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
20 The Open Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . 70
21 The Closed Graph Theorem . . . . . . . . . . . . . . . . . . . . . . . 73

i
22 The Uniform Boundedness Principle . . . . . . . . . . . . . . . . . . 74
23 Closed Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
24 Closable Operators and Examples . . . . . . . . . . . . . . . . . . . 81

VI Duality 85
25 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
26 The Hahn-Banach Theorem . . . . . . . . . . . . . . . . . . . . . . . 89
27 Reflexive Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
28 Weak convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
29 Dual Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
30 Duality in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . 97
31 The Lax-Milgram Theorem . . . . . . . . . . . . . . . . . . . . . . . 99

VIISpectral Theory 101


32 Resolvent and Spectrum . . . . . . . . . . . . . . . . . . . . . . . . 101
33 Projections, Complements and Reductions . . . . . . . . . . . . . . . 106
34 The Ascent and Descent of an Operator . . . . . . . . . . . . . . . . 110
35 The Spectrum of Compact Operators . . . . . . . . . . . . . . . . . . 112

Bibliography 117

ii
Acknowledgement
Thanks to Fan Wu from the 2008 Honours Year for providing an extensive list of mis-
prints.

iii
iv
Chapter I
Preliminary Material

In functional analysis many different fields of mathematics come together. The objects
we look at are vector spaces and linear operators. Hence you need to some basic linear
algebra in general vector spaces. I assume your knowledge of that is sufficient. Second
we will need some basic set theory. In particular, many theorems depend on the axiom
of choice. We briefly discuss that most controversial axiom of set theory and some
equivalent statements. In addition to the algebraic structure on a vector space, we will
look at topologies on them. Of course, these topologies should be compatible with the
algebraic structure. This means that addition and multiplication by scalars should be
continuous with respect to the topology. We will only look at one class of such spaces,
namely normed spaces which are naturally metric spaces. Hence it is essential you
know the basics of metric spaces, and we provide a self contained introduction of what
we need in the course.

1 The Axiom of Choice and Zorn’s Lemma


Suppose that 𝐴 is a set, and that for each 𝛼 ∈ 𝐴 there is a set 𝑋𝛼 . We call (𝑋𝛼 )𝛼∈𝐴 a
family of sets indexed by 𝐴. The set 𝐴 may be finite, countable or uncountable. We
then consider the Cartesian product of the sets 𝑋𝛼 :

𝑋𝛼
𝛼∈𝐴

consisting of all “collections” (𝑥𝛼 )𝛼∈𝐴 , where 𝑥𝛼 ∈ 𝑋𝛼 . More formally, 𝛼∈𝐴 𝑋𝛼 is the
set of functions ⋃
𝑥∶ 𝐴 → 𝑋𝛼
𝛼∈𝐴

such that 𝑥(𝛼) ∈ 𝑋𝛼 for all 𝛼 ∈ 𝐴. We write 𝑥𝛼 for 𝑥(𝛼) and (𝑥𝛼 )𝛼∈𝐴 or simply (𝑥𝛼 )
for a given such function 𝑥. Suppose now that 𝐴 ≠ ∅ and 𝑋𝛼 ≠ ∅ for all 𝛼 ∈ 𝐴. Then
there is a fundamental question:


Is 𝛼∈𝐴 𝑋𝛼 nonempty in general?

Here some brief history about the problem, showing how basic and difficult it is:

1
• Zermelo (1904) (see [14]) observed that it is not obvious from the existing axioms
of set theory that there is a procedure to select a single 𝑥𝛼 from each 𝑋𝛼 in general.
As a consequence he introduced what we call the axiom of choice, asserting that

𝛼∈𝐴 𝑋𝛼 ≠ ∅ whenever 𝐴 ≠ ∅ and 𝑋𝛼 ≠ ∅ for all 𝛼 ∈ 𝐴.

It remained open whether his axiom of choice could be derived from the other
axioms of set theory. There was an even more fundamental question on whether
the axiom is consistent with the other axioms!

• Gödel (1938) (see [8]) proved that the axiom of choice is consistent with the other
axioms of set theory. The open question remaining was whether it is independent
of the other axioms.

• P.J. Cohen (1963/1964) (see [4, 5]) finally showed that the axiom of choice is in
fact independent of the other axioms of set theory, that is, it cannot be derived
from them.

The majority of mathematicians accept the axiom of choice, but there is a minority
which does not. Many very basic and important theorems in functional analysis cannot
be proved without the axiom of choice.

We accept the axiom of choice.


There are some non-trivial equivalent formulations of the axiom of choice which are
useful for our purposes. Given two sets 𝑋 and 𝑌 recall that a relation from 𝑋 to 𝑌 is
simply a subset of the Cartesian product 𝑋 ×𝑌 . We now explore some special relations,
namely order relations.

1.1 Definition (partial ordering) A relation ≺ on a set 𝑋 is called a partial ordering


of 𝑋 if

• 𝑥 ≺ 𝑥 for all 𝑥 ∈ 𝑋 (reflexivity);

• 𝑥 ≺ 𝑦 and 𝑦 ≺ 𝑧 imply 𝑥 ≺ 𝑧 (transitivity);

• 𝑥 ≺ 𝑦 and 𝑦 ≺ 𝑥 imply 𝑥 = 𝑦 (anti-symmetry).

We also write 𝑥 ≻ 𝑦 for 𝑦 ≺ 𝑥. We call (𝑋, ≺) a partially ordered set.

1.2 Examples (a) The usual ordering ≤ on ℝ is a partial ordering on ℝ.


(b) Suppose  is a collection of subsets of a set 𝑋. Then inclusion is a partial
ordering. More precisely, if 𝑆, 𝑇 ∈  then 𝑆 ≺ 𝑇 if and only if 𝑆 ⊆ 𝑇 . We say  is
partially ordered by inclusion.
(c) Every subset of a partially ordered set is a partially ordered set by the induced
partial order.
There are more expressions appearing in connection with partially ordered sets.

1.3 Definition Suppose that (𝑋, ≺) is a partially ordered set. Then

2
(a) 𝑚 ∈ 𝑋 is called a maximal element in 𝑋 if for all 𝑥 ∈ 𝑋 with 𝑥 ≻ 𝑚 we have
𝑥 ≺ 𝑚;

(b) 𝑚 ∈ 𝑋 is called an upper bound for 𝑆 ⊆ 𝑋 if 𝑥 ≺ 𝑚 for all 𝑥 ∈ 𝑆;

(c) A subset 𝐶 ⊆ 𝑋 is called a chain in 𝑋 if 𝑥 ≺ 𝑦 or 𝑦 ≺ 𝑥 for all 𝑥, 𝑦 ∈ 𝐶;

(d) If a partially ordered set (𝑋, ≺) is a chain we call it a totally ordered set.

(e) If (𝑋, ≺) is partially ordered and 𝑥0 ∈ 𝑋 is such that 𝑥0 ≺ 𝑥 for all 𝑥 ∈ 𝑋, then
we call 𝑥0 a first element.

There is a special class of partially ordered sets playing a particularly important role in
relation to the axiom of choice as we will see later.

1.4 Definition (well ordered set) A partially ordered set (𝑋, ≺) is called a well or-
dered set if every subset has a first element.

1.5 Examples (a) ℕ is a well ordered set, but ℤ or ℝ are not well ordered with the
usual order.
(b) ℤ and ℝ are totally ordered with the usual order.

1.6 Remark Well ordered sets are always totally ordered. To see this assume (𝑋, ≺)
is well ordered. Given 𝑥, 𝑦 ∈ 𝑋 we consider the subset {𝑥, 𝑦} of 𝑋. By definition of
a well ordered set we have either 𝑥 ≺ 𝑦 or 𝑦 ≺ 𝑥, which shows that (𝑋, ≺) is totally
ordered. The converse is not true as the example of ℤ given above shows.
There is another, highly non-obvious but very useful statement appearing in connection
with partially ordered sets:

1.7 Zorn’s Lemma Suppose that (𝑋, ≺) is a partially ordered set such that each chain
in 𝑋 has an upper bound. Then 𝑋 has a maximal element.
There is a non-trivial connection between all the apparently different topics we dis-
cussed so far. We state it without proof (see for instance [7]).

1.8 Theorem The following assertions are equivalent

(i) The axiom of choice;

(ii) Zorn’s Lemma;

(iii) Every set can be well ordered.

The axiom of choice may seem “obvious” at the first instance. However, the other two
equivalent statements are certainly not. For instance take 𝑋 = ℝ, which we know is not
well ordered with the usual order. If we accept the axiom of choice then it follows from
the above theorem that there exists a partial ordering making ℝ into a well ordered
set. This is a typical “existence proof” based on the axiom of choice. It does not
give us any hint on how to find a partial ordering making ℝ into a well ordered set.
This reflects Zermelo’s observation that it is not obvious how to choose precisely one

3
element from each set when given an arbitrary collection of sets. Because of the non-
constructive nature of the axiom of choice and its equivalent counterparts, there are
some mathematicians rejecting the axiom. These mathematicians have the point of
view that everything should be “constructible,” at least in principle, by some means
(see for instance [2]).

2 Metric Spaces
Metric spaces are sets in which we can measure distances between points. We expect
such a “distance function,” called a metric, to have some obvious properties, which we
postulate in the following definition.

2.1 Definition (Metric Space) Suppose 𝑋 is a set. A map 𝑑 ∶ 𝑋 × 𝑋 → ℝ is called a


metric on 𝑋 if the following properties hold:

(i) 𝑑(𝑥, 𝑦) ≥ 0 for all 𝑥, 𝑦 ∈ 𝑥;

(ii) 𝑑(𝑥, 𝑦) = 0 if and only if 𝑥 = 𝑦;

(iii) 𝑑(𝑥, 𝑦) = 𝑑(𝑦, 𝑥) for all 𝑥, 𝑦 ∈ 𝑋.

(iv) 𝑑(𝑥, 𝑦) ≤ 𝑑(𝑥, 𝑧) + 𝑑(𝑧, 𝑦) for all 𝑥, 𝑦, 𝑧 ∈ 𝑋 (triangle inequality).

We call (𝑋, 𝑑) a metric space. If it is clear what metric is being used we simply say 𝑋
is a metric space.

2.2 Example The simplest example of a metric space is ℝ with 𝑑(𝑥, 𝑦) ∶= |𝑥 − 𝑦|.
The standard metric used in ℝ𝑁 is the Euclidean metric given by

√𝑁
√∑
𝑑(𝒙, 𝒚) = |𝒙 − 𝒚|2 ∶= √ |𝑥𝑖 − 𝑦𝑖 |2
𝑖=1

for all 𝒙, 𝒚 ∈ ℝ𝑁 .

2.3 Remark If (𝑋, 𝑑) is a metric space, then every subset 𝑌 ⊆ 𝑋 is a metric space
with the metric restricted to 𝑌 . We say the metric on 𝑌 is induced by the metric on 𝑋.

2.4 Definition (Open and Closed Ball) Let (𝑋, 𝑑) be a metric space. For 𝑟 > 0 we
call
𝐵(𝑥, 𝑟) ∶= {𝑦 ∈ 𝑋 ∶ 𝑑(𝑥, 𝑦) < 𝑟}
the open ball about 𝑥 with radius 𝑟. Likewise we call

̄ 𝑟) ∶= {𝑦 ∈ 𝑋 ∶ 𝑑(𝑥, 𝑦) ≤ 𝑟}
𝐵(𝑥,

the closed ball about 𝑥 with radius 𝑟.


Using open balls we now define a “topology” on a metric space.

4
2.5 Definition (Open and Closed Set) Let (𝑋, 𝑑) be a metric space. A subset 𝑈 ⊆ 𝑋
is called open if for every 𝑥 ∈ 𝑋 there exists 𝑟 > 0 such that 𝐵(𝑥, 𝑟) ⊆ 𝑈 . A set 𝑈 is
called closed if its complement 𝑋 ⧵ 𝑈 is open.

2.6 Remark For every 𝑥 ∈ 𝑋 and 𝑟 > 0 the open ball 𝐵(𝑥, 𝑟) in a metric space is
open. To prove this fix 𝑦 ∈ 𝐵(𝑥, 𝑟). We have to show that there exists 𝜀 > 0 such that
𝐵(𝑦, 𝜀) ⊆ 𝐵(𝑥, 𝑟). To do so note that by definition 𝑑(𝑥, 𝑦) < 𝑟. Hence we can choose
𝜀 ∈ ℝ such that 0 < 𝜀 < 𝑟 − 𝑑(𝑥, 𝑦). Thus, by property (iv) of a metric, for 𝑧 ∈ 𝐵(𝑦, 𝜀)
we have 𝑑(𝑥, 𝑧) ≤ 𝑑(𝑥, 𝑦) + 𝑑(𝑦, 𝑧) < 𝑑(𝑥, 𝑦) + 𝑟 − 𝑑(𝑥, 𝑦) = 𝑟. Therefore 𝑧 ∈ 𝐵(𝑥, 𝑟),
showing that 𝐵(𝑦, 𝜀) ⊆ 𝐵(𝑥, 𝑟).
Next we collect some fundamental properties of open sets.

2.7 Theorem Open sets in a metric space (𝑋, 𝑑) have the following properties.
(i) 𝑋, ∅ are open sets;

(ii) arbitrary unions of open sets are open;

(iii) finite intersections of open sets are open.


Proof. Property (i) is obvious. To prove (ii) let 𝑈𝛼 , 𝛼 ∈ 𝐴 be an arbitrary family of

open sets in 𝑋. If 𝑥 ∈ 𝛼∈𝐴 𝑈𝛼 then 𝑥 ∈ 𝑈𝛽 for some 𝛽 ∈ 𝐴. As 𝑈𝛽 is open there exists
⋃ ⋃
𝑟 > 0 such that 𝐵(𝑥, 𝑟) ⊆ 𝑈𝛽 . Hence also 𝐵(𝑥, 𝑟) ⊆ 𝛼∈𝐴 𝑈𝛼 , showing that 𝛼∈𝐴 𝑈𝛼
⋂𝑛
is open. To prove (iii) let 𝑈𝑖 , 𝑖 = 1, … , 𝑛 be open sets. If 𝑥 ∈ 𝑖=1 𝑈𝑖 then 𝑥 ∈ 𝑈𝑖 for
all 𝑖 = 1, … , 𝑛. As the sets 𝑈𝑖 are open there exist 𝑟𝑖 > 0 such that 𝐵(𝑥, 𝑟𝑖 ) ⊆ 𝑈𝑖 for all
⋂𝑛
𝑖 = 1, … , 𝑛. If we set 𝑟 ∶= min𝑖=1,…,𝑛 𝑟𝑖 then obviously 𝑟 > 0 and 𝐵(𝑥, 𝑟) ⊆ 𝑖=1 𝑈𝑖 ,
proving (iii).

2.8 Remark There is a more general concept than that of a metric space, namely that
of a “topological space.” A collection  of subsets of a set 𝑋 is called a topology if the
following conditions are satisfied
(i) 𝑋, ∅ ∈  ;

(ii) arbitrary unions of sets in  are in  ;

(iii) finite intersections of sets in  are in  .


The elements of  are called open sets, and (𝑋,  ) a topological space. Hence the
open sets in a metric space form a topology on 𝑋.

2.9 Definition (Neighbourhood) Suppose that (𝑋, 𝑑) is a metric space (or more gen-
erally a topological space). We call a set 𝑈 a neighbourhood of 𝑥 ∈ 𝑋 if there exists
an open set 𝑉 ⊆ 𝑈 with 𝑥 ∈ 𝑉 .
Now we define some sets associated with a given subset of a metric space.

2.10 Definition (Interior, Closure, Boundary) Suppose that 𝑈 is a subset of a metric


space (𝑋, 𝑑) (or more generally a topological space). A point 𝑥 ∈ 𝑈 is called an interior
point of 𝑈 if 𝑈 is a neighbourhood of 𝑥. We call

5
(i) 𝑈̊ ∶= Int(𝑈 ) ∶= {𝑥 ∈ 𝑈 ∶ 𝑥 interior point of 𝑈 } the interior of 𝑈 ;

(ii) 𝑈 ∶= {𝑥 ∈ 𝑋 ∶ 𝑈 ∩ 𝑉 ≠ ∅ for every neighbourhood 𝑉 of 𝑥} the closure of 𝑈 ;

(iii) 𝜕𝑈 ∶= 𝑈̄ ⧵ Int(𝑈 ) the boundary of 𝑈 .

2.11 Remark A set is open if and only if 𝑈̊ = 𝑈 and closed if and only if 𝑈 = 𝑈 .
Moreover, 𝜕𝑈 = 𝑈 ∩ 𝑋 ⧵ 𝑈 .
Sometimes it is convenient to look at products of a (finite) number of metric spaces.
It is possible to define a metric on such a product as well.

2.12 Proposition Suppose that (𝑋𝑖 , 𝑑𝑖 ), 𝑖 = 1, … , 𝑛 are metric spaces. Then 𝑋 =


𝑋1 × 𝑋2 × ⋯ × 𝑋𝑛 becomes a metric space with the metric 𝑑 defined by


𝑛
𝑑(𝑥, 𝑦) ∶= 𝑑𝑖 (𝑥𝑖 , 𝑦𝑖 )
𝑖=1

for all 𝑥 = (𝑥1 , … , 𝑥𝑛 ) and 𝑦 = (𝑦1 , … , 𝑦𝑛 ) in 𝑋.


Proof. Obviously, 𝑑(𝑥, 𝑦) ≥ 0 and 𝑑(𝑥, 𝑦) = 𝑑(𝑦, 𝑥) for all 𝑥, 𝑦 ∈ 𝑋. Moreover, as
𝑑𝑖 (𝑥𝑖 , 𝑦𝑖 ) ≥ 0 we have 𝑑(𝑥, 𝑦) = 0 if and only if 𝑑𝑖 (𝑥𝑖 , 𝑦𝑖 ) = 0 for all 𝑖 = 1, … , 𝑛. As 𝑑𝑖
are metrics we get 𝑥𝑖 = 𝑦𝑖 for all 𝑖 = 1, … , 𝑛. For the triangle inequality note that


𝑛

𝑛
( )
𝑑(𝑥, 𝑦) = 𝑑(𝑥𝑖 , 𝑦𝑖 ) ≤ 𝑑(𝑥𝑖 , 𝑧𝑖 ) + 𝑑(𝑧𝑖 , 𝑦𝑖 )
𝑖=1 𝑖=1

𝑛

𝑛
= 𝑑(𝑥𝑖 , 𝑧𝑖 ) + 𝑑(𝑧𝑖 , 𝑦𝑖 ) = 𝑑(𝑥, 𝑦) + 𝑑(𝑧, 𝑦)
𝑖=1 𝑖=1

for all 𝑥, 𝑦, 𝑧 ∈ 𝑋.

2.13 Definition (Product space) The space and metric introduced in Proposition 2.12
is called a product space and a product metric, respectively.

3 Limits
Once we have a notion of “closeness” we can discuss the asymptotics of sequences and
continuity of functions.

3.1 Definition (Limit) Suppose (𝑥𝑛 )𝑛∈ℕ is a sequence in a metric space (𝑋, 𝑑), or more
generally a topological space. We say 𝑥0 is a limit of (𝑥𝑛 ) if for every neighbourhood
𝑈 of 𝑥0 there exists 𝑛0 ∈ ℕ such that 𝑥𝑛 ∈ 𝑈 for all 𝑛 ≥ 𝑛0 . We write

𝑥0 = lim 𝑥𝑛 or 𝑥𝑛 → 𝑥 as 𝑛 → ∞.
𝑛→∞

If the sequence has a limit we say it is convergent, otherwise we say it is divergent.

6
3.2 Remark Let (𝑥𝑛 ) be a sequence in a metric space (𝑋, 𝑑) and 𝑥0 ∈ 𝑋. Then the
following statements are equivalent:

(1) lim 𝑥𝑛 = 𝑥0 ;
𝑛→∞

(2) for every 𝜀 > 0 there exists 𝑛0 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑥0 ) < 𝜀 for all 𝑛 ≥ 𝑛0 .

Proof. Clearly (1) implies (2) by choosing neighbourhoods of the form 𝐵(𝑥, 𝜀). If
(2) holds and 𝑈 is an arbitrary neighbourhood of 𝑥0 we can choose 𝜀 > 0 such that
𝐵(𝑥0 , 𝜀) ⊆ 𝑈 . By assumption there exists 𝑛0 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑥0 ) < 𝜀 for all 𝑛 ≥ 𝑛0 ,
that is, 𝑥𝑛 ∈ 𝐵(𝑥0 , 𝜀) ⊆ 𝑈 for all 𝑛 ≥ 𝑛0 . Therefore, 𝑥𝑛 → 𝑥0 as 𝑛 → ∞.

3.3 Proposition A sequence in a metric space (𝑋, 𝑑) has at most one limit.
Proof. Suppose that (𝑥𝑛 ) is a sequence in (𝑋, 𝑑) and that 𝑥 and 𝑦 are limits of that
sequence. Fix 𝜀 > 0 arbitrary. Since 𝑥 is a limit there exists 𝑛1 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑥) <
𝜀∕2 for all 𝑛 > 𝑛1 . Similarly, since 𝑦 is a limit there exists 𝑛2 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑦) <
𝜀∕2 for all 𝑛 > 𝑛2 . Hence 𝑑(𝑥, 𝑦) ≤ 𝑑(𝑥, 𝑥𝑛 ) + 𝑑(𝑥𝑛 , 𝑦) ≤ 𝜀∕2 + 𝜀∕2 = 𝜀 for all
𝑛 > max{𝑛1 , 𝑛2 }. Since 𝜀 > 0 was arbitrary it follows that 𝑑(𝑥, 𝑦) = 0, and so by
definition of a metric 𝑥 = 𝑦. Thus (𝑥𝑛 ) has at most one limit.

We can characterise the closure of sets by using sequences.

3.4 Theorem Let 𝑈 be a subset of the metric space (𝑋, 𝑑) then 𝑥 ∈ 𝑈 if and only if
there exists a sequence (𝑥𝑛 ) in 𝑈 such that 𝑥𝑛 → 𝑥 as 𝑛 → ∞.
Proof. Let 𝑈 ⊆ 𝑋 and 𝑥 ∈ 𝑈 . Hence 𝐵(𝑥, 𝜀) ∩ 𝑈 ) ≠ ∅ for all 𝜀 > 0. For all 𝑛 ∈ ℕ we
can therefore choose 𝑥𝑛 ∈ 𝑈 with 𝑑(𝑥, 𝑥𝑛 ) < 1∕𝑛. By construction 𝑥𝑛 → 𝑥 as 𝑛 → ∞.
If (𝑥𝑛 ) is a sequence in 𝑈 converging to 𝑥 then for every 𝜀 > 0 there exists 𝑛0 ∈ ℕ such
that 𝑥𝑛 ∈ 𝐵(𝑥, 𝜀) for all 𝑛 ≥ 𝑛0 . In particular, 𝐵(𝑥, 𝜀) ∩ 𝑈 ≠ ∅ for all 𝜀 > 0, implying
that 𝑥 ∈ 𝑈 as required.

There is another concept closely related to convergence of sequences.

3.5 Definition (Cauchy Sequence) Suppose (𝑥𝑛 ) is a sequence in the metric space
(𝑋, 𝑑). We call (𝑥𝑛 ) a Cauchy sequence if for every 𝜀 > 0 there exists 𝑛0 ∈ ℕ such that
𝑑(𝑥𝑛 , 𝑥𝑚 ) < 𝜀 for all 𝑚, 𝑛 ≥ 𝑛0 .
Some sequences may not converge, but they accumulate at certain points.

3.6 Definition (Point of Accumulation) Suppose that (𝑥𝑛 ) is a sequence in a metric


space (𝑋, 𝑑) or more generally in a topological space. We say that 𝑥0 is a point of
accumulation of (𝑥𝑛 ) if for every neighbourhood 𝑈 of 𝑥0 and every 𝑛0 ∈ ℕ there exists
𝑛 ≥ 𝑛0 such that 𝑥𝑛 ∈ 𝑈 .

3.7 Remark Equivalently we may say 𝑥0 is an accumulation point of (𝑥𝑛 ) if for every
𝜀 > 0 and every 𝑛0 ∈ ℕ there exists 𝑛 ≥ 𝑛0 such that 𝑑(𝑥𝑛 , 𝑥0 ) < 𝜀. Note that it follows
from the definition that every neighbourhood of 𝑥0 contains infinitely many elements
of the sequence (𝑥𝑛 ).

7
3.8 Proposition Suppose that (𝑋, 𝑑) is a metric space and (𝑥𝑛 ) a sequence in that
space. Then 𝑥 ∈ 𝑋 is a point of accumulation of (𝑥𝑛 ) if and only if


𝑥∈ {𝑥𝑗 ∶ 𝑗 ≥ 𝑘}. (3.1)
𝑘=1
⋂∞
Proof. Suppose that 𝑥 ∈ 𝑘=1 {𝑥𝑗 ∶ 𝑗 ≥ 𝑘}. Then 𝑥 ∈ {𝑥𝑗 ∶ 𝑗 ≥ 𝑘} for all 𝑘 ∈ ℕ.
By Theorem 3.4 we can choose for every 𝑘 ∈ ℕ an element 𝑥𝑛𝑘 ∈ {𝑥𝑗 ∶ 𝑗 ≥ 𝑘} such
that 𝑑(𝑥𝑛𝑘 , 𝑥) < 1∕𝑘. By construction 𝑥𝑛𝑘 → 𝑥 as 𝑘 → ∞, showing that 𝑥 is a point
of accumulation of (𝑥𝑛 ). If 𝑥 is a point of accumulation of (𝑥𝑛 ) then for all 𝑘 ∈ ℕ
there exists 𝑛𝑘 ≥ 𝑘 such that 𝑑(𝑥𝑛𝑘 , 𝑥) < 1∕𝑘. Clearly 𝑥𝑛𝑘 → 𝑥 as 𝑘 → ∞, so that
𝑥 ∈ {𝑥𝑛𝑗 ∶ 𝑗 ≥ 𝑘} for all 𝑘 ∈ ℕ. As {𝑥𝑛𝑗 ∶ 𝑗 ≥ 𝑘} ⊆ {𝑥𝑗 ∶ 𝑗 ≥ 𝑘} for all 𝑘 ∈ ℕ we
obtain (3.1).
In the following theorem we establish a connection between Cauchy sequences and
converging sequences.

3.9 Theorem Let (𝑋, 𝑑) be a metric space. Then every convergent sequence is a
Cauchy sequence. Moreover, if a Cauchy sequence (𝑥𝑛 ) has an accumulation point
𝑥0 , then (𝑥𝑛 ) is a convergent sequence with limit 𝑥0 .
Proof. Suppose that (𝑥𝑛 ) is a convergent sequence with limit 𝑥0 . Then for every 𝜀 > 0
there exists 𝑛0 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑥0 ) < 𝜀∕2 for all 𝑛 ≥ 𝑛0 . Now
𝜀 𝜀
𝑑(𝑥𝑛 , 𝑥𝑚 ) ≤ 𝑑(𝑥𝑛 , 𝑥0 ) + 𝑑(𝑥0 , 𝑥𝑚 ) = 𝑑(𝑥𝑛 , 𝑥0 ) + 𝑑(𝑥𝑚 , 𝑥0 ) < + = 𝜀
2 2
for all 𝑛, 𝑚 ≥ 𝑛0 , showing that (𝑥𝑛 ) is a Cauchy sequence. Now assume that (𝑥𝑛 ) is a
Cauchy sequence, and that 𝑥0 ∈ 𝑋 is an accumulation point of (𝑥𝑛 ). Fix 𝜀 > 0 arbitrary.
Then by definition of a Cauchy sequence there exists 𝑛0 ∈ ℕ such that 𝑑(𝑥𝑛 , 𝑥𝑚 ) < 𝜀∕2
for all 𝑛, 𝑚 ≥ 𝑛0 . Moreover, since 𝑥0 is an accumulation point there exists 𝑚0 ≥ 𝑛0 such
that 𝑑(𝑥𝑚0 , 𝑥0 ) < 𝜀∕2. Hence
𝜀 𝜀
𝑑(𝑥𝑛 , 𝑥0 ) ≤ 𝑑(𝑥𝑛 , 𝑥𝑚0 ) + 𝑑(𝑥𝑚0 , 𝑥0 ) < + = 𝜀
2 2
for all 𝑛 ≥ 𝑛0 . Hence by Remark 3.2 𝑥0 is the limit of (𝑥𝑛 ).
In a general metric space not all Cauchy sequences have necessarily a limit, hence the
following definition.

3.10 Definition (Complete Metric Space) A metric space is called complete if every
Cauchy sequence in that space has a limit.
One property of the real numbers is that the intersection of a nested sequence of closed
bounded intervals whose lengths shrinks to zero have a non-empty intersection. This
property is in fact equivalent to the “completeness” of the real number system. We now
prove a counterpart of that fact for metric spaces. There are no intervals in general
metric spaces, so we look at a sequence of nested closed sets whose diameter goes to
zero. The diameter of a set 𝐾 in a metric space (𝑋, 𝑑) is defined by
diam(𝐾) ∶= sup 𝑑(𝑥, 𝑦).
𝑥,𝑦∈𝐾

8
3.11 Theorem (Cantor’s Intersection Theorem) Let (𝑋, 𝑑) be a metric space. Then
the following two assertions are equivalent:

(i) (𝑋, 𝑑) is complete;

(ii) For every sequence of closed sets 𝐾𝑛 ⊆ 𝑋 with 𝐾𝑛+1 ⊆ 𝐾𝑛 for all 𝑛 ∈ ℕ

diam(𝐾𝑛 ) ∶= sup 𝑑(𝑥, 𝑦) → 0


𝑥,𝑦∈𝐾𝑛


as 𝑛 → ∞ we have 𝑛∈ℕ 𝐾𝑛 ≠ ∅.

Proof. First assume that 𝑋 is complete and let 𝐾𝑛 be as in (ii). For every 𝑛 ∈ ℕ we
choose 𝑥𝑛 ∈ 𝐾𝑛 and show that (𝑥𝑛 ) is a Cauchy sequence. By assumption 𝐾𝑛+1 ⊆ 𝐾𝑛
for all 𝑛 ∈ ℕ, implying that 𝑥𝑚 ∈ 𝐾𝑚 ⊆ 𝐾𝑛 for all 𝑚 > 𝑛. Since 𝑥𝑚 , 𝑥𝑛 ∈ 𝐾𝑛 we have

𝑑(𝑥𝑚 , 𝑥𝑛 ) ≤ sup 𝑑(𝑥, 𝑦) = diam(𝐾𝑛 )


𝑥,𝑦∈𝐾𝑛

for all 𝑚 > 𝑛. Since diam(𝐾𝑛 ) → 0 as 𝑛 → ∞, given 𝜀 > 0 there exists 𝑛0 ∈ ℕ such
that diam(𝐾𝑛0 ) < 𝜀. Hence, since 𝐾𝑚 ⊆ 𝐾𝑛 ⊆ 𝐾𝑛0 we have

𝑑(𝑥𝑚 , 𝑥𝑛 ) ≤ diam(𝐾𝑛 ) ≤ diam(𝐾𝑛0 ) < 𝜀

for all 𝑚 > 𝑛 > 𝑛0 , showing that (𝑥𝑛 ) is a Cauchy sequence. By completenes of 𝑆,
the sequence (𝑥𝑛 ) converges to some 𝑥 ∈ 𝑋. We know from above that 𝑥𝑚 ∈ 𝐾𝑛 for
all 𝑚 > 𝑛. As 𝐾𝑛 is closed 𝑥 ∈ 𝐾𝑛 . Since this is true for all 𝑛 ∈ ℕ we conclude that

𝑥 ∈ 𝑛∈ℕ 𝐾𝑛 , so the intersection is non-empty as claimed.
Assume now that (ii) is true and let (𝑥𝑛 ) be a Cauchy sequence in (𝑋, 𝑑). Hence
there exists 𝑛0 ∈ 𝑋 such that 𝑑(𝑥𝑛0 , 𝑥𝑛 ) < 1∕2 for all 𝑛 ≥ 𝑛0 . Similarly, there exists
𝑛1 > 𝑛0 such that 𝑑(𝑥𝑛1 , 𝑥𝑛 ) < 1∕22 for all 𝑛 ≥ 𝑛1 . Continuing that way we construct a
sequence (𝑛𝑘 ) in ℕ such that for every 𝑘 ∈ ℕ we have 𝑛𝑘+1 > 𝑛𝑘 and 𝑑(𝑥𝑛𝑘 , 𝑥𝑛 ) < 1∕2𝑘+1
for all 𝑛 > 𝑛𝑘 . We now set 𝐾𝑘 ∶= 𝐵(𝑥𝑘 , 2−𝑘 )). If 𝑥 ∈ 𝐾𝑘+1 , then since 𝑛𝑘+1 > 𝑛𝑘

1 1 1
𝑑(𝑥𝑛𝑘 , 𝑥) ≤ 𝑑(𝑥𝑛𝑘 , 𝑥𝑛𝑘+1 ) + 𝑑(𝑥𝑛𝑘+1 , 𝑥) < + = .
2𝑘+1 2𝑘+1 2𝑘
Hence 𝑥 ∈ 𝐾𝑘 , showing that 𝐾𝑘+1 ⊆ 𝐾𝑘 for all 𝑘 ∈ ℕ. By assumption (ii) we have
⋂ ⋂
𝑘∈ℕ 𝐾𝑘 ≠ ∅, so choose 𝑥 ∈ 𝑘∈ℕ 𝐾𝑘 ≠ ∅. Then 𝑥 ∈ 𝐾𝑘 for all 𝑘 ∈ ℕ, so 𝑑(𝑥𝑛𝑘 , 𝑥) ≤
1∕2 for all 𝑘 ∈ ℕ. Hence 𝑥𝑛𝑘 → 𝑥 as 𝑘 → ∞. By Theorem 3.9 the Cauchy sequence
𝑘

(𝑥𝑛 ) converges, proing (i).

We finally look at product spaces defined in Definition 2.13. The rather simple
proof of the following proposition is left to the reader.

3.12 Proposition Suppose that (𝑋𝑖 , 𝑑𝑖 ), 𝑖 = 1, … , 𝑛 are complete metric spaces. Then
the corresponding product space is complete with respect to the product metric.

9
4 Compactness
We start by introducing some additional concepts, and show that they are all equivalent
in a metric space. They are all generalisations of “finiteness” of a set.

4.1 Definition (Open Cover, Compactness) Let (𝑋, 𝑑) be a metric space. We call a

collection of open sets (𝑈𝛼 )𝛼∈𝐴 an open cover of 𝑋 if 𝑋 ⊆ 𝛼∈𝐴 𝑈𝛼 . The space 𝑋
is called compact if for every open cover (𝑈𝛼 )𝛼∈𝐴 there exist finitely many 𝛼𝑖 ∈ 𝐴,
𝑖 = 1, … , 𝑚 such that (𝑈𝛼𝑖 )𝑖=1,…,𝑚 is an open cover of 𝑋. We talk about a finite sub-
cover of 𝑋.

4.2 Definition (Sequential Compactness) We call a metric space (𝑋, 𝑑) sequentially


compact if every sequence in 𝑋 has an point of accumulation.

4.3 Definition (Total Boundedness) We call a metric space 𝑋 totally bounded if for
every 𝜀 > 0 there exist finitely many points 𝑥𝑖 ∈ 𝑋, 𝑖 = 1, … , 𝑚, such that (𝐵(𝑥𝑖 , 𝜀))𝑖=1,…,𝑚
is an open cover of 𝑋.
It turns out that all the above definitions are equivalent, at least in metric spaces (but
not in general topological spaces).

4.4 Theorem For a metric space (𝑋, 𝑑) the following statements are equivalent:
(i) 𝑋 is compact;

(ii) 𝑋 is sequentially compact;

(iii) 𝑋 is complete and totally bounded.


Proof. To prove that (i) implies (ii) assume that 𝑋 is compact and that (𝑥𝑛 ) is a sequence
in 𝑋. We set 𝐶𝑛 ∶= {𝑥𝑗 ∶ 𝑗 ≥ 𝑛} and 𝑈𝑛 ∶= 𝑋 ⧵ 𝐶𝑛 . Then 𝑈𝑛 is open for all 𝑛 ∈ ℕ as
𝐶𝑛 is closed. By Proposition 3.8 the sequence (𝑥𝑛 ) has a point of accumulation if

𝐶𝑛 ≠ ∅,
𝑛∈ℕ

which is equivalent to
⋃ ⋃ ⋂
𝑈𝑛 = 𝑋 ⧵ 𝐶𝑛 = 𝑋 ⧵ 𝐶𝑛 ≠ 𝑋
𝑛∈ℕ 𝑛∈ℕ 𝑛∈ℕ

Clearly 𝐶0 ⊃ 𝐶1 ⊃ ⋯ ⊃ 𝐶𝑛 ≠ ∅ for all 𝑛 ∈ ℕ. Hence every finite intersection of


sets 𝐶𝑛 is nonempty. Equivalently, every finite union of sets 𝑈𝑛 is strictly smaller than
𝑋, so that 𝑋 cannot be covered by finitely many of the sets 𝑈𝑛 . As 𝑋 is compact it

is impossible that 𝑛∈ℕ 𝑈𝑛 = 𝑋 as otherwise a finite number would cover 𝑋 already,
contradicting what we just proved. Hence (𝑥𝑛 ) must have a point of accumulation.
Now assume that (ii) holds. If (𝑥𝑛 ) is a Cauchy sequence it follows from (ii) that it
has a point of accumulation. By Theorem 3.9 we conclude that it has a limit, showing
that 𝑋 is complete. Suppose now that 𝑋 is not totally bounded. Then, there exists
𝜀 > 0 such that 𝑋 cannot be covered by finitely many balls of radius 𝜀. If we let 𝑥0
be arbitrary we can therefore choose 𝑥1 ∈ 𝑋 such that 𝑑(𝑥0 , 𝑥1 ) > 𝜀. By induction

10
we may construct a sequence (𝑥𝑛 ) such that 𝑑(𝑥𝑗 , 𝑥𝑛 ) ≥ 𝜀 for all 𝑗 = 1, … , 𝑛 − 1.
Indeed, suppose we have 𝑥0 , … , 𝑥𝑛 ∈ 𝑋 with 𝑑(𝑥𝑗 , 𝑥𝑛 ) ≥ 𝜀 for all 𝑗 = 1, … , 𝑛 − 1.
⋃𝑛
Assuming that 𝑋 is not totally bounded 𝑗=1 𝐵(𝑥𝑗 , 𝜀) ≠ 𝑋, so we can choose 𝑥𝑛+1
not in that union. Hence 𝑑(𝑥𝑗 , 𝑥𝑛+1 ) ≥ 𝜀 for 𝑗 = 1, … , 𝑛. By construction it follows
that 𝑑(𝑥𝑛 , 𝑥𝑚 ) ≥ 𝜀∕2 for all 𝑛, 𝑚 ∈ ℕ, showing that (𝑥𝑛 ) does not contain a Cauchy
subsequence, and thus has no point of accumulation. As this contradicts (ii), the space
𝑋 must be totally bounded.
Suppose now that (iii) holds, but 𝑋 is not compact. Then there exists an open cover
(𝑈𝛼 )𝛼∈𝐴 not having a finite sub-cover. As 𝑋 is totally bounded, for every 𝑛 ∈ ℕ there
exist finite sets 𝐹𝑛 ⊆ 𝑋 such that

𝑋= 𝐵(𝑥, 2−𝑛 ). (4.1)
𝑥∈𝐹𝑛

Assuming that (𝑈𝛼 )𝛼∈𝐴 does not have a finite sub-cover, there exists 𝑥1 ∈ 𝐹1 such
that 𝐵(𝑥1 , 2−1 ) and thus 𝐾1 ∶= 𝐵(𝑥1 , 3 ⋅ 2−1 ) cannot be covered by finitely many 𝑈𝛼 .
By (4.1) it follows that there exists 𝑥2 ∈ 𝐹2 such that 𝐵(𝑥1 , 2−1 ) ∩ 𝐵(𝑥2 , 2−2 ) and
therefore 𝐾2 ∶= 𝐵(𝑥2 , 3 ⋅ 2−2 ) is not finitely covered by (𝑈𝛼 )𝛼∈𝐴 . We can continue
this way and choose 𝑥𝑛+1 ∈ 𝐹𝑛+1 such that 𝐵(𝑥𝑛 , 2−𝑛 ) ∩ 𝐵(𝑥𝑛+1 , 2−(𝑛+1) ) and therefore
𝐾𝑛+1 ∶= 𝐵(𝑥2 , 3 ⋅ 2−(𝑛+1) ) is not finitely covered by (𝑈𝛼 )𝛼∈𝐴 . Note that 𝐵(𝑥𝑛 , 2−𝑛 ) ∩
𝐵(𝑥𝑛+1 , 2−(𝑛+1) ) ≠ ∅ since otherwise the intersection is finitely covered by (𝑈𝛼 )𝛼∈𝐴 .
Hence if 𝑥 ∈ 𝐾𝑛+1 , then

1 1 3 6 3
𝑑(𝑥𝑛 , 𝑥) ≤ 𝑑(𝑥𝑛 , 𝑥𝑛+1 ) + 𝑑(𝑥𝑛+1 , 𝑥) ≤ 𝑛
+ 𝑛+1 + 𝑛+1 = 𝑛+1 = 𝑛 ,
2 2 2 2 2
implying that 𝑥 ∈ 𝐾𝑛 . Also diam 𝐾𝑛 ≤ 3 ⋅ 2𝑛−1 → 0. Since 𝑋 is complete, by

Cantor’s intersection Theorem 3.11 there exists 𝑥 ∈ 𝑛∈ℕ 𝐾𝑛 . As (𝑈𝛼 ) is a cover of
𝑋 we have 𝑥 ∈ 𝑈𝛼0 for some 𝛼0 ∈ 𝐴. Since 𝑈𝛼0 is open there exists 𝜀 > 0 such that
𝐵(𝑥, 𝜀) ⊆ 𝑈𝛼0 . Choose now 𝑛 such that 6∕2𝑛 < 𝜀 and fix 𝑦 ∈ 𝐾𝑛 . Since 𝑥 ∈ 𝐾𝑛 we
have 𝑑(𝑥, 𝑦) ≤ 𝑑(𝑥, 𝑥𝑛 ) + 𝑑(𝑥𝑛 , 𝑦) ≤ 6∕2𝑛 < 𝜀. Hence 𝐾𝑛 ⊆ 𝐵(𝑥, 𝜀) ⊆ 𝑈𝛼0 , showing
that 𝐾𝑛 is covered by 𝑈𝛼0 . However, by construction 𝐾𝑛 cannot be covered by finitely
many 𝑈𝛼 , so we have a contradiction. Hence 𝑋 is compact, completing the proof of the
theorem.

The last part of the proof is modelled on the usual proof of the Heine-Borel theorem
asserting that bounded and closed sets are the compact sets in ℝ𝑁 . Hence it is not a
surprise that the Heine-Borel theorem easily follows from the above characterisations
of compactness.

4.5 Theorem (Heine-Borel) A subset of ℝ𝑁 is compact if and only if it is closed and


bounded.
Proof. Suppose 𝐴 ⊆ ℝ𝑁 is compact. By Theorem 4.4 the set 𝐴 is totally bounded, and
thus may be covered by finitely many balls of radius one. A finite union of such balls
is clearly bounded, so 𝐴 is bounded. Again by Theorem 4.4, the set 𝐴 is complete, so
in particular it is closed. Now assume 𝐴 is closed and bounded. As ℝ𝑁 is complete it
follows that 𝐴 is complete. Next we show that 𝐴 is totally bounded. We let 𝑀 be such

11
that 𝐴 is contained in the cube [−𝑀, 𝑀]𝑁 . Given 𝜀 > 0 the interval [−𝑀, 𝑀] can
be covered by 𝑚 ∶= [2𝑀∕𝜀] + 1 closed intervals of length 𝜀∕2 (here [2𝑀∕𝜀] is the
integer part of 2𝑀∕𝜀). Hence [−𝑀, 𝑀]𝑁 can be covered by 𝑚𝑁 cubes with edges 𝜀∕2
long. Such cubes are contained in open balls of radius 𝜀, so we can cover [−𝑀, 𝑀]𝑁
and thus 𝐴 by a finite number of balls of radius 𝜀. Hence 𝐴 is complete and totally
bounded. By Theorem 4.4 the set 𝐴 is compact.

We can also look at subsets of metric spaces. As they are metric spaces with the metric
induced on them we can talk about compact subsets of a metric space. It follows from
the above theorem that compact subsets of a metric space are always closed (as they are
complete). Often in applications one has sets that are not compact, but their closure is
compact.

4.6 Definition (Relatively Compact Sets) We call a subset of a metric space relatively
compact if its closure is compact.

4.7 Proposition Closed subsets of compact metric spaces are compact.


Proof. Suppose 𝐶 ⊆ 𝑋 is closed and 𝑋 is compact. If (𝑈𝛼 )𝛼∈𝐴 is an open cover of 𝐶
then we get an open cover of 𝑋 if we add the open set 𝑋 ⧵𝐶 to the 𝑈𝛼 . As 𝑋 is compact
there exists a finite sub-cover of 𝑋, and as 𝑋 ⧵ 𝐶 ∩ 𝐶 = ∅ also a finite sub-cover of 𝐶.
Hence 𝐶 is compact.

Next we show that finite products of compact metric spaces are compact.

4.8 Proposition Let (𝑋𝑖 , 𝑑𝑖 ), 𝑖 = 1, … , 𝑛, be compact metric spaces. Then the prod-
uct 𝑋 ∶= 𝑋1 × ⋯ × 𝑋𝑛 is compact with respect to the product metric introduced in
Proposition 2.12.
Proof. By Proposition 3.12 it follows that the product space 𝑋 is complete. By
Theorem 4.4 is is therefore sufficient to show that 𝑋 is totally bounded. Fix 𝜀 > 0.
Since 𝑋𝑖 is totally bounded there exist 𝑥𝑖𝑘 ∈ 𝑋𝑖 , 𝑘 = 1, … 𝑚𝑖 such that 𝑋𝑖 is covered
by the balls 𝐵𝑖𝑘 of radius 𝜀∕𝑛 and centre 𝑥𝑖𝑘 . Then 𝑋 is covered by the balls of ra-
dius 𝜀 with centres (𝑥1𝑘1 , … , 𝑥𝑖𝑘𝑖 , … 𝑥𝑛𝑘𝑛 ), where 𝑘𝑖 = 1, … 𝑚𝑖 . Indeed, suppose that
𝑥 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) ∈ 𝑋 is arbitrary. By assumption, for every 𝑖 = 1, … 𝑛 there exist
1 ≤ 𝑘𝑖 ≤ 𝑚𝑖 such that 𝑑(𝑥𝑖 , 𝑥𝑖𝑘𝑖 ) < 𝜀∕𝑛. By definition of the product metric the distance
between (𝑥1𝑘1 , … , 𝑥𝑛𝑘𝑛 ) and 𝑥 is no larger than 𝑑(𝑥1 , 𝑥1𝑘1 )+⋯+𝑑(𝑥𝑛 , 𝑥𝑛𝑘𝑛 ) ≤ 𝑛𝜀∕𝑛 = 𝜀.
Hence 𝑋 is totally bounded and thus 𝑋 is compact.

5 Continuous Functions
We give a brief overview on continuous functions between metric spaces. Throughout,
let 𝑋 = (𝑋, 𝑑) denote a metric space. We start with some basic definitions.

5.1 Definition (Continuous Function) A function 𝑓 ∶ 𝑋 → 𝑌 between two metric


spaces is called continuous at a point 𝑥 ∈ 𝑋 if for every neighbourhood 𝑉 ⊆ 𝑌 of 𝑓 (𝑥)

12
there exists a neighbourhood 𝑈 ⊆ 𝑋 of 𝑥 such that 𝑓 (𝑈 ) ⊆ 𝑉 . The map 𝑓 ∶ 𝑋 → 𝑌
is called continuous if it is continuous at all 𝑥 ∈ 𝑋. Finally we set

𝐶(𝑋, 𝑌 ) ∶= {𝑓 ∶ 𝑋 → 𝑌 ∣ 𝑓 is continuous}.

The above is equivalent to the usual 𝜀-𝛿 definition.

5.2 Theorem Let 𝑋, 𝑌 be metric spaces and 𝑓 ∶ 𝑋 → 𝑌 a function. Then the follow-
ing assertions are equivalent:

(i) 𝑓 is continuous at 𝑥 ∈ 𝑋;
( )
(ii) For every 𝜀 > 0 there exists 𝛿 > 0 such that 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦) ≤ 𝜀 for all 𝑦 ∈ 𝑋
with 𝑑𝑋 (𝑥, 𝑦) < 𝛿;

(iii) For every sequence (𝑥𝑛 ) in 𝑋 with 𝑥𝑛 → 𝑥 we have 𝑓 (𝑥𝑛 ) → 𝑓 (𝑥) as 𝑛 → ∞.

Proof. Taking special neighbourhoods 𝑉 = 𝐵(𝑓 (𝑥), 𝜀) and 𝑈 ∶= 𝐵(𝑥, 𝛿) then (ii)
is clearly necessary for 𝑓 to be continuous. To show the (ii) is sufficient let 𝑉 be an
arbitrary neighbourhood of 𝑓 (𝑥). Then there exists ( 𝜀 > 0 such) that 𝐵(𝑓 (𝑥), 𝜀) ⊆ 𝑉 .
By assumption there exists 𝛿 > 0 such that 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦) ≤ 𝜀 for all 𝑦 ∈ 𝑋 with
𝑑𝑋 (𝑥, 𝑦) < 𝛿, that is, 𝑓 (𝑈 ) ⊆ 𝑉 if we let 𝑈 ∶= 𝐵(𝑥, 𝛿). As 𝑈 is a neighbourhood of 𝑥
it follows that 𝑓 is continuous. Let now 𝑓 be continuous and (𝑥𝑛 ) a sequence in 𝑋 con-
verging to 𝑥. If 𝜀 > 0 is given then there exists 𝛿 > 0 such that 𝑑𝑌 (𝑓 (𝑥), 𝑓 (𝑦)) < 𝜀 for all
𝑦 ∈ 𝑋 with 𝑑𝑋 (𝑥, 𝑦) < 𝛿. As 𝑥𝑛 → 𝑥 there exists 𝑛0 ∈ ℕ such that 𝑑𝑋 (𝑥, 𝑥𝑛 ) < 𝛿 for all
𝑛 ≥ 𝑛0 . Hence 𝑑𝑌 (𝑓 (𝑥), 𝑓 (𝑥𝑛 )) < 𝜀 for all 𝑛 ≥ 𝑛0 . As 𝜀 > 0 was arbitrary 𝑓 (𝑥𝑛 ) → 𝑓 (𝑥)
as 𝑛 → ∞. Assume now that (ii) does not hold. Then there exists 𝜀 > 0 such that for
each 𝑛 ∈ ℕ there exists 𝑥𝑛 ∈ 𝑋 with 𝑑𝑋 (𝑥, 𝑥𝑛 ) < 1∕𝑛 but 𝑑𝑌 (𝑓 (𝑥), 𝑓 (𝑥𝑛 )) ≥ 𝜀 for
all 𝑛 ∈ ℕ. Hence 𝑥𝑛 → 𝑥 in 𝑋 but 𝑓 (𝑥𝑛 ) ̸→ 𝑓 (𝑥) in 𝑌 , so (iii) does not hold. By
contrapositive (iii) implies (ii), completing the proof of the theorem.

Next we want to give various equivalent characterisations of continuous maps (without


proof).

5.3 Theorem (Characterisation of Continuity) Let 𝑋, 𝑌 be metric spaces. Then the


following statements are equivalent:

(i) 𝑓 ∈ 𝐶(𝑋, 𝑌 );

(ii) 𝑓 −1 [𝑂] ∶= {𝑥 ∈ 𝑋 ∶ 𝑓 (𝑥) ∈ 𝑂} is open for every open set 𝑂 ⊆ 𝑌 ;

(iii) 𝑓 −1 [𝐶] is closed for every closed set 𝐶 ⊆ 𝑌 ;

(iv) For every 𝑥 ∈ 𝑋 and every neighbourhood 𝑉 ⊆ 𝑌 of 𝑓 (𝑥) there exists a neigh-
bourhood 𝑈 ⊆ 𝑋 of 𝑥 such that 𝑓 (𝑈 ) ⊆ 𝑉 ;
( )
(v) For every 𝑥 ∈ 𝑋 and every 𝜀 > 0 there exists 𝛿 > 0 such that 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦) < 𝜀
for all 𝑦 ∈ 𝑋 with 𝑑𝑋 (𝑥, 𝑦) < 𝛿.

13
5.4 Definition (Distance to a Set) Let 𝐴 be a nonempty subset of 𝑋. We define the
distance between 𝑥 ∈ 𝑋 and 𝐴 by

dist(𝑥, 𝐴) ∶= inf 𝑑(𝑥, 𝑎)


𝑎∈𝐴

5.5 Proposition For every nonempty set 𝐴 ⊆ 𝑋 the map 𝑋 → ℝ, 𝑥 → dist(𝑥, 𝐴), is
continuous.
Proof. By the properties of a metric 𝑑(𝑥, 𝑎) ≤ 𝑑(𝑥, 𝑦) + 𝑑(𝑦, 𝑎). By first taking
an infimum on the left hand side and then on the right hand side we get dist(𝑥, 𝐴) ≤
𝑑(𝑥, 𝑦) + dist(𝑦, 𝐴) and thus

dist(𝑥, 𝐴) − dist(𝑦, 𝐴) ≤ 𝑑(𝑥, 𝑦)

for all 𝑥, 𝑦 ∈ 𝑋. Interchanging the roles of 𝑥 and 𝑦 we get dist(𝑦, 𝐴) − dist(𝑥, 𝐴) ≤


𝑑(𝑥, 𝑦), and thus
| |
|dist(𝑥, 𝐴) − dist(𝑦, 𝐴)| ≤ 𝑑(𝑥, 𝑦),
| |
implying the continuity of dist(⋅ , 𝐴).

We continue to discuss properties of continuous functions on compact sets.

5.6 Theorem If 𝑓 ∈ 𝐶(𝑋, 𝑌 ) and 𝑋 is compact then the image 𝑓 (𝑋) is compact in 𝑌 .
Proof. Suppose that (𝑈𝛼 ) is an open cover of 𝑓 (𝑋) then by continuity 𝑓 −1 [𝑈𝛼 ] are
open sets, and so (𝑓 −1 [𝑈𝛼 ]) is an open cover of 𝑋. By the compactness of 𝑋 it has a
finite sub-cover. Clearly the image of that finite sub-cover is a finite sub-cover of 𝑓 (𝑋)
by (𝑈𝛼 ). Hence 𝑓 (𝑋) is compact.

From the above theorem we can deduce an important property of real valued continuous
functions.

5.7 Theorem (Extreme value theorem) Suppose that 𝑋 is a compact metric space
and 𝑓 ∈ 𝐶(𝑋, ℝ). Then 𝑓 attains its maximum and minimum, that is, there exist
𝑥1 , 𝑥2 ∈ 𝑋 such that 𝑓 (𝑥1 ) = inf{𝑓 (𝑥) ∶ 𝑥 ∈ 𝑋} and 𝑓 (𝑥2 ) = sup{𝑓 (𝑥) ∶ 𝑥 ∈ 𝑋}.
Proof. By Theorem 5.6 the image of 𝑓 is compact, and so by the Heine-Borel theorem
(Theorem 4.5) closed and bounded. Hence the image 𝑓 (𝑋) = {𝑓 (𝑥) ∶ 𝑥 ∈ 𝑋} contain
its infimum and supremum, that is, 𝑥1 and 𝑥2 as required exist.

Continuous functions on compact sets have other nice properties. To discuss these
properties we introduce a stronger notion of continuity.

5.8 Definition (Uniform continuity) We say a function ( 𝑓∶ 𝑋 → ) 𝑌 is uniformly con-


tinuous if for all 𝜀 > 0 there exists 𝛿 > 0 such that 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦) < 𝜀 for all 𝑥, 𝑦 ∈ 𝑋
satisfying 𝑑𝑋 (𝑥, 𝑦) < 𝛿.
The difference to continuity is that 𝛿 does not depend on the point 𝑥, but can be chosen
to be the same for all 𝑥 ∈ 𝑋, that is uniformly with respect to 𝑥 ∈ 𝑋. In fact, not all

14
functions are uniformly continuous. For instance the function 𝑓 ∶ ℝ → ℝ given by
𝑓 (𝑥) = 𝑥2 is not uniformly continuous. To see this note that

|𝑓 (𝑥) − 𝑓 (𝑦)| = |𝑥2 − 𝑦2 | = |𝑥 + 𝑦||𝑥 − 𝑦|

for all 𝑥, 𝑦 ∈ ℝ. Hence, no matter how small |𝑥 − 𝑦| is, |𝑓 (𝑥) − 𝑓 (𝑦)| can be as big
as we like by choosing |𝑥 + 𝑦| large enough. However, the above also shows that 𝑓 is
uniformly continuous on any bounded set of ℝ.
We next show that continuous functions on compact metric spaces are automatically
uniformly continuous. The direct proof based on standard properties of continuous
functions is taken from [6].

5.9 Theorem (Uniform continuity) Let 𝑋, 𝑌 be complete metric spaces. If 𝑋 is com-


pact, then every function 𝑓 ∈ 𝐶(𝑋, 𝑌 ) is uniformly continuous.
( )
Proof. First note that the function 𝐹 ∶ 𝑋×𝑋 → ℝ given by 𝐹 (𝑥, 𝑦) ∶= 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦)
is continuous with respect to the product metric on 𝑋 × 𝑋. Fix 𝜀 > 0 and consider the
inverse image
[ ] { }
𝐴𝜀 ∶= 𝐹 −1 [𝜀, ∞) ∶= (𝑥, 𝑦) ∈ 𝑋 × 𝑋 ∶ 𝐹 (𝑥, 𝑦) ≥ 𝜀 .

As 𝐹 is continuous and [𝜀, ∞) is closed, 𝐴𝜀 is closed by Theorem 5.3. By Proposi-


tion 4.8 𝑋 × 𝑋 is compact and so by Proposition 4.7 𝐴𝜀 is compact. Assume now that
𝐴𝜀 ≠ ∅. Because the real valued function (𝑥, 𝑦) → 𝑑𝑋 (𝑥, 𝑦) is continuous on 𝑋 × 𝑋 it
attains a minimum at some point (𝑥0 , 𝑦0 ) ∈ 𝐴𝜀 ; see Theorem 5.7. In particular,

𝛿 ∶= 𝑑𝑋 (𝑥0 , 𝑦0 ) ≤ 𝑑𝑋 (𝑥, 𝑦)

for all (𝑥, 𝑦) ∈ 𝐴𝜀 . We have 𝛿 > 0 as otherwise 𝑥0 = 𝑦0 and hence (𝑥0 , 𝑦0 ) ∉ 𝐴𝜀 by


definition of 𝐴𝜀 . Furthermore, if 𝑑𝑋 (𝑥, 𝑦) < 𝛿, then (𝑥, 𝑦) ∈ 𝐴𝑐𝜀 and therefore
( )
𝑑𝑋 (𝑥, 𝑦) < 𝛿 ⇐⇒ 𝑑𝑌 𝑓 (𝑥), 𝑓 (𝑦) = 𝐹 (𝑥, 𝑦) < 𝜀. (5.1)

This is exactly what is required for uniform continuity. If 𝐴𝜀 = ∅, then (5.1) holds for
every 𝛿 > 0. As the arguments work for every choice of 𝜀 > 0 this proves the uniform
continuity of 𝑓 .

One could give an alternative proof of the above theorem using the covering property
of compact sets, or a contradiction proof based on the sequential compactness.

15
16
Chapter II
Banach Spaces

The purpose of this chapter is to introduce a class of vector spaces modelled on ℝ𝑁 .


Besides the algebraic properties we have a “norm” on ℝ𝑁 allowing us to measure dis-
tances between points. We generalise the concept of a norm to general vector spaces and
prove some properties of these “normed spaces.” We will see that all finite dimensional
normed spaces are essentially ℝ𝑁 or ℂ𝑁 . The situation becomes more complicated if
the spaces are infinite dimensional. Functional analysis mainly deals with infinite di-
mensional vector spaces.

6 Normed Spaces
We consider a class of vector spaces with an additional topological structure. The
underlying field is always ℝ or ℂ. Most of the theory is developed simultaneously for
vector spaces over the two fields. Throughout, 𝕂 will be one of the two fields.

6.1 Definition (Normed space) Let 𝐸 be a vector space. A map 𝐸 → ℝ, 𝑥 → ‖𝑥‖ is


called a norm on 𝐸 if
(i) ‖𝑥‖ ≥ 0 for all 𝑥 ∈ 𝐸 and ‖𝑥‖ = 0 if and only if 𝑥 = 0;
(ii) ‖𝛼𝑥‖ = |𝛼|‖𝑥‖ for all 𝑥 ∈ 𝐸 and 𝛼 ∈ 𝕂;
(iii) ‖𝑥 + 𝑦‖ ≤ ‖𝑥‖ + ‖𝑦‖ for all 𝑥, 𝑦 ∈ 𝐸 (triangle inequality).
We call (𝐸, ‖⋅‖) or simply 𝐸 a normed space.
There is a useful consequence of the above definition.

6.2 Proposition (Reversed triangle inequality) Let (𝐸, ‖⋅‖) be a normed space. Then,
for all 𝑥, 𝑦 ∈ 𝐸,
| |
‖𝑥 − 𝑦‖ ≥ |‖𝑥‖ − ‖𝑦‖|.
| |
Proof. By the triangle inequality ‖𝑥‖ = ‖𝑥 − 𝑦 + 𝑦‖ ≤ ‖𝑥 − 𝑦‖ + ‖𝑦‖, so ‖𝑥 − 𝑦‖ ≥
‖𝑥‖ − ‖𝑦‖. Interchanging the roles of 𝑥 and 𝑦 and applying (ii) we get ‖𝑥 − 𝑦‖ =
| − 1|‖(−1)(𝑥 − 𝑦)‖ = ‖𝑦 − 𝑥‖ ≥ ‖𝑦‖ − ‖𝑥‖. Combining the two inequalities, the
assertion of the proposition follows.

17
6.3 Lemma Let (𝐸, ‖⋅‖) be a normed space and define

𝑑(𝑥, 𝑦) ∶= ‖𝑥 − 𝑦‖

for all 𝑥, 𝑦 ∈ 𝐸. Then (𝐸, 𝑑) is a metric space.


Proof. By (i) 𝑑(𝑥, 𝑦) = ‖𝑥 − 𝑦‖ ≥ 0 for all 𝑥 ∈ 𝐸 and 𝑑(𝑥, 𝑦) = ‖𝑥 − 𝑦| = 0 if and
only if 𝑥 − 𝑦 = 0, that is, 𝑥 = 𝑦. By (ii) we have 𝑑(𝑥, 𝑦) = ‖𝑥 − 𝑦‖ = ‖(−1)(𝑦 − 𝑥)‖ =
| − 1|‖𝑦 − 𝑥‖ = ‖𝑦 − 𝑥‖ = 𝑑(𝑦, 𝑥) for all 𝑥, 𝑦 ∈ 𝐸. Finally, for 𝑥, 𝑦, 𝑧 it follows from
(iii) that 𝑑(𝑥, 𝑦) = ‖𝑥 − 𝑦‖ = ‖𝑥 − 𝑧 + 𝑧 − 𝑦‖ ≤ ‖𝑥 − 𝑧‖ + ‖𝑧 − 𝑦‖ = 𝑑(𝑥, 𝑧) + 𝑑(𝑧, 𝑦),
proving that 𝑑(⋅ , ⋅) satisfies the axioms of a metric (see Definition 2.1).

We will always equip a normed space with the topology of 𝐸 generated by the metric
induced by the norm. Hence it makes sense to talk about continuity of functions. It
turns out the that topology is compatible with the vector space structure as the following
theorem shows.

6.4 Theorem Given a normed space (𝐸, ‖⋅‖), the following maps are continuous (with
respect to the product topologies).

(1) 𝐸 → ℝ, 𝑥 → ‖𝑥‖ (continuity of the norm);

(2) 𝐸 × 𝐸 → 𝐸, (𝑥, 𝑦) → 𝑥 + 𝑦 (continuity of addition);

(3) 𝕂 × 𝐸 → 𝐸, (𝛼, 𝑥) → 𝛼𝑥 (continuity of multiplication by scalars).


| |
Proof. (1) By the reversed triangle inequality |‖𝑥‖ − ‖𝑦‖| ≤ ‖𝑥 − 𝑦‖, implying that
| |
‖𝑦‖ → ‖𝑦‖ as 𝑥 → 𝑦 (that is, 𝑑(𝑥, 𝑦) = ‖𝑥 − 𝑦‖ → 0). Hence the norm is continuous
as a map from 𝐸 to ℝ.
(2) If 𝑥, 𝑦, 𝑎, 𝑏 ∈ 𝐸 then ‖(𝑥+𝑦)−(𝑎+𝑏)‖ = ‖(𝑥−𝑎)+(𝑦−𝑏)‖ ≤ ‖𝑥−𝑎‖+‖𝑦−𝑏‖ →
0 as 𝑥 → 𝑎 and 𝑦 → 𝑏, showing that 𝑥 + 𝑦 → 𝑎 + 𝑏 as 𝑥 → 𝑎 and 𝑦 → 𝑏. This proves
the continuity of the addition.
(3) For 𝛼, 𝜉 ∈ 𝕂 and 𝑎, 𝑥 ∈ 𝐸 we have, by using the properties of a norm,

‖𝜉𝑥 − 𝛼𝑎‖ = ‖𝜉(𝑥 − 𝑎) + (𝜉 − 𝛼)𝑎‖


≤ ‖𝜉(𝑥 − 𝑎)‖ + ‖(𝜉 − 𝛼)𝑥‖ = |𝜉|‖𝑥 − 𝑎‖ + |𝜉 − 𝛼‖𝑥‖.

As the last expression goes to zero as 𝑥 → 𝑎 in 𝐸 and 𝜉 → 𝛼 in 𝕂 we have also proved


the continuity of the multiplication by scalars.

When looking at metric spaces we discussed a rather important class of metric


spaces, namely complete spaces. Similarly, complete spaces play a special role in func-
tional analysis.

6.5 Definition (Banach space) A normed space which is complete with respect to the
metric induced by the norm is called a Banach space.

6.6 Example The simplest example of a Banach space is ℝ𝑁 or ℂ𝑁 with the Euclidean
norm.

18
We give a characterisation of Banach spaces in terms of properties of series. We recall
the following definition.
∑∞
6.7 Definition (absolute convergence) A series 𝑘=0 𝑎𝑘 in 𝐸 is called absolutely con-
∑∞
vergent if 𝑘=0 ‖𝑎𝑘 ‖ converges.

6.8 Theorem A normed space 𝐸 is complete if and only every absolutely convergent
series in 𝐸 converges.
∑∞
Proof. Suppose that 𝐸 is complete. Let 𝑘=0 𝑎𝑘 an absolutely convergent series in 𝐸,
∑∞
that is, 𝑘=0 ‖𝑎𝑘 ‖ converges. By the Cauchy criterion for the convergence of a series
in ℝ, for every 𝜀 > 0 there exists 𝑛0 ∈ ℕ such that

𝑛
‖𝑎𝑘 ‖ < 𝜀
𝑘=𝑚+1
∑𝑛
for all 𝑛 > 𝑚 > 𝑛0 . (This simply means that the sequence of partial sums 𝑘=0
‖𝑎𝑘 ‖,
𝑛 ∈ ℕ is a Cauchy sequence.) Therefore, by the triangle inequality

‖∑ ∑ ‖ ‖∑ ‖ ∑
𝑛 𝑚 𝑛 𝑛
‖ 𝑎𝑘 − 𝑎𝑘 ‖ = ‖ 𝑎𝑘 ‖ ≤ ‖𝑎𝑘 ‖ < 𝜀
‖ ‖ ‖ ‖
𝑘=0 𝑘=0 𝑘=𝑚+1 𝑘=𝑚+1
∑𝑛
for all 𝑛 > 𝑚 > 𝑛0 . Hence the sequence of partial sums 𝑘=0 𝑎𝑘 , 𝑛 ∈ ℕ is a Cauchy
sequence in 𝐸. Since 𝐸 is complete,



𝑛
𝑎𝑘 = lim 𝑎𝑘
𝑛→∞
𝑘=0 𝑘=0

exists. Hence if 𝐸 is complete, every absolutely convergent series converges in 𝐸. Next


assume that every absolutely convergent series converges. We have to show that every
Cauchy sequence (𝑥𝑛 ) in 𝐸 converges. By Theorem 3.9 it is sufficient to show that (𝑥𝑛 )
has a convergent subsequence. Since (𝑥𝑛 ) is a Cauchy sequence, for every 𝑘 ∈ ℕ there
exists 𝑚𝑘 ∈ ℕ such that ‖𝑥𝑛 − 𝑥𝑚 ‖ < 2−𝑘 for all 𝑛, 𝑚 > 𝑚𝑘 . Now set 𝑛1 ∶= 𝑚1 + 1.
Then inductively choose 𝑛𝑘 such that 𝑛𝑘+1 > 𝑛𝑘 > 𝑚𝑘 for all 𝑘 ∈ ℕ. Then by the above
1
‖𝑥𝑛𝑘+1 − 𝑥𝑛𝑘 ‖ <
2𝑘
for all 𝑘 ∈ ℕ. Now observe that

𝑘−1
𝑥𝑛𝑘 − 𝑥𝑛1 = (𝑥𝑛𝑗+1 − 𝑥𝑛𝑗 )
𝑗=1
∑∞
for all 𝑘 ∈ ℕ. Hence (𝑥𝑛𝑘 ) converges if and only the series 𝑗=1
(𝑥𝑛𝑗+1 − 𝑥𝑛𝑗 ) converges.
By choice of 𝑛𝑘 we have


∑∞
1
‖𝑥𝑛𝑗+1 − 𝑥𝑛𝑗 ‖ ≤ = 1 < ∞.
𝑗=1 𝑗=1
2𝑗
∑∞
Hence 𝑗=1 (𝑥𝑛𝑗+1 − 𝑥𝑛𝑗 ) is absolutely convergent. By assumption every absolutely con-
vergent series converges and therefore (𝑥𝑛𝑘 ) converges, completing the proof of the
theorem.

19
7 Examples of Banach Spaces
In this section we give examples of Banach spaces. They will be used throughout the
course. We start by elementary inequalities and a family of norms in 𝕂𝑁 . They serve
as a model for more general spaces of sequences or functions.

7.1 Elementary Inequalities


In this section we discuss inequalities arising when looking at a family of norms on
𝕂𝑁 . For 1 ≤ 𝑝 ≤ ∞ we define the 𝑝-norms of 𝑥 ∶= (𝑥1 , … , 𝑥𝑁 ) ∈ ℝ𝑁 by

⎧(∑𝑁 )1∕𝑝
⎪ |𝑥𝑖 |𝑝 if 1 ≤ 𝑝 < ∞,
|𝑥|𝑝 ∶= ⎨ 𝑖=1 (7.1)
⎪ max |𝑥𝑖 | if 𝑝 = ∞.
⎩ 𝑖=1,…,𝑁

At this stage we do not know whether |⋅|𝑝 is a norm. We now prove that the 𝑝-norms are
norms, and derive some relationships between them. First we need Young’s inequality.

7.1 Lemma (Young’s inequality) Let 𝑝, 𝑝′ ∈ (1, ∞) such that


1 1
+ = 1. (7.2)
𝑝 𝑝′
Then
1 1 ′
𝑎𝑏 ≤ 𝑎𝑝 + ′ 𝑏𝑝
𝑝 𝑝
for all 𝑎, 𝑏 ≥ 0.
Proof. The inequality is obvious if 𝑎 = 0 or 𝑏 = 0, so we assume that 𝑎, 𝑏 > 0. As ln
is concave we get from (7.2) that
( )
1 𝑝 1 𝑝′ 1 𝑝 1 𝑝′
ln(𝑎𝑏) = ln 𝑎 + ln 𝑏 = ln 𝑎 + ′ ln 𝑏 ≤ ln 𝑎 + ′ 𝑏 .
𝑝 𝑝 𝑝 𝑝
Hence, as exp is increasing we have
( ( ))
1 1 ′ 1 1 ′
𝑎𝑏 = exp(ln(𝑎𝑏)) ≤ exp ln 𝑎𝑝 + ′ 𝑏𝑝 = 𝑎 𝑝 + ′ 𝑏𝑝 ,
𝑝 𝑝 𝑝 𝑝
proving our claim.

The relationship (7.2) is rather important and appears very often. We say that 𝑝′ is the
exponent dual to 𝑝. If 𝑝 = 1 we set 𝑝′ ∶= ∞ and if 𝑝 = ∞ we set 𝑝′ ∶= 1.
From Young’s inequality we get Hölder’s inequality.

7.2 Proposition Let 1 ≤ 𝑝 ≤ ∞ and 𝑝′ the exponent dual to 𝑝. Then for all 𝑥, 𝑦 ∈ 𝕂𝑁


𝑁
|𝑥𝑖 ||𝑦𝑖 | ≤ |𝑥|𝑝 |𝑦|𝑝′ .
𝑖=1

20
Proof. If 𝑥 = 0 or 𝑦 = 0 then the inequality is obvious. Also if 𝑝 = 1 or 𝑝 = ∞ then
the inequality is also rather obvious. Hence assume that 𝑥, 𝑦 ≠ 0 and 1 < 𝑝 < ∞. By
Young’s inequality (Lemma 7.1) we have
𝑁 ( ( ) ( ) ′)
1 ∑𝑁
∑𝑁
|𝑥𝑖 | |𝑦𝑖 | ∑ 1 |𝑥𝑖 | 𝑝 1 |𝑦𝑖 | 𝑝
|𝑥𝑖 ||𝑦𝑖 | = ≤ + ′
|𝑥|𝑝 |𝑦|𝑝′ 𝑖=1 𝑖=1
|𝑥|𝑝 |𝑦|𝑝′ 𝑖=1
𝑝 |𝑥|𝑝 𝑝 |𝑦|𝑝′
𝑁 ( ) 𝑁 ( )′
1 ∑ |𝑥𝑖 | 𝑝 1 ∑ |𝑦𝑖 | 𝑝
= + ′
𝑝 𝑖=1 |𝑥|𝑝 𝑝 𝑖=1 |𝑦|𝑝′

1 1 ∑ 1 1 ∑
𝑁 𝑁

= 𝑝 |𝑥𝑖 | + ′ 𝑝′
𝑝
|𝑦𝑖 |𝑝
𝑝 |𝑥|𝑝 𝑖=1 𝑝 |𝑦| ′ 𝑖=1
𝑝
𝑝 ′

1 |𝑥|𝑝 1 |𝑦|𝑝′
𝑝
1 1
= 𝑝 + ′ = + ′ = 1,
𝑝 |𝑥|𝑝 𝑝 |𝑦| ′
𝑝′
𝑝 𝑝
𝑝

from which the required inequality readily follows.

Now we prove the main properties of the 𝑝-norms.

7.3 Theorem Let 1 ≤ 𝑝 ≤ ∞. Then |⋅|𝑝 is a norm on 𝕂𝑁 . Moreover, if 1 ≤ 𝑝 ≤ 𝑞 ≤ ∞


then 𝑞−𝑝
|𝑥|𝑞 ≤ |𝑥|𝑝 ≤ 𝑁 𝑝𝑞 |𝑥|𝑞 (7.3)
for all 𝑥 ∈ 𝕂𝑁 . (We set (𝑞 − 𝑝)∕𝑝𝑞 ∶= 1∕𝑝 if 𝑞 = ∞.) Finally, the above inequalities
are optimal.
Proof. We first prove that |⋅|𝑝 is a norm. The cases 𝑝 = 1, ∞ are easy, and left to the
reader. Hence assume that 1 < 𝑝 < ∞. By definition |𝑥|𝑝 ≥ 0 and |𝑥|𝑝 = 0 if and only
if |𝑥𝑖 | = 0 for all 𝑖 = 1, … , 𝑁, that is, if 𝑥 = 0. Also, if 𝛼 ∈ 𝕂, then

(∑
𝑁 )1∕𝑝 (∑
𝑁 )1∕𝑝
|𝛼𝑥|𝑝 = |𝛼𝑥𝑖 | 𝑝
= |𝛼|𝑝 |𝑥𝑖 |𝑝 = |𝛼||𝛼𝑥|𝑝 .
𝑖=1 𝑖=1

Thus it remains to prove the triangle inequality. For 𝑥, 𝑦 ∈ 𝕂𝑁 we have, using Hölder’s
inequality (Proposition 7.2), that


𝑁

𝑁
|𝑥 + 𝑦|𝑝𝑝 = |𝑥𝑖 + 𝑦𝑖 | =𝑝
|𝑥𝑖 + 𝑦𝑖 ||𝑥𝑖 + 𝑦𝑖 |𝑝−1
𝑖=1 𝑖=1

∑𝑁

𝑁
≤ |𝑥𝑖 ||𝑥𝑖 + 𝑦𝑖 | 𝑝−1
+ |𝑦𝑖 ||𝑥𝑖 + 𝑦𝑖 |𝑝−1
𝑖=1 𝑖=1
(∑
𝑁 )1∕𝑝′ (∑
𝑁 )1∕𝑝′
(𝑝−1)𝑝′ (𝑝−1)𝑝′
≤ |𝑥|𝑝 |𝑥𝑖 + 𝑦𝑖 | + |𝑦|𝑝 |𝑥𝑖 + 𝑦𝑖 |
𝑖=1 𝑖=1

( )(∑
𝑁
(𝑝−1)𝑝′
)1∕𝑝′
= |𝑥|𝑝 + |𝑦|𝑝 |𝑥𝑖 + 𝑦𝑖 | .
𝑖=1

21
Now, observe that
( )−1 𝑝
′ 1
𝑝 = 1− = ,
𝑝 𝑝−1
so we get from the above that

( )(∑
𝑁 )(𝑝−1)∕𝑝 ( )
|𝑥 + 𝑦|𝑝𝑝 ≤ |𝑥|𝑝 + |𝑦|𝑝 |𝑥𝑖 + 𝑦𝑖 |𝑝 = |𝑥|𝑝 + |𝑦|𝑝 |𝑥 + 𝑦|𝑝−1
𝑝
.
𝑖=1

Hence if 𝑥 + 𝑦 ≠ 0 we get the triangle inequality |𝑥 + 𝑦|𝑝 ≤ |𝑥|𝑝 + |𝑦|𝑝 . The inequality
is obvious if 𝑥 + 𝑦 = 0, so |⋅|𝑝 is a norm on 𝕂𝑁 .
Next we show the first inequality in (7.3). First let 𝑝 < 𝑞 = ∞. If 𝑥 ∈ 𝕂𝑁 we pick
( )1∕𝑝
the component 𝑥𝑗 of 𝑥 such that |𝑥𝑗 | = |𝑥|∞ . Hence |𝑥|∞ = |𝑥𝑗 | = |𝑥𝑗 |𝑝 ≤ |𝑥|𝑝 ,
proving the first inequality in case 𝑞 = ∞. Assume now that 1 ≤ 𝑝 ≤ 𝑞 < ∞. If 𝑥 ≠ 0
then |𝑥𝑖 |∕|𝑥|𝑝 ≤ 1 Hence, as 1 ≤ 𝑝 ≤ 𝑞 < ∞ we have
( |𝑥 | )𝑞 ( |𝑥 | )𝑝
𝑖 𝑖

|𝑥|𝑝 |𝑥|𝑝
for all 𝑥 ∈ 𝕂𝑁 ⧵ {0}. Therefore,
𝑁 (
|𝑥𝑖 | )𝑞
𝑁 (
|𝑥|𝑞𝑞 ∑ ∑ |𝑥𝑖 | )𝑝 |𝑥|𝑝𝑝
= ≤ = =1
|𝑥|𝑞𝑝 𝑖=1
|𝑥|𝑝 𝑖=1
|𝑥|𝑝 |𝑥|𝑝𝑝

for all 𝑥 ∈ 𝕂𝑁 ⧵ {0}. Hence |𝑥|𝑞 ≤ |𝑥|𝑝 for all 𝑥 ≠ 0. For 𝑥 = 0 the inequality is
trivial. To prove the second inequality in (7.3) assume that 1 ≤ 𝑝 < 𝑞 < ∞. We define
𝑠 ∶= 𝑞∕𝑝. The corresponding dual exponent 𝑠′ is given by
𝑠 𝑞
𝑠′ = = .
𝑠−1 𝑞−𝑝
Applying Hölder’s inequality we get

𝑁 (∑
𝑁 )1∕𝑠 (∑
𝑁 )1∕𝑠′ 𝑞−𝑝
(∑
𝑁 )𝑝∕𝑞 𝑞−𝑝
𝑠′
|𝑥|𝑝𝑝 = |𝑥𝑖 | ⋅ 1 ≤
𝑝
|𝑥𝑖 | 𝑝𝑠
1 =𝑁 𝑞 |𝑥𝑖 |𝑞 =𝑁 𝑞 |𝑥|𝑝𝑞
𝑖=1 𝑖=1 𝑖=1 𝑖=1

for all 𝑥 ∈ 𝕂𝑁 , from which the second inequality in (7.3) follows. If 1 ≤ 𝑝 < 𝑞 = ∞
and 𝑥 ∈ 𝕂𝑁 is given we pick 𝑥𝑗 such that |𝑥𝑗 | = |𝑥|∞ . Then
(∑
𝑁 )1∕𝑝 (∑
𝑁 )1∕𝑝
|𝑥|𝑝 = |𝑥𝑖 | 𝑝
≤ |𝑥𝑗 |𝑝 = 𝑁 1∕𝑝 |𝑥𝑗 | = 𝑁 1∕𝑝 |𝑥|∞ ,
𝑖=1 𝑖=1

covering the last case.


We finally show that (7.3) is optimal. For the first inequality look at the standard
basis of 𝕂𝑁 , for which we have equality. For the second choose 𝑥 = (1, 1, … , 1) and
observe that |𝑥|𝑝 = 𝑁 1∕𝑝 . Hence
𝑞−𝑝 𝑞−𝑝 1 1
𝑁 𝑝𝑞 |𝑥|𝑞 = 𝑁 𝑝𝑞 𝑁 𝑞 = 𝑁 𝑝 = |𝑥|𝑝 .
𝑞−𝑝
Hence we cannot decrease the constant 𝑁 𝑝𝑞 in the inequality.

22
7.2 Spaces of Sequences
Here we discuss spaces of sequences. As most of you have seen this in Metric Spaces
the exposition will be rather brief.
Denote by  the space of all sequences in 𝕂, that is, the space of all functions from
ℕ into 𝕂. We denote its elements by 𝑥 = (𝑥0 , 𝑥1 , 𝑥2 , … ) = (𝑥𝑖 ). We define vector space
operations “component” wise:

• (𝑥𝑖 ) + (𝑦𝑖 ) ∶= (𝑥𝑖 + 𝑦𝑖 ) for all (𝑥𝑖 ), (𝑦𝑖 ) ∈ ;

• 𝛼(𝑥𝑖 ) ∶= (𝛼𝑥𝑖 ) for all 𝛼 ∈ 𝕂 and (𝑥𝑖 ) ∈ .

With these operations  becomes a vector space. Given (𝑥𝑖 ) ∈  and 1 ≤ 𝑝 ≤ ∞ we


define the “𝑝-norms”

⎧(∑∞ )1∕𝑝
⎪ |𝑥𝑖 |𝑝 if 1 ≤ 𝑝 < ∞,
|(𝑥𝑖 )|𝑝 ∶= ⎨ 𝑖=1 (7.4)
⎪sup |𝑥 | if 𝑝 = ∞.
⎩ 𝑖∈ℕ 𝑖

These 𝑝-norms are not finite for all sequences. We define some subspaces of  in the
following way:
{ }
• 𝓁𝑝 ∶= 𝓁𝑝 (𝕂) ∶= 𝑥 ∈  ∶ |𝑥|𝑝 < ∞ (1 ≤ 𝑝 ≤ ∞);
{ }
• 𝑐0 ∶= 𝑐0 (𝕂) ∶= (𝑥𝑖 ) ∈  ∶ lim |𝑥𝑖 | = 0 .
𝑖→∞

The 𝑝-norms for sequences have similar properties as the 𝑝-norms in 𝕂𝑁 . In fact most
properties follow from the finite version given in Section 7.1.

7.4 Proposition (Hölder’s inequality) For 1 ≤ 𝑝 ≤ ∞, 𝑥 ∈ 𝓁𝑝 and 𝑦 ∈ 𝓁𝑝′ we have



|𝑥𝑖 ||𝑦𝑖 | ≤ |𝑥|𝑝 |𝑦|𝑝′ ,
𝑖=1

where 𝑝′ is the exponent dual to 𝑝 defined by (7.2).


Proof. Apply Proposition 7.2 to partial sums and then pass to the limit.

7.5 Theorem Let 1 ≤ 𝑝 ≤ ∞. Then (𝓁𝑝 , |⋅|𝑝 ) and (𝑐0 , , |⋅|∞ ) are Banach spaces. More-
over, if 1 < 𝑝 < 𝑞 < ∞ then

𝓁1 ⊊ 𝓁𝑝 ⊊ 𝓁𝑞 ⊊ 𝑐0 ⊊ 𝓁∞ .

Finally, |𝑥|𝑞 ≤ |𝑥|𝑝 for all 𝑥 ∈ 𝓁𝑝 if 1 ≤ 𝑝 < 𝑞 ≤ ∞.


Proof. It readily follows that |⋅|𝑝 is a norm by passing to the limit from the finite
dimensional case in Theorem 7.3. In particular it follows that 𝓁𝑝 , 𝑐0 are subspaces of

23
. If |(𝑥𝑖 )|𝑝 < ∞ for some (𝑥𝑖 ) ∈  and 𝑝 < ∞ then we must have |𝑥𝑖 | → 0. Hence
𝓁𝑝 ⊂ 𝑐0 for all 1 ≤ 𝑝 < ∞. Clearly 𝑐0 ⊊ 𝓁∞ . If 1 ≤ 𝑝 < 𝑞 < ∞ then by Theorem 7.3
(∑
𝑛 )1∕𝑞 (∑
𝑛 )1∕𝑝
sup |𝑥𝑖 | ≤ |𝑥𝑖 |
𝑞
≤ |𝑥𝑖 |𝑝 .
𝑖=1,…,𝑛 𝑖=1 𝑖=1

Passing to the limit on the right hand side and then taking the supremum on the left
hand side we get
|(𝑥𝑖 )|∞ ≤ |(𝑥𝑖 )|𝑞 ≤ |(𝑥𝑖 )|𝑝 ,
proving the inclusions and the inequalities. To show that the inclusions are proper we
use the harmonic series
∑∞
1
= ∞.
𝑖=1
𝑖
Clearly (1∕𝑖) ∈ 𝑐0 but not in 𝓁1 . Similarly, (1∕𝑖1∕𝑝 ) ∈ 𝓁𝑞 for 𝑞 > 𝑝 but not in 𝓁𝑝 .
We finally prove completeness. Suppose that (𝑥𝑛 ) is a Cauchy sequence in 𝓁𝑝 . Then
by definition of the 𝑝-norm

|𝑥𝑖𝑛 − 𝑥𝑖𝑚 | ≤ |𝑥𝑛 − 𝑥𝑚 |𝑝

for all 𝑖, 𝑚, 𝑛 ∈ ℕ. It follows that (𝑥𝑖𝑛 ) is a Cauchy sequence in 𝕂 for every 𝑖 ∈ ℕ. Since
𝕂 is complete
𝑥𝑖 ∶= lim 𝑥𝑖𝑛 (7.5)
𝑛→∞

exists for all 𝑖 ∈ ℕ. We set 𝑥 ∶= (𝑥𝑖 ). We need to show that 𝑥𝑛 → 𝑥 in 𝓁𝑝 , that is, with
respect to the 𝓁𝑝 -norm. Let 𝜀 > 0 be given. By assumption there exists 𝑛0 ∈ ℕ such
that for all 𝑛, 𝑚 > 𝑛0
(∑
𝑁 )1∕𝑝
𝜀
|𝑥𝑖𝑛 − 𝑥𝑖𝑚 |𝑝 ≤ |𝑥𝑛 − 𝑥𝑚 |𝑝 <
𝑖=1
2

if 1 ≤ 𝑝 < ∞ and
𝜀
max |𝑥𝑖𝑛 − 𝑥𝑖𝑚 | ≤ |𝑥𝑛 − 𝑥𝑚 |∞ <
𝑖=1…,𝑁 2
if 𝑝 = ∞. For fixed 𝑁 ∈ ℕ we can let 𝑚 → ∞, so by (7.5) and the continuity of the
absolute value, for all 𝑁 ∈ ℕ and 𝑛 > 𝑛0
(∑
𝑁 )1∕𝑝
𝜀
|𝑥𝑖𝑛 − 𝑥𝑖 |𝑝 ≤
𝑖=1
2

if 1 ≤ 𝑝 < ∞ and
𝜀
max |𝑥𝑖𝑛 − 𝑥𝑖𝑚 | ≤ |𝑥𝑛 − 𝑥|∞ ≤
𝑖=1…,𝑁 2
if 𝑝 = ∞. Letting 𝑁 → ∞ we finally get
𝜀
|𝑥𝑛 − 𝑥|𝑝 ≤ <𝜀
2
for all 𝑛 > 𝑛0 . Since the above works for every 𝜀 > 0 it follows that 𝑥𝑛 − 𝑥 → 0
in 𝓁𝑝 as 𝑛 → ∞. Finally, as 𝓁𝑝 is a vector space and 𝑥𝑛0 , 𝑥𝑛0 − 𝑥 ∈ 𝓁𝑝 we have

24
𝑥 = 𝑥𝑛0 − (𝑥𝑛0 − 𝑥) ∈ 𝓁𝑝 . Hence 𝑥𝑛 → 𝑥 in 𝓁𝑝 , showing that 𝓁𝑝 is complete for
1 ≤ 𝑝 ≤ ∞. To show that 𝑐0 is complete we need to show that 𝑥 ∈ 𝑐0 if 𝑥𝑛 ∈ 𝑐0 for all
𝑛 ∈ ℕ. We know that 𝑥𝑛 → 𝑥 in 𝓁∞ . Hence, given 𝜀 > 0 there exists 𝑛0 ∈ ℕ such that
|𝑥𝑛 − 𝑥|∞ < 𝜀∕2 for all 𝑛 ≥ 𝑛0 . Therefore
𝜀
|𝑥𝑖 | ≤ |𝑥𝑖 − 𝑥𝑖𝑛0 | + |𝑥𝑖𝑛0 | ≤ |𝑥 − 𝑥𝑛0 |∞ + |𝑥𝑖𝑛0 | < + |𝑥𝑖𝑛0 |
2
for all 𝑖 ∈ ℕ. Since 𝑥𝑛0 ∈ 𝑐0 there exists 𝑖0 ∈ ℕ such that |𝑥𝑖𝑛0 | < 𝜀∕2 for all 𝑖 > 𝑖0 .
Hence |𝑥𝑖 | ≤ 𝜀∕2 + 𝜀∕2 = 𝜀 for all 𝑖 > 𝑖0 , so 𝑥 ∈ 𝑐0 as claimed. This completes the
proof of completeness of 𝓁𝑝 and 𝑐0 .

7.6 Remark The proof of completeness in many cases follows similar steps as the one
above.
(1) Take a an arbitrary Cauchy sequence (𝑥𝑛 ) in the normed space 𝐸 and show that (𝑥𝑛 )
converges not in the norm of 𝐸, but in some weaker sense to some 𝑥. (In the above
proof it is, “component-wise” convergence, that is 𝑥𝑖𝑛 converges for each 𝑖 ∈ ℕ to
some 𝑥𝑖 .);
(2) Show that ‖𝑥𝑛 − 𝑥‖𝐸 → 0;
(3) Show that 𝑥 ∈ 𝐸 by using that 𝐸 is a vector space.

7.3 Lebesgue Spaces


The Lebesgue or simply 𝐿𝑝 -spaces are familiar to all those who have taken “Lebesgue
Integration and Fourier Analysis.” We give an outline on the main properties of these
spaces. They are some sort of continuous version of the 𝓁𝑝 -spaces.
Suppose that 𝑋 ⊂ ℝ𝑁 is an open (or simply measurable) set. If 𝑢 ∶ 𝑋 → 𝕂 is a
measurable function we set
⎧( )1∕𝑝
⎪ |𝑢(𝑥)| 𝑝
𝑑𝑥 if 1 ≤ 𝑝 < ∞,
‖𝑢‖𝑝 ∶= ⎨ ∫𝑋 (7.6)
⎪ess sup |𝑢(𝑥)| if 𝑝 = ∞.
⎩ 𝑥∈𝑋
We then let 𝐿𝑝 (𝑋, 𝕂) = 𝐿𝑝 (𝑋) be the space of all measurable functions on 𝑋 for which
‖𝑢‖𝑝 is finite. Two such functions are equal if they are equal almost everywhere.

7.7 Proposition (Hölder’s inequality) Let 1 ≤ 𝑝 ≤ ∞. If 𝑢 ∈ 𝐿𝑝 (𝑋) and 𝑣 ∈ 𝐿𝑝′ (𝑋)


then
|𝑢||𝑣| 𝑑𝑥 ≤ ‖𝑢‖𝑝 ‖𝑣‖𝑝′ .
∫𝑋
Moreover there are the following facts.

7.8 Theorem Let 1 ≤ 𝑝 ≤ ∞. Then (𝐿𝑝 (𝑋), ‖⋅‖𝑝 ) is a Banach space. If 𝑋 has finite
measure then
𝐿∞ (𝑋) ⊂ 𝐿𝑞 (𝑋) ⊂ 𝐿𝑝 (𝑋) ⊂ 𝐿1 (𝑋)
if 1 < 𝑝 < 𝑞 < ∞. If 𝑋 has infinite measure there are no inclusions between 𝐿𝑝 (𝑋)
and 𝐿𝑞 (𝑋) for 𝑝 ≠ 𝑞.

25
7.4 Spaces of Bounded and Continuous Functions
Suppose that 𝑋 is a set and 𝐸 = (𝐸, ‖⋅‖) a normed space. For a function 𝑢 ∶ 𝑋 → 𝐸
we let
‖𝑢‖∞ ∶= sup ‖𝑢(𝑥)‖
𝑥∈𝑋

We define the space of bounded functions by

𝐵(𝑋, 𝐸) ∶= {𝑢 ∶ 𝑋 → 𝐸 ∣ ‖𝑢‖∞ < ∞}.

This space turns out to be a Banach space if 𝐸 is a Banach space. Completeness of


𝐵(𝑋, 𝐸) is equivalent to the fact that the uniform limit of a bounded sequence of func-
tions is bounded. The proof follows the steps outlined in Remark 7.6.

7.9 Theorem If 𝑋 is a set and 𝐸 a Banach space, then 𝐵(𝑋, 𝐸) is a Banach space
with the supremum norm.
Proof. Let (𝑢𝑛 ) be a Cauchy sequence in 𝐵(𝑋, 𝐸). As

‖𝑢𝑛 (𝑥) − 𝑢𝑚 (𝑥)‖𝐸 ≤ ‖𝑢𝑛 − 𝑢𝑚 ‖∞ (7.7)


( )
for all 𝑥 ∈ 𝑋 and 𝑚, 𝑛 ∈ ℕ it follows that 𝑢𝑛 (𝑥) is a Cauchy sequence in 𝐸 for every
𝑥 ∈ 𝑋. Since 𝐸 is complete
𝑢(𝑥) ∶= lim 𝑢𝑛 (𝑥)
𝑛→∞

exists for all 𝑥 ∈ 𝑋. We need to show that 𝑢𝑛 → 𝑢 in 𝐵(𝑋, 𝐸), that is, with respect to
the supremum norm. Let 𝜀 > 0 be given. Then by assumption there exists 𝑛0 ∈ ℕ such
that ‖𝑢𝑛 − 𝑢𝑚 ‖∞ < 𝜀∕2 for all 𝑚, 𝑛 > 𝑛0 . Using (7.7) we get
𝜀
‖𝑢𝑛 (𝑥) − 𝑢𝑚 (𝑥)‖ <
2
for all 𝑚, 𝑛 > 𝑛0 and 𝑥 ∈ 𝑋. For fixed 𝑥 ∈ 𝑋 we can let 𝑚 → ∞, so by the continuity
of the norm
𝜀
‖𝑢𝑛 (𝑥) − 𝑢(𝑥)‖ <
2
for all 𝑥 ∈ 𝑋 and all 𝑛 > 𝑛0 . Hence ‖𝑢𝑛 −𝑢‖∞ ≤ 𝜀∕2 < 𝜀 for all 𝑛 > 𝑛0 . Since the above
works for every 𝜀 > 0 it follows that 𝑢𝑛 − 𝑢 → 0 in 𝐸 as 𝑛 → ∞. Finally, as 𝐵(𝑋, 𝐸)
is a vector space and 𝑢𝑛0 , 𝑢𝑛0 − 𝑢 ∈ 𝐵(𝑋, 𝐸) we have 𝑢 = 𝑢𝑛0 − (𝑢𝑛0 − 𝑢) ∈ 𝐵(𝑋, 𝐸).
Hence 𝑢𝑛 → 𝑢 in 𝐵(𝑋, 𝐸), showing that 𝐵(𝑋, 𝐸) is complete.

If 𝑋 is a metric space we denote the vector space of all continuous functions by 𝐶(𝑋, 𝐸).
This space does not carry a topology or norm in general. However,

𝐵𝐶(𝑋, 𝐸) ∶= 𝐵(𝑋, 𝐸) ∩ 𝐶(𝑋, 𝐸)

becomes a normed space with norm ‖⋅‖∞ . Note that if 𝑋 is compact then 𝐶(𝑋, 𝐸) =
𝐵𝐶(𝑋, 𝐸). The space 𝐵𝐶(𝑋, 𝐸) turns out to be a Banach space if 𝐸 is a Banach
space. Note that the completeness of 𝐵𝐶(𝑋, 𝐸) is equivalent to the fact that the uniform
limit of continuous functions is continuous. Hence the language of functional analysis
provides a way to rephrase standard facts from analysis in a concise and unified way.

26
7.10 Theorem If 𝑋 is a metric space and 𝐸 a Banach space, then 𝐵𝐶(𝑋, 𝐸) is a
Banach space with the supremum norm.

Proof. Let 𝑢𝑛 be a Cauchy sequence in 𝐵𝐶(𝑋, 𝐸). Then it is a Cauchy sequence in


𝐵(𝑋, 𝐸). By completeness of that space 𝑢𝑛 → 𝑢 in 𝐵(𝑋, 𝐸), so we only need to show
that 𝑢 is continuous. Fix 𝑥0 ∈ 𝑋 arbitrary. We show that 𝑓 is continuous at 𝑥0 . Clearly

‖𝑢(𝑥) − 𝑢(𝑥0 )‖𝐸 ≤ ‖𝑢(𝑥) − 𝑢𝑛 (𝑥)‖𝐸 + ‖𝑢𝑛 (𝑥) − 𝑢𝑛 (𝑥0 )‖𝐸 + ‖𝑢𝑛 (𝑥0 ) − 𝑢(𝑥0 )‖𝐸 (7.8)

for all 𝑥 ∈ 𝑋 and 𝑛 ∈ ℕ. Fix now 𝜀 > 0 arbitrary. Since 𝑢𝑛 → 𝑢 in 𝐵(𝑋, 𝐸) there
exists 𝑛0 ∈ ℕ such that ‖𝑢𝑛 (𝑥) − 𝑢(𝑥)‖𝐸 < 𝜀∕4 for all 𝑛 > 𝑛0 and 𝑥 ∈ 𝑋. Hence (7.8)
implies that
𝜀
‖𝑢(𝑥) − 𝑢(𝑥0 )‖𝐸 < + ‖𝑢𝑛 (𝑥) − 𝑢𝑛 (𝑥0 )‖𝐸 . (7.9)
2
Since 𝑢𝑛0 +1 is continuous at 𝑥0 there exists 𝛿 > 0 such that ‖𝑢𝑛0 +1 (𝑥)−𝑢𝑛0 +1 (𝑥0 )‖𝐸 < 𝜀∕2
for all 𝑥 ∈ 𝑋 with 𝑑(𝑥, 𝑥0 ) < 𝛿 and 𝑛 > 𝑛0 . Using (7.9) we get ‖𝑢(𝑥) − 𝑢(𝑥0 )‖𝐸 <
𝜀∕2+𝜀∕2 if 𝑑(𝑥, 𝑥0 ) < 𝛿 and so 𝑢 is continuous at 𝑥0 . As 𝑥0 was arbitrary, 𝑢 ∈ 𝐶(𝑋, 𝐸)
as claimed.

Note that the above is a functional analytic reformulation of the fact that a uniformly
convergent sequence of bounded functions is bounded, and similarly that a uniformly
convergent sequence of continuous functions is continuous.

8 Basic Properties of Bounded Linear Operators


One important aim of functional analysis is to gain a deep understanding of properties
of linear operators. We start with some definitions.

8.1 Definition (bounded sets) A subset 𝑈 of a normed space 𝐸 is called bounded if


there exists 𝑀 > 0 such that 𝑈 ⊂ 𝐵(0, 𝑀).

We next define some classes of linear operators.

8.2 Definition Let 𝐸, 𝐹 be two normed spaces.


(a) We denote by Hom(𝐸, 𝐹 ) the set of all linear operators from 𝐸 to 𝐹 . (“Hom”
because linear operators are homomorphisms between vector spaces.) We also set
Hom(𝐸) ∶= Hom(𝐸, 𝐸).
(b) We set
(𝐸, 𝐹 ) ∶= {𝑇 ∈ Hom(𝐸, 𝐹 ) ∶ 𝑇 continuous}
and (𝐸) ∶= (𝐸, 𝐸).
(c) We call 𝑇 ∈ Hom(𝐸, 𝐹 ) bounded if 𝑇 maps every bounded subset of 𝐸 onto a
bounded subset of 𝐹 .

In the following theorem we collect the main properties of continuous and bounded
linear operators. In particular we show that a linear operator is bounded if and only if
it is continuous.

27
8.3 Theorem For 𝑇 ∈ Hom(𝐸, 𝐹 ) the following statements are equivalent:

(i) 𝑇 is uniformly continuous;

(ii) 𝑇 ∈ (𝐸, 𝐹 );

(iii) 𝑇 is continuous at 𝑥 = 0;

(iv) 𝑇 is bounded;

(v) There exists 𝛼 > 0 such that ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸.

Proof. The implications (i)⇒(ii) and (ii)⇒(iii) are obvious. Suppose now that 𝑇 is
continuous at 𝑥 = 0 and that 𝑈 is an arbitrary bounded subset of 𝐸. As 𝑇 is continuous
at 𝑥 = 0 there exists 𝛿 > 0 such that ‖𝑇 𝑥‖𝐹 ≤ 1 whenever ‖𝑥‖𝐸 ≤ 𝛿. Since 𝑈 is
bounded 𝑀 ∶= sup𝑥∈𝑈 ‖𝑥‖𝐸 < ∞, so for every 𝑥 ∈ 𝑈

‖𝛿 ‖ 𝛿
‖ 𝑥‖ = ‖𝑥‖𝐸 ≤ 𝛿.
‖ 𝑀 ‖𝐸 𝑀
Hence by the linearity of 𝑇 and the choice of 𝛿
( )
𝛿 ‖ 𝛿 ‖
‖𝑇 𝑥‖𝐹 = ‖𝑇 𝑥 ‖ ≤ 1,
𝑀 ‖ 𝑀 ‖𝐹
showing that
𝑀
‖𝑇 𝑥‖𝐹 ≤
𝛿
for all 𝑥 ∈ 𝑈 . Therefore, the image of 𝑈 under 𝑇 is bounded, showing that (iii) implies
(iv). Suppose now that 𝑇 is bounded. Then there exists 𝛼 > 0 such that ‖𝑇 𝑥‖𝐹 ≤ 𝛼
whenever ‖𝑥‖𝐸 ≤ 1. Hence, using the linearity of 𝑇
( )
1 ‖ 𝑥 ‖
‖𝑇 𝑥‖𝐹 = ‖𝑇 ‖ ≤𝛼
‖𝑥‖𝐸 ‖ ‖𝑥‖𝐸 ‖𝐹

for all 𝑥 ∈ 𝐸 with 𝑥 ≠ 0. Since 𝑇 0 = 0 it follows that ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸.
Hence (iv) implies (v). Suppose now that there exists 𝛼 > 0 such that ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸
for all 𝑥 ∈ 𝐸. Then by the linearity of 𝑇

‖𝑇 𝑥 − 𝑇 𝑦‖𝐹 = ‖𝑇 (𝑥 − 𝑦)‖𝐹 ≤ 𝛼‖𝑥 − 𝑦‖𝐸 ,

showing the uniform continuity of 𝑇 . Hence (v) implies (i), completing the proof of
the theorem.

Obviously, (𝐸, 𝐹 ) is a vectors space. We will show that it is a normed space if we


define an appropriate norm.

8.4 Definition (operator norm) For 𝑇 ∈ (𝐸, 𝐹 ) we define

‖𝑇 ‖(𝐸,𝐹 ) ∶= inf{𝛼 > 0 ∶ ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸}

We call ‖𝑇 ‖(𝐸,𝐹 ) the operator norm of 𝐸.

28
8.5 Remark We could define ‖𝑇 ‖(𝐸,𝐹 ) for all 𝑇 ∈ Hom(𝐸, 𝐹 ), but Theorem 8.3
shows that 𝑇 ∈ (𝐸, 𝐹 ) if and only if ‖𝑇 ‖(𝐸,𝐹 ) < ∞.
Before proving that the operator norm is in fact a norm, we first give other characteri-
sations.

8.6 Proposition Suppose that 𝑇 ∈ (𝐸, 𝐹 ). Then ‖𝑇 𝑥‖𝐹 ≤ ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 for all
𝑥 ∈ 𝐸. Moreover,
‖𝑇 𝑥‖𝐹
‖𝑇 ‖(𝐸,𝐹 ) = sup = sup ‖𝑇 𝑥‖𝐹 = sup ‖𝑇 𝑥‖𝐹 = sup ‖𝑇 𝑥‖𝐹 . (8.1)
𝑥∈𝐸⧵{0} ‖𝑥‖𝐸 ‖𝑥‖𝐸 =1 ‖𝑥‖𝐸 <1 ‖𝑥‖𝐸 ≤1

Proof. Fix 𝑇 ∈ (𝐸, 𝐹 ) and set 𝐴 ∶= {𝛼 > 0 ∶ ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸}. By
definition ‖𝑇 ‖(𝐸,𝐹 ) = inf 𝐴. If 𝛼 ∈ 𝐴, then ‖𝑇 𝑥‖𝐹 ≤ 𝛼‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸. Hence,
for every 𝑥 ∈ 𝐸 we have ‖𝑇 𝑥‖𝐹 ≤ (inf 𝐴)‖𝑥‖𝐸 , proving the first claim. Set now
‖𝑇 𝑥‖𝐹
𝜆 ∶= sup .
𝑥∈𝐸⧵{0} ‖𝑥‖𝐸

Then, ‖𝑇 𝑥‖𝐹 ≤ 𝜆‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸, and so 𝜆 ≥ inf 𝐴 = ‖𝑇 ‖(𝐸,𝐹 ) . By the above we
have ‖𝑇 𝑥‖𝐹 ≤ ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 and thus

‖𝑇 𝑥‖𝐹
≤ ‖𝑇 ‖(𝐸,𝐹 )
‖𝑥‖𝐸
for all 𝑥 ∈ 𝐸 ⧵ {0}, implying that 𝜆 ≤ ‖𝑇 ‖(𝐸,𝐹 ) . Combining the inequalities 𝜆 =
‖𝑇 ‖(𝐸,𝐹 ) , proving the first equality in (8.1). Now by the linearity of 𝑇

‖𝑇 𝑥‖𝐹 ‖ 𝑥 ‖
sup = sup ‖𝑇 ‖ = sup ‖𝑇 𝑥‖𝐹 ,
𝑥∈𝐸⧵{0} ‖𝑥‖𝐸 𝑥∈𝐸⧵{0} ‖ ‖𝑥‖𝐸 ‖𝐹 ‖𝑥‖𝐸 =1

proving the second equality in (8.1). To prove the third equality note that
‖𝑇 𝑥‖𝐹 ‖𝑇 𝑥‖𝐹
𝛽 ∶= sup ‖𝑇 𝑥‖𝐹 ≤ sup ≤ sup = 𝜆.
‖𝑥‖𝐸 <1 ‖𝑥‖𝐸 <1 ‖𝑥‖𝐸 𝑥∈𝐸⧵{0} ‖𝑥‖𝐸

On the other hand, we have for every 𝑥 ∈ 𝐸 and 𝜀 > 0


‖ 𝑥 ‖
‖𝑇 ‖ ≤𝛽
‖ ‖𝑥‖𝐸 + 𝜀 ‖𝐹

and thus ‖𝑇 𝑥‖𝐹 ≤ 𝛽(‖𝑥‖𝐸 + 𝜀) for all 𝜀 > 0 and 𝑥 ∈ 𝐸. Hence ‖𝑇 𝑥‖𝐹 ≤ 𝛽‖𝑥‖𝐸 for
all 𝑥 ∈ 𝐸, implying that 𝛽 ≥ 𝜆. Combining the inequalities 𝛽 = 𝜆, which is the third
inequality in (8.1). For the last equality note that ‖𝑇 𝑥‖𝐹 ≤ ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 ≤ ‖𝑇 ‖(𝐸,𝐹 )
whenever ‖𝑥‖𝐸 ≤ 1. Hence

sup ‖𝑇 𝑥‖𝐹 ≤ ‖𝑇 ‖(𝐸,𝐹 ) = sup ‖𝑇 𝑥‖𝐹 ≤ sup ‖𝑇 𝑥‖𝐹 ,


‖𝑥‖𝐸 ≤1 ‖𝑥‖𝐸 =1 ‖𝑥‖𝐸 ≤1

implying the last inequality.

We next show that ‖⋅‖(𝐸,𝐹 ) is a norm.

29
( )
8.7 Proposition The space (𝐸, 𝐹 ), ‖⋅‖(𝐸,𝐹 ) is a normed space.
Proof. By definition ‖𝑇 ‖(𝐸,𝐹 ) ≥ 0 for all 𝑇 ∈ (𝐸, 𝐹 ). Let now ‖𝑇 ‖(𝐸,𝐹 ) = 0. Then
by (8.1) we have
‖𝑇 𝑥‖𝐹
sup = 0,
𝑥∈𝐸⧵{0} ‖𝑥‖𝐸

so in particular ‖𝑇 𝑥‖𝐹 = 0 for all 𝑥 ∈ 𝐸. Hence 𝑇 = 0 is the zero operator. If 𝜆 ∈ 𝕂,


then by (8.1)

‖𝜆𝑇 ‖(𝐸,𝐹 ) = sup ‖𝜆𝑇 𝑥‖𝐹 = sup |𝜆|‖𝑇 𝑥‖𝐹


‖𝑥‖𝐸 =1 ‖𝑥‖𝐸 =1

= |𝜆| sup ‖𝑇 𝑥‖𝐹 = |𝜆|‖𝑇 ‖(𝐸,𝐹 ) .


‖𝑥‖𝐸 =1

If 𝑆, 𝑇 ∈ (𝐸, 𝐹 ), then again by (8.1)


( )
‖𝑆 + 𝑇 ‖(𝐸,𝐹 ) = sup ‖(𝑆 + 𝑇 )𝑥‖𝐹 ≤ sup ‖𝑆𝑥‖𝐹 + ‖𝑇 𝑥‖𝐹
‖𝑥‖𝐸 =1 ‖𝑥‖𝐸 =1
( )
≤ sup ‖𝑆‖(𝐸,𝐹 ) + ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 = ‖𝑆‖(𝐸,𝐹 ) + ‖𝑇 ‖(𝐸,𝐹 ) ,
‖𝑥‖𝐸 =1

completing the proof of the proposition.

From now on we will always assume that (𝐸, 𝐹 ) is equipped with the operator norm.
Using the steps outlined in Remark 7.6 we prove that (𝐸, 𝐹 ) is complete if 𝐹 is com-
plete.

8.8 Theorem If 𝐹 is a Banach space, then (𝐸, 𝐹 ) is a Banach space with respect to
the operator norm.
Proof. To simplify notation we let ‖𝑇 ‖ ∶= ‖𝑇 ‖(𝐸,𝐹 ) for all 𝑇 ∈ (𝐸, 𝐹 ). Sup-
pose that 𝐹 is a Banach space, and that (𝑇𝑛 ) is a Cauchy sequence in (𝐸, 𝐹 ). By
Proposition 8.6 we have

‖𝑇𝑛 𝑥 − 𝑇𝑚 𝑥‖𝐹 = ‖(𝑇𝑛 − 𝑇𝑚 )𝑥‖𝐹 ≤ ‖𝑇𝑛 − 𝑇𝑚 ‖‖𝑥‖𝐸

for all 𝑥 ∈ 𝐸 and 𝑛, 𝑚 ∈ ℕ. As (𝑇𝑛 ) is a Cauchy sequence in (𝐸, 𝐹 ) it follows that


(𝑇𝑛 𝑥) is a Cauchy sequence in 𝐹 for all 𝑥 ∈ 𝐸. As 𝐹 is complete

𝑇 𝑥 ∶= lim 𝑇𝑛 𝑥
𝑛→∞

exists for all 𝑥 ∈ 𝐸. If 𝑥, 𝑦 ∈ 𝐸 and 𝜆, 𝜇 ∈ 𝕂, then


𝑇𝑛 (𝜆𝑥 + 𝜇𝑦) = 𝜆𝑇𝑛 𝑥 + 𝜇𝑇𝑛 𝑦


⏐𝑛→∞ ⏐

⏐𝑛→∞
↓ ↓
𝑇 (𝜆𝑥 + 𝜇𝑦) = 𝜆𝑇 𝑥 + 𝜇𝑇 𝑦.
Hence, 𝑇 ∶ 𝐸 → 𝐹 is a linear operator. It remains to show that 𝑇 ∈ (𝐸, 𝐹 ) and that
𝑇𝑛 → 𝑇 in (𝐸, 𝐹 ). As (𝑇𝑛 ) is a Cauchy sequence, for every 𝜀 > 0 there exists 𝑛0 ∈ ℕ
such that
‖𝑇𝑛 𝑥 − 𝑇𝑚 𝑥‖𝐹 ≤ ‖𝑇𝑛 − 𝑇𝑚 ‖‖𝑥‖𝐸 ≤ 𝜀‖𝑥‖𝐸

30
for all 𝑛, 𝑚 ≥ 𝑛0 and all 𝑥 ∈ 𝐸. Letting 𝑚 → ∞ and using the continuity of the norm
we see that
‖𝑇𝑛 𝑥 − 𝑇 𝑥‖𝐹 ≤ 𝜀‖𝑥‖𝐸
for all 𝑥 ∈ 𝐸 and 𝑛 ≥ 𝑛0 . By definition of the operator norm ‖𝑇𝑛 −𝑇 ‖ ≤ 𝜀 for all 𝑛 ≥ 𝑛0 .
In particular, 𝑇𝑛 − 𝑇 ∈ (𝐸, 𝐹 ) for all 𝑛 ≥ 𝑛0 . Since 𝜀 > 0 was arbitrary, ‖𝑇𝑛 − 𝑇 ‖ → 0
in (𝐸, 𝐹 ). Finally, since (𝐸, 𝐹 ) is a vector space and 𝑇𝑛0 , 𝑇𝑛0 − 𝑇 ∈ (𝐸, 𝐹 ), we
have 𝑇 = 𝑇𝑛0 − (𝑇𝑛0 − 𝑇 ) ∈ (𝐸, 𝐹 ), completing the proof of the theorem.

9 Equivalent Norms
Depending on the particular problem we look at, it may be convenient to work with
different norms. Some norms generate the same topology as the original norm, others
may generate a different topology. Here are some definitions.

9.1 Definition (Equivalent norms) Suppose that 𝐸 is a vector space, and that ‖⋅‖1 and
‖⋅‖2 are norms on 𝐸.

• We say that ‖⋅‖1 is stronger than ‖⋅‖2 if there exists a constant 𝐶 > 0 such that

‖𝑥‖2 ≤ 𝐶‖𝑥‖1

for all 𝑥 ∈ 𝐸. In that case we also say that ‖⋅‖2 is weaker than ‖⋅‖1 .

• We say that the norms ‖⋅‖1 and ‖⋅‖2 are equivalent if there exist two constants
𝑐, 𝐶 > 0 such that
𝑐‖𝑥‖1 ≤ ‖𝑥‖2 ≤ 𝐶‖𝑥‖1
for all 𝑥 ∈ 𝐸.

9.2 Examples (a) Let |⋅|𝑝 denote the 𝑝-norms on 𝕂𝑁 , where 1 ≤ 𝑝 ≤ ∞ as defined in
Section 7.2. We proved in Theorem 7.3 that
𝑞−𝑝
|𝑥|𝑞 ≤ |𝑥|𝑝 ≤ 𝑁 𝑝𝑞 |𝑥|𝑞

for all 𝑥 ∈ 𝕂𝑁 if 1 ≤ 𝑝 ≤ 𝑞 ≤ ∞. Hence all 𝑝-norms on 𝕂𝑁 are equivalent.


(b) Consider 𝓁𝑝 for some 𝑝 ∈ [1, ∞). Then by Theorem 7.5 we have |𝑥|𝑞 ≤ |𝑥|𝑝 for
all 𝑥 ∈ 𝓁𝑝 if 1 ≤ 𝑝 < 𝑞 ≤ ∞. Hence the 𝑝-norm is stronger than the 𝑞-norm considered
as a norm on 𝓁𝑝 . Note that in contrast to the finite dimensional case considered in (a)
there is no equivalence of norms!
The following worthwhile observations are easily checked.

9.3 Remarks (a) Equivalence of norms is an equivalence relation.


(b) Equivalent norms generate the same topology on a space.
(c) If ‖⋅‖1 is stronger than ‖⋅‖2 , then the topology 1 on 𝐸 induced by ‖⋅‖1 is stronger
than the topology 2 induced by ‖⋅‖2 . This means that 1 ⊇ 2 , that is, sets open with
respect to ‖⋅‖1 are open with respect to ‖⋅‖2 but not necessarily vice versa.

31
(d) Consider the two normed spaces 𝐸1 ∶= (𝐸, ‖⋅‖1 ) and 𝐸2 ∶= (𝐸, ‖⋅‖2 ). Clearly
𝐸1 = 𝐸2 = 𝐸 as sets, but not as normed (or metric) spaces. By Theorem 8.3 it is
obvious that ‖⋅‖1 is stronger than ‖⋅‖2 if and only if the linear map 𝑖(𝑥) ∶= 𝑥 is a
bounded linear operator 𝑖 ∈ (𝐸1 , 𝐸2 ). If the two norms are equivalent then also
𝑖−1 = 𝑖 ∈ (𝐸2 , 𝐸1 ).

9.4 Lemma If ‖⋅‖1 and ‖⋅‖2 are two equivalent norms, then 𝐸1 ∶= (𝐸, ‖⋅‖1 ) is com-
plete if and only if 𝐸2 ∶= (𝐸, ‖⋅‖2 ) is complete.
Proof. Let (𝑥𝑛 ) be a sequence in 𝐸. Since ‖⋅‖1 and ‖⋅‖2 are equivalent there exists
𝑐, 𝐶 > 0 such that
𝑐‖𝑥𝑛 − 𝑥𝑚 ‖1 ≤ ‖𝑥𝑛 − 𝑥𝑚 ‖2 ≤ 𝐶‖𝑥𝑛 − 𝑥𝑚 ‖1
for all 𝑛, 𝑚 ∈ ℕ. Hence (𝑥𝑛 ) is a Cauchy sequence in 𝐸1 if and only if it is a Cauchy
sequence in 𝐸2 . Denote by 𝑖(𝑥) ∶= 𝑥 the identity map. If 𝑥𝑛 → 𝑥 in 𝐸1 , then 𝑥𝑛 → 𝑥
in 𝐸2 since 𝑖 ∈ (𝐸1 , 𝐸2 ) by Remark 9.3(d). Similarly 𝑥𝑛 → 𝑥 in 𝐸1 if 𝑥𝑛 → 𝑥 in 𝐸2
since 𝑖 ∈ (𝐸2 , 𝐸1 ).
Every linear operator between normed spaces induces a norm on its domain. We show
under what circumstances it is equivalent to the original norm.

9.5 Definition (Graph norm) Suppose that 𝐸, 𝐹 are normed spaces and 𝑇 ∈ Hom(𝐸, 𝐹 ).
We call
‖𝑢‖𝑇 ∶= ‖𝑢‖𝐸 + ‖𝑇 𝑢‖𝐹
the graph norm on 𝐸 associated with 𝑇 .
Let us now explain the term “graph norm.”

9.6 Remark From the linearity of 𝑇 it is rather evident that ‖⋅‖𝑇 is a norm on 𝐸. It is
called the “graph norm” because it really is a norm on the graph of 𝑇 . The graph of 𝑇
is the set
graph(𝑇 ) ∶= {(𝑢, 𝑇 𝑢) ∶ 𝑢 ∈ 𝐸} ⊂ 𝐸 × 𝐹 .
By the linearity of 𝑇 that graph is a linear subspace of 𝐸 × 𝐹 , and ‖⋅‖𝑇 is a norm,
making graph(𝑇 ) into a normed space. Hence the name graph norm.
Since ‖𝑢‖𝐸 ≤ ‖𝑢‖𝐸 +‖𝑇 𝑢‖𝐹 = ‖𝑢‖𝑇 for all 𝑢 ∈ 𝐸 the graph norm of any linear operator
is stronger than the norm on 𝐸. We have equivalence if and only if 𝑇 is bounded!

9.7 Proposition Let 𝑇 ∈ Hom(𝐸, 𝐹 ) and denote by ‖⋅‖𝑇 the corresponding graph
norm on 𝐸. Then 𝑇 ∈ (𝐸, 𝐹 ) if and only if ‖⋅‖𝑇 is equivalent to ‖⋅‖𝐸 .
Proof. If 𝑇 ∈ (𝐸, 𝐹 ), then
( )
‖𝑢‖𝐸 ≤ ‖𝑢‖𝑇 = ‖𝑢‖𝐸 + ‖𝑇 𝑢‖𝐹 ≤ ‖𝑢‖𝐸 + ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑢‖𝐸 = 1 + ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑢‖𝐸
for all 𝑢 ∈ 𝐸, showing that ‖⋅‖𝑇 and ‖⋅‖𝐸 are equivalent. Now assume that the two
norms are equivalent. Hence there exists 𝐶 > 0 such that ‖𝑢‖𝑇 ≤ 𝐶‖𝑢‖𝐸 for all 𝑢 ∈ 𝐸.
Therefore,
‖𝑇 𝑢‖𝐹 ≤ ‖𝑢‖𝐸 + ‖𝑇 𝑢‖𝐹 = ‖𝑢‖𝑇 ≤ 𝐶‖𝑢‖𝐸
for all 𝑢 ∈ 𝐸. Hence 𝑇 ∈ (𝐸, 𝐹 ) by Theorem 8.3.

32
10 Finite Dimensional Normed Spaces
In the previous section we did not make any assumption on the dimension of a vector
space. We prove that all norms on such spaces are equivalent.

10.1 Theorem Suppose that 𝐸 is a finite dimensional vector space. Then all norms on
𝐸 are equivalent. Moreover, 𝐸 is complete with respect to every norm.
Proof. Suppose that dim 𝐸 = 𝑁. Given a basis (𝑒1 , … , 𝑒𝑁 ), for every 𝑥 ∈ 𝐸 there
exist unique scalars 𝜉1 , … , 𝜉𝑁 ∈ 𝕂 such that


𝑁
𝑥= 𝜉𝑖 𝑒𝑖 .
𝑖=1

In other words, the map 𝑇 ∶ 𝕂𝑁 → 𝐸, given by


𝑁
𝑇 𝜉 ∶= 𝜉𝑖 𝑒𝑖 (10.1)
𝑖=1

for all 𝜉 = (𝜉1 , … , 𝜉𝑁 ), is an isomorphism between 𝕂𝑁 and 𝐸. Let now ‖⋅‖𝐸 be a


norm on 𝐸. Because equivalence of norms is an equivalence relation, it is sufficient
to show that the graph norm of 𝑇 −1 on 𝐸 is equivalent to ‖⋅‖𝐸 . By Proposition 9.7
this is the case if 𝑇 −1 ∈ (𝐸, 𝕂𝑁 ). First we show that 𝑇 ∈ (𝕂𝑁 , 𝐸). Using the
properties of a norm and the Cauchy-Schwarz inequality for the dot product in 𝕂𝑁 (see
also Proposition 7.2 for 𝑝 = 𝑞 = 2) we get

(∑ )1∕2
‖∑ ∑
𝑁 𝑁 𝑁

‖𝑇 𝜉‖𝐸 = ‖ 𝜉𝑖 𝑒𝑖 ‖ ≤ |𝜉𝑖 |‖𝑒𝑖 ‖𝐸 ≤ ‖𝑒𝑖 ‖2𝐸 |𝜉|2 = 𝐶|𝜉|2
‖ ‖𝐸
𝑛=1 𝑛=1 𝑛=1

(∑ )1∕2
𝑁
for all 𝜉 ∈ 𝕂 𝑁
if we set 𝐶 ∶= 𝑛=1
‖𝑒𝑖 ‖2𝐸 . Hence, 𝑇 ∈ (𝕂𝑁 , 𝐸). By the
continuity of a norm the map 𝜉 → ‖𝑇 𝜉‖𝐸 is a continuous map from 𝕂𝑁 to ℝ. In
particular it is continuous on the unit sphere 𝑆 = {𝜉 ∈ 𝕂𝑁 ∶ |𝜉|2 = 1}. Clearly 𝑆 is
a compact subset of 𝕂𝑁 . We know from Theorem 5.7 that continuous functions attain
a minimum on such a set. Hence there exists 𝛽 ∈ 𝑆 such that ‖𝑇 𝛽‖𝐸 ≤ ‖𝑇 𝜉‖𝐸 for
all 𝜉 ∈ 𝑆. Since ‖𝛽‖ = 1 ≠ 0 and 𝑇 is an isomorphism, property (i) of a norm (see
Definition 6.1) implies that 𝑐 ∶= ‖𝑇 𝛽‖𝐸 > 0. If 𝜉 ≠ 0, then since 𝜉∕|𝜉|2 ∈ 𝑆,

‖ 𝜉 ‖
𝑐 ≤ ‖𝑇 ‖ .
‖ |𝜉|2 ‖𝐸

Now by the linearity of 𝑇 and property (ii) of a norm 𝑐|𝜉|2 ≤ ‖𝑇 𝜉‖𝐸 and therefore
|𝑇 −1 𝑥|2 ≤ 𝑐 −1 ‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸. Hence 𝑇 −1 ∈ (𝐸, 𝕂𝑁 ) as claimed. We finally need
to prove completeness. Given a Cauchy sequence in 𝐸 with respect to ‖⋅‖𝐸 we have
from what we just proved that

|𝑇 −1 𝑥𝑛 − 𝑇 −1 𝑥𝑚 |2 ≤ 𝑐‖𝑥𝑛 − 𝑥𝑚 ‖𝐸 ,

33
showing that (𝑇 −1 𝑥𝑛 ) is a Cauchy sequence in 𝕂𝑁 . By the completeness of 𝕂𝑁 we
have 𝑇 −1 𝑥𝑛 → 𝜂 in 𝕂𝑁 . By continuity of 𝑇 proved above we get 𝑥𝑛 → 𝑇 𝜂, so (𝑥𝑛 )
converges. Since the above arguments work for every norm on 𝐸, this shows that 𝐸 is
complete with respect to every norm.

There are some useful consequences to the above theorem. The first is concerned with
finite dimensional subspaces of an arbitrary normed space.

10.2 Corollary Every finite dimensional subspace of a normed space 𝐸 is closed and
complete in 𝐸.
Proof. If 𝐹 is a finite dimensional subspace of 𝐸, then 𝐹 is a normed space with the
norm induced by the norm of 𝐸. By the above theorem 𝐹 is complete with respect to
that norm, so in particular it is closed in 𝐸.

The second shows that any linear operator on a finite dimensional normed space is
continuous.

10.3 Corollary Let 𝐸, 𝐹 be normed spaces and dim 𝐸 < ∞. If 𝑇 ∶ 𝐸 → 𝐹 is linear,


then 𝑇 is bounded.
Proof. Consider the graph norm ‖𝑥‖𝑇 ∶= ‖𝑥‖𝐸 + ‖𝑇 𝑥‖𝐹 , which is a norm on 𝐸. By
Theorem 10.1 that norm is equivalent to ‖⋅‖𝐸 . Hence by Proposition 9.7 we conclude
that 𝑇 ∈ (𝐸, 𝐹 ) as claimed.

We finally prove a counterpart to the Heine-Borel Theorem (Theorem 4.5) for general
finite dimensional normed spaces. In the next section we will show that the converse
is true as well, providing a topological characterisation of finite dimensional normed
spaces.

10.4 Corollary (General Heine-Borel Theorem) Let 𝐸 be a finite dimensional normed


space. Then 𝐴 ⊂ 𝐸 is compact if and only if 𝐴 is closed and bounded.
Proof. Suppose that dim 𝐸 = 𝑁. Given a basis (𝑒1 , … , 𝑒𝑁 ) define 𝑇 ∈ (𝕂𝑁 , 𝐸) as
in (10.1). We know from the above that 𝑇 and 𝑇 −1 are continuous and therefore map
closed sets onto closed sets (see Theorem 5.3). Also 𝑇 and 𝑇 −1 map bounded sets onto
bounded sets (see Theorem 8.3) and compact sets onto compact sets (see Theorem 5.6).
Hence the assertion of the corollary follows.

11 Infinite Dimensional Normed Spaces


The purpose of this section is to characterise finite and infinite dimensional vector
spaces by means of topological properties.

11.1 Theorem (Almost orthogonal elements) Suppose 𝐸 is a normed space and 𝑀


a proper closed subspace of 𝐸. Then for every 𝜀 ∈ (0, 1) there exists 𝑥𝜀 ∈ 𝐸 with
‖𝑥𝜀 ‖ = 1 and
dist(𝑥𝜀 , 𝑀) ∶= inf ‖𝑥 − 𝑥𝜀 ‖ ≥ 1 − 𝜀.
𝑥∈𝑀

34
Proof. Fix an arbitrary 𝑥 ∈ 𝐸 ⧵ 𝑀 which exists since 𝑀 is a proper subspace of 𝐸.
As 𝑀 is closed dist(𝑥, 𝑀) ∶= 𝛼 > 0 as otherwise 𝑥 ∈ 𝑀 ̄ = 𝑀. Let 𝜀 ∈ (0, 1) be
arbitrary and note that (1 − 𝜀) > 1. Hence by definition of an infimum there exists
−1

𝑚𝜀 ∈ 𝑀 such that
𝛼
‖𝑥 − 𝑚𝜀 ‖ ≤ . (11.1)
1−𝜀
We define 𝑥 − 𝑚𝜀
𝑥𝜀 ∶= .
‖𝑥 − 𝑚𝜀 ‖
Then clearly ‖𝑥𝜀 ‖ = 1 and by (11.1) we have

‖ 𝑥 − 𝑚𝜀 ‖ 1 ‖ ( )‖
‖𝑥𝜀 − 𝑚‖ = ‖ − 𝑚‖ = ‖𝑥 − 𝑚𝜀 + ‖𝑥 − 𝑚𝜀 ‖𝑚 ‖
‖ ‖𝑥 − 𝑚𝜀 ‖ ‖ ‖𝑥 − 𝑚𝜀 ‖ ‖ ‖
1 − 𝜀‖ ( )‖
≥ ‖𝑥 − 𝑚𝜀 + ‖𝑥 − 𝑚𝜀 ‖𝑚 ‖
𝛼 ‖ ‖
for all 𝑚 ∈ 𝑀. As 𝑚𝜀 ∈ 𝑀 and 𝑀 is a subspace of 𝐸 we clearly have

𝑚𝜀 + ‖𝑥 − 𝑚𝜀 ‖𝑚 ∈ 𝑀

for all 𝑚 ∈ 𝑀. Thus by our choice of 𝑥


1−𝜀
‖𝑥𝜀 − 𝑚‖ ≥ 𝛼 =1−𝜀
𝛼
for all 𝑚 ∈ 𝑀. Hence 𝑥𝜀 is as required in the theorem.

11.2 Corollary Suppose that 𝐸 has closed subspaces 𝑀𝑖 , 𝑖 ∈ ℕ. If

𝑀1 ⊊ 𝑀2 ⊊ 𝑀3 ⊊ ⋯ ⊊ 𝑀𝑛 ⊊ 𝑀𝑛+1

for all 𝑛 ∈ ℕ then there exist 𝑚𝑛 ∈ 𝑀𝑛 such that ‖𝑚𝑛 ‖ = 1 and dist(𝑚𝑛 , 𝑀𝑛−1 ) ≥ 1∕2
for all 𝑛 ∈ ℕ. Likewise, if

𝑀1 ⊋ 𝑀2 ⊋ 𝑀3 ⊋ ⋯ ⊋ 𝑀𝑛 ⊋ 𝑀𝑛+1

for all 𝑛 ∈ ℕ, then there exist 𝑚𝑛 ∈ 𝑀𝑛 such that ‖𝑚𝑛 ‖ = 1 and dist(𝑚𝑛 , 𝑀𝑛+1 ) ≥ 1∕2
for all 𝑛 ∈ ℕ.
Proof. Consider the first case. As 𝑀𝑛−1 is a proper closed subspace of 𝑀𝑛 we can
apply Theorem 11.1 and select 𝑚𝑛 ∈ 𝑀𝑛 such that dist(𝑚𝑛 , 𝑀𝑛−1 ) ≥ 1∕2. Doing so
inductively for all 𝑛 ∈ ℕ we get the required sequence (𝑚𝑛 ). In the second case we
proceed similarly: There exists 𝑚1 ∈ 𝑀1 such that dist(𝑚1 , 𝑀2 ) ≥ 1∕2. Next choose
𝑚2 ∈ 𝑀2 such that dist(𝑚2 , 𝑀3 ) ≥ 1∕2, and so on.

With the above we are able to give a topological characterisation of finite and infinite
dimensional spaces.

11.3 Theorem A normed space 𝐸 is finite dimensional if and only if the unit sphere
𝑆 = {𝑥 ∈ 𝐸 ∶ ‖𝑥‖ = 1} is compact.

35
Proof. First assume that dim 𝐸 = 𝑁 < ∞. Since the unit sphere is closed and
bounded, by the general Heine-Borel Theorem (Corollary 10.4) it is compact. Now
suppose that 𝐸 is infinite dimensional. Then there exists a countable linearly indepen-
dent set {𝑒𝑛 ∶ 𝑛 ∈ ℕ}. We set 𝑀𝑛 ∶= span{𝑒𝑘 ∶ 𝑘 = 1, … , 𝑛}. Clearly dim 𝑀𝑛 = 𝑛
and thus by Corollary 10.2 𝑀𝑛 is closed for all 𝑛 ∈ ℕ 𝑀𝑛 . As dim 𝑀𝑛+1 > dim 𝑀𝑛 the
sequence (𝑀𝑛 ) satisfies the assumptions of Corollary 11.2. Hence there exist 𝑚𝑛 ∈ 𝑀𝑛
such that ‖𝑚𝑛 ‖ = 1 and dist(𝑚𝑛 , 𝑀𝑛−1 ) ≥ 1∕2 for all 𝑛 ∈ ℕ. However, this implies that
‖𝑚𝑛 − 𝑚𝑘 ‖ ≥ 1∕2 whenever 𝑛 ≠ 𝑘, showing that there is a sequence in 𝑆 which does
not have a convergent subsequence. Hence 𝑆 cannot be compact, completing the proof
of the theorem.

11.4 Corollary Let 𝐸 be a normed vector space. Then the following assertions are
equivalent:
(i) dim 𝐸 < ∞;

(ii) The unit sphere in 𝐸 is compact;

(iii) The unit ball in 𝐸 is relatively compact;

(iv) Every closed and bounded set in 𝐸 is compact.


Proof. Assertions (i) and (ii) are equivalent by Theorem 11.3. Suppose that (ii) is
true. Denote the unit sphere in 𝐸 by 𝑆. Define a map 𝑓 ∶ [0, 1] × 𝑆 → 𝐸 by setting
𝑓 (𝑡, 𝑥) ∶= 𝑡𝑥. Then by Theorem 6.4 the map 𝑓 is continuous and its image is the
closed unit ball in 𝐸. By Proposition 4.8 the set [0, 1] × 𝑆 is compact. Since the image
of a compact set under a continuous function is compact it follows that the closed unit
ball in 𝐸 is compact, proving (iii). Now assume that (iii) holds. Let 𝑀 be an arbitrary
closed and bounded set in 𝐸. Then there exists 𝑅 > 0 such that 𝑀 ⊂ 𝐵(0, 𝑅). Since
the map 𝑥 → 𝑅𝑥 is continuous on 𝐸 and the closed unit ball is compact it follows that
𝐵(0, 𝑅) is compact. Now 𝑀 is compact because it is a closed subset of a compact set
(see Proposition 4.7), so (iv) follows. If (iv) holds, then in particular the unit sphere is
compact, so (ii) follows.

12 Quotient Spaces
Consider a vector space 𝐸 and a subspace 𝐹 . We define an equivalence relation ∼
between elements 𝑥, 𝑦 in 𝐸 by 𝑥 ∼ 𝑦 if and only if 𝑥 − 𝑦 ∈ 𝐹 . Denote by [𝑥] the
equivalence class of 𝑥 ∈ 𝐸 and set

𝐸∕𝐹 ∶= {[𝑥] ∶ 𝑥 ∈ 𝐸}.

As you probably know from Algebra, this is called the quotient space of 𝐸 modulo 𝐹 .
That quotient space is a vector space over 𝕂 if we define the operations

[𝑥] + [𝑦] ∶= [𝑥 + 𝑦]
𝛼[𝑥] ∶= [𝛼𝑥]

36
for all 𝑥, 𝑦 ∈ 𝐸 and 𝛼 ∈ 𝕂. It is easily verified that these operations are well defined.
If 𝐸 is a normed space we would like to show that 𝐸∕𝐹 is a normed space with norm

‖[𝑥]‖𝐸∕𝐹 ∶= inf ‖𝑥 − 𝑧‖𝐸 . (12.1)


𝑧∈𝐹

This is a good definition since then ‖[𝑥]‖𝐸∕𝐹 ≤ ‖𝑥‖𝐸 for all 𝑥 ∈ 𝐸, that is, the natural
projection 𝐸 → 𝐸∕𝐹 , 𝑥 → [𝑥] is continuous. Geometrically, ‖[𝑥]‖𝐸∕𝐹 is the distance
of the affine subspace [𝑥] = 𝑥 + 𝐹 from the origin, or equivalently the distance between
the affine spaces 𝐹 and 𝑥 + 𝐹 as Figure 12.1 shows in the situation of two dimensions.
Unfortunately, ‖⋅‖𝐸∕𝐹 is not always a norm, but only if 𝐹 is a closed subspace of 𝐸.

[𝑥] = 𝑥 + 𝐹
𝑥
‖𝑥‖𝐸∕𝐹
𝐹

Figure 12.1: Distance between affine subspaces

12.1 Proposition Let 𝐸 be a normed space. Then 𝐸∕𝐹 is a normed space with norm
(12.1) if and only if 𝐹 is a closed subspace of 𝐸.

Proof. Clearly ‖[𝑥]‖𝐸∕𝐹 ≥ 0 for all 𝑥 ∈ 𝐸 and ‖[𝑥]‖𝐸∕𝐹 = 0 if [𝑥] = [0], that is,
𝑥 ∈ 𝐹 . Now suppose that ‖[𝑥]‖𝐸∕𝐹 = 0. We want to show that then [𝑥] = [0] if and
only if 𝐹 is closed. First suppose that 𝐹 is closed. If ‖[𝑥]‖𝐸∕𝐹 = 0, then by definition
there exist 𝑧𝑛 ∈ 𝐹 with ‖𝑥 − 𝑧𝑛 ‖ → 0. Hence 𝑧𝑛 → 𝑥, and since 𝐹 is closed 𝑥 ∈ 𝐹 .
But then [𝑥] = [0] proving what we want. Suppose now 𝐹 is not closed. Then there
exists a sequence 𝑧𝑛 ∈ 𝐹 with 𝑧𝑛 → 𝑥 and 𝑥 ∉ 𝐹 . Hence [𝑥] ≠ [0], but

0 ≤ ‖[𝑥]‖𝐸∕𝐹 ≤ lim ‖𝑥 − 𝑧𝑛 ‖ = 0,
𝑛→∞

that is ‖[𝑥]‖𝐸∕𝐹 = 0 even though [𝑥] ≠ [0]. Hence (12.1) does not define a norm. The
other properties of a norm are valid no matter whether 𝐹 is closed or not. First note
that for 𝛼 ∈ 𝕂 and 𝑥 ∈ 𝐸

‖𝛼[𝑥]‖𝐸∕𝐹 = ‖[𝛼𝑥]‖𝐸∕𝐹 ≤ ‖𝛼(𝑥 − 𝑧)‖𝐸 = |𝛼|‖𝑥 − 𝑧‖𝐸

for all 𝑧 ∈ 𝐹 . Hence ‖𝛼[𝑥]‖𝐸∕𝐹 ≤ |𝛼|‖[𝑥]‖𝐸∕𝐹 with equality if 𝛼 = 0. If 𝛼 ≠ 0, then


by the above
‖[𝑥]‖𝐸∕𝐹 = ‖𝛼 −1 𝛼[𝑥]‖𝐸∕𝐹 ≤ |𝛼|−1 ‖𝛼[𝑥]‖𝐸∕𝐹 ,

37
so ‖𝛼[𝑥]‖𝐸∕𝐹 ≥ |𝛼|‖𝑥‖𝐸∕𝐹 , showing that ‖𝛼[𝑥]‖𝐸∕𝐹 = |𝛼|‖𝑥‖𝐸∕𝐹 . Finally let 𝑥, 𝑦 ∈ 𝐸
and fix 𝜀 > 0 arbitrary. By definition of the quotient norm there exist 𝑧, 𝑤 ∈ 𝐹 such
that ‖𝑥 − 𝑧‖𝐸 ≤ ‖[𝑥]‖𝐸∕𝐹 + 𝜀 and ‖𝑦 − 𝑤‖𝐸 ≤ ‖[𝑦]‖𝐸∕𝐹 + 𝜀. Hence

‖[𝑥]+[𝑦]‖𝐸∕𝐹 ≤ ‖𝑥+𝑦−𝑧−𝑤‖𝐸 ≤ ‖𝑥−𝑧‖𝐸 +‖𝑦−𝑤‖𝐸 ≤ ‖[𝑥]‖𝐸∕𝐹 +‖[𝑦]‖𝐸∕𝐹 +2𝜀.

As 𝜀 > 0 was arbitrary ‖[𝑥]+[𝑦]‖𝐸∕𝐹 ≤ ‖[𝑥]‖𝐸∕𝐹 +‖[𝑦]‖𝐸∕𝐹 , so the triangle inequality


holds.

The above proposition justifies the following definition.

12.2 Definition (quotient norm) If 𝐹 is a closed subspace of the normed space 𝐸,


then the norm ‖⋅‖𝐸∕𝐹 is called the quotient norm on 𝐸∕𝐹 .
We next look at completeness properties of quotient spaces.

12.3 Theorem Suppose 𝐸 is a Banach space and 𝐹 a closed subspace. Then 𝐸∕𝐹 is
a Banach space with respect to the quotient norm.
Proof. The only thing left to prove is that 𝐸∕𝐹 is complete with respect to the quo-
tient norm. We use the characterisation of completeness of a normed space given in
∑∞
Theorem 6.8. Hence let 𝑛=1 [𝑥𝑛 ] be an absolutely convergent series in 𝐸∕𝐹 , that is,



‖[𝑥𝑛 ]‖𝐸∕𝐹 ≤ 𝑀 < ∞.
𝑛=1

By definition of the quotient norm, for each 𝑛 ∈ ℕ there exists 𝑧𝑛 ∈ 𝐹 such that
1
‖𝑥𝑛 − 𝑧𝑛 ‖𝐸 ≤ ‖[𝑥𝑛 ]‖𝐸∕𝐹 + .
2𝑛
Hence,

𝑚

𝑚
∑𝑚
1
‖𝑥𝑛 − 𝑧𝑛 ‖𝐸 ≤ ‖[𝑥𝑛 ]‖𝐸∕𝐹 + 𝑛
≤𝑀 +2<∞
𝑛=1 𝑛=1 𝑛=1
2
∑∞
for all 𝑚 ∈ ℕ. This means that 𝑛=1 (𝑥𝑛 − 𝑧𝑛 ) is absolutely convergent and therefore
convergent by Theorem 6.8 and the assumption that 𝐸 be complete. We set


𝑠 ∶= (𝑥𝑛 − 𝑧𝑛 ).
𝑛=1

Now by choice of 𝑧𝑛 and the definition of the quotient norm


( 𝑚 ) ( 𝑚 ) ( 𝑚 )
‖ ∑ ‖ ‖ ∑ ‖ ‖ ∑ ‖
‖ [𝑥𝑛 ] − [𝑠]‖ =‖ [𝑥𝑛 − 𝑧𝑛 ] − [𝑠]‖ ≤‖ (𝑥𝑛 − 𝑧𝑛 ) − 𝑠‖ → 0
‖ ‖𝐸∕𝐹 ‖ ‖𝐸∕𝐹 ‖ ‖𝐸
𝑛=1 𝑛=1 𝑛=1
∑∞
as 𝑚 → ∞ by choice of 𝑠. Hence 𝑛=1 [𝑥𝑛 ] = [𝑠] converges with respect to the quotient
norm, and so by Theorem 6.8 𝐸∕𝐹 is complete.

Next we look at factorisations of bounded linear operators. Given normed spaces 𝐸, 𝐹


and an operator 𝑇 ∈ (𝐸, 𝐹 ) it follows from Theorem 5.3 that ker 𝑇 ∶= {𝑥 ∈

38
𝐸 ∶ 𝑇 𝑥 = 0} is a closed subspace of 𝐸. Hence 𝐸∕ ker 𝑇 is a normed space with
the quotient norm. We then define a linear operator 𝑇̂ ∶ 𝐸∕ ker 𝑇 → 𝐹 by setting

𝑇̂ [𝑥] ∶= 𝑇 𝑥

for all 𝑥 ∈ 𝐸. It is easily verified that this operator is well defined and linear. Moreover,
if we set 𝜋(𝑥) ∶= [𝑥], then by definition of the quotient norm ‖𝜋(𝑥)‖𝐸∕ ker 𝑇 ≤ ‖𝑥‖𝐸 ,
so 𝜋 ∈ (𝐸, 𝐸∕ ker 𝑇 ) with ‖𝜋‖(𝐸,𝐸∕ ker 𝑇 ) ≤ 1. Moreover, we have the factorisation

𝑇 = 𝑇̂ ◦𝜋,

meaning that the following diagram is commutative.

𝑇
𝐸 𝐹

𝜋
𝑇̂
𝐸∕ ker 𝑇

We summarise the above in the following Theorem.

12.4 Theorem Suppose that 𝐸, 𝐹 are normed spaces and that 𝑇 ∈ (𝐸, 𝐹 ). If 𝑇̂ , 𝜋
are defined as above, then 𝑇̂ ∈ (𝐸∕ ker 𝑇 , 𝐹 ), ‖𝑇 ‖(𝐸,𝐹 ) = ‖𝑇̂ ‖(𝐸∕ ker 𝑇 ,𝐹 ) and we
have the factorisation 𝑇 = 𝑇̂ ◦𝜋.
Proof. The only thing left to prove is that 𝑇̂ ∈ (𝐸∕ ker 𝑇 , 𝐹 ), and that ‖𝑇 ‖ ∶=
‖𝑇 ‖(𝐸,𝐹 ) = ‖𝑇̂ ‖(𝐸∕ ker 𝑇 ,𝐹 ) = ∶ ‖𝑇̂ ‖. First note that

‖𝑇̂ [𝑥]‖𝐹 = ‖𝑇 𝑥‖𝐹 = ‖𝑇 (𝑥 − 𝑧)‖𝐹 ≤ ‖𝑇 ‖‖𝑥 − 𝑧‖𝐸

for all 𝑧 ∈ ker 𝑇 . Hence by definition of the quotient norm

‖𝑇̂ [𝑥]‖𝐹 ≤ ‖𝑇 ‖‖[𝑥]‖𝐸∕ ker 𝑇 .

Now by definition of the operator norm ‖𝑇̂ ‖ ≤ ‖𝑇 ‖ < ∞. In particular, 𝑇̂ ∈ (𝐸∕ ker 𝑇 , 𝐹 ).
To show equality of the operator norms observe that

‖𝑇 𝑥‖𝐹 = ‖𝑇̂ [𝑥]‖𝐹 ≤ ‖𝑇̂ ‖‖[𝑥]‖𝐸∕ ker 𝑇 ≤ ‖𝑇̂ ‖‖𝑥‖𝐸

by definition of the operator and quotient norms. Hence ‖𝑇 ‖ ≤ ‖𝑇̂ ‖, showing that
‖𝑇 ‖ = ‖𝑇̂ ‖.

39
40
Chapter III
Banach algebras and the Stone-Weierstrass
Theorem

13 Banach algebras
Some Banach spaces have an additional structure. For instance if we consider the space
of continuous functions 𝐶(𝐾) with 𝐾 compact, then we can multiply the functions as
well and define
(𝑓 𝑔)(𝑥) ∶= 𝑓 (𝑥)𝑔(𝑥)
for all 𝑥 ∈ 𝐾. The natural norm on 𝐶(𝐾) is the supremum norm, and we easily see
that
‖𝑓 𝑔‖∞ ≤ ‖𝑓 ‖∞ ‖𝑔‖∞ ,
which in particular implies that multiplication is continuous. Also, there is a neutral
element for multiplication, namely the constant function with value one. Vector spaces
with this additional multiplicative structure are called algebras. Here we have an addi-
tional topological structure. We introduce the following definition.

13.1 Definition A normed space 𝐸 is called a (commutative) normed algebra if there


is an operation 𝐸 × 𝐸 → 𝐸 called multiplication with the following properties:
(i) 𝑥𝑦 = 𝑦𝑥 for all 𝑥, 𝑦 ∈ 𝐸 (commutative law)
(ii) 𝑥(𝑦𝑧) = (𝑥𝑦)𝑧 for all 𝑥, 𝑦, 𝑧 ∈ 𝐸 (associative law);
(iii) 𝑥(𝑦 + 𝑧) = 𝑥𝑦 + 𝑥𝑧 for all 𝑥, 𝑦 ∈ 𝐸 (distributive law);
(iv) 𝛼(𝑥𝑦) = (𝛼𝑥)𝑦 for all 𝑥, 𝑦 ∈ 𝐸 and 𝛼, 𝛽 ∈ 𝕂;
(v) ‖𝑥𝑦‖ ≤ ‖𝑥‖‖𝑦‖ for all 𝑥, 𝑦 ∈ 𝐸.
We call 𝑒 ∈ 𝐸 a unit if 𝑒𝑥 = 𝑥 for all 𝑥 ∈ 𝐸. If 𝐸 is complete we call 𝐸 a Banach
algebra.
Note that not every Banach algebra has a unit. For instance 𝐿1 (ℝ𝑁 ) is a Banach algebra
with multiplication defined as convolution, that is,

(𝑓 ∗ 𝑔)(𝑥) ∶= 𝑓 (𝑥 − 𝑦)𝑔(𝑦) 𝑑𝑦.


∫ℝ𝑁

41
Young’s inequality implies that ‖𝑓 ∗ 𝑔‖1 ≤ ‖𝑓 ‖1 ‖𝑔‖1 , but there is no unity (see
MATH3969). Unity would correspond to the “delta function” which is really a measure,
not a function.

14 The Stone-Weierstrass Theorem


In this section we consider properties of subspaces of the Banach algebra 𝐶(𝐾), where
𝐾 is a compact set. A classical theorem in real analysis, the Weierstrass approxima-
tion theorem asserts that every continuous function on a compact interval in ℝ can be
uniformly approximated by a sequence of polynomials. In other words, the space of
polynomials on a compact interval 𝐼 is dence in 𝐶(𝐼).
The aim of this section is to prove a generalisation of this theorem to Banach al-
gebras with unity, called the Stone-Weierstrass theorem. It is named after Stone who
published the generalisation in [12] in 1948 with a huge success.
The standard proof of the Stone-Weierstrass theorem requires a very special case of
the Weierstrass approximation theorem, namely the fact that the absolute value function
|𝑥| can be uniformly approximated by polynomials in the interval [−1, 1]. To achieve
that we write √ √
|𝑥| = 𝑥2 = 1 − (1 − 𝑥2 ).
Using the binomial theorem we can write
( )
∑∞
𝑘 1∕2
|𝑥| = (−1) (1 − 𝑥2 )𝑘 (14.1)
𝑘=0
𝑘

which is a power series in 𝑡 = (1 − 𝑥2 ), converging for |𝑡| < 1, that is, 𝑥 ∈ (−1, 1).
If we can prove that it converges absolutely and uniformly with respect to 𝑡 ∈ [−1, 1],
then the Weierstrass 𝑀-test shows that (14.1) converges absolutely and uniformly with
respect to 𝑥 ∈ [−1, 1]. Hence the sequence of partial sums of (14.1) provides a uniform
polynomial approximation of |𝑥| on [−1, 1].

14.1 Lemma Let 𝛼 ∈ (0, 1). Then the binomial series


( )
∑∞
𝑘 𝛼
𝛼
(1 − 𝑡) = (−1) 𝑡𝑘 (14.2)
𝑘=0
𝑘

converges absolutely and uniformly with respect to 𝑡 in the closed ball 𝐵(0, 1) ⊆ ℂ.
Proof. We know that (14.2) holds for 𝑡 in the open ball 𝐵(0, 1). As 𝛼 ∈ (0, 1) we have
( ) |( )|
𝛼 𝛼(𝛼 − 1) … (𝛼 − 𝑘 + 1) 𝑘−1 | 𝛼 |
= = (−1) | |
𝑘 𝑘! | 𝑘 |
| |
for all 𝑘 ≥ 1. Hence, for 𝑡 ∈ (0, 1) the series (14.2) can be rewritten in the form
∞ |( )|
∑ | 𝛼 | 𝑘
𝛼
(1 + 𝑡) = 1 − | |𝑡 . (14.3)
| 𝑘 |
𝑘=0 | |

42
Rearranging we obtain
𝑛 |( )| ∞ |( )|
∑ | 𝛼 | 𝑘 ∑| 𝛼 | 𝑘
| |𝑡 ≤ | | 𝑡 = 1 − (1 − 𝑡)𝛼 ≤ 1
| 𝑘 | | 𝑘 |
𝑘=0 | | 𝑘=0 | |
for all 𝑛 ∈ ℕ and all 𝑡 ∈ (0, 1). Letting 𝑡 ↑ 1 we see that
𝑛 |( )|
∑ | 𝛼 |
| |≤1
| 𝑘 |
𝑘=0 | |
for all 𝑛 ∈ ℕ. These are the partial sums of a series with non-negative terms and hence,
∞ |( )|
∑ | 𝛼 |
| |≤1
| 𝑘 |
𝑘=0 | |
converges. The Weierstrass 𝑀-Test now implies that (14.2) converges absolutely and
uniformly on 𝐵(0, 1).

We now state and prove the main result of this section about the density of sub-
algebras of 𝐶(𝐾). A sub-algebra is simply a subspace of 𝐶(𝐾) that is also a normed
algebra.

14.2 Theorem (Stone-Weierstrass) Let 𝐾 be a compact set and suppose that 𝐵 is a


sub-algebra of 𝐶(𝐾) over 𝕂 satisfying the following conditions:

(i) 𝐵 contains the constant function 1;

(ii) 𝐵 separates points, that is, for every pair of distinct points 𝑥, 𝑦 ∈ 𝐾 there exist
𝑓 ∈ 𝐵 such that 𝑓 (𝑥) ≠ 𝑓 (𝑦).

(iii) If 𝑓 ∈ 𝐵, then also 𝑓̄ ∈ 𝐵, where 𝑓̄ is the complex conjugate of 𝑓 .

Then, 𝐵 is dense in 𝐶(𝐾).


Proof. (a) We first assume that 𝕂 = ℝ. By Lemma 14.1 there exists a sequence of
| |
polynomials 𝑝𝑛 such that |𝑞𝑛 (𝑠) − |𝑠|| < 1∕𝑛2 for all 𝑛 ∈ ℕ and all 𝑠 ∈ [−1, 1]. We
| |
rescale the inequality and obtain
| |
|𝑛𝑞𝑛 (𝑠) − |𝑛𝑠|| < 1∕𝑛
| |
for all 𝑛 ∈ ℕ and 𝑠 ∈ [−1, 1]. Note that we can rewrite 𝑛𝑞𝑛 (𝑠) in the form 𝑝𝑛 (𝑛𝑠) for
some polynomial 𝑝𝑛 . Given that 𝑞𝑛 (𝑠) = 𝑎0 + 𝑎1 𝑠 + ⋯ + 𝑎𝑚 𝑠𝑚 , the polynomial
𝑎1 𝑎
𝑝𝑛 (𝑛𝑠) ∶= 𝑛𝑎0 + 𝑛 (𝑛𝑠) + ⋯ + 𝑛 𝑚𝑚 (𝑛𝑠)𝑚
𝑛 𝑛
satisfies 𝑝𝑛 (𝑛𝑠) = 𝑛𝑞𝑛 (𝑠) for all 𝑠 ∈ ℝ. In particular, setting 𝑡 = 𝑛𝑠, we have a sequence
of polynomials so that
| |
|𝑝𝑛 (𝑡) − |𝑡|| < 1∕𝑛 (14.4)
| |

43
for all 𝑛 ∈ ℕ and all 𝑡 ∈ [−𝑛, 𝑛]. Let 𝑓 ∈ 𝐵. Since 𝐵 is an algebra 𝑝𝑛 ◦𝑓 ∈ 𝐵 for all
𝑛 ∈ ℕ, and that
| |
|(𝑝𝑛 ◦𝑓 )(𝑥) − |𝑓 (𝑥)|| < 1∕𝑛 (14.5)
| |
for all 𝑛 ≥ ‖𝑓 ‖∞ < ∞ and all 𝑥 ∈ 𝐾. Hence 𝑝𝑛 ◦𝑓 → |𝑓 | in 𝐶(𝐾), which means that
|𝑓 | ∈ 𝐵 for all 𝑓 ∈ 𝐵. As a consequence also

1( ) 1( )
max{𝑓 , 𝑔} = 𝑓 + 𝑔 + |𝑓 − 𝑔| and min{𝑓 , 𝑔} = 𝑓 + 𝑔 − |𝑓 − 𝑔| (14.6)
2 2
are in the closure of 𝐵 for all 𝑓 , 𝑔 ∈ 𝐵.
Fix now 𝑓 ∈ 𝐶(𝐾) and 𝑥, 𝑦 ∈ 𝐾 with 𝑥 ≠ 𝑦. By assumption (ii) there exists 𝑔 ∈ 𝐵
such that 𝑔(𝑥) ≠ 𝑔(𝑦). We set
( ) 𝑔(𝑧) − 𝑔(𝑥)
ℎ𝑥𝑦 (𝑧) ∶= 𝑓 (𝑥)1 + 𝑓 (𝑦) − 𝑓 (𝑥)
𝑔(𝑦) − 𝑔(𝑥)

for all 𝑧 ∈ 𝐾. As 1 ∈ 𝐵 by assumption we have ℎ𝑥𝑦 ∈ 𝐵 and ℎ𝑥𝑦 (𝑥) = 𝑓 (𝑥) and
ℎ𝑥𝑦 (𝑦) = 𝑓 (𝑦). Let now 𝜀 > 0 be fixed. Using that ℎ𝑥𝑦 and 𝑓 are continuous the sets
{ }
𝑈𝑥𝑦 ∶= 𝑧 ∈ 𝐾 ∶ ℎ𝑥𝑦 (𝑧) < 𝑓 (𝑧) + 𝜀
{ }
𝑉𝑥𝑦 ∶= 𝑧 ∈ 𝐾 ∶ ℎ𝑥𝑦 (𝑧) > 𝑓 (𝑧) − 𝜀

are open in 𝐾; see Theorem 5.3. Both sets are non-empty since ℎ𝑥𝑦 (𝑥) = 𝑓 (𝑥) and
ℎ𝑥𝑦 (𝑦) = 𝑓 (𝑦) and thus 𝑥, 𝑦 ∈ 𝑈𝑥𝑦 ∩ 𝑉𝑥𝑦 . The family (𝑈𝑥𝑦 )𝑥,𝑦∈𝐾 is clearly an open
cover of 𝐾. Fix now 𝑦 ∈ 𝐾. By the compactness of 𝐾 there exist finitely many points
⋃𝑚
𝑥𝑗 ∈ 𝐾, 𝑘 = 1, … , 𝑚, so that 𝐾 = 𝑗=1 𝑈𝑥𝑗 𝑦 . Applying (14.6) repeatedly we see that

𝑓𝑦 ∶= min ℎ𝑥𝑗 𝑦 ∈ 𝐵.
𝑗=1,…,𝑚

By choice of 𝑥𝑗 it follows that

𝑓𝑦 (𝑧) < 𝑓 (𝑧) + 𝜀 for all 𝑧 ∈ 𝐾;



𝑚
𝑓𝑦 (𝑧) > 𝑓 (𝑧) − 𝜀 for all 𝑧 ∈ 𝑉𝑦 ∶= 𝑉𝑥𝑗 𝑦 .
𝑗=1

As 𝑉𝑦 is the intersection of finitely many open sets all containing 𝑦 it is open and non-
⋃𝓁
empty. Again by compactness of 𝐾 we can choose 𝑦1 , … , 𝑦𝓁 such that 𝐾 = 𝑗=1 𝑈𝑦𝑗 .
Again applying (14.6) we see that

ℎ ∶= max 𝑓𝑦 ∈ 𝐵,
𝑗=1,…,𝑚

and that
𝑓 (𝑧) − 𝜀 < ℎ(𝑧) < 𝑓 (𝑧) + 𝜀
for all 𝑧 ∈ 𝐾. In particular ‖𝑓 − ℎ‖∞ ≤ 𝜀. As ℎ ∈ 𝐵 and the argument above works
for any 𝜀 >, we conclude that 𝑓 ∈ 𝐵 as well.

44
(b) Let us now consider the case of 𝕂 = ℂ. Given 𝑓 ∈ 𝐶(𝐾, ℂ) = 𝐶(𝐾, ℝ) +
𝑖𝐶(𝐾, ℝ), assumption (iii) implies that real and imaginary parts given by
1 1
Re 𝑓 = (𝑓 + 𝑓̄) and Re 𝑓 = (𝑓 − 𝑓̄)
2 2𝑖
are both in 𝐵. Set 𝐵ℝ ∶= 𝐵 ∩ 𝐶(𝐾, ℝ). We can apply the real version proved already
to conclude that 𝐵ℝ = 𝐶(𝐾, ℝ). Hence, we also have 𝐵 = 𝐶(𝐾, ℂ).

We can deduce the Weierstrass approximation theorem as a corollary.

14.3 Theorem (Weierstrass approximation theorem) Let 𝐼 be a compact subset of


ℝ. Then the space of polynomials ℝ[𝑥] is dense in 𝐶(𝐼).
Proof. The space of polynomials is clearly a sub-algebra containing 1 (the constant
polynomial). It also contains the identity 𝑝(𝑥) = 𝑥, and therefore separates points. Now
the Stone-Weierstrass theorem 14.2 applies.

14.4 Remark The Weierstrass approximation theorem does not apply to 𝐶(𝐾) if 𝐾
is a compact subset of ℂ containing some interior. In fact, polynomials are naturally
analytic funtions and a theorem from complex analysis asserts that the uniform limit
of analytic functions are analytic. Hence it is impossible to approximate an arbitrary
continuous function uniformly by polynomials unless it is analytic. All assumptions of
Theorem 14.2 are fulfilled except for the last, the one on the complex conjugates: if
𝑝(𝑧) is a polynomial, then its complex conjugate 𝑝(𝑧) is no longer a polynomial as 𝑧̄
cannot be writen as a polynomial in 𝑧.

45
46
Chapter IV
Hilbert Spaces

Hilbert spaces are in some sense a direct generalisation of finite dimensional Euclidean
spaces, where the norm has some geometric meaning and angles can be defined by
means of the dot product. The dot product can be used to define the norm and prove
many of its properties. Hilbert space theory is doing this in a similar fashion, where an
inner product is a map with properties similar to the dot product in Euclidean space. We
will emphasise the analogies and see how useful they are to find proofs in the general
context of inner product spaces.

15 Inner Product Spaces


Throughout we let 𝐸 denote a vector space over 𝕂.

15.1 Definition (Inner product, inner product space) A function (⋅ ∣ ⋅) ∶ 𝐸 × 𝐸 →


𝕂 is called an inner product or scalar product if
(i) (𝑢 ∣ 𝑣) = (𝑣 ∣ 𝑢) for 𝑢, 𝑣 ∈ 𝐸,
(ii) (𝑢 ∣ 𝑢) ≥ 0 for all 𝑢 ∈ 𝐸 and (𝑢 ∣ 𝑢) = 0 if and only if 𝑢 = 0.
(iii) (𝛼𝑢 + 𝛽𝑣 ∣ 𝑤) = 𝛼(𝑢 ∣ 𝑤) + 𝛽(𝑣 ∣ 𝑤) for all 𝑢, 𝑣, 𝑤 ∈ 𝐸 and 𝛼, 𝛽 ∈ 𝕂,
We say that 𝐸 equipped with (⋅ ∣ ⋅) is an inner product space.

15.2 Remark As an immediate consequence of the above definition, inner products


have the following properties:
(a) By property (i) we have (𝑢 ∣ 𝑢) = (𝑢 ∣ 𝑢) and therefore (𝑢 ∣ 𝑢) ∈ ℝ for all 𝑢 ∈ 𝐸.
Hence property (ii) makes sense.
(b) Using (i) and (iii) we have

(𝑢 ∣ 𝛼𝑣 + 𝛽𝑤) = 𝛼(𝑢 ∣ 𝑣) + 𝛽(𝑢 ∣ 𝑤)

for all 𝑢, 𝑣, 𝑤 ∈ 𝐸 and 𝛼, 𝛽 ∈ 𝕂. In particular we have

(𝑢 ∣ 𝜆𝑣) = 𝜆(𝑢 ∣ 𝑣)

for all 𝑢, 𝑣 ∈ 𝐸 and 𝜆 ∈ 𝕂.

47
Next we give some examples of Banach and Hilbert spaces.

15.3 Examples (a) The space ℂ𝑁 equipped with the Euclidean scalar product given by


𝑁
(𝑥 ∣ 𝑦) ∶= 𝑥 ⋅ 𝑦 = 𝑥𝑖 𝑦𝑖
𝑖=1

for all 𝑥 ∶= (𝑥1 , … , 𝑥𝑁 ), 𝑦 ∶= (𝑦1 , … , 𝑦𝑁 ) ∈ ℂ𝑁 is an inner product space. More


generally, if we take a positive definite Hermitian matrix 𝐴 ∈ ℂ𝑁×𝑁 , then

(𝑥 ∣ 𝑦)𝐴 ∶= 𝑥𝑇 𝐴𝑦̄

defines an inner product on ℂ𝑁 .


(b) An infinite dimensional version is 𝓁2 defined in Section 7.2. An inner product
is defined by
∑∞
(𝑥 ∣ 𝑦) ∶= 𝑥𝑖 𝑦𝑖
𝑖=1

for all (𝑥𝑖 ), (𝑦𝑖 ) ∈ 𝓁2 . The series converges by Hölder’s inequality (Proposition 7.4).
(c) For 𝑢, 𝑣 ∈ 𝐿2 (𝑋) we let

(𝑢 ∣ 𝑣) ∶= 𝑢(𝑥)𝑣(𝑥) 𝑑𝑥.
∫𝑋
By Hölder’s inequality Proposition 7.7 the integral is finite and easily shown to be an
inner product.
The Euclidean norm on ℂ𝑁 is defined by means of the dot product, namely by ‖𝑥‖ =

𝑥 ⋅ 𝑥 for 𝑥 ∈ ℂ𝑁 . We make a similar definition in the context of general inner product
spaces.

15.4 Definition (induced norm) If 𝐸 is an inner product space with inner product (⋅ ∣
⋅) we define √
‖𝑢‖ ∶= (𝑢 ∣ 𝑢) (15.1)
for all 𝑢 ∈ 𝐸.
Note that from Remark 15.2 we always have (𝑥 ∣ 𝑥) ≥ 0, so ‖𝑥‖ is well defined. We
call ‖⋅‖ a “norm,” but at the moment we do not know whether it really is a norm in the
proper sense of Definition 6.1. We now want to work towards a proof that ‖⋅‖ is a norm
on 𝐸. On the way we look at some geometric properties of inner products and establish
the Cauchy-Schwarz inequality.
By the algebraic properties of the inner products in a space over ℝ and the definition
of the norm we get
‖𝑣 − 𝑢‖2 = ‖𝑢‖2 + ‖𝑣‖2 − 2(𝑢 ∣ 𝑣).
On the other hand, by the law of cosines we know that for vectors 𝑢, 𝑣 ∈ ℝ2

‖𝑣 − 𝑢‖2 = ‖𝑢‖2 + ‖𝑣‖2 − 2‖𝑢‖‖𝑣‖ cos 𝜃.

if we form a triangle from 𝑢, 𝑣 and 𝑣 − 𝑢 as shown in Figure 15.1. Therefore

48
‖𝑣‖

𝜃 ‖𝑣 − 𝑢‖
‖𝑢‖

Figure 15.1: Triangle formed by 𝑢, 𝑣 and 𝑣 − 𝑢.

𝑢 ⋅ 𝑣 = ‖𝑢‖‖𝑣‖ cos 𝜃

and thus
|𝑢 ⋅ 𝑣| ≤ ‖𝑢‖‖𝑣‖.
The latter inequality has a counterpart in general inner product spaces. We give a proof
inspired by (but not relying on) the geometry in the plane. All arguments used purely
depend on the algebraic properties of an inner product and the definition of the induced
norm.

15.5 Theorem (Cauchy-Schwarz inequality) Let 𝐸 be an inner product space with


inner product (⋅ ∣ ⋅). Then
|(𝑢 ∣ 𝑣)| ≤ ‖𝑢‖‖𝑣‖ (15.2)
for all 𝑢, 𝑣 ∈ 𝐸 with equality if and only if 𝑢 and 𝑣 are linearly dependent.
Proof. If 𝑢 = 0 or 𝑣 = 0 the inequality is obvious and 𝑢 and 𝑣 are linearly dependent.
Hence assume that 𝑢 ≠ 0 and 𝑣 ≠ 0. We can then define
(𝑢 ∣ 𝑣)
𝑛=𝑣− 𝑢.
‖𝑢‖2
Note that the vector
(𝑢 ∣ 𝑣)
𝑝 ∶= 𝑢
‖𝑢‖2
is the projection of 𝑣 in the direction of 𝑢, and 𝑛 is the projection of 𝑣 orthogonal to 𝑢 as
shown in Figure 15.2. Using the algebraic rules for the inner product and the definition

𝑛=𝑣−𝑝

𝑣 𝑢

(𝑢 ∣ 𝑣)
𝑝= 𝑢
‖𝑢‖2

Figure 15.2: Geometric interpretation of 𝑛.

49
of the norm we get

(𝑢 ∣ 𝑣)(𝑣 ∣ 𝑢) (𝑢 ∣ 𝑣)(𝑢 ∣ 𝑣)
0 ≤ ‖𝑛‖2 = 𝑣 ⋅ 𝑣 − 2 + (𝑢 ∣ 𝑢)
‖𝑢‖2 ‖𝑢‖4
|(𝑢 ∣ 𝑣)|2 |(𝑢 ∣ 𝑣)|2 |(𝑢 ∣ 𝑣)|2
= ‖𝑣‖2 − 2 + ‖𝑢‖ 2
= ‖𝑣‖2
− .
‖𝑢‖2 ‖𝑢‖4 ‖𝑢‖2
Therefore |(𝑢 ∣ 𝑣)|2 ≤ ‖𝑢‖2 ‖𝑣‖2 , and by taking square roots we find (15.2). Clearly
equality holds if and only if ‖𝑛‖ = 0, that is, if
(𝑢 ∣ 𝑣)
𝑣= 𝑢.
‖𝑢‖2
Hence we have equality in (15.2) if and only if 𝑢 and 𝑣 are linearly dependent. This
completes the proof of the theorem.

As a consequence we get a different characterisation of the induced norm.

15.6 Corollary If 𝐸 is an inner product space and ‖⋅‖ the induced norm, then
‖𝑢‖ = sup |(𝑢 ∣ 𝑣)| = sup |(𝑢 ∣ 𝑣)|
‖𝑣‖≤1 ‖𝑣‖=1

for all 𝑢 ∈ 𝐸.
Proof. If 𝑢 = 0 the assertion is obvious, so assume that 𝑢 ≠ 0. If ‖𝑣‖ ≤ 1, then
|(𝑢 ∣ 𝑣)| ≤ ‖𝑢‖‖𝑣‖ = ‖𝑢‖ by the Cauchy-Schwarz inequality. Hence
‖𝑢‖ ≤ sup |(𝑢 ∣ 𝑣)|.
‖𝑣‖≤1

Choosing 𝑣 ∶= 𝑢∕‖𝑢‖ we have |(𝑢 ∣ 𝑣)| = ‖𝑢‖2 ∕‖𝑢‖ ≤ ‖𝑢‖, so equality holds in
the above inequality. Since the supremum over ‖𝑣‖ = 1 is larger or equal to that over
‖𝑣‖ ≤ 1, the assertion of the corollary follows.

Using the Cauchy-Schwarz inequality we can now prove that ‖⋅‖ is in fact a norm.

15.7 Theorem If 𝐸 is an inner product space, then (15.1) defines a norm on 𝐸.


Proof. By property (ii) of an inner product (see Definition 15.1 we have ‖𝑢‖ =

(𝑢 ∣ 𝑢) ≥ 0 with equality if and only if 𝑢 = 0. If 𝑢 ∈ 𝐸 and 𝜆 ∈ 𝕂, then
√ √ √
‖𝜆𝑢‖ = (𝜆𝑢 ∣ 𝜆𝑢) = 𝜆𝜆(𝑢 ̄ ∣ 𝑢) = |𝜆|2 ‖𝑢‖2 = |𝜆|‖𝑢‖

as required. To prove the triangle inequality let 𝑢, 𝑣 ∈ 𝐸. By the algebraic properties


of an inner product and the Cauchy-Schwarz inequality we have

‖𝑢 + 𝑣‖2 = (𝑢 + 𝑣 ∣ 𝑢 + 𝑣) = ‖𝑢‖2 + (𝑢 ∣ 𝑣) + (𝑣 ∣ 𝑢) + ‖𝑣‖2


( )2
≤ ‖𝑢‖2 + 2|(𝑢 ∣ 𝑣)|2 + ‖𝑣‖2 ≤ ‖𝑢‖2 + 2‖𝑢‖2 ‖𝑣‖2 + ‖𝑣‖2 = ‖𝑢‖ + ‖𝑣‖ .
Taking square roots the triangle inequality follows. Hence ‖⋅‖ defines a norm.

As a matter of convention we always consider inner product spaces as normed spaces.

50
15.8 Convention Since every inner product induces a norm we will always assume that
an inner product space is a normed space with the norm induced by the inner product.

Once we have a norm we can talk about convergence and completeness. Note that not
every inner product space is complete, but those which are play a special role.

15.9 Definition (Hilbert space) An inner product space which is complete with re-
spect to the induced norm is called a Hilbert space.

The inner product is a map on 𝐸 × 𝐸. We show that this map is continuous with respect
to the induced norm.

15.10 Proposition (Continuity of inner product) Let 𝐸 be an inner product space.


Then the inner product (⋅ ∣ ⋅) ∶ 𝐸 × 𝐸 → 𝕂 is continuous with respect to the induced
norm.

Proof. If 𝑥𝑛 → 𝑥 and 𝑦𝑛 → 𝑦 in 𝐸 (with respect to the induced norm), then using the
Cauchy-Schwarz inequality

|(𝑥𝑛 ∣ 𝑦𝑛 ) − (𝑥 ∣ 𝑦)| = |(𝑥𝑛 − 𝑥 ∣ 𝑦𝑛 ) + (𝑥 ∣ 𝑦𝑛 − 𝑦)|


≤ |(𝑥𝑛 − 𝑥 ∣ 𝑦𝑛 )| + |(𝑥 ∣ 𝑦𝑛 − 𝑦)| ≤ ‖𝑥𝑛 − 𝑥‖‖𝑦𝑛 ‖ + ‖𝑥‖‖𝑦𝑛 − 𝑦‖ → 0

as 𝑛 → ∞. Note that we also use the continuity of the norm in the above argument to
conclude that ‖𝑦𝑛 ‖ → ‖𝑦‖ (see Theorem 6.4). Hence the inner product is continuous.

The lengths of the diagonals and edges of a parallelogram in the plane satisfy a rela-
tionship. The norm in an inner product space satisfies a similar relationship, called the
parallelogram identity. The identity will play an essential role in the next section.

15.11 Proposition (Parallelogram identity) Let 𝐸 be an inner product space and ‖⋅‖
the induced norm. Then

‖𝑢 + 𝑣‖2 + ‖𝑢 − 𝑣‖2 = 2‖𝑢‖2 + 2‖𝑣‖2 (15.3)

for all 𝑢, 𝑣 ∈ 𝐸.

Proof. By definition of the induced norm and the properties of an inner product

‖𝑢 + 𝑣‖2 + ‖𝑢 − 𝑣‖2 = ‖𝑢‖2 + (𝑢 ∣ 𝑣) + (𝑣 ∣ 𝑢) + ‖𝑣‖2


+ ‖𝑢‖2 − (𝑢 ∣ 𝑣) − (𝑣 ∣ 𝑢) + ‖𝑣‖2 = 2‖𝑢‖2 + 2‖𝑣‖2

for all 𝑢, 𝑣 ∈ 𝐸 as required.

It turns out that the converse is true as well. More precisely, if a norm satisfies (15.3)
for all 𝑢, 𝑣 ∈ 𝐸, then there is an inner product inducing that norm (see [13, Section I.5]
for a proof).

51
16 Projections and Orthogonal Complements
In this section we discuss the existence and properties of “nearest point projections”
from a point onto a set, that is, the points that minimise the distance from a closed set
to a given point.

16.1 Definition (Projection) Let 𝐸 be a normed space and 𝑀 a non-empty closed


subset. We define the set of projections of 𝑥 onto 𝑀 by

𝑃𝑀 (𝑥) ∶= {𝑚 ∈ 𝑀 ∶ ‖𝑥 − 𝑚‖ = dist(𝑥, 𝑀)}.

The meaning of 𝑃𝑀 (𝑥) is illustrated in Figure 16.1 for the Euclidean norm in the plane.
If the set is not convex, 𝑃𝑀 (𝑥) can consist of several points, if it is convex, it is precisely
one.

Figure 16.1: The set of nearest point projections 𝑃𝑀 (𝑥).

We now look at some example. First we look at subsets of ℝ𝑁 , and show that then
𝑃𝑀 (𝑥) is never empty.

16.2 Example Suppose that 𝑀 ⊂ ℝ𝑁 is non-empty and closed and 𝑥 ∈ 𝑀. If we fix


𝛼 > dist(𝑥, 𝑀) and 𝑥 ∈ ℝ𝑁 , then the set 𝐾 ∶= 𝑀 ∩ 𝐵(𝑥, 𝛼) is a closed and bounded,
and dist(𝑥, 𝑀) = dist(𝑥, 𝐾). We know from Proposition 5.5 that the distance function
𝑥 → dist(𝑥, 𝐾) is continuous. Since 𝐾 is compact by the Heine-Borel theorem, the
continuous map 𝑦 → 𝑑(𝑥, 𝑦) attains a minimum on 𝐾. Hence there exists 𝑦 ∈ 𝐾 such
that 𝑑(𝑥, 𝑦) = inf 𝑧∈𝐾 𝑑(𝑦, 𝑧) = dist(𝑥, 𝐾) = dist(𝑥, 𝑀), which means that 𝑦 ∈ 𝑃𝑀 (𝑥).
Hence 𝑃𝑀 (𝑥) is non-empty if 𝑀 ⊂ ℝ𝑁 . The same applies to any finite dimensional
space.
The argument to prove that 𝑃𝑀 (𝑥) is non-empty used above very much depends on the
set 𝐾 to be compact. In the following example we show that 𝑃𝑀 (𝑥) can be empty. It is
evident that 𝑃𝑀 (𝑥) can be empty if 𝐸 is not complete. It may be more surprising and
counter intuitive that even if 𝐸 is complete, 𝑃𝑀 (𝑥) may be empty! There is no mystery
about this, it just shows how much our intuition relies on bounded and closed sets to be
compact.

16.3 Example Let 𝐸 ∶= 𝐶([0, 1]) with norm ‖𝑢‖ ∶= ‖𝑢‖∞ + ‖𝑢‖1 . We claim that 𝐸
is complete. Clearly ‖𝑢‖∞ ≤ ‖𝑢‖ for all 𝑢 ∈ 𝐸. Also
1 1
‖𝑢‖1 = |𝑢(𝑥)| 𝑑𝑥 ≤ ‖𝑢‖∞ 1 𝑑𝑥 = ‖𝑢‖∞
∫0 ∫0

52
for all 𝑥 ∈ 𝐸. Hence ‖⋅‖ is equivalent to the supremum norm ‖⋅‖∞ , so convergence
with respect to ‖⋅‖ is uniform convergence. By Section 7.4 and Lemma 9.4 the space
𝐸 is complete with respect to the norm ‖⋅‖. Now look at the subspace

𝑀 ∶= {𝑢 ∈ 𝐶([0, 1]) ∶ 𝑢(0) = 0}.

Since uniform convergence implies pointwise convergence 𝑀 is closed. (Note that by


closed we mean closed as a subset of the metric space 𝐸, not algebraically closed.)
Denote by 𝟏 the constant function with value 1. If 𝑢 ∈ 𝑀, then

‖𝟏 − 𝑢‖ ≥ ‖𝟏 − 𝑢‖∞ ≥ |1 − 𝑢(0)| = 1, (16.1)



so dist(𝟏, 𝑀) ≥ 1. If we set 𝑢𝑛 (𝑥) ∶= 𝑛 𝑥, then 0 ≤ 1 − 𝑢𝑛 (𝑥) → 0 for all 𝑥 ∈ (0, 1], so
by the dominated convergence theorem
1 1
‖𝟏 − 𝑢𝑛 ‖ = ‖𝟏 − 𝑢𝑛 ‖∞ + 1 − 𝑢𝑛 (𝑥) 𝑑𝑥 ≥ 1 + 1 − 𝑢𝑛 (𝑥) 𝑑𝑥 → 1
∫0 ∫0

as 𝑛 → ∞. Hence dist(𝟏, 𝑀) = 1. We show now that 𝑃𝑀 (𝟏) = ∅, that is, there is no


nearest element in 𝑀 from 𝟏. If 𝑢 ∈ 𝑀, then 𝑢(0) = 0 and thus by continuity of 𝑢
there exists an interval [0, 𝛿] such that |𝟏 − 𝑢(𝑥)| > 1∕2 for all 𝑥 ∈ [0, 𝛿]. Hence for all
𝑢 ∈ 𝑀 we have ‖𝟏 − 𝑢‖1 > 0. Using (16.1) we have

‖𝟏 − 𝑢‖ = ‖𝟏 − 𝑢‖∞ + ‖𝟏 − 𝑢‖1 ≥ 1 + ‖𝟏 − 𝑢‖1 > 1

for all 𝑢 ∈ 𝑀, so 𝑃𝑀 (𝟏) is empty.

In the light of the above example it is a non-trivial fact that 𝑃𝑀 (𝑥) is not empty in a
Hilbert space, at least if 𝑀 is closed and convex. By a convex set, as usual, we mean a
subset 𝑀 such that 𝑡𝑥 + (1 − 𝑡)𝑦 ∈ 𝑀 for all 𝑥, 𝑦 ∈ 𝑀 and 𝑡 ∈ [0, 1]. In other words,
if 𝑥, 𝑦 are in 𝑀, so is the line segment connecting them. The essential ingredient in the
proof is the parallelogram identity from Proposition 15.11.

16.4 Theorem (Existence and uniqueness of projections) Let 𝐻 be a Hilbert space


and 𝑀 ⊂ 𝐻 non-empty, closed and convex. Then 𝑃𝑀 (𝑥) contains precisely one ele-
ment which we also denote by 𝑃𝑀 (𝑥).

Proof. Let 𝑀 ⊂ 𝐻 be non-empty, closed and convex. If 𝑥 ∈ 𝑀, then 𝑃𝑀 (𝑥) = 𝑥, so


there is existence and also uniqueness of an element of 𝑃𝑀 (𝑥). Hence we assume that
𝑥 ∉ 𝑀 and set
𝛼 ∶= dist(𝑥, 𝑀) = inf ‖𝑥 − 𝑚‖.
𝑚∈𝑀

Since 𝑀 is closed and 𝑥 ∉ 𝑀 we have 𝛼 > 0. From the parallelogram identity Propo-
sition 15.11 we get

‖𝑚1 − 𝑚2 ‖2 = ‖(𝑚1 − 𝑥) − (𝑚2 − 𝑥)‖2


= 2‖𝑚1 − 𝑥‖2 + 2‖𝑚1 − 𝑥‖2 − ‖(𝑚1 − 𝑥) + (𝑚2 − 𝑥)‖2 .

53
If 𝑚1 , 𝑚2 ∈ 𝑀, then ‖𝑚𝑖 − 𝑥‖ ≥ 𝛼 for 𝑖 = 1, 2 and by the convexity of 𝑀 we have
(𝑚1 + 𝑚2 )∕2 ∈ 𝑀. Hence

‖ 𝑚 + 𝑚2 ‖
‖(𝑚1 − 𝑥) + (𝑚2 − 𝑥)‖ = ‖𝑚1 + 𝑚2 − 2𝑥‖ = 2‖ 1 − 𝑥‖ ≥ 2𝛼.
‖ 2 ‖
and by using the above

‖𝑚1 − 𝑚2 ‖2 ≤ 2‖𝑚1 − 𝑥‖2 + 2‖𝑚1 − 𝑥‖2 − 4𝛼 2 . (16.2)

for all 𝑚1 , 𝑚2 ∈ 𝑀. We can now prove uniqueness. Given 𝑚1 , 𝑚2 ∈ 𝑃𝑀 (𝑥) we have by


definition ‖𝑚𝑖 − 𝑥‖ = 𝛼 (𝑖 = 1, 2), and so by (16.2)

‖𝑚1 − 𝑚2 ‖2 ≤ 4𝛼 2 − 4𝛼 2 = 0.

Hence ‖𝑚1 − 𝑚2 ‖ = 0, that is, 𝑚1 = 𝑚2 proving uniqueness. As a second step we


prove the existence of an element in 𝑃𝑀 (𝑥). By definition of an infimum there exists a
sequence (𝑥𝑛 ) in 𝑀 such that

‖𝑥𝑛 − 𝑥‖ → 𝛼 ∶= dist(𝑥, 𝑀).

This obviously implies that (𝑥𝑛 ) a bounded sequence in 𝐻, but since 𝐻 is not necessar-
ily finite dimensional, we cannot conclude it is converging without further investigation.
We show that (𝑥𝑛 ) is a Cauchy sequence and therefore converges by the completeness
of 𝐻. Fix now 𝜀 > 0. Since 𝛼 ≤ ‖𝑥𝑛 − 𝑥‖ → 𝛼 there exists 𝑛0 ∈ ℕ such that

𝛼 ≤ ‖𝑥𝑛 − 𝑥‖ ≤ 𝛼 + 𝜀

for all 𝑛 > 𝑛0 . Hence using (16.2)

‖𝑥𝑘 + 𝑥𝑛 ‖2 ≤ 2‖𝑥𝑘 − 𝑥‖2 + 2‖𝑥𝑛 − 𝑥‖2 − 4𝛼 2 ≤ 4(𝛼 + 𝜀)2 − 4𝛼 2 = 4(2𝛼 + 𝜀)𝜀

for all 𝑛, 𝑘 > 𝑛0 . Hence (𝑥𝑛 ) is a Cauchy sequence as claimed.

We next derive a geometric characterisation of the projection onto a convex set. If we


look at a convex set 𝑀 in the plane and the nearest point projection 𝑚𝑥 from a point
𝑥 onto 𝑀, then we expect the angle between 𝑥 − 𝑚𝑥 and 𝑚𝑥 − 𝑚 to be larger or equal
than 𝜋∕2. This means that the inner product (𝑥 − 𝑚𝑥 ∣ 𝑚𝑥 − 𝑚) ≤ 0. We also expect
the converse, that is, if the angle is larger or equal to 𝜋∕2 for all 𝑚 ∈ 𝑀, then 𝑚𝑥 is
the projection. Look at Figure 16.2 for an illustration. A similar fact remains true in
an arbitrary Hilbert space, except that we have to be careful in a complex Hilbert space
because (𝑥 − 𝑚𝑥 ∣ 𝑚𝑥 − 𝑚) does not need to be real.

16.5 Theorem Suppose 𝐻 is a Hilbert space and 𝑀 ⊂ 𝐻 a non-empty closed and


convex subset. Then for a point 𝑚𝑥 ∈ 𝑀 the following assertions are equivalent:

(i) 𝑚𝑥 = 𝑃𝑀 (𝑥);

(ii) Re(𝑚 − 𝑚𝑥 ∣ 𝑥 − 𝑚𝑥 ) ≤ 0 for all 𝑚 ∈ 𝑀.

54
𝑥

𝑥 − 𝑚𝑥
≥ 𝜋∕2
𝑚𝑥 = 𝑃𝑀 (𝑥)
𝑚 𝑚 − 𝑚𝑥

Figure 16.2: Projection onto a convex set

Proof. By a translation we can assume that 𝑚𝑥 = 0. Assuming that 𝑚𝑥 = 0 = 𝑃𝑀 (𝑥)


we prove that Re(𝑚 ∣ 𝑥) ≤ 0 for all 𝑚 ∈ 𝑀. By definition of 𝑃𝑀 (𝑥) we have ‖𝑥‖ =
‖𝑥 − 0‖ = inf 𝑚∈𝑀 ‖𝑥 − 𝑚‖, so ‖𝑥‖ ≤ ‖𝑥 − 𝑚‖ for all 𝑚 ∈ 𝑀. As 0, 𝑚 ∈ 𝑀 and 𝑀 is
convex we have

‖𝑥‖2 ≤ ‖𝑥 − 𝑡𝑚‖2 = ‖𝑥‖2 + 𝑡2 ‖𝑚‖2 − 2𝑡 Re(𝑚 ∣ 𝑥)

for all 𝑚 ∈ 𝑀 and 𝑡 ∈ (0, 1]. Hence


𝑡
Re(𝑚 ∣ 𝑥) ≤ ‖𝑚‖2
2
for all 𝑚 ∈ 𝑀 and 𝑡 ∈ (0, 1]. If we fix 𝑚 ∈ 𝑀 and let 𝑡 go to zero, then Re(𝑚 ∣ 𝑥) ≤ 0
as claimed. Now assume that Re(𝑚 ∣ 𝑥) ≤ 0 for all 𝑚 ∈ 𝑀 and that 0 ∈ 𝑀. We want
to show that 0 = 𝑃𝑀 (𝑥). If 𝑚 ∈ 𝑀 we then have

‖𝑥 − 𝑚‖2 = ‖𝑥‖2 + ‖𝑚‖2 − 2 Re(𝑥 ∣ 𝑚) ≥ ‖𝑥‖2

since Re(𝑚 ∣ 𝑥) ≤ 0 by assumption. As 0 ∈ 𝑀 we conclude that

‖𝑥‖ = inf ‖𝑥 − 𝑚‖,


𝑚∈𝑀

so 0 = 𝑃𝑀 (𝑥) as claimed.

Every vector subspace 𝑀 of a Hilbert space is obviously convex. If it is closed, then


the above characterisation of the projection can be applied. Due to the linear structure
of 𝑀 it simplifies and the projection turns out to be linear. From Figure 16.3 we expect
that (𝑥 − 𝑚𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀 if 𝑚𝑥 is the projection of 𝑥 onto 𝑀 and vice versa.
The corollary also explains why 𝑃𝑀 is called the orthogonal projection onto 𝑀.

16.6 Corollary Let 𝑀 be a closed subspace of the Hilbert space 𝐻. Then 𝑚𝑥 = 𝑃𝑀 (𝑥)
if and only if 𝑚𝑥 ∈ 𝑀 and (𝑥 − 𝑚𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀. Moreover, 𝑃𝑀 ∶ 𝐻 → 𝑀
is linear.
Proof. By the above theorem 𝑚𝑥 = 𝑃𝑀 (𝑥) if and only if Re(𝑚𝑥 − 𝑥 ∣ 𝑚 − 𝑚𝑥 ) ≤ 0
for all 𝑚 ∈ 𝑀. Since 𝑀 is a subspace 𝑚 + 𝑚𝑥 ∈ 𝑀 for all 𝑚 ∈ 𝑀, so using 𝑚 + 𝑚𝑥
instead of 𝑚 we get that

Re(𝑚𝑥 − 𝑥 ∣ (𝑚 + 𝑚𝑥 ) − 𝑚𝑥 ) = Re(𝑚𝑥 − 𝑥 ∣ 𝑚) ≤ 0

55
𝑥

𝑥 − 𝑚𝑥
0
𝑚
𝑚𝑥 = 𝑃𝑀 (𝑥)

Figure 16.3: Projection onto a closed subspace

for all 𝑚 ∈ 𝑀. Replacing 𝑚 by −𝑚 we get − Re(𝑚𝑥 − 𝑥 ∣ 𝑚) = Re(𝑚𝑥 − 𝑥 ∣ −𝑚) ≤ 0,


so we must have Re(𝑚𝑥 − 𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀. Similarly, replacing 𝑚 = ±𝑖𝑚 if
𝐻 is a complex Hilbert space we have

± Im(𝑚𝑥 − 𝑥 ∣ 𝑖𝑚) = Re(𝑚𝑥 − 𝑥 ∣ ±𝑚) ≤ 0,

so also Im(𝑚𝑥 − 𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀. Hence (𝑚𝑥 − 𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀 as


claimed. It remains to show that 𝑃𝑀 is linear. If 𝑥, 𝑦 ∈ 𝐻 and 𝜆, 𝜇 ∈ ℝ, then by what
we just proved

0 = 𝜆(𝑥 − 𝑃𝑀 (𝑥) ∣ 𝑚) + 𝜇(𝑥 − 𝑃𝑀 (𝑦) ∣ 𝑚) = (𝜆𝑥 + 𝜇𝑦 − (𝜆𝑃𝑀 (𝑥) + 𝜇𝑃𝑀 (𝑦)) ∣ 𝑚)

for all 𝑚 ∈ 𝑀. Hence again by what we proved 𝑃𝑀 (𝜆𝑥 + 𝜇𝑦) = 𝜆𝑃𝑀 (𝑥) + 𝜇𝑃𝑀 (𝑦),
showing that 𝑃𝑀 is linear.

We next connect the projections discussed above with the notion of orthogonal com-
plements.

16.7 Definition (Orthogonal complement) For an arbitrary non-empty subset 𝑀 of


an inner product space 𝐻 we set

𝑀 ⟂ ∶= {𝑥 ∈ 𝐻 ∶ (𝑥 ∣ 𝑚) = 0 for all 𝑚 ∈ 𝑀}.

We call 𝑀 ⟂ the orthogonal complement of 𝑀 in 𝐻.


We now establish some elementary but very useful properties of orthogonal comple-
ments.

16.8 Lemma Suppose 𝑀 is a non-empty subset of the inner product space 𝐻. Then

𝑀 ⟂ is a closed subspace of 𝐻 and 𝑀 ⟂ = 𝑀 = (span 𝑀)⟂ = (span 𝑀)⟂ .
Proof. If 𝑥, 𝑦 ∈ 𝑀 ⟂ and 𝜆, 𝜇 ∈ 𝕂, then

(𝜆𝑥 + 𝜇𝑦 ∣ 𝑚) = 𝜆(𝑥 ∣ 𝑚) + 𝜇(𝑦 ∣ 𝑚) = 0,

for all 𝑚 ∈ 𝑀, so 𝑀 ⟂ is a subspace of 𝐻. If 𝑥 is from the closure of 𝑀 ⟂ , then there


exist 𝑥𝑛 ∈ 𝑀 ⟂ with 𝑥𝑛 → 𝑥. By the continuity of the inner product

(𝑥 ∣ 𝑚) = lim (𝑥𝑛 ∣ 𝑚) = lim 0 = 0


𝑛→∞ 𝑛→∞

56
for all 𝑚 ∈ 𝑀. Hence 𝑥 ∈ 𝑀 ⟂ , showing that 𝑀 ⟂ is closed. We next show that 𝑀 ⟂ =
⟂ ⟂
𝑀 . Since 𝑀 ⊂ 𝑀 we have 𝑀 ⊂ 𝑀 ⟂ by definition the orthogonal complement.
Fix 𝑥 ∈ 𝑀 ⟂ and 𝑚 ∈ 𝑀. Then there exist 𝑚𝑛 ∈ 𝑀 with 𝑚𝑛 → 𝑚. By the continuity
of the inner product
(𝑥 ∣ 𝑚) = lim (𝑥 ∣ 𝑚𝑛 ) = lim 0 = 0.
𝑛→∞ 𝑛→∞
⟂ ⟂ ⟂
Hence 𝑥 ∈ 𝑀 and thus 𝑀 ⊃ 𝑀 ⟂ , showing that 𝑀 = 𝑀 ⟂ . Next we show that
𝑀 ⟂ = (span 𝑀)⟂ . Clearly (span 𝑀)⟂ ⊂ 𝑀 ⟂ since 𝑀 ⊂ span 𝑀. Suppose now that
𝑥 ∈ 𝑀 ⟂ and 𝑚 ∈ span 𝑀. Then there exist 𝑚𝑖 ∈ 𝑀 and 𝜆𝑖 ∈ 𝕂, 𝑖 = 1, … , 𝑛, such
∑𝑛
that 𝑚 = 𝑖=1 𝜆𝑖 𝑚𝑖 . Hence


𝑛
(𝑥 ∣ 𝑚) = 𝜆𝑖 (𝑥 ∣ 𝑚𝑖 ) = 0,
𝑖=1

and thus 𝑥 ∈ (span 𝑀)⟂ . Therefore (span 𝑀)⟂ ⊃ 𝑀 ⟂ and so (span 𝑀)⟂ = 𝑀 ⟂
as claimed. The last assertion of the lemma follows by what we have proved above.
⟂ ⟂
Indeed we know that 𝑀 ⟂ = 𝑀 and that 𝑀 = (span 𝑀)⟂ .

We are now ready to prove the main result on orthogonal projections. It is one of the
most important and useful facts on Hilbert spaces.

16.9 Theorem (orthogonal complements) Suppose that 𝑀 is a closed subspace of


the Hilbert space 𝐻. Then

(i) 𝐻 = 𝑀 ⊕ 𝑀 ⟂ ;

(ii) 𝑃𝑀 is the projection of 𝐻 onto 𝑀 parallel to 𝑀 ⟂ (that is, 𝑃𝑀 (𝑀 ⟂ ) = {0})

(iii) 𝑃𝑀 ∈ (𝐻, 𝑀) with ‖𝑃𝑀 ‖(𝐻,𝑀) ≤ 1.

Proof. (i) By Corollary 16.6 we have (𝑥 − 𝑃𝑀 (𝑥) ∣ 𝑚) = 0 for all 𝑥 ∈ 𝐻 and 𝑚 ∈ 𝑀.


Hence 𝑥 − 𝑃𝑀 (𝑥) ∈ 𝑀 ⟂ for all 𝑥 ∈ 𝐻 and therefore

𝑥 = 𝑃𝑀 (𝑥) + (𝐼 − 𝑃𝑀 )(𝑥) ∈ 𝑀 + 𝑀 ⟂ ,

and thus 𝐻 = 𝑀 + 𝑀 ⟂ . If 𝑥 ∈ 𝑀 ∩ 𝑀 ⟂ , then (𝑥 ∣ 𝑥) = 0, so 𝑥 = 0, showing that


𝐻 = 𝑀 ⊕ 𝑀 ⟂ is a direct sum.
(ii) By Corollary 16.6 the map 𝑃𝑀 is linear. Since 𝑃𝑀 (𝑥) = 𝑥 for 𝑥 ∈ 𝑀 we have
𝑃𝑀2
= 𝑃𝑀 and 𝑃𝑀 (𝑀 ⟂ ) = {0}. Hence 𝑃𝑀 is a projection.
(iii) By (i) we have (𝑃𝑀 (𝑥) ∣ 𝑥 − 𝑃𝑀 (𝑥)) = 0 and so

‖𝑥‖2 = ‖𝑃𝑀 (𝑥) + (𝐼 − 𝑃𝑀 )(𝑥)‖2


= ‖𝑃𝑀 (𝑥)‖2 + ‖𝑥 − 𝑃𝑀 (𝑥)‖2 + 2 Re(𝑃𝑀 (𝑥) ∣ 𝑥 − 𝑃𝑀 (𝑥)) ≥ ‖𝑃𝑀 (𝑥)‖2

for all 𝑥 ∈ 𝐻. Hence 𝑃𝑀 ∈ (𝐻, 𝑀) with ‖𝑃𝑀 ‖(𝐻,𝑀) ≤ 1 as claimed.

57
16.10 Remark The above theorem in particular implies that for every closed subspace
𝑀 of a Hilbert space 𝐻 there exists a closed subspace 𝑁 such that 𝐻 = 𝑀 ⊕ 𝑁.
We call 𝑀 a complemented subspace. The proof used the existence of a projection.
We know from Example 16.3 that projections onto closed subspaces do not necessarily
exist in a Banach space, so one may not expect every subspace of a Banach space to
be complemented. A rather recent result [9] shows that if every closed subspace of a
Banach space is complemented, then its norm is equivalent to a norm induced by an
inner product! Hence the above theorem provides a unique property of Hilbert spaces.
The above theorem can be used to prove some properties of orthogonal complements.
The first is a very convenient criterion for a subspace of a Hilbert space to be dense.

16.11 Corollary A subspace 𝑀 of a Hilbert space 𝐻 is dense in 𝐻 if and only if


𝑀 ⟂ = {0}.

Proof. Since 𝑀 ⟂ = 𝑀 by Lemma 16.8 it follows from Theorem 16.9 that

𝐻 = 𝑀 ⊕ 𝑀⟂

for every subspace 𝑀 of 𝐻. Hence if 𝑀 is dense in 𝐻, then 𝑀 = 𝐻 and so 𝑀 ⟂ = {0}.


Conversely, if 𝑀 ⟂ = {0}, then 𝑀 = 𝐻, that is, 𝑀 is dense in 𝐻.

We finally use Theorem 16.9 to get a characterisation of the second orthogonal com-
plement of a set.

16.12 Corollary Suppose 𝑀 is a non-empty subset of the Hilbert space 𝐻. Then

𝑀 ⟂⟂ ∶= (𝑀 ⟂ )⟂ = span 𝑀.

Proof. By Lemma 16.8 we have 𝑀 ⟂ = (span 𝑀)⟂ = (span 𝑀)⟂ . Hence by replacing
𝑀 by span 𝑀 we can assume without loss of generality that 𝑀 is a closed subspace of
𝐻. We have to show that 𝑀 = 𝑀 ⟂⟂ . Since (𝑥 ∣ 𝑚) = 0 for all 𝑥 ∈ 𝑀 and 𝑚 ∈ 𝑀 ⟂ we
have 𝑀 ⊂ 𝑀 ⟂⟂ . Set now 𝑁 ∶= 𝑀 ⟂ ∩ 𝑀 ⟂⟂ . Since 𝑀 is a closed subspace it follows
from Theorem 16.9 that 𝑀 ⟂⟂ = 𝑀 ⊕ 𝑁. By definition 𝑁 ⊂ 𝑀 ⟂ ∩ 𝑀 ⟂⟂ = {0}, so
𝑁 = {0}, showing that 𝑀 = 𝑀 ⟂⟂ .

17 Orthogonal Systems
In ℝ𝑁 , the standard basis or any other basis of mutually orthogonal vectors of length
one play a special role. We look at generalisations of such bases. Recall that two 𝑢, 𝑣
of an inner product space are called orthogonal if (𝑢 ∣ 𝑣) = 0.

17.1 Definition (orthogonal systems) Let 𝐻 be an inner product space with inner
product (⋅ ∣ ⋅) and induced norm ‖⋅‖. Let 𝑀 ⊂ 𝐻 be a non-empty subset.

(i) 𝑀 is called an orthogonal system if (𝑢 ∣ 𝑣) = 0 for all 𝑢, 𝑣 ∈ 𝑀 with 𝑢 ≠ 𝑣.

58
(ii) 𝑀 is called an orthonormal system if it is an orthogonal system and ‖𝑢‖ = 1 for
all 𝑢 ∈ 𝑀.
(iii) 𝑀 is called a complete orthonormal system or orthonormal basis of 𝐻 if it is an
orthogonal system and span 𝑀 = 𝐻.
Note that the notion of orthogonal system depends on the particular inner product, so
we always have to say with respect to which inner product it is orthogonal.

17.2 Example (a) The standard basis in 𝕂𝑁 is a complete orthonormal system in 𝕂𝑁


with respect to the usual dot product.
(b) The set
𝑀 ∶= {(2𝜋)−1∕2 𝑒𝑖𝑛𝑥 ∶ 𝑛 ∈ ℤ}
forms an orthonormal system in 𝐿2 ((−𝜋, 𝜋), ℂ). Indeed,
𝜋 𝜋
1 1
‖(2𝜋)−1∕2 𝑒𝑖𝑛𝑥 ‖22 = 𝑒𝑖𝑛𝑥 𝑒−𝑖𝑛𝑥 𝑑𝑥 = 1 𝑑𝑥 = 1
2𝜋 ∫−𝜋 2𝜋 ∫−𝜋
for all 𝑛 ∈ ℕ. Moreover, if 𝑛 ≠ 𝑚, then
( ) 𝜋
1 𝑖𝑛𝑥 | 1 𝑖𝑚𝑥 1
√ 𝑒 || √ 𝑒 = 𝑒𝑖𝑛𝑥 𝑒−𝑖𝑚𝑥 𝑑𝑥
2𝜋 2𝜋 2𝜋 ∫ −𝜋
𝜋
1 1 1 |𝜋
= 𝑒𝑖(𝑛−𝑚)𝑥 𝑑𝑥 = 𝑒𝑖(𝑛−𝑚)𝑥 | = 0
2𝜋 ∫−𝜋 2𝜋 𝑖(𝑛 − 𝑚) |−𝜋
since the exponential function is 2𝜋𝑖-periodic. The system is in fact complete. To
see this we use the Stone-Weierstrass Theorem 14.2. Clearly 𝑒𝑖𝑛𝑥 are continuous func-
tions. The span of these functions is a sub-algebra of 𝐶([−𝜋, 𝜋]) because the products
𝑒𝑖𝑛𝑥 𝑒𝑖𝑚𝑥 = 𝑒𝑖(𝑛+𝑚)𝑥 are in the sysem for all 𝑛, 𝑚 ∈ ℤ. For 𝑛 = 0 we have the constant
function. Also the family separates points. Hence the span these functions is dense in
𝐶([−𝜋, 𝜋]), which in turn is dense in 𝐿2 ((−𝜋, 𝜋)).
(c) The set of real valued functions
1 1 1
√ , √ cos 𝑛𝑥, √ sin 𝑛𝑥, 𝑛 ∈ ℕ ⧵ {0}
2𝜋 𝜋 𝜋
forms an orthonormal system on 𝐿2 ((−𝜋, 𝜋), ℝ). Again it turns out that this system is
complete. The proof of the orthogonality is a consequence of the trigonometric identi-
ties
1( )
sin 𝑚𝑥 sin 𝑛𝑥 = cos(𝑚 − 𝑛)𝑥 − cos(𝑚 + 𝑛)𝑥
2
1( )
cos 𝑚𝑥 cos 𝑛𝑥 = cos(𝑚 − 𝑛)𝑥 + cos(𝑚 + 𝑛)𝑥
2
1( )
sin 𝑚𝑥 cos 𝑛𝑥 = sin(𝑚 − 𝑛)𝑥 + sin(𝑚 + 𝑛)𝑥
2

which easily follow from using the standard addition theorems for sin(𝑚 ± 𝑛)𝑥 and
cos(𝑚 ± 𝑛)𝑥

59
We next show that orthogonal systems are linearly independent if we remove the zero
element. Recall that by definition an infinite set is linearly independent if every finite
subset is linearly independent. We also prove a generalisation of Pythagoras’ theorem.

17.3 Lemma (Pythagoras theorem) Suppose that 𝐻 is an inner product space and
𝑀 an orthogonal system in 𝐻. Then the following assertions are true:
(i) 𝑀 ⧵ {0} is linearly independent.
(ii) If (𝑥𝑛 ) is a sequence in 𝑀 with 𝑥𝑛 ≠ 𝑥𝑚 for 𝑛 ≠ 𝑚 and 𝐻 is complete, then
∑∞ ∑∞
𝑥 converges if and only if 𝑘=0 ‖𝑥𝑘 ‖2 converges. In that case
𝑘=0 𝑘

‖∑ ‖2 ∑
∞ ∞
‖ 𝑥𝑘 ‖ = ‖𝑥𝑘 ‖2 . (17.1)
‖ ‖
𝑘=0 𝑘=0

Proof. (i) We have to show that every finite subset of 𝑀 ⧵ {0} is linearly independent.
Hence let 𝑥𝑘 ∈ 𝑀 ⧵ {0}, 𝑘 = 1, … , 𝑛 be a finite number of distinct elements. Assume
that 𝜆𝑘 ∈ 𝕂 are such that
∑𝑛
𝜆𝑘 𝑥𝑘 = 0.
𝑘=0

If we fix 𝑥𝑚 , 𝑚 ∈ {0, … , 𝑛}, then by the orthogonality


(∑
𝑛 ) ∑𝑛
|
0= 𝜆𝑘 𝑥𝑘 |𝑥𝑚 = 𝜆𝑘 (𝑥𝑘 ∣ 𝑥𝑚 ) = 𝜆𝑚 ‖𝑥𝑚 ‖2 .
|
𝑘=0 𝑘=0

Since 𝑥𝑚 ≠ 0 it follows that 𝜆𝑚 = 0 for all 𝑚 ∈ {0, … , 𝑛}, showing that 𝑀 ⧵ {0} is
linearly independent.
(ii) Let (𝑥𝑛 ) be a sequence in 𝑀 with 𝑥𝑛 ≠ 𝑥𝑚 . (We only look at the case of
an infinite set because otherwise there are no issues on convergence). We set 𝑠𝑛 ∶=
∑𝑛 ∑𝑛
𝑥 and 𝑡𝑛 ∶= 𝑘=1 ‖𝑥𝑘 ‖2 the partial sums of the series under consideration. If
𝑘=1 𝑘
1 ≤ 𝑚 < 𝑛, then by the orthogonality
(∑ )
‖∑ | ∑
𝑛 𝑛 𝑛
‖2
‖𝑠𝑛 − 𝑠𝑚 ‖ = ‖
2
𝑥𝑘 ‖ = 𝑥𝑘 | 𝑥𝑗
‖ ‖ |
𝑘=𝑚+1 𝑘=𝑚+1 𝑗=𝑚+1
∑ ∑
𝑛 𝑛

𝑛
= (𝑥𝑘 ∣ 𝑥𝑗 ) = ‖𝑥𝑘 ‖2 = |𝑡𝑛 − 𝑡𝑚 |.
𝑘=𝑚+1 𝑗=𝑚+1 𝑘=𝑚+1

Hence (𝑠𝑛 ) is a Cauchy sequence in 𝐻 if and only if 𝑡𝑛 is a Cauchy sequence in ℝ,


and by the completeness they either both converge or diverge. The identity (17.1) now
follows by setting 𝑚 = 0 in the above calculation and then letting 𝑛 → ∞.

In the case of 𝐻 = 𝕂𝑁 and the standard basis 𝑒𝑖 , 𝑖 = 1, … , 𝑁, we call 𝑥𝑖 = (𝑥 ∣ 𝑒𝑖 ) the


components of 𝑥 ∈ 𝕂𝑁 . The Euclidean norm is given by

𝑛

𝑛
‖𝑥‖ = 2
|𝑥𝑖 | =
2
|(𝑥 ∣ 𝑒𝑖 )|2 .
𝑘=1 𝑘=1

60
If we do not sum over the full standard basis we may only get an inequality, namely

𝑚

𝑛
|(𝑥 ∣ 𝑒𝑖 )| ≤ 2
|(𝑥 ∣ 𝑒𝑖 )|2 ≤ ‖𝑥‖2 .
𝑘=1 𝑘=1

if 𝑚 ≤ 𝑛. We now prove a similar inequality replacing the standard basis by an arbitrary


orthonormal system 𝑀 in an inner product space 𝐻. From the above reasoning we
expect that ∑
|(𝑥 ∣ 𝑚)|2 ≤ ‖𝑥‖
𝑚∈𝑀
for all 𝑥 ∈ 𝑀. The definition of an orthonormal system 𝑀 does not make any assump-
tion on the cardinality of 𝑀, so it may be uncountable. However, if 𝑀 is uncountable,
it is not clear what the series above means. To make sense of the above series we define
∑ ∑
|(𝑥 ∣ 𝑚)|2 ∶= sup |(𝑥 ∣ 𝑚)|2 (17.2)
𝑚∈𝑀 𝑁 ⊂ 𝑀 finite 𝑚∈𝑁

We now prove the expected inequality.

17.4 Theorem (Bessel’s inequality) Let 𝐻 be an inner product space and 𝑀 an or-
thonormal system in 𝐻. Then

|(𝑥 ∣ 𝑚)|2 ≤ ‖𝑥‖2 (17.3)
𝑚∈𝑀

for all 𝑥 ∈ 𝐻. Moreover, the set {𝑚 ∈ 𝑀 ∶ (𝑥 ∣ 𝑚) ≠ 0} is at most countable for every


𝑥 ∈ 𝐻.
Proof. Let 𝑁 = {𝑚𝑘 ∶ 𝑘 = 1, … 𝑛} be a finite subset of the orthonormal set 𝑀 in 𝐻.
Then, geometrically,
∑𝑛
(𝑥 ∣ 𝑚𝑘 )𝑚𝑘
𝑘=1
is the projection of 𝑥 onto the span of 𝑁. By Pythagoras theorem (Lemma 17.3) and
since ‖𝑚𝑘 ‖ = 1 we have

‖∑ ‖2 ∑ ∑
𝑛 𝑛 𝑛
‖ (𝑥 ∣ 𝑚𝑘 )𝑚𝑘 ‖ = |(𝑥 ∣ 𝑚𝑘 )|2 ‖𝑚𝑘 ‖2 = |(𝑥 ∣ 𝑚𝑘 )|2 .
‖ ‖
𝑘=1 𝑘=1 𝑘=1

We expect the norm of the projection to be smaller than the norm of ‖𝑥‖. To see that
we use the properties of the inner product and the above identity to get

‖ ∑
𝑛
‖2 ‖∑
𝑛
‖2
0 ≤ ‖𝑥 − (𝑥 ∣ 𝑚𝑘 )𝑚𝑘 ‖ = ‖𝑥‖ + ‖ (𝑥 ∣ 𝑚𝑘 )𝑚𝑘 ‖
2
‖ ‖ ‖ ‖
𝑘=1 𝑘=1

𝑛
∑ 𝑛
− (𝑥 ∣ 𝑚𝑘 )(𝑥 ∣ 𝑚𝑘 ) − (𝑥 ∣ 𝑚𝑘 )(𝑚𝑘 ∣ 𝑥)
𝑘=1 𝑘=1

𝑛

𝑛
= ‖𝑥‖2 + |(𝑥 ∣ 𝑚𝑘 )|2 − 2 |(𝑥 ∣ 𝑚𝑘 )|2
𝑘=1 𝑘=1
∑ 𝑛
= ‖𝑥‖2 − |(𝑥 ∣ 𝑚𝑘 )|2 .
𝑘=1

61
Hence we have shown that ∑
|(𝑥 ∣ 𝑚)|2 ≤ ‖𝑥‖2
𝑚∈𝑁

for every finite set 𝑁 ⊂ 𝑀. Taking the supremum over all such finite sets (17.3)
follows. To prove the second assertion note that for every given 𝑥 ∈ 𝐻 the sets 𝑀𝑛 ∶=
{𝑚 ∈ 𝑀 ∶ |(𝑥 ∣ 𝑚)| ≥ 1∕𝑛} is finite for every 𝑛 ∈ ℕ as otherwise (17.3) could not be
true. Since countable unions of finite sets are countable, the set

{𝑚 ∈ 𝑀 ∶ (𝑥 ∣ 𝑚) ≠ 0} = 𝑀𝑛
𝑛∈ℕ

is countable as claimed.

17.5 Remark Since for every 𝑥 the set {𝑚 ∈ 𝑀 ∶ (𝑥 ∣ 𝑚) ≠ 0} is countable we


can choose an arbitrary enumeration and write 𝑀𝑥 ∶= {𝑚 ∈ 𝑀 ∶ (𝑥 ∣ 𝑚) ≠ 0} =
∑∞
{𝑚𝑘 ∶ 𝑘 ∈ ℕ}. Since the series 𝑘=1 |(𝑥 ∣ 𝑚𝑘 )|2 has non-negative terms and every such
sequence is unconditionally convergent we have




|(𝑥 ∣ 𝑚)| = 2
|(𝑥 ∣ 𝑚𝑘 )|2
𝑚∈𝑀 𝑘=1

no matter which enumeration we take. Recall that unconditionally convergent means


that a series converges, and every rearrangement also converges to the same limit. We
make this more precise in the next section.

18 Abstract Fourier Series


If 𝑥 is a vector in 𝕂𝑁 and 𝑒𝑖 the standard basis, then we know that


𝑛
(𝑥 ∣ 𝑒𝑘 )𝑒𝑘
𝑘=1

is the orthogonal projection of 𝑥 onto the subspace spanned by 𝑒1 , … , 𝑒𝑛 if 𝑛 ≤ 𝑁, and


that
∑𝑁
𝑥= (𝑥 ∣ 𝑒𝑘 )𝑒𝑘 .
𝑘=1

We might therefore expect that the analogous expression



(𝑥 ∣ 𝑚)𝑚 (18.1)
𝑚∈𝑀

is the orthogonal projection onto span 𝑀 if 𝑀 is an orthonormal system in a Hilbert


space 𝐻. However, there are some difficulties. First of all, 𝑀 does not need to be
countable, so the sum does not necessarily make sense. Since we are not working in ℝ,
we cannot use a definition like (17.2). On the other hand, we know from Theorem 17.4
that the set
𝑀𝑥 ∶= {𝑚 ∈ 𝑀 ∶ (𝑥 ∣ 𝑀) ≠ 0} (18.2)

62
is at most countable. Hence 𝑀𝑥 is finite or its elements can be enumerated. If 𝑀𝑥 is
finite (18.1) makes perfectly good sense. Hence let us assume that 𝑚𝑘 , 𝑘 ∈ ℕ is an
enumeration of 𝑀𝑥 . Hence, rather than (18.1), we could write


(𝑥 ∣ 𝑚𝑘 )𝑚𝑘 .
𝑘=0

This does still not solve all our problems, because the limit of the series may depend
on the particular enumeration chosen. The good news is that this is not the case, and
that the series is unconditionally convergent, that is, the series converges and for every
bijection 𝜎 ∶ ℕ → ℕ we have




(𝑥 ∣ 𝑚𝑘 )𝑚𝑘 = (𝑥 ∣ 𝑚𝜎(𝑘) )𝑚𝜎(𝑘) .
𝑘=0 𝑘=0

Recall that the series on the right hand side is called a rearrangement of the series on
the left. We now show that (18.1) is actually a projection, not onto span 𝑀, but onto
its closure.

18.1 Theorem Suppose that 𝑀 is an orthonormal system in a Hilbert space 𝐻 and set
∑∞
𝑁 ∶= span 𝑀. Let 𝑥 ∈ 𝐻 and 𝑚𝑘 , 𝑘 ∈ ℕ an enumeration of 𝑀𝑥 . Then 𝑘=0 (𝑥 ∣ 𝑚𝑘 )𝑚𝑘
is unconditionally convergent, and


𝑃𝑁 (𝑥) = (𝑥 ∣ 𝑚𝑘 )𝑚𝑘 , (18.3)
𝑘=0

where 𝑃𝑁 (𝑥) is the orthogonal projection onto 𝑁 as defined in Section 16.


Proof. Fix 𝑥 ∈ 𝐻. By Theorem 17.4 the set 𝑀𝑥 is either finite or countable. We let
𝑚𝑘 , 𝑘 ∈ ℕ an enumeration of 𝑀𝑥 , setting for convenience 𝑚𝑘 ∶= 0 for 𝑘 larger than
the cardinality of 𝑀𝑥 if 𝑀𝑥 is finite. Again by Theorem 17.4


|(𝑥 ∣ 𝑚𝑘 )|2 ≤ ‖𝑥‖2 ,
𝑘=0

so by Lemma 17.3 the series




𝑦 ∶= (𝑥 ∣ 𝑚𝑘 )𝑚𝑘
𝑘=0

converges in 𝐻 since 𝐻 is complete. We now use the characterisation of projections


from Corollary 16.6 to show that 𝑦 = 𝑃𝑁 (𝑥). For 𝑚 ∈ 𝑀 we consider
(∑
𝑛 ) ∑𝑛
|
𝑠𝑛 (𝑚) ∶= (𝑥 ∣ 𝑚𝑘 )𝑚𝑘 − 𝑥|𝑚 = (𝑥 ∣ 𝑚𝑘 )(𝑚𝑘 ∣ 𝑚) − (𝑥 ∣ 𝑚).
|
𝑘=0 𝑘=0

Since the series is convergent, the continuity of the inner product shows that


(𝑦 − 𝑥 ∣ 𝑚) = lim 𝑠𝑛 (𝑚) = (𝑥 ∣ 𝑚𝑘 )(𝑚𝑘 ∣ 𝑚) − (𝑥 ∣ 𝑚)
𝑛→∞
𝑘=0

63
exists for all 𝑚 ∈ 𝑀. If 𝑚 ∈ 𝑀𝑥 , that is, 𝑚 = 𝑚𝑗 for some 𝑗 ∈ ℕ, then by the
orthogonality
(𝑦 − 𝑥 ∣ 𝑚) = (𝑥 ∣ 𝑚𝑗 ) − (𝑥 ∣ 𝑚𝑗 ) = 0.
If 𝑚 ∈ 𝑀 ⧵ 𝑀𝑥 , then (𝑥 ∣ 𝑚) = (𝑚𝑘 ∣ 𝑚) = 0 for all 𝑘 ∈ ℕ by definition of 𝑀𝑥 and the
orthogonality. Hence again (𝑦 − 𝑥 ∣ 𝑚) = 0, showing that 𝑦 − 𝑥 ∈ 𝑀 ⟂ . By Lemma 16.8

it follows that 𝑦−𝑥 ∈ span 𝑀 . Now Corollary 16.6 implies that 𝑦 = 𝑃𝑁 (𝑥) as claimed.
Since we have worked with an arbitrary enumeration of 𝑀𝑥 and 𝑃𝑁 (𝑥) is independent
of that enumeration, it follows that the series is unconditionally convergent.

We have just shown that (18.3) is unconditionally convergent. For this reason we can
make the following definition, giving sense to (18.1).

18.2 Definition (Fourier series) Let 𝑀 be an orthonormal system in the Hilbert space
𝐻. If 𝑥 ∈ 𝐻 we call (𝑥 ∣ 𝑚), 𝑚 ∈ 𝑀, the Fourier coefficients of 𝑥 with respect to 𝑀.
Given an enumeration 𝑚𝑘 , 𝑘 ∈ ℕ of 𝑀𝑥 as defined in (18.2) we set
∑ ∑

(𝑥 ∣ 𝑚)𝑚 ∶= (𝑥 ∣ 𝑚𝑘 )𝑚𝑘
𝑚∈𝑀 𝑘=0

and call it the Fourier series of 𝑥 with respect to 𝑀. (For convenience here we let
𝑚𝑘 = 0 for 𝑘 larger than the cardinality of 𝑀𝑥 if it is finite.)
With the above definition, Theorem 18.1 shows that

(𝑥 ∣ 𝑚)𝑚 = 𝑃𝑁 (𝑥)
𝑚∈𝑀

for all 𝑥 ∈ 𝐻 if 𝑁 ∶= span 𝑀. As a consequence of the above theorem we get the


following characterisation of complete orthonormal systems.

18.3 Theorem (orthonormal bases) Suppose that 𝑀 is an orthonormal system in the


Hilbert space 𝐻. Then the following assertions are equivalent:
(i) 𝑀 is complete;

(ii) 𝑥 = (𝑥 ∣ 𝑚)𝑚 for all 𝑥 ∈ 𝐻 (Fourier series expansion);
𝑚∈𝑀

(iii) ‖𝑥‖2 = |(𝑥 ∣ 𝑚)|2 for all 𝑥 ∈ 𝐻 (Parseval’s identity).
𝑚∈𝑀

Proof. (i)⇒(ii): If 𝑀 is complete, then by definition 𝑁 ∶= span 𝑀 = 𝐻 and so by


Theorem 18.1 ∑
𝑥 = 𝑃𝑁 (𝑥) = (𝑥 ∣ 𝑚)𝑚
𝑚∈𝑀

for all 𝑥 ∈ 𝐻, proving (ii).


(ii)⇒(iii): By Lemma 17.3 and since 𝑀𝑥 is countable we have
‖∑ ‖2 ∑
‖𝑥‖2 = ‖ (𝑥 ∣ 𝑚)𝑚‖ = |(𝑥 ∣ 𝑚)|2
‖ ‖
𝑚∈𝑀 𝑚∈𝑀

64
if (ii) holds, so (iii) follows.
(iii)⇒(i): Let 𝑁 ∶= span 𝑀 and fix 𝑥 ∈ 𝑁 ⟂ . By assumption, Theorem 16.9
and 18.1 as well as Lemma 17.3 we have

‖∑ ‖2 ∑
0 = ‖𝑃𝑁 (𝑥)‖2 = ‖ (𝑥 ∣ 𝑚)𝑚‖ = |(𝑥 ∣ 𝑚)|2 = ‖𝑥‖2 .
‖ ‖
𝑚∈𝑀 𝑚∈𝑀


Hence 𝑥 = 0, showing that span 𝑀 = {0}. By Corollary 16.11 span 𝑀 = 𝐻, that is,
𝑀 is complete, proving (i).

We next provide the connection of the above “abstract Fourier series” to the “classical”
Fourier series you may have seen elsewhere. To do so we look at the expansions with
respect to the orthonormal systems considered in Example 17.2.

18.4 Example (a) Let 𝑒𝑖 be the standard basis in 𝕂𝑁 . The Fourier “series” of 𝑥 ∈ 𝕂𝑁
with respect to 𝑒𝑖 is

𝑁
𝑥= (𝑥 ∣ 𝑒𝑖 )𝑒𝑖 .
𝑖=1

Of course we do not usually call this a “Fourier series” but say 𝑥𝑖 ∶= (𝑥 ∣ 𝑒𝑖 ) are the
components of the vector 𝑥 and the above sum the representation of 𝑥 with respect to
the basis 𝑒𝑖 . The example should just illustrate once more the parallels of Hilbert space
theory to various properties of Euclidean spaces.
(b) The Fourier coefficients of 𝑢 ∈ 𝐿2 ((−𝜋, 𝜋), ℂ) with respect to the orthonormal
system
1 𝑖𝑛𝑥
√ 𝑒 , 𝑛 ∈ ℤ,
2𝜋
are given by
𝜋
1
𝑐𝑛 ∶= √ 𝑒−𝑖𝑛𝑥 𝑢(𝑥) 𝑑𝑥.

2𝜋 −𝜋
Hence the Fourier series of 𝑢 with respect to the above system is

∑ ∑ 1
𝜋
𝑢= 𝑐𝑛 𝑒𝑖𝑛𝑥 = √ 𝑒−𝑖𝑛𝑥 𝑢(𝑥) 𝑑𝑥𝑒𝑖𝑛𝑥 .
𝑛∈ℤ 𝑛∈ℤ

2𝜋 −𝜋

This is precisely the complex form of the classical Fourier series of 𝑢. As the orthonor-
mal system under considerationis complete, our theory tells us that the series converges
in 𝐿2 ((−𝜋, 𝜋), ℂ), but we do not get any information on pointwise or uniform conver-
gence.
(c) We now look at 𝑢 ∈ 𝐿2 ((−𝜋, 𝜋), ℝ) and its expansion with respect to the or-
thonormal system given by

1 1 1
√ , √ cos 𝑛𝑥, √ sin 𝑛𝑥, 𝑛 ∈ ℕ ⧵ {0}.
2𝜋 𝜋 𝜋

65
The Fourier coefficients are
𝜋
1
𝑎0 = √ 𝑢(𝑥) 𝑑𝑥
2𝜋 ∫−𝜋
𝜋
1
𝑎𝑛 = √ 𝑢(𝑥) cos 𝑛𝑥 𝑑𝑥
𝜋 ∫−𝜋
𝜋
1
𝑏𝑛 = √ 𝑢(𝑥) sin 𝑛𝑥 𝑑𝑥
𝜋 ∫−𝜋
Hence the Fourier series with respect to the above system is


𝑢 = 𝑎0 + (𝑎𝑛 cos 𝑛𝑥 + 𝑏𝑛 sin 𝑛𝑥),
𝑛=0

which is the classical cosine-sine Fourier series. Again convergence is guaranteed in


𝐿2 ((−𝜋, 𝜋), ℝ), but not pointwise or uniform. The completeness proof is the same as
in (b) as we just consider a “change of basis” here.
Orthonormal bases in linear algebra come from diagonalising symmetric matrices as-
sociated with a particular problem from applications or otherwise. Similarly, orthog-
onal systems of functions come by solving partial differential equations by separation
of variables. There are many such systems like Legendre and Laguerre polynomials,
spherical Harmonics, Hermite functions, Bessel functions and so on. They all fit into
the framework discussed in this section if we choose the right Hilbert space of functions
with the appropriate inner product. The most difficult issue is to prove the completeness
of a particular system. This is often handled very casually as pointed out in a letter by
Bôcher to Lord Rayleigh who extensively worked with orthogonal expansions; see [1].

18.5 Remark One can also get orthonormal systems from any finite or countable set of
linearly independent elements of an inner product space by means of the Gram-Schmidt
orghogonalisation process as seen in second year algebra.
We have mentioned the possibility of uncountable orthonormal systems or bases. They
can occur, but in practice all orthogonal bases arising from applications (like partial
differential equations) are countable. Recall that a metric space is separable if it has a
countable dense subset.

18.6 Theorem A Hilbert space is separable if and only if it has a countable orthonor-
mal basis.
Proof. If the space 𝐻 is finite dimensional and 𝑒𝑖 , 𝑖 = 1, … 𝑁, is an orthonormal basis
of 𝐻, then the set
{∑
𝑁 }
spanℚ {𝑒1 , … , 𝑒𝑁 } ∶= 𝜆𝑘 𝑒𝑘 ∶ 𝜆𝑘 ∈ ℚ(+𝑖ℚ)
𝑘=1

is dense in 𝐻 since ℚ is dense in ℝ, so every finite dimensional Hilbert space is sep-


arable. Now assume that 𝐻 is infinite dimensional and that 𝐻 has a complete count-
able orthonormal system 𝑀 = {𝑒𝑘 ∶ 𝑘 ∈ ℕ}. For every 𝑁 ∈ ℕ we let 𝐻𝑁 ∶=

66
span{𝑒1 , … , 𝑒𝑁 }. Then dim 𝐻𝑁 = 𝑁 and by what we just proved, 𝐻𝑁 is separable.
Since countable unions of countable sets are countable it follows that countable unions
of separable sets are separable. Hence

span 𝑀 = 𝐻𝑁
𝑁∈ℕ

is separable. Since 𝑀 is complete span 𝑀 is dense. Hence any dense subset of span 𝑀
is dense in 𝐻 as well, proving that 𝐻 is separable. Assume now that 𝐻 is a sep-
arable Hilbert space and let 𝐷 ∶= {𝑥𝑘 ∶ 𝑘 ∈ ℕ} be a dense subset of 𝐻. We set
𝐻𝑛 ∶= span{𝑥𝑘 ∶ 𝑘 = 1, … , 𝑛}. Then 𝐻𝑛 is a nested sequence of finite dimen-
sional subspaces of 𝐻 whose union contains 𝐷 and therefore is dense in 𝐻. We have
dim 𝐻𝑛 ≤ dim 𝐻𝑛+1 , possibly with equality. We inductively construct a basis for span 𝐷
by first choosing a basis of 𝐻1 . Given a basis for 𝐻𝑛 we extend it to a basis of 𝐻𝑛+1
if dim 𝐻𝑛+1 > dim 𝐻𝑛 , otherwise we keep the basis we had. Doing that inductively
from 𝑛 = 1 will give a basis for 𝐻𝑛 for each 𝑛 ∈ ℕ. The union of all these bases is a
countable linearly independent set spanning span 𝐷. Applying the Gram-Schmidt or-
thonormalisation process we can get a countable orthonormal system spanning span 𝐷.
Since span 𝐷 is dense, it follows that 𝐻 has a complete countable orthonormal system.

Using the above theorem we show that there is, up to an isometric isomorphism, there
is only one separable Hilbert space, namely 𝓁2 . Hence 𝓁2 plays the same role as 𝕂𝑁 is
isomorphic to an arbitrary 𝑁-dimensional space.

18.7 Corollary Every separable infinite dimensional Hilbert space is isometrically


isomorphic to 𝓁2 .
Proof. Let 𝐻 be a separable Hilbert space. Then by Theorem 18.6 𝐻 has a countable
orthonormal basis {𝑒𝑘 ∶ 𝑘 ∈ ℕ}. We define a linear map 𝑇 ∶ 𝐻 → 𝓁2 by setting
(𝑇 𝑥)𝑖 ∶= (𝑥 ∣ 𝑒𝑖 )
for 𝑥 ∈ 𝐻 and 𝑖 ∈ ℕ. (This corresponds to the components of 𝑥 in case 𝐻 = 𝕂𝑁 .) By
Parseval’s identity from Theorem 18.3 we have


‖𝑥‖ = 2
|(𝑥 ∣ 𝑒𝑖 )|2 = |𝑇 𝑥|22
𝑖=1

Hence 𝑇 is an isometry. Hence it remains to show that 𝑇 is surjective. Let (𝜉𝑖 ) ∈ 𝓁2


and set
∑∞
𝑥 ∶= 𝜉𝑘 𝑒𝑖
𝑖=1
Since (𝜉𝑖 ) ∈ 𝓁2 we have




|𝜉𝑖 | ‖𝑒𝑖 ‖ =
2 2
|𝜉𝑖 |2 < ∞
𝑖=1 𝑖=1

By Lemma 17.3 the series defining 𝑥 converges in 𝐻. Also, by orthogonality, (𝑥 ∣ 𝑒𝑖 ) =


𝜉𝑖 , so 𝑇 𝑥 = (𝜉𝑖 ). Hence 𝑇 is surjective and thus an isometric isomorphism between 𝐻
and 𝓁2 .

67
68
Chapter V
Linear Operators

We have discussed basic properties of linear operators already in Section 8. Here we


want to prove some quite unexpected results on bounded linear operator on Banach
spaces. The essential ingredient to prove these results is Baire’s Theorem from topol-
ogy.

19 Baire’s Theorem
To prove some fundamental properties of linear operators on Banach spaces we need
Baire’s theorem on the intersection of dense open sets.

19.1 Theorem (Baire’s Theorem) Suppose that 𝑋 is a complete metric space and that

𝑂𝑛 , 𝑛 ∈ ℕ, are open dense subsets of 𝑋. Then 𝑛∈ℕ 𝑂𝑛 is dense in 𝑋.

Proof. We want to show that for every 𝑥 ∈ 𝑋 and 𝜀 > 0 the ball 𝐵(𝑥, 𝜀) and 𝑛∈ℕ 𝑂𝑛
have non-empty intersection. Fix 𝑥 ∈ 𝑋 and 𝜀 > 0. We show that there exist sequences
(𝑥𝑛 ) in 𝑋 and (𝜀𝑛 ) in ℝ such that 𝑥1 = 𝑥, 𝜀1 = 𝜀, 𝜀𝑛 → 0 and

𝐵(𝑥𝑛+1 , 𝜀𝑛+1 ) ⊆ 𝐵(𝑥𝑛 , 𝜀𝑛 ) ∩ 𝑂𝑛 ; (19.1)

for all 𝑛 ∈ ℕ. We prove the existence of such sequences inductively. Fix 𝑥 ∈ 𝑋 and
𝜀 > 0. We set 𝑥1 ∶= 𝑥 and 𝜀1 ∶= min{1, 𝜀}. Suppose that we have chosen 𝑥𝑛 and 𝜀𝑛
already. Since 𝑂𝑛 is open and dense in 𝑋 the set 𝐵(𝑥𝑛 , 𝜀𝑛 ) ∩ 𝑂𝑛 is non-empty and open.
Hence we can choose 𝑥𝑛+1 ∈ 𝐵(𝑥𝑛 , 𝜀𝑛 ) and 0 < 𝜀𝑛+1 < min{𝜀𝑛 , 1∕𝑛} such that (19.1)
is true.
From the construction we have 𝐵(𝑥𝑛+1 , 𝜀𝑛+1 ) ⊆ 𝐵(𝑥𝑛 , 𝜀𝑛 ) for all 𝑛 ∈ ℕ and 𝜀𝑛 →
0. Since 𝑋 is complete Cantor’s intersection theorem (Theorem 3.11) implies that
⋂ ⋂
𝑛∈ℕ 𝐵(𝑥𝑛 , 𝜀𝑛 ) is non-empty and so we can choose 𝑦 ∈ 𝑛∈ℕ 𝐵(𝑥𝑛 , 𝜀𝑛 ). By (19.1) we
have
𝑦 ∈ 𝐵(𝑥𝑛+1 , 𝜀𝑛+1 ) ⊆ 𝐵(𝑥𝑛 , 𝜀𝑛 ) ∩ 𝑂𝑛

for every 𝑛 ∈ ℕ and therefore 𝑦 ∈ 𝑂𝑛 for all 𝑛 ∈ ℕ. Hence 𝑦 ∈ 𝑛∈ℕ 𝑂𝑛 . By
construction 𝐵(𝑥𝑛 , 𝜀𝑛 ) ⊆ 𝐵(𝑥, 𝜀) for all 𝑛 ∈ ℕ and so 𝑦 ∈ 𝐵(𝑥, 𝜀) as well. Since 𝑥 and

𝜀 > 0 were chosen arbitrarily we conclude that 𝑛∈ℕ 𝑂𝑛 is dense in 𝑋.

69
19.2 Remark In Baire’s Theorem it is essential that the sets 𝑂𝑛 be open in 𝑋 (or satisfy
a suitable other condition). If we set 𝑂1 = ℚ and 𝑂2 = ℝ ⧵ ℚ, then 𝑂1 and 𝑂2 are
dense subsets of ℝ, but 𝑂1 ∩ 𝑂2 = ∅.

We can also reformulate Baire’s theorem in terms of properties of closed sets.

19.3 Corollary Suppose that 𝑋 is a non-empty complete metric space and that 𝐶𝑛 ,

𝑛 ∈ ℕ, is a family of closed sets in 𝑋 with 𝑋 = 𝑛∈ℕ 𝐶𝑛 . Then there exists 𝑛 ∈ ℕ such
that 𝐶𝑛 has non-empty interior.

Proof. If 𝐶𝑛 has empty interior for all 𝑛 ∈ ℕ, then 𝑂𝑛 ∶= 𝑋 ⧵ 𝐶𝑛 is open and dense in

𝑋 for all 𝑛 ∈ ℕ. By Baire’s Theorem the intersection 𝑛∈ℕ 𝑂𝑛 is dense in 𝑋. Hence
⋃ ⋃ (⋂ )
𝐶𝑛 = (𝑋 ⧵ 𝑂𝑛 ) = 𝑋 ⧵ 𝑂𝑛 ≠ 𝑋.
𝑛∈ℕ 𝑛∈ℕ 𝑛∈ℕ

Therefore, 𝐶𝑛 must have non-empty interior for some 𝑛 ∈ ℕ.

20 The Open Mapping Theorem


We know that a map between metric spaces is continuous if and only pre-images of
open sets are open. We do not want to look at pre-images, but the images of open sets
and look at maps for which such images are open.

20.1 Definition (open map) Suppose that 𝑋, 𝑌 are metric spaces. We call a map 𝑓 ∶ 𝑋 →
𝑌 open if 𝑓 maps every open subset of 𝑋 onto an open subset of 𝑌 .

20.2 Remark To be open is a very strong property of a map. Homeomorphisms are


open because the inverse function is continuous. On the other hand, if a continuous
function is not surjective such as the constant function, we do not expect it to be open.
However, even surjective continuous functions are not necessarily open. As an example
consider
√ 𝑓 (𝑥) √ ∶= 𝑥(𝑥2 − 1) which is surjective from ℝ to ℝ. Indeed, 𝑓 ((−1, 1)) =
[−2∕3 3, 2∕3 3] is closed since the function has a maximum and a minimum in
(−1, 1).

Our aim is to show that linear surjective maps are open. We start by proving a charac-
terisation of open linear maps.

20.3 Proposition Suppose 𝐸 is a Banach space and 𝐹 a normed space. For 𝑇 ∈


(𝐸, 𝐹 ) the following assertions are equivalent:

(i) 𝑇 is open;
( )
(ii) There exists 𝑟 > 0 such that 𝐵(0, 𝑟) ⊆ 𝑇 𝐵(0, 1) ;

( )
(iii) There exists 𝑟 > 0 such that 𝐵(0, 𝑟) ⊆ 𝑇 𝐵(0, 1) .

70
Proof.
( We
) prove (i) implies (ii) and hence also (iii). As 𝐵(0, 1) is open, the set
𝑇 𝐵(0, 1) is open in 𝐹 by (i). Since 0 ∈ 𝑇 (𝐵(0, 1)) there exists 𝑟 > 0 such that
( ) ( )
𝐵(0, 𝑟) ⊆ 𝑇 (𝐵(0, 1)) ⊆ 𝑇 𝐵(0, 1) ⊆ 𝑇 𝐵(0, 1) ,

so (iii) follows.
We next prove that (iii) implies (ii). This is the most difficult part of the proof.
Assume that there exists 𝑟 > 0 such that
( )
𝐵(0, 𝑟) ⊆ 𝑇 𝐵(0, 1) .
( )
We show that 𝐵(0, 𝑟∕2) ⊆ 𝑇 𝐵(0, 1) which proves (ii). Hence let 𝑦 ∈ 𝐵(0, 𝑟∕2). Then
( )
2𝑦 ∈ 𝐵(0, 𝑟) and since 𝐵(0, 𝑟) ⊆ 𝑇 𝐵(0, 1) there exists 𝑥1 ∈ 𝐵(0, 1) such that
𝑟
‖2𝑦 − 𝑇 𝑥1 ‖ ≤ .
2

Hence 4𝑦−2𝑇 𝑥1 ∈ 𝐵(0, 𝑟) and by the same argument as before there exists 𝑥2 ∈ 𝐵(0, 1)
such that
𝑟
‖4𝑦 − 2𝑇 𝑥1 − 𝑇 𝑥2 ‖ ≤ .
2
Continuing this way we can construct a sequence (𝑥𝑛 ) in 𝐵(0, 1) such that
𝑟
‖2𝑛 𝑦 − 2𝑛−1 𝑇 𝑥1 − ⋯ − 2𝑇 𝑥𝑛−1 − 𝑇 𝑥𝑛 ‖ ≤
2
for all 𝑛 ∈ ℕ. Dividing by 2𝑛 we get

‖ ∑ 𝑛
‖ 𝑟
‖𝑦 − 2−𝑘 𝑇 𝑥𝑘 ‖ ≤ 𝑛+1
‖ ‖ 2
𝑘=1

for all 𝑛 ∈ ℕ. Hence




𝑦= 2−𝑘 𝑇 𝑥𝑘 .
𝑘=1

Since ‖𝑥𝑘 ‖ ≤ 1 for all 𝑘 ∈ ℕ we have that





2 ‖𝑥𝑘 ‖ ≤
−𝑘
2−𝑘 = 1
𝑘=1 𝑘=1

and so the series




𝑥 ∶= 2−𝑘 𝑥𝑘
𝑘=1

converges absolutely in 𝐸 because 𝐸 is complete. Moreover, ‖𝑥‖ ≤ 1 and so 𝑥 ∈


𝐵(0, 1). Because 𝑇 is continuous we have
(∑
𝑛 ) ∑
𝑛
−𝑘
𝑇 𝑥 = lim 𝑇 2 𝑥𝑘 = lim 2−𝑘 𝑇 𝑥𝑘 = 𝑦
𝑛→∞ 𝑛→∞
𝑘=1 𝑘=1

71
( )
by construction of 𝑥. Hence 𝑦 ∈ 𝑇 𝐵(0, 1) and (ii) follows.
We finally prove that (ii) implies (i). By (ii) and the linearity of 𝑇 we have
( ) ( )
𝑇 𝐵(0, 𝜀) = 𝜀𝑇 𝐵(0, 1)
( )
for every 𝜀 > 0. Because the map 𝑥 → 𝜀𝑥 is a homeomorphism on 𝐹 the set 𝑇 𝐵(0, 𝜀)
is a neighbourhood of zero for every 𝜀 > 0. Let now 𝑈 ⊆ 𝐸 be open and 𝑦 ∈ 𝑇 (𝑈 ).
As 𝑈 is open there exists 𝜀 > 0 such that

𝐵(𝑥, 𝜀) = 𝑥 + 𝐵(0, 𝜀) ⊆ 𝑈 ,

where 𝑦 = 𝑇 𝑥. Since 𝑧 → 𝑥 + 𝑧 is a homeomorphism and 𝑇 is linear we get


( ) ( ) ( )
𝑇 𝐵(𝑥, 𝜀) = 𝑇 𝑥 + 𝑇 𝐵(0, 𝜀) = 𝑦 + 𝑇 𝐵(0, 𝜀) ⊆ 𝑇 (𝑈 ).
( )
Hence 𝑇 𝐵(𝑥, 𝜀) is a neighbourhood of 𝑦 in 𝑇 (𝑈 ). As 𝑦 was an arbitrary point in
𝑇 (𝑈 ) it follows that 𝑇 (𝑈 ) is open.

We next prove a lemma on convex “balanced” sets.

20.4 Lemma Let 𝐸 be a normed space and 𝑆 ⊆ 𝐸 convex with 𝑆 = −𝑆 ∶= {−𝑥 ∶ 𝑥 ∈


𝑆}. If 𝑆̄ has non-empty interior, then 𝑆̄ is a neighbourhood of zero.
Proof. We first show that 𝑆̄ is convex. If 𝑥, 𝑦 ∈ 𝑆̄ and 𝑥𝑛 , 𝑦𝑛 ∈ 𝑆 with 𝑥𝑛 → 𝑥 and
𝑦𝑛 → 𝑥, then 𝑡𝑥𝑛 + (1 − 𝑡)𝑦𝑛 ∈ 𝑆 for all 𝑛 ∈ ℕ and all 𝑡 ∈ [0, 1]. Letting 𝑛 → ∞ we get
𝑡𝑥 + (1 − 𝑡)𝑦 ∈ 𝑆̄ for all 𝑡 ∈ [0, 1], so 𝑆̄ is convex. Also, if −𝑥𝑛 ∈ −𝑆, then passing to
the limit −𝑥 ∈ −𝑆. ̄ Hence also 𝑆̄ = −𝑆. ̄ If 𝑆̄ has non-empty interior, then there exist
𝑧 ∈ 𝑆̄ and 𝜀 > 0 such that 𝐵(𝑧, 𝜀) ⊆ 𝑆. ̄ Therefore 𝑧 ± ℎ ∈ 𝑆̄ whenever ‖ℎ‖ < 𝜀 and
since 𝑆 = −𝑆 we also have −(𝑧 ± ℎ) ∈ 𝑆.
̄ ̄ ̄ By the convexity of 𝑆̄ we have

1( )
𝑦= (𝑥 + ℎ) + (−𝑥 + ℎ) ∈ 𝑆̄
2

whenever ‖ℎ‖ < 𝜀. Hence 𝐵(0, 𝜀) ⊆ 𝑆,


̄ so 𝑆̄ is a neighbourhood of zero.

We can now prove our main theorem of this section.

20.5 Theorem (open mapping theorem) Suppose that 𝐸 and 𝐹 are Banach spaces.
If 𝑇 ∈ (𝐸, 𝐹 ) is surjective, then 𝑇 is open.
Proof. As 𝑇 is surjective we have
⋃ ( )
𝐹 = 𝑇 𝐵(0, 𝑛)
𝑛∈ℕ

( )
with 𝑇 𝐵(0, 𝑛) closed for all 𝑛 ∈ ℕ. Since 𝐹 is complete, by Corollary 19.3 to Baire’s
( )
theorem there exists 𝑛 ∈ ℕ such that 𝑇 𝐵(0, 𝑛) has non-empty interior. Since the map

72
( )
𝑥 → 𝑛𝑥 is a homeomorphism and 𝑇 is linear, the set 𝑇 𝐵(0, 1) has non-empty interior
as well. Now 𝐵(0, 1) is convex and 𝐵(0, 1) = −𝐵(0, 1). The linearity of 𝑇 implies that
( ) ( )
𝑇 𝐵(0, 1) = −𝑇 𝐵(0, 1)
( )
is convex as well. Since we already know that 𝑇 𝐵(0, 1) has non-empty interior,
( )
Lemma 20.4 implies that 𝑇 𝐵(0, 1) is a neighbourhood of zero, that is, there exists
𝑟 > 0 such that
( )
𝐵(0, 𝑟) ⊆ 𝑇 𝐵(0, 1) .
Since 𝐸 is complete Proposition 20.3 implies that 𝑇 is open.

20.6 Corollary (bounded inverse theorem) Suppose that 𝐸 and 𝐹 are Banach spaces
and that 𝑇 ∈ (𝐸, 𝐹 ) is bijective. Then 𝑇 −1 ∈ (𝐹 , 𝐸).
Proof. We know that 𝑇 −1 ∶ 𝐹 → 𝐸 is linear. If 𝑈 ⊆ 𝐸 is open, then by the open
mapping theorem (𝑇 −1 )−1 (𝑈 ) = 𝑇 (𝑈 ) is open in 𝐹 since 𝑇 is surjective. Hence 𝑇 −1 is
continuous by Theorem 5.3, that is, 𝑇 −1 ∈ (𝐹 , 𝐸).

20.7 Corollary Suppose 𝐸 is a Banach space with respect to the norms ‖⋅‖1 and ‖⋅‖2 .
If ‖⋅‖2 is stronger than ‖⋅‖1 , then the two norms are equivalent.
Proof. Let 𝐸1 ∶= (𝐸, ‖⋅‖1 ) and 𝐸2 ∶= (𝐸, ‖⋅‖2 ). Since ‖⋅‖2 is stronger than ‖⋅‖1
the map 𝐼 ∶ 𝑥 → 𝑥 is bounded and bijective from 𝐸2 into 𝐸1 . Since 𝐸1 and 𝐸2 are
Banach spaces the inverse map 𝐼 −1 ∶ 𝑥 → 𝑥 is bounded as well by the bounded inverse
theorem (Corollary 20.6). Hence there exists 𝑐 > 0 such that ‖𝑥‖2 ≤ 𝑐‖𝑥‖1 for all
𝑥 ∈ 𝐸, proving that the two norms are equivalent.

21 The Closed Graph Theorem


Suppose that 𝑇 ∶ 𝐸 → 𝐹 is a linear map between the normed spaces 𝐸 and 𝐹 . Consider
the graph
𝐺(𝑇 ) = {(𝑥, 𝑇 𝑥) ∶ 𝑥 ∈ 𝐸} ⊆ 𝐸 × 𝐹
which is a vector space with norm

‖(𝑥, 𝑇 𝑥)‖𝑇 = ‖𝑥‖ + ‖𝑇 𝑥‖.

If 𝑇 is continuous, then 𝐺(𝑇 ) is closed as a subset of 𝐸 × 𝐹 . The question is whether


the converse is true as well. The answer is no in general. An example was given in
Assignment 1. In that example we had 𝐸 ∶= 𝐶 1 ([−1, 1]) to 𝐹 ∶= 𝐶([−1, 1]) both
with the supremum norm and 𝑇 𝑢 = 𝑢′ was a differential operator. We also saw in
that assignment that 𝐸 was not complete with respect to the supremum norm. We now
show that this non-completeness makes it possible that an operator with closed graph
is unbounded (discontinuous). We prove that this cannot happen if 𝐸 is complete! The
proof is based on the open mapping theorem.

73
21.1 Theorem (closed graph theorem) Suppose that 𝐸 and 𝐹 are Banach spaces and
that 𝑇 ∈ Hom(𝐸, 𝐹 ). Then 𝑇 ∈ (𝐸, 𝐹 ) if and only if 𝑇 has closed graph.
Proof. If 𝑇 ∈ (𝐸, 𝐹 ) and (𝑥𝑛 , 𝑇 𝑥𝑛 ) → (𝑥, 𝑦) then by the continuity 𝑇 𝑥𝑛 → 𝑇 𝑥 = 𝑦,
so (𝑥, 𝑦) = (𝑥, 𝑇 𝑥) ∈ 𝐺(𝑇 ). This shows that 𝐺(𝑇 ) is closed. Now suppose that 𝐺(𝑇 )
is closed. We define linear maps
𝑆 𝑃
← 𝐺(𝑇 ) ←←→
𝐸 ←←→ ← 𝐹, 𝑥 → (𝑥, 𝑇 𝑥) → 𝑇 𝑥.

Then obviously
‖𝑃 (𝑥, 𝑇 𝑥)‖𝐹 = ‖𝑇 𝑥‖𝐹 ≤ ‖(𝑥, 𝑇 𝑥)‖𝑇 .
Hence 𝑃 ∈ (𝐺(𝑇 ), 𝐹 ). Next we note that 𝑆 ∶ 𝐸 → 𝐺(𝑇 ) is an isomorphism because
clearly 𝑆 is surjective and also injective since 𝑆𝑥 = (𝑥, 𝑇 𝑥) = (0, 0) implies that 𝑥 = 0.
Moreover, 𝑆 −1 (𝑥, 𝑇 𝑥) = 𝑥, and therefore

‖𝑆 −1 (𝑥, 𝑇 𝑥)‖𝐸 = ‖𝑥‖𝐸 ≤ ‖(𝑥, 𝑇 𝑥)‖𝑇

for all 𝑥 ∈ 𝐸. Hence 𝑆 −1 ∈ (𝐺(𝑇 ), 𝐸). Since every closed subspace of a Banach
space is a Banach space, 𝐺(𝑇 ) is a Banach space with respect to the graph norm. By
the bounded inverse theorem (Corollary 20.6) 𝑆 = (𝑆 −1 )−1 ∈ (𝐸, 𝐺(𝑇 )) and so
𝑇 = 𝑃 ◦𝑆 ∈ (𝐸, 𝐹 ) as claimed.

22 The Uniform Boundedness Principle


We show that pointwise boundedness of a family of bounded linear operators between
Banach spaces implies the boundedness of the operator norms. Virtually all textbooks
give a proof based on Baire’s theorem. We give a short elementary proof which is a
simplified version of the original proof by Hahn and Banach. This proof was published
very recently in [11] and does not rely on Baire’s theorem.

22.1 Theorem (uniform boundedness principle) Suppose that 𝐸 and 𝐹 are Banach
spaces. Let (𝑇𝛼 )𝛼∈𝐴 be a family of linear operators in (𝐸, 𝐹 ) such that sup𝛼∈𝐴 ‖𝑇𝛼 𝑥‖𝐹 <
∞ for all 𝑥 ∈ 𝐸. Then sup𝛼∈𝐴 ‖𝑇𝛼 ‖(𝐸,𝐹 ) < ∞.
Proof. Fix 𝑥0 ∈ 𝐸 and 𝑟 > 0. Then by definition of the operator norm

1 1( )
‖𝑇 ℎ‖ = ‖𝑇 (𝑥0 + ℎ) − 𝑇 (𝑥0 − ℎ)‖ ≤ ‖𝑇 (𝑥0 + ℎ)‖ + ‖𝑇 (𝑥0 − ℎ)‖
2 { 2 }
≤ max ‖𝑇 (𝑥0 + ℎ)‖, ‖𝑇 (𝑥0 − ℎ)‖ ≤ sup ‖𝑇 (𝑥0 + ℎ)‖
‖ℎ‖≤𝑟

for all ℎ ∈ 𝐸 with ‖ℎ‖ < 𝑟. By definition of the operator norm we therefore get

𝑟‖𝑇 ‖(𝐸,𝐹 ) = sup ‖𝑇 ℎ‖ ≤ sup ‖𝑇 𝑦‖. (22.1)


‖ℎ‖≤𝑟 𝑦∈𝐵(𝑥0 ,𝑟)

Assume now that sup𝛼∈𝐴 ‖𝑇𝛼 ‖(𝐸,𝐹 ) = ∞. Choose a sequence 𝛼𝑛 ∈ 𝐴 such that
1
‖𝑇𝛼𝑛 ‖(𝐸,𝐹 ) > . (22.2)
4𝑛

74
Let 𝑥0 = 0 and use (22.1) to choose 𝑥1 ∈ 𝐸 such that
1 21
‖𝑥1 − 𝑥0 ‖ < and ‖𝑇𝛼1 𝑥1 ‖ ≥ ‖𝑇 ‖.
3 3 3 𝛼1
Then by (22.1) it is possible to choose 𝑥2 ∈ 𝐸 such that
1 21
‖𝑥2 − 𝑥1 ‖ < and ‖𝑇𝛼2 𝑥2 ‖ ≥ ‖𝑇 ‖.
32 3 32 𝛼2
Continuing that way we inductively choose 𝑥𝑛 ∈ 𝐸 such that
1 21
‖𝑥𝑛 − 𝑥𝑛−1 ‖ < and ‖𝑇𝛼𝑛 𝑥𝑛 ‖ ≥ ‖𝑇 ‖.
3𝑛 3 3𝑛 𝛼𝑛
The sequence (𝑥𝑛 ) is a Cauchy sequence since for 𝑛 > 𝑚 ≥ 0

𝑛
∑𝑛 ( )
1 1 1 1
‖𝑥𝑛 − 𝑥𝑚 ‖ ≤ ‖𝑥𝑘 − 𝑥𝑘−1 ‖ ≤ = − .
𝑘=𝑚+1 𝑘=𝑚+1
3𝑘 2 3𝑚 3𝑛

By completeness of 𝐸 that sequence converges, so 𝑥𝑛 → 𝑥 in 𝐸. Letting 𝑛 → ∞ in the


above estimate we get
1 1
‖𝑥 − 𝑥𝑚 ‖ ≤
2 3𝑚
for all 𝑚 ∈ ℕ. Hence by construction of 𝑥𝑚 and the reversed triangle inequality

‖𝑇𝛼𝑚 𝑥‖ = ‖𝑇𝛼𝑚 (𝑥 − 𝑥𝑚 ) + 𝑇𝛼𝑚 𝑥𝑚 ‖


2 1
≥ ‖𝑇𝛼𝑚 𝑥𝑚 ‖ − ‖𝑇𝛼𝑚 (𝑥 − 𝑥𝑚 )‖ ≥ ‖𝑇 ‖ − ‖𝑇𝛼𝑚 ‖‖𝑥 − 𝑥𝑚 ‖
3 3𝑚 𝛼𝑚
( )
2 1 1 1 1 2 1 1 1
≥ ‖𝑇𝛼𝑚 ‖ − ‖𝑇𝛼𝑚 ‖ = 𝑚 ‖𝑇𝛼𝑚 ‖ − = ‖𝑇 ‖.
33 𝑚 23𝑚 3 3 2 6 3𝑚 𝛼𝑚
Using (22.2) we finally get
( )
1 4 𝑚
‖𝑇𝛼𝑚 𝑥‖ ≥ →∞
6 3
as 𝑚 → ∞. Hence there exists 𝑥 ∈ 𝐸 such that sup𝛼∈𝐴 ‖𝑇𝛼 𝑥‖ = ∞. Hence the assertion
of the theorem follows by contrapositive.

22.2 Remark There are stronger versions of the uniform boundedness theorem: Either
the family is bounded in the operator norm, or it is pointwise unbounded for a rather
large set of points (see [10, Theorem 5.8] for a more precise statement).
We can apply the above to countable families of linear operators, and in particular to
sequences which converge pointwise. We prove that the limit operator is bounded if it
is the pointwise limit of bounded linear operators between Banach spaces. Note that in
general, the pointwise limit of continuous functions is not continuous.

22.3 Corollary Suppose that 𝐸 and 𝐹 are Banach spaces and that (𝑇𝑛 ) is a sequence
in (𝐸, 𝐹 ) such that 𝑇 𝑥 ∶= lim𝑛→∞ 𝑇𝑛 𝑥 exists for all 𝑥 ∈ 𝐸. Then 𝑇 ∈ (𝐸, 𝐹 ) and
‖𝑇 ‖(𝐸,𝐹 ) ≤ lim inf ‖𝑇𝑛 ‖(𝐸,𝐹 ) < ∞. (22.3)
𝑛→∞

75
Proof. 𝑇 is linear because for 𝑥, 𝑦 ∈ 𝐸 and 𝜆, 𝜇 ∈ 𝕂 we have

𝑇 (𝜆𝑥 + 𝜇𝑦) = lim 𝑇𝑛 (𝜆𝑥 + 𝜇𝑦) = lim (𝜆𝑇𝑛 𝑥 + 𝜇𝑇𝑛 𝑦) = 𝜆𝑇 𝑥 + 𝜇𝑇 𝑦.


𝑛→∞ 𝑛→∞

By assumption the sequence (𝑇𝑛 𝑥) is bounded in 𝐹 for


( every 𝑥 ∈) 𝐸. Hence the uniform
boundedness principle (Theorem 22.1) implies that ‖𝑇𝑛 ‖(𝐸,𝐹 ) is a bounded sequence
as well. By definition of the limit inferior we get
( )
‖𝑇 𝑥‖ = lim ‖𝑇𝑛 𝑥‖𝐹 = lim inf ‖𝑇𝑛 𝑥‖𝐹 = lim inf ‖𝑇𝑘 𝑥‖𝐹
𝑛→∞ 𝑛→∞ 𝑛→∞ 𝑘≥𝑛
( ) ( )
≤ lim inf ‖𝑇𝑘 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 = lim inf ‖𝑇𝑛 ‖(𝐸,𝐹 ) ‖𝑥‖𝐸 < ∞
𝑛→∞ 𝑘≥𝑛 𝑛→∞

for all 𝑥 ∈ 𝐸. Hence by definition of the operator norm (22.3) follows.

Note that in the above proof we cannot in general replace the limit inferior by a limit
because we only know that the sequence ‖𝑇𝑛 ‖ is bounded.
As an application of the uniform boundedness principles we can prove that there
are Fourier series of continuous functions not converging at a given point, say at zero.

22.4 Example Let 𝐶2𝜋 ([−𝜋, 𝜋], ℂ) be the subspace of 𝑢 ∈ 𝐶([−𝜋, 𝜋], ℂ) with 𝑢(−𝜋) =
𝑢(𝜋) equipped with the supremum norm. It represents the space of continuous functions
on ℝ that are 2𝜋-periodic. As 𝐶2𝜋 ([−𝜋, 𝜋], ℂ) ⊆ 𝐿2 ((−𝜋, 𝜋), ℂ) it follows from Exam-
ple 18.4 that the Fourier series of every 𝑓 ∈ 𝐶([−𝜋, 𝜋], ℂ) converges in 𝐿2 ((−𝜋, 𝜋), ℂ).
The question about pointwise convergence is much more delicate. We use the uniform
boundedness principle to show that there are continuous functions, where the Fourier
series diverges. To do so let 𝑢 ∈ 𝐶([−𝜋, 𝜋], ℂ) with Fourier series

1 ∑
∞ 𝜋
𝑢(𝑥) = 𝑢(𝑡)𝑒−𝑖𝑘𝑡 𝑑𝑡𝑒𝑖𝑘𝑥
2𝜋 𝑘=−∞ ∫−𝜋

We will show that the operators 𝑇𝑛 ∶ 𝐶2𝜋 ([−𝜋, 𝜋], ℂ) → ℂ given by the partial sum

1 ∑
𝑛 𝜋
𝑇𝑛 𝑢 ∶= 𝑢𝑛 (0) ∶= 𝑢(𝑡)𝑒−𝑖𝑘𝑡 𝑑𝑡 (22.4)
2𝜋 𝑘=−𝑛 ∫−𝜋

of the Fourier series evaluated at 𝑥 = 0 is a bounded linear operator. It turns out that
‖𝑇𝑛 ‖ → ∞ as 𝑛 → ∞, so the family of operators (𝑇𝑛 )𝑛∈ℕ is not bounded. Hence,
by the uniform boundedness principle there must exist 𝑢 ∈ 𝐶2𝜋 ([−𝜋, 𝜋], ℂ) so that
𝑇𝑛 𝑢 = 𝑢𝑛 (0) is not bounded, that is, the Fourier series of 𝑢 does not converge at 𝑥 = 0.
One can replace 𝑥 = 0 by an arbitrary 𝑥 ∈ [−𝜋, 𝜋]. One can also use Baire’s Theorem
to show that for most 𝑢 ∈ 𝐶2𝜋 ([−𝜋, 𝜋], ℂ), the Fourier series diverges for instance on
[−𝜋, 𝜋] ∩ ℚ. We do not do that here, but refer to [10, Section 5.11].
We now show determine ‖𝑇𝑛 ‖ and show that ‖𝑇𝑛 ‖ → ∞ as 𝑛 → ∞. To do so we
set
∑𝑛
𝐷𝑛 (𝑡) ∶= 𝑒𝑖𝑘𝑡 .
𝑘=−𝑛

76
We can then write the partial sum of the Fourier series of 𝑢 in the form


𝑛
1 ∑ 𝑖𝑘𝑥
𝑛 𝜋
𝑢𝑛 (𝑥) = 𝑐𝑘 𝑒𝑖𝑘𝑥 = 𝑒 𝑢(𝑡)𝑒−𝑖𝑘𝑡 𝑑𝑡
𝑘=−𝑛
2𝜋 𝑘=−𝑛 ∫−𝜋
1
𝜋 ∑ 𝑛
1
𝜋
= 𝑢(𝑡) 𝑒𝑖𝑘((𝑥−𝑡) 𝑑𝑡 = 𝑢(𝑡)𝐷𝑛 (𝑥 − 𝑡) 𝑑𝑡.
2𝜋 ∫−𝜋 𝑘=−𝑛
2𝜋 ∫−𝜋

Using the formula for the partial sum of a geometric series we also see that


𝑛

2𝑛
𝑖𝑘𝑡 −𝑖𝑛𝑡
𝐷𝑛 (𝑡) = 𝑒 =𝑒 (𝑒𝑖𝑡 )𝑘
𝑘=−𝑛 𝑘=0
𝑒𝑖(2𝑛+1)𝑡 − 1 𝑒𝑖(𝑛+1∕2)𝑡 − 𝑒−𝑖(𝑛+1∕2)𝑡 sin(𝑛 + 1∕2)𝑡
= 𝑒−𝑖𝑛𝑡 = = .
𝑒𝑖𝑡 − 1 𝑒𝑖𝑡∕2 − 𝑒−𝑖𝑡∕2 sin(𝑡∕2)

The 2𝜋-periodic function 𝐷𝑛 is called the Dirichlet kernel. To summarise, we obtain

sin(𝑛 + 1∕2)𝑡
𝐷𝑛 (𝑡) = (22.5)
sin(𝑡∕2)

with a removable singluarity lim𝑡→0 𝐷𝑛 (𝑡) = 2𝑛 + 1. Using the Dirichlet kernel we


obtain a closed expression for the partial sum of a Fourier series, namely,
𝜋
1
𝑢𝑛 (𝑥) = 𝑢(𝑡)𝐷𝑛 (𝑥 − 𝑡) 𝑑𝑡. (22.6)
2𝜋 ∫−𝜋

𝑦
2𝑛 + 1

−𝜋 0 𝜋 𝑡

Figure 22.1: The Dirichlet kernel 𝐷𝑛 .

The operators 𝑇𝑛 ∶ 𝐶2𝜋 ([−𝜋, 𝜋], ℂ) → ℂ introduced in (22.4) is hence given by


𝜋
1
𝑇𝑛 𝑢 ∶= 𝑢𝑛 (0) = 𝑢(𝑡)𝐷𝑛 (𝑡) 𝑑𝑡 (22.7)
2𝜋 ∫−𝜋

77
are bounded since, using (22.6).
1
|𝑇𝑛 𝑢| ≤ ‖𝐷 ‖ ‖𝑢‖∞ .
2𝜋 𝑛 1
Hence,
1
‖𝐷 ‖ .
‖𝑇𝑛 ‖ ≤
2𝜋 𝑛 1
We would like to show equality. If we set 𝑢 ∶= sign 𝐷𝑛 , then 𝑢(𝑡)𝐷𝑛 (𝑡) = |𝐷𝑛 (𝑡)| and
thus 𝜋
1 1
|𝑇𝑛 𝑢| = |𝐷 (𝑡)| 𝑑𝑡 = ‖𝐷 ‖ .
2𝜋 ∫−𝜋 𝑛 2𝜋 𝑛 1
Unfortunately, 𝑢 = sign 𝐷𝑛 is not continous, otherwise the above would imply that
1
‖𝑇𝑛 ‖ ≥
‖𝐷 ‖ (22.8)
2𝜋 𝑛 1
However, we can approximate 𝑢 pointwise by functions 𝑢𝑘 ∈ 𝐶2𝜋 ([−𝜋, 𝜋], ℝ) with
values in [−1, 1] by putting in a line at every jump of with slope ±𝑘 as indicated in
Figure 22.2. The dominated convergence theorem then implies that

𝑦 sign 𝐷𝑛

𝑢𝑘

−𝜋 0 𝜋 𝑡

Figure 22.2: Approximating sign 𝐷𝑛 by continuous functions.

𝜋 𝜋 𝜋
lim 𝑢𝑘 (𝑡)𝐷𝑛 (𝑡) 𝑑𝑡 = sign(𝐷𝑛 (𝑡))𝐷𝑛 (𝑡) 𝑑𝑡 = |𝐷𝑛 (𝑡)| 𝑑𝑡 = ‖𝐷𝑛 ‖1
𝑘→∞ ∫−𝜋 ∫−𝜋 ∫−𝜋
and hence (22.8) is valid. In particular 2𝜋‖𝑇𝑛 ‖ = ‖𝐷𝑛 ‖1 for all 𝑛 ∈ ℕ. We finally need
to show that ‖𝑇𝑛 ‖ → ∞ or equivalently that ‖𝐷𝑛 ‖1 → ∞ as 𝑛 → ∞. As sin(𝑡∕2) ≤
|𝑡|∕2 for all 𝑡 ∈ ℝ we first note that
𝜋 𝜋 𝜋
| sin(𝑛 + 1∕2)𝑡|
|𝐷𝑛 (𝑡)| 𝑑𝑡 = 2 |𝐷𝑛 (𝑡)| 𝑑𝑡 = 2 𝑑𝑡
∫−𝜋 ∫0 ∫0 | sin(𝑡∕2)|
𝜋
| sin(𝑛 + 1∕2)𝑡|
≥4 𝑑𝑡 =∶ 4𝐼𝑛
∫0 𝑡
for all 𝑛 ∈ ℕ. Using the substitution 𝑠 = (𝑛 + 1∕2)𝑡 we see that
𝜋
| sin(𝑛 + 1∕2)𝑡| (𝑛+1∕2)𝜋
| sin 𝑠| ∑ 𝑘𝜋 | sin 𝑠|
𝑛
𝐼𝑛 = 𝑑𝑡 = 𝑑𝑠 ≥ 𝑑𝑠
∫0 𝑡 ∫0 𝑠 ∫
𝑘=1 (𝑘−1)𝜋
𝑠
∑𝑛
1
𝑘𝜋
1 ∑ 1 𝑛→∞
𝑛
≥ | sin 𝑠| 𝑑𝑠 = ← ∞
←←←←←←←←→
𝑘=1
𝑘𝜋 ∫(𝑘−1)𝜋 𝜋 1 𝑘
because of the divergence of the harmonic series. Hence, ‖𝐷𝑛 ‖1 → ∞ as claimed.

78
23 Closed Operators
In this section we look at a class of operators which are not necessarily bounded, but
still sharing many properties with bounded operators.

23.1 Definition (closed operator) Let 𝐸, 𝐹 be Banach spaces. We call 𝐴 a closed


operator from 𝐸 to 𝐹 with domain 𝐷(𝐴) if 𝐷(𝐴) is a subspace of 𝐸 and the graph

𝐺(𝐴) ∶= {(𝑥, 𝐴𝑥) ∶ 𝑥 ∈ 𝐷(𝐴)} ⊆ 𝐸 × 𝐹

is closed in 𝐸 × 𝐹 . We write 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 .

23.2 Remark (a) The graph of a linear operator 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 is closed if and
only if 𝑥𝑛 → 𝑥 in 𝐸 and 𝐴𝑥𝑛 → 𝑦 in 𝐹 imply that 𝑥 ∈ 𝐷(𝐴) and 𝐴𝑥 = 𝑦. The condition
looks similar to continuity. The difference is that continuity means that 𝑥𝑛 → 𝑥 in 𝐸
implies that the sequence (𝐴𝑥𝑛 ) automatically converges, whereas this is assumed in
case of a closed operator.
(b) Note that 𝐴 ∶ 𝐷(𝐴) → 𝐹 is always continuous if we consider 𝐷(𝐴) with the
graph norm ‖𝑥‖𝐴 ∶= ‖𝑥‖𝐸 + ‖𝐴𝑥‖𝐹 .
Classical examples of closed but unbounded operators are differential operators. We
demonstrate this with the simplest possible example.

23.3 Example Consider the vector space of continuous functions 𝐸 = 𝐶([0, 1]). De-
fine the operator 𝐴𝑢 ∶= 𝑢′ with domain

𝐷(𝐴) ∶= 𝐶 1 ([0, 1]) ∶= {𝑢 ∶ [0, 1] → ℝ ∣ 𝑢′ ∈ 𝐶([0, 1])}.

We show that 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸 is a closed operator which is not bounded.


A standard result from analysis asserts that if 𝑢𝑛 → 𝑢 pointwise and 𝑢′𝑛 → 𝑣 uni-
formly, then 𝑢 is differentiable with 𝑢′ = 𝑣 ∈ 𝐶([0, 1]). Suppose that 𝑢𝑛 → 𝑢 and
𝐴𝑢𝑛 → 𝑣 in 𝐸. Since convergence in the supremum norm is the same as uniform con-
vergence we have in particular that 𝑢𝑛 → 𝑢 pointwise and 𝐴𝑢𝑛 = 𝑢′𝑛 → 𝑣 uniformly.
Hence 𝑢 ∈ 𝐶 1 ([0, 1]) = 𝐷(𝐴) and 𝐴𝑢 = 𝑢′ = 𝑣. Hence 𝐴 is a closed operator. We
next show that 𝐴 is not bounded. To do so let 𝑢𝑛 (𝑥) ∶= 𝑥𝑛 . Then ‖𝑢𝑛 ‖∞ = 1 for all
𝑛 ∈ ℕ, but ‖𝑢′𝑛 ‖∞ = 𝑛 → ∞ as 𝑛 → ∞. Hence 𝐴 maps a bounded set of 𝐷(𝐴) onto an
unbounded set in 𝐸 and therefore 𝐴 is not continuous by Theorem 8.3.
We next prove some basic properties of closed operators:

23.4 Theorem Suppose that 𝐸 and 𝐹 are Banach spaces and that 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹
is an injective closed operator.

(i) Then 𝐴−1 ∶ im(𝐴) ⊆ 𝐹 → 𝐸 is a closed operator with domain 𝐷(𝐴−1 ) = im(𝐴).

(ii) If im(𝐴) = 𝐹 , then 𝐴−1 ∈ (𝐹 , 𝐸) is bounded.

(iii) If im(𝐴) = 𝐹 and 𝐴−1 ∶ im(𝐴) ⊆ 𝐹 → 𝐸 is bounded, then im(𝐴) = 𝐹 and


𝐴−1 ∈ (𝐹 , 𝐸).

79
Proof. (i) Note that

𝐺(𝐴) = {(𝑥, 𝐴𝑥) ∶ 𝑥 ∈ 𝐷(𝐴)} = {(𝐴−1 𝑦, 𝑦) ∶ 𝑦 ∈ im(𝐴)}

is closed in 𝐸 × 𝐹 . Since the map (𝑥, 𝑦) → (𝑦, 𝑥) is an isometric isomorphism from


𝐸 × 𝐹 to 𝐹 × 𝐸, the graph of 𝐴−1 is closed in 𝐹 × 𝐸.
(ii) By part (i) the operator 𝐴−1 ∶ im(𝐴) = 𝐹 → 𝐸 is closed. Since 𝐸, 𝐹 are
Banach spaces, the closed graph theorem (Theorem 21.1) implies that 𝐴−1 ∈ (𝐹 , 𝐸).
(iii) Suppose 𝑦 ∈ 𝐹 . Since im(𝐴) is dense there exist 𝑦𝑛 ∈ im(𝐴) such that 𝑦𝑛 → 𝑦
in 𝐹 . If we set 𝑥𝑛 ∶= 𝐴−1 𝑦𝑛 , then by the boundedness of 𝐴−1 we have ‖𝑥𝑛 − 𝑥𝑚 ‖𝐹 ≤
‖𝐴−1 ‖(im(𝐴),𝐸) ‖𝑦𝑛 −𝑦𝑚 ‖𝐸 for all 𝑛, 𝑚 ∈ ℕ. Since 𝑦𝑛 → 𝑦 it follows that (𝑥𝑛 ) is a Cauchy
sequence in 𝐸, so by completeness of 𝐸 we have 𝑥𝑛 = 𝐴−1 𝑦𝑛 → 𝑥 in 𝐸. Since 𝐴−1
is closed by (i) we get that 𝑦 ∈ 𝐷(𝐴−1 ) = im(𝐴) and 𝐴−1 𝑦 = 𝑥. Since 𝑦 ∈ 𝐹 was
arbitrary we get im(𝐴) = 𝐹 .

We can also characterise closed operators in terms of properties of the graph norm on
𝐷(𝐴) as introduced in Definition 9.5.

23.5 Proposition Suppose that 𝐸 and 𝐹 are Banach spaces. Then, the operator 𝐴 ∶ 𝐷(𝐴) ⊆
𝐸 → 𝐹 is closed if and only if 𝐷(𝐴) is a Banach space with respect to the graph norm
‖𝑥‖𝐴 ∶= ‖𝑥‖𝐸 + ‖𝐴𝑥‖𝐹 .
Proof. We first assume that 𝐴 is closed and show that 𝐷(𝐴) is complete with respect
to the graph norm. Let (𝑥𝑛 ) be a Cauchy sequence in (𝐷(𝐴), ‖⋅‖𝐴 ). Fix 𝜀 > 0. Then
there exists 𝑛0 ∈ ℕ such that

‖𝑥𝑛 − 𝑥𝑚 ‖𝐴 = ‖𝑥𝑛 − 𝑥𝑚 ‖𝐸 + ‖𝐴𝑥𝑛 − 𝐴𝑥𝑚 ‖𝐹 < 𝜀

for all 𝑚, 𝑛 > 𝑛0 . Hence (𝑥𝑛 ) and (𝐴𝑥𝑛 ) are Cauchy sequences in 𝐸 and 𝐹 , respectively.
By the completeness of 𝐸 and 𝐹 we conclude that 𝑥𝑛 → 𝑥 and 𝐴𝑥𝑛 → 𝑦. Since 𝐴 is
closed we have 𝑥 ∈ 𝐷(𝐴) and 𝐴𝑥 = 𝑦, so

‖𝑥𝑛 − 𝑥‖𝐴 = ‖𝑥𝑛 − 𝑥‖𝐸 + ‖𝐴𝑥𝑛 − 𝐴𝑥‖𝐹 → 0.

Hence, 𝑥𝑛 → 𝑥 in (𝐷(𝐴), ‖⋅‖𝐴 ), proving that (𝐷(𝐴), ‖⋅‖𝐴 ) is a Banach space.


Suppose now that (𝐷(𝐴), ‖⋅‖𝐴 ) is a Banach space. Suppose that 𝑥𝑛 → 𝑥 in 𝐸 and
𝐴𝑥𝑛 → 𝑦 in 𝐹 . Since (𝑥𝑛 ) is a Cauchy sequence in 𝐸 and (𝐴𝑥𝑛 ) a Cauchy sequence in
𝐹 , the definition of the graph norm

‖𝑥𝑛 − 𝑥𝑚 ‖𝐴 = ‖𝑥𝑛 − 𝑥𝑚 ‖𝐸 + ‖𝐴𝑥𝑛 − 𝐴𝑥𝑚 ‖𝐹

implies that (𝑥𝑛 ) is a Cauchy sequence with respect to the graph norm in 𝐷(𝐴). By the
completeness of 𝐷(𝐴) with respect to the graph norm we have 𝑥𝑛 → 𝑥 in the graph
norm. By the definition of the graph norm this in particular implies 𝑥 ∈ 𝐷(𝐴) and
𝐴𝑥𝑛 → 𝐴𝑥, so 𝐴𝑥 = 𝑦. Hence 𝐴 is closed.

In the following corollary we look at the relationship between continuous and closed
operators. Continuous means with respect to the norm on 𝐷(𝐴) induced by the norm
in 𝐸.

80
23.6 Corollary Suppose that 𝐸, 𝐹 are Banach spaces and that 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 is
a linear operator. Then 𝐴 ∈ (𝐷(𝐴), 𝐹 ) is closed if and only if 𝐷(𝐴) is closed in 𝐸.
Proof. Suppose that 𝐴 ∈ (𝐷(𝐴), 𝐹 ). By Proposition 9.7, the graph norm on 𝐷(𝐴)
is equivalent to the norm of 𝐸 restricted to 𝐷(𝐴). Since 𝐴 is closed we know from
Proposition 23.5 that 𝐷(𝐴) is complete with respect to the graph norm, and therefore
with respect to the norm in 𝐸. Hence 𝐷(𝐴) is closed in 𝐸. Suppose now that 𝐷(𝐴)
is closed in 𝐸. Since 𝐸 is a Banach space, also 𝐷(𝐴) is a Banach space. Hence, the
closed graph theorem (Theorem 21.1) implies that 𝐴 ∈ (𝐷(𝐴), 𝐹 ).

As a consequence of the above result the domain of a closed and unbounded operator
𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 is never a closed subset of 𝐸.

24 Closable Operators and Examples


Often operators are initially defined on a relatively small domain with the disadvantage
that they are not closed. We show now that under a suitable assumption there exists a
closed operator extending such a linear operator.

24.1 Definition (extension/closure of a linear operator) (i) Suppose 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 →


𝐹 is a linear operator. We call the linear operator 𝐵 ∶ 𝐷(𝐵) ⊆ 𝐸 → 𝐹 and extension of
𝐴 if 𝐷(𝐴) ⊆ 𝐷(𝐵) and 𝐴𝑥 = 𝐵𝑥 for all 𝑥 ∈ 𝐷(𝐴). We write 𝐵 ⊇ 𝐴 if that is the case.
(ii) The linear operator 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 is called closable if there exists a closed
operator 𝐵 ∶ 𝐷(𝐵) ⊆ 𝐸 → 𝐹 with 𝐵 ⊇ 𝐴.
(iii) We call 𝐴̄ the closure of 𝐴 if 𝐴̄ is closed, 𝐴̄ ⊇ 𝐴,
̄ and 𝐵 ⊇ 𝐴̄ for every closed
operator with 𝐵 ⊇ 𝐴.

24.2 Theorem (closure of linear operator) Suppose that 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐹 is a


linear operator. Then the following assertions are equivalent:

(i) 𝐴 is closable.

(ii) If 𝑥𝑛 → 0 in 𝐸 and 𝐴𝑥𝑛 → 𝑦 in 𝐹 , then 𝑦 = 0.

In that case 𝐺(𝐴) = 𝐺(𝐴) ̄ ⊆ 𝐸 × 𝐹 is the graph of the closure 𝐴,


̄ and 𝐷(𝐴)
̄ = {𝑥 ∈
𝐸 ∶ (𝑥, 𝑦) ∈ 𝐺(𝐴)} its domain.
Proof. Suppose that (i) is true and that 𝐵 is a closed operator with 𝐵 ⊇ 𝐴. If 𝑥𝑛 ∈
𝐷(𝐴), 𝑥𝑛 → 0 and 𝐴𝑥𝑛 → 𝑦, then 𝑥 ∈ 𝐷(𝐵) and 𝐵𝑥𝑛 = 𝐴𝑥𝑛 → 𝑦 as well. Since 𝐵 is
closed 𝐵0 = 0 = 𝑦, so (ii) follows.
Suppose now that (ii) holds. We show that 𝐺(𝐴) is the graph of a linear operator.
To do so we let (𝑥, 𝑦), (𝑥, 𝑦)
̃ ∈ 𝐺(𝐴) and show that 𝑦 = 𝑦.
̃ By choice of (𝑥, 𝑦) there exist
𝑥𝑛 ∈ 𝐷(𝐴) such that 𝑥𝑛 → 𝑥 in 𝐸 and 𝐴𝑥𝑛 → 𝑦 in 𝐹 . Similarly there exist 𝑧𝑛 ∈ 𝐷(𝐴)
with 𝑧𝑛 → 𝑥 and 𝐴𝑧𝑛 → 𝑦. ̃ Then 𝑥𝑛 − 𝑧𝑛 → 0 and 𝐴(𝑥𝑛 − 𝑧𝑛 ) → (𝑦 − 𝑦).
̃ By assumption
(ii) we conclude that 𝑦 − 𝑦̃ = 0, that is, 𝑦 = 𝑦.
̃ Hence we can define an operator 𝐴̄ by
setting 𝐴𝑥
̄ ∶= 𝑦 for all (𝑥, 𝑦) ∈ 𝐺(𝐴). Its domain is given by

̄ ∶= {𝑥 ∈ 𝐸 ∶ (𝑥, 𝑦) ∈ 𝐺(𝐴)}
𝐷(𝐴)

81
and its graph by 𝐺(𝐴) ̄ = 𝐺(𝐴). If 𝑥, 𝑧 ∈ 𝐷(𝐴),̄ then by the above construction there
exist sequences (𝑥𝑛 ) and (𝑧𝑛 ) in 𝐷(𝐴) with 𝐴𝑥𝑛 → 𝐴𝑥
̄ and 𝐴𝑧𝑛 → 𝐴𝑧.̄ Hence

̄ + 𝜇 𝐴𝑧
𝐴(𝜆𝑥𝑛 + 𝜇𝑧𝑛 ) = 𝜆𝐴𝑥𝑛 + 𝜇𝐴𝑧𝑛 → 𝜆𝐴𝑥 ̄

Since 𝜆𝑥𝑛 + 𝜇𝑧𝑛 → 𝜆𝑥 + 𝜇𝑧, by definition of 𝐴̄ we get (𝜆𝑥 + 𝜇𝑧, 𝜆𝐴𝑥 ̄ + 𝜇𝐴𝑧)
̄ ∈ 𝐺(𝐴) ̄
and so 𝐴(𝜆𝑥 + 𝜇𝑧) = 𝜆𝐴𝑥 + 𝜇𝐴𝑧. Hence 𝐷(𝐴) is a subspace of 𝐸 and 𝐴 is linear.
̄ ̄ ̄ ̄ ̄
Because 𝐺(𝐴) ̄ = 𝐺(𝐴) is closed, it follows that 𝐴̄ is a closed operator. Moreover, 𝐴̄ is
the closure of 𝐴 because 𝐺(𝐴) must be contained in the graph of any closed operator
extending 𝐴.

We can use the above in particular to construct extensions of bounded linear operators
defined on dense subspaces.

24.3 Corollary (Extension of linear operators) Suppose that 𝐸, 𝐹 are Banach spaces,
and that 𝐸0 is a dense subspace of 𝐸. If 𝑇0 ∶ (𝐸0 , 𝐹 ), then there exists a unique op-
erator 𝑇 ∈ (𝐸, 𝐹 ) such that 𝑇 𝑥 = 𝑇0 𝑥 for all 𝑥 ∈ 𝐸0 . Moreover, ‖𝑇0 ‖(𝐸0 ,𝐹 ) =
‖𝑇 ‖(𝐸,𝐹 ) .

Proof. We consider 𝑇0 as a linear operator on 𝐸 with domain 𝐷(𝑇0 ) = 𝐸0 . Suppose


that 𝑥𝑛 ∈ 𝐸0 with 𝑥𝑛 → 0 and 𝑇0 𝑥𝑛 → 𝑦. Since 𝑇0 is bounded we have ‖𝑇0 𝑥𝑛 ‖𝐹 ≤
‖𝑇0 ‖(𝐸0 ,𝐹 ) ‖𝑥𝑛 ‖𝐸 → 0, so 𝑦 = 0. By Theorem 24.2 𝑇0 has a closure which we denote
by 𝑇 . We show that 𝐷(𝑇 ) = 𝐸 and that 𝑇 is bounded. If 𝑥𝑛 ∈ 𝐸0 and 𝑥𝑛 → 𝑥 in 𝐸,
then by definition of 𝑇 𝑥 we have

‖𝑇 𝑥‖ = lim ‖𝑇0 𝑥𝑛 ‖𝐹 ≤ ‖𝑇0 ‖(𝐸0 ,𝐹 ) lim ‖𝑥𝑛 ‖𝐸 ≤ ‖𝑇0 ‖(𝐸0 ,𝐹 ) ‖𝑥‖𝐸 .


𝑛→∞ 𝑛→∞

Hence ‖𝑇 ‖(𝐸,𝐹 ) ≤ ‖𝑇0 ‖(𝐸0 ,𝐹 ) . Since clearly ‖𝑇 ‖(𝐸,𝐹 ) ≥ ‖𝑇0 ‖(𝐸0 ,𝐹 ) , the equality of
the operator norms follow. Now Corollary 23.6 implies that 𝐷(𝑇 ) is closed in 𝐸, and
since 𝐸0 ⊆ 𝐷(𝑇 ) is dense in 𝐸 we get 𝐷(𝑇 ) = 𝐸.

We complete this section by an example of a closable operator arising in reformulating


a Sturm-Liouville boundary value problem in a Hilbert space setting. The approach
can be used for much more general boundary value problems for partial differential
equations. We need one class of functions to deal with such problems.

24.4 Definition (support/test function) Let Ω ⊆ ℝ𝑁 be an open set and 𝑢 ∶ Ω → 𝕂 a


function. The set
supp(𝑢) ∶= {𝑥 ∈ Ω ∶ 𝑢(𝑥) ≠ 0}
is called the support of 𝑢. We let

𝐶 ∞ (Ω) ∶= {𝑢 ∶ Ω → 𝕂 ∶ 𝑢 has partial derivatives of all orders}

and
𝐶𝑐∞ (Ω) ∶= {𝑢 ∈ 𝐶 ∞ (Ω) ∶ supp(𝑢) ⊆ Ω is compact}.
The elements of 𝐶𝑐∞ (Ω) are called test functions.

82
24.5 Example Let 𝐼 = [𝑎, 𝑏] be a compact interval, 𝑝 ∈ 𝐶 1 (𝐼), 𝑞 ∈ 𝐶(𝐼) with 𝑝(𝑥) > 0
for all 𝑥 ∈ 𝐼 and 𝛼𝑖 , 𝛽𝑖 ∈ ℝ with 𝛼𝑖2 + 𝛽𝑖2 ≠ 0 (𝑖 = 1, 2). We want to reformulate the
Sturm-Liouville boundary value problem
−(𝑝𝑢′ )′ + 𝑞𝑢 = 𝑓 in (𝑎, 𝑏),
𝛼1 𝑢(𝑎) − 𝛽1 𝑢′ (𝑎) = 0 (24.1)
𝛼2 𝑢(𝑎) + 𝛽2 𝑢′ (𝑎) = 0
in a Hilbert space setting as a problem in 𝐿2 ((𝑎, 𝑏), ℝ). Functions in 𝐿2 ((𝑎, 𝑏), ℝ) are
not generally differentiable, so we first define a linear operator on a smaller domain by
setting
𝐴0 ∶= −(𝑝𝑢′ )′ + 𝑞𝑢
for all 𝑢 ∈ 𝐷(𝐴0 ), where
𝐷(𝐴0 ) ∶= {𝑢 ∈ 𝐶 ∞ (𝐼) ∶ 𝛼1 𝑢(𝑎) − 𝛽1 𝑢′ (𝑎) = 𝛼2 𝑢(𝑎) + 𝛽2 𝑢′ (𝑎) = 0}.
The idea is to deal with the boundary conditions by incorporating them into the defini-
tion of the domain of the differential operator. We note that if 𝐴0 𝑢 = 𝑓 with 𝑢 ∈ 𝐷(𝐴0 ),
then 𝑢 is a solution of (24.1).
We would like to deal with a closed operator, but unfortunately, 𝐴0 is not closed.
We will show that 𝐴0 is closable as an operator in 𝐿2 ((𝑎, 𝑏), ℝ). Clearly 𝐷(𝐴0 ) ⊆
𝐿2 ((𝑎, 𝑏), ℝ). Moreover, 𝐶𝑐∞ ((𝑎, 𝑏)) ⊆ 𝐷(𝐴0 ). Using integration by parts twice we see
that
𝑏 𝑏 𝑏
|𝑏
𝜑𝐴0 𝑢 𝑑𝑥 = 𝑢𝐴0 𝜑 𝑑𝑥 + 𝑝(𝑢𝜑′ − 𝜑𝑢′ )| = 𝑢𝐴0 𝜑 𝑑𝑥 (24.2)
∫𝑎 ∫𝑎 | 𝑎 ∫𝑎
for all 𝑢 ∈ 𝐷(𝐴0 ) and 𝜑 ∈ 𝐶𝑐∞ ((𝑎, 𝑏)) because 𝜑(𝑎) = 𝜑′ (𝑎) = 𝜑(𝑏) = 𝜑′ (𝑏) = 0.
Suppose now that 𝑢𝑛 ∈ 𝐷(𝐴0 ) such that 𝑢𝑛 → 0 in 𝐿2 ((𝑎, 𝑏), ℝ) and 𝐴0 𝑢𝑛 → 𝑣 in
𝐿2 ((𝑎, 𝑏), ℝ). Using (24.2) we and the Cauchy-Schwarz inequality we get
𝑏 𝑏
| | | |
| 𝜑𝐴0 𝑢𝑛 𝑑𝑥| = | 𝑢 𝐴 𝜑 𝑑𝑥| ≤ ‖𝑢𝑛 ‖2 ‖𝐴0 𝜑‖2 → 0
|∫𝑎 | |∫𝑎 𝑛 0 |
as 𝑛 → ∞. Therefore,
𝑏 𝑏
𝜑𝐴0 𝑢𝑛 𝑑𝑥 → 𝜑𝑣 𝑑𝑥 = 0
∫𝑎 ∫𝑎
for all 𝜑 ∈ 𝐶𝑐∞ ((𝑎, 𝑏)). One can show that 𝐶𝑐∞ ((𝑎, 𝑏)) is dense in 𝐿2 ((𝑎, 𝑏), ℝ), so
that 𝑣 = 0 in 𝐿2 ((𝑎, 𝑏), ℝ). By Theorem 24.2 the operator 𝐴0 is closable. Denote
its closure by 𝐴 with domain 𝐷(𝐴). Since 𝐶𝑐∞ ((𝑎, 𝑏)) is dense in 𝐿2 ((𝑎, 𝑏), ℝ) and
𝐶𝑐∞ ((𝑎, 𝑏)) ⊆ 𝐷(𝐴0 ) it follows that 𝐷(𝐴) is a dense subset of 𝐿2 ((𝑎, 𝑏), ℝ).
Instead of (24.1) it is common to study the abstract equation
𝐴𝑢 = 𝑓
in 𝐿2 ((𝑎, 𝑏), ℝ). The solutions may not be differentiable, but if they are, then they are
solutions to (24.1). We call solutions of the abstract equation generalised solutions of
(24.1). The advantage of that setting is that we can use all the Hilbert space theory
including Fourier series expansions of solutions.

83
84
Chapter VI
Duality

25 Dual Spaces
In this section we discuss the space of linear operators from a normed space over 𝕂 into
𝕂. This is called the “dual space” and plays an important role in many applications.
We give some precise definition and then examples.

25.1 Definition (Dual space) If 𝐸 is a normed vector space over 𝕂, then we call

𝐸 ′ ∶= (𝐸, 𝕂)

the dual space of 𝐸. The elements of 𝐸 ′ are often referred to as bounded linear func-
tionals on 𝐸. The operator norm on (𝐸, 𝕂) is called the dual norm on 𝐸 ′ and is
denoted by ‖⋅‖𝐸 ′ . If 𝑓 ∈ 𝐸 ′ , then we define

⟨𝑓 , 𝑥⟩ ∶= 𝑓 (𝑥).

25.2 Remark (a) As 𝕂 (= ℝ or ℂ) is complete it follows from Theorem 8.8 that 𝐸 ′ is


a Banach space, even if 𝐸 is not complete.
(b) For fixed 𝑥 ∈ 𝐸, the map

𝐸 ′ → 𝕂, 𝑓 → ⟨𝑓 , 𝑥⟩

is obviously linear. Hence


⟨⋅ , ⋅⟩ ∶ 𝐸 ′ × 𝐸 → 𝕂
is a bilinear map. It is called the duality map on 𝐸. Despite the similarity in notation,
⟨⋅ , ⋅⟩ is not an inner product, and we want to use different notation for the two concepts.
(c) By definition of the operator norm, the dual norm is given by
|⟨𝑓 , 𝑥⟩|
‖𝑓 ‖𝐸 ′ = sup = sup |⟨𝑓 , 𝑥⟩|
𝑥∈𝐸⧵{0} ‖𝑥‖𝐸 ‖𝑥‖𝐸 ≤1

and as a consequence
|⟨𝑓 , 𝑥⟩| ≤ ‖𝑓 ‖𝐸 ′ ‖𝑥‖𝐸
for all 𝑓 ∈ 𝐸 ′ and 𝑥 ∈ 𝐸. Hence the duality map ⟨⋅ , ⋅⟩ ∶ 𝐸 ′ × 𝐸 → 𝕂 is continuous.

85
We next give some examples of dual spaces. We also look at specific representations
of dual spaces and the associated duality map, which is useful in many applications.
These representations are often not easy to get.

25.3 Example Let 𝐸 = 𝕂𝑁 . Then the dual space is given by all linear maps from 𝕂𝑁
to 𝕂. Such maps can be represented by a 1 × 𝑁 matrix. Hence the elements of 𝕂𝑁
are column vectors, and the elements of (𝕂𝑁 )′ are row vectors, and the duality map is
simply matrix multiplication. More precisely, if
⎡ 𝑥1 ⎤
𝑓 = [𝑓1 , … , 𝑓𝑁 ] ∈ (𝕂𝑁 )′ and 𝑥 = ⎢ ⋮ ⎥ ∈ 𝕂𝑁 ,
⎢ ⎥
⎣𝑥𝑁 ⎦
then
⎡ 𝑥1 ⎤ ∑𝑁
⟨𝑓 , 𝑥⟩ = [𝑓1 , … , 𝑓𝑁 ] ⎢ ⋮ ⎥ = 𝑓 𝑥 ,
⎢ ⎥ 𝑘=1 𝑘 𝑘
⎣𝑥𝑁 ⎦
which is almost like the dot product except for the complex conjugate in the second
argument. By identifying the row vector 𝑓 with the column vector 𝑓 𝑇 we can identify
the dual of 𝕂𝑁 with itself.
Note however, that the dual norm depends by definition on the norm of the original
space. We want to compute the dual norm for the above identification by looking at
𝐸𝑝 = 𝕂𝑁 with the 𝑝-norm as defined in (7.1). We are going to show that, with the
above identification, ( )
1 1
𝐸𝑝′ = 𝐸𝑝′ + ′ =1
𝑝 𝑝
with equal norms. By Hölder’s inequality (Proposition 7.2)
|⟨𝑓 , 𝑥⟩| ≤ |𝑓 |𝑝′ |𝑥|𝑝 , (25.1)
for all 𝑥 ∈ 𝕂𝑁 and 1 ≤ 𝑝 ≤ ∞. Hence by definition of the dual norm ‖𝑓 ‖𝐸𝑝′ ≤ |𝑓 |𝑝′ .
We show that there is equality for 1 ≤ 𝑝 ≤ ∞, that is,
‖𝑓 ‖𝐸𝑝′ = |𝑓 |𝑝′ . (25.2)

We do this by choosing a suitable 𝑥 in (25.1). Let 𝜆𝑘 ∈ 𝕂 such that |𝜆𝑘 | = 1 and


|𝑓𝑘 | = 𝜆𝑘 𝑓𝑘 . First assume that 𝑝 ∈ (1, ∞]. For 𝑘 = 1, … , 𝑁 we set
{
𝜆𝑘 |𝑓𝑘 |1∕(𝑝−1) if 𝑝 ∈ (1, ∞)
𝑥𝑘 ∶=
𝜆𝑘 if 𝑝 = ∞.

Then, noting that 𝑝′ = 𝑝∕(𝑝 − 1) if 𝑝 ∈ (1, ∞) and 𝑝′ = 1 if 𝑝 = ∞, we have


{
𝑓𝑘 𝜆𝑘 |𝑓𝑘 |1∕(𝑝−1) = |𝑓𝑘 |𝑝∕(𝑝−1) = |𝑓𝑘 |𝑝 if 𝑝 ∈ (1, ∞),

𝑓𝑘 𝑥𝑘 =
if 𝑝 = ∞,

𝑓𝑘 𝜆𝑘 = |𝑓𝑘 | = |𝑓𝑘 |𝑝

and hence for that choice of 𝑥



𝑛
′ ′
⟨𝑓 , 𝑥⟩ = |𝑓𝑘 |𝑝 = |𝑓 |𝑝′ |𝑓 |𝑝𝑝′ −1 .
𝑘=1

86

Now if 𝑝 = ∞ then |𝑓 |𝑝𝑝′ −1 = |𝑓 |01 = 1, so ‖𝑓 ‖𝐸𝑝′ = |𝑓 |1 . If 𝑝 ∈ (1, ∞) then, by
definition of 𝑥 and since 𝑝′ = 𝑝∕(𝑝 − 1), we have

|𝑓𝑘 |𝑝 −1 = |𝑓𝑘 |1∕(𝑝−1) = |𝑥𝑘 |

and so

(∑
𝑁 )1−1∕𝑝′ (∑
𝑁 )1∕𝑝
|𝑓 |𝑝𝑝′ −1 = |𝑓𝑘 |𝑝∕(𝑝−1)
= |𝑥𝑘 |𝑝 = |𝑥|𝑝 .
𝑘=1 𝑘=1

Together with (25.1) we conclude that (25.2) is true if 1 < 𝑝 ≤ ∞. In case 𝑝 = 1 we


let 𝑗 ∈ {1, … , 𝑁} such that |𝑓𝑗 | = max𝑘=1,…,𝑁 |𝑓𝑘 |. Then set 𝑥𝑗 ∶= 𝜆𝑗 (𝜆𝑗 as defined
above) and 𝑥𝑘 = 0 otherwise. Then

⟨𝑓 , 𝑥⟩ = 𝜆𝑗 𝑓𝑗 = |𝑓𝑗 | = max |𝑓𝑘 | = |𝑓 |∞ ,


𝑘=1,…,𝑁

so ‖𝑓 ‖𝐸𝑝′ = |𝑓 |∞ if 𝑝 = 1 as well.
We next look at an infinite dimensional analogue to the above example, namely the
sequence spaces 𝓁𝑝 from Section 7.2.

25.4 Example For two sequences 𝑥 = (𝑥𝑘 ) and 𝑦 = (𝑦𝑘 ) in 𝕂 we set




⟨𝑥, 𝑦⟩ ∶= 𝑥𝑘 𝑦𝑘
𝑘=0

whenever the series converges. By Hölder’s inequality (see Proposition 7.4)

|⟨𝑥, 𝑦⟩| ≤ |𝑥|𝑝 |𝑦|𝑝′

for all 𝑥 ∈ 𝓁𝑝 and 𝑦 ∈ 𝓁𝑝′ , where and 1 ≤ 𝑝 ≤ ∞. This means that for every fixed
𝑦 ∈ 𝓁𝑝′ , the linear functional 𝑥 → ⟨𝑦, 𝑥⟩ is bounded, and therefore an element of the
dual space (𝓁𝑝 )′ . Hence, if we identify 𝑦 ∈ 𝓁𝑝′ with the map 𝑥 → ⟨𝑦, 𝑥⟩, then we
naturally have
𝓁𝑝′ ⊆ 𝓁𝑝′
for 1 ≤ 𝑝 ≤ ∞. The above identification is essentially the same as the one in the
previous example on finite dimensional spaces. We prove now that
(i) 𝓁𝑝′ = 𝓁𝑝′ if 1 ≤ 𝑝 < ∞ with equal norm;

(ii) 𝑐0′ = 𝓁1 with equal norm.


Most of the proof works for (i) and (ii). Given 𝑓 ∈ 𝓁𝑝′ we need to find 𝑦 ∈ 𝓁𝑝′ such that
∑∞
⟨𝑓 , 𝑥⟩ = 𝑘=0 𝑦𝑘 𝑥𝑘 for all 𝑥 ∈ 𝓁𝑝 . We set 𝑒𝑘 ∶= (𝛿𝑗𝑘 ) ∈ 𝓁𝑝 . We define

𝑦𝑘 ∶= ⟨𝑓 , 𝑒𝑘 ⟩

and show that 𝑦 ∶= (𝑦𝑘 ) ∈ 𝓁𝑝′ . Further we define 𝑓𝑛 ∈ 𝓁𝑝′ by


𝑛
⟨𝑓𝑛 , 𝑥⟩ ∶= 𝑦𝑘 𝑥𝑘
𝑘=0

87
∑𝑛
If we set 𝑥(𝑛) ∶= 𝑘=0
𝑥𝑘 𝑒𝑘 , then
⟨ ∑ 𝑛 ⟩ ∑𝑛

𝑛
⟨𝑓 , 𝑥 ⟩ = 𝑓 ,
(𝑛)
𝑥𝑘 𝑒𝑘 = 𝑥𝑘 ⟨𝑓 , 𝑒𝑘 ⟩ = 𝑦𝑘 𝑥𝑘 = ⟨𝑓𝑛 , 𝑥⟩ (25.3)
𝑘=0 𝑘=0 𝑘=0

by definition of 𝑦𝑘 for all 𝑛 ∈ ℕ. Next note that 𝑥(𝑛) → 𝑥 in 𝓁𝑝 . Indeed, if 1 ≤ 𝑝 < ∞


and 𝑥 ∈ 𝓁𝑝 , then
∑∞
|𝑥 − 𝑥|𝑝 =
(𝑛) 𝑝
|𝑥𝑘 |𝑝 → 0
𝑘=𝑛+1

as 𝑛 → ∞. If 𝑝 = ∞ and 𝑥 ∈ 𝑐0 , then also

|𝑥(𝑛) − 𝑥|∞ = sup |𝑥𝑘 | → 0


𝑘≥𝑛

as 𝑛 → ∞, so 𝑥(𝑛) → 𝑥 in 𝑐0 . Hence, by definition of 𝑦 and (25.3) we obtain



𝑥𝑘 𝑦𝑘 = lim ⟨𝑓𝑛 , 𝑥⟩ = lim ⟨𝑓 , 𝑥(𝑛) ⟩ = ⟨𝑓 , 𝑥⟩,
𝑛→∞ 𝑛→∞
𝑘=0

and so 𝑓𝑛 → 𝑓 pointwise and ⟨𝑦, 𝑥⟩ = ⟨𝑓 , 𝑥⟩. By the previous example we know


that |(𝑦0 , … , 𝑦𝑛 )|𝑝′ = ‖𝑓𝑛 ‖𝓁𝑝′ for all 𝑛 ∈ ℕ. Hence the uniform boundedness principle
(Theorem 22.1) implies that

|𝑦|𝑝′ = sup |(𝑦0 , … , 𝑦𝑛 )|𝑝′ = sup ‖𝑓𝑛 ‖𝓁𝑝′ < ∞.


𝑛∈ℕ 𝑛∈ℕ

The first equality holds since |(𝑦1 , … , 𝑦𝑛 )|𝑝′ defines a monotone increasing sequence
with limit |𝑦|𝑝′ . In particular we conclude that 𝑦 ∈ 𝓁𝑝′ .
We next show that |𝑦|𝑝′ = ‖𝑓 ‖𝓁𝑝′ . By Hölder’s inequality |⟨𝑓 , 𝑥⟩| = |⟨𝑦, 𝑥⟩| ≤
|𝑦|𝑝′ |𝑥|𝑝 , and so ‖𝑓 ‖𝓁𝑝′ ≤ |𝑦|𝑝′ . To prove the reverse inequality choose 𝑥(𝑛) ∈ 𝓁𝑝
such that ⟨𝑓 , 𝑥(𝑛) ⟩ = |(𝑦0 , … , 𝑦𝑛 )|𝑝′ |𝑥(𝑛) |𝑝 , which is possible by the construction in the
previous example. Thus by definition of the dual norm, |(𝑦0 , … , 𝑦𝑛 )|𝑝′ ≤ ‖𝑓 ‖𝓁𝑝′ for all
𝑛 ∈ ℕ and hence,
|𝑦|𝑝′ = sup |(𝑦0 , … , 𝑦𝑛 )|𝑝′ ≤ ‖𝑓 ‖𝓁𝑝′
𝑛∈ℕ

as required. Hence (i) and (ii) follow.


We finally mention an even more general version of the above, namely the dual spaces
of the 𝐿𝑝 spaces. Proofs can be found in [13, Section IV.9].

25.5 Example If Ω ⊆ ℝ𝑁 is an open set we can consider 𝐿𝑝 (Ω). By Hölder’s inequality


we have
| |
| 𝑓 (𝑥)𝑔(𝑥) 𝑑𝑥| ≤ ‖𝑓 ‖𝑝′ ‖𝑔‖𝑝
|∫Ω |
for 1 ≤ 𝑝 ≤ ∞. Hence, if we identify 𝑓 ∈ 𝐿𝑝′ (Ω) with the linear functional

𝑔→ 𝑓 (𝑥)𝑔(𝑥) 𝑑𝑥,
∫Ω

88
( )′ ( )′
then we have 𝑓 ∈ 𝐿𝑝 (Ω) . With that identification 𝐿𝑝′ (Ω) ⊆ 𝐿𝑝 (Ω) for 1 ≤ 𝑝 ≤
∞. Using the Radon-Nikodym Theorem from measure theory one can show that
( )′
𝐿𝑝 (Ω) = 𝐿𝑝′ (Ω)

if 1 ≤ 𝑝 < ∞ and ( )′
𝐿1 (Ω) ⊂ 𝐿∞ (Ω)
with proper inclusion.

26 The Hahn-Banach Theorem


We have introduced the dual space to a normed vector space, but do not know how big
it is in general. The Hahn-Banach theorem in particular shows the existence of many
bounded linear functionals on every normed vector space. For the most basic version
for vector spaces over ℝ we do not even need to have a norm on the vector space, but
just a semi-norm.

26.1 Theorem (Hahn-Banach, ℝ-version) Suppose that 𝐸 is a vector space over ℝ


and that 𝑝 ∶ 𝐸 → ℝ is a map satisfying

(i) 𝑝(𝑥 + 𝑦) ≤ 𝑝(𝑥) + 𝑝(𝑦) for all 𝑥 ∈ 𝐸 (sub-additivity);

(ii) 𝑝(𝛼𝑥) = 𝛼𝑝(𝑥) for all 𝑥 ∈ 𝐸 and 𝛼 ≥ 0 (positive homogeneity).

Let 𝑀 be a subspace and 𝑓0 ∈ Hom(𝑀, ℝ) with 𝑓0 (𝑥) ≤ 𝑝(𝑥) for all 𝑥 ∈ 𝑀. Then
there exists an extension 𝑓 ∈ Hom(𝐸, ℝ) such that 𝑓 |𝑀 = 𝑓0 and 𝑓 (𝑥) ≤ 𝑝(𝑥) for all
𝑥 ∈ 𝐸.
Proof. (a) We first show that 𝑓0 can be extended as required if 𝑀 has co-dimension
one. More precisely, let 𝑥0 ∈ 𝐸 ⧵𝑀 and assume that span(𝑀 ∪{𝑥0 }) = 𝐸. As 𝑥0 ∉ 𝑀
we can write every 𝑥 ∈ 𝐸 in the form

𝑥 = 𝑚 + 𝛼𝑥0

with 𝛼 ∈ ℝ and 𝑚 ∈ 𝑀 uniquely determined by 𝑥. Hence for every 𝑐 ∈ ℝ, the map


𝑓𝑐 ∈ Hom(𝐸, ℝ) given by

𝑓𝑐 (𝑚 + 𝛼𝑥) ∶= 𝑓0 (𝑚) + 𝑐𝛼

is well defined, and 𝑓𝑐 (𝑚) = 𝑓0 (𝑚) for all 𝑚 ∈ 𝑀. We want to show that it is possible
to choose 𝑐 ∈ ℝ such that 𝑓𝑐 (𝑥) ≤ 𝑝(𝑥) for all 𝑥 ∈ 𝐸. Equivalently

𝑓0 (𝑚) + 𝑐𝛼 ≤ 𝑝(𝑚 + 𝛼𝑥0 )

for all 𝑚 ∈ 𝑀 and 𝛼 ∈ ℝ. By the positive homogeneity of 𝑝 and the linearity of 𝑓 the
above is equivalent to

𝑓0 (𝑚∕𝛼) + 𝑐 ≤ 𝑝(𝑥0 + 𝑚∕𝛼) if 𝛼 > 0,


𝑓0 (−𝑚∕𝛼) − 𝑐 ≤ 𝑝(−𝑥0 − 𝑚∕𝛼) if 𝛼 < 0.

89
Hence we need to show that we can choose 𝑐 such that
𝑐 ≤ 𝑝(𝑥0 + 𝑚∕𝛼) − 𝑓0 (𝑚∕𝛼)
𝑐 ≥ −𝑝(−𝑥0 − 𝑚∕𝛼) + 𝑓0 (−𝑚∕𝛼)

for all 𝑚 ∈ 𝑀 and 𝛼 ∈ ℝ. Note that ±𝑚∕𝛼 is just an arbitrary element of 𝑀, so the
above conditions reduce to
𝑐 ≤ 𝑝(𝑥0 + 𝑚) − 𝑓0 (𝑚)
𝑐 ≥ −𝑝(−𝑥0 + 𝑚) + 𝑓0 (𝑚)

for all 𝑚 ∈ 𝑀. Such a choice of 𝑐 is therefore possible if

−𝑝(−𝑥0 + 𝑚1 ) + 𝑓0 (𝑚1 ) ≤ 𝑝(𝑥0 + 𝑚2 ) − 𝑓0 (𝑚2 )

for all 𝑚1 , 𝑚2 ∈ 𝑀. By the linearity of 𝑓0 this is equivalent to

𝑓0 (𝑚1 + 𝑚2 ) = 𝑓0 (𝑚1 ) + 𝑓0 (𝑚2 ) ≤ 𝑝(−𝑥0 + 𝑚1 ) + 𝑝(𝑥0 + 𝑚2 )

for all 𝑚1 , 𝑚2 ∈ 𝑀. Using the sub-additivity of 𝑝 we can verify this condition since

𝑓0 (𝑚1 + 𝑚2 ) ≤ 𝑝(𝑚1 + 𝑚2 ) = 𝑝(𝑚1 − 𝑥0 + 𝑚2 + 𝑥0 ) ≤ 𝑝(𝑚1 − 𝑥0 ) + 𝑝(𝑚2 + 𝑥0 )

for all 𝑚1 , 𝑚2 ∈ 𝑀. Hence 𝑐 can be chosen as required.


(b) To prove the assertion of the theorem in the general case we use Zorn’s Lemma
discussed in Section 1. We consider the set 𝑋 of all extensions 𝑔 of 𝑓0 such that 𝑔(𝑥) ≤
𝑝(𝑥) for all 𝑥 in the domain 𝐷(𝑔) of 𝑔. We write 𝑔2 ⊇ 𝑔1 if 𝑔2 an extension 𝑔1 . Then
⊆ defines a partial ordering of 𝑋. Moreover, 𝑓0 ∈ 𝑋, so that 𝑋 ≠ ∅. Suppose now

that 𝐶 = {𝑔𝛼 ∶ 𝛼 ∈ 𝐴} is an arbitrary chain in 𝑋 and set 𝐷(𝑔) ∶= 𝛼∈𝐴 𝐷(𝑔𝛼 ) and
𝑔(𝑥) ∶= 𝑔𝛼 (𝑥) if 𝑥 ∈ 𝐷(𝑔𝛼 ). Then 𝑔 is an upper bound of 𝐶 in 𝑋. By Zorn’s Lemma
(Theorem 1.7) 𝑋 has a maximal element 𝑓 with domain 𝐷(𝑓 ). We claim that 𝐷(𝑓 ) =
𝐸. If not, then by part (a) of this proof 𝑓 has a proper extension in 𝑋, so 𝑓 was not
maximal. Hence 𝐷(𝑓 ) = 𝐸, and 𝑓 is a functional as required.

We next want to get a version of the Hahn-Banach theorem for vector spaces over ℂ.
To do so we need to strengthen the assumptions on 𝑝.

26.2 Definition (semi-norm) Suppose that 𝐸 is a vector space over 𝕂. A map 𝑝 ∶ 𝐸 →


ℝ is called a semi-norm on 𝐸 if
(i) 𝑝(𝑥 + 𝑦) ≤ 𝑝(𝑥) + 𝑝(𝑦) for all 𝑥, 𝑦 ∈ 𝐸;

(ii) 𝑝(𝛼𝑥) = |𝛼|𝑝(𝑥) for all 𝑥 ∈ 𝐸 and all 𝛼 ∈ 𝕂.

26.3 Lemma Let 𝑝 be a semi-norm on 𝐸. Then 𝑝(𝑥) ≥ 0 for all 𝑥 ∈ 𝐸 and 𝑝(0) = 0.
Moreover, |𝑝(𝑥) − 𝑝(𝑦)| ≤ 𝑝(𝑥 − 𝑦) for all 𝑥, 𝑦 ∈ 𝐸 (reversed triangle inequality).
Proof. Firstly by (ii) we have 𝑝(0) = 𝑝(0𝑥) = 0𝑝(𝑥) = 0 for all 𝑥 ∈ 𝐸, so 𝑝(0) = 0. If
𝑥, 𝑦 ∈ 𝐸, then by (i)

𝑝(𝑥) = 𝑝(𝑥 − 𝑦 + 𝑦) ≤ 𝑝(𝑥 − 𝑦) + 𝑝(𝑦),

90
so that
𝑝(𝑥) − 𝑝(𝑦) ≤ 𝑝(𝑥 − 𝑦).
Interchanging the roles of 𝑥 and 𝑦 and using (ii) we also have

𝑝(𝑦) − 𝑝(𝑥) ≤ 𝑝(𝑦 − 𝑥) = 𝑝(−(𝑥 − 𝑦)) = 𝑝(𝑥 − 𝑦).

Combining the two inequalities we get |𝑝(𝑥) − 𝑝(𝑦)| ≤ 𝑝(𝑥 − 𝑦) for all 𝑥, 𝑦 ∈ 𝐸.
Choosing 𝑦 = 0 we get 0 ≤ |𝑝(𝑥)| ≤ 𝑝(𝑥), so 𝑝(𝑥) ≥ 0 for all 𝑥 ∈ 𝐸.

We can now prove a version of the Hahn-Banach Theorem for vector spaces over ℂ.

26.4 Theorem (Hahn-Banach, ℂ-version) Suppose that 𝑝 is a semi-norm on the vec-


tor space 𝐸 over 𝕂 and let 𝑀 be a subspace of 𝐸. If 𝑓0 ∈ Hom(𝑀, 𝕂) is such that
|𝑓0 (𝑥)| ≤ 𝑝(𝑥) for all 𝑥 ∈ 𝑀, then there exists an extension 𝑓 ∈ Hom(𝐸, 𝕂) such that
𝑓 |𝑀 = 𝑓0 and |𝑓 (𝑥)| ≤ 𝑝(𝑥) for all 𝑥 ∈ 𝐸
Proof. If 𝕂 = ℝ, then the assertion of the theorem follows from Lemma 26.3 and
Theorem 26.1. Let now 𝐸 be a vector space over ℂ. We split 𝑓0 ∈ Hom(𝑀, ℂ) into
real and imaginary parts
𝑓0 (𝑥) ∶= 𝑔0 (𝑥) + 𝑖ℎ0 (𝑥)
with 𝑔0 , ℎ0 real valued and 𝑔0 , ℎ0 ∈ Hom(𝑀ℝ , ℝ), where 𝑀ℝ is 𝑀 considered as a
vector space over ℝ. Due to the linearity of 𝑓0 we have

0 = 𝑖𝑓0 (𝑥) − 𝑓0 (𝑖𝑥) = 𝑖𝑔0 (𝑥) − ℎ0 (𝑥) − 𝑔0 (𝑖𝑥) − 𝑖ℎ0 (𝑖𝑥)


( ) ( )
= − 𝑔0 (𝑖𝑥) + ℎ0 (𝑥) + 𝑖 𝑔0 (𝑥) − ℎ0 (𝑖𝑥)

for all 𝑥 ∈ 𝑀. In particular, ℎ0 (𝑥) = −𝑔0 (𝑖𝑥) and so 𝑔0 (𝑖𝑥) + ℎ0 (𝑥) = 0 for all 𝑥 ∈ 𝑀.
Therefore,
𝑓0 (𝑥) = 𝑔0 (𝑥) − 𝑖𝑔0 (𝑖𝑥) (26.1)
for all 𝑥 ∈ 𝑀. We now consider 𝐸 as a vector space over ℝ. We denote that vector
space by 𝐸ℝ . We can consider 𝑀ℝ as a vector subspace of 𝐸ℝ . Now clearly 𝑔0 ∈
Hom(𝑀ℝ , ℝ) and by assumption

𝑔0 (𝑥) ≤ |𝑓0 (𝑥)| ≤ 𝑝(𝑥)

for all 𝑥 ∈ 𝑀ℝ . By Theorem 26.1 there exists 𝑔 ∈ Hom(𝐸ℝ , ℝ) such that 𝑔|𝑀ℝ = 𝑔0
and
𝑔(𝑥) ≤ 𝑝(𝑥)
for all 𝑥 ∈ 𝐸ℝ . We set
𝑓 (𝑥) ∶= 𝑔(𝑥) − 𝑖𝑔(𝑖𝑥)
for all 𝑥 ∈ 𝐸ℝ , so that 𝑓 |𝑀 = 𝑓0 by (26.1). To show that 𝑓 is linear from 𝐸 to ℂ we
only need to look at multiplication by 𝑖 because 𝑓 is linear over ℝ. We have

𝑓 (𝑖𝑥) = 𝑔(𝑖𝑥) − 𝑖𝑔(𝑖2 𝑥) = 𝑔(𝑖𝑥) − 𝑖𝑔(−𝑥)


( )
= 𝑔(𝑖𝑥) + 𝑖𝑔(𝑥) = 𝑖 𝑔(𝑥) − 𝑖𝑔(𝑖𝑥) = 𝑖𝑓 (𝑥)

91
for all 𝑥 ∈ 𝐸. It remains to show that |𝑓 (𝑥)| ≤ 𝑝(𝑥). For fixed 𝑥 ∈ 𝐸 we choose 𝜆 ∈ ℂ
with |𝜆| = 1 such that 𝜆𝑓 (𝑥) = |𝑓 (𝑥)|. Then since |𝑓 (𝑥)| ∈ ℝ and using the definition
of 𝑓
|𝑓 (𝑥)| = 𝜆𝑓 (𝑥) = 𝑓 (𝜆𝑥) = 𝑔(𝜆𝑥) ≤ 𝑝(𝜆𝑥) = |𝜆|𝑝(𝑥) = 𝑝(𝑥).
This completes the proof of the theorem.

From the above version of the Hahn-Banach theorem we can prove the existence of
bounded linear functionals with various special properties.

26.5 Corollary Suppose that 𝐸 is a normed space. Then for every 𝑥 ∈ 𝐸, 𝑥 ≠ 0, there
exists 𝑓 ∈ 𝐸 ′ such that ‖𝑓 ‖𝐸 ′ = 1 and ⟨𝑓 , 𝑥⟩ = ‖𝑥‖𝐸 .
Proof. Fix 𝑥 ∈ 𝐸 with 𝑥 ≠ 0 and set 𝑀 ∶= span{𝑥}. Define 𝑓0 ∈ Hom(𝑀, 𝕂) by
setting ⟨𝑓0 , 𝛼𝑥⟩ ∶= 𝛼‖𝑥‖𝐸 for all 𝛼 ∈ 𝕂. Setting 𝑝(𝑦) ∶= ‖𝑦‖ we have

|⟨𝑓0 , 𝛼𝑥⟩| = |𝛼|‖𝑥‖𝐸 = ‖𝛼𝑥‖𝐸 = 𝑝(𝛼𝑥)

for all 𝛼 ∈ 𝕂. By Theorem 26.4 there exists 𝑓 ∈ Hom(𝐸, 𝕂) such that 𝑓 (𝛼𝑥) = 𝑓0 (𝛼𝑥)
for all 𝛼 ∈ 𝕂 and |⟨𝑓 , 𝑦⟩| ≤ 𝑝(𝑦) = ‖𝑦‖𝐸 for all 𝑦 ∈ 𝐸. Hence 𝑓 ∈ 𝐸 ′ and ‖𝑓 ‖𝐸 ′ ≤ 1.
Since |⟨𝑓 , 𝑥⟩| = ‖𝑥‖𝐸 by definition of 𝑓 we have ‖𝑓 ‖𝐸 ′ = 1, completing the proof of
the corollary.

26.6 Remark By the above corollary 𝐸 ′ ≠ {0} for all normed spaces 𝐸 ≠ {0}.
Theorem 26.4 also allows us to get continuous extensions of linear functionals.

26.7 Theorem (extension of linear functionals) Let 𝐸 be normed space and 𝑀 ⊂ 𝐸


a proper subspace. Then for every 𝑓0 ∈ 𝑀 ′ there exists 𝑓 ∈ 𝐸 ′ such that 𝑓 |𝑀 = 𝑓0
and ‖𝑓 ‖𝐸 ′ = ‖𝑓0 ‖𝑀 ′ .
Proof. The function 𝑝 ∶ 𝐸 → ℝ given by 𝑝(𝑥) ∶= ‖𝑓0 ‖𝑀 ′ ‖𝑥‖𝐸 defines a semi-norm
on 𝐸. Clearly
|⟨𝑓0 , 𝑥⟩| ≤ ‖𝑓0 ‖𝑀 ′ ‖𝑥‖𝐸
for all 𝑥 ∈ 𝑀, so by Theorem 26.4 there exists 𝑓 ∈ Hom(𝐸, 𝕂) with 𝑓 |𝑀 = 𝑓0 and

|⟨𝑓 , 𝑥⟩| ≤ 𝑝(𝑥) = ‖𝑓0 ‖𝑀 ′ ‖𝑥‖𝐸

for all 𝑥 ∈ 𝐸. In particular 𝑓 ∈ 𝐸 ′ and ‖𝑓 ‖𝐸 ′ ≤ ‖𝑓0 ‖𝑀 ′ . Moreover,

‖𝑓 ‖𝑀 ′ = sup |⟨𝑓 , 𝑥⟩| ≤ sup |⟨𝑓 , 𝑥⟩| = ‖𝑓 ‖𝐸 ′ ,


𝑥∈𝑀 𝑥∈𝐸
‖𝑥‖𝐸 ≤1 ‖𝑥‖𝐸 ≤1

so that ‖𝑓 ‖𝐸 ′ = ‖𝑓0 ‖𝑀 ′ .

We can use the above theorem to obtain linear functionals with special properties.

26.8 Corollary Let 𝐸 be a normed space and 𝑀 ⊂ 𝐸 a proper closed subspace. Then
for every 𝑥0 ∈ 𝐸 ⧵ 𝑀 there exists 𝑓 ∈ 𝐸 ′ such that 𝑓 |𝑀 = 0 and ⟨𝑓 , 𝑥0 ⟩ = 1.

92
Proof. Fix 𝑥0 ∈ 𝐸 ⧵ 𝑀 be arbitrary and let 𝑀1 ∶= 𝑀 ⊕ span{𝑥0 }. For 𝑚 + 𝛼𝑥0 ∈ 𝑀1
define
⟨𝑓0 , 𝑚 + 𝛼𝑥0 ⟩ ∶= 𝛼.
for all 𝛼 ∈ 𝕂 and 𝑚 ∈ 𝑀. Then clearly ⟨𝑓0 , 𝑥0 ⟩ = 1 and ⟨𝑓0 , 𝑚⟩ = 0 for all 𝑚 ∈ 𝑀.
Moreover, since 𝑀 is closed 𝑑 ∶= dist(𝑥0 , 𝑀) > 0 and

inf ‖𝑚 + 𝛼𝑥0 ‖ = |𝛼| inf ‖ − 𝛼 −1 𝑚 − 𝑥0 ‖ = |𝛼| inf ‖𝑚 − 𝑥0 ‖ = 𝑑|𝛼|.


𝑚∈𝑀 𝑚∈𝑀 𝑚∈𝑀

Hence
|⟨𝑓0 , 𝑚 + 𝛼𝑥0 ⟩| = |𝛼| ≤ 𝑑 −1 ‖𝑚 + 𝛼𝑥0 ‖
for all 𝑚 ∈ 𝑀 and 𝛼 ∈ 𝕂. Hence 𝑓0 ∈ 𝑀1′ . Now we can apply Theorem 26.7 to
conclude the proof.

27 Reflexive Spaces
The dual of a normed space is again a normed space. Hence we can look at the dual of
that space as well. The question is whether or not the second dual coincides with the
original space.

27.1 Definition (Bi-dual space) If 𝐸 is a normed space we call

𝐸 ′′ ∶= (𝐸 ′ )′

the Bi-dual or second dual space of 𝐸.


We have seen that for every fixed 𝑥 ∈ 𝐸 the map 𝑓 → ⟨𝑓 , 𝑥⟩ is linear and

|⟨𝑓 , 𝑥⟩| ≤ ‖𝑓 ‖𝐸 ′ ‖𝑥‖𝐸

for all 𝑓 ∈ 𝐸 ′ . Hence we can naturally identify 𝑥 ∈ 𝐸 with ⟨⋅, 𝑥⟩ ∈ 𝐸 ′′ . With that
canonical identification we have 𝐸 ⊆ 𝐸 ′′ and

‖⟨⋅, 𝑥⟩‖𝐸 ′′ ≤ ‖𝑥‖𝐸 .

By the Hahn-Banach Theorem (Corollary 26.5) there exists 𝑓 ∈ 𝐸 ′ such that ⟨𝑓 , 𝑥⟩ =


‖𝑥‖, so we have
‖⟨⋅, 𝑥⟩‖𝐸 ′′ = ‖𝑥‖𝐸 .
This proves the following proposition. It also turns out that the norms are the same.

27.2 Proposition The map 𝑥 → ⟨⋅, 𝑥⟩ is an isometric embedding of 𝐸 into 𝐸 ′′ , that is,
𝐸 ⊆ 𝐸 ′′ with equal norms.
For some spaces there is equality.

27.3 Definition (reflexive space) We say that a normed space 𝐸 is reflexive if 𝐸 = 𝐸 ′′


with the canonical identification 𝑥 → ⟨⋅, 𝑥⟩.

93
27.4 Remark (a) As dual spaces are always complete (see Remark 25.2(a)), every re-
flexive space is a Banach space.
(b) Not every Banach space is reflexive. We have shown in Example 25.4 that
𝑐0 = 𝓁1 and 𝓁1′ = 𝓁∞ , so that

𝑐0′′ = 𝓁1′ = 𝓁∞ ≠ 𝑐0

Other examples of non-reflexive Banach spaces are 𝐿1 (Ω), 𝐿∞ (Ω) and 𝐵𝐶(Ω).

27.5 Examples (a) Every finite dimensional normed space is reflexive.


(b) 𝓁𝑝 is reflexive for 𝑝 ∈ (1, ∞) as shown in Example 25.4.
(c) 𝐿𝑝 (Ω) is reflexive for 𝑝 ∈ (1, ∞) as shown in Example 25.5.
(d) Every Hilbert space is reflexive as we will show in Section 30.

28 Weak convergence
So far, in a Banach space we have only looked at sequences converging with respect to
the norm in the space. Since we work in infinite dimensions there are more notions of
convergence of a sequence. We introduce one involving dual spaces.

28.1 Definition (weak convergence) Let 𝐸 be a Banach space and (𝑥𝑛 ) a sequence in
𝐸. We say (𝑥𝑛 ) converges weakly to 𝑥 in 𝐸 if

lim ⟨𝑓 , 𝑥𝑛 ⟩ = ⟨𝑓 , 𝑥⟩
𝑛→∞

for all 𝑓 ∈ 𝐸 ′ . We write 𝑥𝑛 ⇀ 𝑥 in that case.

28.2 Remark (a) Clearly, if 𝑥𝑛 → 𝑥 with respect to the norm, then 𝑥𝑛 ⇀ 𝑥 weakly.
(b) The converse is not true, that is, weak convergence does not imply convergence
with respect to the norm. As an example consider the sequence 𝑒𝑛 ∶= (𝛿𝑘𝑛 )𝑘∈ℕ ∈ 𝓁2 .
Then |𝑒𝑛 |2 = 1 and ⟨𝑓 , 𝑒𝑛 ⟩ = 𝑓𝑛 for all 𝑓 = (𝑓𝑘 )𝑘∈ℕ ∈ 𝓁2′ = 𝓁2 . Hence for all 𝑓 ∈ 𝓁2′

⟨𝑓 , 𝑒𝑛 ⟩ = 𝑓𝑛 → 0

as 𝑛 → ∞, so 𝑒𝑛 ⇀ 0 weakly. However, (𝑒𝑛 ) does not converge with respect to the


norm.
(c) The weak limit of a sequence is unique. To see this note that if ⟨𝑓 , 𝑥⟩ = 0 for
all 𝑓 ∈ 𝐸 ′ , then 𝑥 = 0 since otherwise the Hahn-Banach Theorem (Corollary 26.5)
implies the existence of 𝑓 ∈ 𝐸 ′ with ⟨𝑓 , 𝑥⟩ ≠ 0.
From the example in Remark 28.2 we see that in general ‖𝑥𝑛 ‖ ̸→ ‖𝑥‖ if 𝑥𝑛 ⇀ 𝑥 weakly.
However, we can still get an upper bound for ‖𝑥‖𝐸 using the uniform boundedness
principle.

28.3 Proposition Suppose that 𝐸 is a Banach space and that 𝑥𝑛 ⇀ 𝑥 weakly in 𝐸.


Then (𝑥𝑛 ) is bounded and
‖𝑥‖𝐸 ≤ lim inf ‖𝑥𝑛 ‖𝐸 .
𝑛→∞

94
Proof. We consider 𝑥𝑛 as an element of 𝐸 ′′ . By assumption ⟨𝑓 , 𝑥𝑛 ⟩ → ⟨𝑓 , 𝑥⟩ for all
𝑓 ∈ 𝐸 ′ . Hence by the uniform boundedness principle (Corollary 22.3) and Proposi-
tion 27.2
‖𝑥‖𝐸 = ‖𝑥‖𝐸 ′′ ≤ lim inf ‖𝑥𝑛 ‖𝐸 ′′ = lim inf ‖𝑥𝑛 ‖𝐸 < ∞.
𝑛→∞ 𝑛→∞

as claimed.

29 Dual Operators
For every linear operator 𝑇 from a normed space 𝐸 into a normed space 𝐹 we can
define an operator from the dual 𝐹 ′ into 𝐸 ′ by composing a linear functional on 𝑓 with
𝑇 to get a linear functional on 𝐸.

29.1 Definition Suppose that 𝐸 and 𝐹 are Banach spaces and that 𝑇 ∈ (𝐸, 𝐹 ). We
define the dual operator 𝑇 ′ ∶ 𝐹 ′ → 𝐸 ′ by

𝑇 ′ 𝑓 ∶= 𝑓 ◦𝑇

for all 𝑓 ∈ 𝐹 ′ .
Note that for 𝑓 ∈ 𝐹 ′ and 𝑥 ∈ 𝐸 by definition

⟨𝑇 ′ 𝑓 , 𝑥⟩ = ⟨𝑓 , 𝑇 𝑥⟩.

The situation is also given in the following diagram:

𝑇
𝐸 𝐹

𝑓
𝑇 ′ 𝑓 = 𝑓 ◦𝑇
𝕂

29.2 Remark We can also look at the second dual of an operator 𝑇 ∈ (𝐸, 𝐹 ). Then
𝑇 ′′ = (𝑇 ′ )′ ∶ 𝐸 ′′ → 𝐹 ′′ . By definition of 𝑇 ′′ and the canonical embedding of 𝐸 ⊆ 𝐸 ′′
we always haven 𝑇 ′′ |𝐸 = 𝑇 . If 𝐸 and 𝐹 are reflexive, then 𝑇 ′′ = 𝑇 .
We next show that an operator and its dual have the same norm.

29.3 Theorem Let 𝐸 and 𝐹 be normed spaces and 𝑇 ∈ (𝐸, 𝐹 ). Then 𝑇 ′ ∈ (𝐹 ′ , 𝐸 ′ )
and ‖𝑇 ‖(𝐸,𝐹 ) = ‖𝑇 ′ ‖(𝐹 ′ ,𝐸 ′ ) .
Proof. By definition of the dual norm we have

‖𝑇 ′ 𝑓 ‖𝐸 ′ = sup |⟨𝑇 ′ 𝑓 , 𝑥⟩| = sup |⟨𝑓 , 𝑇 𝑥⟩|


‖𝑥‖𝐸 ≤1 ‖𝑥‖𝐸 ≤1

≤ sup ‖𝑓 ‖𝐹 ′ ‖𝑇 𝑥‖𝐹 = ‖𝑇 ‖(𝐸,𝐹 ) ‖𝑓 ‖𝐹 ′ ,


‖𝑥‖𝐸 ≤1

95
so ‖𝑇 ′ ‖(𝐹 ′ ,𝐸 ′ ) ≤ ‖𝑇 ‖(𝐸,𝐹 ) . For the opposite inequality we use the Hahn-Banach The-
orem (Corollary 26.5) which guarantees that for every 𝑥 ∈ 𝐸 there exists 𝑓 ∈ 𝐹 ′ such
that ‖𝑓 ‖𝐹 ′ = 1 and ⟨𝑓 , 𝑇 𝑥⟩ = ‖𝑇 𝑥‖𝐹 . Hence, for that choice of 𝑓 we have

‖𝑇 𝑥‖𝐹 = ⟨𝑓 , 𝑇 𝑥⟩ = ⟨𝑇 ′ 𝑓 , 𝑥⟩ ≤ ‖𝑇 ′ 𝑓 ‖𝐸 ′ ‖𝑥‖𝐸
≤ ‖𝑇 ′ ‖(𝐹 ′ ,𝐸 ′ ) ‖𝑓 ‖𝐹 ′ ‖𝑥‖𝐸 = ‖𝑇 ′ ‖(𝐹 ′ ,𝐸 ′ ) ‖𝑥‖𝐸 .
Hence ‖𝑇 ′ ‖(𝐹 ′ ,𝐸 ′ ) ≥ ‖𝑇 ‖(𝐸,𝐹 ) and the assertion of the theorem follows.

We can use the dual operator to obtain properties of the original linear operator.

29.4 Proposition Let 𝐸 and 𝐹 be normed spaces and 𝑇 ∈ (𝐸, 𝐹 ). Then the image
of 𝑇 is dense in 𝐹 if and only if ker 𝑇 ′ = {0}.
Proof. Suppose that im(𝑇 ) = 𝐹 and that 𝑓 ∈ ker 𝑇 ′ . Then
0 = ⟨𝑇 ′ 𝑓 , 𝑥⟩ = ⟨𝑓 , 𝑇 𝑥⟩
for all 𝑥 ∈ 𝐸. Since im(𝑇 ) is dense in 𝐹 we conclude that 𝑓 = 0, so ker 𝑇 ′ = {0}. We
prove the converse by contrapositive, assuming that 𝑀 ∶= im(𝑇 ) is a proper subspace
of 𝐹 . By the Hahn-Banach Theorem (Corollary 26.8) there exists 𝑓 ∈ 𝐹 ′ with ‖𝑓 ‖𝐹 ′ =
1 and ⟨𝑓 , 𝑦⟩ = 0 for all 𝑦 ∈ 𝑀. In particular
⟨𝑇 ′ 𝑓 , 𝑥⟩ = ⟨𝑓 , 𝑇 𝑥⟩ = 0
for all 𝑥 ∈ 𝐸. Hence 𝑇 ′ 𝑓 = 0 but 𝑓 ≠ 0 and so ker 𝑇 ′ ≠ {0} as claimed.

We apply the above to show that the dual of an isomorphism is also an isomorphism.

29.5 Proposition Let 𝐸 and 𝐹 be normed spaces and 𝑇 ∈ (𝐸, 𝐹 ) is an isomorphism


with 𝑇 −1 ∈ (𝐹 , 𝐸). Then 𝑇 ′ ∈ (𝐹 ′ , 𝐸 ′ ) is an isomorphism and (𝑇 −1 )′ = (𝑇 ′ )−1 .
Proof. By Proposition 29.4 the dual 𝑇 ′ is injective since im(𝑇 ) = 𝐹 . For given 𝑔 ∈ 𝐸 ′
we set 𝑓 ∶= (𝑇 −1 )′ 𝑔. Then
⟨𝑇 ′ 𝑓 , 𝑥⟩ = ⟨𝑓 , 𝑇 𝑥⟩ = ⟨(𝑇 −1 )′ 𝑔, 𝑇 𝑥⟩ = ⟨𝑔, 𝑇 −1 𝑇 𝑥⟩ = ⟨𝑔, 𝑥⟩
for all 𝑥 ∈ 𝐸. Hence 𝑇 ′ is surjective and (𝑇 ′ )−1 = (𝑇 −1 )′ as claimed.

We can apply the above to “embeddings” of spaces.

29.6 Definition (continuous embedding) Suppose that 𝐸 and 𝐹 are normed spaces
and that 𝐸 ⊆ 𝐹 . If the natural injection 𝑖 ∶ 𝐸 → 𝐹 , 𝑥 → 𝑖(𝑥) ∶= 𝑥 is continuous,
then we write 𝐸 → 𝐹 . We say 𝐸 is continuously embedded into 𝐹 . If in addition 𝐸 is
𝑑
dense in 𝐹 , then we write 𝐸 → 𝐹 and say 𝐸 is densely embedded into 𝐹 .
If 𝐸 → 𝐹 and 𝑖 ∈ (𝐸, 𝐹 ) is the natural injection, then the dual is given by
⟨𝑖′ (𝑓 ), 𝑥⟩ = ⟨𝑓 , 𝑖(𝑥)⟩
for all 𝑓 ∈ 𝐹 ′ and 𝑥 ∈ 𝐸. Hence 𝑖′ (𝑓 ) = 𝑓 |𝐸 is the restriction of 𝑓 to 𝐸. Hence 𝐹 ′
𝑑
in some sense is a subset of 𝐸 ′ . However, 𝑖′ is an injection if and only if 𝐸 → 𝐹 by
Proposition 29.4. Hence we can prove the following proposition.

96
𝑑
29.7 Proposition Suppose that 𝐸 → 𝐹 . Then 𝐹 ′ → 𝐸 ′ . If 𝐸 and 𝐹 are reflexive,
𝑑
then 𝐹 ′ → 𝐸 ′ .
Proof. The first statement follows from the comments before. If 𝐸 and 𝐹 are reflexive,
then 𝑖′′ = 𝑖 by Remark 29.2. Hence (𝑖′ )′ = 𝑖 is injective, so by Proposition 29.4 𝑖′ has
𝑑
dense range, that is, 𝐹 ′ → 𝐸 ′ .

30 Duality in Hilbert Spaces


Suppose now that 𝐻 is a Hilbert space. By the Cauchy-Schwarz inequality

|(𝑥 ∣ 𝑦)| ≤ ‖𝑥‖‖𝑦‖

for all 𝑥, 𝑦 ∈ 𝐻. As a consequence

𝐽 (𝑦) ∶= (⋅ ∣ 𝑦) ∈ 𝐻 ′

is naturally in the dual space of 𝐻 and ‖𝐽 (𝑦)‖𝐻 ′ ≤ ‖𝑦‖ for every 𝑦 ∈ 𝐻.

30.1 Proposition The map 𝐽 ∶ 𝐻 → 𝐻 ′ defined above has the following properties:

(i) 𝐽 (𝜆𝑥+𝜇𝑦) = 𝜆𝐽 ̄ (𝑦) for all 𝑥, 𝑦 ∈ 𝐻 and 𝜆, 𝜇 ∈ 𝕂. We say 𝐽 is conjugate


̄ (𝑥)+ 𝜇𝐽
linear.

(ii) ‖𝐽 (𝑦)‖𝐻 ′ = ‖𝑦‖𝐻 for all 𝑦 ∈ 𝐻.

(iii) 𝐽 ∶ 𝐻 → 𝐻 ′ is continuous.

Proof. (i) is obvious from the definition of 𝐽 and the properties of an inner product.
(ii) follows from Corollary 15.6 and the definition of the dual norm. Finally (iii) follows
from (ii) since ‖𝐽 (𝑥) − 𝐽 (𝑦)‖𝐻 ′ = ‖𝐽 (𝑥 − 𝑦)‖𝐻 ′ = ‖𝑥 − 𝑦‖𝐻 .

We now show that 𝐽 is bijective, that is, 𝐻 = 𝐻 ′ if we identify 𝑦 ∈ 𝐻 with 𝐽 (𝑦) ∈ 𝐻 ′ .

30.2 Theorem (Riesz representation theorem) The map 𝐽 ∶ 𝐻 → 𝐻 ′ is a conjugate


linear isometric isomorphism. It is called the Riesz isomorphism.
Proof. By the above proposition, 𝐽 is a conjugate linear isometry. To show that 𝐽
is surjective let 𝑓 ∈ 𝐻 ′ with 𝑓 ≠ 0. By continuity of 𝑓 the kernel ker 𝑓 is a proper
closed subspace of 𝐻, and so by Theorem 16.9 we have

𝐻 = ker 𝑓 ⊕ (ker 𝑓 )⟂

and there exists 𝑧 ∈ (ker 𝑓 )⟂ with ‖𝑧‖ = 1. Clearly

−⟨𝑓 , 𝑥⟩𝑧 + ⟨𝑓 , 𝑧⟩𝑥 ∈ ker 𝑓

97
for all 𝑥 ∈ 𝐻. Since 𝑧 ∈ (ker 𝑓 )⟂ we get
( | )
0 = −⟨𝑓 , 𝑥⟩𝑧 + ⟨𝑓 , 𝑧⟩𝑥|𝑧 = −⟨𝑓 , 𝑥⟩(𝑧 ∣ 𝑧) + ⟨𝑓 , 𝑧⟩(𝑥 ∣ 𝑧) = −⟨𝑓 , 𝑥⟩ + ⟨𝑓 , 𝑧⟩(𝑥 ∣ 𝑧)
|
for all 𝑥 ∈ 𝐻. Hence
( | )
⟨𝑓 , 𝑥⟩ = ⟨𝑓 , 𝑧⟩(𝑥 ∣ 𝑧) = 𝑥|⟨𝑓 , 𝑧⟩𝑧
|
for all 𝑥 ∈ 𝐻, so 𝐽 (𝑦) = 𝑓 if we set 𝑦 = ⟨𝑓 , 𝑧⟩𝑧.
We use the above to show that Hilbert spaces are reflexive.
30.3 Corollary (reflexivity of Hilbert spaces) If 𝐻 is a Hilbert space, then 𝐻 ′ is a
Hilbert space with inner product (𝑓 ∣ 𝑔)𝐻 ′ ∶= (𝐽 −1 (𝑔) ∣ 𝐽 −1 (𝑓 ))𝐻 . The latter inner
product induces the the dual norm. Moreover, 𝐻 is reflexive.
Proof. By the properties of 𝐽 , (𝑓 ∣ 𝑔)𝐻 ′ ∶= (𝐽 −1 (𝑔) ∣ 𝐽 −1 (𝑓 ))𝐻 clearly defines an
inner product on 𝐻 ′ . Moreover, again by the properties of 𝐽
(𝑓 ∣ 𝑓 )𝐻 ′ = (𝐽 −1 (𝑓 ) ∣ 𝐽 −1 (𝑓 ))𝐻 = ‖𝑓 ‖2𝐻 ′ ,
so 𝐻 ′ is a Hilbert space. To prove that 𝐻 is reflexive let 𝑥 ∈ 𝐻 ′′ and denote the Riesz
isomorphisms on 𝐻 and 𝐻 ′ by 𝐽𝐻 and 𝐽𝐻 ′ , respectively. Then by the definition of the
inner product on 𝐻 ′ and 𝐽𝐻
⟨𝑓 , 𝑥⟩𝐻 = ⟨𝑥, 𝑓 ⟩𝐻 ′ = (𝑓 ∣ 𝐽𝐻−1′ (𝑥))𝐻 ′ = (𝐽𝐻−1 𝐽𝐻−1′ (𝑥) ∣ 𝐽𝐻−1 (𝑓 ))𝐻 = ⟨𝑓 , 𝐽𝐻−1 𝐽𝐻−1′ (𝑥)⟩
for all 𝑓 ∈ 𝐻 ′ . Hence by definition of reflexivity 𝐻 is reflexive.
Let finally 𝐻1 and 𝐻2 be Hilbert spaces and 𝑇 ∈ (𝐻1 , 𝐻2 ). Moreover, let 𝐽1 and 𝐽2
be the Riesz isomorphisms on 𝐻1 and 𝐻2 , respectively. If 𝑇 ′ is the dual operator of 𝑇 ,
then we set
𝑇 ∗ = 𝐽1−1 ◦𝑇 ′ ◦𝐽2 ∈ (𝐻2 , 𝐻1 ).
The situation is depicted in the commutative diagram below:
𝑇′
𝐻2′ 𝐻1′

𝐽2 𝐽1
𝑇∗
𝐻2 𝐻1

30.4 Definition (adjoint operators) The operator 𝑇 ∗ ∈ (𝐻2 , 𝐻1 ) as defined above is


called the adjoint operator of 𝑇 . If 𝐻1 = 𝐻2 and 𝑇 = 𝑇 ∗ , then 𝑇 is called self-adjoint,
and if 𝑇 𝑇 ∗ = 𝑇 ∗ 𝑇 , then 𝑇 is called normal.
30.5 Remark If 𝑇 ∈ (𝐻1 , 𝐻2 ) and 𝑇 ∗ its adjoint, then by definition
(𝑇 𝑥 ∣ 𝑦)𝐻2 = (𝑥 ∣ 𝑇 ∗ 𝑦)𝐻1
for all 𝑥 ∈ 𝐻1 and 𝑦 ∈ 𝐻2 . To see this we use the definition of 𝑇 ∗ :
(𝑇 𝑥 ∣ 𝑦) = (𝐽2 𝑦 ∣ 𝐽2 𝑇 𝑥) = (𝐽2 𝑦 ∣ 𝑇 ′ 𝐽1 𝑥) = (𝑦 ∣ 𝐽1−1 𝑇 ′ 𝐽1 𝑥) = (𝑦 ∣ 𝑇 ∗ 𝑥)
for all 𝑥 ∈ 𝐻1 and 𝑦 ∈ 𝐻2 .

98
31 The Lax-Milgram Theorem
In the previous section we looked at the representation of linear functionals on a Hilbert
space by means of the scalar product. Now we look at sesqui-linear forms.

31.1 Definition (sesqui-linear form) Let 𝐻 be a Hilbert space over 𝕂. A map 𝑏 ∶ 𝐻 ×


𝐻 → 𝕂 is called a sesqui-linear form on 𝐻 if
(i) 𝑢 → 𝑏(𝑢, 𝑣) is linear for all 𝑣 ∈ 𝐻;
(ii) 𝑣 → 𝑏(𝑢, 𝑣) is conjugate linear for all 𝑢 ∈ 𝐻.
The sesqui-linear form 𝑏(⋅ , ⋅) is called a Hermitian form if
(iii) 𝑏(𝑢, 𝑣) = 𝑏(𝑣, 𝑢) for all 𝑢, 𝑣 ∈ 𝐻.
Moreover, the sesqui-linear form 𝑏(⋅ , ⋅) is called bounded if there exists 𝑀 > 0 such
that
|𝑏(𝑢, 𝑣)| ≤ 𝑀‖𝑢‖𝐻 ‖𝑣‖𝐻 (31.1)
for all 𝑢, 𝑣 ∈ 𝐻 and coercive if there exists 𝛼 > 0 such that
𝛼‖𝑢‖2𝐻 ≤ Re 𝑏(𝑢, 𝑢) (31.2)
for all 𝑢 ∈ 𝐻.
If 𝐴 ∈ (𝐻), then
𝑏(𝑢, 𝑣) ∶= (𝑢 ∣ 𝐴𝑣) (31.3)
defines a bounded sesqui-linear form on 𝐻. We next show that the converse is also
true, that is, for every bounded sesqui-linear form there exists a linear operator such
that (31.3) holds for all 𝑢, 𝑣 ∈ 𝐻.

31.2 Proposition For every bounded sesqui-linear form 𝑏(⋅ , ⋅) there exists a unique
linear operator 𝐴 ∈ (𝐻) such that (31.3) holds for all 𝑢, 𝑣 ∈ 𝐻. Moreover, ‖𝐴‖(𝐻) ≤
𝑀, where 𝑀 is the constant in (31.1).
Proof. By (31.1) it follows that for every fixed 𝑣 ∈ 𝐻 the map 𝑢 → 𝑏(𝑢, 𝑣) is a bounded
linear functional on 𝐻. The Riesz representation theorem implies that there exists a
unique element 𝐴𝑣 in 𝐻 such that (31.3) holds. The map 𝑣 → 𝐴𝑣 is linear because
̄
(𝑤 ∣ 𝐴(𝜆𝑢 + 𝜇𝑣)) = 𝑏(𝑤, 𝜆𝑢 + 𝜇𝑣) = 𝜆𝑏(𝑤, 𝑢) + 𝜇𝑏(𝑤,
̄ 𝑣)
̄ ∣ 𝐴𝑢) + 𝜇(𝑤
= 𝜆(𝑤 ̄ ∣ 𝐴𝑣) = (𝑤 ∣ 𝜆𝐴𝑢 + 𝜇𝐴𝑣)
for all 𝑢, 𝑣, 𝑤 ∈ 𝐻 and 𝜆, 𝜇 ∈ 𝕂, and so by definition of 𝐴 we get 𝐴(𝜆𝑢 + 𝜇𝑣) =
𝜆𝐴𝑢 + 𝜇𝐴𝑣. To show that 𝐴 is bounded we note that by Corollary 15.6 and (31.1) we
have
‖𝐴𝑣‖𝐻 = sup |(𝑢 ∣ 𝐴𝑣)| = sup |𝑏(𝑢, 𝑣)| ≤ sup 𝑀‖𝑢‖𝐻 ‖𝑣‖𝐻 = 𝑀‖𝑣‖𝐻
‖𝑢‖𝐻 =1 ‖𝑢‖𝐻 =1 ‖𝑢‖𝐻 =1

for all 𝑣 ∈ 𝐻. Hence 𝐴 ∈ (𝐻) and ‖𝐴‖(𝐻) ≤ 𝑀 as claimed.

We next show that if the form is coercive, then the operator constructed above is in-
vertible.

99
31.3 Theorem (Lax-Milgram theorem) Suppose 𝑏(⋅ , ⋅) is a bounded coercive sesqui-
linear form and 𝐴 ∈ (𝐻) the corresponding operator constructed above. Then 𝐴 is
invertible and 𝐴−1 ∈ (𝐻) with ‖𝐴−1 ‖(𝐻) ≤ 𝛼 −1 , where 𝛼 > 0 is the constant from
(31.2).
Proof. By the coercivity of 𝑏(⋅ , ⋅) we have

𝛼‖𝑢‖2𝐻 ≤ Re 𝑏(𝑢, 𝑢) ≤ |𝑏(𝑢, 𝑢)| = |(𝑢 ∣ 𝐴𝑢)| ≤ ‖𝑢‖𝐻 ‖𝐴𝑢‖𝐻

for all 𝑢 ∈ 𝐻. Therefore 𝛼‖𝑢‖𝐻 ≤ ‖𝐴𝑢‖𝐻 for all 𝑢 ∈ 𝐻 and so 𝐴 is injective,


𝐴−1 ∈ (im(𝐴), 𝐻) with ‖𝐴−1 ‖(𝐻) ≤ 𝛼 −1 . We next show that im(𝐴) is dense in 𝐻 by
proving that its orthogonal complement is trivial. If 𝑢 ∈ (im(𝐴))⟂ , then in particular

0 = (𝑢 ∣ 𝐴𝑢) = 𝑏(𝑢, 𝑢) ≥ 𝛼‖𝑢‖2𝐻

by definition of 𝐴 and the coercivity of 𝑏(⋅ , ⋅). Hence 𝑢 = 0 and so Corollary 16.11 im-
plies that im(𝐴) is dense in 𝐻. Since 𝐴 is in particular a closed operator with bounded
inverse, Theorem 23.4(iii) implies that im(𝐴) = 𝐻.

100
Chapter VII
Spectral Theory

Spectral theory is a generalisation of the theory of eigenvalues and eigenvectors of


matrices to operators on infinite dimensional spaces.

32 Resolvent and Spectrum


Consider an 𝑁 × 𝑁 matrix 𝐴. The set of 𝜆 ∈ ℂ for which (𝜆𝐼 − 𝐴) is not invertible is
called the set of eigenvalues of 𝐴. Due to the fact that dim(im(𝐴)) + dim(ker(𝐴)) = 𝑁
the matrix (𝜆𝐼 − 𝐴) is only invertible if it has a trivial kernel. It turns out that there are
at most 𝑁 such 𝜆 called the eigenvalues of 𝐴. We consider a similar set of 𝜆 for closed
operators on a Banach space 𝐸 over ℂ.

32.1 Definition (resolvent and spectrum) Let 𝐸 be a Banach space over ℂ and 𝐴 ∶ 𝐷(𝐴) ⊆
𝐸 → 𝐸 a closed operator. We call

(i) 𝜚(𝐴) ∶= {𝜆 ∈ ℂ ∣ (𝜆𝐼 − 𝐴) ∶ 𝐷(𝐴) → 𝐸 is bijective and (𝜆𝐼 − 𝐴)−1 ∈ (𝐸)}


the resolvent set of 𝐴;

(ii) 𝑅(𝜆, 𝐴) ∶= (𝜆𝐼 − 𝐴)−1 the resolvent of 𝐴 at 𝜆 ∈ 𝜚(𝐴);

(iii) 𝜎(𝐴) ∶= ℂ ⧵ 𝜚(𝐴) the spectrum of 𝐴.

32.2 Remarks (a) If 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸 is closed, then (𝜆𝐼 − 𝐴) ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸


is closed as well for all 𝜆 ∈ 𝕂. To see this assume that 𝑥𝑛 ∈ 𝐷(𝐴) with 𝑥𝑛 → 𝑥 and
(𝜆𝐼 − 𝐴)𝑥𝑛 → 𝑦 in 𝐸. Then 𝐴𝑥𝑛 = 𝜆𝑥𝑛 − (𝜆𝐼 − 𝐴)𝑥𝑛 → 𝜆𝑥 − 𝑦 in 𝐸. Since 𝐴 is closed
𝑥 ∈ 𝐷(𝐴) and 𝐴𝑥 = 𝜆𝑥 − 𝑦, that is, (𝜆𝐼 − 𝐴)𝑥 = 𝑦. Hence 𝜆𝐼 − 𝐴 is closed.
(b) The graph norms of (𝜆𝐼 − 𝐴) are equivalent for all 𝜆 ∈ 𝕂. This follows since
for every 𝑥 ∈ 𝐷(𝐴) and 𝜆 ∈ 𝕂

‖𝑥‖𝜆𝐼−𝐴 = ‖𝑥‖ + ‖(𝜆𝐼 − 𝐴)𝑥‖ ≤ (1 + |𝜆|)(‖𝑥‖ + ‖𝐴𝑥‖) = (1 + |𝜆|)‖𝑥‖𝐴

and

‖𝑥‖𝐴 = ‖𝑥‖ + ‖𝐴𝑥‖ = ‖𝑥‖ + ‖𝜆𝑥 − (𝜆𝐼 − 𝐴)𝑥‖


≤ (1 + |𝜆|)(‖𝑥‖ + ‖(𝜆𝑥 − 𝐴)𝑥‖) = (1 + |𝜆|)‖𝑥‖𝜆𝐼−𝐴 .

101
We next get some characterisations of the points in the resolvent set.

32.3 Proposition Let 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸 be a closed operator on the Banach space


𝐸. Then the following assertions are equivalent.
(i) 𝜆 ∈ 𝜚(𝐴);

(ii) 𝜆𝐼 − 𝐴 ∶ 𝐷(𝐴) → 𝐸 is bijective;

(iii) 𝜆𝐼 − 𝐴 is injective, im(𝜆𝐼 − 𝐴) = 𝐸 and (𝜆𝐼 − 𝐴)−1 ∈ (im(𝜆𝐼 − 𝐴), 𝐸).


Proof. By definition of the resolvent set (i) implies (ii). Since 𝜆𝐼 − 𝐴 is a closed
operator by the above remark, the equivalence of (i) to (iii) follows from Theorem 23.4.

We now want to derive properties of 𝑅(𝜆, 𝐴) = (𝜆𝐼 − 𝐴)−1 as a function of 𝜆. If 𝐴 is


not an operator but a complex number we can use the geometric series to find various
1
series expansions for = (𝜆 − 𝑎)−1 . To get a more precise formulation we need a
𝜆−𝑎
lemma.

32.4 Lemma Let 𝑇 ∈ (𝐸). Then

𝑟 ∶= lim ‖𝑇 𝑛 ‖1∕𝑛 (32.1)


𝑛→∞

exists and 𝑟 ≤ ‖𝑇 ‖.
Proof. First note that

‖𝑇 𝑛+𝑚 𝑥‖ = ‖𝑇 𝑛 𝑇 𝑚 𝑥‖ ≤ ‖𝑇 𝑛 ‖‖𝑇 𝑚 𝑥‖ ≤ ‖𝑇 𝑛 ‖‖𝑇 𝑚 ‖‖𝑥‖

for all 𝑥 ∈ 𝐸, so by definition of the operator norm

‖𝑇 𝑛+𝑚 ‖ ≤ ‖𝑇 𝑛 ‖‖𝑇 𝑚 ‖ (32.2)

for all 𝑛, 𝑚 ≥ 0. Fix now 𝑚 ∈ ℕ, 𝑚 > 1. If 𝑛 > 𝑚, then there exist 𝑞𝑛 , 𝑟𝑛 ∈ ℕ with
𝑛 = 𝑚𝑞𝑛 + 𝑟𝑛 and 0 ≤ 𝑟𝑛 < 𝑚. Hence
𝑞𝑛 ( 𝑟 )
1 1
= 1+ 𝑛 →
𝑛 𝑚 𝑛 𝑚
as 𝑛 → ∞. From (32.2) we conclude that ‖𝑇 𝑛 ‖1∕𝑛 ≤ ‖𝑇 𝑚 ‖𝑞𝑛 ∕𝑛 ‖𝑇 ‖𝑟𝑛 ∕𝑛 for all 𝑛 > 𝑚.
Hence
lim inf ‖𝑇 𝑛 ‖1∕𝑛 ≤ ‖𝑇 𝑚 ‖1∕𝑚
𝑛→∞

for all 𝑚 ∈ ℕ and so

lim inf ‖𝑇 𝑛 ‖1∕𝑛 ≤ lim sup ‖𝑇 𝑛 ‖1∕𝑛 ≤ ‖𝑇 ‖.


𝑛→∞ 𝑛→∞

Hence the limit (32.1) exists, and our proof is complete.

We now look at an analogue of the “geometric series” for bounded linear operators.

102
32.5 Proposition (Neumann series) Let 𝑇 ∈ (𝐸) and set

𝑟 ∶= lim ‖𝑇 𝑘 ‖1∕𝑘 .
𝑘→∞

Then the following assertions are true.


∑∞
(i) 𝑘=0 𝑇 𝑘 converges absolutely in (𝐸) if 𝑟 < 1 and diverges if 𝑟 > 1.
∑∞ ∑∞
(ii) If 𝑘=0 𝑇 𝑘 converges, then 𝑘=0 𝑇 𝑘 = (𝐼 − 𝑇 )−1 .

Proof. Part (i) is a consequence of the root test for the absolute convergence of a series.
For part (ii) we use a similar argument as in case of the geometric series. Clearly


𝑛

𝑛

𝑛

𝑛+1
𝑘 𝑘 𝑘
(𝐼 − 𝑇 ) 𝑇 = 𝑇 (𝐼 − 𝑇 ) = 𝑇 − 𝑇 𝑘 = 𝐼 − 𝑇 𝑛+1
𝑘=0 𝑘=0 𝑘=0 𝑘=0
∑𝑛
for all 𝑛 ∈ ℕ. Since the series 𝑘=0
𝑇 𝑘 converges, in particular 𝑇 𝑛 → 0 in (𝐸) and
therefore




(𝐼 − 𝑇 ) 𝑇 𝑘𝑥 = 𝑇 𝑘 (𝐼 − 𝑇 )𝑥 = 𝑥
𝑘=0 𝑘=0

for all 𝑥 ∈ 𝐸. Hence (ii) follows.

We next use the above to show that the resolvent is analytic on 𝜚(𝐴). The series expan-
sions we find are very similar to corresponding expansions of (𝜆 − 𝑎)−1 based on the
geometric series.

32.6 Definition (analytic function) Let 𝐸 be a Banach space over ℂ and 𝑈 ⊆ ℂ an


open set. A map 𝑓 ∶ 𝑈 → 𝐸 is called analytic (or holomorphic) if for every 𝜆0 ∈ 𝑈
there exists 𝑟 > 0 and 𝑎𝑘 ∈ 𝐸 such that



𝑓 (𝜆) = 𝑎𝑘 (𝜆 − 𝜆0 )𝑘
𝑘=0

for all 𝜆 ∈ 𝑈 with |𝜆 − 𝜆0 | < 𝑟.

32.7 Theorem Let 𝐸 be a Banach space over ℂ and 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸 a closed


operator.

(i) The resolvent set 𝜚(𝐴) is open in ℂ and the resolvent 𝑅(⋅ , 𝐴) ∶ 𝜚(𝐴) → (𝐸) is
analytic;
1
(ii) For all 𝜆0 ∈ 𝜚(𝐴) we have ‖𝑅(𝜆0 , 𝐴)‖(𝐸) ≥ .
dist(𝜆0 , 𝜎(𝐴))
(iii) For all 𝜆, 𝜇 ∈ 𝜚(𝐴)

𝑅(𝜆, 𝐴) − 𝑅(𝜇, 𝐴) = (𝜇 − 𝜆)𝑅(𝜆, 𝐴)𝑅(𝜇, 𝐴)

(resolvent equation);

103
(iv) For all 𝜆, 𝜇 ∈ 𝜚(𝐴)

𝑅(𝜆, 𝐴)𝑅(𝜇, 𝐴) = 𝑅(𝜇, 𝐴)𝑅(𝜆, 𝐴)

and
𝐴𝑅(𝜆, 𝐴)𝑥 = 𝑅(𝜆, 𝐴)𝐴𝑥 = (𝜆𝑅(𝜆, 𝐴) − 𝐼)𝑥
for all 𝑥 ∈ 𝐷(𝐴).
Proof. (i) If 𝜚(𝐴) = ∅, then 𝜚(𝐴) is open. Hence suppose that 𝜚(𝐴) ≠ ∅ and fix
𝜆0 ∈ 𝜚(𝐴). Then
𝜆𝐼 − 𝐴 = (𝜆 − 𝜆0 ) + (𝜆0 𝐼 − 𝐴)
for all 𝜆 ∈ ℂ. Hence
( )
(𝜆𝐼 − 𝐴) = (𝜆0 𝐼 − 𝐴) 𝐼 + (𝜆 − 𝜆0 )𝑅(𝜆0 , 𝐴) (32.3)

for all 𝜆 ∈ ℂ. Now fix 𝑟 ∈ (0, 1) and suppose that


𝑟
|𝜆 − 𝜆0 | ≤ .
‖𝑅(𝜆0 , 𝐴)‖(𝐸)

Then |𝜆 − 𝜆0 |‖𝑅(𝜆0 , 𝐴)‖(𝐸) < 1 and by Proposition 32.5 𝐼 + (𝜆 − 𝜆0 )𝑅(𝜆0 , 𝐴) is


invertible and
( )−1 ∑∞
𝐼 + (𝜆 − 𝜆0 )𝑅(𝜆0 , 𝐴) = (−1)𝑘 𝑅(𝜆0 , 𝐴)𝑘 (𝜆 − 𝜆0 )𝑘 .
𝑘=0

Hence by (32.3) the operator (𝜆𝐼 − 𝐴) ∶ 𝐷(𝐴) → 𝐸 has a bounded inverse and
( )−1
𝑅(𝜆, 𝐴) = (𝜆𝐼 − 𝐴)−1 = 𝐼 + (𝜆 − 𝜆0 )𝑅(𝜆0 , 𝐴) 𝑅(𝜆0 , 𝐴)


= (−1)𝑘 𝑅(𝜆0 , 𝐴)𝑘+1 (𝜆 − 𝜆0 )𝑘
𝑘=0

for all 𝜆 ∈ ℂ with


1
|𝜆 − 𝜆0 | < .
‖𝑅(𝜆0 , 𝐴)‖(𝐸)
Therefore 𝜚(𝐴) is open and the map 𝜆 → 𝑅(𝜆, 𝐴) analytic on 𝜚(𝐴).
(ii) From the proof of (i) we have
1
|𝜆 − 𝜆0 | ≥ .
‖𝑅(𝜆0 , 𝐴)‖(𝐸)

for all 𝜆 ∈ 𝜎(𝐴). Hence


1
dist(𝜆0 , 𝜎(𝐴)) = inf |𝜆 − 𝜆0 | ≥ .
𝜆∈𝜎(𝐴) ‖𝑅(𝜆0 , 𝐴)‖(𝐸)

and so (ii) follows.


(iii) If 𝜆, 𝜇 ∈ 𝜚(𝐴), then

(𝜇𝐼 − 𝐴) = (𝜆𝐼 − 𝐴) + (𝜇 − 𝜆)𝐼.

104
If we multiply by 𝑅(𝜆, 𝐴) from the left and by 𝑅(𝜇, 𝐴) from the right we get

𝑅(𝜆, 𝐴) = 𝑅(𝜇, 𝐴) + (𝜇 − 𝜆)𝑅(𝜆, 𝐴)𝑅(𝜇, 𝐴).

from which our claim follows.


(iv) The first identity follows from (iii) since we can interchange the roles of 𝜆 and
𝜇 to get the same result. The last identity follows since

𝐴 = 𝜆𝐼 − (𝜆𝐼 − 𝐴)

by applying 𝑅(𝜆, 𝐴) from the left and from the right.

32.8 Remarks (a) Because 𝜚(𝐴) is open, the spectrum 𝜎(𝐴) is always a closed subset
of ℂ.
(b) Part (ii) if the above theorem shows that 𝜚(𝐴) is the natural domain of the analytic
function given by the resolvent 𝑅(⋅ , 𝐴). The function is Banach space valued, but
nevertheless all theorems from complex analysis can be applied. In particular, one can
analyse the structure of 𝐴 near an isolated point of the spectrum by using Laurent series
expansions, see for instance [3].
Next we look at the spectrum of bounded operators.

32.9 Definition Let 𝐸 be a Banach space and 𝑇 ∈ (𝐸). We call

spr(𝑇 ) ∶= sup{|𝜆| ∶ 𝜆 ∈ 𝜎(𝑇 )}

the spectral radius of 𝑇 .

32.10 Theorem Suppose that 𝐸 is a Banach space and 𝑇 ∈ (𝐸). Then 𝜎(𝑇 ) is non-
empty and bounded. Moreover, spr(𝑇 ) = lim𝑛→∞ ‖𝑇 𝑛 ‖1∕𝑛 ≤ ‖𝑇 ‖.
Proof. Using Proposition 32.5 we see that
( )
1 −1 ∑ 1 𝑘

1
𝑅(𝜆, 𝑇 ) = (𝜆𝐼 − 𝑇 )−1 = 𝐼− 𝑇 = 𝑘+1
𝑇 (32.4)
𝜆 𝜆 𝑘=0
𝜆

if |𝜆| > 𝑟 ∶= lim𝑛→∞ ‖𝑇 𝑛 ‖1∕𝑛 . The above is the Laurent expansion of 𝑅(⋅ , 𝑇 ) about
zero for |𝜆| large. We know that such an expansion converges in the largest annulus
contained in the domain of the analytic function. Hence 𝑟 = spr(𝑇 ) as claimed. We
next prove that 𝜎(𝑇 ) is non-empty. We assume 𝜎(𝑇 ) is empty and derive a contradiction.
If 𝜎(𝑇 ) = ∅, then the function

𝑔 ∶ ℂ → ℂ, 𝜆 → ⟨𝑓 , 𝑅(𝜆, 𝑇 )𝑥⟩

is analytic on ℂ for every 𝑥 ∈ 𝐸 and 𝑓 ∈ 𝐸 ′ . By (32.4) we get

1 ∑ ‖𝑇 ‖𝑘

‖𝑓 ‖𝐸 ′ ‖𝑥‖𝐸
|𝑔(𝜆)| ≤ ‖𝑓 ‖ ′ ‖𝑥‖𝐸 =
|𝜆| 𝑘=0 2𝑘 ‖𝑇 ‖𝑘 𝐸
|𝜆|

105
for all |𝜆| ≥ 2‖𝑇 ‖. In particular, 𝑔 is bounded and therefore by Liouville’s theorem 𝑔
is constant. Again from the above 𝑔(𝜆) → 0 as |𝜆| → ∞, and so that constant must be
zero. Hence
⟨𝑓 , 𝑅(𝜆, 𝑇 )𝑥⟩ = 0
for all 𝑥 ∈ 𝐸 and 𝑓 ∈ 𝐸 ′ . Hence by the Hahn-Banach theorem (Corollary 26.5)
𝑅(𝜆, 𝑇 )𝑥 = 0 for all 𝑥 ∈ 𝐸, which is impossible since 𝑅(𝜆, 𝑇 ) is bijective. This proves
that 𝜎(𝑇 ) ≠ ∅.

Note that 𝜆𝐼 − 𝐴 may fail to have a continuous inverse defined on 𝐸 for several reasons.
We split the spectrum up accordingly.

32.11 Definition Let 𝐴 ∶ 𝐷(𝐴) ⊆ 𝐸 → 𝐸 be a closed operator. We call

• 𝜎𝑝 (𝐴) ∶= {𝜆 ∈ ℂ ∶ 𝜆𝐼 − 𝐴 not injective} the point spectrum of 𝐴;

• 𝜎𝑐 (𝐴) ∶= {𝜆 ∈ ℂ ∶ 𝜆𝐼 − 𝐴 injective, im(𝜆𝐼 − 𝐴) = 𝐸, (𝜆𝐼 − 𝐴)−1 unbounded}


the continuous spectrum of 𝐴;

• 𝜎𝑟 (𝐴) ∶= {𝜆 ∈ ℂ ∶ 𝜆𝐼 − 𝐴 injective, im(𝜆𝐼 − 𝐴) ≠ 𝐸} the residual spectrum


of 𝐴;

32.12 Remarks (a) By definition of 𝜎(𝐴) and Proposition 32.3

𝜎(𝐴) = 𝜎𝑝 (𝐴) ∪ 𝜎𝑐 (𝐴) ∪ 𝜎𝑟 (𝐴)

is clearly a disjoint union.


(b) The elements of 𝜎𝑝 (𝐴) are called the eigenvalues of 𝐴, and the non-zero elements
of ker(𝜆𝐼 − 𝐴) the corresponding eigenvectors.
(c) If dim 𝐸 = 𝑁 < ∞ the spectrum consists of eigenvalues only. This follows
since dim(im(𝜆𝐼 − 𝐴)) + dim(ker(𝜆𝐼 − 𝐴)) = 𝑁 and therefore 𝜆𝐼 − 𝐴 is not invertible
if and only if it has non-trivial kernel.

33 Projections, Complements and Reductions


In finite dimensions the eigenvalues are used to find basis under which the matrix asso-
ciated with a given linear operator becomes as simple as possible. In particular, to every
eigenvalue there is a maximal invariant subspace, the generalised eigenspace associated
with an eigenvalue of a square matrix. We would like to have something similar in in-
finite dimensions. In this section we develop the concepts to formulate such results.

33.1 Definition (projection) Let 𝐸 be a vector space and 𝑃 ∈ Hom(𝐸). We call 𝑃 a


projection if 𝑃 2 = 𝑃 .
We next show that there is a one-to-one correspondence between projections and pairs
of complemented subspaces.

106
33.2 Lemma (i) If 𝑃 ∈ Hom(𝐸) is a projection, then 𝐸 = im(𝑃 ) ⊕ ker(𝑃 ). Moreover,
𝐼 − 𝑃 is a projection and ker(𝑃 ) = im(𝐼 − 𝑃 ).
(ii) If 𝐸1 , 𝐸2 are subspaces of 𝐸 with 𝐸 = 𝐸1 ⊕ 𝐸2 , then there exists a unique
projection 𝑃 ∈ Hom(𝐸) with 𝐸1 = im(𝑃 ) and 𝐸2 = im(𝐼 − 𝑃 ).
Proof. (i) If 𝑥 ∈ 𝐸, then since 𝑃 2 = 𝑃

𝑃 (𝐼 − 𝑃 )𝑥 = 𝑃 𝑥 − 𝑃 2 𝑥 = 𝑃 𝑥 − 𝑃 𝑥 = 0,

so im(𝐼 − 𝑃 ) ⊂ ker(𝑃 ). On the other hand, if 𝑥 ∈ ker(𝑃 ), then (𝐼 − 𝑃 )𝑥 = 𝑥 − 𝑃 𝑥 = 𝑥,


so 𝑥 ∈ im(𝐼 − 𝑃 ), so im(𝐼 − 𝑃 ) = ker(𝑃 ). Next note that 𝑥 = 𝑃 𝑥 + (𝐼 − 𝑃 )𝑥, so
𝐸 = im(𝑃 ) + ker(𝑃 ) by what we just proved. If 𝑥 ∈ im(𝑃 ) ∩ ker(𝑃 ), then there exists
𝑦 ∈ 𝐸 with 𝑃 𝑦 = 𝑥. Since 𝑃 2 = 𝑃 and 𝑥 ∈ ker(𝑃 ) we have 𝑥 = 𝑃 𝑦 = 𝑃 2 𝑦 = 𝑃 𝑥 = 0.
Hence we have 𝐸 = im(𝑃 ) ⊕ ker(𝑃 ). Finally note that (𝐼 − 𝑃 )2 = 𝐼 − 2𝑃 + 𝑃 2 =
𝐼 − 2𝑃 + 𝑃 = 𝐼 − 𝑃 , so 𝐼 − 𝑃 is a projection as well.
(ii) Given 𝑥 ∈ 𝐸 there are unique 𝑥𝑖 ∈ 𝐸𝑖 (𝑖 = 1, 2) such that 𝑥 = 𝑥1 + 𝑥2 . By the
uniqueness of that decomposition the map 𝑃 𝑥 ∶= 𝑥1 is linear, and also 𝑃 2 𝑥 = 𝑃 𝑥 = 𝑥1
for all 𝑥 ∈ 𝐸. Hence 𝑃1 is a projection. Moreover, 𝑥2 = 𝑥 − 𝑥1 = 𝑥 − 𝑃 𝑥, so
𝐸2 im(𝐼 − 𝑃 ). Hence 𝑃 is the projection required.

We call 𝑃 the projection of 𝐸 onto 𝐸1 parallel to 𝐸2 . Geometrically we have a situation


as in Figure 33.1.

𝐸2
𝑥
𝑃)

𝑥
(𝐼

𝑃𝑥
𝐸1

Figure 33.1: Projections onto complementary subspaces

33.3 Proposition Let 𝐸1 , 𝐸2 be subspaces of the Banach space 𝐸 with 𝐸 = 𝐸1 ⊕ 𝐸2


and 𝑃 ∈ Hom(𝐸) the associated projection onto 𝐸1 . Then 𝑃 ∈ (𝐸) if and only if 𝐸1
and 𝐸2 are closed.
Proof. By Lemma 33.2 we have 𝐸1 = ker(𝐼 − 𝑃 ) and 𝐸2 = ker(𝑃 ). Hence if
𝑃 ∈ (𝐸), then 𝐸1 and 𝐸2 are closed. Assume now that 𝐸1 and 𝐸2 are closed. To prove
that 𝑃 is bounded we show 𝑃 has closed graph. Hence assume that (𝑥𝑛 ) is a sequence
in 𝐸 with 𝑥𝑛 → 𝑥 and 𝑃 𝑥𝑛 → 𝑦 in 𝐸. Note that 𝑃 𝑥𝑛 ∈ 𝐸1 and 𝑥𝑛 − 𝑃 𝑥𝑛 ∈ 𝐸2 for all
𝑛 ∈ ℕ. Since 𝐸1 and 𝐸2 are closed, 𝑃 𝑥𝑛 → 𝑦 and 𝑥𝑛 − 𝑃 𝑥𝑛 → 𝑦 − 𝑥 ∈ 𝐸 we conclude
that 𝑥 ∈ 𝐸1 and 𝑦 − 𝑥 ∈ 𝐸2 . Now 𝑥 = 𝑦 + (𝑥 − 𝑦), so 𝑃 𝑥 = 𝑦 because 𝐸 = 𝐸1 ⊕ 𝐸2 .
Therefore 𝑃 has closed graph, and the closed graph theorem (Theorem 21.1) implies
that 𝑃 ∈ (𝐸).

107
33.4 Definition (topological direct sum) Suppose that 𝐸1 , 𝐸2 are subspaces of the normed
space 𝐸. We call 𝐸 = 𝐸1 ⊕ 𝐸2 a topological direct sum if the corresponding projection
is bounded. In that case we say 𝐸2 is a topological complement of 𝐸1 . Finally, a closed
subspace of 𝐸 is called complemented if it has a topological complement.

33.5 Remarks (a) By Proposition 33.3 𝐸1 and 𝐸2 must be closed if 𝐸 = 𝐸1 ⊕ 𝐸2 is a


topological direct sum.
(b) By Theorem 16.9 every closed subspace of a Hilbert space is complemented. A
rather recent result [9] shows that if every closed subspace of a Banach space is com-
plemented, then its norm is equivalent to a norm induced by an inner product! Hence
in general Banach spaces we do not expect every closed subspace to be complemented.
(c) Using Zorn’s lemma 1.7 one can show that every subspace of a vector space has
an algebraic complement.
We show that certain subspaces are always complemented.

33.6 Lemma Suppose that 𝑀 is a closed subspace of the Banach space 𝐸 with finite
co-dimension, that is, dim(𝐸∕𝑀) < ∞. Then 𝑀 is complemented.
Proof. We know that 𝑀 has an algebraic complement 𝑁, that is, 𝐸 = 𝑀 ⊕ 𝑁. Let
𝑃 denote the projection onto 𝑁. Then we have the following commutative diagram.

𝑃
𝐸 𝑁

𝜋
𝑃̂
𝐸∕𝑀

Since 𝑀 is a closed subspace, by Theorem 12.3 𝐸∕𝑀 is a Banach space and 𝑃̂ is


an isomorphism. Since dim(𝐸∕𝑀) < ∞ the operator 𝑃̂ is continuous, and so it is
continuous. Therefore 𝑃 is continuous as well as a composition of two continuous
operators. Hence we can apply Proposition 33.3 to conclude the proof.

Suppose now that 𝐸 is a vector space and 𝐸1 and 𝐸2 are subspaces such that 𝐸 =
𝐸1 ⊕ 𝐸2 . Denote by 𝑃1 and 𝑃2 the corresponding projections. If 𝐴 ∈ Hom(𝐸), then

𝐴𝑥 = 𝑃1 𝐴𝑥 + 𝑃2 𝐴𝑥 = 𝑃1 𝐴(𝑃1 𝑥 + 𝑃2 𝑥) + 𝑃2 𝐴(𝑃1 𝑥 + 𝑃2 𝑥)
= 𝑃1 𝐴𝑃1 𝑥 + 𝑃1 𝐴𝑃2 𝑥 + 𝑃2 𝐴𝑃1 𝑥 + 𝑃2 𝐴𝑃2 𝑥

for all 𝑥 ∈ 𝐸. Setting


𝐴𝑖𝑗 ∶= 𝑃𝑖 𝐴𝑃𝑗 ,
we have 𝐴𝑖𝑗 ∈ Hom(𝐸𝑗 , 𝐸𝑖 ). If we identify 𝑥 ∈ 𝐸 with (𝑥1 , 𝑥2 ) ∶= (𝑃1 𝑥, 𝑃2 𝑥) ∈
𝐸1 × 𝐸2 , then we can write
[ ][ ]
𝐴11 𝐴12 𝑥1
𝐴𝑥 =
𝐴21 𝐴22 𝑥2

108
33.7 Definition (complete reduction) We say 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴 if
𝐴12 = 𝐴21 = 0, that is, [ ]
𝐴11 0
𝐴∼
0 𝐴22
If that is the case we set 𝐴𝑖 ∶= 𝐴𝑖𝑖 and write 𝐴 = 𝐴1 ⊕ 𝐴2 .
We next characterise the reductions by means of the projections.

33.8 Proposition Suppose that 𝐸 = 𝐸1 ⊕𝐸2 and that 𝑃1 and 𝑃2 are the corresponding
projections. Let 𝐴 ∈ Hom(𝐸). Then 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴 if and only if
𝑃1 𝐴 = 𝐴𝑃1 .
Proof. Suppose that 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴. Then

𝑃1 𝐴𝑥 = 𝑃1 𝐴𝑃1 𝑥 + 𝑃1 𝐴𝑃2 𝑥 = 𝑃1 𝐴𝑃1 𝑥 = 𝑃1 𝐴𝑃1 𝑥 + 𝑃2 𝐴𝑃1 𝑥 = 𝐴𝑃1 𝑥.

To prove the converse Note that 𝑃1 𝑃2 = 𝑃2 𝑃1 = 0. Hence 𝐴12 = 𝑃1 𝐴𝑃2 = 𝐴𝑃1 𝑃2 = 0


and also 𝐴21 = 𝑃2 𝐴𝑃1 = 𝑃2 𝑃1 𝐴 = 0, so 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴.

Of course we could replace interchange the roles of 𝑃1 and 𝑃2 in the above proposition.

33.9 Proposition Suppose that 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴 ∈ Hom(𝐸). Let


𝐴 = 𝐴1 ⊕ 𝐴2 be this reduction. Then

(i) ker 𝐴 = ker 𝐴1 ⊕ ker 𝐴2 ;

(ii) im 𝐴 = im 𝐴1 ⊕ im 𝐴2 ;

(iii) 𝐴 is injective (surjective) if and only if 𝐴1 and 𝐴2 are injective (surjective);

(iv) if 𝐴 is bijective, then 𝐸 = 𝐸1 ⊕ 𝐸2 completely reduces 𝐴−1 and 𝐴−1 = 𝐴−1


1
⊕ 𝐴−1
2
.

Proof. (i) If 𝑥 = 𝑥1 + 𝑥2 with 𝑥1 ∈ 𝐸1 and 𝑥2 ∈ 𝐸2 , then 0 = 𝐴𝑥 = 𝐴1 𝑥1 + 𝐴2 𝑥2 if


and only if 𝐴1 𝑥1 = 0 and 𝐴2 𝑥2 = 0. Hence (i) follows.
(ii) If 𝑥 = 𝑥1 + 𝑥2 with 𝑥1 ∈ 𝐸1 and 𝑥2 ∈ 𝐸2 , then 𝑦 = 𝐴𝑥 = 𝐴1 𝑥1 + 𝐴2 𝑥2 = 𝑦1 + 𝑦2
if and only if 𝐴1 𝑥1 = 𝑦1 and 𝐴2 𝑥2 = 𝑦2 . Hence (ii) follows.
(iii) follows directly from (i) and (ii), and (iv) follows from the argument given in
the proof of (ii).

We finally look at resolvent and spectrum of the reductions. It turns out that the spectral
properties of 𝐴 can can be completely described by the operators 𝐴1 and 𝐴2 .

33.10 Proposition Suppose that 𝐸 is a Banach space and that 𝐸 = 𝐸1 ⊕ 𝐸2 a topo-


logical direct sum completely reducing 𝐴 ∈ (𝐸). Let 𝐴 = 𝐴1 ⊕ 𝐴2 be this reduction.
Then

(i) 𝐴𝑖 ∈ (𝐸𝑖 ) (𝑖 = 1, 2);

(ii) 𝜎(𝐴) = 𝜎(𝐴1 ) ∪ 𝜎(𝐴2 );

109
(iii) 𝜎𝑝 (𝐴) = 𝜎𝑝 (𝐴1 ) ∪ 𝜎𝑝 (𝐴2 ).
Proof. By Proposition 33.3 the projections 𝑃1 , 𝑃2 associated with 𝐸 = 𝐸1 ⊕ 𝐸2 are
bounded. Hence 𝐴𝑖 = 𝑃𝑖 𝐴𝑃𝑖 is bounded for 𝑖 = 1, 2. The equality in (ii) follows from
Proposition 33.9(iv) which shows that 𝜚(𝐴) = 𝜚(𝐴1 ) ∩ 𝜚(𝐴2 ). Finally (iii) follows since
by Proposition 33.9(i) shows that ker(𝜆𝐼 − 𝐴) = {0} if and only if ker(𝜆𝐼𝐸1 − 𝐴1 ) =
ker(𝜆𝐼𝐸2 − 𝐴2 ) = {0}.

34 The Ascent and Descent of an Operator


Let 𝐸 be a vector space over 𝕂 and 𝑇 ∈ Hom(𝐸). We then consider two nested se-
quences of subspaces:

{0} = ker 𝑇 0 ⊆ ker 𝑇 ⊆ ker 𝑇 2 ⊆ ker 𝑇 3 ⊆ …

and
𝐸 = im 𝑇 0 ⊇ im 𝑇 ⊇ im 𝑇 2 ⊇ im 𝑇 3 ⊇ … .
We are interested in 𝑛 such that there is equality.

34.1 Lemma If ker 𝑇 𝑛 = ker 𝑇 𝑛+1 , then ker 𝑇 𝑛 = ker 𝑇 𝑛+𝑘 for all 𝑘 ∈ ℕ. Moreover,
if im 𝑇 𝑛 = im 𝑇 𝑛+1 , then im 𝑇 𝑛 = im 𝑇 𝑛+𝑘 for all 𝑘 ∈ ℕ.
Proof. If 𝑘 ≥ 1 and 𝑥 ∈ ker 𝑇 𝑛+𝑘 , then 0 = 𝑇 𝑛+𝑘 𝑥 = 𝑇 𝑛+1 (𝑇 𝑘−1 𝑥) and so 𝑇 𝑘−1 𝑥 ∈
ker 𝑇 𝑛 . By assumption ker 𝑇 𝑛 = ker 𝑇 𝑛+1 and therefore 𝑇 𝑛+𝑘−1 𝑥 = 𝑇 𝑛 (𝑇 𝑘−1 𝑥) = 0.
Hence ker 𝑇 𝑛+𝑘 = ker 𝑇 𝑛+𝑘−1 for all 𝑘 ≥ 1 from which the first assertion follows.
The second assertion is proved similarly. If 𝑘 ≥ 1 and 𝑥 ∈ im 𝑇 𝑛+𝑘−1 , then there
exists 𝑦 ∈ 𝐸 with 𝑥 = 𝑇 𝑛+𝑘−1 𝑦 = 𝑇 𝑘−1 (𝑇 𝑛 𝑦) and so 𝑇 𝑛 𝑦 ∈ im 𝑇 𝑛 . By assumption
im 𝑇 𝑛 = im 𝑇 𝑛+1 and therefore there exists 𝑧 ∈ im 𝑇 𝑛+1 with 𝑇 𝑛+1 𝑧 = 𝑇 𝑛 𝑦. Thus
𝑇 𝑛+𝑘 𝑧 = 𝑇 𝑘−1 (𝑇 𝑛+1 𝑧) = 𝑇 𝑘−1 (𝑇 𝑛 𝑦) = 𝑇 𝑛+𝑘−1 𝑦 = 𝑥 which implies that im 𝑇 𝑛+𝑘−1 =
im 𝑇 𝑛+𝑘 for all 𝑘 ≥ 1. This completes the proof of the lemma.

The above lemma motivates the following definition.

34.2 Definition (ascent and descent) We call

𝛼(𝑇 ) ∶= inf{𝑛 ≥ 1 ∶ ker 𝑇 𝑛 = ker 𝑇 𝑛+1 }

the ascent of 𝑇 and

𝛿(𝑇 ) ∶= inf{𝑛 ≥ 1 ∶ im 𝑇 𝑛 = im 𝑇 𝑛+1 }

the descent of 𝑇
If dim 𝐸 = 𝑁 < ∞, then ascent and descent are clearly finite and since dim(ker 𝑇 𝑛 ) +
dim(im(𝑇 𝑛 )) = 𝑁 for all 𝑛 ∈ ℕ, ascent and descent are always equal. We now show that
they are equal provided they are both finite. The arguments we use are similar to those
to prove the Jordan decomposition theorem for matrices. The additional complication
is that in infinite dimensions there is no dimension formula.

110
34.3 Theorem Suppose that 𝛼(𝑇 ) and 𝛿(𝑇 ) are both finite. Then 𝛼(𝑇 ) = 𝛿(𝑇 ). More-
over, if we set 𝑛 ∶= 𝛼(𝑇 ) = 𝛿(𝑇 ), then
𝐸 = ker 𝑇 𝑛 ⊕ im 𝑇 𝑛
and the above direct sum completely reduces 𝑇 .
Proof. Let 𝛼 ∶= 𝛼(𝑇 ) and 𝛿 ∶= 𝛿(𝑇 ). The plan is to show that
im 𝑇 𝛼 ∩ ker 𝑇 𝑘 = {0} (34.1)
and that
im 𝑇 𝑘 + ker 𝑇 𝛿 = 𝐸 (34.2)
for all 𝑘 ≥ 1. Choosing 𝑘 = 𝛿 in (34.1) and 𝑘 = 𝛼 in (34.2) we then get
im 𝑇 𝛼 ⊕ ker 𝑇 𝛿 = 𝐸. (34.3)
To show that 𝛼 = 𝛿 assume that 𝑥 ∈ ker 𝑇 𝛿+1 . By (34.3) there is a unique decompo-
sition 𝑥 = 𝑥1 + 𝑥2 with 𝑥1 ∈ im 𝑇 𝛼 and 𝑥2 ∈ ker 𝑇 𝛿 . Because ker 𝑇 𝛿 ⊆ ker 𝑇 𝛿+1 we
get 𝑥1 = 𝑥 − 𝑥2 ∈ ker 𝑇 𝛿+1 . We also have 𝑥1 ∈ im 𝑇 𝛼 , so by (34.1) we conclude that
𝑥1 = 0. Hence 𝑥 = 𝑥2 ∈ ker 𝑇 𝛿 , so that ker 𝑇 𝛿 = ker 𝑇 𝛿+1 . By definition of the ascent
we have 𝛼 ≤ 𝛿. Assuming that 𝛼 < 𝛿, then by definition of the descent im 𝑇 𝛿 ⊊ im 𝑇 𝛼 .
By (34.2) we still have im 𝑇 𝛿 + ker 𝑇 𝛿 = 𝐸, contradicting (34.3). Hence 𝛼 = 𝛿.
To prove (34.1) fix 𝑘 ≥ 1 and let 𝑥 ∈ im 𝑇 𝛼 ∩ ker 𝑇 𝑘 . Then 𝑥 = 𝑇 𝛼 𝑦 for some
𝑦 ∈ 𝐸 and 0 = 𝑇 𝑘 𝑥 = 𝑇 𝛼+𝑘 𝑦. As ker 𝑇 𝛼 = ker 𝑇 𝛼+𝑘 we also have 𝑥 = 𝑇 𝛼 𝑦 = 0. To
prove (34.2) fix 𝑘 ≥ 1 and let 𝑥 ∈ 𝐸. As im 𝑇 𝛿 = im 𝑇 𝛿+𝑘 there exists 𝑦 ∈ 𝐸 with
𝑇 𝛿 𝑥 = 𝑇 𝛿+𝑘 𝑦. Hence 𝑇 𝛿 (𝑥 − 𝑇 𝑘 𝑦) = 0, proving that 𝑥 − 𝑇 𝑘 𝑦 ∈ ker 𝑇 𝛿 . Hence we can
write 𝑥 = 𝑇 𝑘 𝑦 + (𝑥 − 𝑇 𝑘 𝑦) from which (34.2) follows.
We finally prove that (34.3) completely reduces 𝑇 . By definition of 𝑛 = 𝛼 = 𝛿
we have 𝑇 (ker 𝑇 𝑛 ) ⊆ ker 𝑇 𝑛+1 = ker 𝑇 𝑛 and 𝑇 (im 𝑇 𝑛 ) ⊆ im 𝑇 𝑛+1 = im 𝑇 𝑛 . This
completes the proof of the theorem.
We well apply the above to the operator 𝜆𝐼 − 𝐴, where 𝐴 is a bounded operator on a
Banach space and 𝜆 ∈ 𝜎𝑝 (𝐴).
34.4 Definition (algebraic multiplicity) Let 𝐸 be a Banach space and over ℂ. If 𝐴 ∈
(𝐸) and 𝜆 ∈ 𝜎𝑝 (𝐴) is an eigenvalue we call
dim ker(𝜆𝐼 − 𝐴)
the geometric multiplicity of 𝜆 and
(⋃ )
dim ker(𝜆𝐼 − 𝐴)𝑛
𝑛∈ℕ

the algebraic multiplicity of 𝜆. The ascent 𝛼(𝜆𝐼 − 𝐴) is called the Riesz index of the
eigenvalue 𝜆.
Note that the definitions here are consistent with the ones from linear algebra. As
it turns out the algebraic multiplicity of the eigenvalue of a matrix as defined above
coincides with the multiplicity of 𝜆 as a root of the characteristic polynomial. In infinite
dimensions there is no characteristic polynomial, so we use the above as the algebraic
multiplicity. The only question now is whether there are classes of operators for which
ascent and descent of 𝜆𝐼 − 𝐴 are finite for every or some eigenvalues. It turns out that
such properties are connected to compactness. This is the topic of the next section.

111
35 The Spectrum of Compact Operators
We want to study a class of operators appearing frequently in the theory of partial
differential equations and elsewhere. They share many properties with operators on
finite dimensional spaces.

35.1 Definition (compact operator) Let 𝐸 and 𝐹 be Banach spaces. Then 𝑇 ∈ Hom(𝐸, 𝐹 )
is called compact if 𝑇 (𝐵) is compact for all bounded sets 𝐵 ⊂ 𝐸. We set

(𝐸, 𝐹 ) ∶= {𝑇 ∈ Hom(𝐸, 𝐹 ) ∶ 𝑇 is compact}

and (𝐸) ∶= (𝐸, 𝐸).

35.2 Remark By the linearity 𝑇 is clearly compact if and only if 𝑇 (𝐵(0, 1)) is compact.

35.3 Proposition Let 𝐸, 𝐹 be Banach spaces. Then (𝐸, 𝐹 ) is a closed subspace of


(𝐸, 𝐹 ).
Proof. If 𝐵 ⊂ 𝐸 is bounded, then by assumption the closure of 𝑇 (𝐵) is compact, and
therefore bounded. To show that (𝐸, 𝐹 ) is a subspace of (𝐸, 𝐹 ) let 𝑇 , 𝑆 ∈ (𝐸, 𝐹 )
and 𝜆, 𝜇 ∈ 𝕂. Let 𝐵 ⊂ 𝐸 be bounded and let (𝑦𝑛 ) be a sequence in (𝜆𝑇 + 𝜇𝑆)(𝐵).
Then there exist 𝑥𝑛 ∈ 𝐵 with 𝑦𝑛 = 𝜆𝑇 𝑥𝑛 + 𝜇𝑆𝑥𝑛 . Since 𝐵 is bounded the sequence
(𝑥𝑛 ) is bounded as well. By the compactness of 𝑇 there exists a subsequence (𝑥𝑛𝑘 ) with
𝑇 𝑥𝑛𝑘 → 𝑦. By the compactness of 𝑆 we can select a further subsequence (𝑥𝑛𝑘 ) with
𝑗
𝑆𝑥𝑛𝑘 → 𝑧. Hence 𝑦𝑛𝑘 = 𝜆𝑇 𝑥𝑛𝑘 + 𝜇𝑆𝑥𝑛𝑘 → 𝜆𝑦 + 𝜇𝑧. This shows that every sequence
𝑗 𝑗 𝑗 𝑗
in (𝜆𝑇 + 𝜇𝑆)(𝐵) has a convergent subsequence, so by Theorem 4.4 the closure of that
set is compact. Hence 𝜆𝑇 + 𝜇𝑆 is a compact operator.
Suppose finally that 𝑇𝑛 ∈ (𝐸, 𝐹 ) with 𝑇𝑛 → 𝑇 in 𝑇 ∈ (𝐸, 𝐹 ). It is sufficient to
show that 𝑇 (𝐵) is compact if 𝐵 = 𝐵(0, 1) is the unit ball. We show that 𝑇 (𝐵) is totally
bounded. To do so fix 𝜖 > 0. Since 𝑇𝑛 → 𝑇 there exists 𝑛 ∈ ℕ such that
𝜀
‖𝑇𝑛 − 𝑇 ‖(𝐸,𝐹 ) < . (35.1)
4

By assumption 𝑇𝑛 is compact and so 𝑇𝑛 (𝐵) is compact. Therefore there exist 𝑦𝑘 ∈


𝑇𝑛 (𝐵), 𝑘 = 1, … , 𝑚, such that


𝑚
𝑇𝑛 (𝐵) ⊆ 𝐵(𝑦𝑘 , 𝜀∕4). (35.2)
𝑘=1

If we let 𝑦 ∈ 𝑇 (𝐵), then 𝑦 = 𝑇 𝑥 for some 𝑥 ∈ 𝐵. By (35.2) there exists 𝑘 ∈ {1, … , 𝑚}


such that 𝑇𝑛 𝑥 ∈ 𝐵(𝑦𝑘 , 𝜀∕4). By (35.1) and since ‖𝑥‖ ≤ 1 we get
𝜀 𝜀 𝜀
‖𝑦𝑘 − 𝑦‖ = ‖𝑦𝑘 − 𝑇𝑛 𝑦‖ + ‖𝑇𝑛 𝑦 − 𝑇 𝑥‖ ≤ + ‖𝑥‖ ≤ ,
4 4 2
so that 𝑦 ∈ 𝐵(𝑦𝑘 , 𝜀∕2). Since 𝑦 ∈ 𝐵 was arbitrary this shows that


𝑚
𝑇 (𝐵) ⊆ 𝐵(𝑦𝑘 , 𝜀∕2)
𝑘=1

112
and therefore

𝑚

𝑚
𝑇 (𝐵) ⊆ 𝐵(𝑦𝑘 , 𝜀∕2) ⊆ 𝐵(𝑦𝑘 , 𝜀).
𝑘=1 𝑘=1

Since 𝜀 > 0 was arbitrary 𝑇 (𝐵) is totally bounded. Since 𝐹 is complete Theorem 4.4
implies that 𝑇 (𝐵) is compact.

We now look at compositions of operators.

35.4 Proposition Let 𝐸, 𝐹 , 𝐺 be Banach spaces. If 𝐾 ∈ (𝐹 , 𝐺), then 𝐾𝑇 ∈ (𝐸, 𝐹 )


for all 𝑇 ∈ (𝐸, 𝐹 ). Likewise, if 𝐾 ∈ (𝐸, 𝐹 ), then 𝑇 𝐾 ∈ (𝐸, 𝐺) for all 𝑇 ∈
(𝐹 , 𝐺).
Proof. If 𝐾 ∈ (𝐹 , 𝐺), 𝑇 ∈ (𝐸, 𝐹 ) and 𝐵 ⊂ 𝐸 bounded, then 𝑇 (𝐵) is bounded
in 𝐹 and therefore 𝐾𝑇 (𝐵) is compact in 𝐺. Hence 𝐾𝑇 is compact. If 𝐾 ∈ (𝐸, 𝐹 ),
𝑇 ∈ (𝐹 , 𝐺) and 𝐵 ⊂ 𝐸 bounded, then 𝐾(𝐵) is compact in 𝐹 . Since 𝑇 is continuous
𝑇 (𝐾(𝐵)) is compact in 𝐺, so 𝑇 𝐾(𝐵) ⊂ 𝑇 (𝐾(𝐵)) is compact. Hence 𝑇 𝐾 is a compact
operator.

35.5 Remark If we define multiplication of operators in (𝐸) to be composition of


operators, then (𝐸) becomes a non-commutative algebra. The above propositions
show that (𝐸) is a closed ideal in that algebra.
The next theorem will be useful to characterise the spectrum of compact operators.

35.6 Theorem Let 𝑇 ∈ (𝐸) and 𝜆 ∈ 𝕂 ⧵ {0}. Then for all 𝑘 ∈ ℕ


( )
(i) dim ker(𝜆𝐼 − 𝑇 )𝑘 < ∞;

(ii) im(𝜆𝐼 − 𝑇 )𝑘 is closed in 𝐸.

Proof. Since ker(𝜆𝐼 − 𝑇 )𝑘 = ker(𝐼 − 𝜆−1 𝑇 )𝑘 and im(𝜆𝐼 − 𝑇 )𝑘 = im(𝐼 − 𝜆−1 𝑇 )𝑘 we


can assume without loss of generality that 𝜆 = 1 by replacing 𝑇 by 𝜆𝑇 . Also, we can
reduce the proof to the case 𝑘 = 1 because
( ) ( )
∑𝑘
∑𝑘
𝑗 𝑘 𝑗−1 𝑘
𝑘
(𝐼 − 𝐾) = (−1) 𝑗
𝑇 =𝐼− (−1) 𝑇 𝑗 = 𝐼 − 𝑇̃
𝑗=0
𝑗 𝑗=1
𝑗
∑𝑘 ()
if we set 𝑇̃ ∶= 𝑗=1 (−1)𝑗−1 𝑘𝑗 𝑇 𝑗 . Note that by Proposition 35.4 the operator 𝑇̃ is
compact.
(i) If 𝑥 ∈ ker(𝐼 − 𝑇 ), then 𝑥 = 𝑇 𝑥 and so

𝑆 ∶= {𝑥 ∈ ker(𝐼 − 𝑇 ) ∶ ‖𝑥‖ = 1} ⊆ 𝑇 (𝐵(0, 1)).

Since 𝑇 is compact 𝑆 is compact as well. Note that 𝑆 is the unit sphere in ker(𝐼 − 𝑇 ),
so by Theorem 11.3 the kernel of 𝐼 − 𝑇 is finite dimensional.
(ii) Set 𝑆 ∶= 𝐼 − 𝑇 . As ker 𝑆 is closed 𝑀 ∶= 𝐸∕ ker 𝑆 is a Banach space by
Theorem 12.3 with norm ‖𝑥‖𝑀 ∶= inf 𝑦∈ker 𝑆 ‖𝑦 − 𝑥‖. Now set 𝐹 ∶= im(𝑆). Let

113
𝑆̂ ∈ (𝑀, 𝐹 ) be the map induced by 𝑆. We show that 𝑆̂ −1 is bounded on im(𝑆). If
not, then there exist 𝑦𝑛 ∈ im(𝑆) such that 𝑦𝑛 → 0 but ‖𝑆̂ −1 𝑦𝑛 ‖𝑀 = 1 for all 𝑛 ∈ ℕ. By
definition of the quotient norm there exist 𝑥𝑛 ∈ 𝐸 with 1 ≤ ‖𝑥𝑛 ‖𝐸 ≤ 2 such that
̂ 𝑛 ] = 𝑥𝑛 − 𝑇 𝑥𝑛 = 𝑦𝑛 → 0.
𝑆[𝑥

Since 𝑇 is compact there exists a subsequence such that 𝑇 𝑥𝑛𝑘 → 𝑧. Hence from the
above
𝑥𝑛𝑘 = 𝑦𝑛𝑘 + 𝑇 𝑥𝑛𝑘 → 𝑧.
By the continuity of 𝑇 we have 𝑇 𝑥𝑛𝑘 → 𝑇 𝑧, so that 𝑧 = 𝑇 𝑧. Hence 𝑧 ∈ ker(𝐼 − 𝑇 ) and
[𝑥𝑛𝑘 ] → [𝑧] = [0] in 𝑀. Since ‖[𝑥𝑛 ‖𝑀 = 1 for all 𝑛 ∈ ℕ this is not possible. Hence
𝑆̂ −1 ∈ (im(𝑆), 𝑀) and by Proposition 23.4 it follows that 𝐹 = im(𝑆)̂ = im(𝐼 − 𝑇 ) is
closed.

We next prove that the ascent and descent of 𝜆𝐼 − 𝑇 is finite if 𝑇 is compact and 𝜆 ≠ 0.

35.7 Theorem Let 𝑇 ∈ (𝐸) and 𝜆 ∈ 𝕂⧵{0}. Then 𝑛 ∶= 𝛼(𝜆𝐼 −𝑇 ) = 𝛿(𝜆𝐼 −𝑇 ) < ∞
and
𝐸 = ker(𝜆𝐼 − 𝑇 )𝑛 ⊕ im(𝜆𝐼 − 𝑇 )𝑛
is a topological direct sum completely reducing 𝜇𝐼 − 𝑇 for all 𝜇 ∈ ℂ.
Proof. Replacing 𝑇 by 𝜆𝑇 we can assume without loss of generality that 𝜆 = 1.
Suppose that 𝛼(𝜆𝐼 − 𝑇 ) = ∞. Then 𝑁𝑘 ∶= ker(𝐼 − 𝑇 )𝑘 is closed in 𝐸 for every 𝑘 ∈ ℕ
and
𝑁0 ⊂ 𝑁1 ⊂ 𝑁2 ⊂ 𝑁3 ⊂ … ,
where all inclusions are proper inclusions. By Corollary 11.2 there exist 𝑥𝑘 ∈ 𝑁𝑘 such
that
1
‖𝑥𝑘 ‖ = 1 and dist(𝑥𝑘 , 𝑁𝑘−1 ) ≥
2
for all 𝑘 ≥ 1. Now

𝑇 𝑥𝑘 − 𝑇 𝑥𝓁 = (𝐼 − 𝑇 )𝑥𝓁 − (𝐼 − 𝑇 )𝑥𝑘 + 𝑥𝑘 − 𝑥𝓁
( )
= 𝑥𝑘 − 𝑥𝓁 − (𝐼 − 𝑇 )𝑥𝓁 + (𝐼 − 𝑇 )𝑥𝑘 = 𝑥𝑘 − 𝑧,

where 𝑧 ∶= 𝑥𝓁 − (𝐼 − 𝑇 )𝑥𝓁 + (𝐼 − 𝑇 )𝑥𝑘 ∈ ker(𝐼 − 𝑇 )𝑘−1 = 𝑁𝑘−1 if 1 ≤ 𝓁 < 𝑘. Hence


by choice of (𝑥𝑘 ) we get
1
‖𝑇 𝑥𝑘 − 𝑇 𝑥𝓁 ‖ = ‖𝑥𝑘 − 𝑧‖ ≥
2
for all 1 ≤ 𝓁 < 𝑘. Therefore the bounded sequence 𝑇 𝑥𝑘 does not have a convergent
subsequence, so 𝑇 is not compact. As a consequence, if 𝑇 ∈ (𝐸, 𝐹 ) is compact, then
𝛼(𝐼 − 𝑇 ) < ∞.
The proof for the descent works quite similarly. We assume that 𝛿(𝐼 − 𝑇 ) = ∞.
Then by Theorem 35.6 𝑀𝑘 ∶= im(𝐼 − 𝑇 )𝑘 is closed in 𝐸 for every 𝑘 ∈ ℕ and

𝑀0 ⊃ 𝑀1 ⊃ 𝑀2 ⊃ 𝑀3 ⊃ … ,

114
where all inclusions are proper inclusions. By Corollary 11.2 there exist 𝑥𝑘 ∈ 𝑀𝑘 such
that
1
‖𝑥𝑘 ‖ = 1 and dist(𝑥𝑘 , 𝑀𝑘+1 ) ≥
2
for all 𝑘 ≥ 1. Now

𝑇 𝑥𝑘 − 𝑇 𝑥𝓁 = (𝐼 − 𝑇 )𝑥𝓁 − (𝐼 − 𝑇 )𝑥𝑘 − 𝑥𝑘 + 𝑥𝓁
( )
= 𝑥𝓁 − 𝑥𝑘 − (𝐼 − 𝑇 )𝑥𝓁 + (𝐼 − 𝑇 )𝑥𝑘 = 𝑥𝑘 − 𝑧,

where 𝑧 ∶= 𝑥𝑘 − (𝐼 − 𝑇 )𝑥𝓁 + (𝐼 − 𝑇 )𝑥𝑘 ∈ im(𝐼 − 𝑇 )𝑘+1 = 𝑀𝑘+1 if 1 ≤ 𝑘 < 𝓁. Hence


by choice of (𝑥𝑘 ) we get
1
‖𝑇 𝑥𝑘 − 𝑇 𝑥𝓁 ‖ = ‖𝑥𝓁 − 𝑧‖ ≥
2
for all 1 ≤ 𝑘 < 𝓁. Therefore the bounded sequence 𝑇 𝑥𝑘 does not have a convergent
subsequence, so 𝑇 is not compact. As a consequence, if 𝑇 ∈ (𝐸, 𝐹 ) is compact, then
𝛿(𝐼 − 𝑇 ) < ∞.
By Theorem 34.3 𝑛 ∶= 𝛼(𝐼 − 𝑇 ) = 𝛿(𝐼 − 𝑇 ) and 𝐸 = ker(𝐼 − 𝑇 )𝑛 ⊕ im(𝐼 − 𝑇 )𝑛
is a direct sum completely reducing 𝐼 − 𝑇 and therefore 𝜇𝐼 − 𝑇 for all 𝜇 ∈ 𝕂. By
Theorem 35.6 the subspaces ker(𝐼 − 𝑇 )𝑛 and im(𝐼 − 𝑇 )𝑛 are closed, so by 33.3 the
above is a topological direct sum

As a consequence of the above theorem we prove that compact operators behave quite
similarly to operators on finite dimensional spaces. The following corollary would be
proved by the dimension formula in finite dimensions.

35.8 Corollary Suppose that 𝑇 ∈ (𝐸) and 𝜆 ≠ 0. Then 𝜆𝐼 − 𝑇 is injective if and


only if it is surjective.
Proof. If 𝜆𝐼 − 𝑇 is injective, then {0} = ker(𝜆𝐼 − 𝑇 ) = ker(𝜆𝐼 − 𝑇 )2 , and by
Theorem 35.7 we get

𝐸 = ker(𝜆𝐼 − 𝑇 ) ⊕ im(𝜆𝐼 − 𝑇 ) = {0} ⊕ im(𝜆𝐼 − 𝑇 ) = im(𝜆𝐼 − 𝑇 ). (35.3)

On the other hand, if 𝐸 = im(𝜆𝐼 − 𝑇 ), then im(𝜆𝐼 − 𝑇 ) = im(𝜆𝐼 − 𝑇 )2 , so that


𝛿(𝜆𝐼 − 𝑇 ) = 1. Hence ker(𝜆𝐼 − 𝑇 ) = {0} by Theorem 35.7 and (35.3).

We finally derive a complete description of the spectrum of closed operators.

35.9 Theorem (Riesz-Schauder) Let 𝐸 be a Banach space over ℂ and 𝑇 ∈ (𝐸)


compact. Then 𝜎(𝑇 ) ⧵ {0} consists of at most countably many isolated eigenvalues of
finite algebraic multiplicity. The only possible accumulation point of these eigenvalues
is zero.
Proof. By Corollary 35.8 and Proposition 32.3 every 𝜆 ∈ 𝜎(𝑇 ) ⧵ {0} is an eigenvalue
of 𝑇 . By Theorems 35.6 and 35.7 these eigenvalues have finite algebraic multiplicity.
Hence it remains to show that they are isolated. Fix 𝜆 ∈ 𝜎(𝑇 ) ⧵ {0} and let 𝑛 its Riesz
index (see Definition 34.4). Then by Theorem 35.7

𝐸 = ker(𝜆𝐼 − 𝑇 )𝑛 ⊕ im(𝜆𝐼 − 𝑇 )𝑛

115
is a topological direct sum completely reducing 𝜇𝐼 − 𝑇 for all 𝜇 ∈ ℂ. We set 𝑁 ∶=
ker(𝜆𝐼 − 𝑇 )𝑛 and 𝑀 ∶= im(𝜆𝐼 − 𝑇 )𝑛 . Let 𝑇 = 𝑇𝑁 ⊕ 𝑇𝑀 be the reduction associated
with the above direct sum decomposition. By construction 𝜆𝐼 − 𝑇𝑁 is nilpotent, so by
Theorem 32.10 spr(𝜆𝐼−𝑇𝑁 ) = 0. In particular this means that 𝜎(𝑇𝑁 ) = {𝜆}. Moreover,
by construction, 𝜆𝐼 − 𝑇𝑀 is injective and so by Proposition 32.3 and Corollary 35.8
𝜆 ∈ 𝜚(𝑇𝑀 ). As 𝜚(𝑇𝑀 ) is open by Theorem 32.7 there exists 𝑟 > 0 such that 𝐵(𝜆, 𝑟) ⊂
𝜚(𝑇𝑀 ). By Theorem 33.10 𝜚(𝑇 ) = 𝜚(𝑇𝑁 ) ∩ 𝜚(𝑇𝑀 ), so 𝐵(𝜆, 𝑟) ⧵ {𝜆} ⊂ 𝜚(𝑇 ), showing
that 𝜆 is an isolated eigenvalue. Since the above works for every 𝜆 ∈ 𝜎(𝑇 ) ⧵ {0} the
assertion of the theorem follows.

35.10 Remarks (a) If 𝑇 ∈ (𝐸) and dim 𝐸 = ∞, then 0 ∈ 𝜎(𝑇 ). If not, then by
Proposition 35.4 the identity operator 𝐼 = 𝑇 𝑇 −1 is compact. Hence the unit sphere in
𝐸 is compact and so by Theorem 11.3 𝐸 is finite dimensional.
(b) The above theorem allows us to decompose a compact operator 𝑇 ∈ (𝐸) in
the following way. Let 𝜆𝑘 ∈ 𝜎𝑝 (𝑇 ) ⧵ {0}, 𝑘 = 1, … , 𝑚, and let 𝑛𝑘 be the corresponding
Riesz indices. Then set
𝑁𝑘 ∶= ker(𝜆𝐼 − 𝑇 )𝑛𝑘
and 𝑇𝑘 ∶= 𝑇 |𝑁𝑘 . Then there exists a closed subspace 𝑀 ⊂ 𝐸 which is invariant under
𝑇 such that
𝐸 = 𝑁1 ⊕ 𝑁 2 ⊕ ⋯ ⊕ 𝑁 𝑚 ⊕ 𝑀
completely reduces 𝑇 = 𝑇1 ⊕ 𝑇2 ⊕ ⋯ ⊕ 𝑇𝑚 ⊕ 𝑇𝑀 . To get this reduction apply Theo-
rem 35.9 inductively.
The above theory is also useful for certain classes of closed operators 𝐴 ∶ 𝐷(𝐴) ⊆
𝐸 → 𝐸, namely those for which 𝑅(𝜆, 𝐴) is compact for some 𝜆 ∈ 𝜚(𝐴). In that case
the resolvent identity from Theorem 32.7 and Proposition 35.4 imply that 𝑅(𝜆, 𝐴) is
compact for all 𝜆 ∈ 𝜚(𝐴). It turns out that 𝜎(𝐴) = 𝜎𝑝 (𝐴) consist of isolated eigenvalues,
and the only possible point of accumulation is infinity. Examples of such a situation
include boundary value problems for partial differential equations such as the one in
Example 24.5.

116
Bibliography

[1] G. Birkhoff and G.-C. Rota. “On the completeness of Sturm-Liouville expan-
sions”. In: Amer. Math. Monthly 67 (1960), pp. 835–841. DOI: 10.2307/2309440.
[2] E. Bishop and D. Bridges. Constructive analysis. Vol. 279. Grundlehren der
Mathematischen Wissenschaften. Berlin: Springer-Verlag, 1985, pp. xii+477.
[3] A. P. Campbell and D. Daners. “Linear Algebra via Complex Analysis”. In:
Amer. Math. Monthly 120.10 (2013), pp. 877–892. DOI: 10.4169/amer.math.
monthly.120.10.877.
[4] P. Cohen. “The independence of the continuum hypothesis”. In: Proc. Nat. Acad.
Sci. U.S.A. 50 (1963), pp. 1143–1148.
[5] P. J. Cohen. “The independence of the continuum hypothesis. II”. In: Proc. Nat.
Acad. Sci. U.S.A. 51 (1964), pp. 105–110.
[6] D. Daners. “Uniform continuity of continuous functions on compact metric spaces”.
In: Amer. Math. Monthly 122.6 (2015), p. 592. DOI: 10 . 4169 / amer . math .
monthly.122.6.592.
[7] J. Dugundji. Topology. Boston, Mass.: Allyn and Bacon Inc., 1966, pp. xvi+447.
[8] K. Gödel. “The consistency of the axiom of choice and of the generalized continuum-
hypothesis”. In: Proc. Nat. Acad. Sci. U.S.A. 24 (1938), pp. 556–557.
[9] J. Lindenstrauss and L. Tzafriri. “On the complemented subspaces problem”. In:
Israel J. Math. 9 (1971), pp. 263–269.
[10] W. Rudin. Real and Complex Analysis. 2nd. New York: McGraw-Hill Inc., 1974.
[11] A. D. Sokal. “A Really Simple Elementary Proof of the Uniform Boundedness
Theorem”. In: Amer. Math. Monthly 118.5 (2011), pp. 450–452. DOI: 10.4169/
amer.math.monthly.118.05.450.
[12] M. H. Stone. “The generalized Weierstrass approximation theorem”. In: Math.
Mag. 21 (1948), pp. 167–184, 237–254. DOI: 10.2307/3029750.
[13] K. Yosida. Functional analysis. 6th. Vol. 123. Grundlehren der Mathematischen
Wissenschaften. Berlin: Springer-Verlag, 1980, pp. xii+501.
[14] E. Zermelo. “Beweis, daß jede Menge wohlgeordnet werden kann”. In: Math.
Ann. 59 (1904), pp. 514–516.

117

You might also like