0% found this document useful (0 votes)
96 views142 pages

Functional Analysis. Jie Xiao

This document contains lecture notes on functional analysis from Jie Xiao of Memorial University of Newfoundland. The notes cover topics in metric spaces, continuous maps, normed linear spaces, Banach spaces, and Hilbert spaces. Chapter 1 defines metric spaces and discusses metrizable topology, completeness, compactness, density and separability of metric spaces. It provides examples of metric spaces such as Euclidean spaces.

Uploaded by

jfeliperq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views142 pages

Functional Analysis. Jie Xiao

This document contains lecture notes on functional analysis from Jie Xiao of Memorial University of Newfoundland. The notes cover topics in metric spaces, continuous maps, normed linear spaces, Banach spaces, and Hilbert spaces. Chapter 1 defines metric spaces and discusses metrizable topology, completeness, compactness, density and separability of metric spaces. It provides examples of metric spaces such as Euclidean spaces.

Uploaded by

jfeliperq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 142

Jie Xiao

Lecture Notes on M6310 -


Functional Analysis
September 6, 2020

Department of Mathematics and Statistics


Memorial University of Newfoundland
St. John’s, NL, A1C 5S7, Canada
Contents

1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Metrizable Topology and Connectedness . . . . . . . . . . . . . . . . . . . . 1
1.2 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Compactness, Density and Separability . . . . . . . . . . . . . . . . . . . . . 12
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Continuous Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 Criteria for Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Continuous Maps on Compact or Connected Spaces . . . . . . . . . . 27
2.3 Sequences of Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Contractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5 Equivalence of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45


3.1 Linear Spaces, Norms and Quotient Spaces . . . . . . . . . . . . . . . . . 45
3.2 Finite Dimensional Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Bounded Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4 Linear Functionals via Hahn-Banach Extension . . . . . . . . . . . . . . 60
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4 Banach Spaces via Operators and Functionals . . . . . . . . . . . . . 71


4.1 Definition and Beginning Examples . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 Uniform Boundedness, Open Map and Closed Graph . . . . . . . . . 76
4.3 Dual Banach Spaces by Examples . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.4 Weak and Weak* Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.5 Compact and Dual Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
iv Contents

5 Hilbert Spaces and Their Operators . . . . . . . . . . . . . . . . . . . . . . . 105


5.1 Definition, Examples and Basic Properties . . . . . . . . . . . . . . . . . . 105
5.2 Orthogonality, Orthogonal Complement and Duality . . . . . . . . . 108
5.3 Orthonormal Sets and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4 Five Special Bounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.5 Compact Operators via the Spectrum . . . . . . . . . . . . . . . . . . . . . . 126
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
1
Metric Spaces

The objective of this chapter is to establish a basis which is general enough


to acquaint ourselves with some limit concepts in such an abstract setting as
functional analysis. In this case the word abstract implies that the objects we
deal with do not have to be numbers or even have arithmetic properties like
numbers. The one property that is essential to our idea of limits is the concept
of distance between elements. Each limit concept that we have seen thus far is
based on the question of whether two numbers are close enough to each other
whenever certain conditions hold. So our abstraction is developed in a system
whose elements need possess only the property of distance between pairs of
elements.

1.1 Metrizable Topology and Connectedness

To begin with, we introduce the definition of a metric space followed by some


useful examples, and consider topological properties such as open and closed
sets induced by a metric space.

Definition 1.1. A metric space is a system (X, d) that consists of a set X


whose elements are called points and a function d whose domain is all of
X × X and whose range is in [0, ∞) such that the following three properties
hold:
(i) d(x, y) ≥ 0 for all x, y ∈ X and d(x, y) = 0 if and only if x = y;
(ii) d(x, y) = d(y, x) for every x, y ∈ X;
(iii) d(x, z) ≤ d(x, y) + d(y, z) for every x, y, z ∈ X.

Remark 1.2.
(i) Thanks to these three properties of which (ii) and (iii) mean respectively
the symmetry and the triangle inequality of the metric space, the function d
is usually called the distance function.
2 1 Metric Spaces

(ii) If d(·, ·) has all the properties (i)-(ii)-(iii) except that d(x, y) = 0 does
not necessarily enforce x = y, then we call d a pseudo-distance and (X, d) a
pseudo-metric space.

Example 1.3.
(i) For n ∈ N let Rn be the set of all finite number sequences x = (xk )nk=1
where the real number xk is called the kth coordinate of x. Define
n
X 1/2
dn (x, y) = |xk − yk |2 , x = (xk )nk=1 , y = (yk )nk=1 ∈ Rn .
k=1

Then Rn with this distance function is called the n-dimensional Euclidean


space, or, more briefly, the Euclidean n-space. Our immediate goal, however,
is to verify that dn does indeed provide Rn with a metric distance via proving
the Cauchy-Schwarz inequality for Rn (see also Theorem ?? for a continuous
form):
Xn n
X n
1/2  X 1/2
2
|xk yk | ≤ |xk | |yk |2 .
k=1 k=1 k=1

To see this, consider the quadratic polynomial Q(t) = At2 + Bt + C, where


n
X n
X n
X
A= |xk |2 , B = 2 |xk ||yk |, C = |yk |2 .
k=1 k=1 k=1

Then
n
X
Q(t) = (|xk |t + |yk |)2 ≥ 0.
k=1

This infers that Q(t) cannot have two distinct zeros. Therefore, by the
quadratic formula, its discriminant B 2 − 4AC ≤ 0. In other words,
n
X n
2  X n
 X 
|xk ||yk | − |xk |2 |yk |2 ≤ 0.
k=1 k=1 k=1

Using the Cauchy-Schwarz inequality, we derive that if x, y, z ∈ Rn , then

(dn (x, y) + dn (y, z))2 = dn (x, y)2 + 2dn (x, y)dn (y, z) + dn (y, z)2
Xn
≥ ((xk − yk ) + (yk − zk ))2
k=1
= dn (x, z)2 .

Taking the square root of the first and last members, we further obtain

dn (x, z) ≤ dn (x, y) + dn (y, z),

whence verifying that (Rn , dn ) is a metric space.


1.1 Metrizable Topology and Connectedness 3

(ii) For n ∈ N let Qn = Q × Q × · · · Q = {(x1 , ..., xn ) ∈ Rn : x1 , x2 , ..., xn ∈


Q}. Then (Qn , dn ) is also a metric space.
(iii) Let g : R → R be an increasing function, E ⊆ R an mg -measurable
set, and p ∈ (0, ∞]. If F is R or C, then the Lebesgue-Radon-Stieltjes space
LRSgp (E, F) is defined as the collection of all F-valued functions f = <f +i=f
p
for which |f | = (<f )2 + (=f )2 and

 p1
|f (x)|p dmg (x) o
R

E
, p ∈ (0, ∞)
kf kp,mg ,E = n 
 inf c ≥ 0 : mg {x ∈ E : |f (x)| > c} = 0 , p=∞

is finite. If
kf1 − f2 kpp,mg ,E

, p ∈ (0, 1)
dp,mg ,E (f1 , f2 ) =
kf1 − f2 kp,mg ,E , p ∈ [1, ∞],

then dp,mg ,E (f1 , f2 ) = 0 if and only if f1 = f2 mg -a.e. on E; and (by


Minkowski’s inequality)

dp,mg ,E (f1 , f2 ) ≤ dp,mg ,E (f1 , f3 ) + dp,mg ,E (f3 , f2 )



for f1 , f2 , f3 ∈ LRSgp (E). Accordingly, LRSgp (E, F), dp,mg ,E is a pseudo-
metric space.

Example 1.3 leads to a general fact about metric spaces.

Theorem 1.4.
(i) Let X be any nonempty set and define the discrete distance d(x, y), x, y ∈
X, by 
1 , x 6= y
d(x, y) =
0 , x = y.
Then (X, d) is a metric space.
(ii) If (X, d) is a metric space and Y is a subset of X, then (Y, d) is a metric
space for which the domain of d is now restricted to Y × Y .

Proof. (i) It is clear that this d satisfies Definition 1.1 (i) and (ii). To verify
Definition 1.1 (iii), consider any points x, y, z ∈ X. If d(x, z) = 0, then Defini-
tion 1.1 (iii) is trivial. If d(x, z) > 0, then x 6= z and d(x, z) = 1, and hence y
cannot equal both x and z, so at least one of the distances d(x, y) and d(y, z)
must equal 1. Therefore, Definition 1.1 (iii) follows.
(ii) This truth is an immediate consequence of the definition of metric
space and the concept of subset.

The topology for a space of points provides a theory of cluster points and
convergence for sets of points in the space. In a metric space, this convergence
theory is based on the concept of distance between points. For more abstract
spaces it is based on a system of open sets, and there may be no distance
4 1 Metric Spaces

concept associated with the notion of limit. We here take the former approach
and use the metric distance function to define certain open sets, which are
generalizations of the open intervals in R.
In the rest of this section, we always assume that X is a metric space with
distance function d.

Definition 1.5. If x ∈ X and r > 0, then Br (x) = BrX (x) = {y ∈ X :


X
d(x, y) < r}, B r (x) = B r (x) = {y ∈ X : d(x, y) ≤ r}, and Sr (x) = SrX (x) =
{y ∈ X : d(x, y) = r} are called the open ball, closed ball, and sphere of radius
r about x respectively. A neighborhood of x is any subset of X that contains
some open ball about x.

Remark 1.6. Note that B0 (x) = ∅ and B 0 (x) = {x}.

Example 1.7. Let X = Rn , x ∈ Rn and r > 0.


(i) If n = 1, then Br (x) = (x − r, x + r) and B r (x) = [x − r, x + r]
(ii) If n = 2, then Br (x) and B r (x) are the open and closed disks of radius r
centered at x, respectively.
(iii) If n = 3, then Br (x) and B r (x) are the open and closed (3-dimensional)
balls of radius r centered at x, respectively.

Definition 1.8. Let A ⊆ X.


(i) x ∈ X is called an interior point of A, denoted x ∈ A◦ , provided there
exists r > 0 such that Br (x) ⊆ A.
(ii) A is said to be open provided A ⊆ A◦ .
(iii) x ∈ X is called a cluster point of A, denoted x ∈ A0 , provided for every
r > 0 one has Br (x) ∩ (A \ {x}) 6= ∅.
(iv) x ∈ X is called an isolated point of A provided there exists an r0 > 0 such
that Br0 (x) ∩ A = {x}.
(v) A is said to be closed provided A0 ⊆ A.
(vi) A is said to be bounded provided A ⊆ Br (x) for certain r > 0 and some
x ∈ X.

Remark 1.9. It is clear that A is open or closed if and only if A = A◦ or


A = A = A ∪ A0 .

Example 1.10.
(i) Any open ball Br (x) in a given metric
√ √ space X is a bounded open set.
(ii) In R2 , let A = B2 (0) and x = ( 2, 2); then x ∈ A0 . Since
√ √
( 2 − /2, 2 − /2) ∈ B2 (0) ∩ B (x)

where B (x) is an open disk about x with radius  > 0. However, in R2 , if


A = B1 (0) ∪ {x} where x = (2, 0). Then x ∈ / A0 , for the open disk B1/2 (x)
contains no point of A except x itself.
(iii) Any singleton {x} of X is closed.
(iv) X and ∅ are not only open, but also closed.
1.1 Metrizable Topology and Connectedness 5

(v) Let F be the subset of R2 consisting of


n o
(1, 0), (1/2, 0), (1/3, 0), ..., (1/k, 0), ... .

Then (0, 0) is a cluster point of F , and since (0, 0) is not in F we infer that F
is not closed. Also, the point (1, 0) is not an interior point of F , so F is not
open.

The forthcoming theorem is the fundamental relationship between the no-


tions of open set and closed set.

Theorem 1.11. For a subset A of X let Ac = X \ A be the complement of a


set A in X. Then A is open if and only if Ac is closed.

Proof. Suppose A is open. To show that Ac is closed, we use the contrapositive


method to prove that if x ∈ / Ac then x is not a cluster point of Ac . Suppose
c
now x ∈ / A . Then x ∈ A. Since A is open, there is a ball Br (x) ⊆ A. Thus x
is not a cluster point of Ac , as desired.
Conversely, suppose Ac is closed. If A is not open. Then A contains some
point y that is not an interior point of A. Thus every ball Br (y) contains a
point of Ac . Consequently, y is a cluster point of Ac , and since y is in A \ Ac ,
we conclude that Ac is not closed, a contradiction to the assumption.

Remark 1.12. Perhaps the greatest value of Theorem 1.11 is that it gives us
an alternative method for proving that a set is open or closed. It is sometimes
easier to work with the complement of a set than with the set itself. For
instance, we know immediately that X \ {x} is open because {x} is closed.

Theorem 1.13.
(i) The intersection of a finite number of open sets is itself an open set .
(ii) The union of any collection of open sets is an open set.

Proof. (i) Suppose each of A1 , ..., An is open, and let x ∈ A = ∩nk=1 Ak . Then
x is in each of A1 , ..., An , so for each 1 ≤ k ≤ n there exists a positive
radius rk such that Brk (x) ⊆ Ak . Define r = min{r1 , ..., rn }. Then for each k,
Br (x) ⊆ Brk (x) ⊆ Ak ⊆ A. Hence x is in A◦ , and so A is open.
(ii) Suppose that for each j in an index set I, Aj is an open set. Let
x ∈ ∪j∈I Aj . Then x belongs to some Aj , say, Aj0 . Since it is open, there
exists an open ball Br (x) such that Br (x) ⊆ Aj0 ⊆ ∪j∈I Aj . Hence ∪j∈I Aj is
open.

This theorem leads to a result regarding closed sets.

Corollary 1.14.
(i) The union of a finite number of closed sets is a closed set.
(ii) The intersection of any collection of closed sets is itself a closed set .
6 1 Metric Spaces

Proof. It suffices to prove (i). Suppose that F1 , ..., Fn are closed sets. It is easy
to verify that X \ ∪nk=1 Fk = ∩nk=1 (X \ Fk ). By Theorem 1.13, each X \ F is
open. Thus, according to Theorem 1.13 (i) their intersection is open. Hence
∪nk=1 Fk is closed since its complement is open.

Remark 1.15.
(i) The union of an infinite collection of closed sets need not be closed. For
instance, let Fk be the singleton (1/k, 0) in R2 . It is clear that (0, 0) is a
cluster point of ∪∞k=1 Fk , but (0, 0) is not in this union. So the union is not
closed. Of course, it is not hard to give an example of an infinite collection of
open sets whose intersection is not open. For example, Ak = B1/k (0). Clearly,
∩∞k=1 Ak = {0} which is not open.
(ii) We should take particular notice of the fact that the collections of sets in
Theorem 1.13 and Corollary 1.14 may be arbitrarily large. In particular, the
collections may contain so many sets that is impossible to describe them as
an infinite sequence of sets. This is why we use the device of an index set I to
describe the collection. Had we written {An }∞ n=1 for the collection of sets, this
would have indicated a sequence of sets and therefore limited our discussion
to countable collections.

We finish this section with applying the formula

A = {x ∈ X : Br (x) ∩ A 6= ∅ for all r > 0}

to discussing the connectedness of a metric space – the phenomenon of a set


that cannot be naturally separated into two subsets.
The vagueness of the phrase “naturally separated” suggests that it is not
easy to give a precise definition of the property we want. Indeed, we define it
by first stating when a set does not possess this property.

Definition 1.16. The set S ⊆ X is disconnected if there exist nonempty sets


S1 and S2 such that S1 ∪ S2 = S, S 1 ∩ S2 = ∅ and S 2 ∩ S1 = ∅. If S is not
disconnected, then S is said to be connected.

It is worth pointing out that the above S1 and S2 satisfy a property that is
stronger than being disjoint. Not only must S1 and S2 fail to contain any point
of the other set, they must also fail to contain any cluster point of the other
set. Note that this extra stipulation in Definition 1.16 is significant because
any set containing two or more points can be written as the union of two
nonempty disjoint subsets – for x ∈ S, let S1 = {x} and S2 = S \ {x}. But as
we see in the next theorem, when we are dealing with open sets, disjointness
is sufficient to imply that their union is disconnected.

Theorem 1.17. Let S ⊆ X. Then S is disconnected if and only if there are


disjoint open sets A and B such that S ⊆ A ∪ B, A ∩ S 6= ∅, and B ∩ S 6= ∅.
1.2 Completeness 7

Proof. Suppose that A and B exist as in the statement of Theorem 1.17 and
let S1 = A ∩ S and S2 = B ∩ S. Clearly, S1 and S2 are nonempty, and
S1 ∪ S2 = S. Consider an arbitrary point x ∈ S1 ; then x is in the open set A,
so for some r > 0, Br (x) ⊆ A. Thus Br (x) ∩ B = ∅ because A ∩ B = ∅, so
Br (x) ∩ S2 = ∅. Thus x ∈ / S 2 ; hence S1 ∩ S 2 = ∅. Similarly, S 1 ∩ S2 = ∅, and
we conclude that S is disconnected.
Now assume that S is disconnected, say, S = S1 ∪ S2 and S1 ∩ S 2 =
S 1 ∩ S2 = ∅. Then no point of S1 is in S20 . For x ∈ S1 , choose Br (x) such
that Br (x) ∩ S2 = ∅. Thus A = ∪x∈S1 Br/2 (x) is an open set that contains no
point of S2 . Similarly, we can take a ball Bs (y) for each y ∈ S2 that satisfies
Bs (y) ∩ S1 = ∅ and define B = ∪x∈S2 Bs/2 (x), which is open. It remains only
to show that A∩B = ∅. Suppose not, and let p ∈ A∩B. Then for some x ∈ S1
and y ∈ S2 , p ∈ Br/2 (x) ∩ Bs/2 (y). But this implies that

d(x, y) ≤ d(x, p) + d(p, y) < r/2 + s/2 ≤ max{r, s}.

Therefore either y ∈ Br (x) or x ∈ Bs (y), which contradicts the choice of


either r or s, respectively.

Corollary 1.18. Let S ⊆ X. If S is connected, then S is connected.

Proof. Assume that S is disconnected, and let A and B be disjoint open


sets in Theorem 1.17: S ⊆ A ∪ B, A ∩ S 6= ∅, and B ∩ S 6= ∅. Since S is
obviously contained in A ∪ B, we need to prove only that A ∩ S 6= ∅ and
B ∩ S 6= ∅. Let x ∈ S ∩ A. Since A is open, there is a ball about x such
that Br (x) ⊆ A. But x ∈ S too, so Br (x) must contain some point y of S.
Therefore y ∈ Br (x) ∩ S ⊆ A ∩ S, so A ∩ S 6= ∅. Similarly, B ∩ S 6= ∅, so by
Theorem 1.17, S is disconnected. This contradicts the assumption that S is
connected.

1.2 Completeness
To discuss the completeness of a metric space, we start with the definition of
the convergence of a point sequence in the metric space.
A point sequence is a function from N into a metric space (X, d); this is,
it is a sequence whose terms are points in X. Since our principal examples of
metric spaces are the Euclidean spaces and we use subscripts to designate the
coordinates of points in Rn , we use superscripts to index the terms of a point
sequence – (x(k) )∞
k=1 represents a point sequence in a metric space.

Definition 1.19. Given a metric space (X, d). The point sequence (x(k) )∞ k=1
in X is said to converge to the point x ∈ X, denoted limk→∞ x(k) = x,
provided for every  > 0 there is an N ∈ N such that d(x(k) , x) <  whenever
k > N.
8 1 Metric Spaces

Our first task is to establish a connection between convergent point se-


quences and cluster points.

Theorem 1.20. Given a metric space (X, d) and A ⊆ X. Then x ∈ A0 if and


only if there is a non-repeating point sequence in A that converges to x.

Proof. Suppose x ∈ A0 and r1 = 1. Then there exists a point x(1) in Br1 (x) ∩
(A \ {x}). Let r2 = min{1/2, d(x(1) , x)}, and choose a point x(2) in (A \
{x}) ∩ Br2 (x). After x(1) , ..., x(k) have been defined, let rk+1 = min{1/(k +
1), d(x(k) , x)} and choose x(k+1) as a point of A\{x} in the open ball Brk+1 (x).
Since rk ≤ 1/k, it is clear that limk→∞ x(k) = x; and since rk ≤ d(x(k−1) , x),
it follows that no two x(k) ’s are the same.
To prove the converse, we simply observe that if A contains such a point
sequence that converges to x, then it is obvious from the definition that every
open ball about x contains infinitely many points of A.

Example 1.21. In Rn there is a very strong connection between convergent


point sequences and convergent number sequences. More precisely, x(k) → x
(k)
in Rn if and only if limk→∞ xj = xj for each j = 1, ..., n. This is because
x(k) ∈ B (x) if and only if
n
X 1/2
(k)
dn (x(k) , x) = |xj − xj |2 < .
j=1

Definition 1.22. Suppose (X, d) is a metric space. A point sequence (x(k) )∞


k=1
in X is a Cauchy sequence provided for every  > 0 there is an N > 0 such
that d(x(k) , x(m) ) <  whenever k > m > N .

This definition is obviously a generalization of the concept of Cauchy num-


ber sequences treated in Chapter 1.

Example 1.23. A point sequence in Rn is a Cauchy sequence if and only if


it converges to a point in Rn . To verify this, suppose (x(k) )∞
k=1 is a Cauchy
(k) (m)
sequence. Since dn (x(k) , x(m) ) ≥ |xj − xj | for each j = 1, · · · , n, it follows
(k)
that (xj )∞
k=1 forms a Cauchy number sequence. Thus such a sequence con-
(k)
verges to a point in R, say, limk→∞ xj = xj . Now Example 1.21 ensures
that (x(k) )∞ n
k=1 converges to x = (x1 , ..., xn ) in R . The converse assertion is
true in any metric space, and therefore we give it as the next theorem.

Theorem 1.24. Suppose (X, d) is a metric space.


(i) If (x(k) )∞
k=1 in X is convergent, then (x
(k) ∞
)k=1 is a Cauchy sequence.
(k) ∞
(ii) If (x )k=1 in X is a Cauchy sequence, then it is bounded.
(iii) If x ∈ X and Y = X \ {x}, then any sequence (x(k) )∞ k=1 in Y with
limk→∞ d(x(k) , x) = 0 is a Cauchy but not a convergent sequence in Y with
respect to d.
1.2 Completeness 9

Proof. (i) Assume that (x(k) )∞


k=1 converges to x, and suppose  > 0. Choose
N > 0 so that d(x(k) , x) < /2 as k > N . This implies that

d(x(k) , x(m) ) ≤ d(x(k) , x) + d(x(m) , x) < ,

whenever k > m > N . Therefore, (x(k) )∞ k=1 is a Cauchy sequence in X.


(ii) Let (x(k) )∞
k=1 be a Cauchy sequence. By definition, we find an N > 0
such that d(x(k) , x(m) ) < 1 whenever k > m > N . In particular, if m = N + 1
this property becomes d(x(k) , x(N +1) ) < 1 whenever k > N . Therefore all the
points x(k) for k > N lie in the open ball B1 (x(N +1) ). Now we simply enlarge
the radius of the ball until it also includes the first N points of the sequence.
Define
r = 1 + max 1, d(x(1) , x(N +1) ), ..., d(x(N ) , x(N +1) ) .


Thus r is at least one unit more than the distance between x(N +1) and x(k)
for every k = 1, 2, ...; so Br (x(N +1) ) contains every point x(k) of the sequence.
(iii) Suppose now x ∈ X is the limit of (x(k) )∞ k=1 under d. Then (x
(k) ∞
)k=1
is a sequence in both X and Y , and it converges in X; so by (i) it is a Cauchy
sequence in X. The distance function d on Y ×Y is the same one as on X ×X,
so (x(k) )∞k=1 is also a Cauchy sequence in Y . But in Y there is no point to
which (x(k) )∞k=1 can converge, because x has been removed and the limit of a
sequence is unique.
Accordingly, we see from Theorem 1.24 that Example 1.23 cannot be gen-
eralized to all metric spaces. This idea is the general principle behind the
following example.
Example 1.25.
(i) Let X = B1 (0) be the open unit ball in R2 with the usual Euclidean dis-
tance. Consider a sequence in B1 (0) that converges to a point on its boundary,
say, x(k) = (1 − 1/k, 0), so that it has cluster point (1, 0). Then this sequence
is a Cauchy sequence in X, but it has no limit in X, so it does not converge
in X.
(ii) Again for n ∈ N let Qn be the subset of Rn consisting of√all points
for which all coordinates are rational numbers. The point x = ( 2, 0, ..., 0)
is in Rn \ Qn√ , but if (rk )∞
k=1 is a sequence of rational numbers such that
limk→∞ rk = 2, and x(k) = (rk , 0, ..., 0), then limk→∞ x(k) = x. Therefore
(x(k) )∞
k=1 is a Cauchy sequence but does not converge in Q .
n

Naturally, we can give the definition of the completeness of a metric space.


Definition 1.26. Suppose (X, d) is a metric space. S ⊆ X is said to be com-
plete with respect to d provided every Cauchy sequence in S converges to a
point in S with respect to d. In particular, if X is complete with respect to d,
then we say that (X, d) is a complete metric space.
Theorem 1.27. Let (X, d) be a complete metric space. Then S ⊆ X is com-
plete with respect to d if and only if S is a closed subset of X.
10 1 Metric Spaces

Proof. In fact, suppose that S is closed and (xk )∞


k=1 is a Cauchy sequence in
S. Since X is complete, there is an x ∈ X such that xk → x. But because S
is closed, one gets x ∈ S. Conversely, if S is complete, then for any sequence
(xk )∞
k=1 converging to x ∈ X, one has that this sequence is a Cauchy sequence
in S and, therefore, converges to a point y ∈ S. So x = y ∈ S and S is closed.
In the terminology of Definition 1.26, Rn is a complete metric space. The
next result gives four alternative properties that are equivalent to complete-
ness in Rn . Each is an important result in its own right, and collectively they
are the most important tools of analysis. They are immediately recognizable
as multidimensional extensions of the four theorems of R1 -case.
Theorem 1.28. For n ∈ N let X = Rn be equipped with the distance dn (·, ·).
Then the following statements are equivalent:
(i) X is complete.
(ii) Heine-Borel property: If F is a closed and bounded subset of X and
{Aµ }µ∈I is a collection of open sets whose union contains F , then there is
a finite subcollection {Aµj }m m
j=1 such that F ⊆ ∪j=1 Aµj .
(iii) Bolzano-Weierstrass property: If S is an infinite bounded subset of X,
then S 0 6= ∅.
(iv) Bounded sequence property: If (x(k) )∞
k=1 is a bounded sequence in X, then
it has a convergent subsequence.
(v) Nested set property: If (Fk )∞ k=1 is a sequence of closed and bounded
nonempty sets such that Fk+1 ⊆ Fk ⊆ X for each k ∈ N, then ∩∞ k=1 Fk 6= ∅.

Proof. Since statements (ii)-(v) refer to bounded sets, it will be more conve-
nient to use another form of the boundedness; that is, a set S is bounded in
X if and only if it is contained in an n-cube Q with side-length 2r > 0, i.e.,
S ⊆ Q = {x ∈ Rn : |xj | ≤ r for j = 1, ..., n}.
(i)⇒(ii) Let F be a closed and bounded set and suppose that we are given
that {Aµ }µ∈I is an open cover of F as in (ii). Since F is bounded, there is
an n-cube Q1 containing F . Assume that F cannot be covered by any finite
number of the Aµ ’s. Subdivide Q1 into 2n n-cubes by halving each of the
coordinate intervals of Q1 ; that is, the jth coordinate of a point x in one of
the n-cubes is restricted to either [0, r] or [−r, 0] instead of [−r, r]. If it were
possible to cover those parts of F that lie in each one of the 2n n-cubes with
finitely many Aµ ’s, then the union of all of these Aµ ’s would still be only a
finite number and would cover F . Therefore one of these 2n n-cubes must
contain a subset of F that cannot be covered by any finite number of the
Aµ ’s. Call this n-cube Q2 . Now subdivide Q2 into 2n n-cubes and repeat the
process. The result is a sequence of n-cubes such that Qk+1 ⊆ Qk , and the
jth coordinate of each point in Qk is restricted on an interval whose length is
r22−k . Choose a point sequence (x(k) )∞ k=1 such that x
(k)
is in F ∩ Qk . Then
(k) ∞
(xj )k=1 – the sequence of jth coordinates forms a Cauchy number sequence
in R thanks to
1.2 Completeness 11
(k) (m)
|xj − xj | ≤ r(22−k + · · · + 22−m ) < r23−m → 0 as k > m → ∞.

Consequently, (x(k) )∞ n
k=1 is a Cauchy sequence in R . Assuming (i), we con-
clude that there is a point x ∈ R such that limk→∞ x(k) = x. Since each x(k)
n

is in F and F is closed, we have x ∈ F . Therefore x is in one of the Aµ ’s,


say, x ∈ Aµx . But Aµx is open, so for some s > 0, x ∈ Bs (x) ⊆ Aµx , which
means that Bs (x) contains all but a finite number of the n-cubes Qk ’s. But
this derives that the points of F in all these Qk ’s are contained in Aµx , and
this contradicts the choice of Qk .
(ii)⇒(iii) Suppose that S is a bounded set which has no cluster point. We
show that S must be finite, which establishes (iii). Let B be a closed and
bounded ball containing S. Since no point of B is a cluster point of S, we can
choose a open ball Brx (x) for each x ∈ B such that Brx (x) contains no point
of S except possibly x itself. Now B ⊆ ∪x∈B Brx (x), so {Brx (x) : x ∈ B}
is a collection of open sets whose union contains B. Therefore from (ii) we
conclude that there is a finite subcollection of these open balls that covers B.
But this subcollection must also cover S due to S ⊆ B, and since each open
ball contains at most one point of S, we conclude that S is finite.
(iii)⇒(iv) Assume that (x(k) )∞ k=1 is a bounded point sequence, and let S be
the range of this sequence; that is, S = {p ∈ Rn : p = x(k) for some k}. If S
is a finite set, then at least one of its points must appear infinitely many times
as a term of the sequence, in which case (x(k) )∞ k=1 has a constant subsequence.
If S has infinitely many points, then (iii) implies that S has a cluster point,
and in proving Theorem 1.20 we have seen how to construct a sequence in S
that converges to that cluster point.
(iv)⇒(v) Let (Fk )∞ k=1 be a nested sequence of closed and bounded sets as
in the hypothesis of (v). For each k, choose a point x(k) in Fk . Every such
point is in F1 , so (x(k) )∞ k=1 is bounded sequence. By (iv), (x
(k) ∞
)k=1 has a
convergent subsequence whose limit we here call x. We see that the nested
property of the sets guarantees that each Fk contains all but possibly a finite
number of the points x(k) . Therefore for each k there is a sequence of points
in Fk that converges to x. Since they are closed sets, x must belong to each
Fk . Hence ∩∞ k=1 Fk contains x, so the intersection is nonempty.
(v)⇒(i) Let (x(k) )∞ k=1 be a Cauchy sequence, and for each m set Fm be the
closure of the set of points {x(k) : k ≥ m}. Thus (Fm )∞ m=1 is a nested sequence
of closed sets. By Theorem 1.24 (ii) we see that (x(k) )∞ k=1 is bounded, and
therefore F1 is bounded. By (v) there is a point x in all sets Fm ’s. Let B (x)
be any open ball about x, and choose a number N such that k > m > N
implies that dn (x(k) , x(m) ) < /2. Since x is in FN , which is the closure of the
subsequence (x(k) )∞ k=N , there is some point x
(m)
, where m > N , such that
(m)
x is in B/2 (x). Now if k > N , then

dn (x(k) , x) ≤ dn (x(k) , x(m) ) + dn (x(m) , x) < .

Hence limk→∞ x(k) = x.


12 1 Metric Spaces

Remark 1.29.
(i) In examining the proof of Theorem 1.28, we find that the Euclidean dis-
tance formula was used only in proving that (i) implies (ii). Therefore we
conclude that the other four implications hold in any metric space.
(ii) For x ∈ Rn let X = Rn \ {x} and the distance between points in X
be the same Euclidean distance as in Rn . Now if (x(k) )∞ k=1 is a sequence that
converges to x in Rn and x(k) 6= x, then it is still a Cauchy sequence in X. But
because there is no point in X to which it converges, (x(k) ) is not convergent
in X. Thus X is not complete. Since completeness is implied by the other four
statements, it follows that all five are false.
(iii) Of course, the fact that all five statements in Theorem 1.28 are true
in X = Rn is quite dependent upon the distance formula. The use of the
distance formula in proving that (i) implies (ii) is very subtle. It is inherent
in the notions of boundedness and n-cubes that were used.

Example 1.30. Let X be an infinite set and define d by



1 , x 6= y
d(x, y) =
0 , x = y.

Since each point in X can be written as {x} = B1/2 (x), we see that every
singleton set is an open set in X. Also X = B2 (x) for any x ∈ X, so X is
bounded. Therefore X is a bounded and closed set, and ∪x∈X {x} is an open
cover of X that obviously cannot be reduced to a finite number of open sets.
We now assert that X is complete, because it is not hard to show that if
(x(k) )∞
k=1 is a Cauchy sequence, then there is an N such that x
(k)
= x(N )
(k) (N )
whenever k ≥ N . Therefore limk→∞ x = x , so X is complete even
though the Heine-Borel property does not hold.

1.3 Compactness, Density and Separability


As for an expansion of the Heine-Borel property, we discuss the compactness,
density and countability of sets in a given metric space.
The first important notion induced by the Heine-Borel property is that of
compact sets. To see this picture clearly, we first consider two examples that
enable us to describe some important and attractive results in the theory of
functions.

Example 1.31. A set K ⊆ Rn is said to be compact provided every sequence


in K contains a subsequence that converges to a point in K. This notion can
be characterized by the equivalence of the following three statements:
(i) K is compact;
(ii) K is closed and bounded;
1.3 Compactness, Density and Separability 13

(iii) Every open cover of K has a finite subcover; that is, if K ⊆ ∪α∈I Gα
where each Gα is an open subset of Rn , then there is a finite subset J of the
index set I such that K ⊆ ∪α∈J Gα .
In fact, suppose (i) holds. Then every convergent sequence in K converges
to a point in K. Using the fact that a subset E of Rn is closed if and only if
E equals its closure, we get that K is closed. If K is not bounded, then for
each k ∈ N, there is an xk ∈ K such that dn (xk , 0) > k. Consequently, no
subsequence of (xk )∞ k=1 can converge and so K is not compact, a contradiction.
Thus, (ii) holds.
If (ii) holds, then an application of Theorem 1.28 (i) yields (iii).
Finally, suppose (iii) holds. To verify (i), it suffices to prove that K is
bounded and closed since Theorem 1.28 (iv) implies that every sequence in
such K has a subsequence which converges to a point in K. To see that K is
bounded, set Gx = B1 (x). Then {Gx : x ∈ K} is an open cover of K and so
has a finite subcover, say, Gx1 , ..., Gxk . Let M = max{dn (xj , 0) : j = 1, ..., k}.
If x ∈ K, then x ∈ Gxj for some j and d(x, 0) ≤ dn (xj , 0) + 1 ≤ M + 1. Thus,
K is bounded. To show that K is closed, fix a point x ∈ Rn \K. For any y ∈ K,
there are disjoint open balls Uy and Vy centered at y and x, respectively. Note
that these two balls will generally depend on x. The family {Uy : y ∈ K}
is an open over of K and so has a finite subcover, say, Uy1 , ..., Uyk . Then
K ⊆ ∪kj=1 Uyj , which is disjoint from the open ball B = ∩kj=1 Vyj centered at
x. This ball B must be contained entirely in Rn \ K: Otherwise, B ∩ K 6= ∅
and hence B ∩ ∪kj=1 Uyj ⊇ B ∩K 6= ∅, a contradiction. It follows that Rn \K
is open and so K is closed.
Example 1.32. The family F = {(1/k, 2) : k ∈ N} is an open cover of (0, 1];
the family G = {(2−k , 2) : k ∈ N} is a subfamily of F. Each point of (0, 1]
belongs to (2−k , 2) for k sufficiently large, and so G is a subcover for (0, 1]. If
H is a finite subfamily of F, then there is a largest value of k, say k1 such that
(2−k1 , 2) ∈ H. Since 2−k1 −1 ∈ (0, 1] and does not belong to any element of H,
H is not a subcover for (0, 1]. A close look at this example reveals that the
problem lies in the fact that (0, 1] is not closed. Other examples, which the
reader should construct, show that similar problems can arise with unbounded
sets.
Remark 1.33. There is an important result generalizing the nested set prop-
erty – if F = {Kj }j∈I is a class of compact subsets of Rn , n ∈ N, and the
intersection of any finite subclass of F is nonempty then ∩j∈I Kj 6= ∅. To see
this, let K ∈ F be such that ∩(∩j∈I Kj ) = ∅. Then Ki ⊆ ∪i6=j∈I (Rn \ Kj ).
Since Ki is compact, there are finitely many indices j1 , ..., jk such that

Ki ⊆ ∪kl=1 (Rn \ Kjl ) = Rn \ ∩kl=1 Kjl ,

so Ki ∩ (∩kl=1 Kjl ) = ∅, a contradiction. Accordingly, ∩j∈I Kj 6= ∅.


An advantage of the last characterization of compactness in Example 1.31
is that it remains meaningful when the discussion is extended to the topo-
14 1 Metric Spaces

logical spaces much more general than Rn , and thus serves as the basis for a
definition of the compactness in these more general spaces.

Definition 1.34. Suppose (X, d) is a metric space. We say that S ⊆ X is


compact provided, whenever S is contained in the union of a collection of
open subsets of X, S is contained in the union of a finite number of these
open subsets. In particular, if S = X then (X, d) is called a compact metric
space.

It is very natural to give an alternate description of this definition in terms


of cluster point.

Theorem 1.35. Let (X, d) be a metric space and S ⊆ X. Then the following
statements are equivalent:
(i) S is compact;
(ii) Any infinite subset of S has at least one cluster point in S;
(iii) Any sequence of points in S has a subsequence which is convergent to a
point in S.

Proof. (i)⇒(ii) Suppose (i) is true. If (ii) is false, then there is an infinite
subset S of X that has no cluster point in S. Now for each x ∈ S, we can find
an open ball having x as center and containing only a finite number of points
of S. Note that S is compact and covered by all such open balls centered at
x. So it is subset of the union of a finite number of such open balls of which
each contains only a finite number of points of S. This implies that S is finite,
a contradiction.
(ii)⇒(iii) This implication is trivial.
(iii)⇒(i) Suppose (iii) is valid, but (i) is not true. Given  > 0 and x1 ∈ S.
Choose x2 ∈ S such that d(x2 , x1 ) ≥ . For k = 1, 2 choose x3 ∈ S such
that d(x3 , xk ) ≥ . Continuing the previous selection, we can find xn+1 ∈ S
such that d(xn+1 , xk ) ≥  for k = 1, 2, ..., n and n ∈ N. This process can be
done because S is covered by all open balls centered at points in S of radius
 but this cover has no a finite subcover thanks to the assumption that S is
not compact. According to (iii), (xn )∞ ∞
n=1 must possess a subsequence (xnj )j=1
which is convergent to a point in S, so d(xnj , xnk ) <  whenever nj and nk
are big enough. This contradicts d(xnj , xnk ) ≥ .

Here is an interesting consequence of Theorem 1.35.

Corollary 1.36. Let (X, d) be a complete metric space. Then a closed subset
S of X is compact if and only if S is totally bounded; that is, for any  > 0 there
exist finitely many open balls {B (xj )}nj=1 in X such that S ⊆ ∪nj=1 B (xj ).

Proof. Suppose S ⊆ X is closed. If S is compact and  > 0, then the trivial


inclusion S ⊆ ∪x∈S B (x) induces a finite cover S ⊆ ∪nj=1 B (xj ), as desired.
Conversely, suppose S is totally bounded. Let S be covered by a collection
of open sets {Oα }α∈I . To reach a contradiction, assume S cannot be covered
1.3 Compactness, Density and Separability 15

by any finite sub-collection of {Oα }α∈I . Since S is closed, from the total
boundedness of S it follows that for  = 1 there is finitely many open balls
{B1 (xj )}nj=1
1
such that S = ∪nj=1 1
S ∩ B 1 (xj ). According to the assumption,
one of closed sets in {S ∩ B 1 (xj )}nj=1
1
, say, S1 = S ∩ B 1 (x1 ), cannot be covered
by finitely many elements of {Oα }α∈I . Do the same with  = 1/2 and S1 in
place of  = 1 and S respectively, and continue this process. The result is a
sequence of closed sets (Sk )∞k=1 such that:
(a) S ⊃ S1 ⊃ S2 ⊃ · · ·;
(b) supx,y∈Sk d(x, y) ≤ k −1 ;
(c) each Sk cannot be covered by finitely many elements of {Oα }α∈I .
Choose xk ∈ Sk . By (a), (b) and the fact that X is complete and each Sk
is closed, (xk )∞k=1 is a Cauchy sequence which is convergent to an element
x ∈ ∩∞ k=1 kS . This implies x ∈ Sk ⊂ Oα for some α ∈ I and for sufficiently
large k ∈ N, contradicting (c). Therefore, S must be covered by finitely many
elements of {Oα }α∈I , i.e., S is compact.

One example of a compact metric space will give us many more, by means
of the following result.

Theorem 1.37. Let (X, d) be a compact metric space. Then X is complete


with respect to d. Furthermore:
(i) Any closed subset S of X is compact;
(ii) For any sequence of nonempty closed subsets (Sj )∞ j=1 of X, with S1 ⊇
S2 ⊇ S3 ⊇ ..., there is at least one point in ∩j∈N Sj .

Proof. From Theorem 1.35 (iii) it follows that any Cauchy sequence has a
convergent subsequence and therefore is itself convergent. Thus, (X, d) is com-
plete.
(i) Let S be closed. Then S c is open. If S ⊆ ∪α∈I Uα where Uα ⊆ X is
open for each α ∈ I. Then X ⊆ (∪α∈I Uα ) ∪ S c . Since X is compact, we can
find a finite subset J ⊆ I such that X ⊆ (∪α∈J Uα ) ∪ S c . Hence S ⊆ ∪α∈J Uα .
This shows that S is compact.
(ii) If ∩j∈N Sj = ∅, then X = ∅c = ∪j∈N Sjc . Since X is compact, there is a
finite number of the open subsets {Sicj }Jj=1 such that X = ∪Jj=1 Sicj . Note that
since S1c ⊆ S2c ⊆ S3c ⊆ ..., we must have X = Skc for some k, which produces
the contradiction Sk = X c = ∅.

Remark 1.38. Here, we point out that Theorem 1.37 (ii) does not hold if the
word “compact” is replaced by “complete” in the hypothesis – for example,
X = R and Sk = {x ∈ R : x ≥ k}, k ∈ N.

In addition to this counterexample, we have the following interesting con-


sequence.

Corollary 1.39. Suppose (X, d) is a metric space. If S ⊆ X is compact, then


it is bounded and closed, but not conversely.
16 1 Metric Spaces

Proof. Since X is the union of its open balls, if S ⊆ X is compact then S is


covered by a finite number of those balls and hence S is bounded. To establish
the closedness of S, we note that (S, d) is a compact metric space and so it is
complete, by Theorem 1.37. Accordingly, S is closed.
For the converse, consider the distance d defined by d(x, y) = 1 or d(x, y) =
0 if x 6= y or x = y. Clearly, any infinite set X is bounded and closed under
d, but X is not compact since X = ∪x∈X B1/2 (x) has no finite cover.

Next we deal with the density and the separability of metric spaces.

Definition 1.40. Suppose (X, d) is a metric space.


(i) D ⊆ X is said to be dense in X provided every open ball in X contains a
point of D.
(ii) X is called separable provided there is a countable set S = {x1 , x2 , · · ·} ⊆ X
that is dense in X.

To give an example, we turn immediately to Rn , C[a, b] and `∞ .

Example 1.41.
(i) For n ∈ N, Qn is a countable dense subset of Rn . This fact can be deduced
from the countability of Q. To show that Qn is dense in Rn , let B (x) be any
open ball of radius  > 0 and center x = (x1 , .., xn ) ∈ Rn and use the density
of Q in R to choose rational numbers r1 , ..., rn such that

|xj − rj | < / n for j = 1, ..., n.

Then
n
X 1/2
dn (x, r) = |xj − rj |2 <  and so r = (r1 , ..., rn ) ∈ Qn ∩ B (x).
j=1


(ii)
P∞ Let2 `2 and `∞ be the spaces of real-valued sequences (xk )k=1 with
j=1 xj < ∞ and supk∈N |xk | < ∞, respectively. For two sequences x =
(xj )∞ ∞
j=1 and y = (yj )j=1 , define

  P∞ 1
2 2
dp (x, y) = j=1 |x j − yj | , p=2
1≤j<∞ |xj − yj | , p = ∞.
 sup

Then (`2 , d2 ) and (`∞ , d∞ ) are metric spaces, but the former is separable
and the latter is not separable. It is enough to check the statement after the
but. Let S be the set of all sequences of rational numbers that are eventually
0. If  > 0 is given, then for any x = (xj )∞ j=1 ∈ `2 there is some N ∈ N
P∞ 2 2
such that j=N |x j | < (/2) ; at the same time, for each j ∈ {1, ..., N }

there is an rj ∈ Q such that |rj − xj | < /(2 N ). Accordingly, the sequence
r = (r1 , ...., rN −1 , 0, 0, 0, 0....) ∈ S satisfies
1.3 Compactness, Density and Separability 17
−1
 NX  12 ∞
X  21
d2 (x, r) ≤ |xj − rj |2 + x2j < .
j=1 j=N

This implies that S is dense in `2 . As for the latter, given n ∈ N let x(n) =
(n) (n) (n)
(x1 , x2 , · · · , xj , · · ·) be an element in `∞ . Define x = (x1 , · · · , xj , · · ·),
where (
(j) (j)
xj + 1 , |xj | ≤ 1
xj = (j)
0 , |xj | > 1.
Thus x ∈ `∞ . Moreover,

d∞ (x, x(n) ) ≥ |xn − x(n)


n | ≥ 1, n ∈ N.

This shows that the set {x(n) }∞ n=1 cannot be dense in `∞ .


(iii) For any finite closed interval [a, b] ⊂ R, C[a, b] of all real-valued continu-
ous functions on [a, b] is a metric space under the sup-distance:

d(f, g) = sup |f (x) − g(x)| < ∞ for f, g ∈ C[a, b].


x∈[a,b]

Moreover, it is separable. In fact, let S be the set of piecewise linear functions


of the form
sk+1 − sk
f (t) = sk + (t − tk ), tk ≤ t ≤ tk+1 ,
tk+1 − tk

where a = t0 < t1 < · · · < tn = b is a partition of [a, b], and the sk , tk are
rational (with the possible exceptions of t0 = a and tn = b). The set S is
enumerable. Suppose g ∈ C[a, b]. For any  > 0, there is a δ > 0 such that

|t − s| < δ ⇒ |g(t) − g(s)| < .
4
Let a = t0 < t1 < · · · < tn = b be a partition of [a, b] with t1 , · · · , tn−1 rational
and such that max0≤k≤n−1 (tk+1 − tk ) < δ. Let s0 , · · · , sn be rational numbers
such that max0≤k≤n |g(tk ) − sk | < 4 . Then for tk ≤ t ≤ tk+1 we have
tk+1 − t t − tk
f (t) − g(t) = (sk − g(t)) + (sk+1 − g(t)),
tk+1 − tk tk+1 − tk
and hence

|f (t) − g(t)|
≤ |sk − g(tk )| + |g(tk ) − g(t)| + |sk+1 − g(tk+1 )| + |g(tk+1 ) − g(t)| < .

Thus S is dense in C[a, b].

Motivated by the foregoing example, we can establish a relationship be-


tween compactness and separability.
18 1 Metric Spaces

Theorem 1.42. Let (X, d) be a metric space. If X is compact, then it is


separable.

Proof. Suppose n ∈ N and X is compact. Since X = ∪x∈X B1/n (x), it follows


that there is an Nn ∈ N such that X = ∪N j=1 B1/n (xj ). Now, the countable
n

∞ Nn
set S = ∪n=1 ∪j=1 {xj } is dense in X. As a matter of fact, if x ∈ X then
x ∈ B1/n (xk ) for some k ∈ {1, ..., Nn }. Now for any  > 0 there exists an
n ∈ N such that n−1 <  and hence xk ∈ B1/n (x) ⊆ B (x), i.e., B (x)∩S 6= ∅.
Therefore, X is separable.

The next result is Lebesgue’s property which is of great generality and


combines the concepts of open sets, density and separability. This theorem
should be compared to the Heine-Borel property, for each statement makes
an assertion about reducing an open covering of a set.

Theorem 1.43. For n ∈ N let X = Rn be equipped with the distance dn (·, ·).
(i) There is a countable collection B of open balls in X such that any open
subset of X can be expressed as the union of some subcollection of B.
(ii) If S is any subset of X and {Aµ }µ∈I is a collection of open sets whose
union contains S, then there is a countable subcollection {Aµk }∞ k=1 whose
union contains S.

Proof. (i) As a matter of fact, if B stands for the collection of all open balls
Br (q) with r > 0 being a rational number and q ∈ Qn , then for a given open
set A we can consider the subcollection BA consisting of those open balls such
that Br (q) ⊆ A. Their union is obviously contained in A, and the collection
is countable because both Qn and Q are countable. Therefore it remains only
to show that A is contained in the union of this collection. If x is a point
in A, then there is an open ball about x with rational radius r such that
Br (x) ⊆ A. Since Qn is dense, there is a point q of Qn in Br/2 (x). Then x
is in Br/2 (q), and Br/2 (q) is in BA because it is contained in Br (x), which
lies in A. This last assertion is verified by observing that if y ∈ Br/2 (q), then
dn (x, y) ≤ dn (x, q) + dn (q, y) < r. We have shown that an arbitrary point
x ∈ A is contained in some open ball in BA , so A is contained in the union of
all balls in BA .
(ii) Suppose S ⊆ ∪µ∈I Aµ . By (i), each of the open sets Aµ can be written
as the union of all the balls Brµk centered at the points of Qn that have rational
radii and are contained in Aµ , say, Aµ = ∪∞ µ
k=1 Brk . If x ∈ S then there is some
rk such that x ∈ Brk ⊆ Aµ . For each of the open balls {Brµk }, choose just
µ

one Aµ that contains Brµk and call it Aµrk . The subcollection of all such Aµrk
is countable because there is only one Aµrk for each Brµk . Also, the union of
this subcollection contains the union of the Brµk ’s, which in turn contains S.
Hence there is a countable subcollection of {Aµ }µ∈I whose union contains S.

Remark 1.44. It is worth remarking that the hypothesis of Theorem 1.43 (ii)
makes no restrictive assumption about D. That is the strength and generality
1.3 Compactness, Density and Separability 19

of this result; it applies to any subset D of Rn . If it is given, in addition, that


D is closed and bounded, then the countable subcollection obtained from the
Lebesgue property can be further reduced to a finite subcollection whose union
still contains D.
Definition 1.45. Suppose (X, d) is a metric space.
(i) A set S ⊆ X is said to be nowhere dense if S contains no open ball, i.e.,

S = ∅.
(ii) A set S ⊆ X is said to be of first category in X if S is a countable union
of nowhere dense sets.
(iii) A set S ⊆ X is said to be of second category in X if S is not of first
category.
Example 1.46.
(i) The Georg Cantor set K is nowhere dense. To see this, let us recall the
construction of the Cantor set. Set I = [0, 1]. K is constructed by removing
open intervals from I. The first step, which for notational convenience we call
the zeroth step, is the removal of the “middle third” of I: E0,1 = (1/3, 2/3).
At the next step (i.e., the real first step) we remove the “middle thirds” of the
remaining two intervals. We continue in this way, at each step removing the
“middle thirds” of the remaining intervals. At the kth step there will be 2k
intervals remaining and then we remove 2k “middle thirds”; our notation for
the deleted open intervals is then Ek,j , j = 1, 2, ..., 2k . Note that the length
mid (Ek,j ) = 3−(k+1) of Ek,j . The set remaining after removal of the middle
2k
thirds is called the Cantor set, K. Explicitly, K = I \ ∪∞ k=0 ∪j=1 Ek,j . It is
clear that K is closed since K c is open. Note that {Ek,j } are disjoint one from
another. So it follows from Theorem ?? that
k k
∞ X
2 ∞ X
2
k X X 1
∪∞ ∪2j=1 Ek,j

mid k=0 = mid (Ek,j ) = = 1.
3j+1
k=0 j=1 k=0 j=1

Since mid (I) = 1, one has


2k
1 = mid (I) = mid (K) + mid ∪∞

k=0 ∪j=1 Ek,j = mid (K) + 1,

which implies mid (K) = 0. If (K)◦ = K ◦ were not empty, then K would
contain a finite open interval (a, b), then b − a ≤ mid (K) = 0, and hence
a = b, a contradiction. Thus, K ◦ = ∅.
(ii) Q is of first category in R – this follows from the definition.
(iii) R is of second category in itself – this assertion is not obvious, and in
fact it can only be deduced as a consequence of the so-called Baire category
theorem (named after René-Louis Baire) as follows.
Theorem 1.47. Given a metric (X, d). If X is complete, then it is of second
category, and hence the complement of a first category set in X is of second
category and dense in X
20 1 Metric Spaces

Proof. Let Ak be nowhere dense in X for every k ∈ N, and let E = ∪∞ k=1 Ak .


We shall show that there is x ∈ X \ E = E c and this will establish the result.
Let x(0) ∈ X. Since A1 is nowhere dense, there is a closed ball B1 of radius
less than 1/2 inside the closed ball B 1 (x(0) ) and such that B1 ∩ A1 = ∅. Since
A2 is nowhere dense, there is also a closed ball B2 of radius smaller than 1/3
such that B2 ⊆ B1 and B2 ∩ A2 = ∅. Continuing this construction produces
a nested sequence of closed balls: (Bk )∞ k=1 such that the radius of Bk is less
than 1/(k + 1). By the proof for Theorem 1.28 and its immediate Remark
1.29, there is x ∈ ∩∞k=1 Bk ⊆ B1 (x
(0)
). But x ∈
/ Ak for any k; so x ∈
/ E.
If S is of first category, then the density of S c can be worked out by
replacing B1 (x(0) ) by B (x(0) ) in the foregoing argument. Of course, S c must
be second category: Otherwise, X = S ∪S c is of first category, a contradiction.

Problems
1.1. For each of the following functions on R2 , determine which of the prop-
erties (i), (ii), and (iii) in Definition 1.1 are satisfied:
(i) d(x, y) = |x1 − y1 | + |x2 − y2 |;
(ii) d(x, y) = max{|x1 − y1 |, |x2 − y2 |};
(iii) 
((x1 − y1 )2 + (x2 − y2 )2 )1/2 , x2 ≥ y2
d(x, y) =
|x1 − y1 | + |x2 − y2 | , x2 ≤ y2 .

1.2. Prove thatPif x(1) , ..., x(m) are points in a metric space (X, d), then
m
d(x(1) , x(m) ) ≤ j=2 d(x(j) , x(j−1) ).
1.3. Given n ∈ N. P
n
(i) Let d∗n (x, y) = k=1 |xk − yk | for x, y ∈ Rn . Prove that d∗n satisfies (i),
(ii), and (iii) of Definition 1.1. In addition, show that dn (x, y) ≤ d∗n (x, y) for
x, y ∈ Rn .
(ii) Let d∗,n (x, y) = max1≤k≤n |xk −yk | for x, y ∈ Rn . Prove that d∗,n satisfies
(i), (ii), and (iii) of Definition 1.1. Discuss the relationship between dn and
d∗,n .
1.4. Prove that A◦ and A are the largest open subset of A and the smallest
closed set containing A, respectively.
1.5. In R2 , show that the closure of B1 (0) is B 1 (0).
1.6. Suppose (X, d) is a metric space. Prove that if S is a finite subset of X,
then S is disconnected.
1.7. Prove that, if f, g > 0 on R so that their graphs

G1 = {x ∈ R2 : x2 = f (x1 )} and G2 = {x ∈ R2 : x2 = −g(x1 )}

are sets in R2 , then S = G1 ∪ G2 is disconnected.


1.3 Compactness, Density and Separability 21

1.8. Decide whetheror not each of the following sequences in R2 :


(i) x(k) = k−1 k+1
k , k ;
(k) 1
(ii) x = k , sin πk ;
(iii) x(k) = k1 , k ;


converges and, if so, find its limit.

1.9. Let X be the interval (0, 1) in R with the usual distance |x − y|. Find a
Cauchy sequence in X that does not converge in X.

1.10. For n ∈ N, prove the following two equivalent statements:


(i) (x(k) )∞ n
k=1 is a bounded sequence in R if and only if each coordinate
(k) ∞
sequence (xj )k=1 (j = 1, ..., n) is bounded in R;
(ii) A is a bounded set in Rn when and only when there exists an n-cube
containing A.

1.11. Prove that if X is an infinite set and its distance function d is given by

1 , x= 6 y
d(x, y) =
0 , x = y,

then every Cauchy sequence in X is eventually constant – there exists an


N ∈ N such that x(k) = x(N ) whenever k ≥ N .

1.12. For n ∈ N, prove that D = {x = (x1 , ..., xn ) ∈ Rn : x1 , ..., xn ∈ R\Q}


is dense in Rn .

1.13. Let X be a metric space with the distance function d. Show that if
(x(k) )∞
k=1 is a Cauchy sequence in X and has a subsequence (x
(kj ) ∞
)j=1 that
converges to a point x ∈ X, then d(xk , x) → 0.

1.14. Let (X, d) be a metric space. Prove that X is complete


∞ if and only if
every closed nested sequence of closed balls B rk (xk ) k=1 with rk → 0 has
non-void intersection.

1.15. Let (X, d) be a complete metric space. Prove that if (Sj )∞


j=1 is nonempty
closed subsets of X with Sj+1 ⊆ Sj for any j ∈ N and diam(Sj ) =
sup{d(x, y) : x, y ∈ Sj } → 0 as j → ∞ then ∩∞ j=1 Sj 6= ∅.

1.16. Suppose (X, d) is a metric space. Show that S ⊆ X is nowhere dense if


and only if every open ball O contains an open ball O0 such that O0 ∩ S = ∅.

1.17. Suppose (X, d) is a metric space. Prove that if Eα ⊆ X is compact for


any α ∈ I, then ∩α∈I Eα is compact.

1.18. Let (X, d) be a metric space and limn→∞ d(xn , x0 ) = 0. Prove that the
set {xj }∞
j=0 is compact.
2
Continuous Maps

Since every metric space is automatically a topological space with the set
of all open sets as the metrizable topology, we have a notion of continuous
mappings between metric spaces. Note that a function between topological
spaces is said to be continuous if the inverse image of every open set is open
– this is an attempt to capture the intuition that there are no breaks or
separations in the function. Without referring to the topology, this notion
can also be directly defined using limits of sequences. In this chapter, we
consider properties of continuous mappings, contractions with straightforward
applications to integral and differential equations, and equivalence of metric
spaces.

2.1 Criteria for Continuity

By a mapping on a metric space (X, d), frequently written as (X, dX ) in what


follows, we mean a mapping on the set of points in X, and by a mapping
with values in a metric space Y we mean a mapping with values in the set of
points in Y . Thus if f : X → Y is a mapping from the metric space X into the
metric space Y , then to each point x ∈ X is associated a point f (x) ∈ Y . The
mapping f will be called continuous at a point x0 ∈ X if, roughly speaking,
points of X that are near x0 are mapped by f into points of Y that are near
f (x0 ). Below is the precise definition.

Definition 2.1. Let (X, dX ) and (Y, dY ) be two metric spaces, let f : X → Y
be a mapping, and let x0 ∈ X. Then f is said to be continuous at x0 if, given
any  > 0, there  exists a δ > 0 such that if x ∈ X and dX (x, x0 ) < δ, then
dY f (x), f (x0 ) < . Moreover, if f is continuous at all points in X, then f
is said to be continuous on X.

Here we would like to emphasize that δ depends on  (as well as x0 ).


Therefore, we could more accurately have written δ() instead of δ. We stick
24 2 Continuous Maps

to the notation δ rather than δ() for notational simplicity, always bearing in
mind that each  must have its own δ.

Example 2.2.
(i) The mapping f : R → R given by f (x) = x2 is continuous.
(ii) Let X be any metric space with the distance d, and x0 a fixed point in
X. Then the mapping f : X → R given by f (x) = d(x, x0 ) for all x ∈ X
is continuous. This follows from a use of the triangle inequality of d. The
special case X = R and x0 = 0 shows that the absolute value function |x| is
continuous.
(iii) If f : X → Y is continuous and S is a subspace of X, then the restriction
of f to S is continuous on S. This is clear from Definition 2.1. One of the
curious consequences of Definition 2.1 occurs at a point in x0 ∈ S that is not
a cluster point of S. Since x0 is an isolated point of S, there is an open ball
Bδ (x0 ) that contains no point of S other than x0 . Accordingly, the mapping
is always continuous at x0 .

Let us have a look at another example.


Example 2.3. For n, m ∈ N, let T : Rn → Rm be a linear mapping; that is,

T (ax + by) = aT (x) + bT (y) for all x, y ∈ Rn .

Then there exists an m×n real-valued matrix A = [akj ] such that y = T (x) =
Ax, i.e.,    
x1 y1
 ·   · 
   
T ·  =  ·  ,
   
 ·   · 
xn ym
where
n
X
yj = ajk xk , akj ∈ R, k = 1, ..., n and j = 1, ..., m.
k=1

Thus T is continuous.
To prove this result, we first consider the case m = 1, and see that T :
Rn → R is a function. Let e(k) , k = 1, ..., n be the standard basis vectors
n
of RP : Its kth coordinate is 1 and others are 0. And set Ak = T (e(k) ). Since
n
x = k=1 xk e(k) , by the definition of T we have
n
X n
X
T (x) = xk T (e(k) ) = x k Ak .
k=1 k=1

Thus T is represented by the row matrix [A1 , ..., An ] whenever x is written as


the column matrix
2.1 Criteria for Continuity 25
 
x1
· 
 
· 
x= 
· 
xn
Suppose now m > 1. Then


y1
· 
 
· ,
T (x) =  
· 
ym

where yj is a function of x, say yj = fj (x). The linearity of T implies that T


is linear in each coordinate; that is, each fj is a linear function from Rn into
R. Therefore, by the first case, each fj is given by a row matrix:
n
X
yj = fj (x) = ajk xk .
k=1

This set of m linear functions f1 , ..., fm determines the m rows of the matrix
[ajk ]. Thus T (x) = Ax.
In order to prove the continuity of T , let

X n
m X 1/2 l
X 1/2
MT = a2jk ; ||x||l = x2k ; dl (x, y) = ||x − y||l
j=1 k=1 k=1

for 
  
x1 y1
·  · 
l
   
 ·  and y =  ·  in R .
x=   
·  · 
xl yl
Then !1/2
n
X 2 n
X 2
||T (x)||m = a1k xk + ... + amk xk .
k=1 k=1

Using the Cauchy-Schwarz inequality, we get


n
X 2 n
X n
 X 
ajk xk ≤ a2jk x2k , j = 1, 2, ..., m.
k=1 k=1 k=1

and hence ||T (x)||m ≤ MT ||x||n . Consequently, for any fixed point x0 ∈ Rn
we have
26 2 Continuous Maps

dm T (x), T (x0 ) = ||T (x) − T (x0 )||m ≤ MT dn (x, x0 ).

If  > 0 is given, then we can define δ = 1+MT , and this implies that

dm T (x), T (x0 ) <  whenever dn (x, x0 ) < δ.

Remark 2.4. Note that the choice of δ above does not depend on x0 where the
continuity is established. Thus we conclude that linear mappings, like linear
functions on R, are uniformly continuous on Rn .

Definition 2.1 may be reformulated by saying that f is continuous at x0


if, given any open ball in Y of center f (x0 ), there exists an open ball in X
centered at x0 whose image under f is contained in the former ball. Clearly,
these open balls can be replaced by open sets. The forthcoming criterion for
the continuity of a mapping from one metric space into another is often useful.

Theorem 2.5. Let (X, dX ) and (Y, dY ) be two metric spaces and let f : X →
Y be a mapping. Then f is continuous if and only if for every open subset V
of Y the inverse image f −1 (V ) = {x ∈ X : f (x) ∈ V } is an open subset of
X.

Proof. To prove this theorem, first suppose f is continuous. We have to show


that if V ⊂ Y is open, then also f −1 (V ) is open. Let x0 ∈ f −1 (V ). Then
f (x0 ) ∈ V . Since V is open, it contains the open ball in Y of center f (x0 ) and
some radius  > 0. Since f is continuous at x0, there is a δ > 0 such that if
x ∈ X and dX (x, x0 ) < δ then dY f (x), f (x0 ) < . This means that if x is
in the open ball in X of center x0 and radius δ then f (x) belongs to the open
ball in Y of center f (x0 ) and radius , so that f (x) ∈ V . That is, f −1 (V )
contains the open ball in X of center x0 and radius δ. Since x0 was any point
of f −1 (V ), the set f −1 (V ) is open.
Conversely, suppose that for every open V ⊂ Y , the set f −1 (V ) is an open
subset of X. We must  show that f is continuous at any point x0 ∈ X. For any
 > 0 let B f (x0 ) ⊂ Y be the open ball of center f (x0 ) and radius . Then
f −1 B f (x0 ) is an open subset of X that contains x0 , and hence contains


the open ball in X of center x0 and  some radius δ > 0. Thus if x ∈ X and
dX (x, x0 ) < δ, then dY f (x), f (x0 ) < . This means that f is continuous at
x0 , and this completes the proof.

Remark 2.6. As an immediate consequence of this theorem, we have that if


f : X → R is continuous then for any c ∈ R

{x ∈ X : f (x) > c} and {x ∈ X : f (x) < c}

are open subsets of X. Additionally, it is not hard to see that a continuous


mapping of a continuous mapping is a continuous mapping; that is, if f : X →
Y and g : Y → Z are continuous, so is the mapping g ◦ f : X → Z.
2.2 Continuous Maps on Compact or Connected Spaces 27

The next result is the sequence criterion for continuity.

Theorem 2.7. Let (X, dX ) and (Y, dY ) be two metric spaces and let f : X →
Y be a mapping. Then f is continuous if and only if for every sequence
∞ (xk )∞
k=1
of points in X that converges to x0 ∈ X, the sequence f (xk ) k=1 converges
to f (x0 ) in Y .

Proof. Suppose first that f is continuous at x0 ∈ X and that (xk )∞ k=1 con-

verges to x0 in X. We have to show that f (xk ) k=1 converges to f (x0 ) in Y .
Given  > 0, the  continuity of f at x0 implies that there is a δ > 0 such that
dY f (x), f (x0 ) <  whenever x ∈ X and dX (x, x0 ) < δ. Because (xk )∞ k=1
converges to x0 , there exists a positive integer N such  that dX (xk , x0 ) < δ
for all k > N . Hence if k > N then dY f (xk ), f (x0 ) < , which shows that
∞
f (xk ) k=1 converges to f (x0 ).
We now suppose the statement after the if and only if is valid. If f is
not continuous at x0 , then there is some 0 > 0 such that for no number
δ > 0 is it true  that whenever x ∈ X and dX (x, x0 ) < δ then necessarily
dY f (x), f (x0 ) < . Hence for any k ∈ N we  can find a point xk ∈ X such
that dX (xk , x0 ) < 1/k and dY f (xk ), f (x0 ) ≥ 0 . Since dX (xk , x0 ) < 1/k,
∞
the sequence (xk )∞ k=1 converges to x0 . However f (xk ) k=1 does not converge
to f (x0 ) since dY f (xk ), f (x0 ) ≥ 0 for all k. This completes the proof.

Example 2.8. Let f be defined on R2 by



0 , x=0
f (x) = f (x1 , x2 ) = x1 x2
, x 6= 0.
x21 +x22

(k) ∞
Then as (x(k) )∞

k=1 approaches 0 along the axes, we see that f (x ) k=1 con-
verges to 0. But if x(k) = (1/k, 1/k), then f (x(k) ) = 1/2. Hence f is not
continuous at 0.

2.2 Continuous Maps on Compact or Connected Spaces


First, we consider continuous mappings on a compact metric space.

Theorem 2.9. Let (X, dX ) and (Y, dY ) be two metric spaces and let f : X →
Y be a continuous mapping. If X is compact, so is its image f (X).

Proof. We must prove that if f (X) = {f (x) : x ∈ X} is covered by the


union of a collection of open subsets of Y then it is covered by the union of
a finite number of these open subsets. So, suppose f (X) ⊆ ∪j∈I Vj where I
is an index set and Vj is open. Because f is continuous, each inverse image
f −1 (Vj ) is open. Also, for any x ∈ X we have f (x) ∈ Vj for some j ∈ I, in
which case x ∈ f −1 (Vj ), so that X = ∪j∈I f −1 (Vj ). By the compactness of X
we can find a finite subset J ⊆ I such that X = ∪j∈J f −1 (Vj ). Therefore
28 2 Continuous Maps

f (X) = ∪j∈J f (f −1 (Vj )) ⊆ ∪j∈J Vj .

Thus f (X) is compact.

This theorem has two extremely important immediate consequences.

Corollary 2.10. Let (X, dX ) and (Y, dY ) be two metric spaces and let f :
X → Y be a continuous mapping. If X is compact, then f is bounded.

Proof. Since f (X) is compact, it is bounded according to Corollary 1.39.

The example f (x) = x on (0, 1) shows that a bounded continuous real-


valued function need not assume either its least upper bound or its greatest
lower bound on its domain of definition. If, however, the domain is compact,
then these values are actually taken on by the continuous function.

Corollary 2.11. Let (X, dX ) be a nonempty compact metric space and let
f : X → R be continuous. Then
(i) sup{f (x) : x ∈ X} = M and inf{f (x) : x ∈ X} = m are finite.
(ii) There are points x, y ∈ X such that f (x) = M and f (y) = m.

Proof. (i) By Theorem 2.9 it follows that f (X) is a compact subset of R, hence
closed and bounded. Since f (X) is not empty, f (X) has both a greatest and
a least element.
To prove (ii), by the definition of M we derive that for each k ∈ N there
is xk ∈ X such that M − 1/k < f (xk ) ≤ M. Since X is compact, (xk )∞ k=1 has
a convergent subsequence (xkj )∞ j=1 , which converges to a point x ∈ X. Since
f is continuous, f (x) = limj→∞ f (xkj ) = M. The argument for the existence
of the point y such that f (y) = m is similar.

Definition 2.12. Let (X, dX ) and (Y, dY ) be two metric spaces and let f :
X → Y be a mapping. Then f is said to be uniformly continuous if, given
 exists a δ > 0 such that if x, y ∈ X and dX (x, y) < δ then
any  > 0, there
dY f (x), f (y) < .

If it happens that a mapping f : X → Y is such that for a certain subset


S of X, the restriction of f to S is uniformly continuous, we say that f is
uniformly continuous on S.
Clearly, a uniformly continuous mapping is continuous. To check continuity
at a point x0 ∈ X just set y = x0 in the definition. The next theorem will
state that conversely if f is continuous then f is actually uniformly continuous,
provided X is compact.

Theorem 2.13. Let (X, dX ) and (Y, dY ) be two metric spaces and let f : X →
Y be a continuous mapping.
∞ If X is compact, then f is uniformly continuous;
equivalently, f (xj ) j=1 is a Cauchy sequence in Y whenever (xj )∞ j=1 is a
Cauchy sequence in X.
2.2 Continuous Maps on Compact or Connected Spaces 29

Proof. Suppose f is not uniformly continuous on X. Then there are an 0 > 0


and sequences (xk )∞ ∞
k=1 and (yk )k=1 of points in X such that

dX (xk , yk ) < 1/k and dY f (xk ), f (yk ) ≥ 0 .

Since X is compact, (xk )∞ ∞


k=1 has a subsequence (xkj )j=1 , which converges to
a point x ∈ X. Then ykj → x since

0 ≤ dX (ykj , x) ≤ dX (ykj , xkj ) + dX (xkj , x) < kj −1 + dX (xkj , x).

Since f is continuous, we have


 
lim dY f (xkj ), f (x) = 0 and lim dY f (ykj ), f (x) = 0;
j→∞ j→∞

hence limj→∞ dY f (xkj ), f (ykj ) = 0 because
  
dY f (xkj ), f (ykj ) ≤ dY f (xkj ), f (x) + dY f (ykj , f (x) ,

contradicting dY f (xkj ), f (ykj ) ≥ 0 .
Next, we prove the equivalence. The above argument indicates that if f is
not uniformly continuous on X, then the sequence {xk1 , yk1 , xk2 , yk2 , ...} is a
Cauchy sequence in X but {f (xk1 ), f (yk1 ), f (xk2 ), f (yk2 ), ...} is not a Cauchy
sequence in Y . On the other hand, if f is uniformly continuous, then we
conclude thatfor any  > 0 there is a δ > 0 such that dX (x, y) < δ implies
dY f (x), f (y) < . Now if (xj )∞ j=1 is a Cauchy sequence in X then for the
above obtained δ > 0 there exists N  > 0 such that dX∞(xj , xk ) < δ as k, j > N
and consequently, dY f (xj ), f (xk ) < , i.e., f (xj ) j=1 is a Cauchy sequence
in Y .
Remark 2.14. The compactness of the domain is important. For example, the
function f , defined by f (x) = 1/x on (0, 1], is continuous but is not uniformly
continuous on the noncompact set (0, 1]. The problem, of course, lies at the
left end-point of the interval. The function values are arbitrarily large in every
open ball of center 0.
Secondly, we discuss continuous mappings on a connected metric space.
Theorem 2.15. Let (X, dX ) and (Y, dY ) be two metric spaces and let f :
X → Y be a continuous mapping. If X is connected, so is its image f (X).
Proof. If f (X) is not connected, then f (X) = A ∪ B, where A and B are
disjoint nonempty open subsets of Y . It is clear that f −1 (A) and f −1 (B) are
disjoint nonempty open subsets of X. We therefore have the expression of
X as X = f −1 (A) ∪ f −1 (B). This contradicts the fact that X is connected,
proving the theorem.
Below is the intermediate value property as a consequence of the above
theorem.
30 2 Continuous Maps

Example 2.16. Suppose [a, b] is a finite interval of R. Let f : [a, b] → R be


continuous with f (a) < f (b). If f (a) < y < f (b), then there is ζ ∈ [a, b] such
that f (ζ) = y.

In fact, since [a, b] is connected, so is f ([a, b]). Note that any open subset
of R which contains a and b and does not contain some point between a and
b is not connected: for suppose that a < c < b and S is an open subset of R
with a, b ∈ S, c ∈/ S, then

S = (S ∩ {x ∈ R : x < c}) ∪ (S ∩ {x ∈ R : x > c})

expresses S as the union of two disjoint nonempty open subsets. In other


words, any connected subset of R contains all points between any two of its
points. Since y is between f (a) and f (b) in f ([a, b]) which is connected, we
therefore have y ∈ f ([a, b]).

2.3 Sequences of Mappings


Definition 2.17. Let (X, dX ) and (Y, dY ) be two metric spaces, and for k ∈
N let fk : X → Y be a mapping. We say that 

(i) (fk )∞
k=1 converges at x ∈ X provided fk (x) k=1
converges in Y ;

(ii) (fk )∞

k=1 converges on X, provided f k (x) k=1
converges at any x ∈ X;
(iii) (fk )∞
k=1 converges to a mapping f : X → Y , denoted f = limk→∞ fk ,
provided f (x) = limk→∞ fk (x) whenever x ∈ X.

Example 2.18.
(i) Let fk : [0, 1] → R be given by fk (x) = x − x/k. Then limk→∞ fk (x) = x.
(ii) Let fk : [0, 1] → R be given by fk (x) = xk . Then limk→∞ fk (x) = f (x)
where f (x) = 0 or 1 if x ∈ [0, 1) or x = 1. Note that fk is continuous, but f
is not.

Definition 2.19. Let (X, dX ) and (Y, dY ) be two metric spaces, for k ∈ N
let fk : X → Y be a mapping, and let f : X → Y be another mapping. Then
we say that (fk )∞
k=1 converges uniformly to f on X if, given any  > 0, there
is an N ∈ N such that dY fk (x), f (x) <  whenever k > N for all x ∈ X. If
the restrictions of (fk )∞
k=1 to a certain subset S of X converge uniformly to
some mapping on S, we say that (fk )∞ k=1 converges uniformly on S.

Uniform convergence clearly implies convergence. Having a look at Exam-


ple 2.18, we have that the first one is uniform, but not the second. Moreover,
we have the following Cauchy criterion for uniform convergence.

Theorem 2.20. Let (X, dX ) and (Y, dY ) be two metric spaces, let Y be com-
plete, and for k ∈ N let fk : X → Y be a mapping. Then (fk )∞ k=1 converges
uniformly to f on X when  and only when for any  > 0 there is an N ∈ N
such that dY fk (x), fl (x) <  whenever k, l > N for all x ∈ X.
2.3 Sequences of Mappings 31

Proof. It is enough to verify the sufficiency. Suppose the statement


∞ after the
when and only when is true. So, for any x ∈ X, fk (x) k=1 is a Cauchy
sequence in Y . Since Y is complete, this sequence has a limit. Thus (fk )∞ k=1
converges to some mapping
 f . Given  > 0, choose an integer N so that we
have dY fk (x), fl (x) < /2 whenever k, l > N , for all x ∈ X. Then for any
∞
fixed k > N and fixed x ∈ X the sequence fl (x) l=1 is such that all terms
after the N -th are within distance /2 of fk (x), and are therefore in the closed
ball in Y of center fk (x) and radius /2. Since
∞closed balls are closed sets, the
limit f (x) of the convergent sequence fl (x) l=1 is also in this closed ball, so
 
that dY fk (x), f (x) ≤ /2. Hence if k > N we have dY fk (x), f (x) <  for
all x ∈ X, proving uniform convergence.

Interestingly, the continuity is preserved under the uniform convergence.

Theorem 2.21. Let (X, dX ) and (Y, dY ) be two metric spaces and let (fk )∞ k=1
be a uniformly convergent sequence of continuous mappings from X into Y .
Then f = limk→∞ fk is continuous.

Proof. Fix x0 ∈ X. Let  > 0. Take an N ∈ N such that dY fk (x), f (x) < /3
for all x ∈ X, which is possible by the uniform convergence. Since each fk is
continuous at x0, there is a δ > 0 such that if x ∈ X and dX (x, x0 ) < δ then
dY fk (x), fk (x0 ) < /3. Hence if x ∈ X and dX (x, x0 ) < δ we have
   
dY f (x), f (x0 ) ≤ dY f (x), fk (x) + dY fk (x), fk (x0 ) + dY fk (x0 ), f (x0 )
< .

Thus f is continuous at x0 . We are done.

If f1 and f2 are mappings from X into Y , it is natural to try to find


some measure of the extent to which f1 and f2 differ, that is to find some
sort of distance between f1 and f2 . For any specific x ∈ X we may say that
f1 and f2 differ at x by the distance between their values at x, that is by
dY f1 (x), f2 (x) , but we would really like to measure how much f1 and f2
differ over all points of X, not just at x. There are various ways to do this,
depending on the circumstances and purposes in mind, but the most simple-
minded method turns out to be one of the most useful.  It is to take the
distance between f1 and f2 to be max{dY f1 (x), f2 (x) : x ∈ X} if this
maximum happens to exist. In order to develop this idea we turn aside for a
simple lemma.

Lemma 2.22. Let (X, dX ) and (Y, dY ) be two metric spaces and let f1
and f2 be continuous
 mappings from X into Y . Then the function x →
dY f1 (x), f2 (x) is continuous on X.

Proof. This follows from the triangle inequality based estimate below:
32 2 Continuous Maps
 
|dY f1 (x), f2 (x) − dY f1 (x0 ), f2 (x0 ) |
 
≤ |dY f1 (x), f2 (x) − dY f1 (x), f2 (x0 ) |
 
+|dY f1 (x), f2 (x0 ) − dY f1 (x0 ), f2 (x0 ) |
 
≤ dY f2 (x), f2 (x0 ) + dY f1 (x), f1 (x0 ) .

Now we consider the set C(X, Y ) of all continuous mappings from X into
Y . If X is compact, then it is clear that for f1 , f2 ∈ C(X, Y ) the following
notion 
dC(X,Y ) (f1 , f2 ) = max{dY f1 (x), f2 (x) : x ∈ X}
makes sense since any continuous real-valued function on a compact metric
space attains a maximum. Moreover, it is easy to prove that C(X, Y ) is a
metric space under dC(X,Y ) (·, ·). However, it is abstract in the sense that its
points are mappings on another metric space. A sequence of points in C(X, Y )
is a sequence of continuous mappings (fk )∞ k=1 from X into Y . This sequence
will converge to a point f ∈ C(X, Y ) if and only if limk→∞ dC(X,Y ) (fk , f ) = 0;
in other words, if and only if for each  > 0 there is an N ∈ N such that  for
any integer k > N one has dC(X,Y ) (fk , f ) < ; that is, dY fk (x), f (x) < 
for all x ∈ X. Equivalently, (fk )∞
k=1 converges uniformly to f .
An application of the last two theorems gives the following result cov-
ering the well-known Arzela-Ascoli theorem which comes about as this is a
generalization going back to Cesare Arzelá’s theorem – if fn is a uniformly
bounded sequence of Riemann integrable functions that converge pointwise to
a Riemann integrable function f on the finite interval [a, b], then
Z b Z b
f (x)dx = lim fn (x)dx.
a n→∞ a

Theorem 2.23. Let (X, dX ) and (Y, dY ) be two metric spaces and let X be
compact and Y complete. Then
(i) C(X, Y ) is a complete metric space with respect to the distance dC(X,Y ) (·, ·).
(ii) A subset E of C(X, Y ) is compact if and only if E is bounded, closed, and
equi-continuous in the sense of that for any  > 0 there is a δ > 0 such that

dX (x1 , x2 ) < δ ⇒ dY (x1 ) − f (x2 ) <  for all f ∈ E.

Proof. (i) Suppose (fk )∞k=1 ⊂ C(X, Y ) is a Cauchy sequence. Then for any
 > 0 there is an N  ∈ N such that if k, l > N then dC(X,Y ) (fk , fl ) < ; that
is, dY fk (x), fl (x) <  for all x ∈ X. Since Y is complete, Theorem 2.20 is
applied to derive that (fk )∞ k=1 converges uniformly on X to some mapping
f : X → Y . However, Theorem 2.21 tells us that f is continuous. Thus
f ∈ C(X, Y ) and limk→∞ fk = f in the sense of points of the metric space
C(X, Y ), dC(X,Y ) . Thus C(X, Y ) is complete under dC(X,Y ) (·, ·).
(ii) Let E be a subset of C(X, Y ). If E is compact, then E is automati-
cally bounded and closed. Moreover, E is totally bounded by Theorem 2.23
2.3 Sequences of Mappings 33

and Corollary 1.36, and hence for  > 0 there are finite many open balls
C(X,Y ) Sn C(X,Y )
{B/3 (fk )}nk=1 of C(X, Y ) such that E ⊆ j=1 B/3 (fk ). Since each
fk : X → Y is uniformly continuous, there exists a δk > 0 such that

dX (x1 , x2 ) < δk ⇒ dY fk (x1 ), fk (x2 ) < /3.
Let δ = min{δk : k = 1, 2, ..., n}. Then for a given f ∈ E, choose an fk such
that dC(X,Y ) (f, fk ) < /3. Hence, dX (x1 , x2 ) < δ yields

dY f (x1 ), f (x2 )
  
≤ dY f (x1 ), fk (x1 ) + dY fk (x1 ), fk (x2 ) + dY fk (x2 ), f (x2 ) < .
Conversely, assume the statement after the if and only if. Since E is closed
and C(X, Y ) is complete, according to Corollary 1.36 and Theorem 1.35 it
remains to check that each sequence (fk )∞k=1 in E has a Cauchy subsequence.

Since X is separable, there is a sequence
∞ (xk )k=1 which is dense in X. Note
that E is bounded and then fk (x1 ) k=1 is bounded in the complete space Y .
(1) ∞
So this sequence has a convergent subsequence, denoted fk (x1 ) k=1 . Simi-
(2) (1) ∞ (2) ∞
larly, suppose (fk )∞k=1 is a subsequence of (fk )k=1 such that fk (x2 ) k=1
(j)
converges. In this way, we can inductively obtain subsequence (fk )∞ k=1 of
∞ (j) (k) ∞
(fk )k=1 such that limk→∞ fk (xl ) exists for l = 1, 2, ..., j. Now (fk )k=1 en-
(k) ∞ (k)
sures that fk (xj ) k=1 is convergent for each j. Since fk ∈ C(X, Y ), for
 > 0 there is a δ > 0 such that
(k) (k) 
dX (x, y) < δ ⇒ dY fk (x), fk (y) < /3.
Also, note that (xk )∞ k=1 is dense in the compact space X. So, it follows that
X = ∪N X X
l=1 Bδ (xl ) for some natural number N , where Bδ (xl ) is the open ball
(k)
in X centered at xl with radius δ. Furthermore, the definition of fk yields a
natural number M such that
(m)
(xj ), fn(n) (xj ) < /3, j = 1, 2, ..., N.

m, n > M ⇒ dY fm
Finally, upon choosing xj such that x ∈ BδX (xj ) for any x ∈ X, we get that
if m, n > M then
(m)
(x), fn(n) (x)

d Y fm
(m) (m) (m)
(xj ), fn(n) (xj ) + dY fm
(m)
(xj ), fn(n) (x)
  
≤ d Y fm (x), fm (xj ) + dY fm
< ,
(k)
and consequently, (fk )∞
k=1 is convergent to an element of E and hence E is
compact.
Remark 2.24. Theorem 2.23 extends (from a closed interval to a compact met-
ric space) Guido Ascoli’s theorem – every bounded equi-continuous sequence
of real-valued functions on the unit interval [0, 1] has a uniformly convergent
subsequence.
34 2 Continuous Maps

2.4 Contractions
In general, a fixed point of a mapping f from a set X to itself is a point
x0 ∈ X such that f (x0 ) = x0 . For example, 0 is a fixed point of the mapping
f (x) = x2 + x from R to R. But, not all mappings have fixed points – for
instance, the mapping f (x) = x+1 has no fixed point on R. So, what mapping
has a fixed point is an interesting question. In what follows, we introduce an
important tool – the so-called Banach fixed point theorem (named after Stefan
Banach) – which ensures the existence and uniqueness of fixed points of certain
self mappings of metric spaces, but also provides a constructive method to find
those fixed points.
Definition 2.25. Let (X, dX ) be a metric space. A mapping f from X to
itself is a contraction if there is an α ∈ (0, 1) such that

dX f (x1 ), f (x2 ) ≤ αdX (x1 , x2 ) for all x1 , x2 ∈ X.
A contraction mapping contracts or shrinks the distance between points
by the factor α. Clearly, any contraction map is uniformly continuous on X.
This property actually suggests the following Banach’s theorem on contraction
mappings and fixed points.
Theorem 2.26. Let (X, dX ) be a complete metric space. If f : X → X is a
contraction mapping, then f has a unique fixed point.
Proof. Let x0 ∈ X. Define xk+1 = f (xk ) for k ∈ N ∪ {0}. We claim that
(xk )∞
k=1 is a Cauchy sequence in X. In fact

dX (x2 , x1 ) = dX f (x1 ), f (x0 ) ≤ αdX (x1 , x0 ),
and so
dX (x3 , x2 ) = dX f (x2 ), f (x1 ) ≤ αdX (x2 , x1 ) ≤ α2 dX (x1 , x0 ).


Generally, one has that if k > j then


k−1
X  αj 
dX (xk , xj ) ≤ αi dX (x1 , x0 ) ≤ dX (x1 , x0 ).
i=j
1−α
This, together with α ∈ (0, 1), implies (xk )∞
k=1 is a Cauchy sequence and hence
it converges to a point x ∈ X: limk→∞ xk = x in X. Since f is uniformly
continuous, one has
f (x) = lim f (xk ) = lim xk+1 = x;
k→∞ k→∞

that is to say, x is a fixed point.


Regarding the uniqueness, suppose y is also a fixed point of f . Then

dX (x, y) = dX f (x), f (y) ≤ αdX (x, y)
and hence dX (x, y) = 0; that is, x = y, thanks to α ∈ (0, 1).
2.4 Contractions 35

Remark 2.27. The above proof is constructive in the sense that the fixed point
is the limit of the iterates given by xk+1 = f (xk ), where the initial point x0
is an arbitrary point in X. The previous estimate gives the rapidity of the
convergence xk → x0 :
 αj  
dX (x, xj ) ≤ dX f (x0 ), x0 .
1−α
Below is an interesting example which shows an application of Theorem
2.26 in integral equations.

Example 2.28. For a finite interval [a, b] of R, let k : [a, b] × [a, b] → R be


continuous and φ ∈ C[a, b]. An equation of the type:
Z b
f (x) = λ k(x, y)f (y)dy + φ(x), λ ∈ R,
a

where f is an unknown function, is called a Fredholm integral equation of the


second kind. This equation introduced by Ivar Fredholm has a unique solution
f ∈ C[a, b] for certain λ.

To check this example, define a mapping T : C[a, b] → C[a, b] by


Z b
T (f )(x) = λ k(x, y)f (y)dy + φ(x), λ ∈ R.
a

We write the distance function on C[a, b] as

dC[a,b] (f1 , f2 ) = max |f1 (x) − f2 (x)| for all f1 , f2 ∈ C[a, b].
x∈[a,b]

Solving the desired equation is equivalent to showing that T has a fixed point.
Now

dC[a,b] T (f1 ), T (f2 ) ≤ |λ| max |k(x, y)|(b − a)dC[a,b] (f1 , f2 ).
x,y∈[a,b]

Thus, if |λ| maxx,y∈[a,b] |k(x, y)|(b − a) < 1, then T is a contraction and, there-
fore, has a unique fixed point by Theorem 2.26.
As a consequence of Theorem 2.26, we can obtain a more general fixed
point theorem.

Corollary 2.29. Let (X, dX ) be a complete metric space and n − 1 ∈ N. If


the n-th iteration f(n) (x) = f (f (f (· · · (f (x))))) is a contraction, then f has a
| {z }
n
unique fixed point.

Proof. Since f(n) is a contraction mapping, it has a unique fixed point x by


Theorem 2.26. If α is the contraction constant for f(n) , then
36 2 Continuous Maps
  
dX f (x), x = dX f (f(n) (x) , f(n) (x)
 
= dX f(n) (f (x)), f(n) (x) ≤ αdX f (x), x

and hence f (x) = x.


If y is another fixed point of f then it is also a fixed point of f(n) since

f(n) (y) = f(n−1) f (y) = f(n−1) (y) = · · · = y.

The uniqueness of the fixed point of f(n) implies y = x.

Remark 2.30. In Corollary 2.29, f needs not be a contraction. In fact, there


is a discontinuous function f from [0, 1] to itself such that its second iteration
becomes a contraction. For instance, if
1
, if x ∈ [0, 1/2]
f (x) = 41
2, if x ∈ (1/2, 1],

then f(2) (x) = 1/4 for x ∈ [0, 1].

Example 2.31. Consider the following Volterra integral equation (VIE), named
after Vito Volterra:
Z x
f (x) = λ k(x, y)f (y)dy + φ(x),
a

for which the notations are the same as in Example 2.28. If T : C[a, b] →
C[a, b] is defined by
Z x
T (f )(x) = λ k(x, y)f (y)dy + φ(x),
a

then the VIE has a unique solution in C[a, b] for any λ ∈ R.

To see this assertion, it is enough to show that T in the above has a unique
fixed point in C[a, b]. A simple computation implies that if f1 , f2 ∈ C[a, b] and
M = maxx,y∈[a,b] |k(x, y)| then

|T (f1 )(x) − T (f2 )(x)| ≤ |λ|M dC[a,b] (f1 , f2 )(x − a),

2 (x − a)2
|T(2) (f1 )(x) − T(2) (f2 )(x)| ≤ |λ|M dC[a,b] (f1 , f2 ) ,
2
and
n (x − a)n
|T(n) (f1 )(x) − T(n) (f2 )(x)| ≤ |λ|M dC[a,b] (f1 , f2 ) .
n!
Hence
n
 |λ|(b − a)M
dC[a,b] T(n) (f1 ), T(n) (f2 ) ≤ dC[a,b] (f1 , f2 ).
n!
2.4 Contractions 37
n
Because limn→∞ |λ|(b − a)M /n! = 0, T(n) is a contraction mapping for
large n and, therefore, has a unique fixed point for any value of the parameter
λ by Corollary 2.29.
As a consequence of Remark 2.27, we consider a family of contractions
that vary continuously with respect to a parameter and prove that the corre-
sponding fixed points also vary continuously.

Corollary 2.32. Let (X, dX ) be a complete metric space and let (Y, dY ) be a
metric space. Suppose f : Y × X → X is such that f (·, x) is continuous on Y
for any point x ∈ X and there exists α ∈ (0, 1) such that

dX f (y, x1 ), f (y, x2 ) ≤ αdX (x1 , x2 ) for all y ∈ Y and x1 , x2 ∈ X.

For y ∈ Y let xy ∈ X be the unique fixed point of the contraction f (y, ·). Then
the mapping y → xy is continuous from Y to X.

Proof. For y ∈ Y , consider the equation x = f (y, x). Let y0 ∈ Y . Construct


a solution of the equation starting with the initial point x0 = xy0 so that
xk+1 = f (y, xk ) → xy . From Remark 2.27 it follows that
 
dX xy0 , f (y, xy0 ) dX f (y0 , xy0 ), f (y, xy0 )
dX (xy0 , xy ) ≤ =
1−α 1−α
so that the mapping y → xy is continuous at y0 since the mapping y →
f (y, xy0 ) is continuous at y0 . We are done.

As a further application of the contraction mapping theorem, we give the


following Charles Emile Picard’s existence and uniqueness theorem for a frist-
order system of nonlinear ordinary differential equations based on the Rudolf
Lipschitz continuity.

Theorem 2.33. For n ∈ N equip Rn with the distance


v
u n
uX
dn (x, y) = kx − ykn = t (xk − yk )2
k=1

between both x = (x1 , ..., xn ) and y = (y1 , ..., yn ) in Rn . Let D ⊆ Rn be open


and [a, b] ⊆ R be a finite interval. Suppose f : [a, b] × D → Rn is a continuous
and satisfies a Lipschitz condition with the constant κ > 0 below:

dn f (t, y1 ), f (t, y2 ) ≤ κdn (y1 , y2 ) for all (t, y1 ), (t, y2 ) ∈ [a, b] × D.

Then, there exists a δ ∈ (a, b] such that the initial value problem (IVP):

du(t)
u(a) = y0 ∈ Rn

= f t, u(t) ,
dt
has a unique solution in the interval [a, δ].
38 2 Continuous Maps

Proof. Clearly, solving this IVP is equivalent to solving the integral equation
Z t

u(t) = y0 + f s, u(s) ds.
a

So, it is our aim to show the existence and uniqueness of a solution to this
integral equation.
Assuming that B r (y0 ) ⊆ D is the closed ball centered at y0 with radius
r > 0, we have
M= max kf (t, y)kn < ∞.
(t,y)∈[a,b]×B r (y0 )

Now choose δ ∈ (a, b] such that (δ − a)κ < 1 and (δ − a)M ≤ r. Set S be
the class of continuous mappings from [a, δ] to B r (y0 ). It is clear that S is
complete under the distance defined by

dS (φ1 , φ2 ) = max dn φ1 (t), φ2 (t) for all φ1 , φ2 ∈ S.
t∈[a,δ]

Define a mapping F : S → S by
Z t 
F (φ)(t) = y0 + f s, φ(s) ds.
a

Obviously, F (φ) is continuously differentiable. If φ ∈ S and t ∈ [a, δ], then


 t
Z

dn F (φ)(t), y0 = f s, φ(s) ds ≤ M |t − a| ≤ (δ − a)M ≤ r.

a n

This implies F (φ) ∈ S. Now, our problem amounts to showing that F has a
unique fixed point in S. Thus, it suffices to verify that F is a contraction. To
do so, we note that t ∈ [a, δ] and φ1 , φ2 ∈ S imply


Z t  
dn F (φ1 )(t), F (φ2 )(t) ≤ f s, φ1 (s) − f s, φ2 (s) ds
n
a
≤ (δ − a)κdS (φ1 , φ2 ),
yielding 
dS F (φ1 ), F (φ2 ) ≤ (δ − a)κdS (φ1 , φ2 ).
Accordingly, the assertion follows.
Remark 2.34.
(i) Note that any existence theorem for the nonlinear IVP in the above must
be local in nature. For example,
y 0 (t) = y 2 (t), y(1) = −1
has the solution y(t) = −1/t, which is not defined at t = 0 even though
f (t, y) = y 2 is satisfied with the required condition on D ⊆ R which is
assumed to be open and bounded.
2.5 Equivalence of Metric Spaces 39

(ii) If the Lipschitz condition is dropped, then it is still possible for the IVP
to have one and even many more solutions – for instance, the IVP:

y 0 (t) = y 1/3 (t), y(0) = 0

has an infinite number of solutions


, 0≤t≤c
(
0
y (t) = 
c 2(t−c)
3/2
3 , c < t ≤ 1,

where c ∈ [0, 1].

2.5 Equivalence of Metric Spaces


It is often necessary to compare two metric spaces and decide in what sense
they are equivalent or to analyze how the structure of a metric space changes
when changing the metric. Note that every metric space is a set with additional
topological structure induced by the metric. So to decide in what sense two
metric spaces are equivalent we have to use continuous mappings between
them.

Definition 2.35. Let (X, dX ) and (Y, dY ) be two metric spaces. Then they
are called:
(i) Topologically isomorphic or homeomorphic provided there exists a home-
omorphism between them – a mapping f : X → Y which is bijective and
continuous, and has continuous inverse;
(ii) Uniformly isomorphic provided there exists a uniform isomorphism be-
tween them – a mapping f : X → Y which is bijective and uniformly contin-
uous, and has uniformly continuous inverse;
(iii) Isometrically isomorphic provided
 there exists a bijective mapping f such
that f : X → Y and dY f (x1 ), f (x2 ) = dX (x1 , x2 ) for all x1 , x2 ∈ X;
(iv) Similar provided there exists a positive constant κ > 0 and a bijective
mapping f such that f : X → Y and dY f (x1 ), f (x2 ) = κdX (x1 , x2 ) for all
x1 , x2 ∈ X.
(v) Equivalent provided as sets X = Y and there exist two positive constants
κ1 and κ2 independent of all x1 , x2 ∈ X such that

κ1 dX (x1 , x2 ) ≤ dY (x1 , x2 ) ≤ κ2 dX (x1 , x2 ).

The following example helps us grasp Definition 2.35.

Example 2.36.
(i) Let f : [− π2 , π2 ] → [−1, 1] be defined by f (x) = sin x. Then it is a uniform
isomorphism, and hence any two finite closed subintervals of R are uniformly
isomorphic.
40 2 Continuous Maps

(ii) For x = (x1 , · · · , xn ), y = (y1 , · · · , yn ) ∈ Rn and n ∈ N, recall


v
u n
uX
dn (x, y) = kx − ykn = t |xk − yk |2
k=1

and let
d∞ (x, y) = kx − yk∞ = max{|xk − yk | : k = 1, · · · , n}.
Then
dn (x, y) ≤ d∞ (x, y) ≤ ndn (x, y).
This means that (R , dn ) and (Rn , d∞ ) are equivalent. Of course, dn and d∞
n

define the same notions of continuity and convergence and do not need to
be distinguished for most purposes. In other words, the identity mapping is
uniformly isomorphic from (Rn , dn ) to (Rn , d∞ ).
(iii) In (Rn , dn ), an isometrically isomorphic mapping is a rotation (the move-
ment of a body in such a way that the distance between a certain fixed point
and any given point of that body remains constant), a reflection (to invert a
geometric figure, respect to a line or plane, but not a point), or a translation
(to move every point by a fixed distance in the same direction).
(iv) (Rn , dn ) is similar to itself. This is because the mapping f (x) =
n
κx + x0 , where  κ > 0 is constant and x0n ∈ R , is bijective and satisfies
dn f (x), f (y) = κdn (x, y) for any x, y ∈ R .
Both a uniform isomorphism and an isometric isomorphism are certainly
homeomorphisms which are not only open mappings (from open sets to open
sets) but also closed mappings (from closed sets to closed sets). Intuitively,
a homeomorphism not only maps points in the first object that are close
together to points in the second object that are close together, but also sends
points in the first object that are not close together to points in the second
object that are not close together. Topology is the study of those properties
of objects that do not change when homeomorphisms are applied. Below is a
classical and important result.
Theorem 2.37. For n + 1 ∈ N let Rn+1 be equipped with the Euclidean
distance v
un+1
uX
dn+1 (x, y) = kx − ykn+1 = t (xk − yk )2
k=1

for x = (x1 , · · · , xn+1 ) and y = (y1 , · · · , yn+1 ) in Rn+1 .


(i) If n = 0, then Rn+1 is homeomorphic to the open interval (−1, 1) ⊆ R.
(ii) If n ≥ 1, then Rn is homeomorphic to the punctured sphere Sn \
{(0, · · · , 1)}, where
Sn = {x ∈ Rn+1 : kxkn+1 = dn+1 (x, 0) = 1}
is the compact unit sphere in Rn+1 .
Problems 41

Proof. To check (i), we just take f (x) = π2 arctan x.


As with (ii), we define the mapping f : Sn \ {(0, · · · , 1)} → Rn by letting

y = (y1 , · · · , yn ) = f (x1 , · · · , xn+1 ) = f (x)

be the point in Rn such that


xk
yk = for k = 1, 2, ..., n.
1 − xn+1

Next we define h : Rn → Rn+1 by letting x = h(y) be the point in Rn+1 such


that
2yk 2
xk = 2 for k = 1, · · · , n, and xn+1 = 1 − 2 .
1 + dn (y, 0) 1 + dn (y, 0)

Note that dn+1 (h(y), 0) = 1 and xn+1 6= 1. So, h(Rn ) ⊆ Sn \ {(0, 0, · · · , 1)}.
It is easy to see that both f and h are continuous and satisfy f ◦ h(y) = y
and h ◦ f (x) = x. Thus, h is the inverse of f , and f is a homeomorphism of
Sn \ {(0, · · · , 1)} onto Rn .
Regarding the compactness of Sn , we define a function F : Rn+1 → R
via F (x) = kxkn+1 . The triangle inequality for dn+1 yields that F is con-
tinuous. With this and the closedness of the single point {1}, we see that
Sn = F −1 ({1}) is closed. Since dn+1 (0, x) = 1 for x ∈ Sn , Sn is bounded.
Being closed, Sn is therefore compact.

Problems

2.1. Let f : R3 → R2 be given by f (x) = (x1 , x2 ) for x = (x1 , x2 , x3 ) ∈ R3 .


Prove that f is continuous on R3 .

2.2. Given f (x) = dX (x, 0) for every point x in the subset X \ {0} of a metric
space X with distance dX , define f (0) so that f is continuous at 0.
Pn
2.3. For n ∈ N let f : Rn → R be given by f (x) = ( i=1 x2i )1/2 for x =
(x1 , ..., xn ) ∈ Rn , and let S = (−3, −2). Find f −1 (S).

2.4. Define D = {x = (x1 , x2 , x3 ) ∈ R3 : 0 < x3 < x21 + x22 } and let f be the
mapping from R3 into R given by

1 , x∈D
f (x) =
0 , x∈ / D.

Show that f (x(k) ) tends to f (0) as x(k) approaches 0 along any linear path,
but f is discontinuous at 0.
42 2 Continuous Maps

2.5. Let T be the linear transformation from R2 into itself such that
 
T (1, 1) = (3, −1) and T (1, −1) = (1, 7).

Find the matrix representation of T .

2.6. Given n ∈ N. If K is a compact subset of Rn , f : K → R is continuous,


and f (x) > 0 for any x ∈ K, show that there is a δ > 0 such that f (x) ≥ δ
for any x ∈ K.

2.7. Show that the function f : [0, 1] → [0, 1], defined by



x , x ∈ Q ∩ [0, 1]
f (x) =
1 − x , x ∈ (R \ Q) ∩ [0, 1],

is onto and thus satisfies the intermediate value property. Show, however, that
f is continuous only at x = 1/2.

2.8. Define
Z x
T : C[0, 1] → C[0, 1] by T (f )(x) = f (t)dt.
0

Verify the following three facts:


(i) T is not a contraction;
(ii) T has a unique fixed point;
(iii) T ◦ T is a contraction.

2.9. Let f : R → R be differentiable with |f 0 (x)| ≤ α, where α ∈ (0, 1). Prove


that f is a contraction mapping.

2.10. For a complete metric space (X, dX ), let T : X → X be a mapping. If


∞ 
X dX T(k) (x1 ), T(k) (x2 )
sup < ∞,
x1 6=x2 dX (x1 , x2 )
k=1

| ◦ T{z
prove that T has a unique fixed point, where T(k) = T · · · ◦ T}.
k

2.11. Let (X, dX ) be a metric space. Suppose K is a nonempty compact subset


of X. If the mapping T : K → K satisfies

dX T (x1 ), T (x2 ) < dX (x1 , x2 ) for all x1 6= x2 in K,

prove that there exists precisely one point x ∈ K such that x = T (x).

2.12. Let h ∈ C[0, 1]. Show that there is an f ∈ C[0, 1] such that
Z x
f (x) − f (x − t) exp(−t2 )dt = h(x).
0
Problems 43

2.13. Let (X, dX ) be a complete metric space, and let

f : D ⊆ X → X = f (D)

be a mapping. suppose there is a number κ > 1 such that



dX f (x1 ), f (x2 ) ≥ κdX (x1 , x2 ) for all x1 , x2 ∈ D.

Prove that there exists exactly one point x ∈ D such that f (x) = x.

2.14. Let (X, dX ) and (Y, dY ) be two metric spaces and f : X → Y bijective.
Prove that the following three conditions are equivalent:
(i) f is a homeomorphism;
(ii) f is continuous, and maps closed subsets of X to closed subsets of Y ;
(iii) f is continuous and maps open subsets of X to open subsets of Y .

2.15. suppose R is equipped with the usual Euclidean distance. If f : R → R


is defined by
x + 23 , x ∈ (−∞, 12 ]

f (x) = 1
x , x ∈ ( 12 , ∞),
prove that f is not homeomorphic.

2.16. Let (X, dX ) be a metric space. Prove that there is a complete metric
space (Y, dY ) and a mapping f : X → Y such that f is an isometry, i.e.,

dY f (x1 ), f (x2 ) = dX (x1 , x2 ) for all x1 , x2 ∈ X,

and f (X) is a dense subset of Y .

2.17. Let (X, dX ) be a metric space and let K be a nonempty closed subset
of X. If f : X → R is defined by f (x) = inf y∈K dX (x, y), prove that f is
continuous, and that f (x) = 0 if and only if x ∈ K.
3
Normed Linear Spaces

In contrast to the previous practice of examining functions individually, func-


tional analysis is initially treated as studying common and distinct properties
of various spaces of functions, but actually it can be considered as proceed-
ing from an interplay between both linear algebra and topology (based on
the Euclidean spaces), resulting in the notion of normed linear space. Ac-
cordingly we begin our discussion with this fundamental concept which, when
combined with increasingly specialized assumptions, will lead to various in-
creasingly specialized spaces and results. Here we would like to say that our
investigation will involve only linear functional analysis whose relation to the
developing nonlinear counterpart is similar to and arises from the connection
between the Euclidean spaces and manifolds.

3.1 Linear Spaces, Norms and Quotient Spaces


From now on, we always use F as either R or C = R + iR – the field of real
or complex numbers.

Definition 3.1. A linear (or vector) space over F is a set X equipped with
two functions

+: X ×X →X and ·: F×X →X

with the properties below:


(i) x + y = y + x for all x, y ∈ X;
(ii) (x + y) + z = x + (y + z) for all x, y, z ∈ X;
(iii) There exists 0 ∈ X such that x + 0 = 0 + x = x for all x ∈ X;
(iv) There exists −x ∈ X such that x + (−x) = (−x) + x = 0 for all x ∈ X;
(v) α · (β · x) = (αβ) · x for all α, β ∈ F, x ∈ X;
(vi) (α + β) · x = (α · x) + (β · x) for all α, β ∈ F, x ∈ X;
(vii) α · (x + y) = (α · x) + (α · y) for all α ∈ F, x, y ∈ X;
46 3 Normed Linear Spaces

(viii) 1 · x = x for all x ∈ X where 1 is the multiplicative identity in F.


Although our main purpose is to study linear spaces over R or C, we
frequently state that the spaces are over F whenever both R and C may be
handled in the same way.
Example 3.2.
(i) For n ∈ N, Fn is a linear space with the usual vector addition and scalar
multiplication over F.
(ii) Let X be the set of all polynomials with coefficients in R of degree less
than n ∈ N. Then X is a linear space with usual addition of polynomials and
scalar multiplication over R.
(iii) For m, n ∈ N let Mm,n (C) be the set of all complex-valued m×n matrices.
Then Mm,n (C) is a linear space with usual addition of matrices and scalar
multiplication over C.
(iv) Recall that `∞ represents the set of infinite real-valued sequences (xj )∞
j=1
which are bounded: supj∈N |xj | < ∞. Then `∞ is a linear space over R with:

sup |xj + yj | ≤ sup |xj | + sup |yj |; sup |αxj | = |α| sup |xj | < ∞.
j∈N j∈N j∈N j∈N j∈N

(v) Let C(S, R) be the class of all continuous functions f : S → R with

(f + g)(x) = f (x) + g(x) and (α · f )(x) = αf (x).

Here S is any nonempty subset of R. Then C(S, R) is a linear space over R.


(vi) Let C ∞ [a, b] be the space of all infinitely differentiable real-valued func-
tions on the finite interval [a, b] of R. Then it is linear space over R with the
same addition and scalar multiplication as in (v).
(vii) For any closed interval [a, b] ⊂ R and any increasing function g : R → R,
the classes R[a, b], RSg [a, b] and LRSg ([a, b]) are also linear spaces over R with
the same addition and scalar multiplication as in (v).
For convenience, we will drop the special notation +, · for vector addition
and scalar multiplication, and simply refer to X as a linear space over F.
Moreover, if F = R then we will say that X is a real linear space; whereas if
F = C, then we will say that X is a complex linear space.
As in the linear algebra of finite-dimensional vector spaces, we have the
concept of a linear subspace which is often called simply a subspace (when
the context serves to distinguish it from other kinds of subspaces) and plays
an important role in functional analysis and related fields of mathematics.
Definition 3.3. Let X be a linear space over F.
(i) A subset Y ⊆ X is called a linear subspace of X provided that

α, β ∈ F and x, y ∈ Y imply αx + βy ∈ Y.

(ii) If Y = {x + x0 : x ∈ S} where S is a linear subspace of X and x0 is a


fixed element of X, then Y is called an affine subset of X.
3.1 Linear Spaces, Norms and Quotient Spaces 47

Example 3.4.
(i) For n − 2 ∈ N, the vectors in Rn of the form (x1 , x2 , x3 , 0, ..., 0) form a
linear subspace of Rn .
(ii) For m, n ∈ N and m ≤ n, the set of polynomials of degree ≤ m forms a
linear subspace of the set of polynomials of degree ≤ n.
(iii) If Y = {(x1 , x2 , 1, ..., 1) ∈ Rn } for n − 2 ∈ N, then Y is an affine subset
of Rn .
(iv) In Example 3.2 (iii), let Y be the set of matrices with certain blocks of
1’s. Then Y is an affine subset of Mm,n (C).
(v) In R3 all lines and planes through the origin are linear subspaces, whereas
all lines and planes not passing through the origin are affine subsets.

A fundamental concept for linear spaces is that of dimension. To see this,


suppose X is a linear space over F and n ∈ N. Elements e1 , e2 , ..., en of X are
linearly
Pndependent provided there are scalars α1 , α2 , ..., αn (not all zero) such
that j=1 αj ej = 0. If there is no such set of scalars, then they are linearly
independent. The linear span of the vectors {ej }nj=1 is the linear subspace of
X as follows:
n
nX o
span{ej }nj=1 = span{e1 , ..., en } = αj ej : αj ∈ F .
j=1

In general, the linear span of a subset S of X is defined to be the set of all


finite linear combinations of elements in S; that is, the intersection of all linear
subspaces of X containing S.

Definition 3.5. Let n ∈ N. If the linear space X is equal to the space spanned
by a linearly independent set of n vectors in X, then X is said to have the
dimension n. If there is no such set of vectors, then X is infinite-dimensional.
Furthermore, a linearly independent set of vectors that spans X is called a
basis for X.

Example 3.6.
(i) For n ∈ N, the space Rn has dimension n; the standard basis is given by
the vectors e1 = (1, 0, · · · , 0), e2 = (0, 1, 0, · · · , 0), · · · , en = (0, 0, · · · , 0, 1).
(ii) For n ∈ N, the set {1, t, t2 , · · · , tn } is a basis for the linear space of real-
valued polynomials of degree ≤ n which has dimension n + 1.
(iii) All linear spaces given in Example 3.2 (iv)-(vii) are infinite-dimensional.

Next, we equip a linear space with a norm. The norm on a linear space
is a way of measuring the length of a vector and hence the distance between
two vectors.

Definition 3.7. Let X be a linear space over F. Then a norm on X is a


nonnegative function k · k : X → R with the following three properties:
(i) kxk = 0 if and only if x = 0;
48 3 Normed Linear Spaces

(ii) kx + yk ≤ kxk + kyk for all x, y ∈ X;


(iii) kαxk = |α|kxk for all x ∈ X and α ∈ F.
In this case, (X, k · k) is called a normed linear space.
Remark 3.8.
(i) In the definition we are assuming that | · | denotes the usual absolute value.
If k · k is a nonnegative function only with both (ii) – the triangle inequality
and (iii) – homogeneity, then it is called a semi-norm. For instance, k · kp,mg ,E
defined in Example 1.3 (iii) is a semi-norm whenever p ∈ [1, ∞).
(ii) Whenever (X, k · k) is a normed linear space over F, defining

dX (x1 , x2 ) = kx1 − x2 k, x1 , x2 ∈ X

we find that (X, dX ) is a metric space since dX is a distance function which


can be sufficiently demonstrated by

dX (x1 , x3 ) = kx1 − x3 k ≤ kx1 − x2 k + kx2 − x3 k = dX (x1 , x2 ) + dX (x2 , x3 ).

(iii) Clearly, if Y is a linear subspace of the linear space X (over F) which


is equipped with norm k · k, then (Y, k · k) is also a normed linear space and
hence it can be regarded as a linear subspace of (X, k · k).
Example 3.9. qP
n
(i) For n ∈ N and x = (x1 , ..., xn ) ∈ Rn let dn (x, 0) = 2
j=1 |xj | . Then
dn (·, 0) defines a norm on Rn – the only difficulty to verify this fact is the
triangle inequality – for this we use the Cauchy-Schwarz inequality:
v
n
u n n
X u X  X 
|xj yj | ≤ t |xj |2 |yj |2 .
j=1 j=1 j=1

(ii) For n ∈ N, there are many other norms on Rn , called the (n, p)-norms.
More precisely, if p ∈ [1, ∞), then
n
X  p1
kxkn,p = |xj |p
j=1

is a norm on Rn . To see this, it suffices to verify the triangle inequality, i.e.,


the following Hermann Minkowski’s inequality (see also Problem 3.1 (i))

kx + ykn,p ≤ kxkn,p + kykn,p for x, y ∈ Rn .

If p = ∞, then
kxkn,∞ = sup |xj |
1≤j≤n

defines a norm on R . It is conventional to write `np for these normed spaces.


n

Note that `np and `nq have exactly the same elements which are just ones of
Rn .
3.1 Linear Spaces, Norms and Quotient Spaces 49

(iii) Recall that `∞ is the linear space of bounded infinite sequences of real
numbers. For x = (xj )∞j=1 ∈ `∞ , define

  P∞ 1
p p
kxkp = j=1 |x j | , 1≤p<∞
 sup
1≤j<∞ |x j | , p = ∞.

If our attention is restricted to the linear subspace on which k · kp is finite,


then k · kp is a norm – the somewhat difficult triangle inequality follows from
the Minkowski inequality for the infinite sequences which can be established
by the Minkowski inequality for Rn in (ii) – see also Problem 3.1 (ii). This
yields an infinite family of normed linear spaces:

`p = {x ∈ `∞ : kxkp < ∞}, p ∈ [1, ∞].

Let p, q ∈ [1, ∞] and p < q. Then there is a strict inclusion: `p ⊂ `q and hence
`p is a linear subspace of `q . Moreover, (`p , k · kp ) is separable for p ∈ [1, ∞)
– for a proof see also Example 1.41 (ii).
(iv) For a finite closed interval [a, b] in R, C[a, b] becomes a normed space
with the p-norm below

Rb  p1
p
kf kp = a
|f (t)| dt , 1≤p<∞
a≤t≤b |f (t)| , p = ∞.
 sup

Note that the triangle inequality for p ∈ [1, ∞) follows from the Minkowski
inequality kf1 + f2 kp ≤ kf1 kp + kf2 kp – see also Remark ??.

As an application of the above concepts, quotients of normed linear spaces


may be formulated. Note that we need both the algebraic structure (linear
subspace of a linear space) and a topological property (closedness) to make it
all work.

Definition 3.10. Let X be a linear space over F and Y be a linear subspace


of X. Then the linear space X/Y – the quotient or factor space is formed as
follows. The elements of X/Y are cosets of Y – sets of the form

x + Y = {x + y : y ∈ Y } where x ∈ X.

The set of cosets is a linear space under the operations:

(x1 + Y ) + (x2 + Y ) = x1 + x2 + Y and λ · (x + Y ) = λx + Y

for any x1 , x2 , x ∈ X and λ ∈ F. In addition, dim(X/Y ) is called the


codimension of a linear subspace Y ⊆ X.

Remark 3.11. The above-defined two operations make sense precisely because
Y is itself a linear space, so for instance,
50 3 Normed Linear Spaces

Y + Y = {y1 + y2 : y1 , y2 ∈ Y } = Y

and
λY = {λy : y ∈ Y } = Y for λ 6= 0.
Moreover, x1 + Y and x2 + Y are equal if and only if as sets x1 + Y = x2 + Y ;
this is true if and only if x1 − x2 ∈ Y , denoted x1 ∼ x2 , which reads: x1 is
equivalent to x2 with respect to Y .

Example 3.12.
(i) Let X = R3 , and let Y be the linear subspace spanned by (1, 1, 0). Then
X/Y is a two-dimensional real vector space since (1, 0, 1) + Y and (0, 0, 1) + Y
generate X/Y .
(ii) The linear space Y of finitely supported sequences (of which all but a
finite number of entries vanish) in `1 is a linear subspace. The quotient space
`1 /Y is quite hard to visualize – its elements are equivalence classes under
the relation (xj )∞ ∞ ∞ ∞
j=1 ∼ (yj )j=1 whenever (xj )j=1 and (yj )j=1 differ in finitely
many coordinates.
(iii) For n ∈ N let Y be the set of `1 -sequences of (0, · · · , 0, xn+1 , · · ·) (first
n coordinates are zero). Then Y is a linear subspace of `1 . Here the quotient
space `1 /Y is isomorphic to Rn .
(iv) Recall that if p, q ∈ [1, ∞] and p < q then `p ⊂ `q – this implies that
`q /`p exists as a linear quotient space.
(v) Let X = C[0, 1] and Y = {f ∈ X : f (0) = 0}. Then X/Y is isomorphic
to R.
(vi) Y = C[0, 1] is a linear subspace of X = R[0, 1]. The quotient X/Y is
again a linear space.

Evidently, these examples indicate that not all linear subspaces are equally
good: Example 3.12 (i), (iii) and (v) are quite reasonable, whereas Example
3.12 (ii), (iv) and (vi) are examples of linear spaces unlike any we have seen.
The reason is the following: The space X/Y is guaranteed to be a normed
space with a norm related to the original norm k · k on X only when the
subspace Y is itself closed. Notice that Example 3.12 (i), (iii) and (v) are
precisely the ones in which the linear subspace is closed whenever the space
X is treated as a metric space with the distance

dX (x1 , x2 ) = kx1 − x2 k for all x1 , x2 ∈ X.

Theorem 3.13. Let (X, k · kX ) be a normed linear space over F and Y a


linear subspace of X. Then X/Y is a normed linear space over F under the
norm
kx + Y kX/Y = inf kzkX
z∈x+Y

when and only when Y is a closed subset of X; that is, if (yj )∞


j=1 is a sequence
in Y with limj→∞ kyj − ykX = 0 then y ∈ Y .
3.1 Linear Spaces, Norms and Quotient Spaces 51

Proof. On the one hand, suppose that Y is closed and note that

kx + Y kX/Y = inf kzkX = inf kx − ykX .


z∈x+Y y∈Y

Firstly, if kx + Y kX/Y = 0 then there is a sequence of vectors yj ∈ Y such


that limj→∞ kx − yj kX = 0. Since Y is closed, x ∈ Y and then x + Y = 0 + Y .
Consequently, k0 + Y kX/Y = 0.
Secondly, the homogeneity is clear from λ ∈ F and

kλ(x + Y )kX/Y = inf kλzkX = |λ| inf kzkX = |λ|kx + Y kX/Y .


z∈x+Y z∈x+Y

Finally, the triangle inequality is clear from

k(x1 + Y ) + (x2 + Y )kX/Y = inf kz1 + z2 kX


z1 ∈x1 +Y,z2 ∈x2 +Y
≤ inf kz1 kX + inf kz2 kX
z1 ∈x1 +Y z2 ∈x2 +Y
= kx1 + Y kX/Y + kx2 + Y kX/Y .

On the other hand, suppose Y is not closed. Let x be a point that is not
in Y but is in the closure of Y with respect to the distance kx1 − x2 kX . If
(yj )∞
j=1 is a sequence of points of Y that converges to x, then

0 ≤ kx + Y kX/Y = inf kzkX ≤ inf kx − yj kX = 0,


z∈x+Y j∈N

and consequently, x ∈ Y which contradicts x ∈


/ Y . Therefore k · kX/Y could
not be a norm on X/Y . We are done.

Example 3.14.
(i) If X = R2 and Y = {(0, y) : y ∈ R}, then X/Y consists of lines in X of
the form (s, t) + Y . Furthermore, if√X is equipped with the norm k · k2,2 , then
k(s, t) + Y kX/Y = inf (u,v)∈(s,t)+Y u2 + v 2 = |s|.
(ii) The quotient space may be a little odd. For instance, let `∞,c denote the
space of all sequences (xj )∞j=1 with the property that limj→∞ xj exists. This
is a closed linear subspace of `∞ . It is a good practice to give a representation
of the quotient `∞ /`∞,c .
(iii) Let g : R → R be increasing, E ⊆ R an mg -measurable set, p ∈ [1, ∞].
Then each set LRSgp (E, F) defined in Example 1.3 (iii), is a linear space. If

Ng (E) = {f : E → F : f = 0 mg − a.e. on E},

then LRS pg (E, F) = LRSgp (E, F)/Ng (E) forms a linear quotient space whose
elements can be written as f + Ng (E) where f ∈ LRSgp (E, F). In particular,
if g : R → R is the identity, then LRS pid (E, F) is employed to represent
LRS pg (E, F). More importantly, when 1 ≤ p ≤ ∞ the space LRS pg (E, F) is a
normed linear space under the following norm:
52 3 Normed Linear Spaces

kf + Ng (E)kLRS pg (E,F) = kf kp,mg ,E .

The three properties required by a norm can be easily checked – in particular,


the triangle inequality follows from Minkowski’s inequality presented in Re-
mark ??. Even though LRS pg (E, F) exists as a space of equivalence classes of
mg -a.e.-defined LRSgp (E, F)-functions, for the sake of simplicity of notation,
we will follow the customary language to speak of it as a space of functions –
this minor abuse of language is uniformly accepted and almost never results
in any confusion.

3.2 Finite Dimensional Spaces


From the last section we have seen that any normed linear space generates
a metric space and consequently brings the notions such as continuity and
convergence (discussed in Chapter 6) into play. Therefore, it is unnecessary to
deal with this issue again. But, to put it more abstractly, we want to emphasize
that every normed linear space is a topological vector space and thus carries
a topological structure which is induced by the norm and a linear mapping.

Definition 3.15.
(i) Two linear spaces X1 and X2 over F are called algebraically isomorphic
provided there is a bijection T : X1 → X2 that is linear:

T (αx + βy) = αT (x) + βT (y) for all α, β ∈ F and x, y ∈ X1 .

(ii) A pair (X, k · kX ), (Y, k · kY ) of normed linear spaces over F are called
topologically isomorphic provided there is a linear bijection T : X → Y with
the property that there exist two positive constants c1 and c2 independent of
x ∈ X with
c1 kxkX ≤ kT (x)kY ≤ c2 kxkX .
If c1 = c2 = 1 then T is called an isometry and the normed spaces X and Y
are called isometric.
(iii) k · k(1) and k · k(2) defined on a given linear normed space X are equivalent
provided there are two constants c1 , c2 > 0 such that

c1 k · k(1) ≤ k · k(2) ≤ c2 k · k(1) .

Example 3.16.
(i) Let Y be the set of all real polynomials of the form f (t) = y1 + y2 t + y23 t2
with the norm kf kY = max{|y1 |, |y2 |, |y3 |},pand X the set of all vectors x =
(x1 , x2 , x3 ) ∈ R3 with the norm kxkX = x21 + x22 + x23 . If T : X → Y is
given by T (x) = x1 + x2 t + x23 t2 , then T is linearly bijective but also enjoys
p
kT (x)kY = max{|x1 |, |x2 |, |x3 |} ≤ |x1 |2 + |x2 |2 + |x3 |2 = kxkX
3.2 Finite Dimensional Spaces 53

and
1
√ kxkX ≤ kT (x)kY ≤ kxkX .
3
So, the two spaces are topologically isomorphic.
(ii) Given p ∈ [1, ∞] and n ∈ N, then the previously-defined norm k · kn,p and
the following norm k · kn,∗ defined via

kxkn,∗ = max |xk | for x = (x1 , ..., xn ) ∈ Rn ,


1≤k≤n

are equivalent.

In fact, we have the following more general result.

Theorem 3.17. Let n ∈ N.


(i) Any two norms on a given n-dimensional linear space X over F are equiv-
alent.
(ii) If X and Y are n-dimensional normed linear spaces over F then X and
Y are topologically isomorphic.

Proof. (i) Let k·k(k) , k = 1, 2, be two norms on X. Choose a basis {e1 , · · · , en }


for X and define a third norm k · k(3) as follows. For eachPx ∈ X there
n
is a unique set of scalars α1 , · · · , αn in F such that x = j=1 αj ej . Let
Pn
kxk(3) = j=1 |αj |. Suppose each of the norms k · k(k) , k = 1, 2 is equivalent
to k · k(3) . Then there are positive constants c1 , C1 and c2 , C2 such that

ck k · k(k) ≤ k · k(3) ≤ Ck k · k(k) , k = 1, 2.

It follows that
c1 C1
k · k(1) ≤ k · k(2) ≤ k · k(1) ,
C2 c2
implying the desired equivalence.
Now let k · k denote
Pn either k · k(1) or k · k(2) . We show that k · k is equivalent
to k · k(3) . If x = j=1 αj ej then
n
X 
kxk ≤ |αj |kej k ≤ max kej k kxk(3) .
1≤j≤n
j=1

This implies a one-sided estimate. We shall prove that there is a constant


c > 0 such that k · k(3) ≤ ck · k, i.e.,
n
X Xn
|αj | ≤ c αj ej .

j=1 j=1
Pn
We may assume that kxk(3) = 1 by dividing the last inequality by j=1 |αj |.
Now if the last inequality is not true then there would be a sequence (yk )∞
k=1
where
54 3 Normed Linear Spaces
n
X n
X
yk = αk,j ej with kyk k(3) = |αk,j | = 1 but lim kyk k = 0.
k→∞
j=1 j=1

Note that |αk,j | ≤ 1 for j = 1, · · · , n and every k. So there is a subsequence


of (yk )∞
k=1 , denoted itself also, such that limk→∞ αk,j exists and equals, say,
αj for j = 1, 2, · · · , n as well as
n
X n
X

lim yk − αj ej ≤ max kej k lim |αk,j − αj | = 0.

k→∞ 1≤j≤n k→∞
j=1 j=1

Consequently,
n
X n
X
|αj | = 1 while αj ej = 0.
j=1 j=1

This is impossible, since {ej }nj=1 are linearly independent. This completes the
argument.
(ii) It suffices to show that X is topologically isomorphic to Fn with the
following Euclidean norm:
n
X  12
kαkn,2 = |αk |2 for α = (α1 , ..., αn ) ∈ Fn .
k=1

By the argument for (i), there is an isomorphism T from X onto Fn . We can


use this map to define a new norm on X as follows: For each x ∈ X let

|||x|||X = kT (x)kn,2 .

When X is given this new norm, T becomes a topological isomorphism. Nev-


ertheless, From (i) it turns out that ||| · |||X is equivalent to any norm k · kX on
X. Hence T is a topological isomorphism from (X, k · kX ) onto Fn .

Interestingly, Theorem 3.17 has some important consequences.

Corollary 3.18. Let n ∈ N and X be a normed linear space over F.


(i) If X is n-dimensional, then it is complete.
(ii) Any n-dimensional linear subspace of X is closed.
(iii) If X is n-dimensional, then every bounded closed subset S of X is com-
pact.

Proof. (i) Let {ek }nk=1 be a basis for X. Then each x ∈ X can be written as
Pn Pn 1
2 2
x = j=1 αj ej , αj ∈ F. Set kxkn,2 = j=1 |αj | . This is a norm on X,
and by Theorem 3.17 it is equivalent to the given norm on X. Accordingly, if
Pn (k) ∞ (k) (k) ∞
x(k) = j=1 αj ej k=1 is a Cauchy sequence in X, then (α1 , ..., αn ) k=1
is a Cauchy sequence in Fn which is complete, and hence there exists a point
(k) (k)
(α1 , ..., αn ) ∈ Fn such that (α1 P, ..., αn ) → (α1 , ..., αn ) as k → ∞. Therefore
(k) ∞ n
(x )k=1 is convergent to x = j=1 αj ej in X, whence X is complete.
3.2 Finite Dimensional Spaces 55

(ii) Suppose S is an n-dimensional linear subspace of X. Then S is com-


plete due to (i) and hence any sequence (x(k) )∞k=1 in S which is convergent to
x ∈ X must be a Cauchy sequence in S – this yields x ∈ S and so S is closed.
(iii) By Theorem 3.17 (ii) it follows that X is topologically isomorphic to
Fn and hence every bounded closed subset S of X is topologically isomor-
phic to a compact (i.e., bounded and closed) subset of Fn . Accordingly, S is
compact.
Even more interestingly, we can say something more about the converse
of Corollary 3.18 (iii).
Theorem 3.19. Let (X, k · kX ) be a normed linear space over F.
(i) If S is a proper closed subspace of X, then for each number θ ∈ (0, 1) there
exists an element xθ ∈ X such that

kxθ kX = 1 and d(xθ , S) = inf kxθ − xkX ≥ θ.


x∈S

(ii) If the unit sphere {x ∈ X : kxkX = 1} of X is compact, then X is finite


dimensional.
Proof. (i) Since S is closed and not equal to X, there exists an x1 ∈ X \ S
such that δ = inf x∈S kx1 − xkX > 0. Consequently, for any  > 0 there is an
x2 ∈ S such that kx2 − x1 kX < δ + . Choosing
δ(1 − θ) x2 − x1
= and xθ = ,
θ kx2 − x1 kX
we obtain kxθ kX = 1 and
k(kx2 − x1 kX x + x1 ) − x2 kX δ
kx − xθ kX = ≥ ≥θ
kx2 − x1 kX kx2 − x1 kX
for any x ∈ S. Here we have used the fact that S is a linear subspace of X.
(ii) Suppose x1 is any element of X with kx1 kX = 1 and S1 is the subspace
spanned by x1 . If S1 = X then X is finite dimensional and we are done.
Otherwise, by Theorem 3.19 (i), Corollary 3.18 (ii) and the fact that S1 is finite
dimensional, there exists an x2 ∈ X such that kx2 kX = 1 and d(x2 , S1 ) ≥
1/2. Suppose S2 is the linear subspace of X spanned by x1 and x2 . Then
S2 is a closed linear subspace of X and consequently, if S2 6= X then by
Theorem 3.19 (i) there exists an x3 ∈ X such that kx3 kX = 1 and d(x3 , S2 ) ≥
1/2. Continuing in this manner, we find inductively that if the closed linear
subspace Sn spanned by x1 , ..., xn is not equal to X then by Theorem 3.19
(i) there exists an xn+1 ∈ X such that kxn+1 kX = 1 and d(xn+1 , Sn ) ≥ 1/2.
However, this process must stop at some step – for otherwise there would be an
infinite sequence (xj )∞
j=1 with kxj kX = 1 but kxj − xk kX ≥ 1/2 when j 6= k.
Of course, this sequence has no convergent subsequence, thereby deriving
that the unit sphere of X is not compact (thanks to Theorem 1.37 (iv)), a
contradiction. Accordingly, X equals some Sk , and the argument is complete.
56 3 Normed Linear Spaces

The foregoing discussion reveals that the unit sphere of a given normed
linear space is compact if and only if the space is finite dimensional, but also
leads naturally to the following concept.

Definition 3.20. If (X, k · kX ) and (Y, k · kY ) are normed linear spaces over
F, then any one of:
1
(i) k(x, y)k = (kxkpX + kykpY ) p , p ∈ [1, ∞);
(ii) k(x, y)k = max{kxkX , kykY }.
may be defined as a norm on the Cartesian product X × Y .

This certainly does not exhaust all the possible combinations of the norms
kxkX and kykY , and yet these are the most commonly used ones. An extension
to the Cartesian product of n normed linear spaces is defined in the following
manner.

Remark 3.21. For n − 1 ∈ N let {Xk }nk=1 be a collection of linear spaces over
F with norms {k · kXk }nk=1 . If their Cartesian product space is defined as
n
Y 
Xk = X1 × · · · × Xn = (x1 , ..., xn ) : x1 ∈ X1 , ..., xn ∈ Xn ,
k=1

with vector addition

(x1 , · · · , xn ) + (y1 , · · · , yn ) = (x1 + y1 , · · · xn + yn )

and scalar multiplication

α(x1 , · · · , xn ) = (αx1 , · · · , αxn )

for
n
Y
α ∈ F; (x1 , ..., xn ), (y1 , ..., yn ) ∈ Xk .
k=1
Qn
Then it is not hard to show that the function k=1 Xk → R defined by
n
X
k(x1 , · · · , xn )kQn Xk = kxk kXk
k=1
k=1

is a norm on X.

Example 3.22. If n = m + k with k, m, n ∈ N, then Rn may be viewed as the


Cartesian product of Rm and Rk .
3.3 Bounded Linear Operators 57

3.3 Bounded Linear Operators


In this section, we discuss bounded operators, or equivalently, continuous op-
erators, and inverse operators. To begin with, we give the definition for a
linear operator to be bounded.

Definition 3.23. Let (X, k · kX ) and (Y, k · kY ) be normed linear spaces over
F. A linear operator T : X → Y is said to be bounded provided there exists a
constant C ≥ 0 such that kT (x)kY ≤ CkxkX for all x ∈ X. Define

kT (x)kY
kT k = kT kX→Y = sup .
x∈X,x6=0 kxkX

According to this definition, we have kT k = supkxkX =1 kT (x)kY which


follows from
xkxk−1 = 1 for all x =

X X 6 0.

Example 3.24.
(i) Equip `2 with the 2-norm, and let T : `2 → `2 be given by T (x) =
(0, x1 , x2 , · · ·) when x = (x1 , x2 , · · ·). Then it is easy to check that T is a
bounded linear operator with kT k = 1.
(ii) Choose for both X and Y the real space C[0, 1] with the sup-norm. Define
T : X → Y by T (f )(x) = ex f (x), x ∈ [0, 1]. Then T is bounded with kT k = e.

The following result tells us that as for a linear operator, the boundedness
amounts to the continuity.

Theorem 3.25. Let (X, k · kX ) and (Y, k · kY ) be normed linear spaces over
F, and let T : X → Y be a linear operator. Then the following statements are
equivalent:
(i) T is continuous in X;
(ii) T is continuous at 0;
(iii) T is bounded;
(iv) T maps bounded subsets of X to bounded subsets of Y .

Proof. In the sequel, we will write A ⇔ B for the statement: If A then B


and vice versa, but also use the criterion for a linear map T : X → Y to be
continuous at a point x ∈ X:

kxn − xkX → 0 ⇒ kT (xn ) − T (x)kY = 0.

(i)⇔(ii) The implication (i) ⇒ (ii) is trivial. Regarding (ii) ⇒ (i), we


suppose T is continuous at 0 ∈ X. If xn → x then xn − x → 0. Hence

T (xn − x) → T (0) = 0 so that T (xn ) → T (x) in Y.

(iii)⇔(i) If T is bounded and xn → x in X, then T (xn ) → T (x) as well.


It follows that T is continuous at x. Conversely, assume that T is continuous
58 3 Normed Linear Spaces

in X. If T is not bounded, then for any n ∈ N there exists a point xn ∈ X


with kT (xn )kY ≥ nkxn kX . Let yn = nkxxnnkX , so that kyn kX = n−1 → 0.
However, kT (yn )kY > 1 and T (0) = 0, contradicting the assumption that T
is continuous at 0.
(iv)⇔(ii) Suppose (ii) is true. So T is bounded with kT k ≤ C1 for some
constant C1 > 0 due to (iii)⇔(ii). If S is a bounded subset of X, then there
is a constant C2 > 0 such that kxkX ≤ C2 for all x ∈ S and hence

kT (x)kY ≤ C1 kxkX ≤ C1 C2 for all x ∈ X.

That is to say, T (S) is a bounded subset of Y . Conversely, assume that (iv)


is true. Given an open ball

BY (0) = {y ∈ Y : kykY < }

in Y , let
B1X (0) = {x ∈ X : kxkX < 1}
denote the open unit ball in X. By (iv) it follows that T (B1X ) is bounded in
Y . Thus, there is a λ > 0 such that

T (B1X ) ⊆ λBY (0) = {y ∈ Y : kykY < λ}.

This implies T (BλX−1 ) ⊆ BY since T is linear, and so T is continuous at 0.


Example 3.26. For n ∈ N equip X = Rn with the (n, 2)-norm and let {ej }nj=1
(for which the j-th coordinate of ej is 1 and others are 0) be the standard
basis for Rn . Then any x = (x1 , ..., xn ) ∈ Rn has the form
n
X n
X
x= xj ej with kxk22 = |xj |2 .
j=1 j=1

If T : Rn → Y is a linear transformation, where (Y, k · kY ) is a normed linear


space over F, then the Cauchy-Schwarz inequality implies
n
X  21
kT (x)kY ≤ kxk2 kT (ej )k2Y .
j=1

So, T is bounded with


n
X  21
kT k ≤ kT (ej )k2Y .
j=1

Indeed, Example 3.26 is a special case of the following result.


Theorem 3.27. Let (X, k · kX ) and (Y, k · kY ) be two normed linear spaces
over F. If X is finite dimensional, then any linear transformation T : X → Y
is bounded.
3.3 Bounded Linear Operators 59

Proof. Note that any two norms on a finite dimensional linear space X over
F are equivalent. So, we construct a new norm k · k via k · kX and k · kY :

kxk = kxkX + kT (x)kY , x ∈ X.

Of course, k · k is a norm on X and so it is equivalent to k · kX . This yields a


constant C > 0 with k · k ≤ Ck · kX . It follows that kT (x)kY ≤ kxk ≤ CkxkX
for all x ∈ X. In other words, T : X → Y is bounded.

Next, we consider some basic algebraic properties of the space of all


bounded linear operators.

Definition 3.28. Let (X, k · kX ) and (Y, k · kY ) be two normed linear spaces
over F. Denote by B(X, Y ) the set of all bounded linear operators from X to
Y . In particular, B(X) = B(X, X).

Theorem 3.29. Let (X, k · kX ), (Y, k · kY ) and (Z, k · kZ ) be normed linear


spaces over F. Then
(i) B(X, Y ) is a linear space over F with respect to the operations:

(T + S)(x) = T (x) + S(x) and (αT )(x) = αT (x) for x ∈ X, α ∈ F.

(ii) The function k · k : B(X, Y ) → R, defined by

kT (x)kY
kT k = sup for T ∈ B(X, Y ),
x∈X,kxkX 6=0 kxkX

is a norm on B(X, Y ).
(iii) If T ∈ B(X, Y ) and S ∈ B(Y, Z), then the composition ST = S ◦ T
belongs to B(X, Z) with kST k ≤ kSkkT k.
(iv) If T, S ∈ B(X) are invertible; that is, there are two elements T −1 , S −1 ∈
B(X) such that

T T −1 = T −1 T = I and SS −1 = S −1 S = I,

where I stands for the identity element of B(X), then ST ∈ B(X) is invertible
with (ST )−1 = T −1 S −1 .

Proof. (i) This follows from checking those conditions for a linear space with
B(X, Y ).
(ii) We have to verify three conditions required for a norm. First, it is clear
that kT k ≥ 0. If kT k = 0 then kT (x)kY = 0 for all x ∈ X, and hence T (x) = 0
for all x ∈ X. This gives T = 0. Conversely, T = 0 implies kT k = 0.
Next,
kαT (x)kY
kαT k = sup = |α|kT k.
x∈X,kxkX 6=0 kxkX
Finally,
60 3 Normed Linear Spaces

kT (x) + S(x)kY
kT + Sk = sup
x∈X,kxkX 6=0 kxkX
kT (x)kY kS(x)kY
≤ sup + sup
x∈X,kxkX 6=0 kxk X x∈X,kxkX 6=0 kxkX

= kT k + kSk.

(iii) Clearly, ST is a linear transformation from X to Z. Since T and S


are bounded, we conclude

k(ST )xkZ = kS(T (x))kZ ≤ kSkkT (x)kY ≤ kSkkT kkxkX , for all x ∈ X.

This yields that ST is bounded and so in B(X, Z) with kST k ≤ kSkkT k, as


desired.
(iv) This follows from the following formula:

(ST )T −1 S −1 = I = T −1 S −1 (ST ).

Example 3.30. Define T : `2 → `2 by T (x1 , x2 , · · ·) = (0, x1 , x2 , · · ·). This
map is bounded but not invertible, since it is clearly not onto. If S : `2 → `2 is
given by S (x1 , x2 , · · ·) = (x2 , x3 , · · ·), then it is bounded but not invertible,
because it is clearly not one to one. Note that ST = I 6= T S.

3.4 Linear Functionals via Hahn-Banach Extension


Looking at the Lebesgue-Radon-Stieltjes
R integrals, we find that if g : R → R
is increasing then Lg (f ) = R f dmg defines a linear function from the real
linear space LRSg (R) into R. This observation suggests that we should work
with the so-called linear functionals.

Definition 3.31. Let X be a linear space over F. A linear map from X to F is


called an F-valued linear functional on X. If X is a normed linear space, then
X ∗ = B(X, F) and X ∗∗ = (X ∗ )∗ are respectively called the dual space and
second dual space of X over F; moreover X is said to be reflexive whenever
X ∗∗ and X are isometric.

The question is whether or not the class of F-valued linear functionals on


X consists only of the zero functional. To settle this question, let us consider
the following example of Mahlon Marsh Day type.

Example 3.32. Let p ∈ (0, 1) and I = [0, 1]. If L is a continuous linear function
on LRS pid (I, R) then L must be the zero functional.
To see this conclusion, suppose, to the contrary, L 6= 0. Then, without loss
of generality we may assume that there exists an f ∈ LRS pid (I, R) such that

L(f ) = 1 and kf kp,mid ,I 6= 0.


3.4 Linear Functionals via Hahn-Banach Extension 61

Hence, f 1[0,x] is continuous in the metric of LRS pid (I, R), and consequently,
φ(x) = L(f 1[0,x] ) is a continuous function on I since L is continuous. Note
that φ(0) = 0 and φ(1) = 1 and f = f 1[0,x] + f 1[x,1] . So there is an x0 ∈ (0, 1)
such that

φ(x0 ) = 2−1 and L(f 1[0,x0 ] ) = 2−1 = L(f 1[x0 ,1] ).

Since
kf 1[0,x0 ] kpp,mid ,I + kf 1[x0 ,1] kpp,mid ,I = kf kpp,mid ,I ,
one of two terms of the left-hand side of the last equation is not greater than
2−1 kf kpp,mid ,I , say,

kf 1[0,x0 ] kpp,mid ,I ≤ 2−1 kf kpp,mid ,I

and then

L(2f 1[0,x0 ] ) = 1 and k2f 1[0,x0 ] kpp,mid ,I ≤ 2p−1 kf kpp,mid ,I .


p
Continuing the above argument, we obtain a sequence (fj )∞
j=1 in LRS id (I, R)
such that for any j ∈ N,

L(fj ) = 1 and kfj kpp,mid ,I ≤ 2j(p−1) kf kpp,mid ,I .

But, the last inequality cannot hold due to 0 < p < 1, 2j(p−1) → 0, and L
being continuous.

Actually, the above question will be answered in great generality using the
Hahn-Banach extension theorem which is stated below and named for Hans
Hahn and Stefan Banach.

Theorem 3.33. Let X be a linear space over F, and p : X → [0, ∞) a func-


tion with

p(x + y) ≤ p(x) + p(y) and p(λx) = |λ|p(x) for all λ ∈ F and x, y ∈ X.

If Y is a linear subspace of X and f is an F-valued linear functional on Y


with |f (x)| ≤ p(x) for all x ∈ Y , then there is an F-valued linear functional
F on X such that F (x) = f (x) for all x ∈ Y and |F (x)| ≤ p(x) for all x ∈ X.

Proof. First of all, we demonstrate the theorem for F = R. By the hypothesis,


we denote by K the set of all pairs (Yα , gα ) in which Yα is a linear subspace
of X containing Y , and gα is a real linear functional on Yα with

gα (x) = f (x) for all x∈Y and gα (x) ≤ p(x) for all x ∈ Yα .

Make K into a partially ordered set by defining the relation:

(Yα , gα )  (Yβ , gβ ) if Yα ⊆ Yβ and gα = gβ on Yα .


62 3 Normed Linear Spaces

Clearly, every totally ordered subset {(Yλ , gλ )} of K, for which at least one of

(Yα , gα )  (Yβ , gβ ) and (Yβ , gβ )  (Yα , gα )

holds, has an upper bound ∪λ Yλ on which the functional is given by gλ on


each Yλ . By Zorn’s lemma we find that there is a maximal element (Y0 , g0 ) in
K. Subsequently, we will prove Y0 = X and g0 = F .
If Y0 6= X, then there is a y1 ∈ X \ Y0 . Let Y1 be the linear space spanned
by Y0 and y1 ; that is,

Y1 = {y + λy1 : y ∈ Y0 , λ ∈ R}.

Note that Y0 is a subset of Y1 , and if x, y ∈ Y0 then

g0 (y) − g0 (x) = g0 (y − x) ≤ p(y − x) ≤ p(y + y1 ) + p(−y1 − x)

and hence
−p(−y1 − x) − g0 (x) ≤ p(y + y1 ) − g0 (y).
It follows that
 
a = sup − p(−y1 − x) − g0 (x) ≤ inf p(y + y1 ) − g0 (y) = b.
x∈Y0 y∈Y0

Now for any number c ∈ [a, b] define

g1 (y + λy1 ) = g0 (y) + λc for all y ∈ Y0 and λ ∈ R.

Then g1 is evidently linear. Furthermore, we have the following three cases:


Case 1: If λ = 0, then g1 (y) = g0 (y) ≤ p(y).
Case 2: If λ > 0 and y ∈ Y0 , then
 y 
g1 (y + λy1 ) = λ g0 +c
λ
 y y  y 
≤ λ g0 +p + y1 − g0
λ λ λ
y 
= λp + y1
λ
= p(y + λy1 ),

Case 3: If λ < 0, then


 y  
g1 (y + λy1 ) = |λ| g0 −c
|λ|
 y  y  y 
≤ |λ| g0 − g0 + p − y1 +
|λ| |λ| |λ|
y 
= |λ|p − y1 +
|λ|
= p(y + λy1 ).
3.4 Linear Functionals via Hahn-Banach Extension 63

Accordingly, we get

g1 (y + λy1 ) = g0 (y) + λc ≤ p(y + λy1 ) for all λ ∈ R and y ∈ Y0 .

This is to say,

(Y1 , g1 ) ∈ K and (Y0 , g0 )  (Y1 , g1 ) with Y0 6= Y1 .

This contradicts the maximality of (Y0 , g0 ). The preceding argument indicates


that if x ∈ X and λ = sgnF (x), the real number obeying λF (x) = |F (x)|,
then
|F (x)| = λF (x) = F (λx) ≤ p(λx) ≤ p(x).
Next, we prove the theorem for the case F = C. Let f be a complex linear
functional on Y such that |f (x)| ≤ p(x) for x ∈ Y . Then u = <f is clearly
real linear on Y and |u(x)| = |<f (x)| ≤ |f (x)| ≤ p(x). According to the above
R-case there is a real-valued linear functional U on X such that U (x) = u(x)
for x ∈ Y and |U (x)| ≤ p(x) for x ∈ X. Now if F (x) = U (x) − iU (ix), then
=F (x) = −<(iF (x)) = −U (ix) and F (ix) = U (ix) − iU (−x) = U (ix) +
iU (x) = iF (x), and consequently, F is a complex linear extension of f (x) =
u(x) − iu(ix) to X. Furthermore, if
 
exp − i arg F (x) , F (x) 6= 0
α = sgnF (x) =
0 , F (x) = 0,

then
|F (x)| = αF (x) = F (αx) = U (αx) ≤ p(αx) ≤ p(x).

Here is a series of important and interesting consequences of the Hahn-


Banach extension theorem applied to normed linear spaces.

Corollary 3.34. Let (X, k · kX ) be a normed space over F, and Y a linear


subspace of X.
(i) To any f ∈ B(Y, F) there corresponds an F ∈ X ∗ such that

kF k = kf k and F (y) = f (y) for all y ∈ Y.

(ii) If x0 ∈ X satisfies

inf ky − x0 kX = d > 0,
y∈Y

then there is an F ∈ X ∗ such that

F (x0 ) = 1; kF k = d−1 ; F (y) = 0 for all y ∈ Y.

(iii) If Y is not dense in X, then there is a nonzero F ∈ X ∗ such that F (y) = 0


for all y ∈ Y .
64 3 Normed Linear Spaces

(iv) If x 6= 0 in X then there is an F ∈ X ∗ such that

kF k = 1 and F (x) = kxkX .

(v) If y, z ∈ X and y 6= z, then there exists an F ∈ X ∗ such that F (y) 6= F (z).


(vi) For any x ∈ X,

|F (x)|
kxkX = sup = sup |F (x)|.
06=F ∈X ∗ kF k F ∈X ∗ , kF k=1

(vii) If F ∈ X ∗ is nonzero and N = {x ∈ X : F (x) = 0}, then there exists a


one-dimensional subspace M of X such that

X = N + M = {x + y : x∈N and y ∈ M} and N ∩ M = {0}.

(viii) If X ∗ is separable, so is X.
(ix) If Y is closed and Y ⊥ = {f ∈ X ∗ : f (y) = 0 for all y ∈ Y }, then
Y ∗ = X ∗ /Y ⊥ and (X/Y )∗ = Y ⊥ in the sense that there exist isometries
between the corresponding normed spaces.
(x) If X is reflexive and Y is closed, then Y is reflexive.

Proof. (i) Given f ∈ B(Y, F), let p(x) = kf kkxkX for all x ∈ X. Then

|f (x)| ≤ kf kkxkX = p(x) for all x ∈ Y.

Hence from Theorem 3.33 it turns out that there exists an extension F ∈
B(X, F) with F = f on Y and |F | ≤ p on X. Clearly, kF k = kf k follows from
kF k ≤ kf k and

kf k = sup |f (y)| = sup |F (y)| ≤ kF k.


y∈Y, kykX =1 y∈Y, kykX =1

(ii) Let Y1 be the linear space spanned by Y and x0 . Since d > 0, we


conclude that x0 ∈/ Y and every point x ∈ Y1 may be written uniquely as
x = y + λx0 where y ∈ Y and λ ∈ F. Define a linear functional

f ∈ Y1∗ by f (y + λx0 ) = λ.

Then f (y) = 0 for y ∈ Y and f (x0 ) = 1. If λ 6= 0 and x = y + λx0 , then

kxkX = ky + λx0 kX = |λ| λ−1 y + x0 X ≥ |λ|d = |f (x)|d,


and hence kf k ≤ d−1 . Pick a sequence (yn )∞


n=1 in Y with kx0 − yn kX → d.
Then
1 = f (x0 − yn ) ≤ kf kkx0 − yn kX → dkf k,
so kf k ≥ d−1 . Therefore kf k = d−1 . Accordingly, a direct application of (i)
produces an F ∈ X ∗ such that F (x) = f (x) when x ∈ Y1 , and kF k = kf k, as
desired.
3.4 Linear Functionals via Hahn-Banach Extension 65

(iii) Since Y is not dense in X, we conclude that there is an x0 ∈ X such


that
inf ky − x0 kX = d > 0.
y∈Y

An application of (ii) produces the conclusion in (iii).


(iv) Just apply (ii) with Y = {0} and x ∈ X to get an f ∈ X ∗ such that

kf k = kxk−1
X and f (x) = 1.

We may then take F = kxkX f .


(v) Apply (iv) to x = y − z.
(vi) If x = 0 then the assertion is trivial. So, suppose x 6= 0. Then

sup |F (x)| ≤ kxkX .


F ∈X ∗ , kF k=1

Meanwhile, by (iv) there is an f ∈ X ∗ such that

f (x) = kxkX and kf k = 1,

thus
sup |F (x)| = kxkX .
F ∈X ∗ , kF k=1

(vii) If F 6= 0 then there is a point x0 6= 0 such that F (x0 ) = 1. Note that


any element x ∈ X can then be written as

x = x − λx0 + λx0 with λ = F (x).

So, if Y = {λx0 : λ ∈ F} then the desired decomposition follows right away.


It is clear that Y is the one-dimensional space spanned by x0 . If x ∈ N ∩ Y ,
then
x = λx0 and 0 = F (x) = λF (x0 ) = λ
and hence x = 0.
(viii) Suppose {fj }∞ j=1 is a dense subset of the unit sphere {f ∈ X :

kf k = 1} and choose xj ∈ {x ∈ X : kxkX = 1} to obey |fj (xj )| ≥ 1/2. Let


Y = {xj }∞ j=1 . If Y 6= X, then the just-verified (iii) gives a functional f ∈ X

such that kf k = 1 and f (x) = 0 for all x ∈ Y . Now if  ∈ (0, 1/2), then there
exists a functional fj0 ∈ {fj }∞j=1 such that xj0 ∈ Y and

2−1 ≤ |fj0 (xj0 )| = |f (xj0 ) − fj0 (xj0 )| ≤ kf − fj0 k < ,

a contradiction. Accordingly, Y = X, i.e., X is separable.


(ix) Suppose f ∈ Y ∗ . By (i) there is an F ∈ X ∗ such that F = f on Y
and kF k = kf k. Then φ(F ) = F + Y ⊥ is a well-defined element in X ∗ /Y ⊥ .
It is easy to prove that φ establishes an isometry between Y ∗ and X ∗ /Y ⊥ .
As with the second isometry, we define ψ(f )(x) = f (x + Y ) for any f ∈
(X/Y )∗ , and then find that ψ : (X/Y )∗ → Y ⊥ is an isometry and so (X/Y )∗
and Y ⊥ are isometric.
66 3 Normed Linear Spaces

(x) According to (vi), every x ∈ X defines a unique element x̂ in the second


dual X ∗∗ of X and kx̂k = kxkX , and consequently, b· is always linear and an
isometry from X into X ∗∗ . Therefore, X is not only identified with X b but also
regarded as a linear subspace of X ∗∗ . Since X is reflexive, every functional in
X ∗∗ just arises in the fashion x̂(L) = L(x) for L ∈ X ∗ and x ∈ X; that is, the
mapping b· is surjective.
Suppose Y is closed. Then (ix) tells us that Y ∗ = X ∗ /Y ⊥ and (X ∗ /Y ⊥ )∗ =
(Y ) = {f ∈ X ∗∗ : f (Y ⊥ ) = {0}}. Since X is viewed as a linear sub-
⊥ ⊥

space of X ∗∗ , any element y ∈ Y induces f (y) = 0 for all f ∈ Y ⊥ = {f ∈


X ∗ : f (Y ) = {0}} and consequently, y is a member of (Y ⊥ )⊥ . Meanwhile,
if there is an element y0 in (Y ⊥ )⊥ \ Y , then the closedness of Y ensures
inf y∈Y ky − y0 kX = δ > 0, and hence by (ii) there exists an f ∈ X ∗ such that
f (y0 ) = 1, f (y) = 0 for all y ∈ Y (this implies f ∈ Y ⊥ ), and kf k = δ −1 . Note
that X is reflexive. So y0 ∈ (Y ⊥ )⊥ and f ∈ Y ⊥ yield f (y0 ) = 0, contradicting
f (y0 ) = 1. Therefore, (Y ⊥ )⊥ = Y , and consequently, Y ∗∗ = Y , as desired.

Problems
3.1. Let p ∈ (1, ∞) and q = p/(p − 1). Prove Hölder’s and Minkowski’s in-
equalities of the discrete forms:
(i) For vectors (xj )nj=1 and (yj )nj=1 in Rn , one has
n
X n
X n
 p1  X  q1
|xj yj | ≤ |xj |p |yj |q
j=1 j=1 j=1

and
n
X  p1 n
X  p1 n
X  p1
|xj + yj |p ≤ |xj |p + |yj |p .
j=1 j=1 j=1

(ii) For real-valued sequences (xj )∞ ∞


j=1 ∈ `p and (yj )j=1 ∈ `q , one has


X ∞
X ∞
 p1  X  q1
|xj yj | ≤ |xj |p |yj |q
j=1 j=1 j=1

and

X  p1 ∞
X  p1 ∞
X  p1
|xj + yj |p ≤ |xj |p + |yj |p .
j=1 j=1 j=1

3.2. Determine the dimensions of the following linear Pnspaces:


(i) The set of vectors x = (x1 , · · · , xn ) in Rn with j=1 xj = 0;
(ii) The set C[0, 1] of all continuous functions f : [0, 1] → R;
(iii) The set of all real-valued polynomials on [0, 1].

3.3. Prove the following two results:


3.4 Linear Functionals via Hahn-Banach Extension 67

(i) On C[0, 1] the sup-norm is not equivalent to any p-norm where 1 ≤ p < ∞;
(ii) If C[0, 1] is equipped with the sup-norm, then the unit sphere of C[0, 1] is
not compact.
3.4. Prove that if X is a linear space over F then any norm k · k : X → R
is continuous on X, but also vector addition and scalar multiplication are
continuous whenever X × Y is equipped with the norm k · kX + k · kY .
3.5. Recall that `∞ is the space of all bounded infinite sequences x = (xj )∞
j=1
of real numbers with the norm kxk∞ = supj∈N |xj |. Prove that if `0 is the
class of all infinite sequence of real numbers which have only finitely many
non-zero terms, then `0 is not closed in `∞ .
3.6. Prove
R p→∞ kf kp,mid ,[0,1] = kf k∞,mid ,[0,1] whenever kf k∞,mid ,[0,1] < ∞;
(i) lim
(ii) [0,1] |f1 f2 |dmid ≤ kf1 k1,mid ,[0,1] kf2 k∞,mid ,[0,1] with equality if and only if
|f2 (x)| = kf2 k∞,mid ,[0,1] for mid -a.e. x ∈ [0, 1].
3.7. Prove that if C[0, 1] is equipped with the sup-norm and T : C[0, 1] → R
is given by T (f ) = f (0), then T is bounded with kT k = 1.
3.8. P∞
(i) Suppose that the real infinite matrix [akj ] satisfies supk∈N j=1 |akj | < ∞.
Define

X ∞
X 
T : x = (x1 , x2 , · · ·) → y = T (x) = a1j xj , a2j xj , · · · .
j=1 j=1
P∞
Prove that T : `∞ → `∞ is bounded and kT k = supk∈N j=1 |akj |.
(ii) Let T : `2 → `2 be defined by

T (x) = (0, x1 , x2 , · · ·) for x = (x1 , x2 , · · ·) ∈ `2 .

Prove kT (x)k2 = kxk2 .


(iii) For a finite interval [a, b] in R, let R1 [a, b] be the space of all real-valued
Rb
Riemann integrable functions f on [a, b] with kf k1 = a |f (x)|dx < ∞. Define
Rx
T (f )(x) = a f (t)dt. Prove that T is a bounded linear operator from R1 [a, b]
to itself with kT k = b − a.
3.9. Let C(0, 1) be the space of all real-valued continuous functions on (0, 1).
R1 1
Equip C(0, 1) with the 2-norm kf k2 = 0 |f (t)|2 dt 2 . Define

T : C(0, 1) → C(0, 1) by T (f )(t) = tf (t) for t ∈ (0, 1).

Prove that T is bounded but not invertible.



3.10. Let T : R2 → R2 be given by T (x1 , x2 ) = (x1 , 0). Prove that T is
linear, bounded, but not onto, and cannot map open sets to open sets in R2 .
68 3 Normed Linear Spaces

3.11. For a normed linear space X over F, prove the following results:
(i) If T ∈ B(X) and T −1 exists and belongs to B(X), then (T −1 )n = (T n )−1
holds for any n ∈ N.
(ii) If T, S ∈ B(X) and T S has an inverse in B(X), then T and S must have
inverses in B(X).

3.12. Let K be a convex set in a normed linear space (X, k · kX ) over R; that
is,
x1 , x2 ∈ K and λ ∈ (0, 1) ⇒ λx1 + (1 − λ)x2 ∈ K.
If 0 is an interior point of K, then the Minkowski functional p of K is defined
by
p(x) = inf{λ > 0 : x ∈ λK} where λK = {λx : x ∈ K}.
Prove the following results:
(i) The Minkowski functional of the unit ball in X is the norm k · kX .
(ii) The Minkowski functional p is positive homogeneous; that is,

p(κx) = κp(x) for κ > 0 and x ∈ X

and subadditive; that is,

p(x + y) ≤ p(x) + p(y) for all x, y ∈ X.

(iii) For each x0 6∈ K there exists an L ∈ B(X, R) such that L(x) ≤ L(x0 ) for
x ∈ K.

3.13. Let X be a real normed linear space. By a hyperplane in X is meant


a set of the form H = {x ∈ X : f (x) = r} where f is a continuous linear
functional on X and r ∈ R. Prove the following separation theorem: If E1 and
E2 are disjoint convex subsets of X and E1 has an interior point, then there
is a hyperplane which separates E1 and E2 ; that is, there exist a continuous
functional f ∈ B(X, R) and an r ∈ R such that

E1 ⊆ {x ∈ X : f (x) ≤ r} and E2 ⊆ {x ∈ X : f (x) ≥ r}.

3.14. Let X be a normed linear space over F and f : X → F be a linear


functional. Prove that f ∈ B(X, F) if and only f −1 ({0}) is closed in X.

3.15. Let X be a normed linear space over F. Prove that if M is a closed


linear subspace of X and x ∈ X \ M then M + Fx is closed.

3.16. If M is a finite-dimensional subspace of a normed linear space X over


F, prove that there is a closed linear subspace N such that

M ∩ N = {0} and M + N = X.
3.4 Linear Functionals via Hahn-Banach Extension 69

3.17. Verify the Krein extension theorem: Let X be a normed linear space
over R and Y ⊆ X such that if y1 , y2 ∈ Y and c ≥ 0 then y1 + y2 ∈ Y and
cy1 ∈ Y . Define a partial order on X by declaring that x1  x2 if and only
if x2 − x1 ∈ Y , and call a linear functional f on X Y -positive if f (x) ≥ 0 for
x ∈ Y . Let M be a subspace of X such that for each x ∈ X there exists y ∈ M
with x  y. Prove that if f is a (Y ∩ M)-positive linear functional on M,
then there is a Y -positive linear functional F on X such that the restriction
of F on M is equal to f .
4
Banach Spaces via Operators and Functionals

Banach spaces, named after Stefan Banach who investigated them, are one
of the central objects of study in linear functional analysis. Banach spaces
are typically infinite-dimensional spaces containing functions. In this chapter
we are concerned with the definition of a Banach space followed by some
examples, some of the basic algebraic and topological properties of Banach
spaces, continuous linear functionals, as well as bounded and compact linear
operators in Banach spaces.

4.1 Definition and Beginning Examples


As before, F always means the real field R or the complex field C.

Definition 4.1. A normed linear space over F which is complete in the metric
generated by the norm is called a Banach space over F.

Clearly, if a linear space is a Banach space under one norm, then it is also
a Banach space under any equivalent norm, and hence for each n ∈ N, Rn is
complete under any norm. One more beginning example of Banach spaces is
listed below.

Example 4.2.
(i) For a compact metric space (X, dX ), suppose C(X, F) is equipped with
the sup-norm kf k∞ = supx∈X |f (x)| for f ∈ C(X, F). Then it is a Banach
space over F, because of Theorem 2.23 (i).
(ii) For each p ∈ [1, ∞], let `p (N, F) be the space of all F-valued sequences
(xj )∞
j=1 satisfying

  P∞ 1
p p

k(xj )j=1 kp = j=1 |x j | , p ∈ [1, ∞)
 supj∈N |xj | , p = ∞.
72 4 Banach Spaces via Operators and Functionals

Then `p (N, F), k · kp is a Banach space over F. To see this, assume (xj )∞

j=1
(1) (2)
is a Cauchy sequence in `p (N, F), and write xj = (xj , xj , ...). Because of
k · kp ≥ k · k∞ , for any  > 0 we may find N ∈ N such that m, n > N ⇒
kxn − xm kp <  which in turn implies that kxn − xm k∞ < , so for each k,
(k) (k)
|xn − xm | < . In other words, if (xj )∞
j=1 is a Cauchy sequence in `p (N, F)
(k)
then (xj )∞
j=1 is a Cauchy sequence in F. Since F is complete, for each k
(k)
we have that |xj − y (k) | → 0 as j → ∞. Note that this does not imply by
itself xj → y = (y (1) , y (2) , ...). However, if we know that (xj )∞
j=1 is a Cauchy
sequence, then it does. In fact, we prove this for p < ∞ but the p = ∞ case is
similar. Fix  > 0, and use the Cauchy criterion to find an N ∈ N such that

X
n, m > N ⇒ |x(k) (k) p
n − xm | < .
k=1
P∞ (k)
Now fix n ∈ N and let m → ∞ to see k=1 |xn − y (k) |p ≤ . This inequality
1
means kxn − ykp ≤  p and y = (y (1) , y (2) , ...) ∈ `p (N, F).
To give another important class of Banach spaces, we need one more def-
inition and theorem.
Definition 4.3. For a sequence (xj )∞ in a normed linear space (X, k · kX )
P∞j=1
over F, we say that the series x
j=1 P j is absolutely convergent provided
P∞ ∞
j=1 kxj kX < ∞. Moreover,
Pn
a series j=1 xj is said to converge to x in
X provided limn→∞ k j=1 xj − xkX = 0.
Theorem 4.4. Let (X, k · kX ) be a normed linear space over
P∞ F. Then it is a
Banach space if and only if the absolute convergence of j=1 xj implies its
convergence.
Proof. The argument is motivated by that for Theorem ?? (ii). On the one
hand, suppose (X, k · kX ) is a Banach space. Consider the sequence of partial
Pk P∞
sums sk = j=1 xj . Since j=1 xj is absolutely convergent, we conclude that
m
X
ksm − sk kX ≤ kxj kX → 0 as m > k → ∞.
j=k+1

It follows that (sm )∞


m=1 is a Cauchy sequence,
P∞ and consequently this sequence
converges since X is complete. Thus, j=1 xj converges.
On the other hand, assume (xj )∞ j=1 is a Cauchy sequence in (X, k · kX ).
Then for each k ∈ N there is an Nk ∈ N such that
i, j ≥ Nk ⇒ kxi − xj kX < 2−k .
Without loss of generality, we may assume Nk+1 ≥ Nk . This yields that
(xNk )∞ ∞
k=1 is a subsequence of (xk )k=1 . Set y1 = xN1 and yk = xNk − xNk−1
when k ≥ 2. Note that
4.1 Definition and Beginning Examples 73
l
X l
X
kyk kX < ky1 kX + 21−k ≤ ky1 kX + 1.
k=1 k=2

Pl ∞
So k=1 kyk kP
X l=1 is increasing and bounded, and hence is convergent.
∞ Pn
By hypothesis, k=1 yk is convergent. Since k=1 yk = xNn , it follows that
(xNn )∞ ∞
n=1 is convergent in (X, k·kX ). This, together with the fact that (xk )k=1

is a Cauchy sequence, infers that (xk )k=1 is convergent. Therefore, (X, k · kX )
is a Banach space.

Remark 4.5. It is worthwhile to point out that Theorem 4.4 is clearly not true
for general normed linear spaces. For example, for j ∈ N let

 0 , x ∈ [−1, −1/j]
hj (x) = 1 + jx , x ∈ [−1/j, 0]
1 , x ∈ [0, 1],

and fj = hj+1 − hj . Then fj ∈ C([−1, 1], R) and


∞ ∞  p1
X X 1  1
kfj kp,mid ,[−1,1] = < ∞ for p ∈ [1, ∞),
j=1 j=1
j + 1 j(p + 1)
P∞
but j=1 fj = limn→∞ (hn+1 − h1 ) with respect to k · kp,mid ,[−1,1] is outside
C([−1, 1], R) since limn→∞ khn+1 − 1[0,1] kp,mid ,[−1,1] = 0. This example in-
dicates that C([−1, 1], R) is not complete under k · kp,mid ,[−1,1] . Actually, its
completion is LRS pid ([−1, 1], R).

Example 4.6. Let LRS pg (E, F) be as in Example 3.14. Then each LRS pg (E, F)
is a Banach space whenever p ∈ [1, ∞]. Actually, in the case p ∈ [1, ∞), the
result can be demonstrated via Theorems 4.4 and ?? (ii). As with p = ∞,

suppose (fj )∞
j=1 is a Cauchy sequence in LRS g (E, F). Clearly, we may assume
|fj (x)| ≤ kfk k∞,mg ,E for all x ∈ E by replacing fj with an equivalent function
∞
for which this is true. Then fj (x) k=1 is a Cauchy sequence in F for each x ∈
E and we can thus get an F-valued function f on E such that limk→∞ fk (x) =
f (x) for each x ∈ E. This in turns yields limk→∞ kf − fk k∞,mg ,E = 0.

Interestingly, Example 4.6 leads to a criterion for a quotient space to be


complete.

Theorem 4.7. Let (X, k · kX ) be Banach space over F and Y ⊆ X. Then


X/Y is a Banach space under the norm kx + Y kX/Y = inf z∈x+Y kzkX if and
only if Y is closed.

Proof. If Y is not closed, then Theorem 3.13 tells us that kx + Y kX/Y is not
a norm. So, it remains
P∞ to prove that if Y is closed then X/Y is complete. In
doing so, suppose k=1 (xk + Y ) is an absolutely convergent series in X/Y .
For each k ∈ N choose zk ∈ xk + Y such that
74 4 Banach Spaces via Operators and Functionals

kzk kX ≤ kxk + Y kX/Y + 2−k .


P∞
Since X is a Banach space, it follows that k=1 zk is absolutely convergent
and thus convergent to some x ∈ X thanks to Theorem 4.4. Note that
n n
X  X
x + Y − xk + Y ≤ x − zk → 0 as n → ∞.

X/Y X
k=1 k=1
P∞
So we obtain that k=1 (xk + Y ) converges to x + Y ∈ X/Y . Namely, X/Y
is complete due to Theorem 4.4.

In many situations it makes sense to multiply elements of a Banach space


together.

Definition 4.8. Let (X, k · kX ) be a Banach space over F. If there is a mul-


tiplication (x, y) → xy from X × X → X such that for any x, y, z ∈ X and
α ∈ F,
(i) x(yz) = (xy)z;
(ii) x(y + z) = xy + xz;
(iii) (x + y)z = xz + yz;
(iv) α(xy) = (αx)y = x(αy);
(v) kxykX ≤ kxkX kykX ,
then X is called a Banach algebra.

Example 4.9.
(i) (C[0, 1], k · k∞ ) is a Banach algebra with (f g)(x) = f (x)g(x).
(ii) LRS 1id (R, F) is a Banach algebra under the convolution
Z
f1 ∗ f2 (x) = f1 (x − t)f2 (t)dmid (t).
R

Here, we should note


Z Z Z
f ∈ LRS 1id (R, F) ⇒ f dmid = <f dmid + i =f dmid .
R R R

(iii) If X = Rn , n ∈ N, then by choosing a basis for Rn we may identify


B(Rn ) with the space of n × n real matrices which is a Banach algebra under
the usual matric multiplication.
(iv) If (X, k · kX ) is a Banach space over F, then (B(X), k · k) is a Banach
algebra under the composition ST = S ◦ T .

Beyond Example 4.9 (iv), we can derive more information on all bounded
linear operators on a given Banach space.

Theorem 4.10. Let (X, k · kX ) be a Banach space over F.


4.1 Definition and Beginning Examples 75

(i) If T ∈ B(X) satisfies kT k < 1 and I is the identity element in B(X), then
I − T ∈ B(X) is invertible and

(I − T )−1 = lim (I + T + T 2 + · · · + T n ) in B(X), k · k ,



n→∞

k
| ◦ ·{z
where T is the k-th composition T · · ◦ T} for k ∈ N.
k
(ii) If I stands for the class of all invertible operators in B(X), then I is an
open set in B(X).
Proof. (i) Fix x ∈ X. Since m, n ∈ N and m > n imply

k(I + T + T 2 + · · · + T m )(x) − (I + T + T 2 + · · · + T n )(x)kX


= kT n+1 (x) + · · · + T m (x)kX
≤ kT n+1 (x)kX + · · · + kT m (x)kX
≤ kT n+1 k + · · · + kT m k kxkX


 X ∞ 
≤ kT kj kxkX
j=n+1
 kT kn+1 
= kxkX
1 − kT k
→0 as n → ∞,
∞
we conclude that (I + T + T 2 + · · · + T n )(x) n=1 is a Cauchy sequence in
X. Note that (X, k · kX ) is a Banach space. So the sequence converges to a
limit y ∈ X. Let y = A(x). It is not hard to show that A : X → X is a linear
operator on X. Furthermore, letting m → ∞, we have
 kT kn+1 
kA(x) − (I + T + T 2 + · · · + T n )(x)kX ≤ kxkX ,
1 − kT k

so that A − (I + T + T 2 + · · · + T n ) ∈ B(X), and thus A ∈ B(X). Because

kT kn+1
kA − (I + T + T 2 + · · · + T n )k ≤ →0 as n → ∞,
1 − kT k

we derive that I + T + T 2 + · · · + T n → A as n → ∞. It remains to verify


A = (I − T )−1 . For any x ∈ X we have

(I − T )A (x) = (I − T ) lim (I + T + T 2 + · · · + T n ) (x)


 
n→∞
 
= (I − T ) lim I(x) + T (x) + T 2 (x) + · · · + T n (x)
n→∞
= lim x − T n+1 (x) .

n→∞

But
76 4 Banach Spaces via Operators and Functionals

kT n+1 (x)kX ≤ kT kn+1 kxkX → 0 as n → ∞,



so that T n+1 (x) → 0 as n → ∞, and consequently (I − T )A(x) = x.
Similarly, we have A(I − T ) (x) = x, whence getting A = (I − T )−1 .
(ii) Note that kT −1 k =
6 0 for T ∈ I. So, to prove that the open ball

S ∈ B(X) : kT − Sk < kT −1 k−1




is a subset of I, we just show that every element S of this ball is invertible.


Using
k(T − S)T −1 k ≤ kT − SkkT −1 k < 1,
we obtain that ST −1 = I − (T − S)T −1 is invertible and thus S = (ST −1 )T
is invertible.

Remark 4.11. When (X, k · kX ) is a Banach space over F and T belongs to


B(X), we can define an operator

T2 T3
eT = I + T + + + ···,
2! 3!
which makes sense since
kT k2 kT k3
keT k ≤ 1 + kT k + + + · · · = exp kT k.
2! 3!

With this notion, we have that if x(t) = x1 (t), x2 (t), ..., xn (t) ∈ Rn , then
the system

dx  dx1 (t) dx2 (t) dxn (t) 


= , , ..., = x(t)A; x(0) = a ∈ Rn ,
dt dt dt dt
where A is an n × n matrix, has a solution x(t) = aetA .

4.2 Uniform Boundedness, Open Map and Closed Graph

In this section, we proceed to discuss three fundamental results of func-


tional analysis – the uniform boundedness principle (sometimes known as the
Banach-Steinhaus theorem due to Stefan Banach and Hugo Dyonizy Stein-
haus), open mapping theorem and closed graph theorem.
Below is the so-called uniform boundedness principle.

Theorem 4.12. Let (X, k · kX ) be a Banach space and (Y, k · kY ) a normed


linear space over F. Let {Tα }α∈I be a family of bounded linear operators Tα :
X → Y indexed by the nonempty set I. If supα∈I kTα (x)kY < ∞ for each
x ∈ X, then supα∈I kTα k < ∞.
4.2 Uniform Boundedness, Open Map and Closed Graph 77

Proof. For each j ∈ N let


\
Xj = {x ∈ X : sup kTα (x)kY ≤ j} = {x ∈ X : kTα (x)kY ≤ j}.
α∈I
α∈I

Then each Xj is a closed subset of X and X = ∪∞ j=1 Xj . Since X is complete,


X is of second category. By the Baire category theorem (i.e., Theorem 1.47)
we see that Xj◦ 6= ∅ for some j = j0 ∈ N and consequently, there is an x0 ∈ X
and an r ∈ (0, ∞) such that Br (x0 ) ⊆ Xj0 . This means that for x ∈ Br (x0 )
we have supα∈I kTα (x)kY ≤ j0 . Consequently, for any x ∈ Br (0) we obtain
x0 − x ∈ Br (x0 ) and so for any α ∈ I,

kTα (x)kY ≤ kTα (x − x0 )kY + kTα (x0 )kY ≤ 2j0 .

Thus, for each x ∈ B 1 (0) we get kTα (x)kY ≤ 4j0 /r. This yields supα∈I kTα k ≤
4j0 /r, as required.

The boundedness of a sequence of linear operators is useful in the study


of convergence of operator sequences. To see this, we introduce two types of
convergence involving operator sequences.

Definition 4.13. Let (X, k · kX ) and (Y, k · kY ) be normed linear spaces over
F.
(i) A sequence (Tn )∞
n=1 in B(X, Y ) is said to be uniformly convergent provided
there is a T ∈ B(X, Y ) such that limn→∞ kTn − T k = 0.
(ii) A sequence (Tn )∞
n=1 in B(X, Y ) is said to
∞be strongly (or pointwise) con-
vergent on X provided the sequence Tn (x) n=1 is convergent in Y for any
x ∈ X. Moreover, if there is a T ∈ B(X, Y ) such that limn→∞ kTn (x) −
T (x)kY = 0 for all x ∈ X, then (Tn )∞ n=1 is said to be strongly convergent to
T.

Example 4.14. Clearly, the uniform convergence implies the strong conver-
gence, but not conversely. Consider `p (N, F), p ∈ [1, ∞). For each n ∈ N
define
Tn (x) = (xn , xn+1 , ...), x = (x1 , x2 , ...) ∈ `p .

Then Tn is in B `p (N, F) with kTn k ≤ 1. Note that if x = (x1 , x2 , ...) ∈
`p (N, F) then
X∞  p1
lim kTn (x)kp = lim |xj |p = 0.
n→∞ n→∞
j=n

So (Tn )∞
n=1 is strongly convergent to 0. Meanwhile, for n ∈ N let en =
(0, ..., 0, 1, 0, ...). Then ken kp = 1 and Tn (en ) = (1, 0, 0, ...) and hence kTn k ≥
| {z }
n−1
kTn (en )kp = 1. This shows that (Tn )∞
n=1 is not uniformly convergent to 0.

Here is a special consequence of the uniform boundedness principle.


78 4 Banach Spaces via Operators and Functionals

Theorem 4.15. Let (X, k · kX ) be a Banach space over F.


(i) Suppose (Y, k · kY ) is a normed linear space over F. If a sequence (Tn )∞
n=1
in B(X, Y ) is strongly convergent, then there exists a T ∈ B(X, Y ) such that
(Tn )∞
n=1 is strongly convergent to T .
(ii) Suppose (Y, k · kY ) is a Banach space over F. Then a sequence (Tn )∞ n=1
in B(X,∞Y ) is strongly convergent if and only if supn∈N kTn k < ∞ and
Tn (x) n=1 is convergent in Y for any x in a dense subset D of X.

Proof. (i) Let (Tn )∞


n=1 in∞B(X, Y ) be strongly convergent. Then for each x ∈
X the sequence Tn (x) n=1 is convergent in Y and hence defines a linear
operator T on X by setting T (x) ∞= limn→∞ Tn (x). The key is to show that
T ∈ B(X, Y ). Note that Tn (x) n=1 is bounded in Y . So, from the uniform
boundedness principle it turns out that there is a constant C > 0 such that
supn∈N kTn k ≤ C. Hence kTn (x)kY ≤ CkxkX for all x ∈ X. This implies
kT (x)kY ≤ CkxkX for all x ∈ X, showing that T is bounded. The definition
of T means that (Tn )∞ n=1 converges strongly to T .
(ii) The necessity follows from the uniform boundedness principle. As for
the sufficiency, we assume the statement after the if and only if. Given  > 0.
Then for any x ∈ X there is an element z ∈ D such that kx − zkX < , and
there is an N ∈ N such that m > n > N implies kTm (z) − Tn (z)kY < .
Consequently,

kTm (x) − Tn (x)kY


≤ kTm (x) − Tm (z)kY + kTm (z) − Tn (z)kY + kTn (z) − Tn (x)kY
≤ kTm kkx − zkX +  + kTn kkx − zkX
≤ (1 + 2 sup kTn k).
n∈N
∞
That is to say, Tn (x) n=1 is a Cauchy sequence in Y and thus convergent
thanks to the completeness of (Y, k · kY ).

Recall that a continuous map between two normed linear spaces has the
property that the pre-image of any open set is open, but in general the image
of an open set is not open. But, continuous linear operators between Banach
spaces cannot do this. This is the content of the open mapping theorem as
follows.

Theorem 4.16. Let (X, k · kX ) and (Y, k · kY ) be Banach spaces over F. If


T is a continuous linear operator from X onto Y , then T is an open map,
namely, T (U ) is open in Y whenever U is open in X.

Proof. We split the argument into three steps. In what follows, we employ
BrX (x) and BrY (y) to denote the balls of radius r > 0 centered at x ∈ X and
y ∈ Y , respectively. In particular, we write BrX and BrY for BrX (0) and BrY (0),
respectively.
Step 1. We prove that for any  > 0 there is a δ > 0 such that
4.2 Uniform Boundedness, Open Map and Closed Graph 79

BδY ⊆ T (B2
X ).

To see this, note that



[ ∞
[
X= nBX = {nx : x ∈ BX },
n=1 n=1

and T ∈ B(X, Y ) is surjective. So we have



[ ∞
[
Y = T (X) = nT (BX ) = nT (BX ).
n=1 n=1

Since (Y, k · kY ) is a Banach space, we conclude that from the Baire category
theorem (i.e. Theorem 1.47) that some nT (BX ) is not nowhere dense in Y ,
and so that

BrY (z) = {y ∈ Y : ky − zkY < r} ⊆ nT (BX ) = {ny : y ∈ T (BX )}

for some z ∈ Y and some r > 0. Thus T (BX ) must contain the ball BδY (y0 )
where y0 = z/n and δ = r/n. It follows that the set

V = {y1 − y2 : y1 , y2 ∈ BδY (y0 )}

is contained in T (U ), where

U = {x1 − x2 : x1 , x2 ∈ BX } ⊆ B2


X
.
X ) ⊇ V . Any point y ∈ B Y can be written as y = (y + y ) − y , so
Thus, T (B2 δ 0 0
Y
Bδ ⊆ V , as desired.
Step 2. We further prove that for any  > 0 there is a δ > 0 such that

BδY ⊆ T (B2X
).
P∞
To do so, choose (n )∞n=1 with n > 0 and n=1 n < . By Step 1 there is a
sequence of positive numbers (δn )∞ n=1 such that BδYn ⊆ T (B2
X ). Without loss
n
of generality, we may assume limn→∞ δn = 0.
X ) and hence there is a point x ∈ B X such
Let y ∈ BδY1 . Then y ∈ T (B2 1 1 21
that ky − T (x1 )kY < δ2 . Since y − T (x1 ) ∈ BδY2 , we conclude that there is a
X
point x2 ∈ B2 2
such that ky −T (x1 )−T (x2 )kY < δ3 . Continuing this process,
we obtain a sequence (xn )∞ X
n=1 such that xn ∈ B2n and

n
X 
y − T xk < δn+1 .

Y
k=1
P∞
Since kxn kX < 2n , we conclude that
P∞ n=1 xn is absolutely convergent and
hence by Theorem 4.4 that x = n=1 xn makes sense in (X, k · kX ). This
implies
80 4 Banach Spaces via Operators and Functionals

X ∞
X
kxkX ≤ kxn kX < 2 n < 2.
n=1 n=1

Since the map T is continuous and δn → 0, we get y = T (x). In other words,


for any y ∈ BδY where δ = δ1 , we have found a point x ∈ B2 X
such that
T (x) = y, whence implying the desired inclusion.
Step 3. We prove that if O is open in X then for any point x ∈ G, there
exists a δ > 0 such that
BδY T (x) ⊆ T (O).


X
In fact, if x ∈ O, then there exists an  > 0 such that B2 (x) ⊆ O. By Step 2,
we have BδY ⊆ T (B2 X
) for some δ > 0. Hence

BδY T (x) = T (x) + BδY ⊆ T (x) + T (B2 X X X


 
) = T (x + B2 ) = T B2 (x) ⊆ T (O).

Of course, this step completes the proof of the theorem.


Remark 4.17. The open mapping theorem cannot ensure that the image of a
closed linear subspace of the domain is closed. For example, let T : `2 (N, R) 7→
`2 (N, R) be given by T (x) j = x2j for x = (xj )∞

j=1 , and set C be a linear
subspace of `2 (N, R) comprising all vectors x with x2j = x2j−1 /(2j), j ∈ N.
Note that C is the intersection of the kernels {x ∈ `2 (N, R) : Lk (x) = 0}
of a countable collection of bounded linear functionals {Lk } on `2 (N, R) (cf.
Theorem 4.24). So, C is closed. But, T (C) is not closed since `2 (N, R) 6= T (C)
and T (C) contains all points with finitely many non-zero terms - this means
T (C) = `2 (N, R)
As an application of the open mapping theorem we establish a general
property of inverse maps.
Theorem 4.18. Let X and Y be Banach spaces over F and let T ∈ B(X, Y ).
If T is bijective, then the inverse operator T −1 of T is a bounded linear map.
Proof. It suffices to prove the continuity of T −1 . Since T = (T −1 )−1 maps the
Banach space X onto the Banach space Y , we conclude from Theorem 4.16
that T maps open sets in X to open sets in Y . This amounts to saying that
T −1 is continuous.
Corollary 4.19. Let k · k(1) and k · k(2) be two norms defined on a Banach
space X over F. If there is a constant C1 > 0 such that kxk(1) ≤ C1 kxk(2)
for all x ∈ X, then there exists another constant C2 > 0 such that kxk(2) ≤
C2 kxk(1) for all x ∈ X. Consequently, both norms are equivalent.
Proof. Consider the identity operator

I : (X, k · k(2) ) → (X, k · k(1) ) where I(x) = x for all x ∈ X.

Clearly, I is bounded. By Theorem 4.18 we see that I −1 is also bounded,


whence getting the norm inequality in the other direction.
4.3 Dual Banach Spaces by Examples 81

Definition 4.20. Given two normed linear spaces X and Y over F, let T :
X → Y be a linear operator. Then the graph of T is defined as

G(T ) = {(x, y) ∈ X × Y : y = T (x)}.

Moreover, T is said to be closed when G(T ) is a closed subset of X × Y .

The following result is called the closed graph theorem.

Theorem 4.21. Let (X, k · kX ) and (Y, k · kY ) be Banach spaces over F. Then
a linear operator T : X → Y is bounded if and only if T is closed.

Proof. We initially observe that X × Y is a Banach space under the norm


k(x, y)k = kxkX + kykY . Addition and scalar multiplication are defined in the
expected manner:

(x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) and α(x, y) = (αx, αy).

The completeness of (X × Y, k · k) follows readily from the completeness of


(X, k · kX ) and (Y, k · kY ).
On the one hand, suppose T : X → Y is bounded. To prove that
T is closed, i.e., G(T ) is closed, we assume limn→∞ kxn − xkX = 0 and
limn→∞ kT (xn ) − ykY = 0. The boundedness of T implies limn→∞ kT (xn ) −
T (x)kY = 0. These two limits plus the triangle inequality imply T (x) = y.
This means (x, y) ∈ G(T ). Thus T is closed.
On the other hand, suppose T is closed. To verify T ∈ B(X,
 Y ), we consider
the projection map P from G(T ) onto X via: P x, T (x) = x. Clearly, P is
linear, bijective and bounded. From Theorem 4.18 it turns out that P −1 is
a bounded linear map from X onto G(T ), so there is a constant C > 0 such
that

kxkX + kT (x)kY = x, T (x) = kP −1 (x)k ≤ CkxkX for all x ∈ X.




Consequently, T is bounded.

Remark 4.22. It is worth mentioning that the second part of the argument
for Theorem 4.21 has used the assumption that (X, k · kX ) and (Y, k · kY )
are complete. Whenever this assumption is removed, the conclusion fails, i.e.,
there exist a linear operator T0 and a non-Banach space (X0 , k · kX0 ) such
that T0 is closed but unbounded on (X0 , k · kX0 ); see also Problem 4.8.

4.3 Dual Banach Spaces by Examples

When dealing with a normed linear space X over the base field F, one typically
is only interested in the continuous linear functionals from the space into F.
According to Theorem 3.29 and Definition 3.31, these form a normed linear
82 4 Banach Spaces via Operators and Functionals

space. In this section, we will not only see that this new normed linear space is
indeed a Banach space, but also completely describe the dual spaces of three
typical Banach spaces.
We begin with a more general result.

Theorem 4.23. Let (X, k · kX ) be a normed linear space over F. If (Y, k · kY )


is a Banach space over F, then B(X, Y ) is a Banach space over F.

Proof. If (Tj )∞
j=1 is a Cauchy sequence in B(X, Y ), then it is bounded and so
there is a constant C > 0 such that kTj (x)kY ≤ CkxkX for all x ∈ X and
j ∈ N. Since

kTj (x) − Tk (x)kY ≤ kTj − Tk kkxkX → 0 as j ≥ k → ∞,


∞
Tj (x) j=1 is a Cauchy sequence in Y . Moreover, since Y is a Banach space,
∞
Tj (x) j=1 converges to y ∈ Y . Accordingly, y = limj→∞ Tj (x) = T (x).
Clearly, T is linear, and kT (x)kY ≤ CkxkX for all x ∈ X. This means T ∈
B(X, Y ).
Note that we have not yet proved that (Tj )∞j=1 converges to T in norm.
But, since (Tj )∞
j=1 is a Cauchy sequence, for every  > 0 there is an N ∈ N
such that
j > k > N ⇒ kTj − Tk k < .
Consequently,

j > k > N ⇒ kTj (x) − Tk (x)kY ≤ kxkX for all x ∈ X.

If j → ∞, then
k > N ⇒ kT (x) − Tk (x)kY ≤ kxkX .
That is to say, kTk − T k ≤  as k > N . Thus, kTk − T k → 0 as k → ∞.

Theorem 4.23 particularly says that the dual space X ∗ = B(X, F) of the
normed linear space X over a given field F is always a Banach space no matter
whether X is complete or not. To see this picture clearly, we will identify the
dual spaces of three Banach spaces. Thanks to their importance, we will state
them as three independent theorems.
The first one is about the sequence spaces.

Theorem 4.24. Let p ∈ [1, ∞). Then `p (N, F)∗ = ` p−1 p (N, F) in the sense

that L ∈ `p (N, F)∗ if and only if there is a unique y = (yk )∞


k=1 ∈ ` p−1 (N, F)
p

such that

X
L(x) = xk yk for all x = (xk )∞
k=1 ∈ `p (N, F).
k=1
4.3 Dual Banach Spaces by Examples 83

Proof. Since

X
|xk yk | ≤ kxkp kyk p−1
p ,

k=1

we have that the above-defined functional L is an element in `p (N, F)∗ . Con-


versely, assume L ∈ `p (N, F)∗ . To get the representation of L, we consider
two cases.
Case 1: p ∈ (1, ∞). Suppose x = (xk )∞ k=1 ∈ `p (N, F). For each j ∈ N let ej
be the vector in `p (N,
PnF) having the jth entry equal to 1 and all other entries
equal to 0. Set zn = j=1 xj ej . Then zn ∈ `p (N, F) and

 X  p1
lim kx − zn kp = lim |xj |p = 0.
n→∞ n→∞
j=n+1
Pn
Consequently, L(zn ) = j=1 xj L(ej ) and

|L(x) − L(zn )| = |L(x − zn )| ≤ kLkkx − zn kp → 0 as n → ∞.


P∞
Hence L(x) = j=1 xj L(ej ). Let yj = L(ej ) and y = (yj )∞
j=1 . Then


X
L(x) = xj yj ,
j=1

and hence it remains to show y ∈ ` p−1


p (N, F). To this end, we choose

p
|yj | p−1 −2 yj

, 6 0
yj =
xj =
0 , yj = 0.

This choice gives


n n
X X p
kzn kpp = |xj |p = |yj | p−1 .
j=1 j=1

Moreover
n n
X  p1
X p p
L(zn ) = |yj | p−1 and |L(zn )| ≤ kLkkzn kp = kLk |yj | p−1 .
j=1 j=1

Hence
n
X  p−1
p p
|yj | p−1 ≤ kLk.
j=1

This yields kyk p−1


p ≤ kLk < ∞, and so that for any x ∈ `p (N, F),

X∞
|L(x)| = xj yj ≤ kxkp kyk p−1
p .

j=1
84 4 Banach Spaces via Operators and Functionals

Accordingly, kLk = kyk p−1p . The uniqueness of y is obvious.

Case 2: p = 1. To handle this case, we follow the foregoing argument until


the definition of the vector y. To prove y ∈ `∞ (N, F), we just observe that
for each j ∈ N,
|yj | = |L(ej )| ≤ kLkkej k = kLk.
Additionally, we obtain that for any x ∈ `1 (N, F),
X∞
|L(x)| = xj yj ≤ kyk∞ kxk1 .

j=1

Thus, kLk = kyk∞ . Of course, y is uniquely determined.


Remark 4.25.
(i) From the argument for Theorem 4.24 it follows that (Fn )∗ = Fn , i.e.,
L ∈ (Fn )∗ if and only if there exists a unique y = (y1 , ..., yn ) ∈ Fn such that
n
X
L(x) = xk yk for all x = (x1 , ..., xn ) ∈ Fn .
k=1

(ii) If p ∈ (1, ∞), then `p (N, F) is reflexive. However, if c0 (N, F) stands for
the space of all vectors (xj )∞
j=1 in `∞ with limj→∞ xj = 0, then c0 (N, F) =

`1 (N, F) which can be checked in a similar way to proving Theorem 4.24, and
hence c0 (N, F) is not reflexive thanks to `1 (N, F)∗ = `∞ (N, F).
The second one characterizes the Banach dual of the space of all F-valued
continuous functions on a given finite closed interval in R.
Theorem 4.26. For −∞ < a < b < ∞ equip f ∈ C([a, b], F) with the sup-
norm
kf k∞ = sup |f (x)|,
x∈[a,b]

and let BV ([a, b], F) be the class of all F-valued functions having bounded
variation on [a, b]. If
BV0 ([a, b], F)
n Z b o
= g ∈ BV ([a, b], F) : f (x)dg(x) = 0 for all f ∈ C([a, b], F) ,
a

then
C([a, b], F)∗ = BV ([a, b], F)/BV0 ([a, b], F)
in the sense that L ∈ C([a, b], F)∗ when and only when there is a unique
equivalent class
g + BV0 ([a, b], F) ∈ BV ([a, b], F)/BV0 ([a, b], F),
such that L can be written as the Riemann-Stieltjes integral
Z b
L(f ) = f (x)dg(x) for all f ∈ C([a, b], F).
a
4.3 Dual Banach Spaces by Examples 85

Proof. On the one hand, suppose the statement after the when and only when
is true. We may write g = g1 + ig2 where g1 , g2 ∈ BV [a, b], i.e., Vab (gk ) < ∞
for k = 1, 2, in the sense of Definition ??, and consequently,
Z b 
|L(f )| = f (x)d g1 (x) − ig2 (x)

a
Z b Z b
= f (x)dg1 (x) − i f (x)dg2 (x)

a a
Vab (g1 ) + Vab (g2 ) .

≤ kf k∞
Thus, this functional L belongs to C([a, b], F)∗ .
Conversely, suppose L ∈ C([a, b], F)∗ and E is a Hahn-Banach extension of
L to the space BD([a, b], F) of all functions f : [a, b] → F satisfying kf k∞ =
sup{|f (x)| : x ∈ [a, b]} < ∞. For t ∈ [a, b] let g(t) = E(1[a,t] ), but also for
{t0 , t1 , · · · , tn }, a partition of [a, b], and for each k ∈ {1, · · · , n} let λk be the
sign of g(tk ) − g(tk−1 ), i.e., the number in F obeying

|g(tk ) − g(tk−1 )| = λk g(tk ) − g(tk−1 ) .
Because of the following representation of 1:
n 
X 
1= 1[a,tk ] − 1[a,tk−1 ] on [a, b],
k=1

one has
n
X n
X 
|g(tk ) − g(tk−1 )| = λk E(1[a,tk ] ) − E(1[a,tk−1 ] )
k=1 k=1
n
X 
≤ E λk (1[a,tk ] − 1[a,tk−1 ] )

k=1
Xn
≤ kEk λk (1[a,tk ] − 1[a,tk−1 ] )


k=1
Xn
≤ kEk (1[a,tk ] − 1[a,tk−1 ] )


k=1
≤ kEk.
This means that g has bounded variation on [a, b]. Now let f ∈ C([a, b], F)
and tk = a + nk (b − a) for k = 0, ..., n, and define fn ∈ BD([a, b], F) via
n
X  
fn (x) = f tk−1 1[a,tk ] (x) − 1[a,tk−1 ] (x) , x ∈ [a, b].
k=1

Now, it follows from the foregoing representation of 1 and the uniform conti-
nuity of f on [a, b] that
86 4 Banach Spaces via Operators and Functionals

kfn − f k∞ ≤ max sup f (x) − f (tk−1 ) → 0 as n → ∞.

1≤k≤n x∈[tk−1 ,tk ]

Note that E is continuous on BD([a, b], F). So it follows from the definition
of the Riemann-Stieltjes integral that

L(f ) = E(f )
= lim E(fn )
n→∞
n
X
= lim f (tk−1 )(g(tk ) − g(tk−1 )
n→∞
k=1
Z b
= f (x)dg(x),
a

as required. Finally, if h + BV0 ([a, b], F) ∈ BV ([a, b], F)/BV0 ([a, b], F) also
Rb
satisfies L(f ) = a f (x)dh(x) for all f ∈ C([a, b], F), then
Z b

f (x)d g(x) − h(x) = 0 for all f ∈ C([a, b], F),
a

and hence g − h ∈ BV0 ([a, b], F), as desired.


To reach the third dual Banach space, we need the following Olof Hanner’s
inequality which is viewed as a sort of improvement of Minkowski’s inequality
in Remark ??.
Theorem 4.27. Let g : R → R be increasing, E ⊆ R mg -measurable, p ∈
[1, ∞), and f1 , f2 ∈ LRS pg (E, F).
(i) If p ∈ [1, 2], then

kf1 + f2 kpp,mg ,E + kf1 − f2 kpp,mg ,E


p
≥ (kf1 kp,mg ,E + kf2 kp,mg ,E )p + kf1 kp,mg ,E − kf2 kp,mg ,E

and
p
(kf1 + f2 kp,mg ,E + kf1 − f2 kp,mg ,E )p + kf1 + f2 kp,mg ,E − kf1 − f2 kp,mg ,E

≤ 2p (kf1 kpp,mg ,E + kf2 kpp,mg ,E ).

(ii) If p ∈ [2, ∞), then

kf1 + f2 kpp,mg ,E + kf1 − f2 kpp,mg ,E


p
≤ (kf1 kp,mg ,E + kf2 kp,mg ,E )p + kf1 kp,mg ,E − kf2 kp,mg ,E

and
p
(kf1 + f2 kp,mg ,E + kf1 − f2 kp,mg ,E )p + kf1 + f2 kp,mg ,E − kf1 − f2 kp,mg ,E

≥ 2p (kf1 kpp,mg ,E + kf2 kpp,mg ,E ).


4.3 Dual Banach Spaces by Examples 87

Proof. It suffices to verify the first inequalities in (i) and (ii) since the second
ones follow from the first ones via replacing f1 and f2 with f1 + f2 and f1 − f2 .
Note that the cases p = 1 and p = 2 are clear from a direct computation. So,
it remains to check the case p 6= 2. To this end, we may assume

kf2 kp,mg ,E
R= ≤1 and kf1 kp,mg ,E = 1.
kf1 kp,mg ,E

With this assumption and r ∈ [0, 1], we consider

(1 + r)p−1 − (1 − r)p−1
φ(r) = (1 + r)p−1 + (1 − r)p−1 and ψ(r) =
rp−1
and compute

dφ(r) dψ(r) p  R p 
+ R = (p − 1) (1 + r)p−2 − (1 − r)p−2 1 − ,
dr dr r
which vanishes only at r = R and indicates that φ(r)+ψ(r)Rp has a maximum
or minimum at r = R when p < 2 or p > 2. Since

≤ φ(r) , p < 2
ψ(r)
≥ φ(r) , p > 2,

for R > 1 one has


≤ ψ(r) + φ(r)Rp

, p<2
φ(r) + ψ(r)Rp
≥ ψ(r) + φ(r)Rp , p > 2.

Accordingly, for |a|, |b| ≥ 0 one gets

≤ |a + b|p + |a − b|p

, p<2
φ(r)|a|p + ψ(r)|b|p
≥ |a + b|p + |a − b|p , p > 2,

with equality when r = b/a ≤ 1 and a, b ≥ 0. This in turn implies

Z
φ(r)|f1 |p dmg + ψ(r)|f2 |p dmg


E R 
≤ RE |f1 + f2 |p + |f1 − f2 |p  dmg , p<2
≥ E |f1 + f2 |p + |f1 − f2 |p dmg , p > 2.

Of course, the last estimates, along with r = R, give the desired inequalities
right away.

An immediate example of the use of Theorem 4.27 is given by the following


projection result which will be used later on.
88 4 Banach Spaces via Operators and Functionals

Corollary 4.28. Let g : R → R be increasing, E ⊆ R mg -measurable, p ∈


(1, ∞), and K a closed convex subset of LRS pg (E). If f ∈ LRS pg (E, F) \ K,
then there exists a function h0 ∈ K such that

kf − h0 kp,mg ,E = inf kf − hkp,mg ,E .


h∈K

Moreover, every function h ∈ K satisfies


Z
< (h − h0 )(f − h0 )|f − h0 |p−2 dmg ≤ 0.
E

Proof. Without loss of generality, we may assume and f = 0 and δ =


inf h∈K khkp,mg ,E .
For the first result, suppose (hj )∞
j=1 is in K with limj→∞ khj kp,mg ,E = δ.
Since K is convex, (hj + hk )/2 ∈ K and consequently,

2δ ≤ khj + hk kp,mg ,E
≤ khj kp,mg ,E + khk kp,mg ,E
→ 2δ as k, j → ∞.
p
Suppose (hn )∞
n=1 is not a Cauchy sequence in LRS g (E, F). Then there is a
constant κ > 0 such that khj − hk kp,mg ,E ≥ κ holds for infinitely many j’s
and k’s.
Case 1: p ∈ [2, ∞). According to the first inequality in Theorem 4.27 (ii),
we have

khj + hk kpp,mg ,E + khj − hk kpp,mg ,E


p p
≤ khj kp,mg ,E + khk kp,mg ,E + khj kp,mg ,E − khk kp,mg ,E ,

whence deriving a contradiction

(2δ)p < (2δ)p + κp ≤ (2δ)p .

Case 2: p ∈ (1, 2]. According to the second inequality in Theorem 4.27 (i),
we have
p
khj + hk kp,mg ,E + khj − hk kp,mg ,E
p
+ khj + hk kp,mg ,E − khj − hk kp,mg ,E
≤ 2p khj kpp,mg ,E + khk kpp,mg ,E


→ 2p+1 δ p as j, k → ∞,

whence getting
|2δ + κ|p + |2δ − κ|p ≤ 2p+1 δ p .
Since φ(x) = |2δ + x|p is a strictly convex function with x; that is to say,
φ(tx1 + (1 − t)x2 ) < tφ(x1 ) + (1 − t)φ(x2 ) for any t ∈ (0, 1) and all x1 , x2 ∈ R,
we obtain the following contradiction:
4.3 Dual Banach Spaces by Examples 89

21+p δ p < |2δ + κ|p + |2δ − κ|p ≤ 2p+1 δ p .


p
Thus, (hn )∞n=1 must be a Cauchy sequence in LRS g (E, F). Note that K is

closed. So (hn )n=1 is convergent to an element h0 ∈ K with δ = kh0 kp,mg ,E .
For the second result, let h ∈ K. Since K is convex, ht = th + (1 − t)h0 ∈ K
for all t ∈ [0, 1], and consequently, φ(t) = kht kpp,mg ,E ≥ δ p while φ(0) = δ p .
Noticing both

|ht |p − |h0 |p
= 2−1 p|h0 |p−2 (h − h0 )h0 + (h − h0 )h0

lim+
t→0 t
and
|ht |p − |h0 |p
|h0 |p − |2h0 − h|p ≤ ≤ |h|p − |h0 |p
t
p
which follows from the convexity tx1 + (1 − t)x2 ≤ txp1 + (1 − t)xp2 of the
function xp on [0, ∞), we use the dominated convergence theorem to obtain
that φ0 (0) exists with
Z
0 ≤ φ0 (0) = < (h − h0 )h0 |h0 |p−2 dmg ,
E
as desired.

Now, we are ready to identify the dual space of each LRS pg (E, F) for
p ∈ [1, ∞) via the Frigyes Riesz representation theorem below.

Theorem 4.29. Let g : R → R be increasing, E ⊆ R mg -measurable


p

and p ∈ [1, ∞). Then LRS pg (E, F)∗ = LRS gp−1 (E, F) in the sense that
p

L ∈ LRS pg (E, F)∗ if and only if there is a unique element h ∈ LRS gp−1 (E, F)
such that Z
L(f ) = f hdmg for all f ∈ LRS pg (E, F).
E

Proof. The Hölder inequality (see also Theorem ??) implies the above-
formulated functional is an element of LRS pg (E, F)∗ .
For the reversed conclusion, we may assume 0 6= L ∈ LRS pg (E, F)∗ . Then
we consider two cases as follows.
Case 1: p ∈ (1, ∞). For the above-given L, define

K = {f ∈ LRS pg (E, F) : L(f ) = 0}.

Since L is continuous, the subset K of LRS pg (E, F) is closed and convex.


Because of L 6= 0, there exists an element f0 ∈ LRS pg (E, F) such that L(f0 ) 6=
0, namely, f0 ∈
/ K. According to Corollary 4.28 and the linearity of K, we can
find an h0 ∈ K such that
Z
< (h − h0 )|f0 − h0 |p−2 (f0 − h0 )dmg ≤ 0 for all h ∈ K.
E
90 4 Banach Spaces via Operators and Functionals

Choosing h = 2h0 in the above estimate, we get


Z
< h0 |f0 − h0 |p−2 (f0 − h0 )dmg ≤ 0
E

whence finding
Z
< (±h)|f0 − h0 |p−2 (f0 − h0 )dmg ≤ 0 for all h ∈ K.
E

Note that ih ∈ K. So the last inequality still holds when ±h is replaced by


ih. Consequently, the last integral equals 0 for all h ∈ K.
If f is an arbitrary element of LRS pg (E, F) and
 
L(f )
f = f1 + f2 ; f1 = (f0 − h0 ),
L(f0 − h0 )
then f2 ∈ K and hence
Z
f |f0 − h0 |p−2 (f0 − h0 )dmg
E
Z Z
p−2
= f1 |f0 − h0 | (f0 − h0 )dmg + f2 |f0 − h0 |p−2 (f0 − h0 )dmg
E E
Z
= f1 |f0 − h0 |p−2 (f0 − h0 )dmg
E
|f0 − h0 |p dmg
R 
= L(f ) E .
L(f0 − h0 )
Clearly, the desired element h determining the representation of L is

|f0 − h0 |p−2 (f0 − h0 )L(f0 − h0 )


h= R .
|f − h0 |p dmg
E 0

In fact, such an element h is unique – if there is another element h̃ then


Z
2−p
(h − h̃)f dmg = 0 for f = (h − h̃)|h − h̃| p−1 ∈ LRS pg (E, F)
E

and hence h = h̃ mg -a.e. on E.


Case 2: p = 1. We handle this case via two situations. The first one is
mg (E) < ∞. Under this condition, L is also in LRS pg (E, F)∗ (for any p ∈
(1, ∞)) since the Hölder inequality yields that
 p−1
|L(f )| ≤ kLkkf k1,mg ,E ≤ kLk mg (E) p
kf kp,mg ,E

is valid for all f ∈ LRS pg (E, F). However, according to the previous Case 1
p

there exists a unique h ∈ LRS gp−1 (E, F) such that


4.3 Dual Banach Spaces by Examples 91
Z
L(f ) = f hdmg for all f ∈ LRS pg (E, F).
E

So, by the inclusion LRS pg2 (E, F) ⊆ LRS pg1 (E, F), which holds for 1 < p1 <
p2 < ∞ and follows from
p
Hölder’s inequality, indicates that h does not depend
on p. Now if f = |h| p−1 −2 h, then f ∈ LRS pg (E, F) and hence
p  p−1 p
−1
khk p−1
p
,mg ,E
= L(f ) ≤ kLk mg (E) p khk p−1
p
,mg ,E
.
p−1 p−1

Accordingly, for any  > 0 we have


  p−1
p
(kLk + ) mg {x ∈ E : |h(x)| > kLk + }
 p−1
≤ khk p−1
p
,mg ,E ≤ kLk mg (E)
p
.

Since p can be sufficiently large, the last estimate forces



mg {x ∈ E : |h(x)| > kLk + } = 0,

namely, h ∈ LRS ∞
R
g (E, F). This fact further implies that E |h||f |dmg < ∞
for all f ∈ LRS 1g (E, F). With f ∈ LRS 1g (E, F) and j ∈ N, we consider

f (x) , |f (x)| ≤ j
fj (x) =
0 , |f (x)| > j,

and find that |fj | ∈ LRSgp (E), fj (x) → f (x) and |fj (x)| ≤ |f (x)| for each
x ∈ E. By the dominated convergence theorem it follows that

lim kfj − f k1,mg ,E = 0 and lim khfj − hf k1,mg ,E = 0


j→∞ j→∞

and so that
Z Z
L(f ) = lim L(fj ) = lim fj hdmg = f hdmg .
j→∞ j→∞ E E

The second one is mg (E) = ∞. Since E = ∪∞ j=−∞ E ∩ [j, j + 1) and


1

mg [j, j + 1) < ∞, any function f ∈ LRS g (E, F) can be written as

X
f (x) = f (x)1E∩[j,j+1) (x) for x ∈ E.
j=−∞

Note that f ∈ LRS 1g E ∩[j, j +1), F if and only if f 1E∩[j,j+1) ∈ LRS 1g (E, F).

∗
So, Lj (f ) = L(f 1E∩[j,j+1) ) defines an element in LRS 1g E ∩ [j, j + 1), F ,
and consequently, there exists an element hj ∈ LRS ∞

g E ∩ [j, j + 1), F such
that
92 4 Banach Spaces via Operators and Functionals
Z
L(f 1E∩[j,j+1) ) = f hj dmg
E∩[j,j+1)

and
khj k∞,mg ,E∩[j,j+1) ≤ kLj k ≤ kLk.
Now, if the function h is defined on E by setting h = hj on E ∩ [j, j + 1) for
each j ∈ Z, then h ∈ LRS ∞ g (E, F) and


 X  X Z Z
L(f ) = L f 1E∩[j,j+1) = f hdmg = f hdmg ,
j=−∞ j=−∞ E∩[j,j+1) E

owing to the countable additivity of the measure mg . Moreover, if there is


R
another function h̃ satisfying the last representation, then E |h−h̃||f |dmg = 0
for all f ∈ LRS 1g (E; F), and hence for f = 1E∩[j,j+1) where j ∈ Z. This
deduces h = h̃ mg -a.e. on E. Therefore, the function h is uniquely determined.
Remark 4.30. From Corollary 4.28 to Theorem 4.29 we have used the following
definition of integration
Z Z Z Z
f dmg = (<f + i=f )dmg = <f dmg + i =f dmg
E E E E

(for any |f | ∈ LRSg (E)) and its linearity


Z Z Z
(c1 f1 + c2 f2 )dmg = c1 f1 dmg + c2 f2 dmg
E E E

(for any |f1 |, |f2 | ∈ LRS g (E) and c1 , c2 ∈ F) as well as some elementary
algebraic operations of complex numbers.

4.4 Weak and Weak* Topologies


The purpose of this section is to discuss some special topologies on the Banach
space X and on its dual X ∗ , which are weaker than the usual ones.
Definition 4.31. Let (X, k · kX ) be a normed linear space over F. Then a
w
sequence (xk )∞
k=1 in X is said to converge weakly to x ∈ X, denoted xk → x,
provided limk→∞ |L(xk ) − L(x)| = 0 for each L ∈ X ∗ .
Remark 4.32.
w
(i) Clearly, xk →x in X implies xk → x (see Theorems ?? (i) and 4.29 for a
typical example), but not conversely – for example, consider (`2 (N, R), k · k2 ):
If ej = (0, ..., 0, 1, 0, ...) for each j ∈ N, then kej k2 = 1 yields that (ej )∞
j=1 does
| {z }
j−1
not converge strongly to 0; nevertheless according to Theorem 4.24, for any
L ∈ B(`2 (N, R), R) = `2 (N, R), R)∗ there exists a y = (yk )∞
k=1 ∈ `2 (N, R)
w
such that limj→∞ L(ej ) = limj→∞ yj = 0, namely, ej → 0.
4.4 Weak and Weak* Topologies 93

(ii) In addition, a sequence may have at most one weak limit for if x 6= y then,
as we saw in the proof of Corollary 3.34 (iv), there exists an element L ∈ X ∗
w w
such that L(x) − L(y) = L(x − y) = 1. If xk → x and xk → y, then there is
an N ∈ N such that k > N implies
1 1
|L(xk ) − L(x)| < and |L(xk ) − L(y)| < ,
2 2
and hence |L(x) − L(y)| < 1, a contradiction.

The following is a natural way to introduce the weak topology where the
basis is formed by all weakly open sets.

Definition 4.33. Let (X, k · kX ) be a normed linear space over F.


(i) Given  > 0 and a finite set {L1 , · · · , Ln } in X ∗ , the (, L1 , · · · , Ln )-
neighborhood of a point x ∈ X, denoted N (x; , L1 , · · · , Ln ), is defined by

y ∈ X : |Lk (x) − Lk (y)| < , k = 1, · · · , n .

(ii) A subset of X is called weakly open if it is a union of the neighborhoods


N (x; , L1 , · · · , Ln ). The complement of a weakly open set is called a weakly
closed set.
(iii) A subset E of X is called weakly compact provided every cover of E by
weakly open sets has a finite subcover.

Remark 4.34. Note that BX (x) ⊆ N (x; kLk, L) for each L ∈ X ∗ . So Defini-
tion 4.33 is a natural generalization of the idea of a ball in a metric space.
w
From the definition it turns out that xk → x is equivalent to saying that to
any such weak neighborhood of x there corresponds an N ∈ N such that xk
lies in this neighborhood for k ≥ N .

Using the second dual we introduce the following concept.

Definition 4.35. Let (X, k·kX ) be a normed linear space over F, and suppose
L, L1 , L2 , ... ∈ X ∗ . Then we say that
(i) (Ln )∞
n=1 is strongly convergent to L provided limn→∞ kLn − Lk = 0;
(ii) (Ln )∞
n=1 is weak* convergent to L provided limn→∞ |Ln (x) − L(x)| = 0 for
all x ∈ X;
(iii) (Ln )∞
n=1 is weakly convergent to L provided limn→∞ kF (Ln ) − F (L)k = 0
for all F ∈ X ∗∗ .

Remark 4.36.
(i) In general, (ii) and (iii) in Definition 4.35 are not equivalent. But, if there
is an isometry between X and X ∗∗ then both concepts coincide.
(ii) From Theorem 4.15 (ii) we can see that if (X, k · kX ) is a Banach space
over F, then any weak* convergent sequence in X ∗ must be bounded and its
converse is also true whenever this sequence is assumed to be convergent on
a dense subset of X.
94 4 Banach Spaces via Operators and Functionals

(iii) Motivated by Definition 4.35 (iii), we say that (Tn )∞ n=1 in B(X, Y) is
weakly convergent to T ∈ B(X, Y ) provided limn→∞ |F Tn (x) − F T (x) | =
0 for all x ∈ X and F ∈ Y ∗ . Here (X, k · kX ) and (Y, k · kY ) are supposed to
be normed linear spaces over F. According to Definition 4.13, we can find out
that the uniform convergence implies the strong convergence which implies
the weak convergence, but not conversely. Since Example 4.14 has shown that
the strong convergence does not yield the uniform convergence, it is enough
to check that the weak convergence does not derive the strong convergence.
To do so, consider X = Y = `2 (N, R) over R, and Tn (x) = (0, ..., 0, x1 , x2 , ...)
| {z }
n
for x = (x1 , x2 , ...) ∈ `2 (N, R). Clearly, kTn k = 1 and if y = (y1 , y2 , ...) ∈
`2 (N, R) then
X∞ ∞ ∞  21
 X  X
Tn (x) k yk = xk yk ≤ kxk2 yk2


k=1 k=n+1 k=n+1

tends to 0 as n → ∞, and hence by Example 4.24 it follows that (Tn )∞ n=1 is


weakly convergent to 0. But, (Tn )∞ n=1 is not strongly convergent to 0 since

kTn (e1 ) − Tm (e1 )k2 = ken+1 − em+1 k2 = 2.

Similarly, we can define the weak-∗ topology on X ∗ via all weak* open
subsets of X ∗ .

Definition 4.37. Let (X, k · kX ) be a normed linear space over F.


(i) Given  > 0 and a finite set {x1 , · · · , xn } in X, the (, x1 , · · · , xn )-
neighborhood of a functional F ∈ X ∗ , denoted N (F ; , x1 , · · · , xn ), is defined
by
G ∈ X ∗ : |G(xk ) − F (xk )| < , k = 1, · · · , n .


(ii) An arbitrary union of such neighborhoods is called a weak* open set. The
complement of a weak* open set is called a weak* closed set.
(iii) A subset A of X ∗ is called weak* compact provided every cover of A by
weak* open sets has a finite subcover.

Definition 4.37 yields immediately that a weak* closed subset of a weak*


compact set is itself weak* compact. To grasp the essence of weak* compact
sets, we extend Cartesian products from two sets to a family and work with
their topological compactness.

Definition 4.38.
(i) A collection T of subsets of a set X is said to be a topology for X provided:
(a) ∅, X ∈ T ;
(b) ∩nj=1 Ej ∈ T whenever n ∈ N and E1 , E2 , ..., En ∈ T ;
(c) ∪j∈I Ej ∈ T whenever {Ej }j∈I is a family in T .
In this case, (X, T ) is called a topological space. Meanwhile, members of T
and their complements are called open and closed sets of (X, T ) respectively.
4.4 Weak and Weak* Topologies 95

Furthermore, (X, T ) is called compact provided every cover of X by sets of T


has a finite subcover.
(ii) Let {(Xj , Tj )}j∈I be a family of topological spaces indexed by the set I.
Then the Cartesian product of {Xj }j∈I is defined by
Y 
X= Xj = (xj )j∈I : xj ∈ Xj
j∈I

Furthermore, for each j ∈ I the projection pj : X → Xj is defined by



pj (xα )α∈I = xj .

The product topology T on X is defined to be the topology with the fewest open
sets for which every projection pj is continuous; that is, p−1
j (O) is open in X
whenever O is open in Xj . Consequently, the above-defined (X, T ) is called a
product topological space.
Lemma 4.39. Let F be a family of subsets of a given set X. If F enjoys
the finite intersection property (FIP) – each finite collection of subsets in the
family has a nonempty intersection, then F can be extended to a family which
is maximal with respect to having the FIP.
Proof. Let T be the collection of all families of subsets of X that contain F
and satisfy the FIP. Define an order on T via inclusion and let {Tj : j ∈ J}
be any chain in T – a totally ordered subset of T . If S = ∪j∈J Tj , then S
contains F and if {S1 , ..., Sn } is any finite collection of members of S then
{S1 , ..., Sn } ⊆ Tj0 for some j0 since {Tj }j∈J is ordered by inclusion. Now Tj0
enjoys the FIP. So ∩nj=1 Sj 6= ∅; that is to say, S has the FIP but also each
chain in S has an upper bound. From Zorn’s lemma we conclude that S has
a maximal element, whence establishing the desired result.
Using this lemma, we can derive the following Andrey Nikolayevich Ty-
chonoff’s theorem.
Theorem 4.40. Let {(Xj , Tj )}j∈I be a family of compact topological
Q spaces
indexed by the set I. Then their product topological space (X = j∈I Xj , T )
is compact.
Proof. By forming the contrapositive statement and taking complements, we
immediately see that (X, T ) is compact if and only if any collection of closed
subsets of (X, T ) obeying the FIP has a nonempty intersection. For this rea-
son, we are about to make the following consideration where Tj and T will be
dropped from (Xj , Tj ) and (X, T ).
Suppose F is a collection of closed subsets of X which enjoys the FIP. By
Lemma 4.39 we can extend F to G which is maximal with respect to having
the FIP. Moreover, for each j ∈ I the family {pj (B) : B ∈ G} of subsets of
Xj has the FIP since for any finite collection {B1 , ..., Bn } of elements in G
one has
96 4 Banach Spaces via Operators and Functionals
n
\  n
\ n
\
∅=
6 pj Bk ⊆ pj (Bk ) ⊆ pj (Bk ).
k=1 k=1 k=1

Note that each Xj is compact. By the equivalent statement on the compact-


T
ness there exists an xj in B∈G pj (B).
T T
Define f ∈ X by setting f (j) = xj . Since B∈G B ⊆ B∈F B, once
T
verifying f ∈ B∈G B, we complete the argument. In so doing, let O be an
arbitrary open subset of X containing f . Then for some finite set of indices
j1 , ..., jn and open sets Ojk ⊆ Xjk , we have
n
\
f∈ p−1
jk (Ojk ) ⊆ O.
k=1

Accordingly, xjk ∈ Ojk and thus Ojk ∩pjk (B) 6= ∅ for all B ∈ G. Using the fact
that Ojk is open and pjk is a projection, we can readily find Ojk ∩ pjk (B) 6= ∅
and hence p−1 jk (Ojk ) ∩ B 6= ∅ for all B ∈ G. Because G is maximal with re-
spect to enjoying the FIP, we must then have p−1 jk (Ojk ) ∈ G and similarly
Tn −1
p
k=1 jk (Ojk ) ∈ G, whence obtaining O ∈ G. From this we see that O inter-
sects every element of G and so that f ∈ B for all B ∈ G due to f ∈ O and O
being an arbitrary open set.
Now, we are ready to discuss the well-known theorem of Leonidas Alaoglu
(also known as the Banach-Alaoglu theorem).
Theorem 4.41. Let (X, k · kX ) be a normed linear space over F. Then the
X∗
closed unit ball B 1 (0) = {F ∈ X ∗ : kF k ≤ 1} of X ∗ is weak* compact.
Proof. For x ∈ X let Dx = {λ ∈ F : |λ| ≤ kxkX }. Note that |F (x)| ≤ kxkX for
X∗ X∗
all F ∈ B 1 (0). So B 1 (0) is a subset of the set of all functions f : X → Dx ,
X∗ Q Q
i.e., B 1 (0) ⊆ x∈X Dx . The product topology on x∈X Dx has as open sets
unions of neighborhoods of the form

{f : |f (x1 ) − f0 (x1 )| < , ..., |f (xn ) − f0 (xn )| < }

for given f0 ,  > 0 and x1 , · · · , xn ∈ X. These open sets when restricted to


X∗ X∗
B 1 (0) are just the weak* open subsets of B 1 (0). AccordingQ to Theorem
4.40, the compactness of Dx implies the compactness of x∈X Dx . Thus to
X∗ X∗
prove that BQ
1 (0) is weak* compact it is enough to show that B 1 (0) is weak*
closed since x∈X Dx is weak* compact and the argument for Theorem 1.37
X∗
(i) applicable. Suppose G is in the weak* closure of B 1 (0); that is, each
X∗
weak* neighborhood of G intersects B 1 (0). Then we must check that G is
linear and kGk ≤ 1. Let x, y ∈ X and  > 0. By the assumption on G we see
X∗
that the neighborhood N (G; /3, x, y, x + y) contains some F ∈ B 1 (0). Since
F (x + y) = F (x) + F (y), we have
4.5 Compact and Dual Operators 97

|G(x) + G(y) − G(x + y)|


≤ |G(x) − F (x)| + |G(y) − F (y)| + |G(x + y) − F (x + y)|
< /3 + /3 + /3 = .

So, G(x+y) = G(x)+G(y). In a similar manner, we can prove G(λx) = λG(x)


for λ ∈ F. Furthermore, the inequality |F (x)| ≤ kxkX implies

|G(x)| ≤ |G(x) − F (x)| + |F (x)| ≤ /3 + kxkX ,


X∗
and hence |G(x)| ≤ kxkX which means kGk ≤ 1. In other words, G ∈ B 1 (0).
X∗
Therefore, B 1 (0) is weak* closed and accordingly weak* compact.

Interestingly, the above-established Banach-Alaoglu theorem can be used


to derive the weak compactness of the closed unit ball of a reflexive normed
linear space.

Corollary 4.42. Let (X, k · kX ) be a normed linear space over F. If X is


reflexive, then the closed unit ball of X is weakly compact.

Proof. Recall
X
B 1 (0) = {x ∈ X : kxkX ≤ 1}.
Since X is reflexive, we conclude

X X
x ∈ X ∗∗ : x ∈ B 1 (0)}
B 1 (0) = {b
d

is exactly the closed unit ball of X ∗∗ . From Theorem 4.41 it follows that
X
d X
B 1 (0) is weak* compact. But since X is reflexive, this says that B 1 (0) is
weakly compact. We are done.

4.5 Compact and Dual Operators


Having had a careful look at the concept of a compact (or weakly compact, or
weak* compact) set, we can find that this concept is a natural generalization
of a finite set. In this section we discuss the so-called compact operators and
their dual forms on Banach spaces which are important in many applications.
Let X and Y be two normed linear spaces over F. If T ∈ B(X, Y ) has
a finite dimensional range T (X) then the image of any bounded set in X
is bounded in T (X) and hence has compact closure owing to T (X) being
isomorphic to a Euclidean space. This motivates the following definition of a
compact operator.

Definition 4.43. Let (X, k · kX ) and (Y, k · kY ) be normed linear spaces over
F. Then a linear operator T : X → Y is called compact provided the closure
T (B) is a compact subset of Y for each bounded subset B of X.
98 4 Banach Spaces via Operators and Functionals

Here we should note that Definition 4.43 does not require T to be continu-
ous or bounded. Nevertheless, the following result tells us that the continuity
or the boundedness of T is an immediate consequence of the definition.

Theorem 4.44. Let (X, k · kX ) and (Y, k · kY ) be normed linear spaces over
F. If a linear operator T : X → Y is compact then it is bounded.
X
Proof. In this case, T (B 1 ) is compact for the closed unit ball
X
B 1 = {x ∈ X : kxkX ≤ 1},

so it is bounded in Y . Accordingly, there exists a constant C > 0 such that


X X
kykY ≤ C for all y ∈ T (B 1 ). In particular, kT (x)kY ≤ C for x ∈ B 1 , namely,
kT k ≤ C.

Remark 4.45. Definition 4.43 actually produces three more facts.


(i) A linear combination of compact operators is also compact.
(ii) If T ∈ B(X) is compact and S ∈ B(X) then T S and ST are compact and
hence the class of all compact operators in B(X) form a two sided ideal in
algebraic terms.
(iii) A linear operator T ∈ B(X, Y ) is compact ∞ if and only if for every bounded
sequence (xj )∞ j=1 in X, the sequence T (x j j=1 has a convergent subsequence
)
in Y – the necessity is obvious and to see the sufficiency, assume the statement
after the if and only if; then for any bounded set B ⊆ X and any sequence yk
in T (B) there is an xk ∈ B such that kyn − T (xk )kY < k −1 and consequently

T (xk ) k=1 has and hence (yk )∞ k=1 has a convergent subsequence with limit
in T (B), establishing the compactness of T (B).

Using the criterion in Remark 4.45 (iii), we obtain that the compactness
is preserved under uniform limits in the Banach space setting.

Theorem 4.46. Let (X, k · kX ) be a normed linear space and (Y, k · kY ) a


Banach space over F. If each Tj ∈ B(X, Y ) is compact and limj→∞ kTj −T k =
0 then T : X → Y is compact.

Proof. Suppose (xj )∞j=1 is a bounded sequence in X. Since T is compact, there


1∞
is a subsequence (x1,m )∞
m=1 of (x )∞
j j=1 such that T1 1,m m=1 is convergent.
(x )
Also since T2 iscompact, there is a subsequence (x2,m )∞ ∞
m=1 of (x1,m )m=1 such

that T2 (x2,m ) m=1 is convergent. Continuing this process, we can obtain
∞
subsequence (xn+1,m )∞ m=1 of (xn,m )∞
m=1 such that Tn+1 (xn+1,m ) m=1 is con-

vergent. It follows that Tn (xm,m ) m=1 is convergent for any n ∈ N.

Note that (xm,m )m=1 is bounded in X. So c = 1 + supm∈N kxm,m kX < ∞.
Given  > 0, limj→∞ kTj − T k = 0 implies ∞ that there is an N ∈ N such that
kTN − T k < /(3c). Since TN (xm,m ) m=1 is convergent in Y , there is an
N1 ∈ N such that
4.5 Compact and Dual Operators 99

k, l > N1 =⇒ kTN (xk,k ) − TN (xl,l )kY < /3.

With this, we achieve

kT (xk,k ) − T (xl,l )kY


≤ kT (xk,k ) − TN (xk,k )kY + kTN (xk,k ) − TN (xl,l )kY + kTN (xl,l ) − T (xl,l )kY
≤ kT − TN kkxk,k kX + /3 + kT − TN kkxl,l kX < ,
∞
and so T (xm,m ) m=1 is convergent in Y owing to the fact that Y is a Banach
space under the norm k · kX .
Here is an easy example of compact operators.
Example 4.47. For 1 ≤ p ≤ ∞ let the linear operator T on `p (N, F) be defined
by T (xk )∞ ∞
k=1 = (xk /k)k=1 . Then T is a compact operator on `p (N, F).
To check the result, set

Tn (xk )∞
 
k=1 = x1 , x2 /2, · · · , xn /n, 0, · · · , n ∈ N.

It is clear that each Tn `p (N, F) is finite dimensional and so that Tn is
compact. Note that for any x = (xk )∞k=1 ∈ `p (N, F),
 1
 P∞ |xk |p p k(xk )kp

 ≤ n+1 , p ∈ [1, ∞)
k(T − Tn ) (xk )k=1 kp = k=n+1 k p

|xk | k(xk )k∞


 sup k≥n+1 k ≤ n+1 , p = ∞.

So, limn→∞ kT − Tn k = 0. By Theorem 4.46 we can now conclude that T is


a compact operator.
Next, we consider the dual operator or the adjoint or the adjoint operator
of a bounded operator and see how the compactness of the original operator
is reflected in the behavior of its adjoint.
Theorem 4.48. Let (X, k·kX ) and (Y, k·kY ) be normed linear spaces over F.
For T ∈ B(X, Y ) set T ∗ (f ) = f ◦ T . Then T ∗ ∈ B(Y ∗ , X ∗ ) and kT ∗ k = kT k.
Furthermore, if X and Y are Banach spaces, then T is compact when and
only when T ∗ is compact.
Proof. If f ∈ Y ∗ and x ∈ X, it is easy to see that T ∗ (f )(x) = f ◦ T (x) defines
an F-valued linear function on X since f and T are linear. Consequently,

|T ∗ (f )(x)| ≤ kf kkT (x)kY ≤ kf kkT kkxkX ;

that is, T ∗ (f ) ∈ X ∗ . Clearly, T ∗ : Y ∗ → X ∗ is linear and bounded. According


to Corollary 3.34 (vi), we evaluate

kT k = sup kT (x)kY = sup sup |f ◦ T (x)|


kxkX =1 kxkX =1 kf k=1

= sup sup |T ∗ (f )(x)| = sup kT ∗ (f )k = kT ∗ k.


kf k=1 kxkX =1 kf k=1
100 4 Banach Spaces via Operators and Functionals

To check the second part, it is enough to verify that if T is compact then


so is T ∗ due to the symmetry between both operators. Suppose T is compact.
Let (fj )∞ ∗
j=1 be a sequence in the closed unit ball of Y , i.e., supj∈N kfj k ≤ 1.
Then
|fj (y1 ) − fj (y2 )| ≤ ky1 − y2 kY for y1 , y2 ∈ Y,
and hence (fj )∞j=1 is equi-continuous. Since the closure T (B) of the image
T (B) ⊆ Y of the closed unit B ⊆ X is compact, an application of Theorem
2.23 yields that (fj )∞ ∞
j=1 has a subsequence (fjk )k=1 which converges uniformly
on T (B). Because of

kT ∗ (fjk ) − T ∗ (fjl )k = sup |fjk ◦ T (x) − fjl ◦ T (x)|,


kxkX =1
∞
the completeness of X ∗ implies that T ∗ (fjk ) k=1 converges in X ∗ . Accord-
ingly, T ∗ is compact.
Example 4.49. Let [a, b] and [c, d] be two finite closed intervals of R. For
k(·, ·) ∈ C([c, d] × [a, b], F) let T : C([a, b], F) → C([c, d], F) be given by
Z b
T (f )(x) = k(x, y)f (y)dy, x ∈ [c, d].
a

Then T is a compact operator and so is its adjoint T ∗ which maps

C([c, d], F)∗ = BV ([c, d], F)/BV0 ([c, d], F)

into
C([a, b], F)∗ = BV ([a, b], F)/BV0 ([a, b], F)
by Theorem 4.48, where
Z d
T ∗ (g)(y) = k(x, y)dg(x), y ∈ [a, b],
c

which can be determined by Theorem 4.26.


To verify the compactness of T , we notice that k(·, ·) ∈ C([c, d] × [a, b], F)
implies
M1 = kk(·, ·)k∞ = sup |k(x, y)| < ∞,
(x,y)∈[c,d]×[a,b]

and thus get that if B is a bounded subset of C([a, b], F); that is, there is a
constant M2 > 0 such that kf k∞ ≤ M2 for all f ∈ B, then
Z b
kT (f )k∞ = sup k(x, y)f (y)dy ≤ M1 M2 (b − a).

x∈[c,d] a

This means that T (B) is bounded in C([c, d], F). Since k(·, ·) is indeed uni-
formly continuous on [c, d] × [a, b], we see that for any  > 0 there is a δ > 0
such that
4.5 Compact and Dual Operators 101

x1 , x2 ∈ [c, d] and |x1 − x2 | < δ ⇒ |k(x1 , y) − k(x2 , y)| < .
M2 (b − a)

From this implication we further see that

x1 , x2 ∈ [c, d] and |x1 − x2 | < δ ⇒ |T (f )(x1 ) − T (f )(x2 )| ≤ 

for all f ∈ C([a, b], F). By Theorem 2.23 we can now conclude that the closure
T (B) is compact in C([c, d], F) and so that T is a compact operator.

Problems

4.1. Prove that if C n [0, 1] is the class of all functions f : [0, 1] → R for which
the derivatives f (0) , f (1) ,..., f (n) belong to C[0, 1] and satisfy

kf k = sup sup |f (k) (t)| < ∞,


0≤k≤n t∈[0,1]

then it is a Banach space under this norm k · k.

4.2. Prove that if (X, k · kX ) and (Y, k · kY ) are Banach spaces over F then so
is X × Y under the norm k(x, y)k = max{kxkX , kykY } for (x, y) ∈ X × Y .

4.3. Prove that (C[0, 1], k · kp ), p ∈ [1, ∞) is not complete.


1
P∞ Let L (Z, F) comprise all F-valued functions f on Z satisfying kf k1 =
4.4.
n=−∞ |f (n)| < ∞. Prove that this space is a Banach algebra under the
following convolution:

X
f ∗ g(n) = f (n − m)g(m), f, g ∈ L1 (Z, F).
m=−∞

4.5. Show by an example that the uniform bounded principle does not hold
once the completeness is dropped.

4.6. Prove that there is a continuous map f : R → R such that f (U ) is not


open even if U is open.

4.7.
(i) Prove
Z 2π
sin(n + 1 )x sin x −1 dx = ∞.

lim
n→∞ 0 2 2
(ii) Given a Riemann-integrable function f : [0, 2π] → R, let its Fourier series
be
∞ Z 2π
X 1
s(x) = am e imx
where am = f (y)e−imy dy.
m=−∞
2π 0
102 4 Banach Spaces via Operators and Functionals

Extend the definition of f to make it


P2π-periodic. Define the n-th partial sum
n
of the Fourier series to be sn (x) = m=−n am eimx . Prove

1
Z 2π  sin(n + 1 )y 
2
sn (x) = f (x + y) dy for all x ∈ (0, 2π).
2π 0 sin y2

(iii) Let X be the Banach space of continuous functions f : [0, 2π] → R with
f (0) = f (2π), with the sup-norm. Prove that the linear operator Tn : X → R
defined by
sin(n + 12 )x
Z 2π
1
Tn (f ) = f (x) dx
2π 0 sin x2
is bounded, and
2π sin(n + 12 )x
Z
1
kTn k = dx.
2π 0
sin x2

(iv) Prove that there exists a continuous function f : [0, 2π] → R with f (0) =
f (2π) such that its Fourier series diverges at x = 0.

4.8. Let X = C 1 [0, 1] and Y = C[0, 1] be equipped with the sup-norm. Prove
the following results:
(i) X is not complete;
(ii) The map (d/dx) : X → Y is closed but not bounded.

4.9. Suppose (X, k · kX ) and (Y, k · kY ) are Banach spaces over F and T ∈
B(X, Y ) satisfying kT (x)kY ≥ ckxkX for all x ∈ X and for some constant
c > 0. Verify the following facts:
(i) N (T ) = {x ∈ X : T (x) = 0} = {0};
(ii) R(T ) = {y ∈ Y : y= T (x) for some x ∈ X} is closed;
(iii) T −1 ∈ B R(T ), X satisfies kT −1 k ≤ c−1 .

4.10. Prove the following two results:


(i) (fj )∞
j=1 converges weakly to f in the real space C[0, 1] if and only if
supj∈N kfj k∞ < ∞ and limj→∞ fj (x) = f (x) for any x ∈ [0, 1].
(j) ∞
(ii) xj = (sk )∞ ∞
k=1 j=1 converges weakly to x = (sk )k=1 in the real sequence
(j)
space `p where p ∈ (1, ∞), if and only if, supj∈N kxj kp < ∞ and limj→∞ sk =
sk for any k ∈ N.

4.11. Let (X, k·kX ) be a separable Banach space over F and M be a bounded
subset of X ∗ . Prove that every sequence in M contains a weak* convergent
subsequence.

4.12. Let (X, k · kX ) be a Banach space over F. Show that X is reflexive if


and only if X ∗ is reflexive.

4.13. Let c0 = c0 (N, R), the linear subspace of `∞ = `∞ (N, R) consisting of


all real-valued sequences that converge to 0. Prove the following two facts:
4.5 Compact and Dual Operators 103

(i) c0 is a separable, but non-reflexive Banach space over R;


(ii) L ∈ PB(c0 , R) if and only if there is an element y = (yk )∞
k=1 ∈ `1 such that

L(x) = k=1 xk yk for all x = (xk )∞ k=1 ∈ c0 .

4.14. Let (X, k · kX ) be a normed linear space over F and K be a convex


subset of X. Prove that the weak closure of K equals its norm closure.

4.15. Let (X, k · kX ) be a normed linear space over F and A ⊆ X ∗ be weak*


compact. Prove:
(i) A is weak* closed;
(ii) Any weak* closed subset of A is weak* compact.

4.16. Let (X, k · kX ) and (Y, k · kY ) be Banach spaces over F. Prove that if
T : X → Y is a linear mapping such that f ◦ T ∈ X ∗ for all f ∈ Y ∗ then
T ∈ B(X, Y ).

4.17. Prove that a topological space X is compact if and only if any collection
of closed subsets of X obeying the FIP has a nonempty intersection.

4.18. Let (X, k · kX ) and (Y, k · kY ) be normed linear and Banach spaces over
F respectively. Prove that the set Bc (X, Y ) of all compact operators from X
to Y is a Banach space under the usual operator norm for B(X, Y ).

4.19. Given a finite interval [a, b] ⊂ R, let k(·, ·) be an F-valued continuous


function on {(x, y) ∈ R2 : x ∈ [a, b], y ∈ [a, x]} and let T : C([a, b], F) →
C([a, b], F) be given by
Z x
T (f )(x) = k(x, y)f (y)dy.
a

Show that T is a compact operator.

4.20. Suppose that X is a Banach space over F and T ∈ B(X) is a compact


operator. Prove the following assertions:
(i) T S and ST are compact for any S ∈ B(X);
(ii) T (X) is separable.

4.21. Let (X, k · kX ) be a Banach space over F. Prove the following three
results:
(i) If T : X → X is the zero operator; that is, T (x) = 0 for all x ∈ X, then T
is compact;
(ii) If T ∈ B(X) is compact and X is infinite dimensional, then T is not
invertible;
(iii) If T ∈ B(X) ensures that T (X) is closed and infinite dimensional, then
T is not compact.
5
Hilbert Spaces and Their Operators

Hilbert spaces, named after David Hilbert, are special Banach spaces which
share most of the rich geometric structure of the finite-dimensional Euclidean
spaces. The geometrical wealth of a Hilbert space is produced by a bilinear
form on the space which allows the introduction of a notion of perpendicu-
larity. This chapter is devoted to an introductory study of Hilbert spaces and
their bounded and compact linear operators.

5.1 Definition, Examples and Basic Properties


Once again, F still denotes the field of real or complex numbers.
Definition 5.1. Let X be a linear space over F. An inner product on X is a
map (x, y) → hx, yi from X × X to F such that
(a) hαx + βy, zi = αhx, zi + βhy, zi for all x, y, z ∈ X and for all α, β ∈ F;
(b) hy, xi = hx, yi for all x, y ∈ X;
(c) hx, xi ≥ 0 for all x ∈ X and hx, xi = 0 if and only if x = 0.
If X is equipped with h·, ·i, then it is called an inner product space or a pre-
Hilbert space over F.
Note that the bar signifies complex conjugation – of course, if the scalar
field is R then the axiom (b) is simply hy, xi = hx, yi. So we will refer to real
or complex Hilbert spaces depending on whether F is R or C.
Theorem p5.2. Let X be a linear space over F under the inner product h·, ·i.
If kxk = hx, xi for all x ∈ X, then
(i) Cauchy-Schwarz’s inequality |hx, yi| ≤ kxkkyk holds for all x, y ∈ X, with
equality if and only if x and y are linearly dependent;
(ii) k · k defines a norm on X;
(iii) The parallelogram law
kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 )
is valid for all x, y ∈ X.
106 5 Hilbert Spaces and Their Operators

Proof. (i) If hx, yi = 0, then there is nothing to argue. Suppose now hx, yi =
6 0.
Then x 6= 0 and y 6= 0. Let α = hx, yi and z = αy. Then for t ∈ R we have
0 ≤ hx − tz, x − tzi = kxk2 − 2t|hx, yi|2 + t2 |hx, yi|2 kyk2 .
This ensures that the last quadratic function of t takes its absolute minimum
at t = kyk−2 . Substituting this value for t, we get
0 ≤ kx − tzk2 = kxk2 − kyk−2 |hx, yi|2
with equality if and only if x − tz = x − αty = 0, from which the desired result
is immediate.
(ii) It is obvious that kxk = 0 ⇔ x = 0 and kλxk = |λ|kxk. As for the
triangle inequality, (i) is used to imply
kx + yk2 = kxk2 + 2<hx, yi + kyk2 ≤ (kxk + kyk)2 .
(iii) This follows directly from expanding kx + yk2 and kx − yk2 .
The above definition and theorem actually induce the notion of a Hilbert
space.
Definition 5.3. A Hilbert space X over F is a pre-Hilbert space which is
complete with respect to the metric
p
d(x, y) = kx − yk = hx − y, x − yi for all x, y ∈ X.
Furthermore, we will refer to real or complex Hilbert space according to
whether F is R or C.
Example 5.4.
n
Pn n ∈ N, F is a Hilbert space over F under nthe inner product
(i) For each
hx, yi = j=1 xj yj for x = (x1 , ..., xn ), y = (y1 , ..., yn ) ∈ F .
(ii) Regardless of the cardinality of an arbitrarily given set X, we can define
the sequence space
n X o
`2 (X, F) = f : X → F : |f (x)|2 < ∞ ,
x∈X

where
X nX o
|f (x)|2 = sup |f (x)|2 : all nonempty finite subsets Y of X .
x∈X x∈Y

This space becomes a Hilbert space over F with the inner product
X
hf1 , f2 i = f1 (x)f2 (x) for all f1 , f2 ∈ `2 (X, F).
x∈X

Here the right-hand sum is said to converge provided there is a number s


in F such that to each  > 0 there corresponds a finite subset Y of X
P the property that if Z is any finite subset of X with Y ⊆ Z then
with
| x∈Z f1 (x)f2 (x) − s| < . Obviously, `2 (N, F) is just the square-summable
F-valued sequence space.
5.1 Definition, Examples and Basic Properties 107

(iii) LRS 2id ([a, b], F), often denoted L2 ([a, b], F), is a Hilbert space over F
under the following inner product
Z
hf1 , f2 i = f1 f2 dmid for all f1 , f2 ∈ LRS 2id ([a, b], F).
[a,b]

The following result tells us when a Banach space becomes a Hilbert space.

Theorem 5.5. Let (X, k · k) be a normed linear space over F and satisfy the
parallelogram law above.
(i) If F = C, then
4
X
hx, yi = 4−1 in kx + in yk2
n=1

defines an inner product on X.


(ii) If F = R, then

hx, yi = 4−1 (kx + yk2 − kx − yk2 )

defines an inner product on X.

Proof. It is enough to check (i). As a matter of fact, this h·, ·i enjoys

i|1 + i|2 i|1 − i|2


hx, xi = kxk2 + kxk2 − kxk2 = kxk2 for all x ∈ X.
4 4
To verify that h·, ·i is actually an inner product, it suffices to prove that
Definition 5.1 (a) holds for α = β = 1 and hλ·, ·i = λh·, ·i for any λ ∈ C.
For the former, we use the parallelogram law to achieve

ku + v + wk2 + ku + v − wk2 = 2ku + vk2 + 2kwk2

and
ku − v + wk2 + ku − v − wk2 = 2ku − vk2 + 2kwk2 .
Hence

2 (ku + vk2 − ku − vk2 )


= (ku + v + wk2 − ku − v + wk2 ) + (ku + v − wk2 − ku − v − wk2 ).

This infers
<hu + w, vi + <hu − w, vi = 2<hu, vi.
In a similar fashion, we can establish the relation with the real part < replaced
by the imaginary part =:

=hu + w, vi + =hu − w, vi = 2=hu, vi.

Accordingly,
108 5 Hilbert Spaces and Their Operators

hu + w, vi + hu − w, vi = 2hu, vi.
When u = w, we have h2u, vi = 2hu, vi. Taking u + w = x, u − w = y, v = z,
we get Dx + y E
hx, zi + hy, zi = 2 , z = hx + y, zi.
2
To reach the latter, we observe that for any m ∈ N,

hmx, yi = h(m − 1)x + x, yi = h(m − 1)x, yi + hx, yi = · · · = mhx, yi.

Thus for any k ∈ N,


Dx E Dx E
k , y = hx, yi and , y = k −1 hx, yi.
k k
Consequently, for any r = m/k,
D x E D mx E
rhx, yi = m , y = , y = hrx, yi.
k k
Since limkxk→0 kx + in yk = kyk, hx, yi is a continuous functional in x, and
consequently λhx, yi = hλx, yi for any λ > 0 thanks to the density of Q in R.
If λ < 0 then

λhx, yi − hλx, yi = λhx, yi − |λ|h−x, yi = λh0, yi = 0.

Also, it is evident to verify


4
X
ihx, yi = 4−1 kx + in−1 yk = hix, yi.
n=1

Therefore, for any λ = µ + iν ∈ C we have

λhx, yi = µhx, yi + ihνx, yi = h(µ + iν)x, yi,

as desired.

5.2 Orthogonality, Orthogonal Complement and Duality

First of all, let us consider orthogonality.

Definition 5.6. Let X be a Hilbert space over F.


(i) The angle between two vectors x, y ∈ X is defined by
(
0 , x or y = 0
θx,y = <hx,yi
arccos kxkkyk , otherwise.
5.2 Orthogonality, Orthogonal Complement and Duality 109

(ii) Two vectors x, y ∈ X are called orthogonal, denoted x⊥y, provided hx, yi =
0.
(iii) For any subset S of X, S ⊥ = {x ∈ X : hx, yi = 0 for all y ∈ S} is
called the orthogonal complement of S.

The following result, which has a root in Corollary 4.28 and will be used
later, has natural geometrical and finite-dimensional antecedents.

Theorem 5.7. Let X be a Hilbert space over F. If M is a closed convex subset


of X, then there exists exactly one x ∈ M such that kxk = inf y∈M kyk.

Proof. Let δ = inf y∈M kyk. Then by definition there is a sequence (xj )∞
j=1 in
M such that limj→∞ kxj k = δ. Since M is convex, we conclude 2−1 (xj +xk ) ∈
M and consequently,

kxk + xj k2 = 4k2−1 (xk + xj )k2 ≥ 4δ 2 .

Using the parallelogram law, we obtain

4δ 2 + kxk − xj k2 ≤ kxk + xj k2 + kxk − xj k2 = 2(kxk k2 + kxj k2 ).

Letting j, k → ∞ in the last inequalities, we find that kxj − xk k → 0 as


j, k → ∞; that is, (xj )∞
j=1 is a Cauchy sequence and so there is a point x ∈ M
such that xj → x in X since M is closed. The continuity of the norm leads
to kxk = δ. Finally, x is unique since if y ∈ M satisfies kyk = δ then the
forgoing argument can be used for the sequence x, y, x, y, x, y, .... to show that
this sequence is a Cauchy sequence, and hence x = y.

Next, we introduce the concept of a direct sum which has implicitly ap-
peared in Corollary 3.34.

Definition 5.8. Let X and Y be two linear subspaces of a given linear space
Z over F. Then Z is said to be the direct sum of X and Y , denoted Z = X ⊕Y ,
provided every z ∈ Z can be expressed uniquely in the form z = x + y where
x ∈ X and y ∈ Y and X ∩ Y = {0}.

We are about to prove a theorem on decomposing a Hilbert space into a


direct sum of mutually orthogonal closed subspaces. Before doing so, we need
the following result.

Lemma 5.9. Let M be a proper closed linear subspace of a Hilbert space X


over F. Then there exists a nonzero z ∈ X such that hz, yi = 0 for all y ∈ M .

Proof. Given any x ∈ X, the set x + M is a closed convex set. Thus by


Theorem 5.7 there exists a unique z ∈ x + M such that kzk = inf y∈M kx + yk.
Since M 6= X, we may assume x ∈ / M which yields z 6= 0, and then prove
that hz, yi = 0 for all y ∈ M . Given any y ∈ M and any α ∈ F, we have
110 5 Hilbert Spaces and Their Operators

z + αy ∈ x + M . By our choice of z we then have kz + αyk2 ≥ kzk2 and


consequently,

|α|2 kyk2 + 2<(ᾱhz, yi) ≥ 0 for all α ∈ F.

Choose θ ∈ [0, 2π) with


e−iθ hz, yi = |hz, yi|
and let α = −teiθ , t > 0. The inequality obtained by substituting for this α
is then tkyk2 ≥ 2|hz, yi|. Letting t → 0 we get hz, yi = 0. Because y ∈ M was
arbitrary, we must have hz, yi = 0 for all y ∈ M .
Theorem 5.10. If M is a proper closed linear subspace of a Hilbert space X
over F, then X = M ⊕ M ⊥ and M ⊥ is closed.
Proof. Given x ∈ X, apply the procedure in the proof of Lemma 5.9 to obtain
z ∈ X such that z ∈ x + M and hz, yi = 0 for all y ∈ M . Then z ∈ M ⊥ and
z = x − y for some y ∈ M . Hence x = y + z, y ∈ M , and z ∈ M ⊥ . Noting
that M ∩ M ⊥ = {0} since

x ∈ M ∩ M ⊥ ⇒ hx, xi = 0 ⇒ x = 0,

we thus have X = M ⊕ M ⊥ . To see that M ⊥ is closed, suupose xj ∈ M ⊥ and


xj → x in X. Then for any y ∈ M we have by the Cauchy-Schwarz inequality,

|hy, xi| ≤ |hy, xj − xi| + |hy, xj i| ≤ kykkxj − xk → 0 as j → ∞,

whence deriving x ∈ M ⊥ .
Finally, we consider the dual space of a Hilbert space. Below is the classical
Riesz representation theorem.
Theorem 5.11. Given a Hilbert space X over F and y ∈ X, define Ly : X →
F by Ly (x) = hx, yi. Then Ly ∈ X ∗ and kLy k = kyk. Conversely, for every
f ∈ X ∗ there exists a unique y ∈ X such that f = Ly .
Proof. Clearly, Ly ∈ X ∗ with kLy k ≤ kyk. Note that

kyk2 = Ly (y) ≤ kLy kkyk.

So kLy k = kyk. Conversely, let f ∈ X ∗ be given. If f = 0, then f = Ly where


y = 0. Suppose now f 6= 0. Then we may assume without loss of generality
that kf k = 1 since
f /kf k = Ly ⇒ f = Lkf ky .
For such an f , let
M = {x ∈ X : f (x) = 0}.
Then M is closed linear subspace of X. Of course, M 6= X otherwise f = 0
– a contradiction. By Lemma 5.9 we can obtain a nonzero z ∈ M ⊥ such that
f (z) 6= 0. Now for any x ∈ X, we have
5.3 Orthonormal Sets and Bases 111

f (x) D f (x) E
x− z∈M and x− z, z = 0.
f (z) f (z)
Accordingly, D z E
hx, zi = f (x) ,z .
f (z)
Taking y = f (z)kzk−2 z, we further get f (x) = hx, yi, and then f = Ly . To
see the uniqueness, assume there is another point w ∈ X such that f = Lw .
Then hx, w − yi = 0 for all x ∈ X, and hence

kw − yk2 = hw − y, w − yi = 0 and w = y.

We are done.

Here is an easy consequence of Theorem 5.11 which says that any Hilbert
space is self-reflexive.

Corollary 5.12. Let X be a Hilbert space over F. Then the mapping L : X →


X ∗ given by L(x) (·) = h·, xi is an isometric embedding from X onto X ∗ .
Moreover, X ∗ is also a Hilbert space over F.

Proof. The first part has just been proved in Theorem 5.11. Furthermore it is
easily seen that L(x1 + x2 ) = L(x1 ) + L(x2 ) so the isometry is additive. Since

L(αx)(y) = hy, αxi = ᾱhy, xi = ᾱL(x)(y) for x, y ∈ X and α ∈ F,

we conclude L(αx) = ᾱL(x). If X ∗ is now equipped with the inner product


hL(x), L(y)i = hy, xi, then X ∗ is a Hilbert space over F.

5.3 Orthonormal Sets and Bases


To better understand the structure of a Hilbert space, in this section we
consider the concepts of orthonormal sets and bases.

Definition 5.13. Let X be a Hilbert space over F. A set S ⊆ X is called


orthogonal if any two different elements in S are orthogonal. An orthonormal
set is an orthogonal set consisting entirely of elements of norm 1.

A constructive method of orthonormalizing a set of vectors in a Hilbert


space is the Gram-Schmidt process, named for Jrgen Pedersen Gram and
Erhard Schmidt, as follows.

Theorem 5.14. Let {xj }∞ j=1 be a countable linearly independent set of a


Hilbert space X over F. Then a countable orthonormal set {ej } can be con-
structed so that

span{ej }nj=1 = span{xj }nj=1 for all n ∈ N.


112 5 Hilbert Spaces and Their Operators

Proof. Define e1 = x1 /kx1 k, and, proceeding inductively. Suppose e1 , e2 , ..., en


are successively defined for each n ∈ N. Then en+1 is determined by
yn+1 /kyn+1 k, where
n
X
yn+1 = xn+1 − hxn+1 , ej iej .
j=1

Thus kyn+1 k =
6 0 since otherwise we would have

xn+1 ∈ span({ej }nj=1 ) = span({xj }nj=1 ),

contradicting the linear independence of {xj }∞


j=1 . It is clear that

span({ej }n+1 n+1


j=1 ) = span({xj }j=1 )

because this is true for {ej }nj=1 and {xj }nj=1 . Finally, for j ∈ {1, 2, ..., n}, we
have
hxn+1 , ej i − hxn+1 , ej ihej , ej i
hen+1 , ej i = = 0,
kyn+1 k
and consequently, the set {ej }∞
j=1 obtained by this inductive construction is
an orthonormal set with the desired property.

Example 5.15. Given −∞ R ≤ a < b ≤ ∞ and a function ω : (a, b) → (0, ∞)


with the property that (a,b) tn ω(t)dmid (t) converges for all n ∈ N, define the
2,ω
Hilbert space L R ((a, b),2F) to be the linear space of all F-valued functions f
on (a, b) with (a,b) |f (t)| ω(t)dmid (t) < ∞ – of course, that two functions in
the space coincide means f1 = f2 ωdmid -a.e. on (a, b). It is not hard to prove
that the linearly independent set {1, t, t2 , ...} has a linear span which is dense
in L2,ω ((a, b), F). This set, together with the Gram-Schmidt process through
the inner product
Z
hf1 , f2 iω = f1 (t)f2 (t)ω(t)dmid (t),
(a,b)

derives various families of classical orthonormal functions below:


(i) If ω = 1 and a = −1, b = 1, then the process generates the Legendre
polynomials, named after Adrien-Marie Legendre.
1
(ii) If ω(t) = (1 − t2 )− 2 and a = −1, b = 1, then the process produces the
Chebychev polynomials, named after Pafnuty Chebyshev.
(iii) If ω(t) = tq−1 (1 − t)p−q with q > 0, p − q > −1 and a = 0, b = 1, then
the process generates the Jacobi polynomials, named after Karl Gustav Jacob
Jacobi.
2
(iv) If ω(t) = e−t and a = −∞, b = ∞, then the process produces the
Hermite polynomials, named after Charles Hermite.
(v) If ω(t) = e−t and a = 0, b = ∞, then the process gives the Laguerre
polynomials, named after Edmond Laguerre.
5.3 Orthonormal Sets and Bases 113

Lemma 5.16. For n ∈ N, let {ej }nj=1 be an orthonormal set of a Hilbert


space
PnX over F. Then
(i) j=1 |hx, ej i|2 ≤ kxk2 for all x ∈ X;

Pn
(ii) x − j=1 hx, ej iej , ek = 0 for all x ∈ X and k = 1, 2, ..., n.

Proof. (i) This follows from


n
X 2 n
X
0 ≤ x − hx, ej iej = kxk2 − |hx, ej i|2 ,

j=1 j=1

where the last equality is obtained by expanding in the usual fashion and
using the orthonormality of {ej }nj=1 .
(ii) This follows from
D n
X E
x− hx, ej iej , ek = hx, ek i − hx, ek ihek , ek i = 0.
j=1

Corollary 5.17. Let I be any index set and {ej }j∈I be an orthonormal set of
a Hilbert space X over F. Then S = {ej : hx, ej i 6= 0} is countable for any
x∈X .

Proof. For each n ∈ N, define Sn = {ej : |hx, ej i|2 > S


kxk2 /n}. By Lemma

5.16 each Sn contains at most n−1 elements. Since S = n=1 Sn , we conclude
that S is countable.

Remark 5.18. Given an arbitrary orthonormal set {ej }j∈I Pindexed by2 a set
I, under Lemma 5.16 and Corollary 5.17, the sums |hx, e i| and
P P∞ 2
P∞ j
j∈I
j∈I hx, e ie
j j can be written as the series k=1 |hx, ejk i| and k=1 hx, ejk iejk
respectively, where we restrict ourselves to the countable number of ejk for
which hx, ejk i =
6 0.

The following result assures that the two series in Remark 5.18 are well
defined.

Theorem 5.19. Let (xk )∞ k=1 be an orthonormal sequence


P∞ in a Hilbert space X

over F, and let (ck )P
k=1 be any sequence in F. Then k=1 ck xk is convergent

in X if and only if k=1 |ck |2 < ∞, and if so
X∞ 2 X∞
c x = |ck |2 .

k k
k=1 k=1
P∞
Moreover, k=1 ck xk is independent of the order in which its terms are ar-
ranged.
114 5 Hilbert Spaces and Their Operators

Proof. From the orthonormality of (xk )∞


k=1 it follows that for m, n ∈ N and
m > n,
Xm 2 X m
ck xk = |ck |2 .


k=n k=n

This, together with the completeness of X, implies the if and only if part of
the theorem. If n = 1 and m → ∞, Pthen the desired equality follows.
∞ P∞
To complete the proof, Passume k=1 |ck |2 < ∞ and let y = l=1 ckl xkl

be a rearrangement of x = k=1 ck xk . Then

kx − yk2 = hx, xi + hy, yi − hx, yi − hy, xi,

and

X
hx, xi = hy, yi = |ck |2 .
k=1

If
m
X m
X
sm = ck x k and tm = ckl xkl ,
k=1 l=1

then

X
hx, yi = lim hsm , tm i = |ck |2 .
m→∞
k=1

Note that hy, xi = hx, yi = hx, yi. So it follows that kx − yk = 0 and hence
x = y. We are done.
Here is a statement of Friedrich Wilhelm Bessel about the coefficients of
an element in a Hilbert space with respect to an orthonormal set.
Theorem 5.20. Let I be any index set and {ej }j∈I be an orthonormal set of
a Hilbert space X over P F. Then
(i) Bessel’s inequality: j∈I |hx, ej i|2 ≤ kxk2 for all x ∈ X.

P
(ii) x − j∈I hx, ej iej , ek = 0 for all x ∈ X and k ∈ I.
Proof. (i) This follows from
X n
X
|hx, ej i|2 = lim |hx, ejk i|2 ≤ kxk2 .
n→∞
j∈I k=1

(ii) Using the continuity of the inner product in its left-hand variable that
follows from the Cauchy-Schwarz inequality, we obtain
D X E DX E
x− hx, ej iej , ek = hx, ek i − hx, ej iej , ek
j∈I j∈I
D n
X E
= hx, ek i − lim hx, ejm iejm , ek
n→∞
m=1
5.3 Orthonormal Sets and Bases 115
n
DX E
= hx, ek i − lim hx, ejm iej , ek
n→∞
m=1
= hx, ek i − hx, ek i = 0.

Theorem 5.20 (ii) induces the concept of an orthonormal basis for a Hilbert
space.

Definition 5.21. Let I be an index set and X be a Hilbert space over F.


Then an orthonormal set {ej }j∈I of X is called an orthonormal basis for X
provided that x = 0 follows from hx, ej i = 0 for all j ∈ I.

This definition actually illustrates that an orthonormal set is an orthonor-


mal basis when it is impossible to add one more nonzero element to the set
while still preserving its orthonormality. An orthonormal basis is not generally
a “basis”, i.e., it is not generally possible to write every member of the space
as a linear combination of finitely many members of an orthonormal basis.
In the infinite-dimensional case the distinction really matters; that is to say,
the definition given above requires only that the span of an orthonormal basis
be dense in the normed linear space, not that it equal the entire space. More
precisely, we have the following theorem about series by the Marc-Antoine
Parseval.

Theorem 5.22. Given an index set I, let {ej }j∈I be an orthonormal set in
a Hilbert space X over F. Then the following statements are equivalent:
(i) {ej }j∈I is an orthonormal basis.
(ii) The closed linear span of {ejP
}j∈I is X.
(iii) Parseval’s identity: kxk2 = j∈I |hx, ej i|2 for all x ∈ X.

Proof. First, we prove that (i) implies (ii) and (iii). If (i) is true, then Defini-
tion 5.21 and Theorem 5.20 (ii) yield
X
x= hx, ej iej for all x ∈ X.
j∈I

Clearly, (iii) follows from Theorem 5.19.


Next, we prove that each of (ii) and (iii) implies (i).
(ii)⇒(i) Suppose that y ∈ X satisfies hy, ej i = 0 for all j ∈ I. To prove
y = 0, consider S = {x ∈ X : hy, xi = 0}. It is easy to see that S is a
linear subspace of X. Since ej ∈ S, it follows that S must contain the linear
span of {ej }j∈I . On the other hand, S is closed in view of the continuity of
the inner product, and so S must contain the closure of the linear span of
{ej }j∈I . Hence S = X by (ii). In particular, we have y ∈ S and so hy, yi = 0,
whence y = 0 as required.
(iii)⇒(i) Suppose on the contrary that {ej }j∈I does not form an orthonor-
mal basis of X. Then there exists a nonzero x ∈ X such that hx, ej i = 0 for
all j ∈ I. Then
116 5 Hilbert Spaces and Their Operators
X
0 6= kxk2 = |hx, ej i|2 = 0.
j∈I

This is a contradiction.

Example 5.23.
(i) The set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} forms an orthonormal basis of R3 .
(ii) The set {ej : j ∈ N} with the jth entry of ej ∈ `2 being 1 and others
being 0 forms an orthonormal basis of the real sequence space `2 .
2
(iii) The set {eint }∞
n=−∞ is an orthonormal basis for LRS id ([0, 2π], C). Ac-
2
cordingly, C([0, 2π], C) is dense in LRS id ([0, 2π], C). In fact, this basis is
fundamental to the study of Fourier series, named in honor of Joseph Fourier.

Theorem 5.24.
(i) Every Hilbert space over F has an orthonormal basis.
(ii) Any orthonormal basis in a separable Hilbert space over F is countable.

Proof. (i) Let X be a Hilbert space over F with the inner product h·, ·i and
the norm k · k, and consider the collection E of orthonormal subsets of X. It
is clear that E is nonempty and can be partially ordered S under inclusion. If F
is any totally ordered subcollection of E, the set U = S∈F S is member of E
and an upper bound for F . By Zorn’s lemma there is a maximal orthonormal
set M . Since M is maximal, we conclude from Definition 5.21 that M is an
orthonormal basis – otherwise, there would be a nonzero x0 ∈ X \ M such
that hx0 , ej i = 0 holds for any ej ∈ M . Accordingly, {x0 /kx0 k} ∪ M forms an
orthonormal set which properly contains M – this contradicts the maximality
of M .
(ii) If X is separable, and if {ej }j∈I is an uncountable orthonormal basis
indexed by I, then for j, k ∈ I with j 6= k, we have

kej − ek k2 = kej k2 + kek k2 = 2,


X
and so the open balls B1/2 (ej ) (with center ej and radius 1/2) in X are

mutually disjoint. If {xj }j=1 is a countable dense subset of X, because {ej }j∈I
is uncountable, there is an open ball B1/2X
(ej0 ) which is disjoint from {xj }∞
j=1 .

Hence ej0 is not in the closure of {xj }j=1 . This contradicts the density of
{xj }∞
j=1 in X. We are done.

Example 5.25. There is a non-separable Hilbert space. As a matter of fact, let


X be the linear space of all continuous functions f : R → C satisfying
 Z N  21
|||f |||2 = lim (2N )−1 |f (x)|2 dx < ∞.
N →∞ −N

Although |||f |||2 = 0 does not ensure f = 0 on R, if Y = {f ∈ X : |||f |||2 = 0}


then X/Y is a normed linear space over C. If H is the completion of this
quotient space and is equipped with the following inner product
5.4 Five Special Bounded Operators 117
Z N
hf, gi = lim (2N )−1 f (x)g(x)dx,
N →∞ −N

then H is a Hilbert space over C. Note that {eit(·) +Y }}t∈R is an orthonormal


set of H but also

{f + Y ∈ H : |||f (·) − eit(·) |||2 < 1/2} and {f + Y ∈ H : |||f (·) − eis(·) |||2 < 1/2}

are disjoint for t 6= s. So H is not separable.

Corollary 5.26. Any two infinite dimensional separable Hilbert spaces over
F are isometrically isomorphic.

Proof. Suppose X and Y are two such spaces equipped with the inner products
h·, ·iX and h·, ·iY . Theorem 5.24 tells us that there are sequences (xj )∞j=1 and
(yj )∞j=1 that form orthonormal bases for X and Y respectively. If x ∈ X and
y ∈ Y , then
X∞ X∞
x= hx, xj iX xj and y = hy, yj iY yj .
j=1 j=1

Define a map T : X → Y by T (x) = y if hx, xj iX = hy, yj iY . It is clear


that T is linear and one-to-one, and it maps X onto Y since (hx, xj iX )∞ j=1
and (hy, yj iY )∞
j=1 run through all of elements in the F-valued sequence space
`2 (N, F). In addition,

X ∞
X
kT (x)k2 = |hy, yj iY |2 = |hx.xj iX |2 = kxk2 ,
j=1 j=1

Thus, T is isometrically isomorphic. The proof is complete.

5.4 Five Special Bounded Operators


For a Hilbert space X over F, the elements in B(X) are of particular interest.
To see this, suppose T ∈ B(X). For each fixed y ∈ X, the mapping that
sends x to hT (x), yi is a continuous linear functional on X. Thus by the Riesz
representation theorem there exists a unique z ∈ X such that

hT (x), yi = Lz (x) = hx, zi.

This pairing actually produces a new operator on X – see also Theorem 4.48.

Definition 5.27. Let X be a Hilbert space over F. If T ∈ B(X) then its


adjoint operator T ∗ is defined by

hT (x), yi = hx, T ∗ (y)i for all x, y ∈ X.


118 5 Hilbert Spaces and Their Operators

The three facts (i)-(ii)-(iii) in Theorem 5.28 below are the required condi-
tions for ∗ to be an involution.

Theorem 5.28. Let X be a Hilbert space over F. Then the operation ∗ maps
B(X) to itself, and has the following properties for all T, S ∈ B(X) and
α, β ∈ F:
(i) (αT + βS)∗ = ᾱT ∗ + β̄S ∗ ;
(ii) (T S)∗ = S ∗ T ∗ ;
(iii) T ∗∗ = T ;
(iv) kT ∗ k = kT k;
(v) kT ∗ T k = kT k2 .

Proof. First of all, we verify that ∗ maps B(X) to B(X). The linearity of T ∗
follows from the following calculation:

hx, T ∗ (αy1 + βy2 )i = hT (x), αy1 + βy2 i


= ᾱhT (x), y1 i + β̄hT (x), y2 i
= hx, αT ∗ (y1 ) + βT ∗ (y2 )i

By the definition of the operator norm, we have

kT ∗ k = sup kT ∗ (y)k
kyk=1

≤ sup |hx, T ∗ (y)i|


kxk=1,kyk=1

= sup |hT (x), yi|


kxk=1,kyk=1

≤ sup kT (x)k = kT k.
kxk=1

This implies that ∗ is bounded on B(X).


Next, we check those five properties.
(i) For any x, y ∈ X, we have

hx, (αT + βS)∗ (y)i = αhT (x), yi + βhS(x), yi = hx, (ᾱT ∗ + β̄S ∗ )(y)i.

(ii) This follows from

hx, (T S)∗ (y)i = hT S(x), yi = hS(x), T ∗ (y)i = hx, S ∗ T ∗ (y)i.

(iii) This follows from

hx, T ∗∗ (y)i = hT ∗ (x), yi = hy, T ∗ (x)i = hT (y), xi = hx, T (y)i.

(iv) It is known that kT ∗ k ≤ kT k and T ∗∗ = T . So kT k ≤ kT ∗ k. This gives


kT k = kT ∗ k.
(v) By (iv), we get
5.4 Five Special Bounded Operators 119

kT ∗ T k ≤ kT ∗ kkT k = kT k2 .

For the reverse inequality, we note that

kT (x)k2 = hT ∗ T (x), xi ≤ kT ∗ T (x)kkxk ≤ kT ∗ T kkxk2 .

So, kT k2 ≤ kT ∗ T k, which completes the proof.


Example 5.29. If T is the operator on `2 (N, C) defined by

T (x1 , x2 , ...) = (0, x1 , x2 , ...),

then T ∗ (x1 , x2 , ...) = (x2 , x3 , ...). Clearly, kT k = kT ∗ k = 1.




In what follows, we deal with five special types of bounded linear operators
on a given Hilbert space. They are self-adjoint, normal, non-negative, unitary,
and projective operators.
Definition 5.30. Let X be a Hilbert space over F. An operator T ∈ B(X) is
said to be self-adjoint or symmetric provided T = T ∗ .
Example 5.31.
(i) On a finite-dimensional Hilbert space over F, a self-adjoint operator is one
that is its own adjoint, or, equivalently, one whose matrix is Hermitian, where
a Hermitian matrix is one which is equal to its own conjugate transpose.
(ii) Given n ∈ N, the operator T defined by T (f )(x) = xn f (x) is a self-adjoint
operator on LRS 2id ([0, 1], C).
The structure of self-adjoint operators on infinite dimensional Hilbert
spaces is complicated somewhat by the following characterization and its im-
mediate remark.
Theorem 5.32. Let X be a Hilbert space over F and T ∈ B(X).
(i) If T is self-adjoint, then kT k = supkxk=1 |hT (x), xi|.
(ii) If F = C, then T is self-adjoint when and only when hT (x), xi is real for
all x ∈ X. In particular, T = 0 when and only when hT (x), xi = 0 for all
x ∈ X.
Proof. (i) The Cauchy-Schwartz inequality implies kT k ≥ supkxk=1 |hT (x), xi|.
For the reversed estimate, we use the triangle inequality and the parallelogram
law to obtain that for any x ∈ X with kxk = 1,

2|hT (x), yi + hT (y), xi|


= |hT (x + y), x + yi − hT (x − y), x − yi|
≤ 2 sup |hT (z), zi|(kx + yk2 + kx − yk2 )
kzk=1

= 4 sup |hT (z), zi|(1 + kyk2 ).


kzk=1
120 5 Hilbert Spaces and Their Operators

If T (x) 6= 0, taking y = T (x)/kT (x)k in the above we get kT (x)k ≤


supkzk=1 |hT (z), zi|. Note that the last inequality is trivially valid for T (x) = 0.
So, it follows that kT k ≤ supkxk=1 |hT (x), xi|, and the desired equality occurs.
(ii) If T ∈ B(X) is self-adjoint, then hT (x), xi = hx, T (x)i = hT (x), xi
for any x ∈ X. Hence hT (x), xi is real for all x ∈ X. Conversely, if f (x) =
hT (x), xi, then

f (x + y) = f (x) + f (y) + hT (y), xi + hT (x), yi

and
f (x + iy) = f (x) + f (y) + ihT (y), xi − ihT (x), yi.
Since f is real-valued, we conclude from the last two identities that there are
r, s ∈ R such that

hT (y), xi + hT (x), yi = r and hT (y), xi − hT (x), yi = is.

This yields
r + is r − is
hT (y), xi = and hT (x), yi = ,
2 2
and thus
hT (y), xi = hT (x), yi = hx, T ∗ (y)i = hT ∗ (y), xi.
Consequently, for any y ∈ X, we have

h(T − T ∗ )(y), (T − T ∗ )(y)i = 0, i.e., T = T ∗ .

For the second half of (ii), it is enough to prove the sufficiency. Suppose
hT (x), xi = 0 for all x ∈ X. Then T is self-adjoint, and hence

4hT (x), yi = hT (x + y), x + yi − hT (x − y), x − yi = 0

holds for all y = T (x) ∈ X. This gives T (x) = 0 for all x ∈ X, i.e., T = 0.

Remark 5.33. Unfortunately, Theorem 5.32 (i) is not valid for  any real Hilbert
space. For instance, if X = R2 and T (x) = T (x1 , x2 ) = (−x2 , x1 ) for
x = (x1 , x2 ) ∈ R2 then hT (x), xi = 0, 0 6= T ∈ B(R2 ) and T is not self-
adjoint since T ∗ (x) = T ∗ (x1 , x2 ) = (x2 , −x1 ) for x = (x1 , x2 ) ∈ R2 .

Definition 5.34. Let X be a Hilbert space over C. An operator T ∈ B(X) is


said to be nonnegative, denoted T ≥ 0, provided hT (x), xi ≥ 0 for all x ∈ X.
Moreover, we say that T1 ≤ T2 if T1 and T2 are self-adjoint operators in B(X)
and T2 − T1 ≥ 0.

Example 5.35.
(i) Any bounded operator T on a given Hilbert space over C always generates
a nonnegative operator T ∗ T since hT ∗ T (x), xi = kT (x)k2 ≥ 0.
(ii) The operator defined in Example 5.31 (ii) is nonnegative.
5.4 Five Special Bounded Operators 121

(iii) The operator T on C2 determined
 by T (x1 , x2 ) = (x1 + 2x2 , 2x1 + x2 )
is not nonnegative since T (1, −1) = (−1, 1) and the inner product between
two vectors (−1, 1) and (1, −1) is −2.

According to Theorem 5.32 (ii) and Example 5.35 (ii), any nonnegative
operator is self-adjoint, but not conversely. Here is the basic structure of a
nonnegative operator.

Theorem 5.36. Given a Hilbert space X over C, let I ∈ B(X) be the identity
operator.
(i) If −I ≤ T ≤ I or 0 ≤ T ≤ 0, then kT k ≤ 1 or T = 0.
(ii) If T ≥ 0, then T + I is invertible, and the generalized Cauchy-Schwartz
inequality holds:
p
|hT (x), yi| ≤ hT (x), xihT (y), yi ∀ x, y ∈ X.

(iii) If 0 ≤ T1 ≤ T2 ≤ · · · ≤ I, then there exists a T∞ ∈ B(X) such that


limj→∞ kTj (x) − T∞ (x)k = 0 for all x ∈ X.
(iv) T ≥ 0 when and only√ when there exists a unique nonnegative operator
S ∈ B(X), denoted S = T , such that S 2 = T .

Proof. (i) Note that T is self-adjoint. So, I ± T ≥ 0 implies ±hT (x), xi ≤ kxk2
and kT k ≤ 1 by Theorem 5.32 (i). Moreover, ±T ≥ 0 implies hT (x), xi = 0
and kT k = 0 by Theorem 5.32 (i).
(ii) Since T ≥ 0, we conclude

(T + I)(x) = 0 ⇒ h(T + I)(x), xi = 0 ⇒ 0 ≤ hT (x), xi = −kxk2 ⇒ x = 0,

and so the injectivity of T + I follows. The surjectivity will follows if we can


prove that M = (T + I)(X), the image of T + I, is both dense and closed. The
argument used for injectivity applies to x ∈ M ⊥ to infer that h(T +I)x, xi = 0
for all x ∈ M ⊥ and so M ⊥ = {0}. On the other hand, for any x ∈ X, we have

k(T + I)(x)k2 = kT (x)k2 + 2hT (x), xi + kxk2


∞
and hence kxk ≤ k(T + I)(x)k. Consequently, if (T + I)(xj ) j=1 is a Cauchy
sequence in M , then (xj )∞
j=1 is a Cauchy sequence in X and hence it is conver-
gent. This shows that M is complete and thereby closed in X. From Theorem
5.10 it turns out that X = M ⊕ M ⊥ = M ; that is, T + I is surjective.
Regarding the generalized Cauchy-Schwartz inequality for any x, y ∈ X,
for any t ∈ R let λ = tζ ∈ F obey ζhy, T (x)i = |hy, T (x)i|. Then we get

0 ≤ hx + λy, T (x + λy)i
= hx, T (x)i + 2t|hy, T (x)i| + t2 hy, T (y)i,

whence deriving
122 5 Hilbert Spaces and Their Operators

|hT (x), yi|2 − hx, T (x)ihy, T (y)i ≤ 0.

(iii) For k > j and x ∈ X, we use the generalized Cauchy-Schwartz in-


equality in (ii) to get

kTk (x) − Tj (x)k2 = h(Tk − Tj )(x), (Tk − Tj )(x)i


q
≤ h(Tk − Tj )(x), xih(Tk − Tj )2 (x), (Tk − Tj )(x)i
q
≤ h(Tk − Tj )(x), xikxk
q
≤ hTk (x), xi − hTj (x), xikxk

since 0 ≤ Tk − Tj ≤ I implies kTk − Tj k ≤ 1 and k(Tk − Tj )2 k ≤ 1. Note that



(hTj (x), xi)∞
j=1 is convergent since it is increasing and bounded. So Tj (x) j=1
is a Cauchy sequence in the Hilbert space X and consequently, this sequence
is convergent to T∞ (x) ∈ X. According to Theorem 4.15 (i), T∞ ∈ B(X).
(iv) The sufficiency is trivial. For the necessity, suppose T ≥ 0. According
to Theorem 5.32 (i) we have kT k < ∞ and equivalently T ≤ kT kI. Therefore,
without loss of generality we may assume T ≤ I. To find a nonnegative oper-
ator S such that S 2 = T , let U = I − T , T1 = 0 and Tj+1 = 2−1 (U + Tj2 ) for
j ∈ N. By induction, we find three basic facts as follows:
(a) For j ∈ N, 0 ≤ Tj is a polynomial of U with nonnegative coefficients;
(b) For j ∈ N, Tj ≤ I. In fact, Tj ≤ I implies

hTj2 (x), xi = kTj (x)k2 ≤ kxk2

and so Tj2 ≤ I which, along with U ≤ I, yields Tj+1 ≤ I;


(c) For j ∈ N, Tj+2 − Tj+1 ≥ 0. This is because the easy commutativity
Tj+1 Tj = Tj Tj+1 deduces
2
Tj+1 − Tj2 (Tj+1 + Tj )(Tj+1 − Tj )
Tj+2 − Tj+1 = =
2 2
where Tj+1 + Tj is a polynomial of U with nonnegative coefficients, and so is
Tj+1 − Tj by T2 − T1 = U/2 and induction.
Now, (iii) is applied to produce a T∞ ∈ B(X) with limj→∞ Tj (x) = T∞ (x) for
all x ∈ X. Then T∞ = 2−1 (U + T∞ 2
) and hence desired square root operator
2
is S = I − T∞ , i.e., (I − T∞ ) = T .
It remains to check the uniqueness. Suppose S1 ≥ 0 satisfies S12 = T .
Then S1 T = S13 = T S1 and hence S1 T n = T n S1 for any n ∈ N. Now
the previous-established limit process implies SS1 = S1 S. Consequently, if
y = (S−S1 )(x) for any x ∈ X then 0 = h(S+S1 )(y), yi = hS(y), yi+hS1 (y), yi,
√ 2 √ 2
and hence S(y) = 0 = S1 (y) (here S = S and S1 = S1 ) which ensures
(S 2 − SS1 )(x) = 0 = (S1 S − S12 )(x). Accordingly, k(S − S1 )(x)k2 = h(S −
S1 )2 (x), xi = 0, and so S = S1 .
5.4 Five Special Bounded Operators 123

Definition 5.37. Given a Hilbert space X over F let T ∈ B(X). Then T is


called unitary provided T is an isometric isomorphism.
Example 5.38. For f ∈ LRS 2id (R, C) let
Z
1
F(f )(x) = √ f (t) exp(itx)dmid (t), x ∈ R,
2π R
then F is a unitary operator on LRS 2id (R, C). This operator is called the
Fourier transform and its adjoint operator is determined by
Z
∗ 1
F (f )(x) = √ f (t) exp(−itx)dmid (t), x ∈ R.
2π R
We can characterize unitary operators in terms of adjoints and inner prod-
ucts.
Theorem 5.39. Given a Hilbert space X over C let T ∈ B(X). If I is the
identity operator in B(X), then the following statements are equivalent:
(i) T is unitary;
(ii) hT (x), T (y)i = hx, yi for all x, y ∈ X, and T is surjective;
(iii) T T ∗ = T ∗ T = I.
Proof. (i)⇒(ii) If T is unitary, then kT (x)k = kxk and hence
4
X
hT (x), T (y)i = 4−1 in kT (x) + in T (y)k2
n=1
4
X
= 4−1 in kT (x + in y)k2
n=1
4
X
= 4−1 in kx + in yk2
n=1
= hx, yi.

The surjectivity of T follows the definition of a unitary operator.


(ii)⇒(iii) Clearly, (ii) implies that T is injective. Moreover

hT ∗ T (x), xi = hT (x), T (x)i = hx, xi.

This, along with Theorem 5.32 (ii), yields

h(T ∗ T − I)(x), xi = 0 ⇒ T ∗ T = I ⇒ T T ∗ = I.

(iii)⇒(i) If (iii) is true, then T is surjective and

kT (x)k2 = hT (x), T (x)i = hx, T ∗ T (x)i = hx, xi = kxk2 .

This implies that T is injective, and thereby T is unitary.


124 5 Hilbert Spaces and Their Operators

Theorem 5.39 leads to the following consideration.

Definition 5.40. Let X be a Hilbert space over C. An operator T ∈ B(X) is


said to be normal if T T ∗ = T ∗ T .

Interestingly, the normal operators correspond to the complex numbers.

Theorem 5.41. Let X be a Hilbert space over C and T ∈ B(X). Then the
following are equivalent:
(i) T is normal.
(ii) T = T1 + iT2 where T1 and T2 are self-adjoint and T1 T2 = T2 T1 .
(iii) kT (x)k = kT ∗ (x)k for all x ∈ X.

Proof. (i)⇒(ii) Put

T + T∗ T − T∗
T1 = and T2 = .
2 2i
It is easy to see that T = T1 + iT2 holds with T1 and T2 being self-adjoint and
commutative.
(ii)⇒(iii) Using the given decomposition of T plus T1 T2 = T2 T1 , we obtain

kT (x)k2 = hx, T ∗ T (x)i


= hx, (T1 − iT2 )(T1 + iT2 )(x)i
= hx, (T1 T1 + T2 T2 )(x)i
= hx, (T1 + iT2 )(T1 − iT2 )(x)i
= hx, T T ∗ (x)i
= kT ∗ (x)k2 .

(iii)⇒(i). For any x ∈ X, we have

h(T T ∗ − T ∗ T )(x), xi = hT T ∗ (x), xi − hT ∗ T (x), xi = kT ∗ (x)k2 − kT (x)k2 = 0.

So, Theorem 5.32 is employed to derive that T T ∗ − T ∗ T = 0, as desired.

Example 5.42. Let T = 2iI on a given Hilbert space X over C, where I is the
identity operator in B(X). Then T ∗ = −2iI and T T ∗ = T ∗ T = 4I. Thus, T
is normal operator but not unitary nor self-adjoint.

The fifth special operator that we take account of is the projection oper-
ator.

Definition 5.43. Let X be a Hilbert space over F. Then an operator T ∈


B(X) is called a projection provided T 2 = T .
5.4 Five Special Bounded Operators 125

Example 5.44. Let M be a closed subspace of a Hilbert space X over F. Then


X = M ⊕ M ⊥ ; that is, any x ∈ X there are unique y ∈ M and z ∈ M ⊥ such
that x = y + z. If PM (x) = y then PM is a projection. In fact,
2 2
PM (x) = PM (y) = y = PM (x) ⇒ PM = PM .

The fact that PM ∈ B(X) follows from

kxk2 = ky + zk2 = kyk2 + kzk2 ≥ kyk2 = kPM (x)k2 ⇒ kPM k ≤ 1.

Traditionally, PM is called the orthogonal projection of X onto M . Moreover,


if M 6= {0} then PM 6= 0 and hence for any x ∈ M \ {0} we have x = x + 0
and PM (x) = x, giving kPM k = 1.
The following theorem shows that the above orthogonal projections exist
as a very important subclass of the projection operators.
Theorem 5.45. Let X be a Hilbert space over C. If T ∈ B(X) is a projection,
then the following statements are equivalent:
(i) T is nonnegative.
(ii) T is self-adjoint.
(iii) T is normal.
(iv) T is the orthogonal projection on T (X).
Proof. Since (i)⇒(ii)⇒(iii) are straightforward, we only verify (iii)⇒(iv)⇒(i).
(iii)⇒(iv) Assume (iii) holds. To reach (iv), let M = T (X).
We first prove that M is closed. Given {yj }∞ j=1 in M with yj → y, we have
yj = T (xj ), xj ∈ X. Since T is a projection, we can conclude from yj → y
that
yj = T (xj ) = T 2 (xj ) = T (yj ) → T (y) ⇒ y = T (y) ∈ M
and hence M is closed.
Next, given x ∈ X write x = y + z, y ∈ M and z ∈ M ⊥ . We must verify
T (x) = y. Because T (x) = T (y) + T (z), it suffices to prove that T (y) = y
and T (z) = 0. Since y ∈ M , there is w ∈ X such that y = T (w) and so
T (y) = T 2 (w) = T (w) = y. By the definition of M we have T (z) ∈ M , and if
we can also prove that T (z) ∈ M ⊥ , then we can conclude that T (z) = 0. To
do so, given any u ∈ M , then there is v ∈ X such that u = T (v) and

hT (z), ui = hT (z), T (v)i = hz, T ∗ T (v)i = hz, T T ∗ (v)i = 0

due to z ∈ M ⊥ and T T ∗ (v) ∈ M . In other words, T (z) ∈ M ⊥ .


(iv)⇒(i) For any x ∈ X, let

x = y + z, y ∈ M = T (X) and z ∈ M ⊥ .

Then
hT (x), xi = hy, y + zi = hy, yi + hy, zi = hy, yi ≥ 0.
Hence T is nonnegative and the proof is complete.
126 5 Hilbert Spaces and Their Operators

Remark 5.46. Here it is worth remarking that there exists a natural one-to-
one correspondence between projection operators T and direct sum decom-
positions X = M ⊕ N where M and N are closed. As a matter of fact, if T
is a projection, then X = M ⊕ N where M = T (X) and N = (I − T )(X).
Conversely, if X = M ⊕ N , then we define T as was done for the orthogonal
projection but using N instead of M ⊥ . The orthogonal projections are exactly
those that arise when N = M ⊥ , and these are generally the projections of
interest in Hilbert space theory.

5.5 Compact Operators via the Spectrum

This section is devoted to a spectrum-based study of the structure of a com-


pact operator acting on a Hilbert space.

Definition 5.47. Let (X, k · kX ) be a Banach space over F. Then a linear


operator T is said to be of finite rank provided it is of the following form:
n
X
T (x) = Lk (x)xk for x, xk ∈ X and Lk ∈ X ∗ .
k=1

Clearly, an operator of finite rank is compact, for it takes a bounded se-


quence into a bounded sequence in a finite dimensional Banach space. Hence,
the image of the sequence has a convergent subsequence. By Theorem 4.46
we know that in a Banach space the limit in norm of operators of finite rank
is compact. So, it is very natural to wonder whether or not the converse is
true; that is, for each compact operator, can we find a sequence of operators
of finite rank that approximate this compact operator in norm? Although the
answer is no in general for Banach spaces, we will show that the answer is yes
for all complex Hilbert spaces.

Definition 5.48. Let X be a Banach space over C and T ∈ B(X).


(i) The null space of T , denoted N (T ), is {x ∈ X : T (x) = 0}. The range of
T , denoted R(T ), is {y ∈ X : y = T (x) for some x ∈ X}.
(ii) A complex number λ is called an eigenvalue of T if there exists a nonzero
x ∈ X such that T (x) = λx. In this case, x is called an eigenvector associated
with λ.
(iii) The spectrum of T , denoted σ(T ), is the set of all complex numbers λ such
that T − λI does not have a bounded inverse where I is the identity operator
in B(X). The resolvent of T , written ρ(T ), is C \ σ(T ). The spectral radius of
T , denoted rσ (T ), is sup{|λ| : λ ∈ σ(T )}. In addition, σ(T ) can be split into
the following three classes:
(a) The point spectrum

σp (T ) = λ ∈ σ(T ) : N (λI − T ) 6= {0} .
5.5 Compact Operators via the Spectrum 127

(b) The continuous spectrum



σc (T ) = λ ∈ σ(T ) : N (T − λI) = {0}, R(T − λI) = X .

(c) The residual spectrum



σr (T ) = λ ∈ σ(T ) : N (T − λI) = {0}, R(T − λI) 6= X .

Example 5.49.
(i) For the closed interval [0, 1] of R, consider LRS 2id ([0, 1], C) and the differ-
ential operator T = −d2 /dx2 defined on the linear subspace of all complex-
valued functions f having derivatives of any order on [0, 1] and satisfy-
ing the boundary conditions: f (0) = f (1) = 0. Then a computation via
integration-by-parts indicates that T is self-adjoint; its eigenfunctions are
fj (x) = sin(jπx), j ∈ N with the real eigenvalues j 2 π 2 ; the well-known or-
thogonality of the sine functions follows as a consequence of the property of
being self-adjoint.
(ii) σ(T ) is compact. In fact, according to Theorem 4.10 (i), |λ| > kT k implies

λ(T − λI)−1 = (λ−1 T − I)−1 ∈ B(X)

and then λ ∈ / σ(T ). By contrapositive statement, if λ ∈ σ(T ) then |λ| ≤ kT k.


Thus, rσ (T ) ≤ kT k. This implies that σ(T ) is bounded. To see that σ(T ) is
closed, let Φ : F → B(X) be Φ(λ) = T − λI. This map is continuous on C
since kΦ(λ1 ) − Φ(λ2 )kX = kIk|λ1 − λ2 |. Thus, σ(T ) is closed, as the set of all
invertible elements of B(X) is open. Of course, ρ(T ) = C \ σ(T ) is open.
(iii) Let T be defined by T (x1 , x2 , ...) = (x2 , x3 , ...) on `2 (N, C). It is easy
to see σ(T ) ⊆ {λ ∈ C : |λ| ≤ 1}. Since |λ| < 1 implies x = (1, λ, λ2 , ...) ∈
`2 (N, C) and T (x) = λx, and consequently {λ ∈ C : |λ| < 1} ⊆ σp (T ). Note
that σ(T ) is closed. So {λ ∈ C : |λ| = 1} ⊆ σ(T ). Also, λ ∈ C and |λ| = 1 and
T (x) = λx imply x = (x1 , x2 , ...) = x1 (1, λ, λ2 , ...) ∈
/ `2 (N, C) unless x1 = 0.
This means {λ ∈ C : |λ| = 1} ⊆ σc (T ). Accordingly, σr (T ) = ∅.

The following result, which will be used later, gives much more information
on the spectrum of a self-adjoint operator.

Theorem 5.50. Let X be a Hilbert space over C and T ∈ B(X) be self-


adjoint.
(i) If λ is an eigenvalue of T , then λ is real.
(ii) If x1 and x2 are eigenvectors associated with different eigenvalues λ1 and
λ2 of T , then hx1 , x2 i = 0.
(iii) If λ is not an eigenvalue of T , then R(T − λI) is dense in X.
(iv) λ ∈ ρ(T ) if and only if k(T − λI)(x)k ≥ κkxk for all x ∈ X and some
κ > 0.
(v) If mT = inf kxk=1 hT (x), xi and MT = supkxk=1 hT (x), xi, then MT = kT k
and σ(T ) is a nonempty subset of the closed interval [mT , MT ]. In particular,
mT , MT ∈ σ(T ).
128 5 Hilbert Spaces and Their Operators

(vi) rσ (T ) = kT k.

Proof. (i) This follows from the fact that T (x) = λx for some x 6= 0 implies

λkxk2 = hT (x), xi ∈ R

due to Theorem 5.32.


(ii) In this case, by (i) we have

λ1 hx1 , x2 i = hT (x1 ), x2 i = hx1 , T (x2 )i = λ2 hx1 , x2 i

which yields hx1 , x2 i = 0.


(iii) Suppose λ is not an eigenvalue of T . If z ∈ R(T − λI)⊥ then

0 = hz, T (x) − λxi = hT (z), xi − hλ̄z, xi = h(T − λ̄I)z, xi

for all x ∈ X. If z 6= 0 then this says that λ̄ is an eigenvalue of T . Since


T = T ∗ , we must then have λ̄ = λ, which is a contradiction. Thus z = 0 and
⊥
X = R(T − λI) ⊕ R(T − λI)

⊆ R(T − λI) ⊕ R(T − λI) = R(T − λI) ⊆ X.

(iv) If λ ∈ ρ(T ) then (T − λI)−1 ∈ B(X) and we may set κ = k(T −


−1 −1
λI) k . Conversely, if k(T − λI)(x)k ≥ κkxk for all x ∈ X and some κ > 0,
then R(T − λI) is closed. In fact, suppose (T − λI)(xk ) → y. Then (T −
∞
λI)(xk ) k=1 is a Cauchy sequence and hence (xk )∞ k=1 is a Cauchy sequence
too. Since X is Hilbert space, there exists a z ∈ X such that xk → z. By
continuity we have (T − λI)(xk ) → T (z) and therefore y = T (z). Also, by
(iii) above, we see that R(T − λI) is dense in X and so R(T − λI) = X. The
given inequality insures that N (T − λI) = {0}. Hence T − λI is surjective and
injective. Accordingly, (T − λI)−1 ∈ B(X), namely, λ ∈ ρ(T ).
(v) Clearly, MT ≤ kT k. To prove kT k ≤ MT , assume kzk = 1 and T (z) 6=
0. If
1
b = kT (z)k 2 , x = bz + b−1 T (z) and y = bz − b−1 T (z),
then by the parallelogram law,

kxk2 + kyk2 = 2−1 (kx + yk2 + kx − yk2 ) = 2b2 + 2b−2 kT (z)k2 = 4kT (z)k.

Note that |hT u, ui| ≤ MT kuk2 for any u ∈ X. So, the last equation and a
routine calculation give

4kT (z)k2 = hT (x), xi − hx, T (x)i ≤ MT (kxk2 + kyk2 ) = 4MT kT (z)k,

and consequently, kT (z)k ≤ MT ; that is, kT k ≤ MT .


We next prove that σ(T ) comprises only real numbers. In so doing, let
λ = α + iβ where β 6= 0. Then for any x ∈ X we have
5.5 Compact Operators via the Spectrum 129

k(T − λI)(x)k2 = kT (x) − αxk2 + β 2 kxk2 ≥ β 2 kxk2 ,

and so λ ∈ ρ(T ) by (iv). In other words, if λ ∈ σ(T ) then β = 0 and hence


λ ∈ R.
After that, if λ ∈ (−∞, mT ) then for any x ∈ X,

kT (x) − λxkkxk ≥ hT (x) − λx, xi = hT (x), xi − λkxk2 ≥ (mT − λ)kxk2

and hence λ ∈ ρ(T ) by (iv). Similarly, if λ ∈ (MT , ∞) then λ ∈ ρ(T ). Thus


σ(T ) ⊆ [mT , MT ].
Furthermore, in order to verify MT ∈ σ(T ), we may assume MT ≥ mT ≥ 0
and MT = kT k by replacing T by T − mT I. By the definition of MT there
exists a sequence (xk )∞
k=1 with kxk k = 1 such that limk→∞ hT (xk ), xk i = MT .
Accordingly,

kT (xk ) − MT xk k2 ≤ MT2 − 2MT hT (xk ), xk i + MT2 → 0 as k → ∞.

This, along with (iv), implies MT ∈ σ(T ).


Finally, in order to check mT ∈ σ(T ), we consider S = −T and obtain
−mT = MS ∈ σ(S) by the previous argument. Thus

T − mT I = −(S − MS I)

does not have a bounded inverse, and hence mT ∈ σ(T ).


(vi) By (v), we have mT , MT ∈ σ(T ) and so

rσ (T ) ≥ max{|mT |, |MT |} = kT k.

This and the general estimate rσ (T ) ≤ kT k yields the desired formula.


When working with a compact self-adjoint operator, we find that the struc-
ture of its spectrum is particularly simple.
Theorem 5.51. Let X be a Hilbert space over C, and let T ∈ B(X) be com-
pact and self-adjoint.
(i) If λ ∈ σ(T ) is nonzero, then λ is an eigenvalue of T .
(ii) If {eα : α ∈ A} is an orthonormal set of eigenvectors corresponding to
eigenvalues λα with |λα | > c > 0, then A is a finite set.
Proof. (i) By Theorem 5.50 (iv), we see that if λ ∈ σ(T ) is not equal to 0 then
there is a sequence (xk )∞
k=1 in X obeying kxk k = 1 and T (xk )−λxk → 0. Since
T is compact, there exists a subsequence (xkn )∞n=1 such that T (xkn ) → y ∈ X
as n → ∞. Consequently,
 
xkn = λ−1 T (xkn ) − T (xkn ) − λxkn → λ−1 y as n → ∞,

and therefore λy = limn→∞ λT (xkn ) = T (y) owing to the continuity of T ;


that is, λ is an eigenvalue of T and y is a corresponding eigenvector. Moreover,
y 6= 0 since kxk k = 1 and λ 6= 0.
130 5 Hilbert Spaces and Their Operators

(ii) Assume A is infinite. Then there exists an infinite sequence of or-


thonormal eigenvectors (yk )∞ ∞
k=1 corresponding to eigenvalues (λk )k=1 with

|λk | > c > 0 for all k ∈ N. Note that the orthonormality of (yk )k=1 gives

kT (yk ) − T (yl )k2 = kλk yk k2 + kλl yl k2 = |λk |2 + |λl |2 > 2c2 .

So, the compactness of T implies 0 ≥ c, a contradiction.

To reach the main result of this section, we introduce the following termi-
nology.

Definition 5.52. Let X be a Hilbert space over C. If λ is an eigenvalue of


T ∈ B(X), then N (T − λI) is called the eigenspace associated with λ.

Remark 5.53.
(i) It is clear that N (T − λI) is invariant under T ; that is, if x ∈ N (T − λI)
then T (x) ∈ N (T − λI) since T (T (x)) = T (λx) = λT (x).
(ii) If T is compact and self-adjoint, then the eigenspace corresponding to each
nonzero eigenvalue must be finite dimensional, but also, there can be at most
finitely many eigenvalues λ obeying |λ| > c for a given constant c > 0. The
former clearly follows from Theorem 5.51 (ii). To see the latter, we assume that
there were an infinite sequence of different eigenvalues (λk )∞k=1 obeying |λk | >
c > 0. If (yk )∞
k=1 is a corresponding sequence of norm one eigenvectors, then by
Theorem 5.50 (ii) we see that (yk )∞ k=1 are orthogonal and hence orthonormal.
But, (yk )∞ k=1 cannot be infinite by Theorem 5.51 (ii). Thus, there can be
only finitely many eigenvalues obeying |λk | > c > 0. Consequently, the set
of nonzero eigenvalues of the operator is at most countable. Moreover, if the
operator has infinitely many eigenvalues, then they must converge to zero
since at most finitely many eigenvalues can satisfy an inequality of the form
|λ| > c > 0.

Below is the spectral representation of a compact and self-adjoint operator


on a complex Hilbert space.

Theorem 5.54. Let X be a Hilbert space over C, and let T ∈ B(X) be com-
pact and self-adjoint. If (λj )∞
j=1 is an enumeration ofPthe distinct nonzero

eigenvalues of T , then for each x ∈ X one has T (x) = k=1 λk Pk (x), where
each Pk is the orthogonal projection of X onto the eigenspace N (T − λk I).

Proof. From the definition of each Pk and Theorem 5.7 it follows that for
x ∈ X,
kPk (x) − xk = inf{ky − xk : y ∈ N (T − λk I)}.
If {yk,1 , · · · , yk,nk } is an orthonormal basis for N (T
P− λk I), then by Remark
nk
5.53 (ii), and Theorems 5.20 and 5.22, Pk (x) = j=1 hx, yk,j iyk,j . Because
eigenvectors
S∞ corresponding to different eigenvalues are orthogonal, the set S =
k=1 {yk,1 , · · · , yk,nk } is an orthonormal set. The vector
5.5 Compact Operators via the Spectrum 131

X ∞ X
X nk
y= Pk (x) = hx, yk,nk iyk,nk
k=1 k=1 j=1

is then well-defined since


∞ X
X nk
|hx, yk,nk i|2 ≤ kxk2
k=1 j=1

P∞ from Bessel’s inequality. Since Pk (x) ∈ N (T − λk I), we have


which follows
T (y) = k=1 λk Pk (x). Writing x = y + (x − y), we find that the proof will
be complete upon verifying x − y ∈ N (T ). To this end, let M be the smallest
closed linear subspace of X containing S. Then S is an orthonormal basis for
M and hence y = PM (x), the orthogonal projection onto M of x. From this
it turns out that

x = (I − PM )(x) + PM (x) = PM ⊥ (x) + PM (x).

Since M is invariant under T , x ∈ M ⊥ implies hT (x), zi = hx, T (z)i = 0


for z ∈ M , namely, M ⊥ is invariant under T . Therefore, if U stands for the
restriction of T to M ⊥ , then U ∈ B(M ⊥ ) and it is obviously a compact
self-adjoint operator on the Hilbert space M ⊥ . Now, if U 6= 0, then it has a
nonzero eigenvalue which would also be a nonzero eigenvalue of T . In other
words, there is some nonzero x ∈ M ⊥ with x ∈ N (T − λI) ⊆ M for some
nonzero λ ∈ σ(T ). Noticing M ∩ M ⊥ = {0}, we get x = 0, a contradiction.
Therefore, U must be a zero operator and consequently,

T (x − y) = U PM ⊥ (x) = 0.

This completes the proof of the theorem.

Remark 5.55. The argument for Theorem 5.54 produces a linear subspace M ⊥
of X comprising elements mapped to zero by T . So, we can choose an or-
thonormal basis for M ⊥ , each element of which is an eigenvector of T with
eigenvalue 0. If these new eigenvectors are included in the sequence of eigen-
vectors constructed in Theorem 5.54, we obtain an orthonormal basis {ej } for
X. Note that the new eigenvalues are zero, their inclusion does
P not affect the
convergence of the sequence (λk )∞ k=1 . Consequently, if x = j cj ej ∈ X and
{λj } are all eigenvalues corresponding to {ej }, then

X ∞
X
T (x) = cj λj ej = ck λk Pk (x).
j k=1

In some ways, Theorem 5.54 tells us that the self-adjoint compact operators
in a Hilbert space are similar to the symmetric matrices in Rn . Moreover, as
an interesting consequence of Theorem 5.54, we can approximate any compact
operator on a Hilbert space by the operators of finite rank.
132 5 Hilbert Spaces and Their Operators

Theorem 5.56. Let X be a Hilbert space over C. Then for each compact
operator T ∈ B(X) there is a sequence of operators of finite rank converging
to T in norm.

Proof. First, assume X is separable. Then by Theorem 5.24 we see that X has
a countable orthonormal basis {ek }∞k=1 . For each n ∈ N define the operator
Pn via
Xn
Pn (x) = hx, ek iek , x ∈ X.
k=1

From Theorem 5.19 or Remark 5.46 it follows that

max{kPn k, kI − Pn k} ≤ 1 for all n∈N

and
lim kPn (x) − xk = 0.
n→∞

Now, if the assertion were false, then there would be a compact operator
T ∈ B(X) and a number δ > 0 such that kT − F k ≥ δ for all operators F of
finite rank. Hence, for each n ∈ N there is an xn ∈ X such that kxn k = 1 and
k(T − Fn )xn k ≥ δ/2, where the operator Fn defined by
n
X
Fn (x) = Pn T (x) = hT (x), ek iek , x ∈ X,
k=1

is an operator of finite rank. Since (xn )∞


n=1 is bounded
∞ and T is compact, there
is a subsequence (xnj )∞ j=1 such that T (x )
nj j=1 converges to some element
w ∈ X. But

k(T − Fnj )(xnj )k ≤ k(I − Pnj )(T (xnj ) − w)k + k(I − Pnj )(w)k
≤ kT (xnj ) − wk + k(I − Pnj )(w)k.

Now for nj big enough, we have

k(I − Pnj )(w)k < δ/4 and kT (xnj ) − wk < δ/4.

The foregoing estimates therefore produce a contradiction: δ/2 > δ/2, which
naturally proves the assertion in the case when X is separable.
Next, assume X is any Hilbert space over C. Set S = T ∗ T . Then S is
self-adjoint and compact. By Theorem 5.54, S can be written as the following
form
X∞
S(x) = λk hx, ek iek , x ∈ X,
k=1

for some orthonormal sequence (ek )∞k=1 . Suppose now that X0 is the subspace
of all linear combinations of elements of the form T n (ek ), n + 1 ∈ N; k ∈ N.
Then X0 is a separable closed linear subspace of X, and T maps X0 into itself.
5.5 Compact Operators via the Spectrum 133

By the first part, there exists a sequence (Fn )∞


n=1 of finite rank operators on
X0 converging in norm to the restriction T̃ of T to X0 . Now every element

x ∈ X can be decomposed into the form x = y + z where y ∈ X0 and z ∈ X0
is orthogonal to X0 . With this, we get that z is orthogonal to each ek , and so
z ∈ N (S). Accordingly,

0 = hS(z), zi = hT ∗ T (z), zi = kT (z)k2 ;

that is, T (z) = 0. Thus, if Gn (x) = Fn (y) for n ∈ N, then Gn is of finite rank
and limn→∞ kGn − T k = 0 thanks to

kGn (x) − T (x)k = kFn (y) − T (y)k = kFn (y) − T̃ (y)k ≤ kFn − T̃ kkxk.

We are done.
To close this section, we use Remark 5.55 to produce the forthcoming
Fredholm’s alternative which arises in the study of the following homogeneous
and inhomogeneous Fredholm integral equations:
Z Z
f (x) − k(x, y)f (y)dmid (y) = 0; f (x) − k(x, y)f (y)dmid (y) = g(x)
[a,b] [a,b]

for given g ∈ LRS 2id ([a, b], C) and |k(·, ·)|2 ∈ LRS id ([a, b] × [a, b]).
Theorem 5.57. Let X be a Hilbert space over C and let T be a self-adjoint
compact linear operator on X.
(i) If the only solution to the homogeneous equation f − T (f ) = 0 is f = 0,
then the inhomogeneous equation f − T (f ) = g has a unique solution f for
each g ∈ X.
(ii) If f − T (f ) = 0 has nonzero solutions, then f − T (f ) = g has a solution
for a given g ∈ X only if hg, f i = 0 for any solution f to f − T (f ) = 0, in
which case f − T (f ) = g has infinitely many solutions, the difference of any
two of them being a solution of f − T (f ) = 0.
Proof. (i) From Remark 5.55 it follows that X has an orthonormal basis
{ej } comprising
P eigenvectors of T with eigenvalues {λj }. For g ∈ X, let
g =P j cj ej , and then search a solution to f − T (f ) = g in the form
f = j dj ej . Then
X X X
dj ej − cj λj ej = cj ej
j j j

and hence dj = cj (1 − λj )−1 provided λj 6= 1. If f − T (f ) = 0 has no nonzero


solution, then 1 is not an eigenvalue of T and thus all dPj exist. This indicates
that the solution to f − T (f ) = g must be of the form j cj (1 − λj )−1 ej . Of
course, this shows that if f − T (f ) = g has a solution then it is unique. Now,
it remains to check that the last series converges in X but also produces a
solution to f − T (f ) = g.
134 5 Hilbert Spaces and Their Operators
P
If 1 were a cluster point of the set of eigenvalues of T , then j cj (1 −
λj )−1 ej would not generally converge because its terms would be very large
as λj was close to 1. But, we know from Remarks 5.53 (ii) that λj → 0 as
j → ∞, and so no subsequence of {λj } can approach 1 – this means that
1 is not a cluster point of the set {λj } and so that supj |1 − λj |−2 < ∞.
Consequently,
X X
|cj (1 − λj )−1 |2 ≤ sup |1 − λj |−2 |cj |2 < ∞.
j
j j

This finiteness, along with Theorem 5.22, implies that j cj (1 − λj )−1 ej is


P
convergent in X.
Next, let us verify that the last series is a solution to f −TP (f ) = g. Since g ∈
X and {ej } is an orthonormal basis for X, we conclude that j |hg, ej i|2 < ∞.
Also since T is self-adjoint and compact, it follows that λ = inf k |1 − λk | > 0
and then X X
|hg, ej i|2 |1 − λj |−2 ≤ λ−2 |hg, ej i|2 < ∞.
j j
−1
P
This yields that f = j hg, ej i(1 − λj ) ej is convergent in X thanks to
Theorem 5.22. Accordingly,
X
T (f ) = hg, ej i(1 − λj )−1 T (ej )
j
X
= hg, ej i(1 − λj )−1 λj ej
j
X X
= cj (1 − λj )−1 ej − hg, ej iλj ej
j j
= f − g.

(ii) Given g ∈ X, assume now that f1 satisfies f − T (f ) = g and f2 is a


nonzero solution to f − T (f ) = 0. This assumption, plus T ∗ = T , gives

hf1 , f2 i = hT (f1 ), f2 i + hg, f2 i = hf1 , T (f2 )i + hg, f2 i = hf1 , f2 i + hg, f2 i,

whence deriving hg, f2 i = 0. Here, since f − T (f ) = g implies (f + cf2 ) −


T (f + cf2 ) = g for any f2 with f2 − T (f2 ) = 0 and c ∈ C, there are infinitely
many such solutions to f − T (f ) = g. The final assertion follows simply from
taking the difference between two solutions to f − T (f ) = g.

Clearly, the above Fredholm’s alternative theorem is a simple generaliza-


tion of facts on sets of linear algebraic equations: An inhomogeneous system
has a unique solution when and only when the determinant of the coefficients
is nonzero, i.e., the corresponding homogeneous system has no nonzero solu-
tion. In addition, we here want to say by an example that the compactness of
the operator T in the Fredholm’s alternative theorem is essential.
5.5 Compact Operators via the Spectrum 135

Example 5.58. On LRS 2id ([0, 1], C) define T (f )(x) = (1 − x)f (x). Then T is
self-adjoint but not compact. The homogeneous equation f − T (f ) has no
solution other than f = 0. Nevertheless, the inhomogeneous equation f −
T (f ) = g has no solution in LRS 2id ([0, 1], C) for any nonconstant function
g ∈ LRS 2id ([0, 1], C) – indeed, the solution to f −T (f ) = g is then proportional
to 1/x, x ∈ (0, 1], which is certainly not in LRS 2id ([0, 1], C).

Problems
5.1. Equip C([−1, 1], C) with the 2-norm and the inner product
Z 1
hf, gi = f (x)g(x)dx.
−1

Prove that C([−1, 1], C) is an inner product space but not a Hilbert space.
5.2. Equip `p (N, C), 1 ≤ p < ∞, with the p-norm. Prove that `2 (N, C) is a
Hilbert space under the inner product

X
hx, yi = xj yj for all x = (xj )∞ ∞
j=1 , y = (yj )j=1 ∈ `2 (N, C),
j=1

but `p (N, C) is not a Hilbert space under the p-norm whenever p 6= 2.


5.3. Let X be an inner product space over F.
(i) If F = R, prove that x⊥y is equivalent to kx + yk2 = kxk2 + kyk2 . Is this
equivalence true for F = C?
(ii) x⊥y if and only if kx + βyk ≥ kxk for all β ∈ F.
5.4. Let X be a Hilbert space over F. Prove that if M is a closed linear
subspace of X then (M ⊥ )⊥ = M .
5.5. Equip LRS 2id ([−1, 1], C) with the inner product hf, gi = [−1,1] f gdmid .
R

(i) Let

M = f ∈ LRS 2id ([−1, 1], C) : f (x) = 0 for all x ∈ [−1, 0] .




Find M ⊥ .
(ii) Let

Modd = f ∈ LRS 2id ([−1, 1], C) : f (−x) = −f (x)



for all x ∈ [−1, 1]

and

Meven = f ∈ LRS 2id ([−1, 1], C) : f (−x) = f (x)



for all x ∈ [−1, 1] .

Prove LRS 2id ([−1, 1], C) = Modd ⊕ Meven .


136 5 Hilbert Spaces and Their Operators

5.6. Equip LRS 2id ([−π, π], C) with the inner product hf, gi = [−π,π] f gdmid .
R

Prove ∞
1
(i) (2π)− 2 einx n=−∞ is an orthonormal set.


(ii)
1 1 1 1 1
(2π)− 2 , π − 2 cos t, π − 2 sin t, π − 2 cos 2t, π − 2 sin 2t, ...}


is an orthonormal basis of LRS 2id ([−π, π], C).


5.7. Let LRS 2id ([−π, π], C) be as in Problem 5.5. For φ ∈ C([−π, π], C) let
T (f ) = φf be defined on LRS 2id ([−π, π], C).
(i) Calculate T ∗ using the inner product defined above.
(ii) Prove that if φ is real-valued then T is self-adjoint.
(iii) Find a condition on φ such that T is respectively unitary, nonnegative,
or a projection.
5.8. Let X be a Hilbert space over C and T ∈ B(X). Prove the following
results:
(i) R(T )⊥ = N (T ∗ ), N (T )⊥ = R(T ∗ ).
(ii) R(T ∗ )⊥ = N (T ), N (T ∗ )⊥ = R(T ).
5.9. Equip C([0, 2π], C) with the sup-norm: kf k∞ = supx∈[0,2π] |f (x)| < ∞.
(i) Define T (f )(x) = xf (x) for f ∈ C([0, 2π], C). Find σ(T ).
(ii) Define T (f )(x) = eix f (x) for f ∈ C([0, 2π], C). Prove σ(T ) = {λ ∈ C :
|λ| = 1}.
5.10. Equip `2 (N, C) with the 2-norm.
(i) If T (x) = T (x1 , x2 , · · ·) = (x2 , x3 , · · ·), find σ(T ).
(ii) If T (x) = T (x1 , x2 , · · ·) = (0, −x1 , −x2 , −x3 , · · ·), prove:
(a) T has no eigenvalue.
(b) ρ(T ) = {λ ∈ C : |λ| > 1}.
(c) k(T − λI)−1 k = (|λ| − 1)−1 .
5.11. Suppose that X is a Hilbert space over C, and that T ∈ B(X). Prove
σ(T ∗ ) = {λ : λ ∈ σ(T )}.
5.12. Let (X, k · k) be a Hilbert space over C. If T ∈ B(X) is compact and
self-adjoint and x ∈ X satisfies kxk = 1, prove the following two results:
(i) λ ∈ R obeys kT (x) − λxk2 = kT (x)k2 − hT (x), xi2 when and only when
λ = hT (x), xi.
(ii) There exists a λ ∈ σ(T ) such that
p
|λ − hT (x), xi| ≤ kT (x)k2 − hT (x), xi2 .
R1
5.13. Let T (f )(x) = 0 ex+y f (y)dy. Find the eigenvalues and eigenvectors of
T.
5.14. Let X be a Hilbert space over C. If T = T1 + iT2 is normal operator on
X, prove kT k2 = kT12 + T22 k and kT 2 k = kT k2 .
5.5 Compact Operators via the Spectrum 137

5.15. Let X be a Hilbert space over C, λ ∈ C, and T ∈ B(X) normal. Prove


the following results:
(i) k(T ∗ − λ̄I)(x)k = k(T − λI)(x)k for x ∈ X.
(ii) If T (x) = λx then T ∗ (x) = λ̄x.
(iii) rσ (T ) = kT k.
5.16. Let X be a Hilbert space over C. Prove if T ∈ B(X) and T ∗ T is compact
then T is compact.
5.17. Suppose that [amn ] is an infinite matrix of real numbers obeying
P ∞ 2
m,n=1 amn < ∞. Let the linear P operator T on `2 be defined by T (x) = y
∞ P∞ 
where x = (x1 , x2 , · · ·) and y = n=1 a1n xn , n=1 a2n xn , · · · . Prove that
T is a compact operator on `2 .
5.18. Consider
R LRS 2id ([0, 1], C) on which the inner
R product is given by
hf, gi = [0,1] f gdmid , and suppose K(f )(x) = [0,1] k(x, y)f (y)dy for f ∈
LRS 2id ([0, 1], C).
(i) If
n
X
k(y, x) = gk (x)fk (y) where fk , gk ∈ LRS 2id ([0, 1], C)
k=1
and {gk }nk=1 are linearly independent, prove that T is compact.
(ii) If λ is nonzero eigenvalue, prove its corresponding eigenvector has the
following form:
n
X
f (x) = ck gk (x) where ck is constant;
k=1
Pn
moreover, if hk,j = [0,1] fk gj dmid then ck = λ−1 j=1 hj,k cj , k = 1, 2, ..., n.
R

(iii) If fk = gk and hfj , gk i = 0 for j 6= k, evaluate eigenvalues and eigenvectors


of K.
(iv) If k(y, x) = cos y+x
π for x, y ∈ [0, 1], evaluate eigenvalues and eigenvectors
of K.
5.19. Suppose T is a compact operator on an infinite-dimensional Hilbert
space (X, k · k) over C. If {ej }∞
j=1 is an orthonormal basis of X, prove
limj→∞ kT (ej )k = 0.
5.20. Let (X, k · kX ) and (Y, k · kY ) be Hilbert spaces over C. A bounded
linear operator T : X → Y is called Hilbert-Schmidt (Erhard Schmidt) op-
erator provided there exists an orthonormal basis {ej }∞ j=1 in X such that
P∞ 2
j=1 kT (e j )kY < ∞.
(i) If T ∗ : Y → X is defined by
hT (x), yiY = hx, T ∗ (y)iX , x ∈ X, y ∈ Y,
where h·, ·iX and h·, ·iY stand for the inner products equipped with X and Y
respectively, prove that T is Hilbert-Schmidt operator when and only when
T ∗ is Hilbert-Schmidt operator.
138 5 Hilbert Spaces and Their Operators

(ii) Prove that any Hilbert-Schmidt operator Ris compact.


(iii) If k(·, ·) : (a, b)×(c, d) → C satisfies (a,b) (c,d) |k(x, y)|2 dmid (x)dmid (y) <
R

∞ where (a, b) × (c, d) ⊆ R2 , prove the integral operator


Z
T (f )(x) = k(x, y)f (y)dmid (y), x ∈ (a, b)
(c,d)

is a Hilbert-Schmidt operator from LRS 2id ((c, d), C) to LRS 2id ((a, b), C) and
therefore compact.

5.21. Let X and Y be Hilbert spaces over C. Suppose {ej }∞ ∞


j=1 and {fk }k=1

are orthonormal bases of X and Y respectively. For a sequence (ck )k=1 in C
define a linear operator T : X → Y by

X
T (x) = ck hx, ek iX fk , x ∈ X,
k=1

where h·, ·iX and h·, ·iY stand for the inner products equipped with X and Y
respectively. Prove the following results:
(i) T is a bounded operator when and only when supk∈N |ck | < ∞.
(ii) T is a compact operator when and only when limk→∞ P |ck | = 0.

(iii) T is a Hilbert-Schmidt operator when and only when k=1 |ck |2 < ∞.
(iv) T is an operator of finite rank when and only when there is an N ∈ N
such that |cn | = 0 for n > N .

You might also like