0% found this document useful (0 votes)
141 views68 pages

17B4 1-Allwebnotes

This document provides notes for a course on Functional Analysis I. It begins with a section reviewing preliminaries from Part A including definitions of norms, normed spaces, and examples such as `p spaces and function spaces. It discusses properties of norms such as equivalence and the metric they induce. It also covers basic topics like subspaces, continuity of linear maps, and the importance of Banach spaces, which are complete normed spaces and the focus of the course.

Uploaded by

Ataaye Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views68 pages

17B4 1-Allwebnotes

This document provides notes for a course on Functional Analysis I. It begins with a section reviewing preliminaries from Part A including definitions of norms, normed spaces, and examples such as `p spaces and function spaces. It discusses properties of norms such as equivalence and the metric they induce. It also covers basic topics like subspaces, continuity of linear maps, and the importance of Banach spaces, which are complete normed spaces and the focus of the course.

Uploaded by

Ataaye Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

B4.

1: FUNCTIONAL ANALYSIS I
Michaelmas Term 2017

H.A. Priestley

This file contains the full set of webnotes for the course, with a contents
list, by subsections, at the end.

With acknowledgements to the lecturers who have previously shaped this


course and who produced online notes for it, on which the present notes
have drawn.

Functional Analysis I studies normed spaces in general and complete normed


spaces (called Banach spaces) in particular. Such spaces—principally infinite-dimensional
ones—form the backbone of a theory that underpins much of applied analysis as well as
being worthy of study in its own right.
The importance of normed spaces, in analysis and elsewhere in mathematics and its
applications, is recognised by their introduction in the Part A Metric Spaces course.
The Part B course will ssume knowledge of the basic material from Part A. A summary
is provided in Section 0 of these notes. The numbering is significant: Section 0 will not
be covered in lectures.

1
2

0. Preliminaries and first examples, from Part A

In which we set out the definition and simple properties of a norm, renew
acquaintance with some familiar normed spaces, and review basic topolog-
ical notions as these specialise to normed spaces.

0.1. Definitions: normed space, equivalent norms.


Let X be a vector space over F, where F = R or C. A norm on X is a function
x 7→ kxk from X to [0, ∞) which satisfies, for all x, y ∈ X and all λ ∈ F,

(N1) kxk > 0 with equality if and only if x = 0;


(N2) kλxk = |λ|kxk;
(N3) kx + yk 6 kxk + kyk.

We then say that (X, k · k) is a normed space. Where no ambiguity would result we
simply say X is a normed space. On the other hand we adopt the notation kxkX when
we need to make the domain explicit.
Note that the restriction to real or complex scalars is needed in (N2). Later, when
we work with two normed spaces at the same time—for example when considering maps
from one space to another—we tacitly assume that F is the same for both.

We say that two norms, k · k and k · k0 on a vector space X are equivalent if there
exist constants m, M > 0 such that, for all x,
mkxk 6 kxk0 6 M kxk.

0.2. Basic properties of a norm and introductory examples.


There’s a (real or complex) vector space underlying every normed space X. So don’t
forget what you’ve learned in Linear Algebra courses in Prelims and Part A. It’ll be the
basics that come in handy in FA-I: the Subspace Test rather than Jordan Normal Form.
Obviously, R becomes a normed space when we take F = R and kxk = |x|, and
likewise for C, with F = C. Indeed the norm conditions (N1) (non-degeneracy), (N2)
(homogeneity) and (N3) (Triangle Inequality) mimic basic properties of modulus on R
and on C.
Some basic properties of modulus carry over to a general normed space (X, k · k) with
the same proofs. We highlight

(i) kx1 + · · · + xm k 6 kx1 k + · · · + kxm k for any m = 3, 4, . . . and any x1 , . . . , xm ∈ X;


(ii) kx + yk > |kxk − kyk| for all x, y (reverse triangle inequality).
 Care is needed in manipulating inequalities involving norms, just as it is in handling
inequalities involving modulus.
In Section 1 we give a full discussion of what’s involved in verifying the norm prop-
erties, working with a wide range of examples.

Those real or complex vector spaces which are equipped with an inner product carry
a natural norm.

0.3. Proposition (the norm on an inner product space).


p Let h · , · i be an inner
product on a vector space X (over R or C). Then kxk = hx, xi defines a norm on X.
3

Proof. Note that k · k is well-defined because hx, xi > 0 for all x. (N1) and (N2) follow
directly from properties of the inner product. For (N3) we call on the Cauchy–Schwarz
inequality:
|hx, yi| 6 kxk kyk.
This gives
kx + yk2 = hx + y, x + yi
= kxk2 + hx, yi + hy, xi + kyk2
= kxk2 + 2Rehx, yi + kyk2
6 kxk2 + 2kxk kyk + kyk2
= (kxk + kyk)2 .
Since the norm is non-negative, kx + yk 6 kxk + kyk follows. 

0.4. Subspaces.
Part of the standard machinery of vector space theory involves the ability to form
subspaces, We can consider a norm as an add-on to this general framework.
Given a normed space (X, k · kX ) and a subspace Y of the vector space X, we can
form a new normed space (Y, k · kY ) by defining kykY = kykX for all y ∈ Y .

0.5. Norms on the finite-dimensional spaces Fm . We can define the following norms
on Fm , for F as R or C and m > 1: for x = (x1 , . . . , xm ) ∈ Fn ,
m
!1/2
X
kxk2 = |xj |2 (Euclidean norm)
j=1
m
X
kxk1 = |xj |
j=1

kxk∞ = max |xj |.


j=1,...,m

Here the Euclidean norm comes from the standard inner product (the scalar product)
and Proposition 0.3 confirms that it is indeed a norm. Verification that k · k1 and k · k∞
satisfy (N1), (N2) and (N3) is straightforward, using properties of the real and complex
numbers.
These norms on Fm are related as follows:
√ 1
kxk2 6 kxk1 6 mkxk2 and √ kxk2 6 kxk∞ 6 kxk2 .
m
These inequalities tell us that any two of these norms are equivalent. In Section 1 we
extend the definitions to define a norm k · kp for any p with 1 6 p < ∞.

0.6. Further normed spaces encountered in Part A metric spaces. The following
appeared briefly as examples, with real scalars:
(1) The spaces `1 , `2 and `∞ . These are infinite-dimensional analogues of the finite-
dimensional spaces with the analogous norms. Issues of convergence come into play
here; see Section 1 for details.
(2) Function spaces
(i) Bounded real-valued functions on any set Ω with supremum norm: kf k∞ :=
sup{ |f (x)| | x ∈ Ω }. Boundedness ensures that kf k∞ is finite. Notation in
FA-I is F b (Ω).
4

(ii) Real-valued continuous functions on a compact set K with the supremum norm.
Here boundedness is guaranteed. Notation: C(K) or CR (K). In particular, K
can be any closed bounded interval in R.
(3) Continuous functions on a closed bounded interval with L1 or L2 norm.

These are just tasters. A more comprehensive catalogue of examples is given in


Section 1.

0.7. The metric associated with a norm.


Let (X, k · k) be a normed space. Then d : X × X → [0, ∞) defined by
d(x, y) = kx − yk
is a metric on X, so kx − yk measures the distance between x and y.
All the standard notions associated with a metric space are available in any normed
space: open sets, closed sets, closure, and so on.
Suppose X is a normed space with respect to two different norms, k · k and k · k0 .
Then the norms are equivalent if and only if they give rise to the same open sets. This
easy exercise is an extension of results proved in Part A.
Convergence of a sequence (xn ) in a normed space X is defined in the expected way.
We say (xn ) converges if there exists x ∈ X such that kxn − xk → 0. In this situation
we write xn → x. Note that convergence with respect to the supremum norm on a space
of functions (as in 0.6) is uniform convergence.
Both closedness and continuity can, as in any metric space, be conveniently captured
via sequences.
(i) A set S is closed iff, for any sequence (xn ),
xn ∈ S for all n and xn → x =⇒ x ∈ S.
(ii) Given a non-empty subset S of X, a point x belongs to the closure S of S iff there
exists a sequence (xn ) in S such that xn → x.
(iii) A real- or complex-valued function f on X is continuous iff for all sequences (xn ),
xn → x =⇒ f (xn ) → f (x).
[Our reason for including (ii) here is to make it clear that there is no need to involve limit
points, and no advantage in doing so.]

0.8. Proposition. Let X be a normed space. Then the following maps are continuous:

(i) x 7→ kxk, from X to [0, ∞);


(ii) (x, y) 7→ x + y, from X × X to X;
(iii) (λ, x) →7 λx, from F × X to X.

[Here the norm on X × X can be taken to be that given by (x, y) 7→ kxk + kyk, or any
norm equivalent to this.]

Proof. See Problem sheet 0.

0.9. Corollary (closure of a subspace). Let Y be a subspace of a normed space X.


Then the closure Y of Y is also a subspace.
5

0.10. Advance notice: continuity of a linear map.


It is natural in the context of normed spaces that we should consider maps between
such spaces which are both linear and continuous. In Metric Spaces it was shown that
these have a characterisation in terms of a condition of boundedness. The notion of a
bounded linear operator and in particular of a bounded linear functional will be
of major importance in FA-I (analysed in detail in Section 3 in the 2017 notes).

0.11. Banach space: definition.


Recall that a metric space (X, d) is complete if every Cauchy sequence converges to
an element of X. This notion applies in particular when the metric comes from a norm.
A sequence (xn ) in a normed space (X, k · k) is a Cauchy sequence if
∀ε > 0 ∃N ∈ N ∀m, n > N kxn − xm k < ε.
A normed space (X, k · k) is a Banach space if it is complete, that is, every Cauchy
sequence converges.
The following elementary result is very useful.

0.12. Proposition (closed subspaces of Banach spaces). Let (X, k·kX ) be a normed
space and Y a subspace of X. Then
(i) If (Y, k · kX ) is a Banach space then Y is closed in X.
(ii) If (X, k · kX ) is a Banach space and Y is closed, then (Y, k · kX ) is a Banach space.

Proof. This is just a definition-chase with sequences.


Consider (i). Suppose Y is a Banach space and that yn → x ∈ X, where (yn ) is a
sequence in Y . Any convergent sequence is Cauchy and hence there exists y ∈ Y such
that yn → y. By uniqueness of limits, x = y, so x ∈ Y and Y is closed.
Consider (ii). Suppose Y is closed and that (yn ) is a Cauchy sequence in Y . Then
(yn ) is also a Cauchy sequence in the Banach space X. Hence there exists x ∈ X such
that kyn − xkX → 0. Since Y is closed, x ∈ Y . 

We record the following Prelims result as a theorem. It underpins mathematical


analysis at a deep level, and its implications are far-reaching. More parochially, we shall
draw on it when showing that many normed spaces in our catalogue of examples here
and in Section 1 are also Banach spaces.

0.13. Theorem (Cauchy Convergence Principle). Each of (R, | · |) and (C, | · |) is a


Banach space.
6

B4.1 FUNCTIONAL ANALYSIS I: webnotes for Sections 1 and 2

1. Normed spaces, Banach spaces and Hilbert spaces

In which we assemble our full cast of characters and start to get to know
them.

Introductory course overview.

Context.
All the spaces we shall consider will be real or complex normed spaces. Many of
these will be important in analysis, both pure and applied. Infinite-dimensional spaces
will predominate, with spaces of functions as primary examples. In rare but important
cases the norm will come from an inner product.
Functional analysis as we study it involves vector spaces with additional structure
(a norm function). Thus linearity is always present and all the maps we consider will
be linear maps. This constrains the potential areas of application: mathematical physics
yes; non-linear systems, no.

The role of a norm.


A norm on a real or complex vector space provides a special kind of metric, one which
is compatible with the linear structure. In particular addition and scalar multiplication
are continuous maps; see 0.8.
A norm provides a measure of distance. Different norms on the same vector space lead
to different ways of measuring ‘nearness’, and the choice can be tailored to an intended
application. So if we’re dealing with, say, a space of differentiable functions we may want
to use a norm which involves a function’s derivative as well as the function itself; see 1.9.
As in ε-δ analysis the presence of a norm allows the study of approximations, with a
measure of how good these are—an aspect of numerical analysis. But approximations are
also a valuable theoretical tool: often any element of a normed space can be recognised
as a limit (in norm) of elements drawn from some special subset. Here’s a sample of such
density results:
• the classic Weierstrass’s Theorem, asserting that a continuous real-valued func-
tion on a closed bounded interval can be uniformly approximated by polynomials;
• the approximation of integrable functions by simple functions or by step functions,
where the norm we use is derived from the integral.
The course contains numerous results of this general type, for a variety of spaces.
Many are theorems of interest in their own right, others are principally useful in con-
structing proofs.
As Section 0 confirmed, we are able to make use of topological ideas, in a setting
more restricted than that of general metric spaces. These ideas include open and closed
sets; continuity; and completeness.

Many of the spaces we meet are complete, that is, Cauchy sequences converge. A
complete normed space is a Banach space. A complete normed space whose norm
comes from an inner product is called a Hilbert space. The latter spaces have special
properties with a geometric flavour: think of them as behaving like Euclidean spaces.
7

How prevalent is completeness?


1. It turns out that ALL finite-dimensional normed spaces are complete, and all norms
on a finite-dimensional space are equivalent. See Section 4, in which we fit finite-
dimensional spaces into our overall theory.
2. Very many of our key infinite-dimensional examples are Banach spaces. Nevertheless
normed spaces which are not complete are important in certain areas of application,
notably PDE’s. A source of these is non-closed subspaces of familiar Banach spaces.
Only hints of this appear in FA-I.
3. The only concrete Hilbert spaces we see in FA-I are Fm with the Euclidean norm, the
sequence space `2 and L2 -spaces (the last with a light touch). (An in-depth study
of Hilbert spaces, with further examples, forms a major part of FA-II.)

The role of completeness in the theory


Completeness does not feature strongly in the general theorems in FA-I. In par-
ticular the centrepiece Hahn–Banach Theorem is a theorem about normed spaces.
Completeness is however not far away. The HBT concerns the dual space of a normed
space X—defined as the space of continuous linear functionals on X—and this dual space
is a Banach space. The HBT has multiple consequences: for example it allows us to anal-
yse a normed space X by taking ‘snapshots’: we look at how elements of its dual space
act on it. Density will be a recurring theme in the course and the HBT leads to valuable
density results.
Continuous linear maps feature prominently. Spectral theory is more complicated
and more subtle in infinite-dimensional spaces than in the finite-dimensional ones with
which Prelims LAII and Part A Linear Algebra worked. No Rank–Nullity Theorem here!
Spectral theory in FA-I is explored in the context of Banach spaces (at the end of the
course).
At the end of Section 2 we give an informal introduction to the special properties
Hilbert spaces possess, thanks to their being Banach spaces with their norm coming from
an inner product.
FA-I does not cover deep theory of Banach spaces. FA-II does venture into this. Any
Banach space is a complete metric space, and such spaces have a rich theory thanks to the
Baire Category Theorem. This has major applications to Banach spaces in general,
with a clutch of Big Theorems, and to Hilbert spaces in particular, and FA-II includes an
introduction to these. Moreover, many deep theorems in classical analysis stem from the
BCT: the rich supply of continuous functions which are nowhere differentiable (it’s most
of them!); the existence of functions whose Fourier series behave atrociously in respect of
pointwise convergence, . . . .

How reliable is your intuition?


You live in a Hilbert space world, R3 , and draw sketch diagrams in the Hilbert space
R2 . You are accustomed to doing linear algebra in finite-dimensional spaces, backed up
by the use of bases and by the Rank–Nullity Theorem. On the analysis side, the Heine–
Borel Theorem characterises compact sets as those which are closed and bounded, and
continuous real-valued functions on compact sets are bounded and attain their bounds.
In working with finite-dimensional spaces from an algebraic perspective, continuity
and closure don’t come into play. The theory of infinite-dimensional spaces is richer and
more varied. We’d expect to include some topological assumptions: continuity of our
linear maps; subspaces maybe needing to be closed. But the following are warnings that
things may not always go smoothly.
8

(a) For a continuous linear map T : X → Y , the kernel ker T is closed but the image Im T
need not be. This explains why certain results in the finite-dimensional theory of dual
spaces and dual maps require dimension arguments. Infinite-dimensional analogues
are likely to bring in Im T .
In a Banach space setting useful sufficient conditions for Im T to be closed are
available.
(b) Minimum distances need not be attained. Given a metric space (X, d) and non-empty
closed subset S ⊆ X we can define the distance from x to S by
dist(x, S) := inf{ d(x, s) | s ∈ S }.
It is an easy (Part A) topology exercise to show x 7→ dist(x, S) is continuous. But 
even when d is the metric coming from a norm on a Banach space and S is a closed
subspace, there may not be any s0 ∈ S such that d(x, s0 ) = dist(x, S).
Replace ‘Banach’ by ‘Hilbert’ here and the result is true.
(c) Suppose we have a vector space direct sum X = Y ⊕ Z and consider the projection
map PY : y + z 7→ y (y ∈ Y , z ∈ Z). Any picture you are likely to draw will suggest
that kPY (x)k 6 kxk. whence PY is continuous. Don’t be fooled! 
Projection maps on general normed spaces need not be continuous (example on a
problem sheet). Continuity is ensured if X is a Banach space and Y, Z closed (this
fact rests on the BCT). Things work better still in a Hilbert space, for orthogonal
projections.
Here (a), (b) and (c) reflect different extents to which the theory and practice of func-
tional analysis in normed spaces is different from what you’ve seen hitherto. Divergence
from the familiar is greatest in general normed spaces, less so in Banach spaces (though
proofs may require hard work), and least in Hilbert spaces.

Now we embark in earnest on our study of normed spaces.

1.1. Notes on verifying the norm properties.


Suppose we have a vector space X over F, where F = R or C, equipped with some
function x 7→ kxk which we wish to show is a norm.
We have to confirm that kxk is finite for every x ∈ X (let’s call his property (N0))
and that, for all x, y ∈ X and all λ ∈ F, the following hold:
(N1) kxk > 0 with equality if and only if x = 0;
(N2) kλxk = |λ|kxk;
(N3) kx + yk 6 kxk + kyk.
For starters, we need to know X really is a vector space. Virtually always X will be,
or can be identified with, a set of F-valued functions on some set Ω, and the addition and
scalar multiplication will be given pointwise. Here the domain Ω doesn’t have need to
have any structure—just a set will do. The set FΩ of all functions from Ω to F is a vector
space for pointwise addition and multiplication (from Prelims). So X is a vector space
so long as it’s non-empty and closed under pointwise addition and scalar multiplication.
Now consider the norm conditions. Note that (N0), finiteness of kxk, may need
checking. Where a space is specified as a subspace of a familiar vector space by looking
at those elements for which the candidate norm is finite, then convergence questions may
arise in the application of the Subspace Test; see 1.4 for an example.
The issue with (N1) is whether kxk = 0 implies x = 0. This may not be obvious (see
Example 1.11(a) below) or even true (see Subsection 1.2).
9

Typically, (N2) is obvious or is boringly routine to check. One general observation


can sometimes reduce the work. Assume kµxk 6 |µ|kxk holds for any µ ∈ F. Let λ 6= 0
and combine the inequalities we obtain by taking first µ = λ and then µ = 1/λ (some
vector space axioms come into play here!). This gives kλxk = |λ|kxk for λ 6= 0; (N1)
covers the missing case λ = 0.
The norm property which is most often non-trivial to check is the Triangle Inequality
(N3). Note however the proof of (N3) that applies to the norm on an inner product space:
the Cauchy–Schwarz inequality does the job, for all IPS’s at once (recall the proof in 0.3).

1.2. Quotients and seminorms.


In the context of finite-dimensional vector spaces quotient spaces (as studied in Part A
LA) are important as a tool for using induction on dimension to prove major results
(triangularisation, Cayley–Hamilton Theorem, Spectral Theorem, for example). In this
course we rarely encounter quotients of existing normed spaces.

When (N2) and (N3) hold but (N1) does not then we say that k · k is a seminorm.
The game to play here is to pass to a quotient vector space to convert to a normed space.
The process is set out in Problem sheet Q. 1. For concrete examples see 1.11(b) and (c).

We now bring on stage a full cast of characters for the FA-I course.

1.3. More examples of norms on finite-dimensional spaces.


We may define a norm on Fm for any p ∈ [1, ∞) by
m
!1/p
X
k(x1 , . . . , xm )kp = |xj | .
j=1

To confirm that the Triangle Inequality holds we can start from an inequality due to
Hölder which reduces to the Cauchy–Schwarz inequality for Fm when p = 2. It states
that for x = (xj ), y = (yj ) ∈ Fm and q such that p1 + 1q = 1,

m

m
!1/p m !1/q
X X X
p q
xj y j 6 |xj | |yj | .



j=1 j=1 j=1

An optional exercise on Problem sheet 1 outlines a proof of this. From here one can go
on to derive the Triangle Inequality for the p-norm on Fm (a version of Minkowski’s
Inequality), and well covered in textbooks.

1.4. Sequence spaces.


All the spaces (Fm , k · kp ) (1 6 p 6 ∞) have infinite-dimensional analogues.
We first note that the set, denoted FN , of infinite sequences (xj )j>1 , with coordinates
xj ∈ F, form a vector space under the usual coordinatewise addition and scalar multi-
plication. (Here we tacitly assume that N = {1, 2, . . .}; we shan’t make a big deal of
whether sequences should start with x1 or x0 and shall use whichever is appropriate in a
given case.) Any subset of FN which satisfies the conditions of the Subspace Test is also
a vector space.
We define

!1/p
X X
`p = { (xj ) | |xj |p converges }, k(xj )kp = |xj |p (1 6 p < ∞)
j=1
10

and
`∞ = { (xj ) | (xj ) is bounded }, k(xj )k∞ = sup |xj |.

Very easy (AOL) arguments confirm that `∞ is a normed space. So now assume
1 6 p < ∞. We claim firstly that each `p is a vector space and secondly that k · kp makes
it into a normed space. The required arguments can be interwoven. This allows us to be
both efficient and rigorous.
By way of illustration, we consider `1 . We set out the (Prelims-level) proof in some
detailPto show how to avoid being P sloppy. [In particular we never write down an infinite

sum j=1 aj before we know that aj converges.]
P P
Let (xj ) and (yj ) be such that |xj | and |yj | converge. By the triangle inequality
in F,
|xj + yj | 6 |xj | + |yj | for all j.
Hence, for all n,
n
X n
X n
X
sn := |xj + yj | 6 |xj | + |yj | 6 k(xj )k1 + k(yj )k1 .
j=1 j=1 j=1
P
By the Monotonic Sequences Theorem applied to (sn ), the series |xj + yj | converges
and moreover
k(xj ) + (yj )k1 = k(xj + yj )k1 6 k(xj )k1 + k(yj )k1 .

Special case: `2 , as an IPS and as a normed space.


You already know that Fm carries the usual Euclidean inner product (alias scalar or
dot product) and that this provides an associated norm. No convergence issues here.
For `2 , convergence of the infinite sums which arise does have to be addressed. Prob-
ably the quickest route is to show that
X
((xj ), (yj )) 7→ xj yj for (xj ), (yj ) ∈ `2
j

is a well-defined inner product, and then to appeal to Proposition 0.3 for the norm
properties. To this end, let x = (xj ) and y = (yj ) be in `2 . Then for any n > 1, using the
CS inequality in the IPS Fn ,
n n
!1/2 n
!1/2
X X X
|xj yj | 6 |xj |2 · |yj |2 6 {xk · kyk;
j=1 j=1 j=1
P
We deduce that |xj yj | converges. The inner product properties follow easily from
those in the finite-dimensional case using arguments similar to those used to prove `1 is
a normed space.
Remarks on the sequence spaces and their norms.
All the `p norms are available on any finite-dimensional space, and all are equivalent.
The situation is more complicated for the infinite-dimensional sequence spaces `p . See
Problem sheet Q. 5(i).
For the sequence spaces `p , the choice p = 2, and no other, gives a norm coming from
an inner product: The parallelogram law fails for all p 6= 2.
11

1.5. Products of normed spaces.


We can form the product of two vector spaces, by defining addition and scalar multi-
plication coordinatewise on the cartesian product of the underlying sets. Given normed
spaces (X1 , k · kX1 ) and (X2 , k · kX2 ), we can define various norms on X1 × X2 :
(x1 , x2 ) 7→ kx1 kX1 + kx2 kX2 or (x1 , x2 ) 7→ max{kx1 kX1 , kx2 kX2 },
for example. These mirror the 1-norm and ∞-norm on F2 (the case X = Y = F). These
two norms are equivalent, and either can be employed as convenient (or other choices,
likewise).

1.6. Subspaces of sequence spaces.


Further examples of normed spaces arise as subspaces of familiar normed spaces. For
example, the following are subspaces of (`∞ , k · k∞ ) (bounded sequences):
• c: convergent sequences;
• c0 : sequences which converge to 0;
• c00 : sequences (xj ) such that xj = 0 for all but finitely many j.
Clearly c00 ( c0 ( c ( `∞ . Proofs that c, c0 and c00 are subspaces of `∞ : use (AOL) from
Prelims Analysis.

1.7. Example: sum of subspaces.


Here we present a cautionary tale: the vector space sum of two closed subspaces of a
normed space need not be closed. There are various examples of this phenomenon, but
all work in essentially the same way.
The example we give here lives in `2 × `1 with k((xj ), (yj ))k = k(xj )k2 + k(yj )k1 . Note
that `1 ⊆ `2 (why?) and that (1/j) ∈ `2 \ `1 . Let Y and Z be the subspaces given by
Y = { (0, y) | y ∈ `1 } and Z = { (x, x) | x ∈ `1 }.
Consider the sequence ζn = (1, 1/2, 1/3, . . . , 1/n, 0, 0, . . .). Then (0, −ζn ) ∈ Y and
(ζn , ζn ) ∈ Z. This implies that (ζn , 0) ∈ Y + Z. But ζn → (1, 1/2, 1/3, . . .) ∈ `2 so
(0, −ζn ) converges in `2 × `1 but to an element which cannot belong to Y + Z. However
Y and Z are closed.
[For a variation on the same theme see Problem sheet Q. 6.]

1.8. Function spaces with the supremum norm.


Frequent players on the Analysis stage: continuous functions, differentiable functions,
infinitely differentiable functions, integrable functions, polynomials, . . ., with appropriate
domains, for example R, a closed bounded interval [a, b], C. Without explicitly think-
ing about it, we perform vector space operations on classes of real- or complex-valued
functions pointwise. Prelims Analysis confirms that, for example, C[0, 1] is a vector
space and that the solutions of y 00 + y = 0 form a subspace of the space of all real-valued
functions y on R.
Suppose X is a vector space of bounded real- or complex-valued functions on some
set Ω. Then
kf k∞ = sup{ |f (t)| | t ∈ Ω }
defines a norm on X, known as the sup norm. We need to assume the functions are
bounded to guarantee that kf k∞ is finite.
The following are important examples of spaces of F-valued functions (with F as R
or C) which can carry the supremum norm:
• F b (Ω): all bounded functions on a set Ω; the space `∞ is the case that Ω = N.
12

• C[a, b], the continuous functions on a closed bounded interval [a, b]. More generally
we can take C(Ω), for Ω any compact space. Here boundedness is guaranteed.
• C b (R), bounded continuous functions on R.

1.9. Example: norms on spaces of differentiable functions.


The differentiable functions, or infinitely differentiable functions, on [0, 1] form sub-
spaces of C[0, 1] (1-sided derivatives at the endpoints, as usual). These, and other ex-
amples in the same vein, are normed spaces for the sup norm. But it may be more
appropriate, particularly in the context of approximations, to use a norm which better
reflects the nature of the functions we would like to study. For example, C 1 [a, b] denotes
the space of continuously differentiable functions on [a, b] with norm given by
kf k = kf k∞ + kf 0 k∞ .
Certainly kf k∞ 6 kf k, but the norms are not equivalent (recall Metric Spaces, sheet 1,
Q. 4).

1.10. Example: Lipschitz functions.


See Problem sheet Q. 4. A case where some work is involved to show that the functions
form a vector space and that the candidate norm really is a norm.

1.11. Norms on spaces of integrable functions.


When dealing with functions on R or on subintervals of R (or on some general measure
space) we often want to allow some averaging when measuring how close together two
functions are. Using integrals to measure distance can capture this.
(a) We start with a simple example, from Prelims and picked up again in Part A Metric
Spaces. Consider C[0, 1] and define
Z 1
kf k1 = |f (t)| dt.
0

It follows from elementary properties of integrals that kf k1 is finite and > 0, and
that (N2) and (N3) hold. It is a Prelims result, too, that
Z 1
|f (t)| dt = 0 =⇒ f ≡ 0
0

(this relies on continuity of f : argue by contradiction, recalling that |f (c)| > 0 forces
|f (t)| > 0 in some interval [0, 1] ∩ (c − δ, c + δ) . . .). A detailed proof can be found
in Metric Spaces notes. Hence (C[0, 1], k · k1 ) is a normed space.

(b) The Lebesgue spaces [presupposing Part A Integration]


Consider
L1 (R) = { f : R → R | f is measurable and |f | < ∞ }.
R
R
This is a vector space under
R the usual pointwise operations. The map f →
7 |f |
cannot be a norm because |f | = 0 implies f = 0 almost everywhere (by an MCT
application) but f need not be the zero function. We have a seminorm but not a
norm.
We then let
N = { f : R → R | f = 0 a.e. }.
This is a subspace of L1 (R) and we can form the quotient space L1 (R) = L1 (R)/N ,
whose elements are the equivalence classes for the relation ∼ given by f ∼ g iff
13

f = g a.e.. If we denote the equivalence class of f by [f ] we get a well-defined norm


on L1 (R) by setting Z
k[f ]k = |f |.
This is an instance of the general procedure for obtaining a normed space from a
space which carries a seminorm k·k0 by quotienting by the subspace { x | kxk0 = 0 },
as set out in Problem sheet Q.1. In practice we allow some sloppiness and speak
about elements of L1 (R) are if they were ordinary functions but do not distinguish
between functions which are equal a.e. and usually drop the [ ] notation.
It is worth noticing that each equivalence class [f ] can contain at most one function
which is continuous everywhere. So, for example, we don’t get into difficulties if we
treat C[0, 1] as if it were a subspace of L1 [0, 1].
Similarly, for p ∈ (1, ∞) we define Lp (R) = Lp (R)/N where
Lp (R) = { f : R → R | f is measurable and |f |p < ∞ }.
R

Once again, the case p = 2 is special. The L2 -norm comes from an inner product:
for the complex case, Z
hf, gi := f g.
The fact that the traditional norm on an Lp space indeed gives a normed space
will be assumed in FA-I. A proof of the Triangle Inequality appeared in Part A
Integration, starting from Hardy’s inequality. These results are important and
detailed accounts are available in many textbooks.
(c) For completeness we record that L∞ (R) is defined to be the space of (equivalence
classes of) bounded measurable functions f : R → F, with
kf k∞ = inf{ M > 0 | |f (t)| 6 M a.e. }.
[This space is mentioned for only for completeness.]
The Lebesgue spaces are much more central to FA-II than to FA-I. In this course they
feature primarily as illustrations. In FA-II L2 -spaces, associated with various measure
spaces, are central. The `p -spaces, for 1 6 p < ∞ can be subsumed within the Lp -spaces
using counting measure.
Technical note. Where spaces of integrable functions arise in the course as examples
or, very occasionally, on problem sheets, measurability of the functions involved may be
assumed. FA-I is not a course on measure theory. Non-measurable functions are anyway
elusive beasts, and are not encountered in everyday mathematics; their existence relies
on an assumption from Set Theory (Zorn’s Lemma).

To conclude this section we record some of the special properties shared by IPS-based
examples in which kxk = (hx, xi)1/2 . We collect together, for occasional use later, the
basic toolkit for working with such a norm.

1.12. Properties of the norm on an inner product space.


(i) Cauchy–Schwarz inequality: Let x, y ∈ X. Then
|hx, yi| 6 kxk kyk,
with equality if and only if x and y are linearly dependent.
(ii) Parallelogram Law:
kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 .
A norm (on a real or complex vector space) for which the Parallelogram Law fails
cannot come from an inner product (see Problem 0.2 for an example).
14

[Aside. There is a converse. Let k · k be a norm on a vector space X such that


the Parallelogram
p Law holds. Then there is an inner product h·, ·i on X such that
kxk = hx, xi. This is a bit tricky to prove.]
(iii) Polarisation: Retrieving the inner product from the norm:
(
1
(kx + yk2 − kx − yk2 ) (F = R),
hx, yi = 41 2 2 2 2
4
(kx + yk + ikx + iyk − kx − yk − ikx − iyk ) (F = C).
Proof: expand out the expressions on the RHS.
(iv) h·, ·i is a continuous function from X × X to F. Proof: Exercise (note Proposi-
tion 0.8).

2. Completeness and density

In which we establish the completeness of many of the spaces assembled in


Section 1 and non-completeness of a few others, and present some general
techniques.
With our cast of examples now assembled on stage, we start to explore their individual
characters against a backdrop of general properties. Here the topology of normed spaces
provides the focus.

2.1. Density.
Recall that a subset S of a metric space (or more generally a topological space) X
is dense if S = X. Density is an important notion: it will enable us to talk about
approximations of general elements by special ones, and many proofs in functional analysis
and its applications proceed by establishing a given result first for elements of a dense
subset, or subspace, of a normed space and then extending the result to all of X by a
limiting argument.
First examples of dense subspaces:
(a) Let X = `p where 1 6 p < ∞). Let en = (δjn ), that is, the sequence all of whose
coordinates are 0 except the nth coordinate, which is 1. Let x = (xj ) be an arbitrary
element of X. Then
X n
kx − xk ek kp = k(0, 0, . . . , 0, xk+1 , . . .)kp → 0 as k → ∞.
k=1

Hence the subspace of `p consisting of vectors having at most finitely many non-zero
coordinates is dense in `p .
(b) [Assumed fact] The step functions are dense in Lp (R) for 1 6 p < ∞. A correspond-
ing result holds for Lp (a, b), where −∞ 6 a < b 6 ∞.
In functional analysis, step function approximations are usually much easier to
work with than approximations by simple functions.

2.2. Density and examples of non-completeness.


The contrapositive of Proposition 0.12(i) leads to easy, and natural, examples of
normed spaces which fail to be complete. Suppose Y is a dense and proper subspace of a
normed space X. Then Y = X so Y cannot be closed and so cannot be complete for the
norm inherited from X.
15

R1
Example: Consider C[−1, 1] with norm kf k1 = −1 f (t) dt. To prove non-completeness,
consider the sequence (fn ) of continuous piecewise-linear functions for which

−1 if − 1 6 t 6 −1/n,

fn (t) = nt if − 1/n < t < 1/n,

1 if 1/n 6 t 6 1.
Suppose for contradiction kfn − gk1 → − where g ∈ C[−1.1]. But kfn − f k1 → 0, where
f = χ)0,1] − χ[−1,0). This implies g = f a.e., which is not possible.

We are ready for a clutch of archetypal completeness proofs. All involve function
spaces. There’s some overlap with Part A Metric Spaces but we give several proofs here
to reinforce the key points in the strategy. Direct proofs via Cauchy sequences follow a
uniform pattern. Subsidiary results can obtained via ”closed subspace of a Banach space
is a Banach space” (from 0.12).

2.3. Example: completeness of F b (Ω) (`∞ is a special case).


We take X to be the space of real-valued bounded functions on a set Ω with the sup
norm. Let (fn ) be a Cauchy sequence in X. Then given ε > 0, there exists N such that
∀m, n > N sup |fm (s) − fn (s)| < ε.
s∈Ω

Step 1: identify candidate limit. Fix t ∈ Ω, and let m, n ∈ N. Then


|fm (t) − fn (t)| 6 sup |fm (s) − fn (s)| = kfm − fn k∞ .
s∈Ω

It follows that (fn (t)) is a Cauchy sequence in R. Hence there exists a real number, write
it as f (t), such that fn (t) → f (t).

Step 2: from pointwise convergence to sup norm convergence (they’re differ-


ent!). For s ∈ Ω,
m, n > N =⇒ |fm (s) − fn (s)| < ε.
With s and n fixed let m → ∞ to get
n > N =⇒ |f (s) − fn (s)| 6 ε.
It follows that kf − fn k∞ → 0 as n → ∞.
Step 3: membership condition (that is, f ∈ X). We need f bounded. Using the
inequality in Step 2 and the Triangle Inequality we get, for all s,
|f (s)| 6 |fN (s)| + ε 6 kfN k∞ + ε.
Hence kf k∞ 6 kfN k∞ + ε < ∞. 

2.4. Example: completeness of C[0, 1].


We just outline the steps, noting that the framework is as in 2.3. Let (fn ) be a Cauchy
sequence in X = C[0, 1].
Step 1: candidate limit. (fn ) Cauchy implies for each t that (fn (t)) is Cauchy in R,
and hence convergent, to f (t) say, for each t.
Step 2: From pointwise convergence to sup norm convergence. As in 2.3.
Step 3: f ∈ X. The issue is continuity of f . But what we need is exactly that the uniform
limit of continuous functions is continuous—true by the standard “ε/3 argument” from
Prelims.
16

An alternative here is simply to show that C[0, 1] is a closed subspace of the Banach
space F b ([0, 1]). This amounts to piggybacking on Example 2.3 to reduce the problem to
Step 3 alone.

2.5. Example: completeness of `1 .


The issue here is as much with the notation as with the substance. We start from a
(n)
Cauchy sequence in `1 , which we shall denote by (x(n) ), where x(n) = (xj ).
(m) (n)
Step 1: candidate limit, coordinatewise. Since |xj − xj | 6 kx(m) − x(n) k1 so
for each j the sequence of j th coordinates is Cauchy and so converges, to some xj . Let
x = (xj ).
(n)
Step 2: convergence of (xj ) to (xj ) in `1 -norm. For any K ∈ N we have
K
X (m) (n)
|xj − xj | 6 kx(m) − x(n) k1
j=1

and the RHS can be made less than a given ε if m, n > N , for some N ∈ N. Keeping K
and n fixed and letting m → ∞, (AOL) gives
K
X (n)
|xj − xj | 6 ε for all n > N.
j=1

(n)
Since this is true for all K we deduce that (xj − xj ) belongs to `1 for all n > N and
that its norm tends to 0 as n → ∞.
Step 3: x = (xj ) ∈ `1 . This follows from Step 2 and the triangle inequality for `1 , by
(N ) (N )
writing x as (xj − xj ) + (xj ).

A similar but messier proof shows that `p is complete for 1 < p < ∞.

2.6. Example: completeness of c0 .


An exercise in ε-δ analysis.—be careful with quantifiers! Show that (c0 , k · k∞ ) is a
closed subspace of `∞ and so complete.

2.7. Example: completeness of C 1 [0, 1].


Consider the space X of real-valued continuously differentiable functions on [0, 1] with
the C 1 norm: kf k = kf k∞ + kf 0 k∞ , as in 1.9. Let (fn ) be a Cauchy sequence in X.
0
Because kfm − fn k > kfm − fn k∞ and kfm − fn k > kfm − fn0 k∞ we can show as in
Example 2.4 that there exist continuous functions f and g such that kf − fn k∞ → 0 and
kg − fn0 k∞ → 0. It remains to show that f 0 = g. Once we have this, kf − fn k → 0 follows
easily by (AOL).
Observe that (by FTC, from Prelims Analysis III),
Z t
fn (t) − fn (0) = fn0 (s) ds (for all n).
0
Let n → ∞ to get
Z t
f (t) − f (0) = g(s) ds;
0
here we use uniform convergence of (fn0 ) to justify interchanging Rthe limit and the integral
t
on the RHS. But, again from Analysis III, the indefinite integral 0 g(s) ds is differentiable
w.r.t. t with derivative g(t). We conclude that f 0 = g, as required.
17

2.8. Stocktake on tactics for completeness proofs.


Note that all the completeness proofs presented so far for function spaces (which
subsume sequence spaces) follow the same pattern. Suppose we have such a space X,
which we wish to show is complete. We take a Cauchy sequence (xn ) in X and first
identify a candidate limit, x say, a point at a time (or a coordinate at a time).
We then need to show two things: that x ∈ X and that xn → x with respect to X’s
norm. For this we usually show that x ∈ X by showing that x − xN is in X for suitable N
and deducing that x = (x − xN ) + xN is also in X. The proof that x − xN ∈ X usually
comes out of showing that kx − xn k is small for n > N some sufficiently large N . This
strategy underlies our ordering of Steps 2 and 3 in our examples above.
Note that Step 1, identifying the limit, pointwise, does not make Step 2 unnecessary. 
On C[0, 1] for example, convergence w.r.t. the sup norm is uniform convergence, a much
stronger condition than pointwise convergence.

A significant consequence of the Cauchy


PConvergence Principle in elementary Analysis
is that every absolutely convergent series an of real or complex numbers
P is convergent.
P
The proof relies on consideration of the sequences of partial sums of |an | and an .
We now carry these ideas over to normed spaces and Banach spaces.

2.9. Series of vectors in a normed space.


Let (xn ) be a sequence in a normed space X and consider the sequence (sn ) of partial
sums:
sn = x1 + · · · + xn (n > 1).
P∞
If
P there exists x ∈ X such that s n → x then we write x = n=1 xn andP
say that the series
xn converges. The series is said to be absolutely convergent if kxn k converges.

2.10. Theorem (completeness and absolute convergence). Let X be a normed


space. Then X is a Banach space if and only if every absolutely convergent series of
vectors in X converges.

Proof. =⇒ : this is proved in exactly the same way as for the case in which X is R or
C, with k · k in place of | · |—check Prelims notes!
P
⇐=: Take any Cauchy sequence (yn ) in X. We endeavour P∞ to construct a series xn
in X which is absolutely convergent and such that x := n=1 xn supplies the limit we
require for (yn ). We can find natural numbers n1 < n2 < · · · such that
ky` − ym k < 2−k for `, m > nk .
Let sk = ynk . Define (xk ) by
x 1 = s1 , xk = sk − sk−1 (k > 1).
P P −k P
Then the real series kxk k converges by comparison with 2 so xk is absolutely
P
convergent, and hence by assumption it converges. Moreover, by construction, xk is a
telescoping series:
k
X
ynk = sk = x` .
`=1

We deduce that (ynk ) converges. But, just as in Prelims Analysis, a Cauchy sequence in
a normed space which has a convergent subsequence must itself converge. 
18

2.11. Applications of Theorem 2.10.


(a) Completeness of the Lp spaces (1 6 p < ∞) (Riesz–Fischer Theorem). The
proof requires mastery of techniques of Lebesgue integration and won’t be discussed
in FA-I.
A spin-off is completeness of the `p spaces: consider counting measure on N.
(b) Completeness of L∞ (R). [Outline proof included just for completeness.] Take an
fn in L∞ (R). By definition of the norm on L∞ (R),
P
absolutely convergent series
thereSexists for each n a null set An such that |fn (t)| 6 kfn kL∞ (R) + 2−n . Let
A = An ; this is null. Now let gn = fn χR\A. Then each gn ∈ F b (R) and kgn kF b (R)
P

converges by comparison since (kfn kL∞ + 2−n ) converges.


P
b
P
By completeness ofPnF (R) (see
Pn2.3), gn converges, to some bounded function
G. Moreover, off A, k=1 fk = k=1 gk → G pointwise. Therefore G is measurable,
G ∈ L∞ (R) and (for convergence in norm),
Xn Xn
G − gk > G − fk → 0 as n → ∞.

k=1 F b (R) k=1 L∞

(c) Quotient spaces. It can be proved from Theorem 2.10 that if X is a Banach space
and Y is a closed subspace of X then X/Y is a Banach space for the quotient norm:
kx + Y k := inf{ kx + yk | y ∈ Y }.
(Here closedness of Y is necessary to ensure that we have a norm rather than merely
a seminorm.)
We shan’t need this result in FA-I. The proof is rather technical and we omit it.
We conclude this section with some indications of why Hilbert spaces are special and
what distinguishes them from inner product spaces in general and from Banach spaces in
general. This brief account can be seen as providing context to FA-I and as a look-ahead
to FA-II. [The theorems mentioned below do not form part of the examinable syllabus
for FA-I].

2.12. A glimpse at Hilbert spaces. The Prelims and Part A Linear Algebra courses
reveal features and methods that apply only to inner product spaces. Euclidean spaces,
along with much of their geometry, form the prototype for finite-dimensional IPS’s. A
central notion is that of orthogonality and a key theorem asserts that if L is a subspace
of a fd IPS V then
(†) V = L ⊕ L⊥ .
This is a key step in the proof of the Spectral Theorem for self-adjoint operations in
finite-dimensional IPS’s, since it allows to proceed by induction on dim V .
This can be proved using ideas of the Gram–Schmidt process and orthonormal bases.
More geometrically, given x ∈ V , one want y ∈ L so that x − y ∈ L⊥ , which then implies
V = L + L⊥ . Geometrically, we want to pick yx ∈ L so d(x, yx ) is as small as possible.
Algebraic arguments tell us this will work when the IPS V is finite-dimensional. But does
it work in general?
We want
d(x, yx ) = δ := inf{ d(x, y) | y ∈ L },
that is, kx − y0 k = inf{ kx − yk | y ∈ L }. A viable strategy, in a Hilbert space is to
take a sequence (yn ) in L such that kx − yn k → δ and to apply the parallelogram law to
show (yn ) is Cauchy, and so convergent. Finally we’d need to add the assumption that
L is closed to get yx := lim yn ∈ L. By continuity of the norm function, y0 is the closest
point we seek.
Hence:–
19

Theorem (Closest Point Theorem, for subspaces) Let L be a closed


subspace of a Hilbert space X. Given x ∈ L there exists a unique point
yx ∈ L such that
d(x, yx ) = inf{ d(x, y) | y ∈ L }.
This leads on to a core theorem about Hilbert spaces. (n Section 4 we’ll see that
any subspace of a finite-dimensional space is closed, so the theorem subsumes the result
for finite-dimensional IPS’s recalled above. Concerning the final assertion of the theorem
recall the comments about projections in the preamble to Section 1.
Theorem (Projection Theorem) Let X be a Hilbert space and L a closed
subspace of X. Then X = L ⊕ L⊥ .
Moreover kPL (x)k 6 kxk for all x ∈ X, where PL is the projection map
from X onto L.
Full details and the derivation from the Closest Point Theorem can be found in
textbooks covering basic Hilbert space theory.

3. Linear operators between normed spaces

In which we bring on stage structure-preserving maps between normed


spaces (the bounded linear operators) and develop their elementary
properties.

It is taken for granted in contemporary pure mathematics that we should consider


not just mathematical objects having a particular type (groups, vector spaces, metric
spaces, . . .) but also the structure-preserving maps between such objects (respectively,
homomorphisms, linear maps, continuous maps, . . .). That is, we study objects not in
isolation but we study how they relate to other objects of the same type. Furthermore,
this approach ties in with potential applications: for example, on a space of infinitely
differentiable functions, the properties of differential operators may be important in the
mathematical modelling of physical problems (though we do not treat these aspects in
this course).

In Part A Metric Spaces an indication was given that a map from one normed space
to another has special properties when it is linear AND continuous. We shall call such a
map a continuous linear operator. We now elaborate on what was shown in Metric
Spaces. A map T : X → Y satisfying condition (3) in Proposition 3.1 will be said to be
bounded.

3.1. Proposition (characterising continuous linear operators). Let X and Y be


normed spaces over F and assume that T : X → Y is linear. Then the following conditions
are equivalent:
(1) T is continuous;
(2) T is continuous at 0;
(3) there exists a non-negative constant M such that kT xk 6 M kxk for all x ∈ X.

Proof. (1) =⇒ (2): Trivial.


(2) =⇒ (3): Suppose for contradiction that (3) fails. Then we can find a sequence (xn )
in X such that kxn k 6 1 and kT xn k > n (for all n). Let yn = xn /n so that yn → 0 by
(N2). By (2), T yn → T 0. But T linear and (N2) together imply kT yn k > 1 and T 0 = 0,
which gives the required contradiction.
20

(3) =⇒ (1): Let x0 , x ∈ X Then


kT x − T x0 k = kT (x − x0 )k 6 M kx − x0 k.
Hence T is continuous (using the ε–δ definition of continuity of a map between metric
spaces). 

3.2. The norm of a bounded linear operator.


Again let X and Y be normed spaces over F. Write B(X, Y ) for the set of all
continuous, alias bounded, linear operators from X to Y , and B(X) for B(X, X).
The set B(X, Y ) becomes a vector space if we define addition and scalar multiplication
pointwise in the usual way: given S, T ∈ B(X, Y ) and λ ∈ F, we ”define” S + T and λT
by
∀x ∈ X (S + T )(x) = Sx + T x,
∀x ∈ X (λT )(x) = λT x.
As in Prelims LA, S + T and λT are linear maps. Moreover it is easy to see from
Proposition 3.1 that they are continuous as maps from one normed space to another.
Check the details!
We’d like to do more: we’d like to make B(X, Y ) into a normed space in its own right.
Given T ∈ B(X, Y ) it is easy to check that the following candidate definitions for kT k all
give the same value:
n kT xk o
(Op1) kT k = sup | x 6= 0 ;
kxk
(Op2) kT k = sup{ kT xk | kxk = 1 };
(Op3) kT k = sup{ kT xk | kxk 6 1 };

(Op4) kT k = inf{ M | ∀x kT xk 6 M kxk }.
Here (Op1), (Op2) and (OP3) are minor variants of each other and we use them
interchangeably, as convenient; the link to (Op4) relies on the definition of a sup as the
least upper bound. We leave the proofs of equivalence as an exercise, noting as usual that
kx/kxkk has norm 1 provided x 6= 0. Don’t forget about the special case x = 0.
It is a routine exercise to verify that T 7→ kT k does define a norm on B(X, Y ). Note
also that (0p1) immediately supplies the very useful fact that
kT xk 6 kT k kxk for all x ∈ X.

3.3. Bounded linear operators and their norms: first examples.


Suppose that X, Y are normed spaces over F and that you are asked to show that
some given map T belongs to B(X, Y ). You need to check
(i) T is a well-defined map from X to Y . Usually this just means you have to confirm
that, for each x ∈ X, the formula for T x makes sense and that T x ∈ Y . (For
example, are there convergence issues to be addressed?)
(ii) T is a linear map. This is likely to be routine checking and should not be lingered
over.
(iii) T is bounded. Usually this can be done by estimating kT xk to exhibit a finite
constant M such that kT xk 6 M kxk for all x.
If you are also asked to calculate kT k, (iii) will have told you that kT k 6 M . You
then have to find the minimum possible M . See the examples for illustrations of how to
do this.
21

(1) Shift operators on sequence spaces. To illustrate, we work with maps from `1
to `1 . Other examples are available. Define
R(x1 , x2 , . . .) = (0, x1 , x2 , . . .),
L(x1 , x2 , . . .) = (x2 , x3 , . . .).
it is clear that each of R and L maps `1 into `1 and is linear. Also, for any x =
(xj ) ∈ `1 ,
kRxk = kxk and kLxk 6 kxk.
Hence R, L ∈ B(`1 , `1 ), with kRk = 1 immediate from (Op3) and kLk 6 1.
Let en = (δjn )j>1 . Then ken k = 1 for all n. Then kLe2 k = ke1 k = 1 and we
deduce that kLk = 1, with the supremum in (Op3) being attained.
(2) Let X = Y = CR [0, 1] and define T f by (T x)(t) = tx(t) (for all t ∈ [0, 1]). We claim
that T ∈ B(X) and kT k = 1. Prelims Analysis confirms that T maps X into X and
is linear [no proof called for here]. Also, for all x ∈ X and t ∈ [0, 1],
|(T x)(t)| = |tx(t)| 6 |x(t)| 6 kxk∞ .
Hence kT xk 6 kxk for all x ∈ X and we have equality when x ≡ 1. It follows that
T is bounded, with kT k = 1.
(3) T as in (2) but now X = Y = L2 (0, 1). Other Lp spaces are available.
First note that g : t 7→ t2 is a bounded measurable function on [0, 1], so x ∈ L2 (0, 1)
implies T x ∈ L2 (0, 1) too and linearity of T is clear. Also
Z 1 Z 1
2 2
kT xk2 = |tx(t)| dt 6 |x(t)|2 dt = kxk22 .
0 0

Therefore T is bounded and kT k 6 1.


It is not immediately obvious that kT k > 1: we can’t claim that the sup defining
kT k in (Op1) or (Op2) will be attained. To prove that the sup really is 1, we
construct a ‘witnessing sequence’. Define xn by

xn (t) = nχ[1− 1 ,1](t) (t ∈ [0, 1]).
n
√ 2
Routine calculation gives kxn k22 = n /n = 1 and
Z 1 Z 1
2 2
kT xn k2 = |txn (t)| dt = n t2 dt > n(1 − 1/n)2 /n = (1 − 1/n)2 .
0 1−1/n

Hence kT k > (1 − 1/n) for all n so kT k > 1.


[Observe that the supremum cannot be shown to be attained because members x
of X are not necessarily such that x(t) = O(t2 ) as t → 1.]
(4) An integral operator. Let X = C[0, 1], real-valued continuous functions on [0, 1]
with the sup norm. Let k ∈ C([0, 1] × [0, 1]). Note that, as a continuous real-valued
function on a compact set, k is bounded. For x ∈ X let
R1
(T x)(t) = 0 k(s, t)x(s) ds (t ∈ [0, 1]).
Note that the integral on the RHS exists for each t since the integrand is continuous.
We claim that T x ∈ C[0, 1] for each x ∈ X. The slickest way to prove this is probably
to use the Continuous DCT to show
R1 R1
lim 0 k(s, t)x(s) ds = 0 k(s, to ) ds :
t→t0

a dominating function for the family of functions yt : s 7→ k(s, t)x(t) on [0, 1] is


kkk∞ kxk∞ χ[0,1]. [Alternative strategies: (i) use ‘ordinary’ DCT for sequences and
appeal to 2.1 or (ii) use elementary estimation of |(T x)(t) − (T x)(t0 )|, making use
22

of the fact that k is uniformly continuous.] Hence T : C[0, 1] → C[0, 1] and by


elementary properties of integrals T is linear. Moreover, for all t,
R R
1 1
|(T x)(t)| = 0 k(s, t)x(s) ds 6 0 |k(s, t)x(s)| ds 6 kkk∞ kxk∞ .

Hence kT xk∞ 6 kkk∞ kxk∞ and therefore T is bounded, with kT k 6 kkk∞ .


See Problem sheet Q. 8 for a special case in which kT k can be calculated explicitly.

3.4. Example: an unbounded operator.


Let X be the space of real-valued continuously differentiable functions on [0, 1] with
the sup norm and let C[0, 1] also have the sup norm. Define T : X → C[0, 1] by
(T f )(t) = f 0 (t). Then T is linear but is not bounded: find a sequence (fn ) of con-
tinuously differentiable functions on [0, 1] for which kfn k∞ 6 1 for all n but kfn0 k∞ → ∞.

[Here X is not a Banach space: it is a dense proper subspace of C[0, 1]. Results on
Banach spaces proved in FA-II show why unbounded operators on Banach spaces are
hard to find.]

3.5. Remarks on calculating operator norms. Suppose X, Y are normed spaces and
T : X → Y is linear. Suppose we seek to show T is bounded and to calculate its norm.
As our examples have shown, it is often relatively easy to find some constant M such
that kT xk 6 M kxk for all x, This implies that T is bounded, with kT k 6 M . But it is
often harder to find the least such M , as in (Op4) in Proposition 3.1.
Two cases can arise.
1. Supremum attained in (Op1)/(Op2)/(Op3). We are in luck! We can show
kT xk 6 M kxk for all x and that there exists x0 6= 0 such that kT x0 k = M kx0 k.
Then the sup in (Op1) is attained and we have kT k = M .
2. Supremum not (necessarily) attained in (Op1)/(Op2/Op3).
Recall the Approximation Property for sups from Prelims Analysis I: Let S be
a non-empty subset of R which is bounded above, so sup S exists. Then, given ε > 0,
there exists s ∈ S, depending on ε, such that
sup S − ε < s 6 sup S.
This characterises the sup in the sense that if M is an upper bound for S and there
exists a sequence (sn ) in S such that sn > M − n−1 for all n then M = sup S. Our
witnessing sequence method in 3.3(3) draws on this. Note also Example 3.6.
Note finally that we can also use sequences to witness that a linear map T is un-
bounded: this happens if there exists a sequence (xn ) with kxn k = 1 (or (kxn k) bounded
will do) and kT xn k → ∞ as n → ∞.

3.6. Example: bounded linear operators on sequence spaces.


By way of illustration, consider a given bounded linear operator T : `1 → Y , where
Y is some normed space. We can get information about kT k by considering the images
T ek , where ek = (δkj )j>1 , for k > 1. For any k, we have kek k1 = 1 and hence
sup kT ek k 6 kT k.
k

We claim there is equality. Take any x = (xj ) in `1 and define


X n
(n)
x = xk ek = (x1 , . . . , xn , 0, 0, . . .).
k=1
23

Then

X
(n)
kx − x k1 = |xk | → 0 as n → ∞.
k=n+1

Hence, because T is continuous, T x(n) → T x in Y . But


X n  Xn
T x(n) = T xk ek = xk T ek .
k=1 k=1

It follows that, for all n,


Pn
kT x(n) k 6 |xk |kT ek k 6 sup16k6n kT ek k nk=1 |xk |
P
k=1
6 (supk kT ek k) ∞
P
k=1 |xk | = (supk kT ek k) kxki .

By continuity of the norm in Y ,


kT xk = lim kT x(n) k 6 (sup kT ek k)kxk1 .
n→∞

This proves our claim that


kT k = sup kT ek k.
k>1

Specialising to Y = `p , for 1 6 p < ∞, we deduce that, for any T ∈ B(`1 , `p ),


kT k = sup kT ek kp .
k>1

See Problem sheet Q, 11 for a cautionary example concerning a bounded linear op-
erator from `∞ into `1 .

3.7. Aside: matrix norms (attn: numerical analysts).


Norms on vector spaces Mn (R) are useful in numerical linear algebra, for example in
error analysis of solutions of linear equations. To be specific, let us consider X = Rn with
the 1-norm, 2-norm or ∞-norm.
Take the standard basis {e1 , . . . , en } for Rn . Then any linear map T : X → X is
represented with respect to this basis by a matrix A = (aij ) with
n
X
T ek = aik ei .
k=1

Then one may measure the ‘size’ of A using either of the following quantities:
Xn 
kAk1 = max |aij | ,
16j6n
i=1
n
X 
kAk∞ = max |aij | .
16i6n
j=1

It can be shown, cf. Example 3.6, that k · k1 and k · k∞ are respectively the norms on
B(Rn ) when Rn has the 1-norm and the ∞-norm.
One may likewise ask for a formula for the norm on B(Rn ) when Rn has the 2-
norm. This is more elusive. However when we have a real symmetric matrix A, this is
diagonalisable, with eigenvalues λ1 , . . . , λn (not necessarily distinct). Then
r
kAk2 := sup kAxk22 = max{|λ1 |. . . . , |λn |}.
kxk2 =1

We now return to general theory.


24

3.8. Theorem (completeness). Let X be a normed space and Y a Banach space, both
over the same field. Then the normed space B(X, Y ) is complete.

Proof. The proof goes in very much the same way as the completeness proofs for function
spaces given in Section 2. Let (Tn ) be a Cauchy sequence in B(X, Y ).
Step 1: candidate limit. For each x we have
kTm x − Tn xk = k(Tm − Tn )xk 6 kTm − Tn k kxk.
We deduce that (Tn x) is Cauchy in Y , and so convergent to some T x ∈ Y . We thus have
a map T : X → Y . It follows from the continuity of addition and scalar multiplication
(see 2.2) that T is linear.
Steps 2 and 3: from pointwise convergence to norm convergence and proof
that T ∈ B(X, Y ).
Observe that if, for some N , we can show that the linear map T − TN is bounded
then T = (T − TN ) + TN will be bounded too. For any fixed ε > 0,
∃N ∀m, n > N kTm x − Tn xk 6 kTm − Tn kkxk 6 εkxk.
Fix x and n > N and let m → ∞ to get kT x−Tn xk 6 εkxk and hence k(T −Tn )xk 6 εkxk.
This implies in particular that T − TN is bounded. Also n > N implies kT − Tn k 6 ε.
Therefore kT − Tn k → 0. 

3.9. Corollary. For any normed space X over F, with F as R or C, the space X ∗ =
B(X, F) of bounded linear functionals is complete, where the norm is
n |f (x)| o
kf k = sup | x 6= 0 .
kxk

We shall make significant use of spaces of the form X ∗ later in the course.

3.10. Products of linear operators.


We can form compositions of bounded linear operators in just the same way as we can
form composite linear maps between vector spaces. Let T ∈ B(X, Y ) and S ∈ B(Y, Z),
where X, Y, Z are normed spaces over F. Then ST , defined by (ST )(x) = S(T x) for all
x ∈ X, is a member of B(X, Z). Boundedness of ST comes from
k(ST )xk = kS(T x)k 6 kSk kT xk 6 kSk kT k kxk.
Moreover we deduce from this that
kST k 6 kSk kT k.
Assume T ∈ B(X). Then, by induction, T n ∈ B(X) and kT n k 6 kT kn , for n = 1, 2, . . . .

3.11. Kernel and image.


Let X and Y be normed spaces. Let T ∈ B(X, Y ), that is, T is linear and is
continuous, alias bounded.
In Linear Algebra in finite-dimensional spaces, the kernel and image of a linear map
feature prominently. They are still important, but less central, when we work in infinite-
dimensional normed spaces. But it is worth noting that the subspace
ker T := { x ∈ X | T x = 0 }
−1
equals T {0}. This is the inverse image under a continuous map of a closed set and so
is closed.
On the other hand, T X = im T is not in general closed. See Example 3.16 for an
25

instance of this.

3.12. Examples: kernel and image.


Recall the shift operators R and L in Example 3.3(1):

R(x1 , x2 , . . .) = (0, x1 , x2 , . . .),


L(x1 , x2 , . . .) = (x2 , x3 , . . .).

These belong to B(`1 ).


It is easy to see that R is injective but not surjective and that L is surjective but not
injective: its kernel is the space spanned by e1 = (1, 0, 0, . . .). Of course, this is behaviour
that cannot arise for a linear map from a finite-dimensional space to itself, thanks to the
Rank-Nullity Theorem.

3.13. Invertibility of a bounded linear operator.


Suppose we are given a bounded linear operator T ∈ B(X) where X is a normed
space. We would like to know when T has an inverse S which is again a bounded linear
operator. So we seek S ∈ B(X) such that S ◦ T = T ◦ S = I. For this it is necessary that
T be a bijection, but in general this will not be sufficient. Note too that we demand a
two-sided inverse.
When such an S exists we write S = T −1 .

3.14. A checklist for invertibility of T ∈ B(X) (X a normed space).


Mindful of Example 8.3 we now pick apart what we need in order that a bounded
linear operator is invertible: We must have

(a) T injective (equivalently ker T = {0});


(b) T surjective, that is, T X = X.

Assume now that (a) and (b) hold. Then there exists a map S : X → X such that
T ◦ S = S ◦ T = I. We’d then want S to be a bounded linear operator on X. So we need

(c) S is linear: this is always true (routine calculation, just Linear Algebra);
(d) S is bounded, that is, there exists a constant K > 0 such that

kSyk 6 Kkyk ∀y ∈ X.

Given (b), this is equivalent to

(?) ∃δ > 0 such that kT xk > δkxk ∀x ∈ X.

When this holds K := δ −1 provides a bound for kT −1 k.

In practice, the kernel of T can usually be found by direct calculation. Surjectivity


of T is generally much harder to check. A fallback strategy is to aim to show that T X is
both closed and dense in X, from which T X = X would follow.
The following proposition is informative and will be useful later. Note that we need
to assume that X is a Banach space.
26

3.15. Proposition (closed range). Let X be a Banach space and T ∈ B(X). Assume
that (?) holds for some constant K. Then T is injective and T X is closed.
If in addition T X = X, then T is bijective and T is invertible.

Proof. From (?), T x = 0 implies x = 0, so ker T = {0}. Now assume (yn ) is a sequence
such that yn = T xn and yn → y. Then (T xn ) is Cauchy and then (?) implies that (xn ) is
Cauchy. Since X is Banach, there exists x such that xn → x. Then yn = T xn → T x so
y ∈ T X.
Assume further that T X is dense. Then T X = X. Moreover (?) tells us that S = T −1
is bounded. 

3.16. Examples: invertible and non-invertible operators.


(1) Let K be a compact subset of C and consider X = C(K), the space of complex-
valued continuous functions on K with the sup norm. Define T ∈ B(X) by
(T f )(z) = zf (z) (z ∈ K, f ∈ X).
Then T is invertible if and only if 0 ∈/ K. For the ⇐= direction, observe that
/ K implies that T is a bijection with inverse (T −1 g)(z) = g(z)/z and that
0 ∈
kT −1 gk 6 kzk/δ where δ := dist(0, K) > 0, so T −1 is bounded.
(2) Let T ∈ B(c0 ) be given by T (xj ) = (xj /j). Then ker T = {0}. √
The operator T is not surjective: no (xj ) ∈ c0 exists such that T (xj ) = (1/ j).
However the range T c0 is dense since it contains c00 the subspace of all sequences
with only finitely many non-zero coordinates.
Also ken k = 1 and kT en k = 1/n. It follows that (?) in 3.14 cannot hold.
Our final invertibility result in this section gives a sufficient condition for an operator
on a Banach space to be invertible and which directly provides a formula for the inverse.
Note how completeness is used in the proof.

3.17. Proposition. Let X be a Banach space. Let T ∈ B(X) be such that kT k < 1.
P∞
Then I − T is invertible with inverse given by k=0 T k (where T 0 := I).
Moreover, if P ∈ B(X) is such that kI − P k < 1 then P is invertible.

Proof. Assume kT k < 1. Define


Sn = I + T + T 2 + · · · + T n .


Since kT k k 6 kT kk for all k, the series


P k
kT k converges. Note B(X) := B(X, X) is a
Banach space because X is, by 3.8. Now by Theorem 2.10 there exists S ∈ B(X) such
that kSn − Sk → 0. Also
(I − T )Sn = Sn (I − T ) = I − T n+1 .
Letting n → ∞, noting that kT n+1 k → 0, we see that (I − T )−1 exists and equals S.
For the final assertion put T = I − P and note that P = I − T . 

This result is important in spectral theory and in its applications, for example the
theory of integral equations.
27

4. Finite-dimensional normed spaces

In which we reveal why linear algebra in finite-dimensional real or complex


vector spaces doesn’t normally explicitly involve analysis, and we show how
finite-dimensional spaces fit into functional analysis.
The key results are
• any two norms on a given finite-dimensional normed space are equiv-
alent;
• any linear operator with finite-dimensional domain is bounded;
• any finite-dimensional normed space is a Banach space;
• any finite-dimensional subspace of a normed space is closed.

4.1. Finite-dimensional spaces, algebraically.


Let X be a vector space over F, where F is R or C. Assume that X is finite-
dimensional, of dimension m, and fix some basis {x1 , . . . , xm }. From Prelims Linear
Algebra, the map
P : (λ1 x1 + · · · + λm xm ) 7→ (λ1 , . . . , λm )
is well-defined and sets up an isomorphism (that is, a linear bijection) from X onto Fm .
Denote its inverse by Q. Note (Prelims LA result) that Q is necessarily linear.

4.2. Theorem (introducing topology). Let X be a vector space over F of dimen-


sion m, equipped with a norm k · k. Let Q : Fm → X be the isomorphism set up in 4.1
and give Fm the ∞-norm. Then
(i) Q is bounded and so continuous;
(ii) Q is a homeomorphism and hence so is P .

Proof. (i) Case m = 1: Here Q : λ1 7→ λ1 x1 with x1 necessarily non-zero. Trivially Q is a


homeomorphism.
Case m > 1:
Q : (λ1 , . . . , λm ) 7→ λ1 x1 + . . . + λm xm .
For each j ∈ {1, . . . , m} there is a projection map πj : (λ1 , . . . , λm ) 7→ λj which is a
bounded linear operator from Fm to F (it has norm 6 1). Now consider for each j the
composite map Qj ◦ πj , where Qj : λj 7→ λj xj :
πj Qj
(λ1 , . . . , λm ) 7−→ λj 7−→ λj xj .
As noted for the case m = 1, each Qj is continuous. Hence Qj ◦πj : Fm → X is continuous.
Finally note that Q = (Q1 ◦ π1 ) + · · · + (Qm ◦ πm ) : Fm → X is continuous too.

(ii) Consider C := { y ∈ Fm | kyk∞ = 1 }. The set C is closed and bounded, and


hence compact by the Heine–Borel Theorem. Therefore Q(C) is compact since Q is
continuous. We want to show the linear bijection P = Q−1 : X → Fm is continuous.
Certainly 0 ∈/ Q(C) since 0 ∈
/ C and Q is a linear bijection. As a compact subset of a
metric space Q(C) is necessarily closed in X. Hence there exists δ > 0 such that kxk < δ
implies x ∈
/ Q(C). Rephrasing this,
kyk∞ = 1 =⇒ Qy ∈
/ { x ∈ X | kxk < δ } =⇒ kQyk > δ.
Since Q is linear and surjective this gives kP xk 6 δ −1 kxk for all x ∈ X, so P is continuous,
as required. (Recall (?) in 3.14.) 

We already know that the various p-norms on Fm are equivalent. But we can now do
better.
28

4.3. Theorem (equivalence of norms). Any two norms on a finite-dimensional normed


space X are equivalent.

Proof. Let k · k and k · k0 be norms on X. Then there exist homeomorphisms


P : (X, k · k) → (Fm , k · k∞ ) and P 0 : (X, k · k0 ) → (Fm , k · k∞ ),
by Theorem 4.2, and these are equal as set maps. Hence the identity maps
(P 0 )−1 ◦ P : (X, k · k) → (X, k · k0 ) and P −1 ◦ P 0 : (X, k · k0 ) → (X, k · k)
are both continuous and are obviously linear. As such, they are bounded linear operators,
and this is the same as saying that the norms k · k and k · k0 are equivalent. 

4.4. Corollary (boundedness of linear operators on a finite-dimensional normed


space). Let X and Y be normed spaces over F, with X finite-dimensional. Assume that
T : X → Y is linear. Then T is bounded.
Assume X = Y . A linear map T : X → X is invertible if and only if it is a bijection.
Here invertibility is defined as in 3.13.

Proof. Denote
P the given norm of X by k · k. Take a fixed basis {x1 , . . . , xm } for X and
let x = λj xj be a general element of X, where the scalars λj are uniquely determined
by x. Then
kT (λ1 x1 + · · · + λm xm )k = kλ1 T x1 + · · · + λm T xm k
m
X 
6 |λ1 |kT x1 k + · · · + |λm |kT xm k 6 kT xj k max |λj |.
16j6m
j=1

Define a new norm k · k0 on X by


kxk0 = k(λ1 x1 + . . . + λm xm )k0 := max |λj |.
16j6m
0
By Theorem 4.3, k · k and k · k are equivalent. This implies that there is a constant K
such that kxk0 6 Kkxk for all x. Therefore
Xm 
kT xk 6 K kT xj k kxk (for all x ∈ X).
j=1

We conclude that T is bounded. 

4.5. Theorem (completeness of finite-dimensional normed spaces).


(i) (Fm , k · k∞ ) is a Banach space.
(ii) Let (X, k · k) be a finite-dimensional normed space. Then X is a Banach space.

Proof. (i) This is a very simple instance of the strategy for proving completeness in
function spaces (we can regard Fm as the space of bounded functions from {1, . . . , m} to
F, or just think in terms of coordinates).
For (ii) it suffices to prove that X is complete in some norm since all norms on X
are equivalent and, for equivalentPnorms, the same sequences are Cauchy. Take a basis
{x1 , . . . , xm } for X and define k j λj xj k to be maxj |λj |. This is a norm on X and X
with this norm is Banach, exactly as in (i). 

We now consider closed subspaces, recalling Proposition 0.12. We give two proofs
for the following result, one which uses all the machinery of this section, the other much
more direct.
29

4.6. Proposition. Let Y be a finite-dimensional subspace of a normed space X. Then


Y is closed.

Proof. Method 1: (Y, k · kX ) is a finite-dimensional normed space, so complete by The-


orem 4.5(ii). Hence Y is closed by Proposition 0.12.

Method 2: We proceed by induction on dim Y . If dim Y = 0 then Y = {0} and this is


closed.
Now assume that Z is any closed subspace of X and let y ∈ X \ Z. We claim that
W = span(Z ∪ {y}) is closed. Each element of W is expressible (uniquely) as z + λy,
where z ∈ Z and λ ∈ F. Suppose wn → x where wn = zn + λn y.
Case 1: (λn ) is unbounded. By passing to a subsequence if necessary we may assume
that |λn | → ∞ (and wlog λn 6= 0). Then
|λn |−1 kwn k = kλ−1
n zn + yk.

Letting n → ∞, noting that (wn ) is norm-bounded, we see that y ∈ Z = Z, which is a


contradiction.

Case 2: (λn ) is bounded. By Prelims Analysis, (λn ) has a convergent subsequence,


(λnr ) say, converging to some λ ∈ F. Now
znr = wnr − λnr y → w − λy.
But this implies that (znr ) converges to some z, necessarily in Z. Finally we get w ∈ W ,
so W is closed. 

The famous Heine–Borel Theorem tells us that in a finite-dimensional normed space


m
F compactness is equivalent to (closed + bounded). The following result tells us that
the backward implication fails in any infinite-dimensional normed space.

4.7. Theorem (compactness of closed unit ball). Let X be a normed space and let
S := B(0, 1) = { x ∈ X | kxk 6 1 } be the closed unit ball in X. Then S is compact if
and only if X is finite-dimensional.

Proof. ⇐=: By Theorem 4.2, X is isomorphic and homeomorphic to some normed space
Fm . Thus it is sufficient to show that the closed unit ball in a space Fm is compact. But
as noted above in this special case this follows from the Heine–Borel Theorem.

=⇒ : We first introduce some useful notation, which just uses the vector space
structure of X. For A, B ⊆ X and c > 0, let
A + B = { a + b | a ∈ A, b ∈ B } and cA = { ca | a ∈ A }.
Note that if Y is a subspace then Y + Y = Y and cY = Y for all c > 0.
Now assume that S is compact. We shall make use of the fact that a compact subset
of a metric space is totally bounded. This means that for each ε > 0 there exists a
finite set Fε (called an ε-net) in X such that every point of S is within a distance ε of
some point in Fε . The proof is easy: consider the open cover of S consisting of all open
balls in X of radius < ε with centres in S.
Let F be a 1/2-net. Then, for any x ∈ S, there exists u ∈ F such that kx − uk < 1/2.
Saying this another way, S ⊆ F + B(0, 1/2), and this implies S ⊆ F + 21 S.
Let Y = span(F ). Then
S ⊆ F + 12 S ⊆ Y + 21 S.
30

By a simple notation-chase,
1
2
S ⊆ Y + 14 S.
Putting these together
S ⊆ Y + Y + 41 S = Y + 14 S.
Proceeding by induction
S ⊆ Y + 2−k S for all k > 1.
But then S ⊆ k (Y + 2−k S). The set on the RHS is Y . Since Y is finite-dimensional,
T
Y is closed. Therefore S ⊆ Y . But this implies X ⊆ Y . Hence X = Y and so X is
finite-dimensional. 

5. Density and separability

In which we reveal different aspects of the notion of density. Major results:


• Stone–Weierstrass Theorem 5.10, on dense subspaces of (C(K), k·k∞ ),
where K is compact;
• Theorem 5.25, which is a technically useful general theorem on ex-
tending a bounded linear operator from a dense subspace to the whole
space.
In addition we discuss the notion of separability. A separable space is
one which has a countable dense subset. Many of our familiar examples
are shown to be separable, a few are inseparable.

5.1. Spaces of continuous functions on compact sets.


Our goal is to identify amenable dense subsets in spaces of continuous functions,
for the sup norm. In particular we’ll be interested in polynomial approximations to
continuous functions on closed bounded subintervals of R.
For definiteness, let’s consider first C[0, 1], the continuous real-valued functions on
[0, 1]. Then C[0, 1] is a normed space for the sup norm, with convergence being uniform
convergence (and C[0, 1] is in fact a Banach space, though this is not directly relevant in
this section).
We know that C[0, 1] supports pointwise-defined operations as follows:
(1) addition and scalar multiplication;
(2) a commutative multiplication, with kf gk 6 kf k kgk, and with a unit for multiplica-
tion, viz. the constant function 1;
(3) modulus, |f |, for f ∈ C[0, 1];
(4) max and min: for f, g ∈ C[0, 1],
f ∨ g := max{f, g}; f ∧ g := min{f, g}.
Just as the algebraic operations and modulus rely for their definitions on the existence
of corresponding operations in R, so too the order-theoretic operations in (4) rely on the
corresponding order-theoretic operations in R and so on the underlying order relation 6
on R. A consequence is that, while (1)–(3) would go through unchanged to complex-
valued functions, (4) would not.

Selecting particular features:


31

• C[0, 1] is a commutative ring with identity ((1), with scalar multiplication ignored,
& (2));
• C[0, 1] is a commutative Banach algebra ((1) & (2), plus completeness to cover the
inclusion of ‘Banach’ in the name).
• C[0, 1] is a vector lattice: ((1) & (4)).
A surfeit of riches!
[Note: In all cases the formal definitions of the italicised terms involve some axioms
to ensure the various operations interact as we would wish. Not important for us because
we only work with the special case of C(K).]

Observe that we have also met normed spaces with a supplementary operation of
product in Section 3: the spaces B(X, Y ), in which kST k 6 kSk kT k and which are
Banach spaces when Y is a Banach space. These spaces are examples of non-commutative
Banach algebras.

Everything said so far works without change if we replace [0, 1] by any compact
space K. Compactness of K guarantees that kf k∞ is finite for each f ∈ C(K) so the sup
norm is available. Henceforth in this section C(K) will denote the real-valued continuous
functions on a non-empty compact set K.

5.2. Separation of points.


We say that a set Y ⊆ C(K) separates the points of K if, given p, q ∈ K with
p 6= q there exists g ∈ Y such that g(p) 6= g(q). As first examples we note that, in C[0, 1],
each of the following sets separates the points of [0, 1]:

C[0, 1], the polynomials, the piecewise-linear functions.

Obviously, the constant functions fail to separate points. In general, the smaller a subset
Y of C(K) is the less likely it is to separate points. We may ask when C(K) itself
separates the points of K. This is the case if

(i) K = [0, 1] or more generally any closed bounded interval in R;


(ii) K is a compact subset in Rn (exercise);
(iii) K is a compact subset of a metric space (use the fact that for each fixed point a,
the map x 7→ d(x, a) is continuous);
(iv) K is a compact Hausdorff space (quite advanced topology, based on Urysohn’s
Lemma; mentioned here just to complete the picture).

We mention these results because the Stone–Weierstrass Theorem that we prove for a
space C(K) (both forms) would be vacuous if C(K) failed to separate points.

What we are aiming for is sufficient conditions on a subspace Y of C(K) which will
guarantee that it is dense in C(K). This suggests that it needs to be ‘big’. In particular
it’s worth noting that Y cannot be dense if it is characterised by a property which lifts
from Y to its closure but which does not hold universally in C(K). For example, the
following are proper closed subspaces of C[0, 1] and so cannot be dense:
• { f ∈ C[0, 1] | f (1/2) = 0 };
• { f ∈ C[0, 1] | f (0) = f (1) }.

The first of these separates points but fails to contain the constants, while the second
contains the constants but fails to separate points.
32

5.3. Two-point fit lemma. Let Y be a subspace of C(K) containing the constant
functions and separating the points of K. Let p 6= q in K and α, β ∈ R. Then there
exists g ∈ Y such that
g(p) = α and g(q) = β.

Proof. Since Y separates points, we can first choose f ∈ Y such that f (p) 6= f (q). Now
consider g = λf + µ1 and aim to choose λ, µ ∈ R so that
α = λf (p) + µ, β = λf (q) + µ.
These equations are uniquely soluble for λ and µ. Since Y is a subspace containing the
constants, λf + µ1 ∈ Y . 

5.4. More about lattice operations in C(K).


A subspace L of C(K) is said to be a linear sublattice if f, g ∈ L implies f ∨ g
and f ∧ g belong to L. It is easy to see that L is a linear sublattice if and only if f ∈ L
implies |f | ∈ L. This follows from the (familiar) formulae:
|f | = (f ∨ 0) + ((−f ) ∨ 0);
f ∨ g = 12 (f + g + |f − g|) , f ∧ g = 12 (f + g − |f − g|) .
Note that it is crucial here that L be a subspace.
Note too that we can form the max and the min of any finite number of continuous
real-valued functions and that these are again continuous. But this doesn’t extend to
infinite families of functions.
Now take a linear sublattice L of C(K) which contains the constants and separates
points. Then the Two-point fit Lemma implies that, for a given f in C(K), there is a
member of L which equals f at any two specified points of K. This, thanks to continuity,
should enable us to approximate f locally by elements of L. To get a global approximation
to f , we first seek to use max operations to approximate f from below and then to use
min operations to approximate from above. We will need to call on compactness of K
to ensure that we only need a finite number of approximating functions from L at each
of these two stages. [Those who have seen the proof that a compact Hausdorff space is
normal may recognise similarities.]
So here’s our first density theorem, whose proof follows the strategy outlined above.

5.5. Stone–Weierstrass Theorem (real case, lattice form). Let L be a subspace of


C(K) which is such that
(i) L is a linear sublattice;
(ii) L contains the constant functions;
(iii) L separates the points of K.
Then L is dense in C(K).

Proof. We want to show that, given f ∈ C(K) and any ε > 0 we can find h ∈ C(K) such
that
f − ε < h < f + ε.
Step 1: approximating f at points p, q ∈ K. We claim there exists g ∈ L such that
g(p) = f (p) and g(q) = f (q). If p 6= q this comes from the Two-point fit Lemma. If p = q
a constant function with serve for g.
[This step doesn’t need the assumption that L is closed under max and min.]
33

Step 2: approximating f from below. Fix p ∈ K and let q ∈ K. Use Step 1 to


construct gq ∈ L such that f (p) = gq (p) and f (q) = gq (q). By continuity of gq − f there
exists an open set Uq 3 q such that
gq (s) > f (s) − ε for all s ∈ Uq .
By compactness of K there exist q1 , . . . , qn ∈ K such that K = Uq1 ∪ · · · ∪ Uqn . Then
g := gq1 ∨ · · · ∨ gqn > f − ε
and g(p) = f (p). Since g depends on p, let’s now denote it g p .

Step 3: approximating f from above. We now vary p. For each p we can choose an
open set Vp containing p such that
g p (t0 < f (t) + ε for all t ∈ Vp .
By compactness, there exists p1 , . . . , pm such that K = Vp1 ∪ · · · ∪ Vpm . Now
h := g p1 ∧ · · · ∧ g pm < f + ε.

Step 4: putting the pieces together. Consider h as in Step 3. From Step 2, each
g pi > f − ε. Hence h > f − ε. Therefore kh − f k∞ < ε. Also h ∈ L. We conclude that
f ∈ L. 

5.6. Example: an application of SWT, lattice form.


The set L of piecewise-linear real-valued continuous functions on a closed bounded
interval [a, b] ∈ R form a linear sublattice of C[a, b] (why?) and L contains the constants
and separates points. By 5.5, L is dense in C[a, b]. (This can also be proved directly, by
an ε − δ proof involving uniform continuity.)

We shall now use the sublattice form of SWT to get at a different form of the the-
orem which is particularly useful, and which subsumes Weierstrass’s classic polynomial
approximation theorem. We are going to need the following lemma, which can be seen
as a very special case of the uniform approximation of a continuous function on [0, 1] by
polynomials.

5.7. Technical lemma (approximating t on [0, 1] by polynomials). Define a se-
quence (pn ) of polynomials recursively by
1
p1 (t) = 0, pn+1 (t) = pn (t) + (t − pn (t)2 ) (n > 1).
2
u
Then (pn (t)) is an increasing sequence for each t ∈ [0, 1] and pn (t) → t1/2 on [0, 1].

Proof. We can prove easily that if


0 = p1 (t) 6 p2 (t) 6 · · · 6 pk (t) 6 t1/2 (for all t ∈ [0, 1])
holds for k = n then it holds for k = n + 1 too, and it is certainly true for k = 1. Hence
by induction (pn (t)) is a monotonic increasing sequence bounded above by 1. Therefore
(pn (t)) converges. Moreover, by (AOL) techniques (as in Prelims), the defining recurrence
relation implies that lim pn (t) = t1/2 for each t ∈ [0, 1].
It remains to prove that convergence is uniform. Dini’s Theorem seen in Prelims
Analysis II is exactly what we need. For completeness we include a proof for the case we
want (more topological than that given in Prelims course notes).
First note that (fn (t)), where fn (t) = t1/2 − pn (t), is a monotonic decreasing sequence
of continuous functions which converges pointwise to 0. Given ε > 0, let
Fn = { s ∈ [0, 1] | fn (s) > ε }.
34

Then Fn is closed and Fn+1 ⊆ Fn for all n. Since [0, 1] isTcompact, the Finite Intersection
Property implies that if all Fn ’s were non-empty then n Fn 6= ∅. But this would imply
that there is some point t at which fn (t) 9 0, contrary to assumption. So there exists N
such that FN = ∅. Then 0 6 fn (s) < ε for all s ∈ [0, 1] and for all n > N . 

It is clear that the set of real polynomials on a closed bounded interval of R is not
closed under the lattice operations of max and min: just consider t and −t on [−1, 1].
But the polynomials are closed under forming products.

5.8. Subalgebras of C(K).


We say that a subspace A of C(K) is a subalgebra if f, g ∈ A implies f g ∈ A and A
contains the constant functions. It is an elementary exercise to show that A a subalgebra
implies A is a subalgebra too. We claimed earlier that the closure of a subspace is a
subspace. So we only need to check that A is closed under products. But this comes from
a normed space version of the (AOL) result about limits of products of sequences.
Our reason for bringing in closures of subalgebras is revealed by the following propo-
sition.

5.9. Proposition. Let A be a closed subalgebra of C(K). Then A is a linear sublattice.

Proof. It will suffice to show that f ∈ A implies |f | ∈ A. Assume first that kf k 6 1.


Then 0 6pf 2 6 1 and by 5.7 we can find a sequence (pn ) of polynomials such that
pn (f 2 ) → f 2 = |f | uniformly on K. But pn (f 2 ) ∈ A since A is a subalgebra, and A is
closed under uniform limits since A is closed for the sup norm.
For the general case, take f 6= 0 and consider f /kf k, which has norm 1. There is
nothing to prove if f = 0. 

5.10. Stone–Weierstrass Theorem (subalgebras form, real case). Let A ⊆ C(K)


be such that
(i) A is a subalgebra of C(K) (that is, it is a subspace closed under products and it
contains the constants);
(ii) A separates the points of K.
Then A is dense in C(K).

Proof. By 5.8, L := A is a closed subalgebra. By 5.9, L is a linear sublattice. Also L


contains constants and L separates points since A does. Now SWT for linear sublattices
implies that L is dense. But since L is closed, it equals C(K). 

5.11. Corollary (Weierstrass’s polynomial approximation theorem (real case)).


Every real-valued continuous function on a closed bounded subinterval of R, or more
generally on a compact subset of Rn , is the uniform limit of polynomials.

5.12. Example (on Weierstrass’s Theorem).


Suppose that f ∈ C[0, 1] is such that f is real-valued and
Z 1
tn f (t) dt = 0 for all n = 0, 1, 2, . . . .
0

Then f is identically zero.


35

Proof. Take ε > 0. Choose a real polynomial p such that kf − pk < ε. Then, by the
assumption, and properties of integration,
Z 1 Z 1
2
06 f (t) dt = f (t)(f (t) − p(t)) dt.
0 0

Now
1 1
Z Z


f (t)(f (t) − p(t)) dt 6 |f (t)(f (t) − p(t))| dt 6 kf k kf − pk 6 kf kε.
0 0
R1
Since ε was arbitrary this forces 0 f 2 (t) dt = 0. Since f 2 is continuous and non-negative,
f 2 , and so also f , is identically zero. (Recall 1.11(a).)

5.13. Concluding remarks on SWT.


Note that there exist various direct proofs of Weierstrass’s polynomial approximation
theorem for continuous functions on a closed bounded interval, some of which work for
complex-valued functions too. Given Weierstrass’s Theorem, one could of course appeal
to this to get the general SWT for subalgebras instead of making use of the very special
case of it we obtained in 5.7.
The Stone–Weierstrass Theorem, in its subalgebras form, can be extended to the
case of complex-valued functions. This extension is not entirely routine and FA-I does
not cover it.

5.14. Separability: introductory remarks.


We may think of finite-dimensional normed spaces as being ‘small’ and infinite-
dimensional ones as ‘large’. But can we distinguish some infinite-dimensional normed
spaces as being smaller than others in a way that is meaningful and useful? Certainly a
dichotomy based on the number of vectors—countable or uncountable—wouldn’t be help-
ful, since the only countable normed space is {0} because the scalar field is uncountable.
So what about something more geared to topology?

5.15. Definition: separable.


A normed space X is separable if there is a countable subset D of X such that D is
dense, that is, D = X. A space which is not separable is said to be inseparable.
Explicitly, D is dense in X iff for each x ∈ X,
∀ε > 0 ∃s ∈ D kx − sk < ε.
Of course here s will depend on x.
The definition of density in terms of closure makes sense in any metric space (or any
topological space), but we focus on normed spaces, in which examples proliferate and in
which there are useful results making specific use of the norm.
We record as a lemma an elementary fact which can help make some separability
proofs less messy than they might otherwise be.

5.16. Lemma. Let (X, k · k) and (X, k · k0 ) be normed spaces with equivalent norms.
Then either both are separable or both are inseparable.

Proof. A subset D of X is dense in both spaces or neither: either use the definition of
equivalent norms, or use the result that the two spaces have the same open sets. 
36

It goes without saying that to understand and to work with separability you need to
know the rudiments of the theory of countable and uncountable sets, as introduced in.
Prelims Analysis I. This will be assumed in this section but a brief summary is provided
in a supplementary note, Review of basic facts about countability. It will be legitimate to
quote the results recalled there. We enclose such justifications within square brackets in
the proofs we give below.
Let’s begin with an example to show that separability can’t be taken for granted.

5.17. Example: `∞ is inseparable.


We argue by contradiction. Suppose D were a countable dense subset of `∞ .
Consider the subset U of `∞ consisting of sequences (uj ) such that uj = 0 or 1 for
each j. Then U is uncountable [standard example].
Take any 2 elements u = (uj ) and v = (vj ) in U with u 6= v. Then there exists j such
that |uj − vj | = 1 so ku − vk > 1 (in fact we have equality). Since D is dense, there exist
su and sv in D such that
1
ku − su k < 2
and kv − sv k < 12 .
If it were the case that su = sv we would have
ku − vk 6 ku − su k + ksu − sv k + ksv − vk < 1,
which is false. Therefore the map u 7→ su is injective, which is impossible since U is
uncountable and D was assumed to be countable [obvious standard fact].

Other examples of inseparable spaces are the space of Lipschitz functions and L∞ (R).
See Problem sheet Q. 17 for details.

5.18. Proposition (first examples of separable spaces). Let the scalar field F be R
or C. The following are separable:
(1) F;
(2) Fm , with any norm;
(3) `1 .

Proof.
(1) Q is countable and dense in R. Q + iQ := { a + ib | a, b ∈ Q } is countable and dense
in C (for, for example, the k · k∞ - norm, and hence for any norm by 5.16). So F has
a countable dense subset, henceforth denoted CF .
(2) Let’s consider Fm with the ∞-norm and let CF be countable and dense in F. Then
(CF )m is countable [finite product of countable sets]. Take any (x1 , . . . , xm ) ∈ Fm
and any ε > 0. Then for each j there exists sj ∈ CF such that |xj − sj | < ε. Hence
k(x1 , . . . , xm ) − (s1 , . . . , sm )k∞ = max |xj − sj | < ε.
j

Therefore (CF )m is a dense subset of Fm .


Now recall that any two norms on Fm are equivalent (Theorem 4.3) and appeal
to Lemma 5.16.
(3) For n > 1 let
Dn = {(a1 , . . . , an , 0, 0, . . .)| | aj ∈ CF }.
S
Then each Dn is countable [finite product of countable sets] and so D := n Dn is
countable [countable union of countable sets].
37

We claim D is dense in `1P


. Let x = (xj ) belong to `1 and fix ε > 0. Since x ∈ `1 ,
we can choose N such that n>N |xn | < ε. So
X
kx − x(N ) k = |xn | < ε, where x(N ) = (x1 , . . . , xN , 0, 0, . . .).
n>N

By (2) we can choose (a1 , . . . , aN ) ∈ (CF )N such that k(x1 , . . . , xN )−(a1 , . . . , aN )k1 <
ε. Let s = (a1 , . . . , aN , 0, 0, . . .). Then s ∈ D and kx − sk < 2ε. 

These examples illustrate techniques which are available more widely. As ever, the
span of a set S in a vector space X is
nX k o
span(S) := λi si | k = 1, 2, . . . , λi scalar, si ∈ S .
i=1

Crucially, we are only allowed to form finite linear combinations here. There is no pre-
sumption here in general that S is countable.

The import of the following theorem is, loosely, that we can build in the separability
of the scalars and facts about countable sets to arrive at a viable test for separability
which avoids the need for messy approximation arguments.

5.19. Theorem (testing for separability).


(i) Let Y be a normed space and let S be a countable set such that span(S) = Y . Then
Y is separable.
(ii) Let X be a normed space and such that span(S) is dense in X, where S is countable.
Then X is separable.
(iii) Let X be a normed space and let S be countable and such that span(S) is dense
in X. Then X is separable.

Proof. (i) Every element of Y is a finite linear combination of elements of S. The set T
of finite linear combinations of elements from S with scalars drawn from the countable
set CF is countable:
[nX n o
T = aj sj | aj ∈ CF , sj ∈ S
n>1 j=1
[countable union of countable sets]. Moreover T is dense in Y (either slog it out or exploit
the fact that addition and scalar multiplication are continuous (see 0.8)).

(ii) By (i), applied with Y = span(S), we see that Y is separable. Let D be countable
and dense in Y . Then, using superscripts to indicate the space with respect to which the
closure is taken.
Y X
D = Y and Y = X.
But (by elementary topology),
Y X
D = D ∩ Y.
X
so Y ⊆ D and, since Y = X by assumption it follows that D is dense in X. 

(iii) The facts underlying the proof here are that, in a normed space X, the span of a
subset P is the smallest subspace containing P and that the closure of a subset Q is the
smallest closed set containing Q.
We have
S ⊆ span(S) =⇒ S ⊆ span(S)
=⇒ span(S) ⊆ span(S) (closure of a subspace is a subspace)
38

=⇒ span(S) ⊆ span(S) (property of closure).


Also
S ⊆ S =⇒ span(S) ⊆ span(S)
=⇒ span(S) ⊆ span(S).

Putting the inclusions together, we get span(S) = span(S). Hence X is separable, by


(ii). 

5.20. Applications: proofs that particular spaces are separable.


We first revisit our initial examples of separable spaces from 5.18 and then add some
new ones.

(1) Fm has a countable spanning set so is covered by Theorem 5.19(i).

(2) `p is separable for 1 6 p < ∞.


This follows from Theorem 5.19(ii) and the fact that the ‘coordinate vectors’
en = (δnj ) form a countable set whose span is dense.
(3) Easy exercise: prove c0 is separable. A slightly less easy result is that c, the space
of convergent sequences, is separable; see Problem sheet Q. 18.

We now exploit earlier density results by showing that we can find suitable countable
subsets S such that span(S) is dense or span(S) is dense.

(4) C[a, b] (real-valued continuous functions on [a, b] ⊆ R, with sup norm) is separable.
More generally, C(K), for K a compact subset of Rn , is separable.

Proof. For C[a, b]: Take S to be the set {1, x, x2 , . . .}. Its span is the subspace of
all polynomials, and this is dense by Weierstrass’s Theorem. For the general result,
take S to the set of monomials in variables x1 , . . . , xn , that is, expressions xq11 · · · xqnn ,
where each qi ∈ N; note that S is countable. SWT (subalgebra form) implies span(S)
is dense. 

(5) Lp (R) (1 6 p < ∞) is separable.

Proof. Apply Theorem 5.19 with S as the set of characteristic functions of bounded
intervals with rational endpoints. Here the closure of S in the Lp norm contains
all characteristic functions of bounded intervals. Then span(S) contains the step
functions (why?) and so is dense. [Assumed fact: the step functions are dense in
Lp (R).] 

It is a salutary exercise to attempt to prove these results directly from the definition
of separability. For example, you might try to show that a suitable countable set of
piecewise linear functions is dense in C[a, b]. It’s messy! Not recommended.
Likewise, direct construction of a countable dense subset in an Lp space is messy.

Now for another general result, which we present in the setting of metric spaces.
39

5.21. Theorem. Let Y be a subset of a separable metric space (X, d). Then (Y, d) is
separable.

Proof. Note that we need to find a countable dense subset in Y .


Let DX = {xk }k∈N be a countable dense set in X. Then for each k, there exist points
yk,j (j = 1, 2, . . .) in Y such that
1
d(yk,j , xk ) < dist(xk , Y ) + ;
j
this comes just from the definition of the distance of a point from a subset in a metric
space.
The set DY := { yk,j | k, j ∈ N } is countable [N × N is countable]. We claim DY is
dense in Y . This needs an ε/3-argument. Take ε > 0 and y ∈ Y . By density of DX in X
we can find xk such that d(y, xk ) < ε/3. This implies that dist(xk , Y ) < ε/3. Now there
exists j such that j −1 < ε/3. Then
d(y, yk,j ) 6 d(y, xk ) + d(xk , yk,j )
ε 1
6 + + dist(xk , Y )
3 j
ε ε ε
6 + + = ε.
3 3 3
Hence DY is dense in Y , as claimed. 

5.22. Separable subspaces: examples and remarks.


(1) Cc (R) (continuous real-valued functions of compact support, sup norm) is separable.
S (n) (n)
Proof. Use the fact that Cc (R) = n Cc (R), where Cc (R) is the space of continu-
(n)
ous functions on R which vanish outside [−n, n]. Each Cc (R) has a countable dense
set, Dn say, since it can be identified with a subset of C[−n,
S n], which is separable.
(It is necessary to say ‘a subset of’ here. Why?) Then n Dn is dense in Cc (R). 
(2) Observe that we can’t get much mileage out of Theorem 5.21 for simplifying our
arguments in 5.20 that particular spaces are separable, since we have few instances
where the theorem would apply. Certainly c0 , with its k · k∞ -norm, is a subspace
of `∞ . Since the latter is inseparable, this does not tell us whether c0 is separable
or not. On the other hand, once c is known to be separable, then separability of c0
would follow.
(3) When we apply Theorem 5.21 with the metric d coming from a norm, then the
subspace Y is required to carry the norm inherited from X. For example, although
`1 ⊆ `2 (as sets), we could not deduce separability of (`1 , k · k1 ) from separability of
(`2 , k · k2 ) because the norms are not the same.

5.23. Addendum: bases in separable normed spaces?


It might be tempting to think that in separable normed spaces one could develop a
notion of bases, where a ‘basis’ would be a countable set of vectors forming a linearly
independent spanning set.P∞But this is highly problematic. We certainly have no assurance
that sums of the form n=1 λn xn will be norm-convergent. There is a notion of basis
(called a Hamel basis) available in an arbitrary vector space V , whereby a set S is a
spanning set for V if every v ∈ V is a finite linear combination of elements of S and S is
linearly independent if every finite subset of S is linearly independent in the usual sense.
But this is of little practical use: it can be shown with the aid of the Baire Category
theorem from FA-II that a Hamel basis for a Banach space is either finite or uncountable.
40

There is one very special case in which a separable normed space X does have available
a good notion of basis. When X is a separable Hilbert space, the notion of orthonormal
basis works well. An orthonormal sequence (xn ) is an orthonormal basis if hx, xni = 0
for all n implies x = 0. This concept is explored in FA-II. It has connections with or-
thogonal expansions as encountered in certain courses in applied analysis and differential
equations. The Projection Theorem, stated in 2.12, underpins the theory.

We do, however, have a simple positive result which holds in any separable normed
space.

5.24. Separable normed spaces: a density lemma. Let X be a separable normed


space and Y a subspace of X. Then there exists a sequence
Y = L0 6 L1 6 L2 6 · · ·
of subspaces of X such that

[
L∞ := Ln
n=1
is a dense subspace of X.

Proof. Let D = {s1 , s2 , . . .} be a countable dense subset of X. Define (Ln ) recursively by


L0 = Y, Ln+1 = span (Ln ∪ {sn+1 }) (n > 0).
It is elementary to show that L∞ , as in the statement of the lemma, is a subspace. It
remains to prove L∞ is dense. First observe that D ⊆ L∞ , since each member of D
belongs to some Ln . Finally
X = D ⊆ L∞ ⊆ X.
So we get equality throughout. 

We conclude this section with a general theorem on the theme ‘sometimes it is good
enough to have information on a dense subspace’. We shall exploit this and Lemma 5.24
when presenting a proof of the Hahn–Banach Theorem for separable spaces in the next
section. The theorem is also needed in FA-II. Note the crucial requirement that the
operator must map into a Banach space.

5.25. Extension Theorem for a bounded operator on a dense subspace. Let Z


be a dense subspace of a normed space X. Assume that Y is a Banach space and let
∼ ∼
T ∈ B(Z, Y ). Then there exists a unique T ∈ B(X, Y ) such that T Z = T . Moreover

kT k = kT k.

Proof. Let x ∈ X. Then there exists a sequence (zn ) in Z such that kx − zn k → 0. Now,
because T is bounded and linear,
kT zm − T zn k 6 kT k kzm − zn k 6 kT k (kzm − xk + kx − zn k) .
We deduce that (T zn ) is a Cauchy sequence in Y . This converges to an element y ∈ Y ,

depending on x. Denote it by T x. But we now must confront a possible issue of well-
definedness.
Suppose we have a rival sequence (zn0 ) in Z which also converges to x. We need to
show that lim T zn = lim T zn0 . To this end consider
kT zn − T zn0 k 6 kT k kzn − zn0 k 6 kT k (kzn − xk + kx − zn0 k) .
Since the RHS tends to 0, the limits lim T zn and lim T zn0 (which we know exist) are

the same. Therefore T x can be unambiguously defined to be lim T zn where (zn ) is any
41


sequence converging to x. In addition we see from this that T extends T (consider a
constant sequence). Uniqueness of the extension also follows because any continuous

extension of T to X, say T , must be such that, for zn → x as above,
≈ ≈
T x = lim T zn = lim T zn .
.

Linearity of T comes from the continuity of addition and scalar multiplication in a
normed space. In more detail, let x, v ∈ X with approximating sequences zn → x and
wn → v, with zn , wn ∈ Z for all n. Let λ, µ ∈ F. Then λzn + µwn → λx + µv. But
∼ ∼
T (λzn + µwn ) = λT zn + µT wn → λT x + µT v

and, by continuity of T , we also have T (λzn + µwn ) → T (λx + µv). By uniqueness of

limits in a normed space we see that T is indeed a linear map from X into Y .
Moreover, by continuity of norm,

kT xk = k lim T zn k = lim kT zn k 6 lim kT k kzn k = kT k kxk.
∼ ∼
Hence T is bounded, with norm 6 kT k. But because T extends T we must have the
reverse inequality too, so we get equality. 

6. Dual spaces and the Hahn–Banach Theorem

In which we present the Hahn–Banach Theorem and give the proof for
a separable normed space. In which too we begin to reveal the HBT’s
powerful and far-reaching consequences.

6.1. The dual space of a normed space.


Let X be any normed space. We define the dual space of X to be
X ∗ := B(X, F);
its elements are the continuous (alias bounded) linear functionals on X, equipped with
the operator norm
n |f (x)| o
kf k := sup | x 6= 0
kxk
(the (Op1) formula used here; any of (Op2), (Op3), (Op4) is available instead, as you
prefer). We shall repeatedly use the fact that, for any f ∈ X ∗ ,
|f (x)| 6 kf k kxk for all x ∈ X.

Recall that X ∗ is always a Banach space, whether or not X is complete (see Corol-
lary 3.9): completeness of X ∗ comes from the completeness of the scalar field. Warning:
choice of notation (X ∗ or X 0 ) for bounded linear functionals is not uniform across the
literature and past exam questions.
We slot in here an easy consequence of our Extension Theorem for bounded linear
operators, Theorem 5.25. It tells us that a normed space and any dense subspace of it
have essentially the same dual space.
42

6.2. Proposition. [dual space of a dense subspace] Suppose Z is a dense subspace of a


normed space X. Then the map J : X ∗ → Z ∗ given by Jf = f Z is a (vector space)
isomorphism which is isometric (kJf k = kf k for all f ∈ X ∗ ).

Proof. Certainly J is well-defined, linear and continuous. Theorem 5.25 ensures that J
is surjective and an isometry. 

We now give a result about linear functionals which is not a specialisation of one true
for linear operators.

6.3. Proposition. Let X be a normed space and f a linear functional on X.

(i) Assume f 6= 0. Then there exists x0 ∈ X such that

X = span(ker f ∪ {x0 }).

Here x0 may be chosen such that f (x0 ) = 1.


(ii) f is bounded if and only if ker f is closed.

Proof. (i) Take x0 with f (x0 ) 6= 0. For any x ∈ X,


f (x)
x− x0 ∈ ker f.
f (x0 )
Hence X = span(ker f ∪ {x0 }). To obtain the final assertion note that the span is
unchanged if we replace x0 by x0 /f (x0 ) and that f (x0 /f (x0 )) = 1.

(ii), =⇒ direction: ker f = f −1 ({0}), and this is closed if f is continuous.

⇐= direction: Result is trivial if f = 0. So assume f 6= 0. Take x0 as in (i), where


we may assume that f (x0 ) = 1. Denote ker f by Z. Let x ∈/ Z. Because Z is closed,

0 < dist(x0 , Z) := inf{ kx0 − zk | z ∈ Z }.

Then x − f (x)x0 ∈ Z so x/f (x) − x0 ∈ Z too. Hence



x
f (x) > δ := dist(x0 , Z).

Hence |f (x)| 6 δ −1 kxk for all x ∈


/ Z. This inequality holds, trivially, for x ∈ Z too.
Hence f is bounded with kf k 6 δ −1 . 

6.4. Cautionary examples.

(1) Example of an unbounded linear operator with closed kernel. Let X := span{en }n>1
in `1 , where as usual en = (δnj ). Then X is a dense and proper subspace of `1 .
Define T : X → `1 by T (xj ) = (jxj ). Then ken k1 = 1 and kT en k1 = n, so T is
unbounded. Clearly ker T = {0}.

(2) Example of an unbounded linear functional. Let X be the subspace of C[0, 1] (real-
valued functions, sup norm) such that X consists of the continuously differentiable
functions. Let f : X → R be given by f (x) = x0 (1/2) for x ∈ X. Then f is linear
but not bounded.
43

6.5. Hahn–Banach Theorem: context.


At its heart, the Hahn–Banach Theorem asserts the existence on normed spaces of
bounded linear functionals with particular properties.
The theorem comes in many formulations and FA-I only covers the most basic ones
and requires proofs only for separable spaces. The HBT is not a theorem about separable
spaces. But its proof is more elementary in that case. [Comments in 6.13 on using set
theory machinery (Zorn’s Lemma) to remove the separability restriction.]
The proof of the core form of the HBT (6.6) involves inequalities between scalar-
valued functions, and so is available only for real normed spaces. However, as we are
able to show quite easily, the statement of the theorem remains valid for complex normed
spaces too (see 6.7). Virtually all the spin-off applications we shall give are valid for real
or complex normed spaces.
There are some versions of HBT with a geometric flavour and these won’t extend to
complex normed spaces. We give only hints in FA-I of such theorems (see 6.15). They
are however important in more advanced courses, including those focusing on functional
analytic methods for PDE’s.

6.6. HAHN–BANACH THEOREM. Let Y be a subspace of a real normed space X.


Let g ∈ Y ∗ . Then there exists f ∈ X ∗ such that
f Y = g and kf kX = kgkY .
Thus the bounded linear functional g on Y has an extension to a bounded linear func-
tional f on the whole of X, with the same norm. There is no requirement that Y be
closed or that X be a Banach space.
Proof [for the case that X is separable]
HBT proof, stage 1. [Separability used here.] Lemma 5.24 allows us to construct
S an
increasing sequence of subspaces (Ln ) with L0 = Y and the subspace L∞ := Ln dense.

We shall extend g step by step to obtain a norm-preserving extension of g to L∞ —the


central part of the proof. This depends on a key lemma.
HBT proof, stage 2: One-step Extension Lemma. [The proof here relies crucially
on the scalars being R.] Let Z be a subspace of a real normed space X and let g ∈ Z ∗ =
B(Z, R). Let W := span(Z ∪ {y}) where y ∈ / Z. Then there exists h ∈ W ∗ such that
hZ = g and khkW = kgkZ .
Proof. We may assume wlog that kgkZ = 1. Every element of W is uniquely expressible
as
w = z + λy for some z ∈ Z, λ ∈ R.
If h : W → R is any linear functional extending g then
h(w) = g(z) + λh(y).
Our task is to show that we can choose the value of c := h(y) to make h bounded, with
norm 1. We need in particular
−kz + yk 6 g(z) + c and g(z) + c 6 kz + yk.
Equivalently c has to satisfy
(†) −kz + yk − g(z) 6 c 6 kz + yk − g(z).
Now for z, z 0 ∈ Z,
g(z) − g(z 0 ) = g(z − z 0 ) 6 kz − z 0 k = k(z + y) − (z 0 + y)k 6 kz + yk + kz 0 + yk.
44

Hence
α := sup {−g(z 0 ) − kz 0 + yk} 6 inf {−g(z) + kz + yk} := β.
z 0 ∈Z z∈Z

We then choose c ∈ [α, β]. Then, working backwards, we see that our requirements in
(†) are satisfied. But so far we’ve not considered what h will do to a general element
of W \ Z. We shall now confirm that khk 6 1 when c is chosen as above. For λ 6= 0,
|h(z + λc)| = |λ||g(z/λ) + c| 6 |λ|k(z/λ) + yk = kz + λyk;
the idea here is to exploit the fact that Z is closed under scalar multiplication: we call
on (†) with z/λ in place of z, considering separately the cases λ > 0 and λ < 0 to arrive
at the required inequality. Finally note that h, as an extension of g, has khk > kgk. 

HBT proof, stage 3: Extension of g defined on L0 to k defined on V := L∞ .


By applying the One-step Extension Lemma a finite number of times we can extend
g to hN ∈ B(LN , R) for any finite N , where khN k = 1. (It may happen that Lm+1 = Lm
for certain values of m. No extension is needed to pass from Lm to Lm+1 in such cases.)
Since N is arbitrary we can see that this process gives us an a well-defined extension k
to V for which kkkV = 1.

HBT proof, stage 4: The last lap: exploit density of V := L∞ :


We can call on our Extension Theorem 5.25 to extend k ∈ B(V, R) to f ∈ B(X, R)
with kf kX = kkkV = kgkY .

6.7. Hahn–Banach Theorem, complex case. Let X be a complex normed space. Let
Y be a subspace of X and let g ∈ Y ∗ . Then there exists f ∈ X ∗ such that f Y = g and
kf kX = kgkY .

Proof. We first consider real and imaginary parts of linear functionals on a complex vector
space X. Let f be a C-linear map from X into C. Write f (x) = u(x) + iv(x) for each
x ∈ X. Then (just calculate) u and v are real-linear and v(x) = −u(ix) and
f (x) = u(x) − iu(ix).
Conversely, given an R-linear functional u : X → R, we can define f in terms of u as
above to obtain a C-linear functional f : X → C with re f = u.
Now consider a complex normed space X, subspace Y and a complex-linear bounded
linear functional g on Y . Restricting to real scalars, regard X as a real space XR with
Y as a real subspace YR , and let u = re g. Apply Theorem 6.6 to extend u from YR to
a linear functional w on XR with kukYR = kwkXR . Define f by f (x) = w(x) − is(ix).
This is C-linear and extends g. It remains to check that kf kX = kgkY . First of all,
|u(y)| = |re g(y)| 6 |g(y)| for all y ∈ Y . Hence kukYR 6 kgkY . Consider x ∈ X. Then
f (x) ∈ C. We can choose θ such that |f (x)| = eiθ f (x). Then
|f (x)| = f (eiθ x) = re f (eiθ x) = u(eiθ x) 6 kukXR keiθ xk 6 kgkY kxk.
So kf kX 6 kgkY and the reverse inequality is trivial. 

6.8. A simple example using HBT.


Let X be a normed space over F. Let S = {xi }i∈I be a subset of X and {ci }i∈I a
corresponding set of complex numbers, and let M be a finite non-negative constant. Prove
that a necessary and sufficient condition for there to exist a continuous linear functional
f ∈ X ∗ such that f (xi ) = ci for all i ∈ I and kf k 6 M is that, for every finite subset J
of I,
X X
λj cj 6 M λj xj



j∈J j∈J
45

for any choice of scalars λj (j ∈ J).

Solution.
=⇒ : Assume f exists. Then, for any J and any scalars λj ,

X X X  X X
λ j cj = λj f (xj ) = f λj xj 6 kf k λj (xj 6 M λj (xj .



j∈J j∈J j∈J j∈J j∈J

⇐=: Let Y be the subspace of X spanned by S, so that the elements of Y are all
finite linear combinations of elements of S. “Define” f on Y by
X  X
f λ j xj = λ j cj .
j∈J j∈J

The key point to observe is thatP


the assumed conditionP implies that f is well defined:
express a vector x ∈ Y as x = j∈J λj xj and x = k∈K µk xk , where J, K are finite
subsets of I. Consider X X
λj xj − µ k xk
j∈J k∈K

and note that this is an element of Y to which the given condition can be applied to
show that the two representations of x give the same value for f (x). Moreover f is linear
(routine) and bounded, with kf kY 6 M . Also by construction each xi ∈ Y and f (xi ) = ci
for each i ∈ I. Now apply HBT to extend f from Y to X. 

Corollaries of HBT tumble out fast from Theorems 6.6 and 6.7 with little work needed.
We state them as results in their own right but they should be seen as direct or indirect
consequences of the main theorem. When not specified the scalar field may be either R
or C.
The idea in all cases is to let Y be a subspace containing the vectors about which we
want information, to define a bounded linear functional on Y to capture this information,
and then to let HBT do the rest.

Our first result tells us that a normed space (other than {0}) will always have a
plentiful supply of non-zero bounded linear functionals.

6.9. Proposition. Let X be a normed space and 0 6= x0 ∈ X. Then


(i) there exists f ∈ X ∗ such that
f (x0 ) = kx0 k and kf k = 1;
(ii) there exists f ∈ X ∗ such that
f (x0 ) = 1 and kf k = 1/kx0 k.

Proof. (i) Define Y = span({x0 }). Define g on Y by


g(λx0 ) = λkx0 k for λ any scalar.
Then g is linear and
|g(λx0 )|
kgkY = sup = 1.
λ6=0 kλx0 k
Now obtain the required f by extending g from Y to X.
For (ii), either rescale in (i), replacing x0 by x0 /kx0 k or more directly take g on Y =
span({x0 }) defined by g(λx0 ) = λ and apply HBT. 
46

6.10. Proposition (separating a point from a closed subspace). Let X be a


normed space. Let M be a proper closed subspace of X and let x ∈ X \ M . Then
there exists f ∈ X ∗ such that
1
f (x) = 1, f M = 0, kf k = .
dist(x, M )

Proof. Note that because M is closed and x ∈


/ M necessarily dist(x, M ) > 0.
Define Y = span(M ∪ {x}). Note that each element of Y has a unique representation
as y = m + λx, where m ∈ M and λ is a scalar. Let g : Y → F be given by
g(m + λx) = λ (m ∈ M, λ ∈ F).
Then g is well defined and linear and by construction g(m) = 0 for all m ∈ M and
g(x) = 1. Finally,
|g(y)|
kgk = sup
06=y∈Y kyk
|λ|
= sup
06=m+λx, λ6=0 km + λxk
1
= sup −1
06=m+λx, λ6=0 kx + λ mk
1
=
inf
0
kx + m0 k
m ∈M
1
= .
dist(x, M )

Now apply HBT. 

6.11. Theorem (density and bounded linear functionals). Let X be a normed


space and let S ⊆ X. Let M = span(S).
(i) If there exists f ∈ X ∗ such that f = 0 on S but f 6≡ 0 then the linear span of S is
not dense.
(ii) M = X if, for all f ∈ X ∗ , f (s) = 0 for all s ∈ S implies f = 0.

Proof. For any f ∈ X ∗ ,


f (s) = 0 for all s ∈ S =⇒ f = 0 on span(S) (since f is linear)
=⇒ f = 0 on span(S) (since f is continuous).

Hence the contrapositive of (i) holds.

For (ii): Observe that M = span(S) is a closed subspace. If M is proper than there
exists x ∈ X \ M . In that situation we can find f ∈ X ∗ as in Proposition 6.10 which is
zero on S but not identically 0. 

6.12. Remarks on Theorem 6.11.


Both parts of the theorem are important. Note that (i)—a test for non-density—does
not use HBT. Problem sheet Q. 19 provides an example of non-density established by the
strategy in (i).
47

Part (ii) should be seen as a density theorem. Given a subset S of a normed space X
we can find out whether span(S) is dense by showing that an element f of X ∗ for which
f (s) = 0 for all s ∈ S has to be identically zero. But in order for this to be useful we
shall need to describe the dual space of X. We address in the next section the problem of
giving concrete descriptions for various of our familiar normed spaces and illustrate the
use of Theorem 6.11 (further examples in problem sheets).

In subsequent sections we shall also see that there is much more to be said about
HBT applications in general. In particular HBT plays a key part in the investigation of
duals of bounded linear operators.

Annexe to Section 6 [optional extras]: More about the Hahn–Banach Theorem


Here we stray beyond the narrow confines of the Part B syllabus in two directions.

6.13. The proof of Theorem 6.6 without the restriction to separable spaces.
[This subsection is aimed at those planning to take Part B Set Theory, and at the naturally
curious.]
We need the principle of set theory known as Zorn’s Lemma. The idea is to extend
g, a bounded linear functional on Y , without change of norm, to a maximal possible
domain, and we hope to show that there is an extension to the whole of X. Let’s consider
the set of all possible extensions:
E := { h ∈ B(Z, R) | Z is a subspace of X ⊇ Y and hY = g }.
Then we can regard E as being partially ordered (formally one says h1 6 h2 iff graph h1 ⊆
graph h2 , which is a fancy way of saying that h2 ’s domain contains h1 ’s domain and that
h2 extends h1 ).
Suppose we have an element of E which is maximal, that is, an extension f of g, with
domain Z say, which cannot be extended any further. Then EITHER Z = X and we
have the extension we were seeking OR Z 6= X and we can apply the One-step Extension
Lemma to extend f to span(Z ∪ {x}), thereby contradicting maximality. But how do we
get a maximal element?

Zorn’s Lemma: Let E be a S non-empty family of sets partially ordered


by inclusion and assume that i∈I Si ∈ E whenever {Si }i∈I is a chain of
elements of E, meaning that for any i, j ∈ I, either Si ⊆ Sj or Sj ⊆ Si .
Then E has an element which is maximal (with respect to set inclusion).
The assumption that E 6= ∅ is important. In the HBT example the initial functional
g on Y gives an element of E.
The idea of a chain of subsets isn’t new: our subspaces {Ln } constructed in ?? and
used in the HBT proof 6.6 form a chain of sets indexed by N and their union is a subspace
to which g naturally extended. For the general case the idea is that extensions belonging
to a chain are compatible, allowing us to take their union and get another extension.
Similarly, in algebra the union of a chain of proper ideals in a commutative ring with 1
is a proper ideal; applying Zorn’s Lemma we get that every proper ideal is contained in
a maximal ideal.

6.14. Relaxing the assumptions in the statement of the real HBT.


We used the norm function to measure how big an extension of a given functional g
is. But careful inspection of the proof of Theorem 6.6 shows that we didn’t make use of
48

all the properties of a norm. With X a real vector space we could have replaced k · k by
a sublinear functional p : X → R satisfying
p(x + y) 6 p(x) + p(y) ∀x, y ∈ X,
p(αx) = αp(x) ∀x ∈ X, α > 0.
The conditions may be seen as weakening of norm condition (N2) and abandoning of (N1).
Obviously any norm, any linear functional and any seminorm is a sublinear functional.

Real HBT, sublinear functional form. Let X be real vector space


equipped with a sublinear functional p. Assume that Y is a subspace of X
and that g : Y → R is lear and such that |g(y)| 6 p(y) for all y ∈ Y . Then
there exists a linear functional f on X which extends g and is such that
|f (x)| 6 p(x) for all x ∈ X.

6.15. Real Hahn–Banach Theorem: a geometric view. There is a very extensive


literature on geometric forms of the HBT, of which we give only the briefest possible
glimpse.
Recall from 6.3 that the kernel of a non-zero linear functional f on X has codi-
mension 1, that is, that there exists x0 such that X = span(ker f ∪ {x0 }). This means
that we may think of a set { x ∈ X | f (x) = α }, which is a translate of the subspace
ker f , as a hyperplane, Hα . This may be viewed as dividing X \ ker f into two half-
spaces, { x ∈ X | f (x) < α } and { z ∈ X | f (x) > α }, whose closures intersect in Hα
(assuming f is continuous).
One may then think of Proposition 6.9(i) as constructing a hyperplane which is tan-
gent to the closed unit ball at a vector x0 of norm 1.
In the other direction, one may, for example, seek to separate a closed convex set V
in a real normed space from a point x not in V by a hyperplane given by an element f
of X ∗ so that V lies in one of the associated open half-spaces and x in the other.
A systematic approach to results of this type (of which there are many variants) one
can proceed in a manner similar to that used with a sublinear functional p, employing a
convex real-valued function F in place of p and replacing the inequality |g(y)| 6 p(y) for
y ∈ Y by g(y) 6 F (y), and likewise for extensions. But obtaining HBT in the convex
functions form and its applications are a specialised topic. The work involved in carrying
this through is disproportionate if one’s primary interest is in the basic applications of
HBT given earlier in this section and in the following ones.

7. Dual spaces of particular spaces; further applications of the


Hahn–Banach Theorem

In which we investigate the dual spaces of some familiar normed spaces


and illustrate how these can be exploited in conjunction with the Hahn–
Banach Theorem. In addition we apply HBT techniques to obtain various
theoretical results.

7.1. The dual space of a finite-dimensional vector space.


Revise your Part A Linear Algebra! Let X be a finite-dimensional vector space over
F, where F is R or C. We define X 0 to be the space of linear functionals (= linear maps
from X into F) with addition and scalar multiplication defined pointwise. Then
49

• for any given basis {e1 , . . . , em } of X there exists a dual basis {e01 , . . . , e0m } for
X 0 with e0j (ei ) = δij , and hence
• dim X 0 = dim X (so X and X 0 are isomorphic).
Moreover,
• there is a canonical (= basis-free) isomorphism from X onto its second dual X 00
given by x 7→ εx , where εx (f ) = f (x) for all f ∈ X 0 .
A key example: X = Rm . Here, for any y ∈ Rm we get a linear functional fy : x 7→ x · y,
where · denotes scalar product. Writing x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ) the formula
is m
X
fy (x) = xj y j .
j=1
m
Taking the standard basis {e1 , . . . , em } for R , we find that fej (ei ) = δij , so that the dual
basis is {fe1 , . . . , fem }.
If we identify fy with y then we may think of (Rm )0 as being Rm . [If one thinks of the
vectors x as column vectors and the vectors y as row vectors then y(x) is simply given
by matrix multiplication.]

When we make the transition from finite-dimensional spaces (pure Linear Algebra)
to normed spaces, we want to consider continuous linear functionals. if X is finite-
dimensional then every linear functional on X is automatically continuous, by Theo-
rem ??. This means that we already know from 7.1 how to describe the elements of X ∗
and how they act on elements of X. The only novel feature is the norm structure, in
particular how the chosen norm in X relates to the operator norm of X ∗ . Not for the
first time there are parallels between `p spaces and spaces (Fm , k · kp ); see 7.5 below.

7.2. Dual spaces of particular normed spaces.


Having an explicit description of the dual space X ∗ of a given normed space X gives us
a lot of valuable information we can exploit in various ways to solve problems involving X.
Specifically, given X, we would like to find a space (Y, k · k) (it’ll necessarily be a
Banach space) and a map J : Y → X ∗ such that
(A) J is well defined, that is, Jy is a bounded linear functional on X for each y ∈ Y ;
(B) J is linear;
(C) kJyk 6 kyk for all y ∈ Y , so J is bounded;
(D) kJyk > kyk for all y ∈ Y (assuming (C) holds, this forces J to be injective);
(E) J is surjective.
Here (C) and (D) together show that J is isometric, that is, kJyk = kyk. Also (C), (D)
and (E) together imply that J −1 is a well-defined map which is linear and bounded.
When such a J exists we say that X ∗ is isometrically isomorphic to Y and write
X ∼

= Y . This is the appropriate notion of isomorphism for normed spaces, telling us
that the spaces involved can if we wish be identified.
Finding a candidate for Y and verifying (A)–(E) may be fairly easy (sequence spaces,
except `∞ ); may involve sophisticated machinery to carry through in full (spaces of con-
tinuous functions, Lebesgue spaces, except L∞ ); or the dual space may be unwieldy and
really hard to describe (`∞ , L∞ ). See 7.5 for more explanation.

7.3. Theorem (dual spaces of sequence spaces).


(1) (c0 )∗ ∼
= `1 ;
(2) (`1 )∗ ∼
= `∞ ;
50

(3) (`2 )∗ ∼
= `2 ;
(4) (`p )∗ ∼
= `q , for 1 < p < ∞, where p−1 + q −1 = 1.
In each of the four cases, the dual space X ∗ of the sequence space X is identified via
an isometric isomorphism J with another sequence space, Y say. When this identification
is made, the action of elements of Y on elements of X is given by
X
(ηj )(xj ) = xj η j for all (xj ) ∈ X, (ηj ) ∈ Y.
j

Proof. To illustrate the strategy we prove (1) in gory detail, following the checklist (A)–
(E), and paying particular attention to issues of well-definedness, which are often linked
to convergence issues.

(A) Defining J: Given x = (xj ) ∈ c0 and η = (ηj ) ∈ `1 , “define”


X
(Jη)(x) = xj η j .

Check: P P
(i) (Jη)(x) ∈ F: P
we need xj ηj to converge. Proof:P |xj ηj | converges by com-
parison with |ηj | since (xj ) is bounded; hence xj ηj converges.
(ii) Jη is linear for each η: We need to show (Jy)(λx+µx0 ) = λ(Jη)(x)+µ(Jη)(x0 ),
for x, x0 ∈ c0 and λ, µ ∈ F. This is routine to check.
(iii) Jη is bounded for each η:
X X X
|(Jη)(x)| = xj η j 6 |xj ηj | 6 sup |xj | |ηj |;

j
P
here we have used the fact that |xj ηj | converges (from (i)) and the Triangle
Inequality for infinite sums of scalars (Prelims exercise). Therefore
kJηk 6 kηk1 .

(B) J is linear: We need to show that, for η, η 0 ∈ `1 and λ, µ ∈ F,


J(λη + µη 0 ) = λJη + µJη 0 ,
that is, that
J(λη + µη 0 ) (x) = (λJη + µJη 0 )(x) for all x ∈ c0 ,


that is, that


J(λη + µη 0 ) (x) = λ(Jη)(x) + µ(Jη 0 )(x) for all x ∈ c0 ,


because vector space operations in a dual space are defined pointwise. This last
equation is straightforward to check. [(B) is routine but clear thinking is required
about what needs to be checked.]
(C) Upper bound on kJηk: Already proved in (A)(iii).
(D) Lower bound on kJηk: We let 0 6= η = (ηj ) ∈ `1 and look at the effect of Jη on
suitably chosen vectors in c0 to get a lower bound for kJηk. For n = 1, 2, . . ., define
x(n) = (xnj )j>1 by

 |ηj |

if j 6 n and ηj 6= 0,
xnj = ηj
0 otherwise.

51

Note that each x(n) ∈ c0 . Moreover x(n) has norm 1 if x(n) 6= 0 and this is the case
for large enough n. Then, for such n,
Xn
(n)
kJηk > |(Jη)(x )| = |ηj |.
j=1
P∞
It follows that kJηk > j=1 |ηj | = kηk. (There is nothing to do if η = 0.)
(E) J surjective: Take any f ∈ (c0 )∗ . Let η = (ηj ) where ηj := f (ej ). We claim that
f = Jη. This requires two steps.
(i) (ηj ) ∈ `1 : We have to show that
P
|f (ej )| converges. By Prelims Analysis this
will be true provided the partial sums are bounded above. We make use of x(n)
as defined in (D) where we now take yj = f (ej ). Then by linearity of f we have
Xn  X n n
X
(n)
f (x ) = f xnj ej = xnj f (ej ) = |ηj |.
j=1 j=1 j=1

Hence, for all n,


n
X
(n)
kf k > |f (x )| = |ηj |,
j=1
as required.
(ii) (Jη)(x) = f (x) for any x = (xj ) ∈ c0 :
Xn Xn n
X
|f (x) − xj ηj | = |f (x) − f ( xj ej )| 6 kf k kx − xj ej )k = kf k sup |xj |.
j>n
j=1 j=1 j=1

Since x ∈ c0 we deduce that



X
f (x) = xj ηj = (Jη)(x).
j=1

This completes the proof of (1). The way that elements η of `1 act as linear functionals
on elements x of c0 is immediate from the proof we have given.
The characterisations (2), (3) and (4) are handled likewise, with (A)–(C) requiring
almost no adaptation and so routine. Variants are needed for (D) for different spaces, and
(E) is then modified accordingly. Check the details for yourself, referring to textbooks
for confirmation. 

7.4. Example: density in `1 . (from 2008:b4:3) on use of 6.11


Let β1 , β2 , . . . be distinct elements of C such that there exists δ such that for k =
1, 2, . . .,
|βk | 6 δ < 1.
For each k define the sequence xk by
xk = ((η) = (1, βk , βk2 , . . .).
Show that xk lies in the sequence space `1 and that the closed linear span
span{ xk | k = 1, 2, . . . }
1
coincides with ` .

Solution. Certainly each xk ∈ `1 . Now we seek to apply Theorem 6.11. Suppose that
f ∈ (`1 )∗ is such that f (xk ) = 0 for all k. By Theorem ??(2), we can identify f with
(ηj ) ∈ `∞ to get X j−1
0= βk ηj for all k.
j>1
52

Reinterpreting this we see that



X
F (z) := ηn+1 z n
n=0

takes the value 0 at each a = βk . But F is defined by a power series which has radius
of convergence at least 1 since (ηn+1 ) is bounded. It follows that F is holomorphic in
the open unit disc. It has an infinite set of zeros in the closed disc D(0, α). By the
Bolzano–Weierstrass Theorem, F has a limit point of zeros in the open unit disc. By the
Identity Theorem F ≡ 0. This implies that ηj = 0 for all j, so f ≡ 0. The required result
follows from Theorem 6.11(ii). 

7.5. Further dual space characterisations, and remarks.


• Finite-dimensional spaces: The duals of the spaces Fm with p-norms (1 6 p < ∞)
follow exactly the same pattern as do their `p analogues except that convergence
checks are not involved in the proofs and surjectivity van be handled by a dimension
argument (recall 7.1). All that is new is identification of the dual space norm:
(Fm , k·kp )∗ ∼
= (Fm , k·kq ), where p−1 +q −1 = 1. Moreover we also have (Fm , k·k∞ )∗ ∼
=
m
(F , k · k1 ): the convergence obstacles which arise if we try to apply the techniques
of Theorem 7.3 to `∞ don’t arise.
• Spaces of continuous functions: Characterisations of dual spaces go deep into
measure theory. But (see Problem sheet Q.??), it is not difficult to show that C[0, 1]∗
contains a proper dense subspace which is isometrically isomorphic to L1 [0, 1].
• `∞ : In the proof of the characterisation of (c0 )∗ we made use of the fact that we
were dealing with sequences which converge to 0, and not with sequences which are
merely bounded, in just one place: in (E) (proving surjectivity of J). We deduce
that (`∞ )∗ contains a subspace isometrically isomorphic to `1 . But maybe it contains
much more? This is not an easy question!
• Lebesgue spaces: The pattern seen for `p (a special instance of an Lp space)
persists: Lp (R)∗ can be identified with Lq (R) (1 < p < ∞, p−1 + q −1 = 1) and
L1 (R)∗ can be identified with L∞ (R).
As might be expected, L∞ (R)∗ is large and elusive. Similar claims hold when
R with Lebesgue measure is replaced by other measure spaces. All this goes way
beyond FA-I.

7.6. A special case: dual spaces of Hilbert spaces.


To avoid distractions keeping track of complex conjugate signs we shall assume in this
subsection that the scalars are real. All the results we mention have complex analogues.
Our catalogue of dual spaces in Theorem 7.2(4) includes the result that (`2 )∗ is iso-
metrically isomorphic to `2 . Moreover, for x = (xj ) and y = (yj ) in `2 ,

X
hx, yi = xj y j
j=1

and the expression on the right-hand side is the same as that we have when we regard
(y as an element of (`2 )∗ . In other words, every bounded linear functional on `2 is of the
form fy : x 7→ hx, yi, for some y ∈ `2 .
This exactly parallels what you have seen in Linear Algebra for finite-dimensional
(real) inner product spaces: the Riesz Representation Theorem. And we can bring the
Linear Algebra theorem within the normed spaces framework by equipping Rm with the
Euclidean norm associated with the usual scalar product.
53

So far we’ve only considered particular Hilbert spaces. With a sneak preview of FA-II
territory, we record the general Riesz Representation Theorem (real case) and make brief
comments.
Riesz Representation Theorem Let X be a real Hilbert space. Then
there is a linear isometry J of X onto X ∗ , given by:
(Jy)(x) = hx, yi (x, y ∈ X).
The proof depends on Proposition 6.3 together with, to prove surjectivity of J, the
Projection Theorem 2.12 applied with Z = ker f for f ∈ X ∗ .
It is interesting to review the Hahn–Banach Theorem in the context of Hilbert spaces.
Suppose we have a subspace Y of a real Hilbert space X and g ∈ Y ∗ . The subspace Y is
not assumed to be closed. However Y is a dense subspace of Z := Y . and we can appeal
to Proposition 6.2 to extend g to Z without changing the norm. So we may assume
without loss of generality that Y is closed. By Proposition ??, Y is a Hilbert space for
the induced IPS norm. Then, by RRT applied to Y , there exists y ∈ Y such that g = fy ,
where fy (x) = hx, yi for all x ∈ Y . Then fy is the restriction to Y of the continuous
linear functional x 7→ hx, yi on X. Both fy and its extension to X have norm ky
. This proved the HBT for (real) Hilbert spaces. Note that separability is not relevant
to this argument.

7.7. Aside: Dual spaces and separability.


It need not be the case that a separable normed space has a separable dual: con-
sider `1 .
On the other hand it can be shown that if X ∗ is separable then X is necessarily
separable. [See for example 2011, Paper b4, Question 2 for an outline proof—quite
tough.]

7.8. Looking further afield: second duals.


We mentioned that a finite-dimensional vector space X is naturally isomorphic to its
second dual X 00 . We remark here that we cannot expect in general to have X ∼ = X ∗∗ for
a normed space. In particular this could not occur whenever X is not complete. Also, by
way of an example of a different type, c0 is separable, but (c0 )∗∗ ∼
= (`1 )∗ ∼
= `∞ , which is
inseparable and so not isometrically isomorphic to c0 .
On the other hand we do have
(`p )∗∗ ∼
= (`q )∗ ∼
= `p (1 < p < ∞, p−1 + q −1 = 1).

7.9. Embedding a normed space into its second dual.


Let X be a normed space. For each x ∈ X, define a map εx by
εx (f ) = f (x) for all f ∈ X ∗ ;
think of εx as ‘evaluate each f ∈ X ∗ at x’. Let J : x 7→ εx (x ∈ X).
Claims
(i) εx ∈ X ∗∗ for each x;
(ii) J : X → X ∗∗ is well defined and linear;
(iii) J is isometric (and hence injective).
Note that there are two linearity checks required here, both routine, and that the one in
(i) is what ensures that J in (ii) is well defined.
54

Now consider (iiii). We have


kJxk = sup{ |(Jx)(f )| | f ∈ X ∗ and kf k = 1 }
= sup{ |f (x)| | f ∈ X ∗ and kf k = 1 }
6 kxk.
To get the reverse inequality we invoke Proposition 6.9(i). Given x ∈ X there exists
f ∈ X ∗ with f (x) = kxk and kf k = 1 and this witnesses that the sup is attained. In fact
we have for any x that
kxk = sup k |f (x)| | f ∈ X ∗ and kf k = 1 }.

Now let us define


b := JX ⊆ X ∗∗ .
X
This gives a faithful copy of X inside the Banach space X ∗∗ . We say that X is reflexive
if JX = X ∗∗ . We do not pursue the theory of reflexive spaces in B4.1—this is substantial
and belongs in a more advanced course (Part C level).

7.10. Completion of a normed space.


Any reflexive space X is automatically a Banach space. But let’s consider a general
normed space X. With notation as above we have a subspace X b = JX of the Banach

space X ∗∗ which is isometrically isomorphic to the original space X. Consider X := X.
b

Then X is a closed subspace of X ∗∗ and hence is a Banach space. Moreover JX is dense
in Z, We have proved that any normed space has a completion in which it embeds as a

dense subspace. Note we needed the HBT to show that J faithfully embeds X into X,
in a form applicable to all normed spaces [real or complex] r

We can say more about our completion (J, X) of X. It is characterised by what is
known as a universal mapping property: given any completion (i, Z) of X (so i is a
linear isometry embedding X into a Banach space Z) then there exists a linear isometry
∼ ∼
i : X → Z such that the following diagram commutes:


X
J ∼
i

X i
Z


The proof of the existence of the lifting i comes straight from Theorem 6.1.
Compare this construction with that of the completion of a metric space on Problem
sheet Q. 7. There we worked with a bigger class of spaces, but the completion we obtained
was not a Banach space or the embedding linear, even when the metric came from a norm.
However a universal mapping property analogous to that for the normed space completion
can be proved for the metric space completion.
55

8. Bounded linear operators: dual operators; spectral theory

In which we pursue further the theory of bounded linear operators, with


more powerful techniques available than we had in Section 3. Specifically
we shall exploit the Hahn–Banach Theorem in various ways. One such
application deals with the dual of a bounded linear operator, extending
the theory from Part A Linear Algebra to the setting of normed spaces.
Building on our earlier study of invertible operators we demonstrate
that in infinite dimensional spaces there’s much more to spectral theory
than eigenvalues. We extend the idea of spectrum to normed spaces. We
exhibit a number of general results about the spectrum of a bounded linear
operator on X; the deeper ones require X to be a complex Banach space.

8.1. Introducing dual operators.


Recap from Part A:
If X and Y are finite-dimensional vector spaces and T : X → Y is a linear map then
we can define a linear map T 0 : Y 0 → X 0 such that
(T 0 g)(x) = g(T x) for all g ∈ Y 0 , x ∈ X.
Here X 0 and Y 0 are the vector spaces of all linear functionals on X and Y respectively;
T 0 : g 7→ g ◦ T and is certainly a well-defined and linear map. The theory of dual transfor-
mations in the finite-dimensional setting focused on the relationships between the kernel
(respectively, image) of T and the image (respectively, kernel) of T 0 and on matrix repre-
sentations with respect to bases for X and Y and corresponding dual bases for Y 0 and X 0 .
All of this relied heavily on dimension arguments, including the Rank-Nullity Theorem.
The special case of T : X → X, where X is a finite-dimensional inner product space re-
ceived special attention in Prelims and Part A: there the notion of an adjoint T ∗ : X → X
was paramount. A tie-up between T ∗ : X → X and T 0 : X 0 → X 0 results from identifying
X 0 with X itself using the Riesz Representation Theorem for linear functionals on an
inner product space (care needed with complex conjugate signs in the case of complex
scalars). [We don’t consider in FA-I adjoint operators on Hilbert spaces (= inner product
spaces with a Banach space norm): that’s a major topic in FA-II.]

What can we do with dual maps in the context of normed spaces? Given T ∈ B(X, Y ),
where X, Y are normed spaces, we can define a map T 0 by
(T 0 ϕ)(x) = ϕ(T x) for all ϕ ∈ Y ∗ , x ∈ X.
Paralleling the finite-dimensional case, T 0 : ϕ 7→ ϕ ◦ T . Proposition 8.2 confirms that T 0
is a bounded linear operator from Y ∗ to X ∗ . There is a notational awkwardness here we
cannot avoid. We reserve the notation X 0 for the space of all linear functionals on any
space X and so chose to use X ∗ for the dual space of a normed space X, whose elements
are the bounded (alias continuous) linear functionals on X. We use the notation T 0 for
the map dual to T since T ∗ is too well-established usage in the restricted setting of inner
product spaces to be a sensible choice here.

8.2. Proposition (dual operator). Let X, Y be normed spaces (over the same field,
R or C) and T ∈ B(X, Y ). Then T 0 ∈ B(Y ∗ , X ∗ ) and kT 0 k = kT k.

Proof. We only prove that each T 0 ϕ is bounded and T 0 is bounded and that kT 0 k = kT k
(the rest is just linear algebra).
|(T 0 ϕ)(x)| = |ϕ(T x)| 6 kϕk kT xk 6 kϕk kxk kT k.
Hence T 0 ϕ is bounded, with kT 0 ϕk 6 kT k kϕk. Thence T 0 is also bounded, with kT 0 k 6
kT k.
56

For the reverse inequality for the norm, we need HBT. Take x ∈ X. Assume first
that T x 6= 0. By Proposition 6.9(i) there exists ϕ ∈ Y ∗ such that ϕ(T x) = kT xk and
kϕk = 1. Then
kT xk = |ϕ(T x)| = |(T 0 ϕ)(x)| 6 kT 0 ϕk kxk 6 kT 0 k kxk,
and this also holds, trivially, if T x = 0. Therefore kT k 6 kT 0 k. 

8.3. Annihilators; kernels and images of bounded linear operators and their
duals.
Let X be a normed vector space. For S ⊆ X and Q ⊆ X ∗ let
S ◦ = { f ∈ X ∗ | f (x) = 0 for all x ∈ S },
Q◦ = { x ∈ X | f (x) = 0 for all f ∈ Q }.
Then S ◦ and Q◦ are closed subspaces of X ∗ and X respectively (easy exercise). Moreover
[from a problem sheet question on HBT], for any subspace Y of X,
Y = (Y ◦ )◦ :
note the closure sign!
This leads on, easily, to the following results. The proofs are left as exercises. Let
X, Y be normed spaces and T ∈ B(X, Y ), with dual operator T 0 ∈ B(Y ∗ , X ∗ ). Then
(T X)◦ = ker T 0 , T X = (ker T 0 )◦ ;
(T 0 Y ∗ )◦ = ker T, T 0 Y ∗ ⊆ (ker T )◦ .
Where closure signs appear in the general results, they aren’t needed when the domain or
the range of the operator in question is finite-dimensional (why?). Moreover, when T 0 Y ∗
is finite-dimensional, T 0 Y ∗ = (ker T )◦ .
Turning this around, if you seek to extend a result on dual maps from Part A Linear
Algebra, expect a closure sign to come into play in the normed space setting whenever
the proof of the corresponding linear algebra result needs a dimension argument.

8.4. Example (annihilators and dual maps).


Define T : `1 → `1 by
(
2−j xj if j is odd,
T (xj ) = (αj ) where αj =
0 if j is even.
It is routine to check that T ∈ B(`1 , `1 ). Since (`1 )∗ ∼
= `∞ we can regard T 0 as a bounded
∞ ∞
linear operator from ` to ` [we shall suppress the map J setting up the isometric
isomorphism from `∞ onto (`1 )∗ ].
We want to find T 0 . Let en = (δnj ) in `1 and take (xj ) = en . Let y = (yj ) in `∞ and
let T 0 y = (βj ). Then, for any n,
X∞ ∞
X
0
βj δnj = (T y)(en ) = y(T en ) = y2k+1 2−n δn(2k+1) .
j=1 k=1

It follows from this that (


yj /2j if j is odd,
βj =
0 otherwise.

It is easy to see that


ker T = { (xj ) ∈ `1 | xj = 0 for j odd }
and that
(ker T )◦ = { (yj ) ∈ `∞ | yj = 0 for j even }.
57

Take y = (yj ) where yj = 1 for j odd and 0 for j even. Then y = T 0 z for some
z = (zj ) would imply that z2k = 22k , so z could not belong to `∞ . We conclude that
y ∈ (ker T )◦ \ T 0 `∞ . This shows that the inclusion of T 0 Y ∗ in (ker T )◦ may be strict. Here
T 0 Y ∗ is not closed.

We now move on to spectral theory. For reasons which will emerge, the usual setting
for this will be a complex normed space X, but we note that some more elementary results
and some examples work just as well when F = R. At crucial points, clearly flagged, we
shall need to assume that X is a Banach space. Thus the best results overall are available
in the setting of complex Banach spaces.

8.5. Definition: the spectrum of a bounded linear operator.


Let X be a normed space and let T ∈ B(X). The spectrum of T is
σ(T ) = { λ ∈ C | (λI − T ) is not invertible in B(X) }.
We may work with either (λI − T ) or with (T − λI). It makes no difference which we
use.
In identifying σ(T ) for a particular T and to obtain general properties of the spectrum,
we shall exploit results from Section 3, with T replaced by λI − T .

8.6. A closer look at the spectrum of a bounded linear operator.


We now record what our checklist for invertibility in 3.14 tells us about non-invertibility
of an operator λI − T , for T ∈ B(X), where X is a normed space.
The scalar λ ∈ σ(T ) if and only if at least one of the following holds:
(A) λ is an eigenvalue of T . That is, there exists x 6= 0 in X such that T x = λx.
Equivalently, λI − T fails to be injective.
(B) λ is an approximate eigenvalue, meaning that there exists a sequence (xn ) in X
such that
kxn k = 1 and kT xn − λxn k → 0.
(C) λI − T fails to be surjective. This happens in particular if (λI − T )X is not dense.
The set of eigenvalues of T is called the point spectrum of T and denoted σp (T ). The
set of approximate eigenvalues is denoted by σap (T ) and we refer to it as the approximate
point spectrum.
The condition in (B) for an approximate eigenvalue λ asserts that λI − T could
not have a bounded inverse (assuming it were a bijection); recall (?) in the invertibility
checklist. Specifically, this happens if there does not exist K > 0 such that
k(λI − T )xk > Kkxk.
This implies (consider failure for K = 1/n for n = 1, 2, . . .) there exists a sequence (un )
with k(λI − T )un k < 1/nkun k; necessarily un =
6 0. Then let xn = un /kun k to show λ is
an approximate eigenvalue.
We claim that
σp (T ) ⊆ σap (T ) ⊆ σ(T ).
The first inclusion is clear: if x is an eigenvector of norm 1 (wlog) then we may take xn = x
for all n. For the second inclusion assume for contradiction that λ ∈ σap (T )\σ(T ) and that
(xn ) is as in (B). Hence S := (T −λI)−1 exists in B(X). Then xn = S(T −λI)xn → S0 = 0,
contradiction.
The following can be useful pointers to identifying approximate eigenvalues:
58

(i) In sequence spaces it often happens that a particular λ fails to be an eigenvalue


because any candidate associated eigenvector x = (xj ) would not belong to the
space X because it would not have finite norm. Try ‘truncating’: consider (x(n) ),
(n)
where xj = xj for j 6 n and 0 otherwise, to show that λ is an approximate
eigenvalue.
(ii) Given y ∈ X try to solve the equation (λI − T )x = y for x. If this can be done,
then consider whether (λI − T )−1 is bounded. If it’s not you may be able to find a
sequence (xn ) which witnesses this.

Now for some first examples.

8.7. Example 1 (a multiplication operator).


Let X be the space C(K) of continuous complex-valued functions on a compact subset
K of C, with the sup norm. Define T ∈ B(X) by
(T f )(z) = zf (z) (z ∈ K).
Then σ(T ) = K. This follows immediately from Example 3.16(1), replacing T by λI − T
in that example.

8.8. Example 2 (eigenvalues and approximate eigenvalues).


Let T ∈ B(c0 ) be given by T (xj ) = (xj /j). Each λ = 1/n is an eigenvalue with en as
an associated eigenvector (n = 1, 2, . . .).
Certainly 0 is not an eigenvalue because ker T = {0}. However 0 is an approximate
eigenvalue: consider xn = en and note kT en k = 1/n → 0.
Consider λ ∈ / S := {0} ∪ {1/n | n > 1}. Then S is closed and bounded, hence there
exists δ > 0 such that |λ − s| > δ for all s ∈ S. Let y = (yj ) ∈ c0 and suppose x = (xj )
is such that (λI − T )x = y. This requires (λ − 1/j)xj = yj for all j. This gives xj such
that |xj | 6 |yj |/δ. From this, x ∈ c0 and kλI − T )−1 k 6 δ −1 .
We conclude that σ(T ) = S.

We now work towards establishing general properties of the spectrum. Here com-
pleteness of X becomes important. The following elementary lemma will be useful when
we need to juggle with invertible operators.

8.9. A lemma on invertible operators.


(i) Let X be a normed space and P, Q ∈ B(X).
(a) Assume P and Q are invertible. Then P Q is invertible with inverse Q−1 P −1 .
(b) Assume that P Q = QP , P is invertible and that P Q is invertible. Then Q is
invertible.
(ii) Let X be a Banach space and let P, Q ∈ B(X). Assume that P is invertible and
that
kP − Qk < kP −1 k−1 .
Then Q is invertible.

Proof. We leave (i)(a) is an easy exercise; remember that for an operator to be invertible
we need a 2-sided inverse, in B(X).
Consider (i)(b). Assume S is an inverse for P Q in B(X). Then I = (P Q)S = S(P Q).
Since multiplication is associative and P and Q commute, I = Q(P S) = (SP )Q. We
59

deduce that Q is a bijection. Invertibility of P Q implies that there exists δ > 0 such that
kP Qxk > δkxk for all x. Then
δkxk 6 kP Qxk 6 k 6 kP kkQxk.
Also P Q invertible forces P 6= 0, so kP k =
6 0. Therefore kQxk > (δ/kP k)kxk for all kxk.
We now deduce from 3.14 that Q in invertible.
[Note that, purely algebraically, we got a left inverse and a right inverse for Q, each
of which is a bounded operator. But we don’t know that these are equal. Hence we need
to argue via the invertibility checklist.]

We now prove (ii). Consider R = (P − Q)P −1 . Then R ∈ B(X) and kRk 6


kP − QkkP −1 k and so kRk < 1. Hence by Proposition 3.17, I − R is invertible. Then
(I − R)P is invertible too. But (I − R)P = Q. 

We already have enough information to prove quite a lot about the spectrum of a
bounded operator on a Banach space. A corresponding result holds when the scalar field
is R.

8.10. Theorem I (basic facts about spectrum). Let T be a bounded linear operator
on a complex Banach space X. Then
(i) σ(T ) ⊆ D(0, kT k), the closed disc center 0 radius kT k.
(ii) σ(T ) is closed.
(iii) σ(T ) is a compact subset of C.

Proof. Proposition 3.17, applied to I − T /λ (λ 6= 0), implies that λI − T is invertible if


|λ| > kT k. Hence (i) holds.
Now assume λ ∈
/ σ(T ). Then
k(λI − T ) − (µI − T )k = |λ − µ| < k(λI − T )−1 k−1
if µ is such that |λ − µ| is sufficiently small. Now use Lemma 8.9(ii).

Part (ii) tells us that C \ σ(T ) is an open set, so σ(T ) is closed. Since it is also
bounded, by (i), the Heine–Borel Theorem gives (iii). 

The next example illustrates how the results in Theorem I can be combined to identify
the spectrum of an operator in certain cases.

8.11. Example 3 (left-shift operator on `1 ).


Let T : `1 → `1 be given by
T (x1 , x2 , x3 , . . .) = (x2 , x3 , x4 , . . .).
Then (see 3.3(1)) kT k = 1, so σ(T ) ⊆ D(0, 1).
Now find eigenvalues (if any). Suppose x = (xj ) ∈ `1 is such that T x = λx. Equating
components:
λxj = xj+1 (j > 1).
j−1
Hence xj = λ x1 . If x1 = 0, we get x = 0 and this is not an eigenvector. Otherwise
x = (xj ) ∈ `1 iff |λ| < 1. Hence σp (T ) = D(0, 1).
Finally
D(0, 1) = σp (T ) ⊆ σ(T ) ⊆ D(0, 1).
Taking the closure right through and using the fact that σ(T ) is closed we get
σ(T ) = D(0, 1).
60

The next result can give valuable information about σ(T ) in cases where kT n k can
be found explicitly and where Theorem I together with knowledge of σp (T ) does not pin
down σ(T ) completely; contrast Example 4 below with Example 3.

8.12. Proposition (a refinement of Theorem I(ii)). Let T ∈ B(X), where X is


Banach. Then
λ ∈ σ(T ) =⇒ |λ|n 6 kT n k for all n.

Proof. Take n > 1. By 8.10(i) applied to T n it will be enough to prove that


λ ∈ σ(T ) =⇒ λn ∈ σ(T n ).
We shall obtain the contrapositive. Assume λn I − T n is invertible with inverse S. We
claim λI − T is invertible. Observe that
λn I − T n = (λI − T )p(T ) = p(T )(λI − T ),
where p(T ) is a polynomial in T and belongs to B(X). Now apply 8.9(i)(b). 

8.13. Example 4 (an integral operator). [From FHS Paper B4a:3]


Let X = C[0, 2π] (real-valued functions), with the supremum norm. Define T by
Z t
(T x)(t) = sin s x(s) ds.
0
Then T is a bounded linear operator on X (recall 3.3(4)) and an easy induction gives,
for n > 1,
tn
|(T n x)(t)| 6 kxk∞ .
n!
Hence, taking the sup first over t and then over x, we get
(2π)n
kT n k 6
n!
The RHS → 0 as n → ∞ (why?). Hence by Proposition 8.12 the only possible value for
λ when λ ∈ σ(T ) is λ = 0.
We assert that 0 is an approximate eigenvalue. This is witnessed by the sequence
(xn ) where xn (t) = cos nt. To confirm this, note that kxn k = 1 and integrate by parts to
get an upper bound for kT xn k from which we can deduce kT xn − 0 · xn k = kT xn k → 0.
We conclude that σ(T ) = {0}.

We have already seen that looking at the powers of an operator T may provide
valuable information about its spectrum. We now take this idea further. Here we don’t
need X to be a Banach space but we do need the scalar field to be C.

8.14. Theorem II (Spectral Mapping Theorem for polynomials). Let p be a com-


plex polynomial, not identically zero. Let X be a complex normed space and T ∈ B(X).
Then
σ(p(T )) = p(σ(T )) := { p(λ) | λ ∈ σ(T ) }.

Proof. There is nothing to prove if p is a constant so assume p has degree n > 1. Let
µ ∈ C. We can factorise p(z) − µ as a product of linear factors:
p(z) − µ = α(z − β1 ) · · · (z − βn ),
for some α 6= 0 and β1 , . . . , βn . Then
p(T ) − µI = α(T − β1 I) · · · (T − βn I).
61

Here the factors commute and µ = p(λ) for some λ if and only if λ ∈ {β1 , . . . , βn }. So
µ ∈ p(σ(T )) ⇐⇒ ∃λ ∈ σ(T ) such that µ = p(λ)
⇐⇒ σ(T ) ∩ {β1 , . . . .βn } =
6 ∅.

Assume µ ∈ / σ(p(T )). Then λ 6= βr and hence (T − βr I) is invertible, for each r. This
implies α(T − β1 I) · · · (T − βn I) is invertible. We deduce that µ ∈
/ p(σ(T )).
Assume µ ∈
/ p(σ(T )). Then, for each r,
Y
p(T ) − µI = (T − βr I) (T − βj I).
j6=r

It follows from Lemma 8.9(i)(b) that T − βr I is invertible. Therefore βr ∈


/ σ(T ) for
r = 1, . . . , n. Hence µ ∈
/ p(σ(T )). 

8.15. Example 5 (use of SMT).


Let T ∈ B(`1 ) be given by
T (xj ) = (xj − 2xj+1 + xj+2 ).
Then T = (I − L)2 , where L is the left-shift operator considered in Example 3: σ(L) =
D(0, 1). By Spectral Mapping Theorem,
σ(T ) = { (1 − λ)2 | |λ| 6 1 }.
Describing σ(T ) explicitly is now an exercise on the geometry of the complex plane. By
going into polar coordinates, one can show that
σ(T ) = { (r, θ) | 0 6 θ < 2π, 0 6 r 6 2 + 2 cos θ }.
The boundary curve is a cardioid. Moral: quite simple polynomial transformations can
change the spectrum in ways which would have been hard to predict.

Our earlier Proposition 8.12 is a corollary of the Spectral Mapping Theorem. A much
stronger result linking the spectrum of T to powers of T can be proved than that in
Proposition 8.12. We omit the proof of the Spectral Radius Formula, which needs
more advanced theory than that in FA-I and which we shall not need.

8.16. Theorem (Spectral Radius Formula). Let X be a complex Banach space. Then
the spectral radius of T , defined by
rad(σ(T )) = sup{ |λ| | λ ∈ σ(T ) },
is given by
rad(σ(T )) = inf kT n k1/n = lim kT n k1/n .
n n→∞

We have flagged up already that it can be useful to consider the dual map T 0 when
seeking information about a bounded linear operator T ∈ B(X)—assuming of course we
can (i) describe the dual space X ∗ on which T 0 acts and (ii) describe the action of T 0
explicitly. But even with these provisos concerning its usefulness in specific cases the
following theorem is of interest and the proof instructive. It is crucial that we work with
a Banach space since we shall call on Proposition 3.15.
62

8.17. Theorem II (T and T 0 in tandem). Let X be a Banach space. Let T ∈ B(X)


and let T 0 ∈ B(X ∗ ) be the dual operator. Then
σ(T ) = σap (T ) ∪ σp (T 0 ).

Proof. The proof depends on Proposition 3.15 and results from 8.3 as they apply to
P := λI − T . Note that P 0 = λI − T 0 . Also we shall use the fact (from Problem sheet
Q. 23) that σ(T 0 ) = σ(T ).
Hence, from 8.6,
σ(T ) ⊇ σap (T ) ∪ σp (T 0 ).
/ σap (T ) ∪ σp (T 0 ). Then there exists
For the proof of the reverse inclusion, assume λ ∈
K > 0 such that kP xk > Kkxk for all x. Also ker P 0 = {0} so, from 8.3, P X is dense.
Now Proposition 3.15 implies that P is invertible. 

Something’s missing! So far in all our examples we have seen that the spectrum is
non-empty. But is this true in general? Our final piece of theory will show that if X is
a complex Banach space, then σ(T ) 6= ∅ for any T ∈ B(X). We shall treat the proof of
this quite lightly, aiming to give the flavour without the fine detail. First we need some
preliminaries.

8.18. The resolvent set.


Let T ∈ B(X), where X is a complex Banach space. Then the resolvent set is
ρ(T ) := C \ σ(T ).
For λ ∈ ρ(T ) we can define a bounded linear operator R(λ, T ) = (T − λI)−1 . For
|λ| > kT k this is given by

X
R(λ, T ) = λ−k−1 T k ;
k=0
this comes from Proposition 3.17 by rescaling.
We also have, for λ, µ ∈ ρ(T ),
(T − µI) = (T − λI) + (λ − µ)I = (T − λI) I + (λ − µ)(T − λI)−1 .


We can deduce from this that the map λ 7→ (T − λI)−1 is a continuous function on ρ(T ).
Moreover we have the Resolvent Identity
R(λ, T ) − R(µ, T ) = (λ − µ)R(λ, T )R(µ, T )
(proof is pure linear algebra).

8.19. Theorem IV (non-empty spectrum). Let X be a complex Banach space and


T ∈ B(X). Then σ(T ) 6= ∅.

Proof. We argue by contradiction. If the conclusion is false R(λ, T ) = (λI − T )−1 is a


bounded linear operator for every λ ∈ C. By the Hahn–Banach Theorem there exists
f ∈ (B(X))∗ such that f (R(0, T )) 6= 0. Consider the map
ϕ : λ 7→ f (R(λ, T )) (λ ∈ C).
This is a composition of continuous maps, so continuous. But, for λ 6= 0,
R(λ, T ) = (−λ(I − λ−1 T ))−1 = −λ−1 (I − λ−1 T )−1 → 0 as |λ| → ∞.
To confirm this, note that (I − λ−1 T )−1 → I.
63

Provided we can prove that ϕ is holomorphic, then we can conclude from Liouville’s
Theorem that ϕ ≡ 0 and this would contradict the fact that ϕ(0) 6= 0. With some juggling
we can show with the aid of the Resolvent Identity that λ 7→ ϕ(λ) has a convergent
power series expansion in a suitably small neighbourhood of each µ ∈ C. Hence it is
holomorphic. 

8.20. SUMMARY: Pinning down σ(T ).


The second column in the following table gives a reference for the may assist in
determining σ(T ). Which of the various techniques are needed in a particular example
will depend on the type of operator under consideration and what information one finds
along the way.

Table 1. Strategies for finding spectrum

Theory that is relevant or maybe useful

Methods not involving (B), (C) in 8.6


1. Calculate kT k Theorem I(i)

2. Find σp (T ) & its closure Theorem I

3. If expedient, investigate kT n k: 8.12


if inf n kT n k1/n = 0, then σ(T ) = {0} 8.12 & Theorem IV

4. Is T a polynomial of a simpler operator? SMT (Theorem II)

More refined analysis using (B), (C)


Assume λ ∈
/ σp (T )
5. Try to solve y = (λI − T )x for x:
∃y s.t. 6 ∃x ∈ X? If so, surjectivity fails & λ ∈ σ(T )

6. Is λ ∈ σap (T )? If so, λI − T cannot have a bounded


inverse & λ ∈ σ(T )

7. Consider T 0 Theorem III

Overall, we seek to find for which values of λ the operator (λI − T )


• does have a bounded inverse;
• fails to have a bounded inverse.
Our strategy in practice may be to toggle between looking for when λ ∈ / σ(T ) and when
λ ∈ σ(T ) until all λ ∈ C are identified as being in or out of the spectrum.
Towards finding when a bounded inverse does exist for λI − T , we may first seek a
disc to which σ(T ) must be confined [1. and 3. in the table].
In the other direction, to find points λ which have to belong to σ(T ), we first look
for the eigenvalues. As usual we try to find conditions on λ so that T x = λx has a non-
zero solution for x. Crucially we require x in X. When this membership test fails for a
64

candidate value of λ we may suspect that λ ∈ σap (T ). In some cases σp (T ) may be dense
in D(0, kT k), giving σ(T ) = D(0, kT k) [as in Example 3]. Sometimes this won’t hold, but
a given operator may be a polynomial in some other operator for which the spectrum can
be easily found. Then the Spectral Mapping Theorem comes into play [4. in the table;
Example 5].
In certain cases there may be few eigenvalues or none at all [the right-shift operator
on `1 provides an example [see Problem sheet Q. 26]. For λ ∈ / σp (T ) and where λI − T is
not already guaranteed by 1. or 3. to be invertible, it may then be worth trying to find
an inverse map (λI − T )−1 : could it have domain X, i.e., is λI − T surjective? [5. in the
table]. If surjectivity is assured, one then needs to test for boundedness of the inverse
((?) in 3.14). [Illustrations: Examples 1 and 2.]
To show λ ∈ / σp (T ) is such that λ ∈ σ(T ) we want to find whether one of (B), (C) in
8.6 holds. There can be points in the spectrum for which λ is not an eigenvalue or even
an approximate eigenvalue of T and failure of surjectivity may be hard to prove directly.
If X is a space whose dual space is known (for example X = `p with 1 6 p < ∞ (recall
7.3)), then recourse to Theorem III, involving T 0 , may be a good option [7. in the table]:
it may be easier to find eigenvalues of T 0 than to explore when λI − T is surjective. [See
Problem sheet Q. 26 and also the bonus extension question on the final problem sheet.]

8.21. What use is the spectrum?


That’s a story for another, more advanced course. What we’d hope to do is to extend
to suitable operators on infinite-dimensional complex Banach spaces spectral represen-
tations such as we have for self-adjoint operators on finite-dimensional inner product
spaces. We’d expect, demanding on how complicated the spectrum is, to need a spectral
representation involving an infinite sum or even a Stieltjes-style integral.
65

Contents

0. Preliminaries and first examples, from Part A 2


0.1. Definitions: normed space, equivalent norms 2
0.2. Basic properties of a norm and introductory examples 2
0.3. Proposition (the norm on an inner product space) 2
0.4. Subspaces 3
0.5. Norms on the finite-dimensional spaces Fm 3
0.6. Further normed spaces encountered in Part A metric spaces 3
0.7. The metric associated with a norm 4
0.8. Proposition 4
0.9. Corollary (closure of a subspace) 4
0.10. Advance notice: continuity of a linear map 5
0.11. Banach space: definition 5
0.12. Proposition (closed subspaces of Banach spaces) 5
0.13. Theorem (Cauchy Convergence Principle) 5
1. Normed spaces, Banach spaces and Hilbert spaces 6
Introductory course overview 6
Context 6
The role of a norm 6
How reliable is your intuition? 7
1.1. Notes on verifying the norm properties 8
1.2. Quotients and seminorms 9
1.3. More examples of norms on finite-dimensional spaces 9
1.4. Sequence spaces. 9
1.5. Products of normed spaces 11
1.6. Subspaces of sequence spaces 11
1.7. Example: sum of subspaces 11
1.8. Function spaces with the supremum norm 11
1.9. Example: norms on spaces of differentiable functions 12
1.10. Example: Lipschitz functions 12
1.11. Norms on spaces of integrable functions 12
1.12. Properties of the norm on an inner product space 13
2. Completeness and density 14
2.1. Density 14
2.2. Density and examples of non-completeness 14
2.3. Example: completeness of F b (Ω) (`∞ is a special case) 15
2.4. Example: completeness of C[0, 1] 15
66

2.5. Example: completeness of `1 16


2.6. Example: completeness of c0 16
1
2.7. Example: completeness of C [0, 1] 16
2.8. Stocktake on tactics for completeness proofs 17
2.9. Series of vectors in a normed space 17
2.10. Theorem (completeness and absolute convergence) 17
2.11. Applications of Theorem 2.10 18
2.12. A glimpse at Hilbert spaces 18
3. Linear operators between normed spaces 19
3.1. Proposition (characterising continuous linear operators) 19
3.2. The norm of a bounded linear operator 20
3.3. Bounded linear operators and their norms: first examples 20
3.4. Example: an unbounded operator 22
3.5. Remarks on calculating operator norms 22
3.6. Example: bounded linear operators on sequence spaces 22
3.7. Aside: matrix norms (attn: numerical analysts) 23
3.8. Theorem (completeness) 24
3.9. Corollary 24
3.10. Products of linear operators 24
3.11. Kernel and image 24
3.12. Examples: kernel and image 25
3.13. Invertibility of a bounded linear operator 25
3.14. A checklist for invertibility of T ∈ B(X) (X a normed space) 25
3.15. Proposition (closed range) 26
3.16. Examples: invertible and non-invertible operators 26
3.17. Proposition 26
4. Finite-dimensional normed spaces 27
4.1. Finite-dimensional spaces, algebraically 27
4.2. Theorem (introducing topology) 27
4.3. Theorem (equivalence of norms) 28
4.4. Corollary (boundedness of linear operators on a finite-dimensional normed
space) 28
4.5. Theorem (completeness of finite-dimensional normed spaces) 28
4.6. Proposition 29
4.7. Theorem (compactness of closed unit ball) 29
5. Density and separability 30
5.1. Spaces of continuous functions on compact sets 30
5.2. Separation of points 31
67

5.3. Two-point fit lemma 32


5.4. More about lattice operations in C(K) 32
5.5. Stone–Weierstrass Theorem (real case, lattice form) 32
5.6. Example: an application of SWT, lattice form 33

5.7. Technical lemma (approximating t on [0, 1] by polynomials) 33
5.8. Subalgebras of C(K) 34
5.9. Proposition 34
5.10. Stone–Weierstrass Theorem (subalgebras form, real case) 34
5.11. Corollary (Weierstrass’s polynomial approximation theorem (real case)) 34
5.12. Example (on Weierstrass’s Theorem) 34
5.13. Concluding remarks on SWT 35
5.14. Separability: introductory remarks 35
5.15. Definition: separable 35
5.16. Lemma 35
5.17. Example: `∞ is inseparable 36
5.18. Proposition (first examples of separable spaces) 36
5.19. Theorem (testing for separability) 37
5.20. Applications: proofs that particular spaces are separable 38
5.21. Theorem 39
5.22. Separable subspaces: examples and remarks 39
5.23. Addendum: bases in separable normed spaces? 39
5.24. Separable normed spaces: a density lemma 40
5.25. Extension Theorem for a bounded operator on a dense subspace 40
6. Dual spaces and the Hahn–Banach Theorem 41
6.1. The dual space of a normed space 41
6.2. Proposition 42
6.3. Proposition 42
6.4. Cautionary examples 42
6.5. Hahn–Banach Theorem: context 43
6.6. HAHN–BANACH THEOREM 43
6.7. Hahn–Banach Theorem, complex case 44
6.8. A simple example using HBT 44
6.9. Proposition 45
6.10. Proposition (separating a point from a closed subspace) 46
6.11. Theorem (density and bounded linear functionals) 46
6.12. Remarks on Theorem 6.11 46
6.13. The proof of Theorem 6.6 without the restriction to separable spaces 47
6.14. Relaxing the assumptions in the statement of the real HBT 47
68

6.15. Real Hahn–Banach Theorem: a geometric view 48


7. Dual spaces of particular spaces; further applications of the Hahn–Banach
Theorem 48
7.1. The dual space of a finite-dimensional vector space 48
7.2. Dual spaces of particular normed spaces 49
7.3. Theorem (dual spaces of sequence spaces) 49
7.4. Example: density in `1 51
7.5. Further dual space characterisations, and remarks 52
7.6. A special case: dual spaces of Hilbert spaces 52
7.7. Aside: Dual spaces and separability 53
7.8. Looking further afield: second duals 53
7.9. Embedding a normed space into its second dual 53
7.10. Completion of a normed space 54
8. Bounded linear operators: dual operators; spectral theory 55
8.1. Introducing dual operators 55
8.2. Proposition (dual operator) 55
8.3. Annihilators; kernels and images of bounded linear operators and their duals 56
8.4. Example (annihilators and dual maps) 56
8.5. Definition: the spectrum of a bounded linear operator 57
8.6. A closer look at the spectrum of a bounded linear operator 57
8.7. Example 1 (a multiplication operator) 58
8.8. Example 2 (eigenvalues and approximate eigenvalues) 58
8.9. A lemma on invertible operators 58
8.10. Theorem I (basic facts about spectrum) 59
8.11. Example 3 (left-shift operator on `1 ) 59
8.12. Proposition (a refinement of Theorem I(ii)) 60
8.13. Example 4 (an integral operator) 60
8.14. Theorem II (Spectral Mapping Theorem for polynomials) 60
8.15. Example 5 (use of SMT) 61
8.16. Theorem (Spectral Radius Formula) 61
0
8.17. Theorem II (T and T in tandem) 62
8.18. The resolvent set 62
8.19. Theorem IV (non-empty spectrum) 62
8.20. SUMMARY: Pinning down σ(T ) 63
8.21. What use is the spectrum? 64

You might also like