17B4 1-Allwebnotes
17B4 1-Allwebnotes
1: FUNCTIONAL ANALYSIS I
Michaelmas Term 2017
H.A. Priestley
This file contains the full set of webnotes for the course, with a contents
list, by subsections, at the end.
1
2
In which we set out the definition and simple properties of a norm, renew
acquaintance with some familiar normed spaces, and review basic topolog-
ical notions as these specialise to normed spaces.
We then say that (X, k · k) is a normed space. Where no ambiguity would result we
simply say X is a normed space. On the other hand we adopt the notation kxkX when
we need to make the domain explicit.
Note that the restriction to real or complex scalars is needed in (N2). Later, when
we work with two normed spaces at the same time—for example when considering maps
from one space to another—we tacitly assume that F is the same for both.
We say that two norms, k · k and k · k0 on a vector space X are equivalent if there
exist constants m, M > 0 such that, for all x,
mkxk 6 kxk0 6 M kxk.
Those real or complex vector spaces which are equipped with an inner product carry
a natural norm.
Proof. Note that k · k is well-defined because hx, xi > 0 for all x. (N1) and (N2) follow
directly from properties of the inner product. For (N3) we call on the Cauchy–Schwarz
inequality:
|hx, yi| 6 kxk kyk.
This gives
kx + yk2 = hx + y, x + yi
= kxk2 + hx, yi + hy, xi + kyk2
= kxk2 + 2Rehx, yi + kyk2
6 kxk2 + 2kxk kyk + kyk2
= (kxk + kyk)2 .
Since the norm is non-negative, kx + yk 6 kxk + kyk follows.
0.4. Subspaces.
Part of the standard machinery of vector space theory involves the ability to form
subspaces, We can consider a norm as an add-on to this general framework.
Given a normed space (X, k · kX ) and a subspace Y of the vector space X, we can
form a new normed space (Y, k · kY ) by defining kykY = kykX for all y ∈ Y .
0.5. Norms on the finite-dimensional spaces Fm . We can define the following norms
on Fm , for F as R or C and m > 1: for x = (x1 , . . . , xm ) ∈ Fn ,
m
!1/2
X
kxk2 = |xj |2 (Euclidean norm)
j=1
m
X
kxk1 = |xj |
j=1
Here the Euclidean norm comes from the standard inner product (the scalar product)
and Proposition 0.3 confirms that it is indeed a norm. Verification that k · k1 and k · k∞
satisfy (N1), (N2) and (N3) is straightforward, using properties of the real and complex
numbers.
These norms on Fm are related as follows:
√ 1
kxk2 6 kxk1 6 mkxk2 and √ kxk2 6 kxk∞ 6 kxk2 .
m
These inequalities tell us that any two of these norms are equivalent. In Section 1 we
extend the definitions to define a norm k · kp for any p with 1 6 p < ∞.
0.6. Further normed spaces encountered in Part A metric spaces. The following
appeared briefly as examples, with real scalars:
(1) The spaces `1 , `2 and `∞ . These are infinite-dimensional analogues of the finite-
dimensional spaces with the analogous norms. Issues of convergence come into play
here; see Section 1 for details.
(2) Function spaces
(i) Bounded real-valued functions on any set Ω with supremum norm: kf k∞ :=
sup{ |f (x)| | x ∈ Ω }. Boundedness ensures that kf k∞ is finite. Notation in
FA-I is F b (Ω).
4
(ii) Real-valued continuous functions on a compact set K with the supremum norm.
Here boundedness is guaranteed. Notation: C(K) or CR (K). In particular, K
can be any closed bounded interval in R.
(3) Continuous functions on a closed bounded interval with L1 or L2 norm.
0.8. Proposition. Let X be a normed space. Then the following maps are continuous:
[Here the norm on X × X can be taken to be that given by (x, y) 7→ kxk + kyk, or any
norm equivalent to this.]
0.12. Proposition (closed subspaces of Banach spaces). Let (X, k·kX ) be a normed
space and Y a subspace of X. Then
(i) If (Y, k · kX ) is a Banach space then Y is closed in X.
(ii) If (X, k · kX ) is a Banach space and Y is closed, then (Y, k · kX ) is a Banach space.
In which we assemble our full cast of characters and start to get to know
them.
Context.
All the spaces we shall consider will be real or complex normed spaces. Many of
these will be important in analysis, both pure and applied. Infinite-dimensional spaces
will predominate, with spaces of functions as primary examples. In rare but important
cases the norm will come from an inner product.
Functional analysis as we study it involves vector spaces with additional structure
(a norm function). Thus linearity is always present and all the maps we consider will
be linear maps. This constrains the potential areas of application: mathematical physics
yes; non-linear systems, no.
Many of the spaces we meet are complete, that is, Cauchy sequences converge. A
complete normed space is a Banach space. A complete normed space whose norm
comes from an inner product is called a Hilbert space. The latter spaces have special
properties with a geometric flavour: think of them as behaving like Euclidean spaces.
7
(a) For a continuous linear map T : X → Y , the kernel ker T is closed but the image Im T
need not be. This explains why certain results in the finite-dimensional theory of dual
spaces and dual maps require dimension arguments. Infinite-dimensional analogues
are likely to bring in Im T .
In a Banach space setting useful sufficient conditions for Im T to be closed are
available.
(b) Minimum distances need not be attained. Given a metric space (X, d) and non-empty
closed subset S ⊆ X we can define the distance from x to S by
dist(x, S) := inf{ d(x, s) | s ∈ S }.
It is an easy (Part A) topology exercise to show x 7→ dist(x, S) is continuous. But
even when d is the metric coming from a norm on a Banach space and S is a closed
subspace, there may not be any s0 ∈ S such that d(x, s0 ) = dist(x, S).
Replace ‘Banach’ by ‘Hilbert’ here and the result is true.
(c) Suppose we have a vector space direct sum X = Y ⊕ Z and consider the projection
map PY : y + z 7→ y (y ∈ Y , z ∈ Z). Any picture you are likely to draw will suggest
that kPY (x)k 6 kxk. whence PY is continuous. Don’t be fooled!
Projection maps on general normed spaces need not be continuous (example on a
problem sheet). Continuity is ensured if X is a Banach space and Y, Z closed (this
fact rests on the BCT). Things work better still in a Hilbert space, for orthogonal
projections.
Here (a), (b) and (c) reflect different extents to which the theory and practice of func-
tional analysis in normed spaces is different from what you’ve seen hitherto. Divergence
from the familiar is greatest in general normed spaces, less so in Banach spaces (though
proofs may require hard work), and least in Hilbert spaces.
When (N2) and (N3) hold but (N1) does not then we say that k · k is a seminorm.
The game to play here is to pass to a quotient vector space to convert to a normed space.
The process is set out in Problem sheet Q. 1. For concrete examples see 1.11(b) and (c).
We now bring on stage a full cast of characters for the FA-I course.
To confirm that the Triangle Inequality holds we can start from an inequality due to
Hölder which reduces to the Cauchy–Schwarz inequality for Fm when p = 2. It states
that for x = (xj ), y = (yj ) ∈ Fm and q such that p1 + 1q = 1,
m
m
!1/p m !1/q
X X X
p q
xj y j 6 |xj | |yj | .
j=1 j=1 j=1
An optional exercise on Problem sheet 1 outlines a proof of this. From here one can go
on to derive the Triangle Inequality for the p-norm on Fm (a version of Minkowski’s
Inequality), and well covered in textbooks.
and
`∞ = { (xj ) | (xj ) is bounded }, k(xj )k∞ = sup |xj |.
Very easy (AOL) arguments confirm that `∞ is a normed space. So now assume
1 6 p < ∞. We claim firstly that each `p is a vector space and secondly that k · kp makes
it into a normed space. The required arguments can be interwoven. This allows us to be
both efficient and rigorous.
By way of illustration, we consider `1 . We set out the (Prelims-level) proof in some
detailPto show how to avoid being P sloppy. [In particular we never write down an infinite
∞
sum j=1 aj before we know that aj converges.]
P P
Let (xj ) and (yj ) be such that |xj | and |yj | converge. By the triangle inequality
in F,
|xj + yj | 6 |xj | + |yj | for all j.
Hence, for all n,
n
X n
X n
X
sn := |xj + yj | 6 |xj | + |yj | 6 k(xj )k1 + k(yj )k1 .
j=1 j=1 j=1
P
By the Monotonic Sequences Theorem applied to (sn ), the series |xj + yj | converges
and moreover
k(xj ) + (yj )k1 = k(xj + yj )k1 6 k(xj )k1 + k(yj )k1 .
is a well-defined inner product, and then to appeal to Proposition 0.3 for the norm
properties. To this end, let x = (xj ) and y = (yj ) be in `2 . Then for any n > 1, using the
CS inequality in the IPS Fn ,
n n
!1/2 n
!1/2
X X X
|xj yj | 6 |xj |2 · |yj |2 6 {xk · kyk;
j=1 j=1 j=1
P
We deduce that |xj yj | converges. The inner product properties follow easily from
those in the finite-dimensional case using arguments similar to those used to prove `1 is
a normed space.
Remarks on the sequence spaces and their norms.
All the `p norms are available on any finite-dimensional space, and all are equivalent.
The situation is more complicated for the infinite-dimensional sequence spaces `p . See
Problem sheet Q. 5(i).
For the sequence spaces `p , the choice p = 2, and no other, gives a norm coming from
an inner product: The parallelogram law fails for all p 6= 2.
11
• C[a, b], the continuous functions on a closed bounded interval [a, b]. More generally
we can take C(Ω), for Ω any compact space. Here boundedness is guaranteed.
• C b (R), bounded continuous functions on R.
It follows from elementary properties of integrals that kf k1 is finite and > 0, and
that (N2) and (N3) hold. It is a Prelims result, too, that
Z 1
|f (t)| dt = 0 =⇒ f ≡ 0
0
(this relies on continuity of f : argue by contradiction, recalling that |f (c)| > 0 forces
|f (t)| > 0 in some interval [0, 1] ∩ (c − δ, c + δ) . . .). A detailed proof can be found
in Metric Spaces notes. Hence (C[0, 1], k · k1 ) is a normed space.
Once again, the case p = 2 is special. The L2 -norm comes from an inner product:
for the complex case, Z
hf, gi := f g.
The fact that the traditional norm on an Lp space indeed gives a normed space
will be assumed in FA-I. A proof of the Triangle Inequality appeared in Part A
Integration, starting from Hardy’s inequality. These results are important and
detailed accounts are available in many textbooks.
(c) For completeness we record that L∞ (R) is defined to be the space of (equivalence
classes of) bounded measurable functions f : R → F, with
kf k∞ = inf{ M > 0 | |f (t)| 6 M a.e. }.
[This space is mentioned for only for completeness.]
The Lebesgue spaces are much more central to FA-II than to FA-I. In this course they
feature primarily as illustrations. In FA-II L2 -spaces, associated with various measure
spaces, are central. The `p -spaces, for 1 6 p < ∞ can be subsumed within the Lp -spaces
using counting measure.
Technical note. Where spaces of integrable functions arise in the course as examples
or, very occasionally, on problem sheets, measurability of the functions involved may be
assumed. FA-I is not a course on measure theory. Non-measurable functions are anyway
elusive beasts, and are not encountered in everyday mathematics; their existence relies
on an assumption from Set Theory (Zorn’s Lemma).
To conclude this section we record some of the special properties shared by IPS-based
examples in which kxk = (hx, xi)1/2 . We collect together, for occasional use later, the
basic toolkit for working with such a norm.
2.1. Density.
Recall that a subset S of a metric space (or more generally a topological space) X
is dense if S = X. Density is an important notion: it will enable us to talk about
approximations of general elements by special ones, and many proofs in functional analysis
and its applications proceed by establishing a given result first for elements of a dense
subset, or subspace, of a normed space and then extending the result to all of X by a
limiting argument.
First examples of dense subspaces:
(a) Let X = `p where 1 6 p < ∞). Let en = (δjn ), that is, the sequence all of whose
coordinates are 0 except the nth coordinate, which is 1. Let x = (xj ) be an arbitrary
element of X. Then
X n
kx − xk ek kp = k(0, 0, . . . , 0, xk+1 , . . .)kp → 0 as k → ∞.
k=1
Hence the subspace of `p consisting of vectors having at most finitely many non-zero
coordinates is dense in `p .
(b) [Assumed fact] The step functions are dense in Lp (R) for 1 6 p < ∞. A correspond-
ing result holds for Lp (a, b), where −∞ 6 a < b 6 ∞.
In functional analysis, step function approximations are usually much easier to
work with than approximations by simple functions.
R1
Example: Consider C[−1, 1] with norm kf k1 = −1 f (t) dt. To prove non-completeness,
consider the sequence (fn ) of continuous piecewise-linear functions for which
−1 if − 1 6 t 6 −1/n,
fn (t) = nt if − 1/n < t < 1/n,
1 if 1/n 6 t 6 1.
Suppose for contradiction kfn − gk1 → − where g ∈ C[−1.1]. But kfn − f k1 → 0, where
f = χ)0,1] − χ[−1,0). This implies g = f a.e., which is not possible.
We are ready for a clutch of archetypal completeness proofs. All involve function
spaces. There’s some overlap with Part A Metric Spaces but we give several proofs here
to reinforce the key points in the strategy. Direct proofs via Cauchy sequences follow a
uniform pattern. Subsidiary results can obtained via ”closed subspace of a Banach space
is a Banach space” (from 0.12).
It follows that (fn (t)) is a Cauchy sequence in R. Hence there exists a real number, write
it as f (t), such that fn (t) → f (t).
An alternative here is simply to show that C[0, 1] is a closed subspace of the Banach
space F b ([0, 1]). This amounts to piggybacking on Example 2.3 to reduce the problem to
Step 3 alone.
and the RHS can be made less than a given ε if m, n > N , for some N ∈ N. Keeping K
and n fixed and letting m → ∞, (AOL) gives
K
X (n)
|xj − xj | 6 ε for all n > N.
j=1
(n)
Since this is true for all K we deduce that (xj − xj ) belongs to `1 for all n > N and
that its norm tends to 0 as n → ∞.
Step 3: x = (xj ) ∈ `1 . This follows from Step 2 and the triangle inequality for `1 , by
(N ) (N )
writing x as (xj − xj ) + (xj ).
A similar but messier proof shows that `p is complete for 1 < p < ∞.
Proof. =⇒ : this is proved in exactly the same way as for the case in which X is R or
C, with k · k in place of | · |—check Prelims notes!
P
⇐=: Take any Cauchy sequence (yn ) in X. We endeavour P∞ to construct a series xn
in X which is absolutely convergent and such that x := n=1 xn supplies the limit we
require for (yn ). We can find natural numbers n1 < n2 < · · · such that
ky` − ym k < 2−k for `, m > nk .
Let sk = ynk . Define (xk ) by
x 1 = s1 , xk = sk − sk−1 (k > 1).
P P −k P
Then the real series kxk k converges by comparison with 2 so xk is absolutely
P
convergent, and hence by assumption it converges. Moreover, by construction, xk is a
telescoping series:
k
X
ynk = sk = x` .
`=1
We deduce that (ynk ) converges. But, just as in Prelims Analysis, a Cauchy sequence in
a normed space which has a convergent subsequence must itself converge.
18
(c) Quotient spaces. It can be proved from Theorem 2.10 that if X is a Banach space
and Y is a closed subspace of X then X/Y is a Banach space for the quotient norm:
kx + Y k := inf{ kx + yk | y ∈ Y }.
(Here closedness of Y is necessary to ensure that we have a norm rather than merely
a seminorm.)
We shan’t need this result in FA-I. The proof is rather technical and we omit it.
We conclude this section with some indications of why Hilbert spaces are special and
what distinguishes them from inner product spaces in general and from Banach spaces in
general. This brief account can be seen as providing context to FA-I and as a look-ahead
to FA-II. [The theorems mentioned below do not form part of the examinable syllabus
for FA-I].
2.12. A glimpse at Hilbert spaces. The Prelims and Part A Linear Algebra courses
reveal features and methods that apply only to inner product spaces. Euclidean spaces,
along with much of their geometry, form the prototype for finite-dimensional IPS’s. A
central notion is that of orthogonality and a key theorem asserts that if L is a subspace
of a fd IPS V then
(†) V = L ⊕ L⊥ .
This is a key step in the proof of the Spectral Theorem for self-adjoint operations in
finite-dimensional IPS’s, since it allows to proceed by induction on dim V .
This can be proved using ideas of the Gram–Schmidt process and orthonormal bases.
More geometrically, given x ∈ V , one want y ∈ L so that x − y ∈ L⊥ , which then implies
V = L + L⊥ . Geometrically, we want to pick yx ∈ L so d(x, yx ) is as small as possible.
Algebraic arguments tell us this will work when the IPS V is finite-dimensional. But does
it work in general?
We want
d(x, yx ) = δ := inf{ d(x, y) | y ∈ L },
that is, kx − y0 k = inf{ kx − yk | y ∈ L }. A viable strategy, in a Hilbert space is to
take a sequence (yn ) in L such that kx − yn k → δ and to apply the parallelogram law to
show (yn ) is Cauchy, and so convergent. Finally we’d need to add the assumption that
L is closed to get yx := lim yn ∈ L. By continuity of the norm function, y0 is the closest
point we seek.
Hence:–
19
In Part A Metric Spaces an indication was given that a map from one normed space
to another has special properties when it is linear AND continuous. We shall call such a
map a continuous linear operator. We now elaborate on what was shown in Metric
Spaces. A map T : X → Y satisfying condition (3) in Proposition 3.1 will be said to be
bounded.
(1) Shift operators on sequence spaces. To illustrate, we work with maps from `1
to `1 . Other examples are available. Define
R(x1 , x2 , . . .) = (0, x1 , x2 , . . .),
L(x1 , x2 , . . .) = (x2 , x3 , . . .).
it is clear that each of R and L maps `1 into `1 and is linear. Also, for any x =
(xj ) ∈ `1 ,
kRxk = kxk and kLxk 6 kxk.
Hence R, L ∈ B(`1 , `1 ), with kRk = 1 immediate from (Op3) and kLk 6 1.
Let en = (δjn )j>1 . Then ken k = 1 for all n. Then kLe2 k = ke1 k = 1 and we
deduce that kLk = 1, with the supremum in (Op3) being attained.
(2) Let X = Y = CR [0, 1] and define T f by (T x)(t) = tx(t) (for all t ∈ [0, 1]). We claim
that T ∈ B(X) and kT k = 1. Prelims Analysis confirms that T maps X into X and
is linear [no proof called for here]. Also, for all x ∈ X and t ∈ [0, 1],
|(T x)(t)| = |tx(t)| 6 |x(t)| 6 kxk∞ .
Hence kT xk 6 kxk for all x ∈ X and we have equality when x ≡ 1. It follows that
T is bounded, with kT k = 1.
(3) T as in (2) but now X = Y = L2 (0, 1). Other Lp spaces are available.
First note that g : t 7→ t2 is a bounded measurable function on [0, 1], so x ∈ L2 (0, 1)
implies T x ∈ L2 (0, 1) too and linearity of T is clear. Also
Z 1 Z 1
2 2
kT xk2 = |tx(t)| dt 6 |x(t)|2 dt = kxk22 .
0 0
[Here X is not a Banach space: it is a dense proper subspace of C[0, 1]. Results on
Banach spaces proved in FA-II show why unbounded operators on Banach spaces are
hard to find.]
3.5. Remarks on calculating operator norms. Suppose X, Y are normed spaces and
T : X → Y is linear. Suppose we seek to show T is bounded and to calculate its norm.
As our examples have shown, it is often relatively easy to find some constant M such
that kT xk 6 M kxk for all x, This implies that T is bounded, with kT k 6 M . But it is
often harder to find the least such M , as in (Op4) in Proposition 3.1.
Two cases can arise.
1. Supremum attained in (Op1)/(Op2)/(Op3). We are in luck! We can show
kT xk 6 M kxk for all x and that there exists x0 6= 0 such that kT x0 k = M kx0 k.
Then the sup in (Op1) is attained and we have kT k = M .
2. Supremum not (necessarily) attained in (Op1)/(Op2/Op3).
Recall the Approximation Property for sups from Prelims Analysis I: Let S be
a non-empty subset of R which is bounded above, so sup S exists. Then, given ε > 0,
there exists s ∈ S, depending on ε, such that
sup S − ε < s 6 sup S.
This characterises the sup in the sense that if M is an upper bound for S and there
exists a sequence (sn ) in S such that sn > M − n−1 for all n then M = sup S. Our
witnessing sequence method in 3.3(3) draws on this. Note also Example 3.6.
Note finally that we can also use sequences to witness that a linear map T is un-
bounded: this happens if there exists a sequence (xn ) with kxn k = 1 (or (kxn k) bounded
will do) and kT xn k → ∞ as n → ∞.
Then
∞
X
(n)
kx − x k1 = |xk | → 0 as n → ∞.
k=n+1
See Problem sheet Q, 11 for a cautionary example concerning a bounded linear op-
erator from `∞ into `1 .
Then one may measure the ‘size’ of A using either of the following quantities:
Xn
kAk1 = max |aij | ,
16j6n
i=1
n
X
kAk∞ = max |aij | .
16i6n
j=1
It can be shown, cf. Example 3.6, that k · k1 and k · k∞ are respectively the norms on
B(Rn ) when Rn has the 1-norm and the ∞-norm.
One may likewise ask for a formula for the norm on B(Rn ) when Rn has the 2-
norm. This is more elusive. However when we have a real symmetric matrix A, this is
diagonalisable, with eigenvalues λ1 , . . . , λn (not necessarily distinct). Then
r
kAk2 := sup kAxk22 = max{|λ1 |. . . . , |λn |}.
kxk2 =1
3.8. Theorem (completeness). Let X be a normed space and Y a Banach space, both
over the same field. Then the normed space B(X, Y ) is complete.
Proof. The proof goes in very much the same way as the completeness proofs for function
spaces given in Section 2. Let (Tn ) be a Cauchy sequence in B(X, Y ).
Step 1: candidate limit. For each x we have
kTm x − Tn xk = k(Tm − Tn )xk 6 kTm − Tn k kxk.
We deduce that (Tn x) is Cauchy in Y , and so convergent to some T x ∈ Y . We thus have
a map T : X → Y . It follows from the continuity of addition and scalar multiplication
(see 2.2) that T is linear.
Steps 2 and 3: from pointwise convergence to norm convergence and proof
that T ∈ B(X, Y ).
Observe that if, for some N , we can show that the linear map T − TN is bounded
then T = (T − TN ) + TN will be bounded too. For any fixed ε > 0,
∃N ∀m, n > N kTm x − Tn xk 6 kTm − Tn kkxk 6 εkxk.
Fix x and n > N and let m → ∞ to get kT x−Tn xk 6 εkxk and hence k(T −Tn )xk 6 εkxk.
This implies in particular that T − TN is bounded. Also n > N implies kT − Tn k 6 ε.
Therefore kT − Tn k → 0.
3.9. Corollary. For any normed space X over F, with F as R or C, the space X ∗ =
B(X, F) of bounded linear functionals is complete, where the norm is
n |f (x)| o
kf k = sup | x 6= 0 .
kxk
We shall make significant use of spaces of the form X ∗ later in the course.
instance of this.
Assume now that (a) and (b) hold. Then there exists a map S : X → X such that
T ◦ S = S ◦ T = I. We’d then want S to be a bounded linear operator on X. So we need
(c) S is linear: this is always true (routine calculation, just Linear Algebra);
(d) S is bounded, that is, there exists a constant K > 0 such that
kSyk 6 Kkyk ∀y ∈ X.
3.15. Proposition (closed range). Let X be a Banach space and T ∈ B(X). Assume
that (?) holds for some constant K. Then T is injective and T X is closed.
If in addition T X = X, then T is bijective and T is invertible.
Proof. From (?), T x = 0 implies x = 0, so ker T = {0}. Now assume (yn ) is a sequence
such that yn = T xn and yn → y. Then (T xn ) is Cauchy and then (?) implies that (xn ) is
Cauchy. Since X is Banach, there exists x such that xn → x. Then yn = T xn → T x so
y ∈ T X.
Assume further that T X is dense. Then T X = X. Moreover (?) tells us that S = T −1
is bounded.
3.17. Proposition. Let X be a Banach space. Let T ∈ B(X) be such that kT k < 1.
P∞
Then I − T is invertible with inverse given by k=0 T k (where T 0 := I).
Moreover, if P ∈ B(X) is such that kI − P k < 1 then P is invertible.
This result is important in spectral theory and in its applications, for example the
theory of integral equations.
27
We already know that the various p-norms on Fm are equivalent. But we can now do
better.
28
Proof. Denote
P the given norm of X by k · k. Take a fixed basis {x1 , . . . , xm } for X and
let x = λj xj be a general element of X, where the scalars λj are uniquely determined
by x. Then
kT (λ1 x1 + · · · + λm xm )k = kλ1 T x1 + · · · + λm T xm k
m
X
6 |λ1 |kT x1 k + · · · + |λm |kT xm k 6 kT xj k max |λj |.
16j6m
j=1
Proof. (i) This is a very simple instance of the strategy for proving completeness in
function spaces (we can regard Fm as the space of bounded functions from {1, . . . , m} to
F, or just think in terms of coordinates).
For (ii) it suffices to prove that X is complete in some norm since all norms on X
are equivalent and, for equivalentPnorms, the same sequences are Cauchy. Take a basis
{x1 , . . . , xm } for X and define k j λj xj k to be maxj |λj |. This is a norm on X and X
with this norm is Banach, exactly as in (i).
We now consider closed subspaces, recalling Proposition 0.12. We give two proofs
for the following result, one which uses all the machinery of this section, the other much
more direct.
29
4.7. Theorem (compactness of closed unit ball). Let X be a normed space and let
S := B(0, 1) = { x ∈ X | kxk 6 1 } be the closed unit ball in X. Then S is compact if
and only if X is finite-dimensional.
Proof. ⇐=: By Theorem 4.2, X is isomorphic and homeomorphic to some normed space
Fm . Thus it is sufficient to show that the closed unit ball in a space Fm is compact. But
as noted above in this special case this follows from the Heine–Borel Theorem.
=⇒ : We first introduce some useful notation, which just uses the vector space
structure of X. For A, B ⊆ X and c > 0, let
A + B = { a + b | a ∈ A, b ∈ B } and cA = { ca | a ∈ A }.
Note that if Y is a subspace then Y + Y = Y and cY = Y for all c > 0.
Now assume that S is compact. We shall make use of the fact that a compact subset
of a metric space is totally bounded. This means that for each ε > 0 there exists a
finite set Fε (called an ε-net) in X such that every point of S is within a distance ε of
some point in Fε . The proof is easy: consider the open cover of S consisting of all open
balls in X of radius < ε with centres in S.
Let F be a 1/2-net. Then, for any x ∈ S, there exists u ∈ F such that kx − uk < 1/2.
Saying this another way, S ⊆ F + B(0, 1/2), and this implies S ⊆ F + 21 S.
Let Y = span(F ). Then
S ⊆ F + 12 S ⊆ Y + 21 S.
30
By a simple notation-chase,
1
2
S ⊆ Y + 14 S.
Putting these together
S ⊆ Y + Y + 41 S = Y + 14 S.
Proceeding by induction
S ⊆ Y + 2−k S for all k > 1.
But then S ⊆ k (Y + 2−k S). The set on the RHS is Y . Since Y is finite-dimensional,
T
Y is closed. Therefore S ⊆ Y . But this implies X ⊆ Y . Hence X = Y and so X is
finite-dimensional.
• C[0, 1] is a commutative ring with identity ((1), with scalar multiplication ignored,
& (2));
• C[0, 1] is a commutative Banach algebra ((1) & (2), plus completeness to cover the
inclusion of ‘Banach’ in the name).
• C[0, 1] is a vector lattice: ((1) & (4)).
A surfeit of riches!
[Note: In all cases the formal definitions of the italicised terms involve some axioms
to ensure the various operations interact as we would wish. Not important for us because
we only work with the special case of C(K).]
Observe that we have also met normed spaces with a supplementary operation of
product in Section 3: the spaces B(X, Y ), in which kST k 6 kSk kT k and which are
Banach spaces when Y is a Banach space. These spaces are examples of non-commutative
Banach algebras.
Everything said so far works without change if we replace [0, 1] by any compact
space K. Compactness of K guarantees that kf k∞ is finite for each f ∈ C(K) so the sup
norm is available. Henceforth in this section C(K) will denote the real-valued continuous
functions on a non-empty compact set K.
Obviously, the constant functions fail to separate points. In general, the smaller a subset
Y of C(K) is the less likely it is to separate points. We may ask when C(K) itself
separates the points of K. This is the case if
We mention these results because the Stone–Weierstrass Theorem that we prove for a
space C(K) (both forms) would be vacuous if C(K) failed to separate points.
What we are aiming for is sufficient conditions on a subspace Y of C(K) which will
guarantee that it is dense in C(K). This suggests that it needs to be ‘big’. In particular
it’s worth noting that Y cannot be dense if it is characterised by a property which lifts
from Y to its closure but which does not hold universally in C(K). For example, the
following are proper closed subspaces of C[0, 1] and so cannot be dense:
• { f ∈ C[0, 1] | f (1/2) = 0 };
• { f ∈ C[0, 1] | f (0) = f (1) }.
The first of these separates points but fails to contain the constants, while the second
contains the constants but fails to separate points.
32
5.3. Two-point fit lemma. Let Y be a subspace of C(K) containing the constant
functions and separating the points of K. Let p 6= q in K and α, β ∈ R. Then there
exists g ∈ Y such that
g(p) = α and g(q) = β.
Proof. Since Y separates points, we can first choose f ∈ Y such that f (p) 6= f (q). Now
consider g = λf + µ1 and aim to choose λ, µ ∈ R so that
α = λf (p) + µ, β = λf (q) + µ.
These equations are uniquely soluble for λ and µ. Since Y is a subspace containing the
constants, λf + µ1 ∈ Y .
Proof. We want to show that, given f ∈ C(K) and any ε > 0 we can find h ∈ C(K) such
that
f − ε < h < f + ε.
Step 1: approximating f at points p, q ∈ K. We claim there exists g ∈ L such that
g(p) = f (p) and g(q) = f (q). If p 6= q this comes from the Two-point fit Lemma. If p = q
a constant function with serve for g.
[This step doesn’t need the assumption that L is closed under max and min.]
33
Step 3: approximating f from above. We now vary p. For each p we can choose an
open set Vp containing p such that
g p (t0 < f (t) + ε for all t ∈ Vp .
By compactness, there exists p1 , . . . , pm such that K = Vp1 ∪ · · · ∪ Vpm . Now
h := g p1 ∧ · · · ∧ g pm < f + ε.
Step 4: putting the pieces together. Consider h as in Step 3. From Step 2, each
g pi > f − ε. Hence h > f − ε. Therefore kh − f k∞ < ε. Also h ∈ L. We conclude that
f ∈ L.
We shall now use the sublattice form of SWT to get at a different form of the the-
orem which is particularly useful, and which subsumes Weierstrass’s classic polynomial
approximation theorem. We are going to need the following lemma, which can be seen
as a very special case of the uniform approximation of a continuous function on [0, 1] by
polynomials.
√
5.7. Technical lemma (approximating t on [0, 1] by polynomials). Define a se-
quence (pn ) of polynomials recursively by
1
p1 (t) = 0, pn+1 (t) = pn (t) + (t − pn (t)2 ) (n > 1).
2
u
Then (pn (t)) is an increasing sequence for each t ∈ [0, 1] and pn (t) → t1/2 on [0, 1].
Then Fn is closed and Fn+1 ⊆ Fn for all n. Since [0, 1] isTcompact, the Finite Intersection
Property implies that if all Fn ’s were non-empty then n Fn 6= ∅. But this would imply
that there is some point t at which fn (t) 9 0, contrary to assumption. So there exists N
such that FN = ∅. Then 0 6 fn (s) < ε for all s ∈ [0, 1] and for all n > N .
It is clear that the set of real polynomials on a closed bounded interval of R is not
closed under the lattice operations of max and min: just consider t and −t on [−1, 1].
But the polynomials are closed under forming products.
Proof. Take ε > 0. Choose a real polynomial p such that kf − pk < ε. Then, by the
assumption, and properties of integration,
Z 1 Z 1
2
06 f (t) dt = f (t)(f (t) − p(t)) dt.
0 0
Now
1 1
Z Z
f (t)(f (t) − p(t)) dt 6 |f (t)(f (t) − p(t))| dt 6 kf k kf − pk 6 kf kε.
0 0
R1
Since ε was arbitrary this forces 0 f 2 (t) dt = 0. Since f 2 is continuous and non-negative,
f 2 , and so also f , is identically zero. (Recall 1.11(a).)
5.16. Lemma. Let (X, k · k) and (X, k · k0 ) be normed spaces with equivalent norms.
Then either both are separable or both are inseparable.
Proof. A subset D of X is dense in both spaces or neither: either use the definition of
equivalent norms, or use the result that the two spaces have the same open sets.
36
It goes without saying that to understand and to work with separability you need to
know the rudiments of the theory of countable and uncountable sets, as introduced in.
Prelims Analysis I. This will be assumed in this section but a brief summary is provided
in a supplementary note, Review of basic facts about countability. It will be legitimate to
quote the results recalled there. We enclose such justifications within square brackets in
the proofs we give below.
Let’s begin with an example to show that separability can’t be taken for granted.
Other examples of inseparable spaces are the space of Lipschitz functions and L∞ (R).
See Problem sheet Q. 17 for details.
5.18. Proposition (first examples of separable spaces). Let the scalar field F be R
or C. The following are separable:
(1) F;
(2) Fm , with any norm;
(3) `1 .
Proof.
(1) Q is countable and dense in R. Q + iQ := { a + ib | a, b ∈ Q } is countable and dense
in C (for, for example, the k · k∞ - norm, and hence for any norm by 5.16). So F has
a countable dense subset, henceforth denoted CF .
(2) Let’s consider Fm with the ∞-norm and let CF be countable and dense in F. Then
(CF )m is countable [finite product of countable sets]. Take any (x1 , . . . , xm ) ∈ Fm
and any ε > 0. Then for each j there exists sj ∈ CF such that |xj − sj | < ε. Hence
k(x1 , . . . , xm ) − (s1 , . . . , sm )k∞ = max |xj − sj | < ε.
j
By (2) we can choose (a1 , . . . , aN ) ∈ (CF )N such that k(x1 , . . . , xN )−(a1 , . . . , aN )k1 <
ε. Let s = (a1 , . . . , aN , 0, 0, . . .). Then s ∈ D and kx − sk < 2ε.
These examples illustrate techniques which are available more widely. As ever, the
span of a set S in a vector space X is
nX k o
span(S) := λi si | k = 1, 2, . . . , λi scalar, si ∈ S .
i=1
Crucially, we are only allowed to form finite linear combinations here. There is no pre-
sumption here in general that S is countable.
The import of the following theorem is, loosely, that we can build in the separability
of the scalars and facts about countable sets to arrive at a viable test for separability
which avoids the need for messy approximation arguments.
Proof. (i) Every element of Y is a finite linear combination of elements of S. The set T
of finite linear combinations of elements from S with scalars drawn from the countable
set CF is countable:
[nX n o
T = aj sj | aj ∈ CF , sj ∈ S
n>1 j=1
[countable union of countable sets]. Moreover T is dense in Y (either slog it out or exploit
the fact that addition and scalar multiplication are continuous (see 0.8)).
(ii) By (i), applied with Y = span(S), we see that Y is separable. Let D be countable
and dense in Y . Then, using superscripts to indicate the space with respect to which the
closure is taken.
Y X
D = Y and Y = X.
But (by elementary topology),
Y X
D = D ∩ Y.
X
so Y ⊆ D and, since Y = X by assumption it follows that D is dense in X.
(iii) The facts underlying the proof here are that, in a normed space X, the span of a
subset P is the smallest subspace containing P and that the closure of a subset Q is the
smallest closed set containing Q.
We have
S ⊆ span(S) =⇒ S ⊆ span(S)
=⇒ span(S) ⊆ span(S) (closure of a subspace is a subspace)
38
We now exploit earlier density results by showing that we can find suitable countable
subsets S such that span(S) is dense or span(S) is dense.
(4) C[a, b] (real-valued continuous functions on [a, b] ⊆ R, with sup norm) is separable.
More generally, C(K), for K a compact subset of Rn , is separable.
Proof. For C[a, b]: Take S to be the set {1, x, x2 , . . .}. Its span is the subspace of
all polynomials, and this is dense by Weierstrass’s Theorem. For the general result,
take S to the set of monomials in variables x1 , . . . , xn , that is, expressions xq11 · · · xqnn ,
where each qi ∈ N; note that S is countable. SWT (subalgebra form) implies span(S)
is dense.
Proof. Apply Theorem 5.19 with S as the set of characteristic functions of bounded
intervals with rational endpoints. Here the closure of S in the Lp norm contains
all characteristic functions of bounded intervals. Then span(S) contains the step
functions (why?) and so is dense. [Assumed fact: the step functions are dense in
Lp (R).]
It is a salutary exercise to attempt to prove these results directly from the definition
of separability. For example, you might try to show that a suitable countable set of
piecewise linear functions is dense in C[a, b]. It’s messy! Not recommended.
Likewise, direct construction of a countable dense subset in an Lp space is messy.
Now for another general result, which we present in the setting of metric spaces.
39
5.21. Theorem. Let Y be a subset of a separable metric space (X, d). Then (Y, d) is
separable.
There is one very special case in which a separable normed space X does have available
a good notion of basis. When X is a separable Hilbert space, the notion of orthonormal
basis works well. An orthonormal sequence (xn ) is an orthonormal basis if hx, xni = 0
for all n implies x = 0. This concept is explored in FA-II. It has connections with or-
thogonal expansions as encountered in certain courses in applied analysis and differential
equations. The Projection Theorem, stated in 2.12, underpins the theory.
We do, however, have a simple positive result which holds in any separable normed
space.
We conclude this section with a general theorem on the theme ‘sometimes it is good
enough to have information on a dense subspace’. We shall exploit this and Lemma 5.24
when presenting a proof of the Hahn–Banach Theorem for separable spaces in the next
section. The theorem is also needed in FA-II. Note the crucial requirement that the
operator must map into a Banach space.
Proof. Let x ∈ X. Then there exists a sequence (zn ) in Z such that kx − zn k → 0. Now,
because T is bounded and linear,
kT zm − T zn k 6 kT k kzm − zn k 6 kT k (kzm − xk + kx − zn k) .
We deduce that (T zn ) is a Cauchy sequence in Y . This converges to an element y ∈ Y ,
∼
depending on x. Denote it by T x. But we now must confront a possible issue of well-
definedness.
Suppose we have a rival sequence (zn0 ) in Z which also converges to x. We need to
show that lim T zn = lim T zn0 . To this end consider
kT zn − T zn0 k 6 kT k kzn − zn0 k 6 kT k (kzn − xk + kx − zn0 k) .
Since the RHS tends to 0, the limits lim T zn and lim T zn0 (which we know exist) are
∼
the same. Therefore T x can be unambiguously defined to be lim T zn where (zn ) is any
41
∼
sequence converging to x. In addition we see from this that T extends T (consider a
constant sequence). Uniqueness of the extension also follows because any continuous
≈
extension of T to X, say T , must be such that, for zn → x as above,
≈ ≈
T x = lim T zn = lim T zn .
.
∼
Linearity of T comes from the continuity of addition and scalar multiplication in a
normed space. In more detail, let x, v ∈ X with approximating sequences zn → x and
wn → v, with zn , wn ∈ Z for all n. Let λ, µ ∈ F. Then λzn + µwn → λx + µv. But
∼ ∼
T (λzn + µwn ) = λT zn + µT wn → λT x + µT v
∼
and, by continuity of T , we also have T (λzn + µwn ) → T (λx + µv). By uniqueness of
∼
limits in a normed space we see that T is indeed a linear map from X into Y .
Moreover, by continuity of norm,
∼
kT xk = k lim T zn k = lim kT zn k 6 lim kT k kzn k = kT k kxk.
∼ ∼
Hence T is bounded, with norm 6 kT k. But because T extends T we must have the
reverse inequality too, so we get equality.
In which we present the Hahn–Banach Theorem and give the proof for
a separable normed space. In which too we begin to reveal the HBT’s
powerful and far-reaching consequences.
Recall that X ∗ is always a Banach space, whether or not X is complete (see Corol-
lary 3.9): completeness of X ∗ comes from the completeness of the scalar field. Warning:
choice of notation (X ∗ or X 0 ) for bounded linear functionals is not uniform across the
literature and past exam questions.
We slot in here an easy consequence of our Extension Theorem for bounded linear
operators, Theorem 5.25. It tells us that a normed space and any dense subspace of it
have essentially the same dual space.
42
Proof. Certainly J is well-defined, linear and continuous. Theorem 5.25 ensures that J
is surjective and an isometry.
We now give a result about linear functionals which is not a specialisation of one true
for linear operators.
(1) Example of an unbounded linear operator with closed kernel. Let X := span{en }n>1
in `1 , where as usual en = (δnj ). Then X is a dense and proper subspace of `1 .
Define T : X → `1 by T (xj ) = (jxj ). Then ken k1 = 1 and kT en k1 = n, so T is
unbounded. Clearly ker T = {0}.
(2) Example of an unbounded linear functional. Let X be the subspace of C[0, 1] (real-
valued functions, sup norm) such that X consists of the continuously differentiable
functions. Let f : X → R be given by f (x) = x0 (1/2) for x ∈ X. Then f is linear
but not bounded.
43
Hence
α := sup {−g(z 0 ) − kz 0 + yk} 6 inf {−g(z) + kz + yk} := β.
z 0 ∈Z z∈Z
We then choose c ∈ [α, β]. Then, working backwards, we see that our requirements in
(†) are satisfied. But so far we’ve not considered what h will do to a general element
of W \ Z. We shall now confirm that khk 6 1 when c is chosen as above. For λ 6= 0,
|h(z + λc)| = |λ||g(z/λ) + c| 6 |λ|k(z/λ) + yk = kz + λyk;
the idea here is to exploit the fact that Z is closed under scalar multiplication: we call
on (†) with z/λ in place of z, considering separately the cases λ > 0 and λ < 0 to arrive
at the required inequality. Finally note that h, as an extension of g, has khk > kgk.
6.7. Hahn–Banach Theorem, complex case. Let X be a complex normed space. Let
Y be a subspace of X and let g ∈ Y ∗ . Then there exists f ∈ X ∗ such that f Y = g and
kf kX = kgkY .
Proof. We first consider real and imaginary parts of linear functionals on a complex vector
space X. Let f be a C-linear map from X into C. Write f (x) = u(x) + iv(x) for each
x ∈ X. Then (just calculate) u and v are real-linear and v(x) = −u(ix) and
f (x) = u(x) − iu(ix).
Conversely, given an R-linear functional u : X → R, we can define f in terms of u as
above to obtain a C-linear functional f : X → C with re f = u.
Now consider a complex normed space X, subspace Y and a complex-linear bounded
linear functional g on Y . Restricting to real scalars, regard X as a real space XR with
Y as a real subspace YR , and let u = re g. Apply Theorem 6.6 to extend u from YR to
a linear functional w on XR with kukYR = kwkXR . Define f by f (x) = w(x) − is(ix).
This is C-linear and extends g. It remains to check that kf kX = kgkY . First of all,
|u(y)| = |re g(y)| 6 |g(y)| for all y ∈ Y . Hence kukYR 6 kgkY . Consider x ∈ X. Then
f (x) ∈ C. We can choose θ such that |f (x)| = eiθ f (x). Then
|f (x)| = f (eiθ x) = re f (eiθ x) = u(eiθ x) 6 kukXR keiθ xk 6 kgkY kxk.
So kf kX 6 kgkY and the reverse inequality is trivial.
Solution.
=⇒ : Assume f exists. Then, for any J and any scalars λj ,
X X X
X
X
λ j cj = λj f (xj ) = f λj xj 6 kf k
λj (xj
6 M λj (xj
.
j∈J j∈J j∈J j∈J j∈J
⇐=: Let Y be the subspace of X spanned by S, so that the elements of Y are all
finite linear combinations of elements of S. “Define” f on Y by
X X
f λ j xj = λ j cj .
j∈J j∈J
and note that this is an element of Y to which the given condition can be applied to
show that the two representations of x give the same value for f (x). Moreover f is linear
(routine) and bounded, with kf kY 6 M . Also by construction each xi ∈ Y and f (xi ) = ci
for each i ∈ I. Now apply HBT to extend f from Y to X.
Corollaries of HBT tumble out fast from Theorems 6.6 and 6.7 with little work needed.
We state them as results in their own right but they should be seen as direct or indirect
consequences of the main theorem. When not specified the scalar field may be either R
or C.
The idea in all cases is to let Y be a subspace containing the vectors about which we
want information, to define a bounded linear functional on Y to capture this information,
and then to let HBT do the rest.
Our first result tells us that a normed space (other than {0}) will always have a
plentiful supply of non-zero bounded linear functionals.
For (ii): Observe that M = span(S) is a closed subspace. If M is proper than there
exists x ∈ X \ M . In that situation we can find f ∈ X ∗ as in Proposition 6.10 which is
zero on S but not identically 0.
Part (ii) should be seen as a density theorem. Given a subset S of a normed space X
we can find out whether span(S) is dense by showing that an element f of X ∗ for which
f (s) = 0 for all s ∈ S has to be identically zero. But in order for this to be useful we
shall need to describe the dual space of X. We address in the next section the problem of
giving concrete descriptions for various of our familiar normed spaces and illustrate the
use of Theorem 6.11 (further examples in problem sheets).
In subsequent sections we shall also see that there is much more to be said about
HBT applications in general. In particular HBT plays a key part in the investigation of
duals of bounded linear operators.
6.13. The proof of Theorem 6.6 without the restriction to separable spaces.
[This subsection is aimed at those planning to take Part B Set Theory, and at the naturally
curious.]
We need the principle of set theory known as Zorn’s Lemma. The idea is to extend
g, a bounded linear functional on Y , without change of norm, to a maximal possible
domain, and we hope to show that there is an extension to the whole of X. Let’s consider
the set of all possible extensions:
E := { h ∈ B(Z, R) | Z is a subspace of X ⊇ Y and hY = g }.
Then we can regard E as being partially ordered (formally one says h1 6 h2 iff graph h1 ⊆
graph h2 , which is a fancy way of saying that h2 ’s domain contains h1 ’s domain and that
h2 extends h1 ).
Suppose we have an element of E which is maximal, that is, an extension f of g, with
domain Z say, which cannot be extended any further. Then EITHER Z = X and we
have the extension we were seeking OR Z 6= X and we can apply the One-step Extension
Lemma to extend f to span(Z ∪ {x}), thereby contradicting maximality. But how do we
get a maximal element?
all the properties of a norm. With X a real vector space we could have replaced k · k by
a sublinear functional p : X → R satisfying
p(x + y) 6 p(x) + p(y) ∀x, y ∈ X,
p(αx) = αp(x) ∀x ∈ X, α > 0.
The conditions may be seen as weakening of norm condition (N2) and abandoning of (N1).
Obviously any norm, any linear functional and any seminorm is a sublinear functional.
• for any given basis {e1 , . . . , em } of X there exists a dual basis {e01 , . . . , e0m } for
X 0 with e0j (ei ) = δij , and hence
• dim X 0 = dim X (so X and X 0 are isomorphic).
Moreover,
• there is a canonical (= basis-free) isomorphism from X onto its second dual X 00
given by x 7→ εx , where εx (f ) = f (x) for all f ∈ X 0 .
A key example: X = Rm . Here, for any y ∈ Rm we get a linear functional fy : x 7→ x · y,
where · denotes scalar product. Writing x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ) the formula
is m
X
fy (x) = xj y j .
j=1
m
Taking the standard basis {e1 , . . . , em } for R , we find that fej (ei ) = δij , so that the dual
basis is {fe1 , . . . , fem }.
If we identify fy with y then we may think of (Rm )0 as being Rm . [If one thinks of the
vectors x as column vectors and the vectors y as row vectors then y(x) is simply given
by matrix multiplication.]
When we make the transition from finite-dimensional spaces (pure Linear Algebra)
to normed spaces, we want to consider continuous linear functionals. if X is finite-
dimensional then every linear functional on X is automatically continuous, by Theo-
rem ??. This means that we already know from 7.1 how to describe the elements of X ∗
and how they act on elements of X. The only novel feature is the norm structure, in
particular how the chosen norm in X relates to the operator norm of X ∗ . Not for the
first time there are parallels between `p spaces and spaces (Fm , k · kp ); see 7.5 below.
(3) (`2 )∗ ∼
= `2 ;
(4) (`p )∗ ∼
= `q , for 1 < p < ∞, where p−1 + q −1 = 1.
In each of the four cases, the dual space X ∗ of the sequence space X is identified via
an isometric isomorphism J with another sequence space, Y say. When this identification
is made, the action of elements of Y on elements of X is given by
X
(ηj )(xj ) = xj η j for all (xj ) ∈ X, (ηj ) ∈ Y.
j
Proof. To illustrate the strategy we prove (1) in gory detail, following the checklist (A)–
(E), and paying particular attention to issues of well-definedness, which are often linked
to convergence issues.
Check: P P
(i) (Jη)(x) ∈ F: P
we need xj ηj to converge. Proof:P |xj ηj | converges by com-
parison with |ηj | since (xj ) is bounded; hence xj ηj converges.
(ii) Jη is linear for each η: We need to show (Jy)(λx+µx0 ) = λ(Jη)(x)+µ(Jη)(x0 ),
for x, x0 ∈ c0 and λ, µ ∈ F. This is routine to check.
(iii) Jη is bounded for each η:
X X X
|(Jη)(x)| = xj η j 6 |xj ηj | 6 sup |xj | |ηj |;
j
P
here we have used the fact that |xj ηj | converges (from (i)) and the Triangle
Inequality for infinite sums of scalars (Prelims exercise). Therefore
kJηk 6 kηk1 .
because vector space operations in a dual space are defined pointwise. This last
equation is straightforward to check. [(B) is routine but clear thinking is required
about what needs to be checked.]
(C) Upper bound on kJηk: Already proved in (A)(iii).
(D) Lower bound on kJηk: We let 0 6= η = (ηj ) ∈ `1 and look at the effect of Jη on
suitably chosen vectors in c0 to get a lower bound for kJηk. For n = 1, 2, . . ., define
x(n) = (xnj )j>1 by
|ηj |
if j 6 n and ηj 6= 0,
xnj = ηj
0 otherwise.
51
Note that each x(n) ∈ c0 . Moreover x(n) has norm 1 if x(n) 6= 0 and this is the case
for large enough n. Then, for such n,
Xn
(n)
kJηk > |(Jη)(x )| = |ηj |.
j=1
P∞
It follows that kJηk > j=1 |ηj | = kηk. (There is nothing to do if η = 0.)
(E) J surjective: Take any f ∈ (c0 )∗ . Let η = (ηj ) where ηj := f (ej ). We claim that
f = Jη. This requires two steps.
(i) (ηj ) ∈ `1 : We have to show that
P
|f (ej )| converges. By Prelims Analysis this
will be true provided the partial sums are bounded above. We make use of x(n)
as defined in (D) where we now take yj = f (ej ). Then by linearity of f we have
Xn X n n
X
(n)
f (x ) = f xnj ej = xnj f (ej ) = |ηj |.
j=1 j=1 j=1
This completes the proof of (1). The way that elements η of `1 act as linear functionals
on elements x of c0 is immediate from the proof we have given.
The characterisations (2), (3) and (4) are handled likewise, with (A)–(C) requiring
almost no adaptation and so routine. Variants are needed for (D) for different spaces, and
(E) is then modified accordingly. Check the details for yourself, referring to textbooks
for confirmation.
Solution. Certainly each xk ∈ `1 . Now we seek to apply Theorem 6.11. Suppose that
f ∈ (`1 )∗ is such that f (xk ) = 0 for all k. By Theorem ??(2), we can identify f with
(ηj ) ∈ `∞ to get X j−1
0= βk ηj for all k.
j>1
52
takes the value 0 at each a = βk . But F is defined by a power series which has radius
of convergence at least 1 since (ηn+1 ) is bounded. It follows that F is holomorphic in
the open unit disc. It has an infinite set of zeros in the closed disc D(0, α). By the
Bolzano–Weierstrass Theorem, F has a limit point of zeros in the open unit disc. By the
Identity Theorem F ≡ 0. This implies that ηj = 0 for all j, so f ≡ 0. The required result
follows from Theorem 6.11(ii).
and the expression on the right-hand side is the same as that we have when we regard
(y as an element of (`2 )∗ . In other words, every bounded linear functional on `2 is of the
form fy : x 7→ hx, yi, for some y ∈ `2 .
This exactly parallels what you have seen in Linear Algebra for finite-dimensional
(real) inner product spaces: the Riesz Representation Theorem. And we can bring the
Linear Algebra theorem within the normed spaces framework by equipping Rm with the
Euclidean norm associated with the usual scalar product.
53
So far we’ve only considered particular Hilbert spaces. With a sneak preview of FA-II
territory, we record the general Riesz Representation Theorem (real case) and make brief
comments.
Riesz Representation Theorem Let X be a real Hilbert space. Then
there is a linear isometry J of X onto X ∗ , given by:
(Jy)(x) = hx, yi (x, y ∈ X).
The proof depends on Proposition 6.3 together with, to prove surjectivity of J, the
Projection Theorem 2.12 applied with Z = ker f for f ∈ X ∗ .
It is interesting to review the Hahn–Banach Theorem in the context of Hilbert spaces.
Suppose we have a subspace Y of a real Hilbert space X and g ∈ Y ∗ . The subspace Y is
not assumed to be closed. However Y is a dense subspace of Z := Y . and we can appeal
to Proposition 6.2 to extend g to Z without changing the norm. So we may assume
without loss of generality that Y is closed. By Proposition ??, Y is a Hilbert space for
the induced IPS norm. Then, by RRT applied to Y , there exists y ∈ Y such that g = fy ,
where fy (x) = hx, yi for all x ∈ Y . Then fy is the restriction to Y of the continuous
linear functional x 7→ hx, yi on X. Both fy and its extension to X have norm ky
. This proved the HBT for (real) Hilbert spaces. Note that separability is not relevant
to this argument.
∼
X
J ∼
i
X i
Z
∼
The proof of the existence of the lifting i comes straight from Theorem 6.1.
Compare this construction with that of the completion of a metric space on Problem
sheet Q. 7. There we worked with a bigger class of spaces, but the completion we obtained
was not a Banach space or the embedding linear, even when the metric came from a norm.
However a universal mapping property analogous to that for the normed space completion
can be proved for the metric space completion.
55
What can we do with dual maps in the context of normed spaces? Given T ∈ B(X, Y ),
where X, Y are normed spaces, we can define a map T 0 by
(T 0 ϕ)(x) = ϕ(T x) for all ϕ ∈ Y ∗ , x ∈ X.
Paralleling the finite-dimensional case, T 0 : ϕ 7→ ϕ ◦ T . Proposition 8.2 confirms that T 0
is a bounded linear operator from Y ∗ to X ∗ . There is a notational awkwardness here we
cannot avoid. We reserve the notation X 0 for the space of all linear functionals on any
space X and so chose to use X ∗ for the dual space of a normed space X, whose elements
are the bounded (alias continuous) linear functionals on X. We use the notation T 0 for
the map dual to T since T ∗ is too well-established usage in the restricted setting of inner
product spaces to be a sensible choice here.
8.2. Proposition (dual operator). Let X, Y be normed spaces (over the same field,
R or C) and T ∈ B(X, Y ). Then T 0 ∈ B(Y ∗ , X ∗ ) and kT 0 k = kT k.
Proof. We only prove that each T 0 ϕ is bounded and T 0 is bounded and that kT 0 k = kT k
(the rest is just linear algebra).
|(T 0 ϕ)(x)| = |ϕ(T x)| 6 kϕk kT xk 6 kϕk kxk kT k.
Hence T 0 ϕ is bounded, with kT 0 ϕk 6 kT k kϕk. Thence T 0 is also bounded, with kT 0 k 6
kT k.
56
For the reverse inequality for the norm, we need HBT. Take x ∈ X. Assume first
that T x 6= 0. By Proposition 6.9(i) there exists ϕ ∈ Y ∗ such that ϕ(T x) = kT xk and
kϕk = 1. Then
kT xk = |ϕ(T x)| = |(T 0 ϕ)(x)| 6 kT 0 ϕk kxk 6 kT 0 k kxk,
and this also holds, trivially, if T x = 0. Therefore kT k 6 kT 0 k.
8.3. Annihilators; kernels and images of bounded linear operators and their
duals.
Let X be a normed vector space. For S ⊆ X and Q ⊆ X ∗ let
S ◦ = { f ∈ X ∗ | f (x) = 0 for all x ∈ S },
Q◦ = { x ∈ X | f (x) = 0 for all f ∈ Q }.
Then S ◦ and Q◦ are closed subspaces of X ∗ and X respectively (easy exercise). Moreover
[from a problem sheet question on HBT], for any subspace Y of X,
Y = (Y ◦ )◦ :
note the closure sign!
This leads on, easily, to the following results. The proofs are left as exercises. Let
X, Y be normed spaces and T ∈ B(X, Y ), with dual operator T 0 ∈ B(Y ∗ , X ∗ ). Then
(T X)◦ = ker T 0 , T X = (ker T 0 )◦ ;
(T 0 Y ∗ )◦ = ker T, T 0 Y ∗ ⊆ (ker T )◦ .
Where closure signs appear in the general results, they aren’t needed when the domain or
the range of the operator in question is finite-dimensional (why?). Moreover, when T 0 Y ∗
is finite-dimensional, T 0 Y ∗ = (ker T )◦ .
Turning this around, if you seek to extend a result on dual maps from Part A Linear
Algebra, expect a closure sign to come into play in the normed space setting whenever
the proof of the corresponding linear algebra result needs a dimension argument.
Take y = (yj ) where yj = 1 for j odd and 0 for j even. Then y = T 0 z for some
z = (zj ) would imply that z2k = 22k , so z could not belong to `∞ . We conclude that
y ∈ (ker T )◦ \ T 0 `∞ . This shows that the inclusion of T 0 Y ∗ in (ker T )◦ may be strict. Here
T 0 Y ∗ is not closed.
We now move on to spectral theory. For reasons which will emerge, the usual setting
for this will be a complex normed space X, but we note that some more elementary results
and some examples work just as well when F = R. At crucial points, clearly flagged, we
shall need to assume that X is a Banach space. Thus the best results overall are available
in the setting of complex Banach spaces.
We now work towards establishing general properties of the spectrum. Here com-
pleteness of X becomes important. The following elementary lemma will be useful when
we need to juggle with invertible operators.
Proof. We leave (i)(a) is an easy exercise; remember that for an operator to be invertible
we need a 2-sided inverse, in B(X).
Consider (i)(b). Assume S is an inverse for P Q in B(X). Then I = (P Q)S = S(P Q).
Since multiplication is associative and P and Q commute, I = Q(P S) = (SP )Q. We
59
deduce that Q is a bijection. Invertibility of P Q implies that there exists δ > 0 such that
kP Qxk > δkxk for all x. Then
δkxk 6 kP Qxk 6 k 6 kP kkQxk.
Also P Q invertible forces P 6= 0, so kP k =
6 0. Therefore kQxk > (δ/kP k)kxk for all kxk.
We now deduce from 3.14 that Q in invertible.
[Note that, purely algebraically, we got a left inverse and a right inverse for Q, each
of which is a bounded operator. But we don’t know that these are equal. Hence we need
to argue via the invertibility checklist.]
We already have enough information to prove quite a lot about the spectrum of a
bounded operator on a Banach space. A corresponding result holds when the scalar field
is R.
8.10. Theorem I (basic facts about spectrum). Let T be a bounded linear operator
on a complex Banach space X. Then
(i) σ(T ) ⊆ D(0, kT k), the closed disc center 0 radius kT k.
(ii) σ(T ) is closed.
(iii) σ(T ) is a compact subset of C.
Part (ii) tells us that C \ σ(T ) is an open set, so σ(T ) is closed. Since it is also
bounded, by (i), the Heine–Borel Theorem gives (iii).
The next example illustrates how the results in Theorem I can be combined to identify
the spectrum of an operator in certain cases.
The next result can give valuable information about σ(T ) in cases where kT n k can
be found explicitly and where Theorem I together with knowledge of σp (T ) does not pin
down σ(T ) completely; contrast Example 4 below with Example 3.
We have already seen that looking at the powers of an operator T may provide
valuable information about its spectrum. We now take this idea further. Here we don’t
need X to be a Banach space but we do need the scalar field to be C.
Proof. There is nothing to prove if p is a constant so assume p has degree n > 1. Let
µ ∈ C. We can factorise p(z) − µ as a product of linear factors:
p(z) − µ = α(z − β1 ) · · · (z − βn ),
for some α 6= 0 and β1 , . . . , βn . Then
p(T ) − µI = α(T − β1 I) · · · (T − βn I).
61
Here the factors commute and µ = p(λ) for some λ if and only if λ ∈ {β1 , . . . , βn }. So
µ ∈ p(σ(T )) ⇐⇒ ∃λ ∈ σ(T ) such that µ = p(λ)
⇐⇒ σ(T ) ∩ {β1 , . . . .βn } =
6 ∅.
Assume µ ∈ / σ(p(T )). Then λ 6= βr and hence (T − βr I) is invertible, for each r. This
implies α(T − β1 I) · · · (T − βn I) is invertible. We deduce that µ ∈
/ p(σ(T )).
Assume µ ∈
/ p(σ(T )). Then, for each r,
Y
p(T ) − µI = (T − βr I) (T − βj I).
j6=r
Our earlier Proposition 8.12 is a corollary of the Spectral Mapping Theorem. A much
stronger result linking the spectrum of T to powers of T can be proved than that in
Proposition 8.12. We omit the proof of the Spectral Radius Formula, which needs
more advanced theory than that in FA-I and which we shall not need.
8.16. Theorem (Spectral Radius Formula). Let X be a complex Banach space. Then
the spectral radius of T , defined by
rad(σ(T )) = sup{ |λ| | λ ∈ σ(T ) },
is given by
rad(σ(T )) = inf kT n k1/n = lim kT n k1/n .
n n→∞
We have flagged up already that it can be useful to consider the dual map T 0 when
seeking information about a bounded linear operator T ∈ B(X)—assuming of course we
can (i) describe the dual space X ∗ on which T 0 acts and (ii) describe the action of T 0
explicitly. But even with these provisos concerning its usefulness in specific cases the
following theorem is of interest and the proof instructive. It is crucial that we work with
a Banach space since we shall call on Proposition 3.15.
62
Proof. The proof depends on Proposition 3.15 and results from 8.3 as they apply to
P := λI − T . Note that P 0 = λI − T 0 . Also we shall use the fact (from Problem sheet
Q. 23) that σ(T 0 ) = σ(T ).
Hence, from 8.6,
σ(T ) ⊇ σap (T ) ∪ σp (T 0 ).
/ σap (T ) ∪ σp (T 0 ). Then there exists
For the proof of the reverse inclusion, assume λ ∈
K > 0 such that kP xk > Kkxk for all x. Also ker P 0 = {0} so, from 8.3, P X is dense.
Now Proposition 3.15 implies that P is invertible.
Something’s missing! So far in all our examples we have seen that the spectrum is
non-empty. But is this true in general? Our final piece of theory will show that if X is
a complex Banach space, then σ(T ) 6= ∅ for any T ∈ B(X). We shall treat the proof of
this quite lightly, aiming to give the flavour without the fine detail. First we need some
preliminaries.
We can deduce from this that the map λ 7→ (T − λI)−1 is a continuous function on ρ(T ).
Moreover we have the Resolvent Identity
R(λ, T ) − R(µ, T ) = (λ − µ)R(λ, T )R(µ, T )
(proof is pure linear algebra).
Provided we can prove that ϕ is holomorphic, then we can conclude from Liouville’s
Theorem that ϕ ≡ 0 and this would contradict the fact that ϕ(0) 6= 0. With some juggling
we can show with the aid of the Resolvent Identity that λ 7→ ϕ(λ) has a convergent
power series expansion in a suitably small neighbourhood of each µ ∈ C. Hence it is
holomorphic.
candidate value of λ we may suspect that λ ∈ σap (T ). In some cases σp (T ) may be dense
in D(0, kT k), giving σ(T ) = D(0, kT k) [as in Example 3]. Sometimes this won’t hold, but
a given operator may be a polynomial in some other operator for which the spectrum can
be easily found. Then the Spectral Mapping Theorem comes into play [4. in the table;
Example 5].
In certain cases there may be few eigenvalues or none at all [the right-shift operator
on `1 provides an example [see Problem sheet Q. 26]. For λ ∈ / σp (T ) and where λI − T is
not already guaranteed by 1. or 3. to be invertible, it may then be worth trying to find
an inverse map (λI − T )−1 : could it have domain X, i.e., is λI − T surjective? [5. in the
table]. If surjectivity is assured, one then needs to test for boundedness of the inverse
((?) in 3.14). [Illustrations: Examples 1 and 2.]
To show λ ∈ / σp (T ) is such that λ ∈ σ(T ) we want to find whether one of (B), (C) in
8.6 holds. There can be points in the spectrum for which λ is not an eigenvalue or even
an approximate eigenvalue of T and failure of surjectivity may be hard to prove directly.
If X is a space whose dual space is known (for example X = `p with 1 6 p < ∞ (recall
7.3)), then recourse to Theorem III, involving T 0 , may be a good option [7. in the table]:
it may be easier to find eigenvalues of T 0 than to explore when λI − T is surjective. [See
Problem sheet Q. 26 and also the bonus extension question on the final problem sheet.]
Contents