Linear Analysis 2011
Linear Analysis 2011
1 Normed Spaces 1
2 The Hahn-Banach Theorem 11
3 The Baire Category Theorem 16
4 The Space C(X) 23
5 Weak Topologies on Normed Spaces 33
6 Hilbert Space 39
7 Spectral Theory 48
Part IB Linear Algebra, Analysis II and Metric and Topological Spaces are essential.
Normed and Banach spaces. Linear mappings, continuity, boundedness, and norms. Finite-
dimensional normed spaces. [4]
The Baire category theorem. The principle of uniform boundedness, the closed graph theorem
and the inversion theorem; other applications. [5]
The normality of compact Hausdorff spaces. Urysohn’s lemma and Tiezte’s extension the-
orem. Spaces of continuous functions. The Stone-Weierstrass theorem and applications.
Equicontinuity: the Ascoli-Arzelà theorem. [5]
Inner product spaces and Hilbert spaces; examples and elementary properties. Orthonormal
systems, and the orthogonalization process. Bessel’s inequality, the Parseval equation, and
the Riesz-Fischer theorem. Duality; the self duality of Hilbert space. [5]
Bounded linear operations, invariant subspaces, eigenvectors; the spectrum and resolvent set.
Compact operators on Hilbert space; discreteness of spectrum. Spectral theorem for compact
Hermitian operators. [5]
Appropriate books
B. Bollobás Linear Analysis. 2nd Edition, Cambridge University Press 1999 (£22.99 paper-
back).
C. Goffman and G. Pedrick A First Course in Functional Analysis. 2nd Edition, Oxford
University Press 1999 (£21.00 Hardback)
W. Rudin Real and Complex Analysis. McGraw-Hill International Editions: Mathematics
Series (£31.00 Paperback)
1. Normed Spaces
In this course all vector spaces V will be over R or C. Usually it doesn’t matter which. They
won’t always (or even usually) be finite dimensional. Our vector spaces will have additional
analytic structure; this makes the theory much more interesting.
(i) kxk = 0 ⇐⇒ x = 0
(ii) kλxk = |λ| kxk for all λ ∈ R or C
(iii) kx + yk 6 kxk + kyk (‘subadditive’)
Think of kxk as the ‘length’ of x, but don’t rely on geometric intuition too much!
Examples
1
ℓp -norms, Hölder’s Inequality and Minkowski’s Inequality
1 1
Hölder’s Inequality. Suppose 1 < p < ∞. Define the conjugate index q by p + q = 1.
Let (ai )ni=1 and (bi )ni=1 be two sequences of complex numbers. Then
n
X X
n 1/p X
n 1/q
|ai bi | 6 |ai |p |bi |q .
i=1 i=1 i=1
aλ1 1 . . .aλnn 6 λ1 a1 + . . . + λn an .
Proof. We use the concavity of the function f (x) = log x, which follows from the fact that
f ′′ (x) = −1/x2 6 0. (Exercise: this really implies concavity.)
ap bq
A particular example of this (in the case n = 2) is that ab 6 + (∗), whenever
p q
a, b ∈ [0, ∞) and p, q are conjugate indices.
Proof of Hölder’s Inequality. The inequality is homogeneous in the sense that if it is true
for ai , bi , then it is also true for λai and µbi for any λ, µ ∈ R \ {0}.
P P q
The inequality is manifestly true when either |ai |p or |bi | equals 0 (since then
eitherPall ai or all bP
i are 0). Otherwise, by an appropriate choice of λ, µ we may assume
n n
that i=1 |ai |p = i=1 |bi |q = 1.
Pn 1
Pn
But under these conditions, by (∗) we do indeed have i=1 |ai bi | 6 p i=1 |ai |p +
1 Pn q
q i=1 |bi | = 1. 2
2
In other words, the ℓp -norm on Rn (or Cn ) is indeed a norm, since we have the triangle
inequality.
n
X n
X n
X n
X
Proof. |ai + bi |p = |ai + bi |p−1 |ai + bi | 6 |ai + bi |p−1 |ai | + |ai + bi |p−1 |bi |.
i=1 i=1 i=1 i=1
1 1
Now, (p − 1)q = p, since p + q = 1. Therefore this is
X
n 1/q X
n 1/p X
n 1/p !
p p p
|ai + bi | |ai | + |bi |
i=1 i=1 i=1
P 1/q
Divide though by |ai + bi |p .
X
n 1− q1 X
n 1/p X
n 1/p
|ai + bi |p 6 |ai |p + |bi |p .
i=1 i=1 i=1
1
But 1 − q = p1 , so done. 2
Banach Spaces
If V is a normed space with norm k k then it can be regarded as a metric space by defining
d(x, y) = kx − yk. (Easy exercise to see that this is a metric space.)
Definition. V is a Banach space if it is complete with respect to this metric. (I.e., Cauchy
sequences converge.)
Examples
• R is a Banach space.
• Rn with ℓ1 -norm: kxk1 = |x1 | + . . . + |xn |.
The metric induced on Rn is exactly the product metric on Rn , viewed as a product of
n copies of R.
Recall from Met&Top that if X, Y are metric spaces then we can define a metric on
X × Y by d (x1 , y1 ), (x2 , y2 ) = dX (x1 , x2 ) + dY (y1 , y2 ).
∞
If X and Y are complete then so is X × Y . (Proof: exercise.) If (xn , yn ) n=1 is Cauchy
in X × Y then (needs justifying) (xn ) is Cauchy in X and (yn ) is Cauchy in Y . Let
xn → x∗ , yn → y ∗ , then (xn , yn ) → (x∗ , y ∗ ).
Suppose k k and k k′ are two norms on Rn (or Cn ). Then they are equivalent in the
sense that there are c1 , c2 > 0 such that c1 kxk 6 kxk′ 6 c2 kxk for all x ∈ V .
Proof. Since ‘equivalence’ is indeed an equivalence relation, it suffices to show that a given
norm k k is equivalent to the ℓ1 -norm, k k1 .
3
We claim that x 7→ kxk is continuous in the metric
Pn induced by the ℓ1 -norm. To see
this, suppose that kv − wk1 6 δ, that is to say i=1 |vi − wi | 6 δ. Then
n
X
kv − wk = (vi − wi )ei with e1 , . . ., en the standard basis
i=1
n
X
6 |vi − wi | kei k using scalar multn and triangle ineq. for k k
i=1
n
X
6 M |vi − wi | where M = sup kei k
i=1 i
6 Mδ
Now, the unit sphere {x : kxk1 = 1} ⊂ Rn is closed and bounded, hence compact. So
the function x 7→ kxk, being continuous, is bounded and attains its bounds.
In particular, inf kxk = min kxk 6= 0, since the only vector with norm zero is 0.
kxk1 =1 kxk1 =1
So there are c1 , c2 > 0 such that c1 6 kxk 6 c2 for all x with kxk1 = 1. This implies
c1 kxk1 6 kvk 6 c2 kvk1 , for all v (take x = v/kvk1 ). 2
Proof. Equivalent norms give rise to equivalent metrics: c1 d1 (x, y) 6 d2 (x, y) 6 c2 d1 (x, y).
And Cauchy sequences and convergence in one metric means the same as in the other.
(Exercise: check details.)
For (⇒), it is easy to show that the triangle inequality for k k implies that {x : kxk 6 1} is
a convex body.
For (⇐), suppose K is such a convex body. We can define kxk to be the dilation factor of K
required to ‘hit‘ x, that is kxk = min{x ∈ λK}.
λ
Why is it a norm? First check it’s well-defined (using non-empty interior). The triangle
inequality follows since λK + µK ⊂ (λ + µ)K – a consequence of convexity.
✬✩
ℓ∞ ℓ2 ℓ1
❅
❅
❅
✫✪ ❅
4
Back to examples of Banach spaces
❅ ❆ ‘−→’
❅ ❆
1 1 1
2 1 2 1 2 1
1
0 if x > 2 + n1
1
E.g., fn (x) = 1 if x 6 2 − n1
1
2 − 2 (x − 2 ) if 21 −
n 1 1
n 6x6 1 1
2 + n
2 2
This is Cauchy in L1 , since kfm − fn k 6 max m, n → 0 as m, n → ∞.
1
However, fn does not converge to any continuous function f . What would f 2 be?
Wlog, f 12 6 21 . Since f is continuous, f (x) 6 34 for x − 12 6 δ, for some δ.
1 1 δ
By contrast, fn (x) = 1 if 2 −δ 6x6 2 − 2 for n sufficiently large.
Hence, if n is big enough, kf − fn k1 > 41 2δ = 8δ . So fn 6→ f in L1 .
Remark. For essentially the same reason, C[0, 1] is not a Banach space in any Lp -norm,
1 6 p 6 ∞.
More examples
P∞ P
• ℓp = {(x1 , x2 , x3 , . . .) ∈ CN : i=1 |xi |p < ∞}, with the ℓp -norm, kxkp = ( |xi |p )1/p .
(This is a norm by Minkowski’s inequality – let n → ∞.)
This is complete and hence a Banach space.
Proof. Let (xn )∞n=1 be a Cauchy sequence in ℓp . Each xn is a sequence; write them as
x1 = (x11 , x12 , x13 , . . .), x2 = (x21 , x22 , x23 , . . .), . . .
Observe that for any fixed i, |xni − xmi | 6 kxn − xm kp , and therefore (xni )∞
n=1 is
a Cauchy sequence. Since C is complete, there exists x∗i such that xni → x∗i .
Claim. Let m be such that kxm − xn kp 6 ε for all n > m. Then kxm − x∗ kp 6 ε.
5
Proof of claim. Suppose kxm − xn kp 6 ε for all n > m. Let R be arbitrary.
PR PR
Then i=1 |xmi − xni |p 6 εp . Let n → ∞, we get i=1 |xmi − x∗i |p 6 εp .
P∞
Let R → ∞, we get i=1 |xmi − x∗i |p 6 εp .
Let (V, k k) be a normed space. The construction of the completion of V is as follows. Let
CV be the set of all Cauchy sequences in V . This is a vector space: if x = (xn ) and y = (yn ),
then λx + µy = (λxn + µyn ).
We define a norm on V . For x ∈ CV , let p(x) = limn→∞ kxn k, which is well-defined as kxn k
is Cauchy in R. We have
So (V , p) is a normed space.
We can regard (V, k k) as a dense subset of (V , p). Define ϕ : V → V , x 7→ [(x, x, . . .)]. Then
ϕ is linear and p(ϕ(x)) = kxk, so ϕ is isometric, and in particular injective.
Proof. Let [x] ∈ V and ε> 0. Then x = (xn ) is Cauchy, so kxm − xn k 6 ε for m, n > n0 .
Then p ϕ(xn0 ) − [x] = p(xn0 − x1 , xn0 − x2 , . . .) 6 ε. 2
If V is complete then V ∼
= V , with ϕ(V ) = V . For if [x] ∈ V , write x = (xn ) ∈ CV and
x∞ = limn→∞ xn . Then x∞ ∈ V by the completeness of V we have ϕ(x∞ ) = [x].
Proof. Let ([xn ]) be a Cauchy sequence in V , so that (xn ) is Cauchy in V for all n. Since
ϕ(V ) = V , there exists zn ∈ V such that p(ϕ(zn ) − [xn ]) 6 21n , for each n ∈ N.
6
kzn − zm k = p ϕ(zn ) − ϕ(zm )
6 p ϕ(zn ) − [xn ] + p [xn ] − [xm ] + p [xm ] − ϕ(zm )
1 1
6 n
+ p [xn ] − [xm ] + m → 0 as n, m → ∞
2 2
So z = (zn ) is Cauchy in V .
p [z] − [xn ] = lim kzm − xnm k
m→∞
6 lim sup kzm − zn k + lim sup kzn − xnm k
m→∞ m→∞
1
6 → 0 as n → ∞
2n
So [xn ] → [z] as n → ∞ and thus (V , p) is complete. 2
Proof. Exercise.
Proposition. A normed space V is complete iff every absolutely convergent series converges.
P∞ PM
Proof. Suppose V is complete and n=1 kxn k < ∞. Then for M > N , we have n=N xn
PM
6 n=N kxn k → 0, as N → ∞. So the series is Cauchy, hence converges.
Conversely, suppose that every absolutely convergent series is convergent, and let (xn )
be a Cauchy sequence. It is sufficient to show that there exists subsequence xni con-
verges to some x ∈ V , as then: kxm − xk 6 kxm − xni k + kxni − xk → 0 as m, i → ∞.
Linear operators
Let X, Y be two normed spaces. A linear map T : X → Y is said to be continuous if T is
a continuous map in the metric spaces induced by the norms on X and Y .
That is, T must be continuous at every point x0 ∈ X, which means that for all ε > 0 there
exists δ > 0 such that kx − x0 kX 6 δ implies kT x − T x0 kY 6 ε.
Lemma. Let T : X → Y be a linear map between normed spaces. The following are
equivalent.
(1) T is continuous
(2) T is continuous at some x0
(3) T is bounded, meaning that T (BX (1)) ⊂ BY (R) for some R (where BX (r) =
{x ∈ X : kxk < r}). Equivalently, kT xkY 6 RkxkX .
7
As a result of this we always speak of bounded operators and never continuous ones.
(2) ⇒ (3). If T is continuous at x0 then there exists δ > 0 such that kx − x0 kX < δ
implies kT x − T x0 kY 6 1. I.e., kT (x − x0 )k 6 1, by linearity of T .
Hence T (BX (δ)) ⊂ BY (1). By linearity of T again, this implies T (BX (1)) ⊂ BY (1/δ).
(3) ⇒ (1). Note (3) implies that kT xkY 6 RkxkX . Hence if ε > 0, take δ = ε/R. Then
if kx − x0 k 6 δ we have kT x − T x0 kY = kT (x − x0 )kY 6 Rkx − x0 kX 6 ε.
So T is continuous at x0 . 2
The infimum of all possible values of R for which kT xkY 6 RkxkX is called the operator
norm of T , written kT k. Equivalently, kT k = sup kT xkY .
kxkX =1
Examples
So done since any two norms on a finite-dimensional vector space are equivalent.
P
2. Shift map. Let X = ℓ1 = {(x1 , x2 , . . .) : |xi | < ∞}.
Both are linear and kSk = kT k = 1. Note that S ◦ T = id, T ◦ S 6= id. (So existence of
a left inverse does not imply existence of a right inverse.)
We also have the sup norm, kf k∞ = supt |f (t)|. Let Y = C[0, 1] with the sup norm,
and define T : X → Y by T (f ) = f ′ . This is certainly linear.
T is bounded if X is given the first norm, but not if X has the sup norm. (Take a
bounded function with unbounded derivative.)
8
Suppose X, Y are normed spaces. We write B(X, Y ) = {T : X → Y : kT k < ∞}.
When Y = R (or C) we write X ∗ = B(X, R) (or B(X, C)). This is the dual space of X.
Elements of X ∗ are called functionals.
B(X, Y ) is a vector space and in fact a normed space with the operator norm:
Let ε > 0 and suppose kTm − Tn k 6 ε for all n > m. Then kTm x − Tn xkY 6 εkxkX .
Letting n → ∞, we get kTm x − T xkY 6 εkxkX . For such an m, we have kTm − T k 6 ε.
Hence Tm → T in the operator norm.
Proof. Let e1 = (1, 0, 0, . . .), e2 = (0, 1, 0, . . .), e3 = (0, 0, 1, 0, . . .), . . . in ℓp . (Exercise: this
is not a basis for ℓp as a vector space.)
9
P∞
It follows that ϕ(x) = ϕ (x1 , x2 , . . .) = i=1 xi yi = limn→∞ ϕ (x1 , . . ., xn , 0, 0, . . .) .
However, the yi are not arbitrary.
So ϕ is a bounded linear functional with kϕk 6 kykq . (In fact, equality occurs.)
P∞ P 1/p
∞
(⇒). Suppose conversely that i=1 xi yi 6 M i=1 |xi |p for all (xi ) ∈ ℓp .
|yi |q yi−1 if i 6 N
Let xi = , for fixed N .
0 if i > N
PN P 1/p
N
Then (xi ) ∈ ℓp , and so i=1 |yi |q 6 M i=1 |yi |(q−1)p .
P 1/q
N
Hence, since (q − 1)p = q and 1/p = 1 − 1/q, we have i=1 |yi |q 6 M.
Note. Note. This doesn’t work for ℓ1 , because ℓ1 is much smaller than ℓ∗∞ . (In fact, ℓ∞ is
a pretty exotic object.)
Adjoints
Easy to show T ∗ is linear and bounded. Indeed: kT ∗ g(x)k = kg(T x)k 6 kgk kT xk 6
kgk kT k kxk, all g ∈ Y ∗ , x ∈ X.
10
2. The Hahn-Banach Theorem
Throughout this section, X ∗ is always a Banach space. But could X ∗ = {0}?
Note convexity follows from subadditivity, p(x + y) 6 p(x) + p(y), and positive homogeneity.
Then there exists F : M ′ → R such that F |M = f , and F (x) 6 p(x) for all x ∈ M ′
1
For t > 0, we require f (y) + tc 6 p(y + tx0 ), which is true iff c 6 t p(y + tx0 ) − f (y) =
p(y ′ + x0 ) − f (y ′ ), where y ′ = y/t.
For t < 0, let t = −s, then we require f (y) − sc 6 p(y − sx0 ), which is true iff
c > 1s f (y) − p(y − sx0 ) = f (y ′′ ) − p(y ′′ − x0 ), where y ′′ = y/s.
Zorn’s Lemma. Suppose that every totally ordered subset S of a non-empty, partially
ordered set P has an upper bound. Then P has a maximal element.
Theorem (Hahn-Banach theorem for real vector spaces). Let X be a real vector
space, p : X → R a convex functional on X, M ⊂ X a linear subspace and f : M → R
a linear functional on M such that f (x) 6 p(x) for all x ∈ M .
11
Then there exists F : X → R, linear, such that F |M = f , and F (x) 6 p(x) for all
x ∈ X.
( )
f ⊂ X is a linear subspace such that M
M f ⊃ M, and
f e
Proof. Let F = (M , f ) : e .
f → R linear, fe|M = f, fe(x) 6 p(x) for all x ∈ X
f :M
f, fe) 6 (N
Define a partial ordering by (M e, e f⊂N
g ) if M e and e
g |M e
f = f.
Then there exists F : X → K, linear, with F |M = f , and |F (x)| 6 q(x) for all x ∈ X.
Proof. For K = R, apply Hahn-Banach for real vector spaces. Then there exists F : X → R
such that F |M = f , F (x) 6 q(x) and −F (x) = F (−x) 6 q(−x) = q(x) for all x ∈ X.
So |F (x)| 6 q(x) for all x ∈ X.
Let F (x) = g(x) − ig(ix). Then F (ix) = g(ix) − ig(−x) = g(ix) + ig(x) = iF (x), so F
is C-linear.
Then for all x ∈ M , we have Re F (x) = g(x) = Re f (x), and Im F (x) = −g(ix) =
−Re f (ix) = −Re if (x) = Im f (x).
Lastly, |F (x)| = | eiθ F (x) | = |F (eiθ x)| = |g(eiθ x)| 6 q(eiθ x) = q(x) for all x ∈ X. 2
| {z }
real
Consequences of Hahn-Banach
Throughout, K = R or C, and X is a normed space over K.
12
F : X → K linear with F |M = f and |F (x)| 6 kf k kxk for all x ∈ X. So kF k 6 kf k.
But since F |M = f , we have kF k = kf k.
2. Let x0 ∈ X \ {0}. Then there exists f ∈ X ∗ such that kf k = 1 and f (x0 ) = kx0 k.
Indeed, let M = hx0 i. Then the function fe(λx0 ) = λkxk for λ ∈ K has kfek = 1. By
Hahn-Banach, there is f ∈ X ∗ with kf k = 1 and f |M = fe. So f (x0 ) = fe(x0 ) = kx0 k.
3. Let Z be a linear subspace of X, and y ∈ X \ Z, and let d = dist(y, z) = inf z∈Z ky − zk.
Note d > 0. Then there exists F ∈ X ∗ such that kF k = 1, F |Z = 0, F (y) = d.
Indeed, let M = Z + hyi, and f : M → K be defined by f (z + ty) = td. So f is linear
and:
|f (z + ty)| |t|d d
kf k = sup = sup z 6 =1
z∈Z,t∈K kz + tyk z∈Z,t∈K |t| k t + yk d
So kf k = 1. By Hahn-Banach, there exists F : X → R such that kF k = 1 and F |M = f .
So F (z) = f (z) = 0 for all z ∈ Z, and F (y) = f (y) = d.
4. X ∗ separates the points of X. In other words, for all x, y ∈ X with x 6= y, there exists
f ∈ X ∗ such that f (x) 6= f (y).
Indeed if x 6= y, then w = x − y 6= 0. So by Hahn-Banach, there exists F ∈ X ∗ such
that kF k = 1 and F (x) − F (y) = F (x − y) = kx − yk > 0. So F (x) 6= F (y).
Indeed, |f (x)| 6 kf k kxk 6 kxk for all f ∈ X ∗ with kf k 6 1. On the other hand, if
x ∈ X \ {0}, by Hahn-Banach there exists f ∈ X ∗ such that kf k = 1 and f (x) = kxk.
So |f (x)| = kxk and sup |f (x)| > kxk.
kf k61
Example. ℓp (K) is separable for all 1 6 p < ∞, but ℓ∞ (K) is not separable.
Proof. Let {fk }k∈N be a dense subset of X ∗ . Then for all k ∈ N, there exists xk ∈ X with
kxk k = 1 such that |fk (xk )| > 12 kfk k. Let A be the set of finite linear combinations of
the {xk } with rational coefficients. Then A is countable.
Then kfkj − f k → 0. But kfkj − f k > |fkj (xkj ) − f (xkj )| = |fkj (xkj )| > 21 kfkj k.
13
Definition. Let X be a normed space. Define X ∗∗ = (X ∗ )∗ to be the second dual of X.
Then X ∗∗ is always a Banach space. Also define ϕX : X → X ∗∗ by x 7→ ϕX (x) where
ϕX (x) : X ∗ → K, f 7→ ϕX (x)(f ) = f (x).
|ϕX (x)(f )|
|ϕX (x)(f )| = |f (x)| 6 kf k kxk, so sup 6 kxk, and so kϕX (x)k 6 kxk.
f 6=0 kf k
|ϕX (x)(fe)|
So |ϕX (x)(fe)| = |fe(x)| = kxk. So = kxk, and thus kϕX (x)k > kxk. 2
kfek
Remark. Every reflexive space is complete. However, there are Banach spaces which are
not reflexive.
Theorem. Let X be a reflexive Banach space and M ⊂ X a closed linear subspace. Then
M is reflexive.
Thus x ∈ M . We now need to show ϕM (x) = m∗∗ . To prove this, note that the
following diagram is commutative:
ϕX
X −→ X ∗∗
j↑ ↑ j ∗∗
ϕM
M −→ M ∗∗
For example, ϕX ◦ j = j ∗∗ ◦ ϕM .
So ϕM (x) = m∗∗ . 2
14
Theorem. Let X be a Banach space. Then X is reflexive iff X ∗ is reflexive.
Let x∗∗∗ ∈ X ∗∗∗ , and define x∗ = x∗∗∗ ◦ ϕX . This is linear and continuous, so in X ∗ .
For any x∗∗ ∈ X ∗∗ , we have x∗∗ = ϕX (x) for some x ∈ X, since X is reflexive. Then:
ϕX ∗ (x∗ )(x∗∗ ) = x∗∗ (x∗ ) = ϕX (x)(x∗ ) = x∗ (x) = x∗∗∗ (ϕX (x)) = x∗∗∗ (x∗∗ ).
But ℓ1 and ℓ∞ are not reflexive. Indeed, if ℓ1 were reflexive, then ℓ1 =∼ ℓ∗∗ implies ℓ∗∗
1 1
∗ ∼
is separable and then ℓ1 = ℓ∞ is separable. /\/\ Thus ℓ∞ is also not reflexive.
15
3. Baire Category Theorem
Suppose X is a metric space. A subset A ⊂ X is said to be dense if it intersects every open
set U ⊂ X. Equivalently, A intersects every open ball Br (x0 ) = {x ∈ X : d(x, xo ) < r}.
∞
Theorem (Baire Category Theorem). Suppose T∞ (Gn )n=1 is a sequence of open dense
subsets of a complete metric space. Then n=1 Gn is dense.
Lemma. Let X be a complete metric space and suppose F1 ⊃ TF2 ⊃ F3 ⊃ . . . are nested,
closed, non-empty subsets of X with diam(Fn ) → 0. Then ∞
n=1 Fn 6= ∅.
Suppose xn → x∗ . Then, for each m, the sequence xm , xm+1 , xm+2 , . . . lies entirely in
Fm (as the sets are nested) and tends to x∗ .
T∞
Since Fm is closed, we have x∗ ∈ Fm , and so x∗ ∈ n=1 Fn . 2
T∞
Proof of BCT. Consider some ball Br0 (x0 ). We want a point of n=1 Gn inside here.
Since G1 is open and dense, Br0 (x0 ) meets G1 in an open set, and hence Br0 (x0 )
contains an open ball. In fact, it contains some closed ball Br1 (x1 ) with r1 > 0.
(Indeed, Bε/2 (t) ⊂ Bε (t) – that is, every open ball contains a non-empty closed ball.)
Now, Br1 (x1 ) meets G2 , so find a closed ball Br2 (x2 ) ⊂ Br1 (x1 ) ∩ G2 . Continue in this
fashion, finding ever smaller balls Br3 (x3 ) ⊂ Br2 (x2 ) ∩ G3 , etc. Do this in such a way
that rn → 0.
T∞
Since x0 and r0 are arbitrary, we have that n=1 Gn is dense. 2
Theorem (Baire Category Theorem II). Suppose X is aScomplete metric space, and
∞
that (Fn )∞
n=1 is a sequence of closed sets such that X = n=1 Fn .
Then at least one Fn has non-empty interior. That is, some Bε (x0 ), ε > 0, is in it.
T∞
Proof. Take Gn = X \ Fn . Apply Baire Category: since n=1 Gn = ∅, at least one of Gn is
not dense.
16
closed sets with empty interior; it is of second category otherwise.
So Baire Category could be stated as: a complete metric space is of second category.
Proof. Define Sn to be the set of all f ∈ C[0, 1] such that there exists some x ∈ (0, 1) such
that ‘the slope of f is bounded near x by n’. That is to say |f (x)−f (y)|
|x−y| 6 n for all y
with 0 < |x − y| 6 n1 .
If f is differentiable at x then
S∞ (Mean Value Theorem) f lies in Sn for n big enough. If
the theorem is false, then n=1 Sn = C[0, 1]. So it is sufficient to show that each Sn is
(a) closed and (b) has empty interior.
(a) Suppose (fi )∞
i=1 is a sequence in Sn such that fi → f for some f ∈ C[0, 1]. We
aim to show that f ∈ Sn .
(b) We need to show Sn contains no ball Bε (f0 ). In other words, for any continuous
f0 ∈ C[0, 1] we want some f ∈ C[0, 1] with kf − f0 k∞ < ε but whose slope is not
bounded (in the sense of belonging to Sn ).
First, we’ll find a piecewise linear function f1 with kf1 − f0 k < 2ε . In fact, f1 will
be linear on each segment [ Mi , i+1
M ] for i = 0, 1, . . ., M − 1.
Since f0 is uniformly continuous, if M is sufficiently large then |f0 (x) − f0 (y)| < 4ε
1
whenever |x−y| 6 M . It is an easy exercise to show that this implies kf1 −f0 k < 2ε .
17
We now define f = f1 + 2ε g, where g is a suitable function bounded by 1 everywhere,
so that kf − f0 k < ε.
The slope of g at every point is at least 2M ′ , hence the slope of f at every point
is at least εM ′ − slope(f1 ). By taking M ′ large enough in terms of ε and slope(f1 )
(which is bounded since f1 is piecewise linear), we can make this > n. Thus
f∈ / Sn , as required. 2
Then there is a ball Bε (x0 ), ε > 0, such that sup sup |f (x)| < ∞.
f ∈F x∈Bε (x0 )
T
Proof. Define Sn = f ∈F {x ∈ X : |f (x)| 6 n}. Then Sn is an intersection of closed sets,
so is closed.
S∞
And n=1 Sn = X – indeed, x ∈ Sn whenever n > sup |f (x)|.
f ∈F
Proof. Apply the previous theorem to the functions x 7→ kT xk, for T ∈ T . This is a family
of continuous functions on a complete metric space X. Hence there is a ball Bε (x0 ),
ε > 0, such that sup sup kT xk < ∞. Call this M .
T ∈T x∈Bε (x0 )
ε
In particular, if z ∈ X has kzk < 2 then kT (x0 + z)k 6 M and kT (x0 − z)k 6 M .
2M
Scaling up, we see that kT yk 6 for all y with kyk 6 1 and for all T .
ε
2M
Hence sup kT k 6 . 2
T ∈T ε
18
P
Heuristic computation: assume f (x) = ak eikx , and then
Z 2π Z 2π X X Z 2π
−imx i(k−m)x
f (x)e dx = ak e dx ∼ ak ei(k−m)x dx ∼ 2πam ,
0 0 k k 0
R 2π 2π if n = 0
as einx dx =
0 0 6 0
if n =
R 2π
Thus (we are tempted to say), ak = 1
2π 0 f (x)e−ikx dx = fb(k).
P b ikx
Question. How rigorous is this? Is f (x) = k f (k)e in any meaningful way?
Note that f 7→ SN f (0) is a linear operator, which we’ll call ϕN . It is defined on the
space X of 2π-periodic continuous functions, which we’ll identify with {f ∈ C[0, 2π] :
f (0) = f (2π)}. Obviously, with the sup norm k k∞ this is a closed subspace of C[0, 2π],
and hence it is a Banach space.
We’ll show that each ϕN is bounded but that kϕN k → ∞ as N → ∞. The result
then follows from uniform boundedness: if supN kϕN f k < ∞ for all f ∈ X then
supN kϕN k < ∞, contradiction.
X Z 2π Z 2π
1 X 1
ϕN f = SN f (0) = fb(k) = f (x)e−ikx dx = f (x)DN (x)dx,
2π 0 2π 0
|k|6N |k|6N
P
where DN (x) = Dirichlet kernel = |k|6N eikx .
Z 2π
1
This makes it clear that ϕN is bounded: kϕN f k 6 sup |f (x)| · |DN (x)|dx.
x 2π 0
| {z }
this is certainly finite
Z 2π
1
Actually, kϕN k = |DN (x)|dx.
2π 0
Z 2π
1
To see this, take f (x) = e−i arg DN (x) , then kf k∞ = 1 and ϕN f = |DN (x)|dx.
2π 0
R 2π
All we have to show is that 0
|DN (x)|dx → ∞. We estimate DN (x) as follows.
π(r+ 12 ) 1
This has peaks around x = N + 12
, say of width 10N .
π(r+ 12 )
Around such a peak, | sin(N + 21 )x| > 12 , but sin 21 x 6 12 x 6 2(N + 21 )
6 10r
N .
19
1 N 1
Hence the contribution to the integral from the rth peak is at least 10N · 10r = 100r .
Z 2π N
X 1
But the harmonic series diverges, and |DN (x)|dx > > c log N .
0 r=1
100r
R 2π
(In fact, 0
|DN (x)|dx ∼ C log N .) 2
P
Remarks (non-examinable). What went wrong? We looked at SN f = |k|6N fb(k)eikx ,
and the cut-off at |k| = N is too sharp.
P |k| b
Look instead at Sf
Nf = |k|6N (1 − N )f (k)e
ikx
.
R 2π
Here, Sf
N f (0) = 0
f (x)KN (x)dx, where KN (x) is called the Fejér kernel.
2
sin(N − 12 )x R 2π
KN (x) ∼ , and |KN (x)|dx < ∞ uniformly in N .
sin 12 x 0
The theory of Sf
N is rather nice.
Proof. It suffices to show (by scaling and linearity properties of T ) that T (BX (1)) is open,
where BX (1) is the open unit ball {x ∈ X : kxkX < 1}. It then follows easily that
T (BX (x0 , ε)) is open for any x0 ∈ X and any ε > 0.
Plan. (1) Apply the Baire Category Theorem to conclude that T (BX (1)) contains an
open ball BY (δ), δ > 0. Here we use the completeness of Y .
(2) Mess around a bit to show that in fact T (BX (1)) in fact contains BY (δ). Here
we use the completeness of X.
S∞
(1) T surjective implies that Y = n=1 T (BX (n)). But T (BX (n)) = nT (BX (1)) and
S∞ S∞
so Y = n=1 nT (BX (1)), and thus trivially Y = n=1 nT (BX (1)).
Applying Baire Category (2nd form), it follows that one of these sets nT (BX (1))
has non-empty interior. Hence T (BX (1)) has non-empty interior – say it contains
BY (y0 , δ) = {y ∈ Y : ky − y0 kY < δ}.
Note that T (BX (1)) is symmetric about the origin (i.e. if it contains y then it
contains −y) and is convex since BX (1) is.
If kzk < δ then T (BX (1)) contains both y0 + z and −y0 + z, and hence contains
1 1
2 (y0 + z) + 2 (−y0 + z) = z.
20
Assume, then, that T (BX (1)) ⊃ BY (1) (∗). We wish to conclude from this that
T (BX (1)) ⊃ BY (1).
Observe that (∗) implies that for any y ∈ Y and ε > 0 there is an x such that
kxkX 6 kykY and y = T x + y ′ with ky ′ kY 6 ε.
Find x2 with kx2 k 6 ky2 k such that y2 = T x2 + y3 with ky3 k 6 ε2 . And so on.
We have found x ∈ BX (1) such that T x = y. Since y was arbitrary, this implies
that T (BX (1)) ⊃ BY (1). 2
Proof. Only the boundedness requires proof. But if U ⊂ X is open then (T −1 )−1 U = T U
is open, by the open mapping theorem. 2
Proof. X × Y is a normed space with norm k(x, y)kX×Y = kxkX + kykY . (Easy exercise:
the topology induced on X × Y is the product topology.)
21
So X × Y is a Banach space and Γ is a closed linear subspace of it. Hence Γ is also a
Banach space (with the same norm). Indeed, Γ is certainly a normed space. If (γn )∞
n=1
is a Cauchy sequence in Γ then it is certainly Cauchy in X × Y and hence converges to
some γ. But Γ closed ⇒ γ ∈ Γ.
22
4. The Space C(X)
Let X be a topological space. We’ll study the space C(X) of continuous R-valued functions
in some generality. (Much of the theory is easier when X is a metric space.)
Our spaces will be compact (aside: for much of the theory, locally compact will do – see, e.g.,
Rudin’s red book), and, importantly, Hausdorff.
Recall. X is Hausdorff if for every pair a, b of distinct points in X there are disjoint open
sets U, V with a ∈ U , b ∈ V .
Proof. A closed ⇒ A compact: easy. The converse is slightly trickier. Suppose A is compact
and that x ∈/ A. For each a ∈ A, the Hausdorff property gives open sets Ua ∋ a and
Va ∋ x which are disjoint. Since A is compact and the Ua s cover A, there is a finite
subcover: Ua1 ∪ . . . ∪ Uan , say.
Tn Sn
But then i=1 Vai is open, contains x and is disjoint from i=1 Uai and hence from A.
Since x was arbitrary, X \ A is open. 2
Proof. Suppose A, B ⊂ X are disjoint closed sets. For any pair a ∈ A, b ∈ B, there are
disjoint open sets Ua,b and Va,b with a ∈ Ua,b and b ∈ Va,b .
Fix a. The sets Va,b , b ∈ B, form an open cover of B, and since B is compact (by the
lemma above), there is a finite subcover Va,b1 ∪ . . . ∪ Va,bn . Write Va for this set, and
write Ua for Ua,b1 ∩ . . . ∩ Ua,bn .
Then Ua , Va are open and disjoint, and a ∈ Ua whilst B ⊂ Va . But now the Ua form
an open cover of A, and since A is compact (again by the lemma), there is a finite
subcover Ua1 ∪ . . . ∪ Uam . Take U to be this set, and let V = Va1 ∩ . . . ∩ Vam .
Lemma. Let X be a topological space. Then X is normal iff it has the ‘sandwiching prop-
erty’. This is: if A is closed, W is open, and A ⊂ W , then there is an open set U with
A ⊂ U ⊂ U ⊂ W.
23
dist(x, A)
Remark. If X is a metric space, take f (x) = .
dist(x, A) + dist(x, B)
Set W = X \ B. This is an open set containing A. At the first step use the sandwiching
property twice to find open sets U0 , U1 such that
A ⊂ U0 ⊂ U0 ⊂ U1 ⊂ U1 ⊂ W.
A ⊂ U0 ⊂ U0 ⊂ U 21 ⊂ U 21 ⊂ U1 ⊂ U1 ⊂ W.
A ⊂ U0 ⊂ U0 ⊂ U 41 ⊂ U 41 ⊂ U 21 ⊂ U 21 ⊂ U 43 ⊂ U 43 ⊂ U1 ⊂ U1 ⊂ W.
Carry on doing this for all dyadic rationals (rationals with denominator a power of 2.)
We end up with open sets Ur for each dyadic rational r, with the crucial property that
if r < s then Ur ⊂ Us .
Claim. Suppose α ∈ (0, 1). Then f −1 ([0, α]) and f −1 ([α, 1]) are closed.
T
Proof. First, we show that f −1 ([0, α]) = r>α Ur .
T
If x ∈ LHS, then f (x) 6 α, so x ∈ Ur for all r > α, and thus x ∈ r>α Ur ⊂ RHS.
If x ∈ RHS, then x ∈ Ur for all r > α, and so x ∈ Us for all s > r > α, and hence,
since r was arbitrary, for all s > α. Therefore f (x) 6 α, and so x ∈ LHS.
S S
Second, we show f −1 ([α, 1]) = X \ r<α Ur . Equivalently, f −1 ([0, α)) = r<α Ur .
S
For: x ∈ f −1 ([0, α)) ⇔ f (x) < α ⇔ x ∈ Ur for some r < α ⇔ x ∈ r<α Ur .
This implies that f −1 (open interval) is open, whence f −1 (any open set) is open. So f
is continuous, as required. 2
Proof. We’ll apply Urysohn’s Lemma repeatedly to find continuous function on X which
approximate f better and better on S.
24
Now define A1 = {x ∈ S : f1 (x) 6 − 13 · 23 } and B1 = {x ∈ S : f1 (x) > 13 · 23 }.
By construction, fe|S = f0 = f . 2
Stone-Weierstrass Theorem
We’ll prove a significant generalisation of the ‘well-known’ fact that any continuous function
on [a, b] can be uniformly approximated by polynomials.
Examples. (a) X = [a, b] and A ⊂ C(X) is the set of real-coefficient polynomials p(x) =
c0 + c1 x + . . . + cn xn .
(b) X = [a1 , b1 ] × [a2 , b2 ] ⊂ R2 , with A the collection of all finite sums g1 (x)h1 (y) +
. . . + gn (x)hn (y), where gi : [a1 , b1 ] → R and hj : [a2 , b2 ] → R are continuous 1-variable
functions.
Idea. The proof has two main steps, rather orthogonal to one another.
(1) Show that A is a lattice. That is, if f, g ∈ A then max(f, g) and min(f, g) ∈ A.
(2) Use the lattice property, compactness of X and separation-of-points property to
show A = C(X).
25
and so max(f, g) = max(f − g, 0) + g ∈ A, and similarly for min(f, g).
Proof of claim.
√ It suffices to show that there are polynomials Qn such that
Qn (t) → t uniformly on [0, 1], since we may then take Pn (s) = Qn (s2 ).
√
Let δ > 0 be small. There is an analytic branch of z + δ in the ball B 1+δ ( 12 ),
√ 2
and hence there is a Taylor series: z + δ = c0 + c1 (z − 21 ) + c2 (z − 12 )2 + . . .,
uniformly convergent for |z − 21 | 6 21 , say.
Truncating this series at the ith term, and restricting to real values of z (which
certainly includes [0,√1]), we get a sequence of polynomials Ri,δ (t) such that
lim sup |Ri,δ (t) − t + δ| = 0.
i→∞ t∈[0,1]
√
Note that the constant term of Ri,δ (t) tends to δ as i → ∞ (put√t = 0). So
√
ei,δ (t) = Ri,δ (t) − Ri,δ (0), then lim sup |R
let R ei,δ (t) − t + δ + δ| = 0.
i→∞ t∈[0,1]
Now define Qn (t) = R ei(n), 1 (t), where i(n) is chosen large enough that
q q
n
ei(n), 1 (t) − t + 1 + 1 6 1 .
sup R n n n n
t∈[0,1]
√ √ q q
1
Then sup |Qn (t) − t| 6 n+ sup t + n1 − t + 1
n → 0 as n → ∞.
t∈[0,1] t∈[0,1]
(Exercise.)
(2) Consider any closed lattice A ⊂ C(X) with the separation-of-points property. Let
f ∈ C(X) be arbitrary. We need to show that f can be approximated arbitrarily
closely by functions from A. Let ε > 0 be arbitrary.
Fix x. Then the sets Ux,y form an open cover of X and so by compactness of X
there is a finite subcover Ux,y1 ∪ . . . ∪ Ux,yn . Define fx = min{fx,y1 , . . ., fx,yn }.
This fx ∈ A and fx (x) = f (x) and fx (z) < f (z) + ε for all z ∈ X, since z is in at
least one Ux,yi .
26
finite subcover Vx1 ∪ . . . ∪ Vxn .
Define fe = max{fx1 , . . ., fxn }. Then by construction fe ∈ A and f (z) − ε < fe(z) <
f (z) + ε for all z ∈ X. Since ε was arbitrary, A is indeed dense in C(X), and
hence, since A is closed, A = C(X). 2
Proof. Apply the complex-valued Stone-Weierstrass theorem to R/2πZ = [0, 2π]/(0 ∼ 2π).
Note. The an need not be the Fourier coefficients of f (if one is trying to approximate f ).
Next, a slightly more precise version of Stone-Weierstrass – more like the ‘standard’ version.
Theorem. Let X be a compact Hausdorff space. Then any proper closed subalgebra of
C(X) is contained in:
(i) the subalgebras Ax,y = {f ∈ C(X) : f (x) = f (y)}, x 6= y
(ii) the subalgebras Ax = {f ∈ C(X) : f (x) = 0}.
Proof. Fix x and y. Let A be an algebra and suppose A is proper. Define Vx,y =
{(f (x), f (y)) : f ∈ A} ⊂ R2 .
Assume that A is not contained in any Ax,y or Ax or Ay . Then the vector space Vx,y
contains points (a1 , b1 ), (a2 , b2 ), (a3 , b3 ) with a1 6= b1 , a2 6= 0, b3 6= 0.
The only way Vx,y could be a proper subspace of R2 yet still contain these three points
would be if Vx,y = h(t, u)i with t 6= u, t 6= 0, u 6= 0.
But Vx,y also contains (t2 , u2 ), and since (t, u) and (t2 , u2 ) span R2 , this is a contradic-
tion.
27
Corollary (typical formulation of Stone-Weierstrass). Let X be a compact Hausdorff
space, and A ⊂ C(X) a subalgebra which separates the points in the sense that for all
x, y ∈ X with x 6= y, there is f ∈ A with f (x) 6= f (y).
Then either A is dense in C(X), or there is x0 ∈ X such that f (x0 ) = 0 for all f ∈ A.
Proof. The assumption is that A 6⊂ Ax,y for any x, y. Hence A is everything, or else is
proper, and hence contained in Ax for some x. 2
There are certainly closed subalgebras of C(R) with the separation-of-points property which
are not dense in C(R). For example, C0 (R) = {f ∈ C(R) : f (x) → 0 as |x| → ∞}.
Similar examples exist in any locally compact space. (Recall a space is locally compact
every x ∈ X lies in some neighbourhood with compact closure.) Indeed, define C0 (X) =
if
f ∈ C(X) : {x : |f (x)| > ε} is compact for all ε > 0 . This coincides with what we wrote
before when X = R. It is always a closed subalgebra of C(X). (Exercise: by Urysohn’s
Lemma, C0 (X) separates the points.)
One may check that X e becomes a compact Hausdorff space. ‘All of this is very easy’,
except note that the local compactness of X is required to check the Hausdorff property
e Indeed, if x ∈ X then there is an open set U with compact closure. Let
for X.
V = (X \ U ) ∪ {∞}. Then V is open, disjoint from U and contains ∞.
Another easy check: C0 (X) may be identified with the space {fe ∈ C(X)
e : fe(∞) = 0},
e e
where f ∈ C0 (X) is identified with the function f : X → R which equals f on X and
is 0 at ∞. It remains to check that this is continuous.
g ∈ Ae and a constant c
Now suppose f ∈ C0 (X). Then for every ε > 0 there is some e
such that fe = ge + c +η, where kηk∞ < 2ε .
| {z }
∈B
28
** Non-examinable section **
Theorem (Weierstass). Suppose f : [0, 1] → R is continuous. Then for all ε > 0 there is
a polynomial p ∈ R[x] with kf − pk∞ < ε on [0, 1].
Remark. Of course, this√ follows trivially from Stone-Weierstass. However, we need the
special case f (t) = t in the proof of that theorem.
Of course, bi,n (x) is P(X1 + . . . + Xn = i), where the Xj are independent random
variables with mean x.
Pn
This implies i=0 bi,n (x) = 1. Also bi,n (x) is largest when i ∼ nx.
m
Since f is uniformly continuous, this → 0 as long as n → 0 (∗)
X
Also, |S2 | 6 2kf k∞ bi,n (x) = 2kf k∞P(|X1 + . . . + Xn − nx| > M ).
i:|i−nx|>M
29
Arzelà-Ascoli Theorem
Suppose F ⊂ C(X). Say that F is precompact if its closure F is compact.
When is F precompact?
Say that F is equicontinuous if for each x ∈ X and for all ε > 0 there is an open set
Ux ∋ x such that y ∈ Ux implies |f (y) − f (x)| < ε for all f ∈ F simultaneously. (I.e., in the
definition of continuity, the same open set works for all f ∈ F .)
Example. Lipschitz functions on [0, 1]. The set F of all f ∈ C[0, 1] with |f (x)| < c1
and |f (x) − f (y)| < c2 |x − y| (Lipschitz condition). Arzelà-Ascoli says that this is a
precompact set of functions. To see equicontinuity, note that |f (x)−f (y)| < ε whenever
y ∈ Bδ (x) with δ = cε2 . Actually, we could replace (x − y) with ψ(x − y) as long as
ψ(t) → 0 as t → 0. When ψ(t) = tα we talk of the ‘Hölder condition’ with exponent α.
Say that F is totally bounded if for every ε > 0 we can cover F by finitely many balls of
radius ε, say B(fi , ε) with fi ∈ F .
Proof. Suppose first that F has been covered by balls B(fi , 2ε ). If g ∈ F then there is a
sequence of functions in F with limit g. By the pigeonhole principle we can assume
these all lie in the same B(fi , 2ε ). Hence g ∈ B(fi , 2ε ) ⊂ B(fi , ε).
Conversely, suppose that F has been covered by balls B(fi , 2ε ). For each i, choose
gi ∈ F with d(fi , gi ) < ε2 . Then the balls B(gi , ε) cover F . 2
Proof of Arzelà-Ascoli. Recall that a metric space is compact iff it is complete and totally
bounded. Note that F is automatically complete, because it is a closed subset of the
complete metric space C(X) (with the sup norm). So all we need show is that:
TB ⇒ UB. For any ε > 0 there is some finite collection {f1 , . . ., fn } of functions such
that for all f ∈ F , there is i ∈ {1, . . ., n} such that kf − fi k∞ 6 ε.
TB ⇒ EQ. Note that each fi is continuous at x, and so there is an open set U con-
taining x such that if y ∈ U then |fi (x) − fi (y)| 6 3ε for i = 1, . . ., n. Now if f ∈ F
is arbitrary, let y ∈ U . Choose i such that kf − fi k∞ 6 3ε . Then:
ε ε ε
|f (y) − f (x)| 6 |f (y) − fi (y)| + |fi (y) − fi (x)| + |fi (x) − f (x)| 6 3 + 3 + 3 = ε.
Thus F is EQ.
30
UB + EQ ⇒ TB. Let ε > 0. For each x ∈ X, EQ implies that there is a set Ux ∋ x
such that |f (y) − f (x)| 6 3ε whenever f ∈ F and y ∈ Ux . Since X is compact, we
may pass to a finite subcover Ux1 ∪ . . . ∪ Uxn .
Look at the vectors {(f (x1 ), . . ., f (xn )) : f ∈ F } ∈ Rn . Since F is UB these all lie
in some box [−m, m]n .
This means we can choose an 3ε -net from this set of vectors. I.e., there is a
collection of functions f1 , . . ., fk such that for each f ∈ F there is some i such that
|f (xj ) − fi (xj )| 6 3ε for j = 1, . . ., n.
We have shown that the balls of radius ε about f1 , . . ., fk in the sup norm cover
F . Since ε was arbitrary, the family F is indeed totally bounded. 2
Proof. The idea is to construct approximate solutions to the equation in a suitable space of
functions V . We’ll ensure that V is compact, hence sequentially compact, and so our
sequence of approximate solutions will have a limit, which is a genuine solution.
1
Let M = max max Ψ(x, y), 1 , and let η = M.
x,y∈[−1,1]
Take V to be the subset of C[−η, η] with f (0) = 0 and Lipschitz |f (x) − f (x′ )| 6
M |x − x′ | for all x, x′ ∈ [−η, η].
Lemma. For each δ > 0 there is a δ-approximate solution to our ODE in V . That
means a function f ∈ V , differentiable except possibly at finitely many points,
and such that |f ′ (x) − Ψ(x, f (x))| 6 δ for x ∈ [−η, η] if f ′ (x) is defined.
Proof of lemma. Let n be a large positive integer, and xj = 2j n for j = −n, −(n − 1),
. . ., (n − 1), n. So −η = x−n < · · · < x−1 < x0 < x1 < · · · < xn = η.
From this it follows that |f (x)| 6 M |x| 6 M η 6 1 for all x ∈ (xi , xi+1 ). It follows
that f ∈ V .
31
Let us show that (if n is big enough), then f is a δ-approximate solution to the
ODE.
Suppose x ∈ (xi , xi+1 ). Then |f ′ (x) − Ψ(x, f (x))| = |Ψ(xi , f (xi )) − Ψ(x, f (x))|.
η 2M
However, |xi − x| 6 n and |f (xi ) − f (x)| 6 n .
Z x
Thus f (x) = Ψ(t, f (t)) dt, so f is differentiable and f ′ (x) = Ψ(x, f (x)). 2
0
32
5. Weak Topologies on Normed Spaces
On X, we always have the norm topology τk k , defined by: A ∈ τk k iff for all x ∈ A, there
exists ε > 0 such that Bε (x) ⊂ A. Alternatively, given a family F of functions f : X → Yf
for some topological spaces Yf , we can also define a topology as follows:
S = {f −1 (U ) : U ∈ τYf , f ∈ F }
\
τF = τ, over topologies τ on X with τ ⊃ S
Then τF is the smallest topology on X such that every f ∈ F is continuous. We can show
that τF is the set of arbitrary unions of finite intersections of sets of the form f −1 (U ), where
U ∈ τYf , f ∈ F .
Proof. Let x, y ∈ X, x 6= y. Then there exists f ∈ F such that f (x) 6= f (y). So there are
W1 , W2 ∈ τYf such that f (x) ∈ W1 , f (y) ∈ W2 with W1 ∩ W2 = ∅.
Lemma. Let f1 , . . ., fn be linear functionals on X and N = {x ∈ X : fiP (x) = 0 for all i}.
Then f (x) = 0 for all x ∈ N iff there exist α1 , . . ., αn such that f = αi fi .
Proof. Exercise. 2
Recall that a locally convex space is a topological vector space with a basis consisting of
convex sets.
Theorem. Let X be a vector space and F a vector space of linear functionals on X, sepa-
rating the points of X. Then (X, τF ) is a locally convex space, and (X, τF )∗ = F .
Proof. Since F separates points of X, we know (X, τ F ) is Hausdorff. We first check for the
existence of a neighbourhood basis consisting of convex sets. Indeed, there is a basis of
the neighbourhoodTof 0 consisting of the sets of the form V = {x ∈ X : |fi (x)| < εi for
all i = 1, . . ., n} = fi−1 (Bε (0)) for some fi ∈ F and εi > 0.
Next we check the continuity of the sum and scalar multiplication. Note that if V is a
0-neighbourhood as before, then 12 V + 21 V = V and (x + 12 V ) + (y + 12 V ) = x + y + V .
So + : X × X → X. If z = x + y and z ∈ U ∈ τF , then there is a V as above such that
z + V ⊂ U , and so +−1 (x + y + V ) ⊃ (x + 12 V ) × (y + 21 V ), and thus + is continuous,
as preimages of open sets are open.
Suppose x0 ∈ X such that fi (x0 ) = 0 for all i. Then x0 ∈ V and moreover αx0 ∈ V
P α. So |f (αx0 )| < 1 for all α, and thus f (x0 ) = 0. By the previous lemma,
for all
f = αi fi ∈ F . 2
33
Definition. Let X be a normed space, and X ∗ be the dual space. The topology on X
generated by F = X ∗ is known as the weak topology on X and is denoted by τw .
Thus U ⊂ X is open in the weak topology (‘w-open’) iff for every x ∈ U there are
f1 , . . ., fn ∈ X ∗ and ε1 , . . ., εn > 0 such that {y ∈ X : |fi (x) − fi (y)| < εi ∀ i} ⊂ U .
(By replacing fi by ε1i fi , we may assume each εi = 1.)
Thus F ⊂ X ∗ is open in the weak-star topology (‘w∗ -open’) iff for every f ∈ F there
are x1 , . . ., xn ∈ X and ε1 , . . ., εn > 0 such that {g ∈ X ∗ : |ϕX (xi )(f ) − ϕX (xi )(g)| <
εi ∀ i} = {g ∈ X ∗ : |f (xi ) − g(xi )| < εi ∀ i} ⊂ F . (As before, we may assume each
εi = 1.)
Proof. (⇐). Wlog, X = Rn or Cn with norm kxk = maxi |xi |, where x = (x1 , . . ., xn )t . Let
fi (x) = xi . Then fi ∈ X ∗ for all i = 1, . . ., n. For every ε > 0, Bε (0) = {x ∈ X :
|fi (x)| < ε for all i} ∈ τw . So τk k ⊂ τw , and therefore τk k = τw .
Remark. We proved that for f ∈ X ∗ , xn ⇀ x implies f (xn ) → f (x). This is not the
definition of continuity. Only in metric spaces is continuity equivalent to sequential
continuity. In general, continuity implies sequential continuity.
Lemma. Let xn ⇀ x. Then xn is bounded and kxk 6 lim inf n→∞ kxn k.
Proof. For each f ∈ X ∗ , f (xn ) → f (x), and so there exists cf such that cf > |f (xn )| =
34
|ϕX (xn )(f )| for all n > 1. Let F = {ϕX (xn ) : n > 1} be a family of linear functionals
on X ∗ such that for all f ∈ X ∗ , there exists cf > 0 with |T (f )| 6 cf for all T ∈ F . By
Banach-Steinhaus, there exists c > 0 such that kT k < c for all T ∈ F .
So kxn k = kϕX (xn )k 6 c for all n > 1 and so xn is bounded. By Hahn-Banach, there
exists f ∈ X ∗ such that kf k = 1 and f (x) = kxk. Then kxk = f (x) = limn→∞ f (xn ) 6
lim inf n→∞ kf k kxn k = lim inf n→∞ kxn k. 2
Lemma (Tukey’s Lemma). Let F be a set system of finite character and F ∈ F . Then
F has a maximal element containing F .
Proof. Let F0 =
S{A ∈ F : F ⊂ A}. If C ⊂ F0 is totally ordered with respect to inclusion,
then D = C∈C C ∈ F , because its finite subsets are in F . Also D ⊃ F , so D ∈ F0
and D is an upper bound for C. By Zorn’s Lemma, F0 has a maximal element. This is
also a maximal element for F and contains F .
Definition. A system F of subsets of Tna given set has the finite intersection property if
for all F1 , . . ., Fn ∈ F , we have i=1 Fi 6= ∅.
Proposition. Let X be a topological space. Then X compact iff for every system F of
35
T
closed subsets of X with the finite intersection property we have A∈F A 6= ∅.
S
Proof. X compact ⇐⇒ for all collections of open sets (Uα )α∈ΛS with X = Uα ,
there exist α1 , . . ., αn such that X = Uαi
T
⇐⇒ for all collections of closed sets (Vα )α∈Λ
T with ∅ = V α,
there exist α1 , . . ., αn such that ∅ = Vαi , by taking Vα = Uαc
⇐⇒ if F = (Vα )α∈Λ is a systemTof closed sets with the finite
intersection property then Vα 6= ∅. 2
Let F be the collections of all systems of subsets of X which have the finite intersection
property. By definition, F is of finite character, so by Tukey’s Lemma there exists a
maximal systemT B of subsets
T of X having the finite intersection
T property such that
B ⊃ A. Since B∈B B ⊂ A∈A A, it is enough to show that B∈B B 6= ∅.
Tn Tn
However, each p−1 (Uγi ) ∈ B, and so i=1 pγ−1
i
(Uγi ) ∈ B, so i=1 p−1
γi (Uγi ) ∩ B 6= ∅.
T
So, for all B ∈ B, we have U ∩ B 6= ∅, so x ∈ B, and thus x ∈ B∈B B, as required. 2
Theorem (Banach-Alaoglu Theorem). Let X be a normed space. Then the closed unit
ball B(X ∗ ) = {f ∈ X ∗ : kf k 6 1} is compact in the τw∗ topology (‘w∗ -compact’).
36
Define ϕ : B(X ∗ ) → D, f 7→ (f (x))x∈X .
I.e., ϕ is continuous.
So f is linear. And since |f (x)| = |ξx | 6 kxk for all x ∈ X, f is continuous and kf k 6 1,
i.e. f ∈ B(X ∗ ). And by definition of f , we have ξ = ϕ(f ) ∈ ϕ(B(X ∗ )), which is thus
closed. 2
Proof. Let U be w∗ -open in B(X ∗∗ ) and let θ ∈ U . Recall that this means there are
f1 , . . ., fn ∈ X ∗ and ε1 , . . ., εn > 0 such that {ψ ∈ X ∗∗ : |θ(fi ) − ψ(fi )| < εi ∀ i} ⊂ U .
To show that ϕX (B(X)) is dense, we want to show that U contains a ψ of the form
ϕX (x) for some x ∈ B(X). So it is enough
P to show that for every
P f1 , . . ., fn and ε > 0,
there exists x ∈ B(X) such that ε > |θ(fi ) − ϕX (x)(fi )| = |θ(fi ) − fi (x)|.
37
Let b = fb ◦ Jb−1 ∈ (Kn )∗ , say b = (b1 , . . ., bn ). Then
P P
θ(b ◦ J) = θ ( b
bi fi ) = bi θ(fi ) = (b ◦ J)([z]) = fb([z]) = k[z]k
b
Since |(b ◦ J)(x)| = |(b ◦ J)([x])| = |fb([x])| 6 kfbk k[x]k = k[x]k 6 kxk, we know
kb ◦ Jk 6 1 and thus k[z]k 6 kθk kb ◦ Jk 6 kθk 6 1.
P since k[z]k = inf x∼z kxk, there existsx x ∈ X such that x ∼ z and
So, P kxk 6 1 + ε. Then
|θ(fi )−fi (x)| = 0, and taking x0 = 1+ε we have kx0 k 6 1 and |θ(fi )−fi (x0 )| 6 cε,
for some c. 2
Proof. Let (xn ) be a sequence in B(X). Define X1 = h(xn )i. Then X1 is separable and
reflexive. So X1∗ is separable. Let (fn ) be a dense sequence in X1∗ .
(f1 (xn )) is a bounded sequence in K, so there exists a subsequence x1,j of xn such that
f1 (x1,j ) converges.
(f2 (xn )) is a bounded sequence in K, so there exists a subsequence x2,j of xn such that
f2 (x2,j ) converges.
Continue to find sequences xj,k for all j, k ∈ N. Let vk = xk,k . It follows that fn (vk )
converges for all n. So f (vk ) converges for all f ∈ X1∗ , so (vk ) is a w-Cauchy sequence
and therefore w-convergent. 2
Corollary. Let X be a reflexive Banach space. Then every bounded sequence in X has a
w-convergent subsequence.
38
6. Hilbert Spaces
Let V be a vector space over R. Then an inner product on V is a map h , i : V × V → R
satisfying:
Now let V be a vector space over C. Then a Hermitian inner product on V is a map
h , i : V × V → R satisfying:
Examples
p
Inner products (over R or C) may be used to define norms: define kxk = hx, xi.
Proof. It suffices to prove it with y replaced by y ′ = λy, where |λ| = 1, since |hx, y ′ i| =
|hx, yi| and ky ′ k = kyk. By choosing λ appropriately, we may assume that hx, yi ∈ R.
Now let t ∈ R and note that hx + ty, x + tyi > 0, i.e. kxk2 + 2thx, yi + t2 kyk2 > 0. (Note
hx, yi = hy, xi = hy, xi.) As a quadratic in t, this has a non-positive discriminant, that
is 4hx, yi2 − 4kxk2 kyk2 6 0. Rearranging gives the result. 2
39
It follows that k k satisfies the triangle inequality:
kx + yk2 = hx + y, x + yi = kxk2 + kyk2 + hy, xi + hx, yi 6 (kxk + kyk)2 .
Definition. If V with the norm k k induced from a (Hermitian) inner product is complete
(i.e. if V is a Banach space), then V is called a Hilbert space.
Examples 1 and 2 above are Hilbert spaces: 1 because every finite-dimensional normed space
is a Banach space, and 2 because we proved that ℓp is complete for all p > 1.
But 3 is not complete. We showed before (see page 5) that C[a, b] is not complete with the
Rb Rb
L1 -norm kf k = a |f (t)|dt. Here our norm is the L2 -norm kf k = ( a |f (t)|2 dt)1/2 , but the
same example (functions converging to a step function) works.
Completeness
Can turn Xe into a complete metric space – it is the unique (up to isomorphism) smallest
complete metric space containing X.
This construction works well with respect to inner products and norms. In particular if
V comes equipped a Hermitian inner product h , i (which induces a norm k k) then the
completion Ve of V as a metric space is a Hilbert space (i.e. can be given the structure of a
Hilbert space).
The completion of C C [a, b] with respect to the L2 -norm is a very important space, called
L2 [a, b]. Remarkable fact: there is a reasonably explicit description of L2 [a, b] as the space
Rb
of Lebesgue measurable functions with a |f |2 < ∞.
p
A few facts about inner product norms, kxk = hx, xi.
Polarisation identity
• Over R, hx, yi = 12 (kx + yk2 − kxk2 − kyk2 ). (Trivial proof.)
So knowing the norm tells us the inner product (if there is one).
Z 2π
1
• Over C, hx, yi = kx + eiθ yk2 eiθ dθ.
2π 0
Z 2π
1
Expanding out, gives: kxk2 eiθ + hy, xie2iθ + kyk2 eiθ + hx, yi dθ.
2π 0
Z 2π
2π if n = 0
But einθ dθ =
0 0 if n ∈ Z \ {0}
40
Parallelogram identity. kx − yk2 + kx + yk2 = 2kxk2 + 2kyk2. (This is not satisfied by all
norms, only inner product norms.)
Another name for a (Hermitian) inner product space is a Euclidean space. So a Hilbert
space is a complete Euclidean space.
This is a closed linear subspace (and hence a Hilbert space), as follows. Obviously, x⊥
is closed under addition and scalar multiplication. It is closed because if yn → y then
|hx, y − yn i| 6 kxk ky − yn k → 0 as n → ∞. So |hx, yi − hx, yn i| → 0. But yn ∈ X ⊥ and thus
hx, yi = 0.
T
More generally, if S ⊂ X is any set at all, then we define S ⊥ = x∈S x⊥ . This is also a
closed linear subspace (as indeed is any intersection of closed linear subspaces).
Suppose that V = lin S = closed linear space of S, the closure of all linear combinations of
elements of S. Then S ⊥ = V ⊥ . It’s clear that V ⊥ ⊂ S ⊥ . (In fact, if A ⊂ B then B ⊥ ⊂ A⊥ .)
Proof. Idea is to take y to be the ‘nearest’ point to x. Need: (1) to understand what this
means, and (2) check that x − y ⊥ Y .
1
(1) Let d = inf y∈Y kx − yk. Take a sequence (yn )∞ 2 2
n=1 ⊂ Y with kx − yn k 6 d + n .
We claim that this is a Cauchy sequence. Recall the parallelogram law, kv + wk2 +
kv − wk2 = 2kvk2 + 2kwk2 . Apply this with v = x − yn , w = x − ym . We obtain
(For (∗), note that 21 (yn + ym ) ∈ Y , hence kx − 12 (yn + ym )k > inf y∈Y kx − yk = d.)
It follows that yn tends to some limit y. We obviously have kx−yk = limn→∞ kx−
yn k = d.
41
λ ∈ C, we can assume that hy ∗ , zi ∈ R and is > 0.
kx − y ′ k2 = hx − (y + εz), x − (y + εz)i
= hy ∗ − εz, y ∗ − εzi
= ky ∗ k2 − 2εhy ∗ , zi + ε2 kzk2
= kx − yk2 − 2εhy ∗ , zi + ε2 kzk2
Remark. The theorem is most often applied when X is a Hilbert space, in which case it
suffices that Y be a closed subspace (as it’s then automatically complete).
Furthermore, anything of this type is a bounded linear functional and the identification
f : x0 → ϕ gives a bijection X → X ∗ . This is an anti-isometry, in that f (λx + µx′ ) =
λf (x) + µf (x′ ). Furthermore, kϕk = kx0 k, where kϕk means the norm of ϕ considered
as an operator, i.e. supkxk=1 |ϕ(x)|.
This gives X ∗ with the dual norm the structure of a Hilbert space.
It’s easy to see that dim Y ⊥ 6 1, since ϕ|Y ⊥ has trivial kernel. If ϕ = 0 then we’re
done, otherwise dim Y ⊥ = 1.
42
Suppose Y ⊥ = hzi for some z ∈ X. Let x = y + λz be an arbitrary element of X. Then
ϕ(x) = λϕ(z). Also, hx, zi = hλz, zi = λkzk2 .
z ϕ(z)
Hence, defining x0 = ϕ(z), we get hx, x0 i = hx, zi = λϕ(z) = ϕ(x).
kzk2 kzk2
We have kϕk = sup |ϕ(x)| = sup hx, x0 i 6 sup kxk kx0 k = kx0 k.
kxk=1 kxk=1 kxk=1
x0 hx0 , x0 i
On the other hand, ϕ = = kx0 k, so kϕk > kx0 k as well. 2
kx0 k kx0 k
A remark on adjoints
Note that T ∗∗ = T . We showed before, for general normed spaces, that kT ∗k 6 kT k. Hence
kT k = kT ∗∗k 6 kT ∗k, thus kT k = kT ∗ k.
** Non-examinable section **
‘Typically’, we’d expect this ‘orbit’ to become equidistributed on X. Ergodic theory studies
this phenomenon. When are time averages and space averages the same?
Von Neumann’s ergodic theorem. Let H be a Hilbert space, and suppose that T : H →
H has norm at most 1. Let Y be the closed subspace of H consisting of T -invariant
vectors, i.e. T y = y. Let π : H → Y be the orthogonal projection.
1
Write SN x = N (x + T x + . . . + T N −1 x) – ‘time average’. Then SN : x 7→ π(x).
Suppose X is a compact metric space, and take H = L2 (X), the completion of the space
C(X) of continuous functions on X. E.g., X = [0, 1], C(X) has L2 -norm. Let Ψ : X → X
be a map.
This will induce a map T : H → H by defining T f (x) = f (Ψ(x)). Von Neumann’s ergodic
theorem, in this context, say that either
43
Proof. First of all note that kT ∗k 6 1 as well. Next suppose that x is T -invariant, i.e.
T x = x. Claim that T ∗ x = x as well.
Hence indeed x = T ∗ x.
Proof of idea. First of all, suppose f ∈ Y . Claim hf, ∂gi = 0 for all g ∈ H. Indeed,
hf, ∂gi = hf, g − T gi = hf, gi − hT ∗ f, gi = 0, since f is T -invariant. Hence if
f ∈ Y then f is orthogonal to M . Conversely suppose f is orthogonal to M , then
hf, ∂f i = 0.
But kf − T f k2 = hf − T f i + hf − T f, f i − kf k2 + kT f k2 6 0. Therefore f = T f
and hence f ∈ Y .
2 ε
and hence kSN (∂g)k 6 N kgk 6 2 for N sufficiently large.
ε ε
It follows that if N is big enough then kSN x − π(x)k 6 2 + 2 = ε.
Orthonormal systems
Let X be a Euclidean space. A collection of elements (ϕi )i∈I is said to be orthonormal if
kϕi k = 1 for all i, and if hϕi , ϕj i = 0 when i 6= j.
Let X be a Hilbert space, and let (ϕi )i∈I be an orthonormal space. We say that the system
is complete if we can’t add another element ϕ and still have an orthonormal system.
Lemma. An orthonormal system (ϕi )i∈I is complete iff its closed linear span is X.
Proof. Suppose (ϕi )i∈I has closed linear span X. Then ((ϕi )i∈I )⊥ = X ⊥ = {0}.
Conversely, suppose (ϕI )i∈I is complete, and let Y be its closed linear span. Then
Y ⊥ = {0}. But X = Y ⊕ Y ⊥ and so X = Y . 2
44
Proof. Proceed by induction on i. Suppose y1 , . . ., yi are defined.
Then define yei+1 = xi+1 − Pi xi+1 , where Pi is the orthogonal projection on to the
closed subspace hx1 , . . ., xi i = hy1 , . . ., yi i.
yei+1
Set yi+1 = . (Note, ke
yi+1 k 6= 0 since xi+1 ∈
/ hx1 , . . ., xi i.)
ke
yi+1 k
Clearly yei+1 and hence yi+1 are orthogonal to hx1 , . . ., xi i = hy1 , . . ., yi i. Also clearly,
hy1 , . . ., yi+1 i = hy1 , . . ., yi , yei+1 i = hx1 , . . ., xi+1 i. 2
Explicitly, the orthogonal projection of X onto hy1 , . . ., yi i is π(x) = hx, y1 iy1 + . . . + hx, yi iyi .
Pi
For π(x) ∈ hy1 , . . ., yi i and hx−π(x), yj i = hx, yj i− k=1 hx, yk ihyk , yi i = hx, yj i−hx, yj i = 0.
Corollary. Let X be a separable Hilbert space (i.e. X has a countable subset with dense
linear span).
Thin it out if necessary so that the xi are linearly independent. Then apply Gram-
Schmidt. 2
Examples
45
PN 2
PN 2
Proof. Write xN = n=1 an ϕn . Then PkxN −x2
M k for N > M is n=M+1 |an | by Pythago-
ras. So indeed xN converges iff |an | does.
PN
Note (Pythagoras) that kxN k2 = n=1 |an |2 . 2
P∞
Further remarks. If x = n=1 cn ϕn then the cn are called the Fourier coefficients of x
with respect to (ϕn )∞
n=1 .
Pm
We have cn = hx, ϕn i. To see this, write xm = i=1 ci ϕi . Then xm → x.
Pm
Also, hxm , ϕn i = i=1 ci hϕi , ϕn i = cn if m > n. So limm→∞ hxm , ϕn i = cn = hx, ϕn i.
P∞
If x, y ∈ X, then n=1 hx, ϕn ihy, ϕn i = hρV x, ρV yi = hρV x, yi = hx, ρV yi, by self-
adjointness. (∗∗)
So if (ϕn )∞
n=1 is complete then ρV = I and so:
∞
X ∞
X
|hx, ϕn i|2 = kxk2 and hx, ϕn ihy, ϕn i = hx, yi – Parseval’s identities.
n=1 n=1
(Recall, isometric means isomorphic, with the isomorphism preserving the inner product.)
46
Proof of (∗∗). This is straightforward.
Pm
hρV x, ρV yi = limm→∞ hxm , ym i, where xm = n=1 hx, ϕn iϕn and similarly for ym .
Pm P∞
So hρV x, ρV yi = limm→∞ n=1 hx, ϕn ihy, ϕn i = n=1 hx, ϕn ihy, ϕn i.
** Non-examinable section **
Proof. Work in L2 [0, 2π], the Hilbert space obtained by completing the continuous functions
C[0, 2π] in the L2 -norm. The exponentials ϕn (x) = einx , n ∈ Z, are a complete
orthonormal system (by Stone-Weierstrass).
Z 2π
1
f (n) = f (x)e−inx dx = hf, ϕn i.
2π 0
X
Thus kSN f − f k2 = |hf, ϕn i|2 → 0 as N → ∞.
|n|>N
This doesn’t contradict the earlier counterexample, because kSN f − f k2 → 0 does not imply
SN f (x) → f (x) pointwise.
47
7. Spectral Theory
Let T : X → X be a bounded linear operator on a Banach space.
Examples. Let X = ℓ2 .
1. Right shift: T (x1 , x2 , . . .) = (0, x1 , x2 , . . .).
This has no eigenvalues. Suppose (0, x1 , x2 , . . .) = λ(x1 , x2 , . . .). Then λ 6= 0 we have
inductively that x1 = x2 = . . . = 0. And if λ = 0 then we also have x1 = x2 = . . . = 0.
2. Left shift: T (x1 , x2 , . . .) = (x2 , x3 , . . .).
Every λ ∈ C with |λ| < 1 is an eigenvalue, with eigenvector (1, λ, λ2 , λ3 , . . .).
(⇐) is not true: for the right shift, T itself is not invertible, but 0 is not an eigenvalue.
If λ is an eigenvalue (i.e. there exists x 6= 0 for which T x = λx) then we say that λ lies in
the point spectrum, σp (T ).
Proof of claim. Indeed, the operator on the right hand side is well-defined since
48
Also,
To show that σ(T ) is compact, it suffices to show it’s closed, for which is suffices to
show that the invertible operators are an open subset of B(X).
So let T ∈ B(X) be invertible. If S is another operator, write S = T I + T −1 (S − T ) .
Hence 0 ∈
/ σ(T ), and the result follows. 2
Remark. Conversely, it’s clear that if T − λI doesn’t have dense image, or if λ is an ap-
proximate eigenvalue of T , then λ ∈ σ(T ).
From now on, we start specialising, considering in turn the following classes of operators:
49
Compact Operators
Let X, Y be Banach spaces, and T : X → Y a bounded linear operator.
(Actually, if follows from the open mapping theorem that T (BX (1)) is automatically
closed, so the ‘pre’ is superfluous.)
Let X = C[0, 1], say, and let K : [0, 1]2 → C be a continuous function ‘kernel’. Define
R1
T : X → X by T f (x) = 0 f (y)K(x, y)dy.
Z 1
where C = sup |K(x, y)|dy 6 kKk∞ , which is finite since [0, 1]2 is compact.
x 0
Z 1
EQ. |T f (x1 ) − T f (x2 )| = f (y) (K(x1 , y) − K(x2 , y)) dy
0
Z 1
6 kf k∞ |K(x1 , y) − K(x2 , y)| dy
0
Here, let X = Y = ℓ2 .
50
P P P P
Hence i kT ei k2 = j i |hei , T ∗ e′j i|2 = j kT ∗ e′j k2 , by Parseval again.
P P
ei )∞
Similarly, if (e i=1 is another orthonormal system then i ei k2 =
kT e j kT ∗ e′j k2 .
P P
Hence kT eik2 = ei k2 , as required.
kT e 2
P P
(To prove this rigorously, consider x(n) = (x1 , . . ., xn , 0, . . .), note T x(n) = i j6n aij xj ei ,
and let n → ∞.)
P
Hence if x ∈ BX (1), that is if kxk 6 1, then (T x)i = j aij xj , and so by Cauchy-Schwarz
P 1/2
2
|(T x)i | 6 j |a ij | = bi .
P
So in fact T is Hilbert-Schmidt if and only if i,j |aij |2 < ∞.
(i)
Look at the first coordinate x1 , i = 1, 2, . . .. Since these live in a closed interval, pass
to a convergent subsequence. With this new sequence, take a subsequence for which
the second coordinates converge, and so on.
Now take the sequence consisting of the nth element of the nth sequence, and then all
coordinates converge. But any such sequence in a Hilbert cube converges. 2
Proposition. Let X, Y be Banach spaces. Then B0 (X, Y ), the space of compact operators
from X to Y , is a closed linear subspace of B(X, Y ), the space of all bounded linear
operators from X to Y .
51
Proof. A compact operator is bounded, since T (BX (1)) is bounded.
To do this, first use compactness of S to pass to a subsequence x′n for which Sx′n
converges. Now use compactness of T to pass to a further subsequence x′′n for which
T x′′n also converges. Then (S + T )x′′n converges.
Let ε > 0 and let n be such that kTn − T k 6 2ε . Since Tn is compact, Tn (BX (1)) is
totally bounded and so there are y1 , . . ., ym ∈ Y such that for all x ∈ BX (1) there is i
such that kTn x − yi k 6 2ε .
ε ε ε
But then kT x − yi k 6 kT x − Tn xk + kTn x − yi k 6 kT − Tn k kxk + 2 6 2 + 2 = ε.
Furthermore, T is compact, being the limit in the operator norm of the finite-rank operators
Tn defined by Tn (x1 , x2 , . . .) = (α1 x1 , . . ., αn xn , 0, 0, . . .). Each Tn is compact, as are all
finite-rank operators. And
k(T − Tn ) (x1 , x2 , . . .) k = k(0, . . ., 0, αn+1 xn+1 , αn+2 xn+2 , . . .)k
6 sup |αm | k(0, . . ., 0, xn+1 , xn+2 , . . .)k
m>n
6 sup |αm | kxk.
m>n
Proof. Let T : X → X be compact, and let ε > 0. Since T (BX (1)) is totally bounded,
there are points y1 , . . ., ym ∈ X such that for each x ∈ BX (1) there is i such that
kT x − yi k 6 2ε .
52
We’ll show kT − Tek 6 ε. Let x ∈ BX (1) and choose yi so that kT x − yi k 6 2ε .
Proof. Since T (BX (1)) is compact, there is a convergent subsequence of the (xn ). Assume,
relabelling if necessary, that T xn → z. But then, xn → λ1 z. Let z ∗ = λ1 z.
Then the eigenvalues of T are all real, eigenvectors corresponding to distint eigenvalues
are orthogonal, and T is diagonalisable, meaning that there is an orthonormal basis for
Cn consisting of eigenvectors of T . (Apply Gram-Schmidt within each eigenspace.)
Furthermore, these eigenspaces are finite-dimensional, and there are only finitely many
eigenvalues λ with |λ| > ε for any ε > 0.
There is an orthonormal
P basis for X consisting of eigenvectors of T . More precisely,
we may write T = λ ρV λ, where ρV is the orthogonal projection onto the (finite-
dimensional) eigenspace Eλ corresponding to λ.
We start with an observation about the spectrum of bounded (not necessarily compact)
self-adjoint operators on Hilbert space.
53
Suppose then that 0 is not an approximate eigenvalue. Then T is bounded below, say
kT xk > εkxk.
But then (im T )⊥ = ker T because T is self-adjoint: if hx, T yi = 0 for all y, then
hT x, yi = 0 for all y, so T x = 0.
Corollary. Suppose now that T is compact and self-adjoint. Then σ(T ) \ {0} = σp (T ) \ {0}.
Proof. Combine the preceding lemma with the result from earlier – that non-zero approxi-
mate eigenvalues are actual eigenvalues (for compact operators).
Proposition. Suppose T is compact and self-adjoint. Then T has at least one eigenvalue.
More specifically, either kT k or −kT k is an eigenvalue.
Proof. We’ll show that T 2 − kT k2 I is not invertible. Then it follows that at least one of
T − kT kI and T + kT kI is not invertible. That is, σ(T ) contains one of ±kT k, and
from the corollary above it then follows that one of ±kT k lies in σp (T ). (The proof is
trivial if T = 0, so we may assume not.)
2
T 2 xn − kT k2xn = kT 2 xn k2 + kT k4 − kT k2hxn , T 2 xn i − kT k2 hT 2 xn , xn i
= kT 2 xn k2 + kT k4 − 2kT k2kT xn k2
6 2kT k4 − 2kT k2kT xn k2 since kT 2xn k 6 kT 2 k 6 kT k2
→ 0 by the choice of the xn
• Ditto the fact that eigenvectors corresponding to distinct eigenvalues are orthog-
onal: if T x = λx and T y = µy, then λhx, yi = hT x, yi = hx, T yi = µhx, yi, so if
λ 6= µ then hx, yi = 0.
If either of these statements fails, then there is some ε > 0, together with an
infinite orthonormal sequence (xn )∞n=1 with T xn = λn xn , |λn | > ε for all n.
(If the first statement fails, use Gram-Schmidt on an eigenspace Eλ with dim Eλ =
∞. If the second fails, simply pick one unit vector from each eigenspace.)
54
But then kT xn − T xm k2 = kλn xn − λm xm k2 = λ2n + λ2m > 2ε2 .
So then (T xn )∞n=1 does not have a convergent subsequence, contrary to the com-
pactness of T .
Then T (x1 e1 + . . . + xn en + . . . + y1 f1 + . . .) = λ1 x1 + . . . + λn xn + . . . + 0 + 0 + . . ..
55
The Hahn-Banach Theorem BJG October 2011
Conspicuous by its absence from this course (Cambridge Mathematical Tripos Part
II, Linear Analysis) is the Hahn-Banach theorem. A simple version of it is as follows.
Theorem 1 (Hahn-Banach). Let V, Ṽ be normed spaces with V ⊆ Ṽ . Let φ : V → R
be a bounded linear functional. Then there is a bounded linear functional φ̃ : Ṽ → R
which extends φ in the sense that φ̃|V = φ, and for which kφ̃k = kφk.
Proof. Roughly speaking, the idea is to extend φ to φ̃ “one dimension at a time”.
Suppose, then, that Ṽ = V + hwi, where w ∈ / V . By rescaling we may assume, without
loss of generality, that kφk = 1. We are forced to define
φ̃(v + w) = φ(v) + tλ
for all v ∈ V and t ∈ R, where λ must not depend on v or on t. A map φ̃ defined in this
way will always be a linear functional, but our task is to show that by judicious choice
of λ we may ensure that kφ̃k 6 1. For this we require that
φ(v 0 ) − φ(v) 6 kv 0 + wk + kv + wk
for all v, v 0 ∈ V . However the left-hand side is φ(v 0 − v) which, since kφk = 1, has
magnitude at most kv 0 − vk. The result is now a consequence of the triangle inequality.
We have proved the Hahn-Banach theorem when Ṽ is obtained from V by the addition
of one vector. This is already enough to prove the whole theorem when Ṽ is finite-
dimensional (by incrementing the dimension of V one step at a time). Essentially the
1
2
same argument works in the infinite-dimensional case, too, although Zorn’s lemma is
needed to make this rigorous.
Consider the set of all extensions of φ, that is to say pairs (V 0 , φ0 ) where V ⊆ V 0 ⊆ Ṽ ,
φ0 |V = φ, and kφ0 k 6 kφk. There is an obvious partial order on this set: namely, say
that (V1 , φ1 ) (V2 , φ2 ) if and only if V1 ⊆ V2 and φ2 |V1 = φ1 . Every chain in this partial
order has an upper bound. Indeed if (Vi , φi )i∈I is a chain, then an upper bound for it is
(V 0 , φ0 ), where V 0 = i∈I Vi and φ0 equals φi on Vi , for all i. By Zorn’s lemma, there is
S
a maximal element (V0 , φ0 ). However by the special case of the theorem proved above,
we could extend φ0 to V0 + hwi for any w ∈ / V0 . The only possible conclusion is that
there is no w ∈ / V0 , or in other words V0 = Ṽ and φ0 is defined on all of Ṽ .
Remark. Zorn’s lemma is equivalent to the axiom of choice, and so we have used
the axiom of choice in proving Hahn-Banach. It is known that Hahn-Banach is strictly
weaker than the axiom of choice, but cannot be proven in ZF.
Let us derive some consequences of the theorem. The main point is that, without it,
we are essentially powerless to construct a good supply of bounded linear functionals
on a typical normed space X. With it, however, we immediately see that X ∗ is quite
rich; indeed for any x ∈ X there is some φ ∈ X ∗ such that φ(x) 6= 0. More specifically,
there is some φ ∈ X ∗ with kφk = 1 such that φ(x) = kxk. To see these facts, simply
take V to be the subspace spanned by hxi and Ṽ := X, and extend the linear functional
φ0 : V → R defined by φ0 (tx) = tkxk.
One may think of this geometrically in terms of convex bodies admitting supporting
hyperplanes. Consider the unit ball B := {x ∈ X : kxk 6 1} (which is the most
general form of a convex set) and let x0 ∈ B have norm 1. As just remarked, there is
a linear functional φ : X → R with φ(x0 ) = 1 and kφk 6 1. Consider the hyperplane
H := {x ∈ X : φ(x) = 1}. Then H meets B at x0 (and possibly at other points).
However if x ∈ B then φ(x) 6 kφkkxk 6 1, and so all of B lies in the half-space
{x ∈ X : φ(x) 6 1}. H is called a supporting hyperplane for B.
The following interesting fact is little more than a rephrasing of the above.
It follows that
kT ∗ k > kT k − ε.
Since ε > 0 was arbitrary, the result follows.
Proof. Certainly the dual (`∞ )∗ contains `1 , since if (bi )i∈N ∈ `1 then the map
P
(ai )i∈N 7→ i ai bi is a bounded linear functional. Note that any functional of this
type is determined by its values on `∞ 0 , the closed subspace of `
∞
consisting of se-
quences which tend to zero. This is obviously a proper subspace of `∞ , and so the
quotient space `∞ /`∞ 0 is a nontrivial normed space. By Hahn-Banach we may find a
nontrivial bounded linear functional ψ on it. This pulls back under the quotient map
π : `∞ → `∞ /`∞ ∞ ∗
0 to give a nontrivial functional φ ∈ (` ) defined by φ(x) := ψ(π(x)).
Since φ is trivial on `∞ 1
0 , it does not come from ` .
The applications we have given so far are perhaps not very “surprising”. The next
one is rather more so.
Theorem 5 (Finitely additive measure on Z). Write P(Z) for the set of all subsets of
Z. Then there is a “measure” µ : P(Z) → [0, 1] which is normalised so that µ(Z) = 1,
is shift-invariant in the sense that µ(A + 1) = µ(A), and is finitely-additive in the sense
that µ(A1 ∪ · · · ∪ Ak ) = µ(A1 ) + µ(A2 ) + · · · + µ(Ak ) whenever A1 , . . . , Ak are disjoint.
Proof. For the purposes of this proof write `∞ for the Banach space of bounded
sequences (xn )n∈Z indexed by Z. We will in fact construct a linear functional φ ∈ (`∞ )∗
which is shift-invariant in the sense that φ((xn )n∈Z ) = φ((xn+1 )n∈Z ), positive in the
sense that φ((xn )n∈Z ) > 0 whenever xn > 0 for all n, and normalised so that φ(1) = 1,
where 1 is the constant sequence of 1s.
4
φ0 ((xn+1 − xn + c)n∈Z ) := c
on V . For any ε > 0 there must be some n such that xn+1 − xn > −ε, and so if c > 0
we certainly have k(xn+1 − xn − c)n∈Z k = supn |xn+1 − xn + c| > |c| − ε. Since ε was
arbitrary, we actually have k(xn+1 − xn + c)n∈Z k > |c|. The same conclusion holds if
c 6 0, and therefore kφ0 k 6 1. By the Hahn-Banach theorem there is an extension of
φ0 to a linear functional φ on all of `∞ such that kφk 6 1. This is obviously normalised
so that φ(1) = 1, and φ is pretty clearly shift-invariant since (xn )n∈Z and the shifted
sequence (xn+1 )n∈Z differ by an element of V0 , on which φ0 is defined and equal to zero.
It remains to confirm that φ is positive, and for this we may suppose without loss of
generality that kxk = 1, so that 0 6 xn 6 1 for all n. Then k1 − xk 6 1, and so
1 − φ(x) = φ(1 − x) 6 k1 − xk 6 1,