0% found this document useful (0 votes)
163 views44 pages

Functional Analysis: Gerald Teschl

The document provides an introduction to functional analysis and uses the example of solving linear partial differential equations to illustrate concepts. It introduces Banach and Hilbert spaces as important objects in functional analysis and describes using separation of variables and Fourier series to find solutions to the heat equation. It also discusses how similar techniques can be applied to other common PDEs in physics.

Uploaded by

Mehwish Qadir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views44 pages

Functional Analysis: Gerald Teschl

The document provides an introduction to functional analysis and uses the example of solving linear partial differential equations to illustrate concepts. It introduces Banach and Hilbert spaces as important objects in functional analysis and describes using separation of variables and Fourier series to find solutions to the heat equation. It also discusses how similar techniques can be applied to other common PDEs in physics.

Uploaded by

Mehwish Qadir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Functional Analysis

Gerald Teschl
Gerald Teschl
Institut für Mathematik
Nordbergstraße 15
Universität Wien
1090 Wien, Austria

E-mail address: [email protected]


URL: http://www.mat.univie.ac.at/~gerald/

1991 Mathematics subject classification. 81-01, 81Qxx

Abstract. This manuscript provides a brief introduction to Functional


Analysis.
Warning: This is an incomplete DRAFT!

Keywords and phrases. Functional Analysis, Banach space, Hilbert space.

Typeset by AMS-LATEX and Makeindex.


Version: October 18, 2004
Copyright c 2004 by Gerald Teschl
Contents

Preface v
Chapter 0. Introduction 1
§0.1. Linear partial differential equations 1
Chapter 1. A first look at Banach and Hilbert spaces 5
§1.1. The Banach space of continuous functions 5
§1.2. The geometry of Hilbert spaces 9
§1.3. Completeness 13
§1.4. Bounded operators 14
Chapter 2. Hilbert spaces 17
§2.1. Orthonormal bases 17
§2.2. The projection theorem and the Riesz lemma 21
§2.3. Orthogonal sums and tensor products 23
§2.4. Compact operators 24
§2.5. The spectral theorem for compact symmetric operators 26
§2.6. Applications to Sturm-Liouville operators 28
Chapter 3. Banach spaces 31
Bibliography 33
Glossary of notations 35
Index 36
Index 37

iii
Preface

The present manuscript was written for my course Functional Analysis given
at the University of Vienna in Winter 2004.
It is available from
http://www.mat.univie.ac.at/~gerald/ftp/book-fa/
Acknowledgments
I’d like to thank ....

Gerald Teschl

Vienna, Austria
October, 2004

v
Chapter 0

Introduction

Functional analysis is an important tool in the investigation of all kind of


problems in pure mathematics, physics, biology, economics, etc.. In fact, it
is hard to find a branch in science where functional analysis is not used.
The main objects are (infinite dimensional) linear spaces with different
concepts of convergence. The classical theory focuses on linear operators
(i.e., functions) between these spaces but nonlinear operators are of course
equally important. However, since one of the most important tools in investi-
gating nonlinear mappings is linearization (differentiation), linear functional
analysis will be our first topic in any case.

0.1. Linear partial differential equations


Rather than overwhelming you with a vast number of classical examples
I want to focus on one: linear partial differential equations. We will use
this example as a guide throughout this first chapter and will develop all
necessary method for a successful treatment of our particular problem.
In his investigation of heat conduction Fourier was lead to the (one
dimensional) heat or diffusion equation

∂ ∂2
u(t, x) = u(t, x), (0.1)
∂t ∂x2
Here u(t, x) is the temperature distribution at time t at the point x. It
is usually assumed, that the temperature at x = 0 and x = 1 is fixed, say
u(t, 0) = a and u(t, 1) = b. By considering u(t, x) → u(t, x)−a−(b−a)x it is
clearly no restriction to assume a = b = 0. Moreover, the initial temperature
distribution u(0, x) = u0 (x) is assumed to be know as well.

1
2 0. Introduction

Since finding the solution seems at first sight not possible, we could try
to find at least some some solutions of (0.1) first. We could for example make
an ansatz for u(t, x) as a product of two functions, each of which depends
on only one variable, that is,
u(t, x) = w(t)y(x). (0.2)
This ansatz is called separation of variables. Plugging everything into
the heat equation and bringing all t, x dependent terms to the left, right
side, respectively, we obtain
ẇ(t) y ′′ (x)
= . (0.3)
w(t) y(x)
Here the dot refers to differentiation with respect to t and the prime to
differentiation with respect to x.
Now if this equation should hold for all t and x, the quotients must be
equal to a constant −λ. That is, we are lead to the equations
−ẇ(t) = λw(t) (0.4)
and
−y ′′ (x) = λy(x), y(0) = y(1) = 0 (0.5)
which can easily be solved. The first one gives
w(t) = c1 e−λt (0.6)
and the second one
√ √
y(x) = c2 cos( λx) + c3 sin( λx). (0.7)
However, y(x) must also satisfy the boundary conditions y(0) = y(1) = 0.
The first one y(0) = 0 is satisfied if c2 = 0 and the second one yields (c3 can
be absorbed by w(t))

sin( λ) = 0, (0.8)
which holds if λ = (πn)2 , n ∈ N. In summary, we obtain the solutions
2
un (t, x) = cn e−(πn) t sin(nπx), n ∈ N. (0.9)

So we have found a large number of solutions, but we still have not


dealt with our initial condition u(0, x) = u0 (x). This can be done using
the superposition principle which holds since our equation is linear. In fact,
choosing

2
X
u(t, x) = cn e−(πn) t sin(nπx), (0.10)
n=1
0.1. Linear partial differential equations 3

where the coefficients cn decay sufficiently fast, we obtain further solutions


of our equation. Moreover, these solutions satisfy

X
u(0, x) = cn sin(nπx) (0.11)
n=1

and expanding the initial conditions into Fourier series



X
u0 (x) = u0,n sin(nπx), (0.12)
n=1

we see that the solution of our original problem is given by (0.10) if we


choose cn = u0,n .
Of course for this last statement to hold we need to ensure that the series
in (0.10) converges and that we can interchange summation and differenti-
ation. You are asked to do so in Problem 0.1.
In fact many equations in physics can be solved in a similar way:
• Reaction-Diffusion equation:
∂ ∂2
u(t, x) − 2 u(t, x) + q(x)u(t, x) = 0,
∂t ∂x
u(0, x) = u0 (x),
u(t, 0) = u(t, 1) = 0. (0.13)
Here u(t, x) could be the density of some gas in a pipe and q(x) > 0 describes
that a certain amount per time is removed (e.g., by a chemical reaction).
• Wave equation:
∂2 ∂2
u(t, x) − u(t, x) = 0,
∂t2 ∂x2
∂u
u(0, x) = u0 (x), (0, x) = v0 (x)
∂t
u(t, 0) = u(t, 1) = 0. (0.14)
Here u(t, x) is the displacement of a vibrating string which is fixed at x = 0
and x = 1. Since the equation is of second order in time, both the initial
displacement u0 (x) and the initial velocity v0 (x) of the string need to be
known.
• Schrödinger equation:
∂ ∂2
i u(t, x) = − 2 u(t, x) + q(x)u(t, x),
∂t ∂x
u(0, x) = u0 (x),
u(t, 0) = u(t, 1) = 0. (0.15)
4 0. Introduction

Here |u(t, x)|2 is the probability distribution of a particle trapped in a box


x ∈ [0, 1] and q(x) is a given external potential which describes the forces
acting on the particle.
All these problems (and many others) leads to the investigation of the
following problem
d2
Ly(x) = λy(x), L=− + q(x), (0.16)
dx2
subject to the boundary conditions
y(a) = y(b) = 0. (0.17)
Such a problem is called Sturm–Liouville boundary value problem.
Our example shows that we should prove the following facts about our
Sturm–Liouville problems:
(1) The Sturm–Liouville problem has a countable number of eigen-
values En with corresponding eigenfunctions un (x), that is, un (x)
satisfies the boundary conditions and Lun (x) = En un (x).
(2) The eigenfunctions un are complete, that is, any nice function u(x)
can be expanded into a generalized Fourier series
X∞
u(x) = cn un (x).
n=1
This problem is very similar to the eigenvalue problem of a matrix and
we are looking for a generalization of the well-known fact that every sym-
metric matrix has an orthonormal basis of eigenvectors. However, our linear
operator L is now acting on some space of functions which is not finite
dimensional and it is not at all what even orthogonal should mean for func-
tions. Moreover, since we need to handle infinite series, we need convergence
and hence define the distance of two functions as well.
Hence our program looks as follows:
• What is the distance of two functions? This automatically leads us
to the problem of convergence and completeness.
• If we additionally require the concept of orthogonality, we are lead
to Hilbert spaces which are the proper setting for our eigenvalue
problem.
• Finally, the spectral theorem for compact symmetric operators will
be the solution of our above problem
Problem 0.1. Find conditions for the initial distribution u0 (x) such that
(0.10) is indeed a solution (i.e., such that interchanging the order of sum-
mation and differentiation is admissible).
Chapter 1

A first look at Banach


and Hilbert spaces

1.1. The Banach space of continuous functions


So let us start with the set of continuous function C(I) on a compact in-
terval I = [a, b] ⊂ R. Since we want to handle complex models (e.g., the
Schrödinger equation) as well, we will always consider complex valued func-
tions!
One way of declaring a distance, well-known from calculus, is the max-
imum norm:
kf (x) − g(x)k∞ = max |f (x) − g(x)|. (1.1)
x∈I

It is not hard to see that with this definition C(I) becomes a normed linear
space:
A normed linear space X is a vector space X over C (or R) with a
real-valued function (the norm) k.k such that
• kf k ≥ 0 for all f ∈ X and kf k = 0 if and only if f = 0,
• kλ f k = |λ| kf k for all λ ∈ C and f ∈ X, and
• kf + gk ≤ kf k + kgk for all f, g ∈ X (triangle inequality).
Once we have a norm, we have a distance d(f, g) = kf − gk and hence
we know when a sequence of vectors fn converges to a vector f . We
will write fn → f or limn→∞ fn = f , as usual, in this case. Moreover, a
mapping F : X → Y between to normed spaces is called continuous if
fn → f implies F (fn ) → F (f ). In fact, it is not hard to see that the norm
is continuous (Problem 1.2).

5
6 1. A first look at Banach and Hilbert spaces

In addition to the concept of convergence we have also the concept of


a Cauchy sequence and hence the concept of completeness: A normed
space is called complete if every Cauchy sequence has a limit. A complete
normed space is called a Banach space.
Example. The space ℓ1 (N) of all sequences a = (aj )∞
j=1 for which the norm


X
kak1 = |aj | (1.2)
j=1

is finite, is a Banach space.


To show this, we need to verify three things: (i) ℓ1 (N) is a Vector space,
that is closed under addition and scalar multiplication (ii) k.k1 satisfies the
three requirements for a norm and (iii) ℓ1 (N) is complete.
First of all observe
k
X k
X k
X
|aj + bj | ≤ |aj | + |bj | ≤ kak1 + kbk1 (1.3)
j=1 j=1 j=1

for any finite k. Letting k → ∞ we conclude that ℓ1 (N) is closed under


addition and that the triangle inequality holds. That ℓ1 (N) is closed under
scalar multiplication and the two other properties of a norm are straight-
forward. It remains to show that ℓ1 (N) is complete. Let an = (anj )∞ j=1 be
a Cauchy sequence, that is, for given ε > 0 we can find an Nε such that
kam − an k1 ≤ ε for m, n ≥ Nε . This implies in particular |am n
j − aj | ≤ ε for
n
any fixed j. Thus aj is a Cauchy sequence for fixed j and by completeness
of C has a limit: anj → aj . Now consider
k
X
|am n
j − aj | ≤ ε (1.4)
j=1

and take m → ∞:
k
X
|aj − anj | ≤ ε. (1.5)
j=1

Since this holds for any finite k we even have ka−an k1 ≤ ε. Hence (a−an ) ∈
ℓ1 (N) and since an ∈ ℓ1 (N) we finally conclude a = an + (−an ) ∈ ℓ1 (N). ⋄

Example. The space ℓ∞ (N) of all bounded sequences a = (aj )∞


j=1 together
with the norm
kak∞ = sup |aj | (1.6)
j∈N

is a Banach space (Problem 1.3). ⋄


1.1. The Banach space of continuous functions 7

Now what about convergence in this space? A sequence of functions


fn (x) converges to f if and only if
lim kf − fn k = lim sup |fn (x) − f (x)| = 0. (1.7)
n→∞ n→∞ x∈I

That is, in the language of real analysis, fn converges uniformly to f . Now


let us look at the case where fn is only a Cauchy sequence. Then fn (x) is
clearly a Cauchy sequence of real numbers for any fixed x ∈ I. In particular,
by completeness of C, there is a limit f (x) for each x. Thus we get a limiting
function f (x). Moreover, letting m → ∞ in
|fm (x) − fn (x)| ≤ ε ∀m, n > Nε , x ∈ I (1.8)
we see
|f (x) − fn (x)| ≤ ε ∀n > Nε , x ∈ I, (1.9)
that is, fn (x) converges uniformly to f (x). However, up to this point we
don’t know whether it is in our vector space C(I) or not, that is, whether
it is continuous or not. Fortunately, there is a well-known result from real
analysis which tells us that the uniform limit of continuous functions is again
continuous. Hence f (x) ∈ C(I) and thus every Cauchy sequence in C(I)
converges. Or, in other words
Theorem 1.1. C(I) with the maximum norm is a Banach space.

Next we want to know if there is a basis for C(I). In order to have only
countable sums, we would even prefer a countable basis. If such a basis
exists, that is, if there is a set {un } ⊂ X of linearly independent vectors
such that every element f ∈ X can be written as
X
f= cn un , cn ∈ C, (1.10)
n
then the span (the set of all linear combinations) of {un } is dense in X. A
set whose span is dense is called total and if we have a total set, we also
have a countable dense set (consider only linear combinations with rational
coefficients). A normed linear space containing a countable dense set is
called separable. Luckily this is the case for C(I):
Theorem 1.2 (Weierstraß). Let I be a compact interval. Then the set of
polynomials is dense in C(I).

Proof. Let f (x) ∈ C(I) be given. By considering f (x) − f (a) + (f (b) −


f (a))(x − b) it is no loss to assume that f vanishes at the boundary points.
Moreover, without restriction we only consider I = [ −1 1
2 , 2 ] (why?).
Now the claim follows from the lemma below using
1
un (x) = (1 − x2 )n , (1.11)
In
8 1. A first look at Banach and Hilbert spaces

where
1
n!
Z
In = (1 − x2 )n dx = 1 1
−1 2(2 + 1) · · · ( 12 + n)
√ Γ(1 + n)
r
π 1
= π 3 = (1 + O( )). (1.12)
Γ( 2 + n) n n
(Remark: The integral is known as Beta function and the asymptotics follow
from Stirling’s formula.) 
Lemma 1.3 (Smoothing). Let un (x) be a sequence of nonnegative continu-
ous functions on [−1, 1] such that
Z Z
un (x)dx = 1 and un (x)dx → 0, δ > 0. (1.13)
|x|≤1 δ≤|x|≤1

(In other words, un has mass one and concentrates near x = 0 as n → ∞.)
Then for every f ∈ C[− 21 , 21 ] which vanishes at the endpoints, f (− 12 ) =
f ( 21 ) = 0, we have that
Z 1/2
fn (x) = un (x − y)f (y)dy (1.14)
−1/2

converges uniformly to f (x).

Proof. Since f is uniformly continuous, for given ε we can find a δ (inde-


pendent of x) such that |fR (x)−f (y)| ≤ ε whenever |x−y| ≤ δ. Moreover, we
can choose n such that δ≤|y|≤1 un (y)dy ≤ ε. Now abbreviate M = max f
and note
Z 1/2 Z 1/2
|f (x)− un (x−y)f (x)dy| = |f (x)| |1− un (x−y)dy| ≤ M ε. (1.15)
−1/2 −1/2

In fact, either the distance of x to one of the boundary points ± 21 is smaller


than δ and hence |f (x)| ≤ ε or otherwise the difference between one and the
integral is smaller than ε.
Using this we have
Z 1/2
|fn (x) − f (x)| ≤ un (x − y)|f (y) − f (x)|dy + M ε
−1/2
Z
≤ un (x − y)|f (y) − f (x)|dy
|y|≤1/2,|x−y|≤δ
Z
+ un (x − y)|f (y) − f (x)|dy + M ε
|y|≤1/2,|x−y|≥δ
= ε + 2M ε + M ε = (1 + 3M )ε, (1.16)
which proves the claim. 
1.2. The geometry of Hilbert spaces 9

Note that fn will be as smooth as un , hence the title smoothing lemma.


The same idea is used to approximate noncontinuous functions by smooth
ones (of course the convergence will no longer be uniform in this case).
Corollary 1.4. C(I) is separable.

The same is true for ℓ1 (N), but not for ℓ∞ (N) (Problem 1.4)!
Problem 1.1. Show that |kf k − kgk| ≤ kf − gk.
Problem 1.2. Show that the norm, vector addition, and multiplication by
scalars are continuous. That is, if fn → f , gn → g, and λn → λ then
kfn k → kf k, fn + gn → f + g, and λn gn → λg.
Problem 1.3. Show that ℓ∞ (N) is a Banach space.
Problem 1.4. Show that ℓ1 (N) is separable. Show that ℓ∞ (N) is not sep-
arable (Hint: Consider sequences which take only the value one and zero.
How many are there? What is the distance between two such sequences?).

1.2. The geometry of Hilbert spaces


So it looks like C(I) has all the properties we want. However, there is
still one thing missing: How should we define orthogonality in C(I)? In
Euclidean space, two vectors are called orthogonal if their scalar product
vanishes, so we would need a scalar product:
Suppose H is a vector space. A map h., ..i : H × H → C is called skew
linear form if it is conjugate linear in the first and linear in the second
argument, that is,
hλ1 f1 + λ2 f2 , gi = λ∗1 hf1 , gi + λ∗2 hf2 , gi
, λ1 , λ2 ∈ C, (1.17)
hf, λ1 g1 + λ2 g2 i = λ1 hf, g1 i + λ2 hf, g2 i
where ‘∗’ denotes complex conjugation. A skew linear form satisfying the
requirements
(1) hf, f i > 0 for f 6= 0.
(2) hf, gi = hg, f i∗
is called inner product or scalar product. Associated with every scalar
product is a norm p
kf k = hf, f i. (1.18)
The pair (H, h., ..i) is called inner product space. If H is complete it is
called a Hilbert space.
Example. Clearly Cn with the usual scalar product
n
X
ha, bi = a∗j bj (1.19)
j=1
10 1. A first look at Banach and Hilbert spaces

is a (finite dimensional) Hilbert space. ⋄

Example. A somewhat more interesting example is the Hilbert space ℓ2 (N),


that is, the set of all sequences
n X∞ o
(aj )∞ |a |2
< ∞ (1.20)

j=1 j
j=1

with scalar product



X
ha, bi = a∗j bj . (1.21)
j=1
(Show that this is in fact a separable Hilbert space! Problem 1.5) ⋄
p
Of course I still owe you a proof for the claim that hf, f i is indeed a
norm. Only the triangle inequality is nontrivial which will follow from the
Cauchy-Schwarz inequality below.
A vector f ∈ H is called normalized or unit vector if kf k = 1. Two
vectors f, g ∈ H are called orthogonal or perpendicular (f ⊥ g) if hf, gi =
0 and parallel if one is a multiple of the other.
For two orthogonal vectors we have the Pythagorean theorem:
kf + gk2 = kf k2 + kgk2 , f ⊥ g, (1.22)
which is one line of computation.
Suppose u is a unit vector, then the projection of f in the direction of
u is given by
fk = hu, f iu (1.23)
and f⊥ defined via
f⊥ = f − hu, f iu (1.24)
is perpendicular to u since hu, f⊥ i = hu, f − hu, f iui = hu, f i − hu, f ihu, ui =
0.

✒▼❇❇
f❇ f⊥

✶❇

✏✏✏
✏ fk
✏✏


✏✏ u

Taking any other vector parallel to u it is easy to see


kf − λuk2 = kf⊥ + (fk − λu)k2 = kf⊥ k2 + |hu, f i − λ|2 (1.25)
and hence fk = hu, f iu is the unique vector parallel to u which is closest to
f.
1.2. The geometry of Hilbert spaces 11

As a first consequence we obtain the Cauchy-Schwarz-Bunjakowski


inequality:
Theorem 1.5 (Cauchy-Schwarz-Bunjakowski). Let H0 be an inner product
space, then for every f, g ∈ H0 we have
|hf, gi| ≤ kf kkgk (1.26)
with equality if and only if f and g are parallel.

Proof. It suffices to prove the case kgk = 1. But then the claim follows
from kf k2 = |hg, f i|2 + kf⊥ k2 . 

Note that the Cauchy-Schwarz inequality implies that the scalar product
is continuous in both variables, that is, if fn → f and gn → g we have
hfn , gn i → hf, gi.
As another consequence we infer that the map k.k is indeed a norm.
kf + gk2 = kf k2 + hf, gi + hg, f i + kgk2 ≤ (kf k + kgk)2 . (1.27)

But let us return to C(I). Can we find a scalar product which has the
maximum norm as associated norm? Unfortunately the answer is no! The
reason is that the maximum norm does not satisfy the parallelogram law
(Problem 1.7).
Theorem 1.6 (Jordan-von Neumann). A norm is associated with a scalar
product if and only if the parallelogram law
kf + gk2 + kf − gk2 = 2kf k2 + 2kgk2 (1.28)
holds.
In this case the scalar product can be recovered from its norm by virtue
of the polarization identity
1
kf + gk2 − kf − gk2 + ikf − igk2 − ikf + igk2 .

hf, gi = (1.29)
4
Proof. If an inner product space is given, verification of the parallelogram
law and the polarization identity is straight forward (Problem 1.6).
To show the converse, we define
1
kf + gk2 − kf − gk2 + ikf − igk2 − ikf + igk2 .

s(f, g) = (1.30)
4
Then s(f, f ) = kf k2 and s(f, g) = s(g, f )∗ are straightforward to check.
Moreover, another straightforward computation using the parallelogram law
shows
g+h
s(f, g) + s(f, h) = 2s(f, ). (1.31)
2
12 1. A first look at Banach and Hilbert spaces

Now choosing h = 0 (and using s(f, 0) = 0) shows s(f, g) = 2s(f, g2 ) and


thus s(f, g) + s(f, h) = s(f, g + h). Furthermore, by induction we infer
m m
2n s(f, g) = s(f, 2n ), that is λs(f, g) = s(f, λg) for every positive rational λ.
By continuity (check this!) this holds for all λ > 0 and s(f, −g) = −s(f, g)
respectively s(f, ig) = i s(f, g) finishes the proof. 

Note that the parallelogram law and the polarization identity even hold
for skew linear forms.
But how do we define a scalar product on C(I)? One possibility is
Z b
hf, gi = f ∗ (x)g(x)dx. (1.32)
a
The corresponding inner product space is denoted by L2 (I). Note that we
have p
kf k ≤ |b − a|kf k∞ (1.33)
and hence the maximum norm is stronger than the L2 norm.
Suppose we have two norms k.k1 and k.k2 on a space X. Then k.k2 is
said to be stronger than k.k1 if there is a constant m > 0 such that
kf k1 ≤ mkf k2 . (1.34)
It is straightforward to check that
Lemma 1.7. If k.k2 is stronger than k.k1 , then any k.k2 Cauchy sequence
is also a k.k1 Cauchy sequence.

Hence if a function F : X → Y is continuous in (X, k.k1 ) it is also


continuos in (X, k.k2 ) and if a set is dense in (X, k.k2 ) it is also dense in
(X, k.k1 ).
In particular, L2 is separable. But is it also complete? Unfortunately
the answer is no:
Example. Take I = [0, 2] and define
0 ≤ x ≤ 1 − n1

 0,
fn (x) = 1 + n(x − 1), 1 − n1 ≤ x ≤ 1 (1.35)
1, 1≤x≤2

then fn (x) is a Cauchy sequence in L2 , but there is no limit in L2 ! Clearly


the limit should be the step function which is 0 for 0 ≤ x < 1 and 1 for
1 ≤ x ≤ 2, but this step function is discontinuous (Problem 1.8)! ⋄

This shows that in infinite dimensional spaces different norms will give
raise to different convergent sequences! In fact, the key to solving prob-
lems in infinite dimensional spaces is often finding the right norm! This is
something which cannot happen in the finite dimensional case.
1.3. Completeness 13

Theorem 1.8. If X is a finite dimensional case, then all norms are equiv-
alent. That is, for given two norms k.k1 and k.k2 there are constants m1
and m2 such that
1
kf k1 ≤ kf k2 ≤ m1 kf k1 . (1.36)
m2
≤ j ≤ n, and assume
Proof. Clearly we can choosePa basis uj , 1P P that k.k2 is
the usual Euclidean norm, k j αj uj k22 = j |αj |2 . Let f = j αj uj , then
by the triangle and Cauchy Schwartz inequalities
X sX
kf k1 ≤ |αj |kuj k1 ≤ kuj k21 kf k2 (1.37)
j j
qP
and we can choose m2 = j kuj k1 .
In particular, if fn is convergent with respect to k.k2 it is also convergent
with respect to k.k1 . Thus k.k1 is continuous with respect to k.k2 and attains
its minimum m > 0 on the unit sphere (which is compact by the Heine-Borel
theorem). Now choose m1 = 1/m. 
Problem 1.5. Show that ℓ2 (N) is a separable Hilbert space.
Problem 1.6. Let s(f, g) be a skew linear form and p(f ) = s(f, f ) the
associated quadratic form. Prove the parallelogram law
p(f + g) + p(f − g) = 2p(f ) + 2p(g) (1.38)
and the polarization identity
1
s(f, g) =(p(f + g) − p(f − g) + i p(f − ig) − i p(f + ig)) . (1.39)
4
Problem 1.7. Show that the maximum norm does not satisfy the parallel-
ogram law.
Problem 1.8. Prove the claims made about fn , defined in (1.35), in the
last example.

1.3. Completeness
Since L2 is not complete, how can we obtain a Hilbert space out of it? Well
the answer is simple: take the completion.
If X is a (incomplete) normed space, consider the set of all Cauchy
sequences X̃. Call two Cauchy sequences equivalent if their difference con-
verges to zero and denote by X̄ the set of all equivalence classes. It is easy
to see that X̄ (and X̃) inherit the vector space structure from X. Moreover,
Lemma 1.9. If xn is a Cauchy sequence, then kxn k converges.
14 1. A first look at Banach and Hilbert spaces

Consequently the norm of a Cauchy sequence (xn )∞ n=1 can be defined by


k(xn )∞
n=1 k = limn→∞ kxn k and is independent of the equivalence class (show
this!). Thus X̄ is a normed space (X̃ is not! why?).
Theorem 1.10. X̄ is a Banach space containing X as a dense subspace if
we identify x ∈ X with the equivalence class of all sequences converging to
x.

Proof. (Outline) It remains to show that X̄ is complete. Let ξn = [(xn,j )∞


j=1 ]

be a Cauchy sequence in X̄. Then it is not hard to see that ξ = [(xj,j )j=1 ]
is its limit. 

In particular it is no restriction to assume that a normed linear space or


an inner product space is complete. However, in the important case of L2 it is
somewhat inconvenient to work with equivalence classes of Cauchy sequences
and hence we will give a different characterization using the Lebesgue inte-
gral later.

1.4. Bounded operators


A linear map A between two normed spaces X and Y will be called a (lin-
ear) operator
A : D(A) ⊆ X → Y. (1.40)
The linear subspace D(A) on which A is defined, is called the domain of A
and is usually required to be dense. The operator A is called bounded if
the following operator norm
kAk = sup kAf kY (1.41)
kf kX =1

is finite.
The set of all bounded linear operators from X to Y is denoted by
L(X, Y ). If X = Y we write L(X, X) = L(X).
Theorem 1.11. The space L(X, Y ) together with the operator norm (1.41)
is a normed space. It is a Banach space if Y is.

Proof. That (1.41) is indeed a norm is straightforward. If Y is complete


and An is a Cauchy sequence of operators, then An f converges to an element
g for every f . Define a new operator A via Af = g and note An → A. 

By construction, a bounded operator is Lipschitz continuous


kAf kY ≤ kAkkf kX (1.42)
and hence continuous. The converse is also true
Theorem 1.12. An operator A is bounded if and only if it is continuous.
1.4. Bounded operators 15

Proof. Suppose A is continuous but not bounded. Then there is a sequence


of unit vectors un such that kAun k ≥ n. Then fn = n1 un converges to 0 but
kAfn k ≥ 1 does not converge to 0. 

Moreover, if A is bounded and densely defined, it is no restriction to


assume that it is defined on all of X.
Theorem 1.13. Let A ∈ L(X, Y ) and let Y be a Banach space. If D(A)
is dense, there is a unique (continuous) extension of A to X, which has the
same norm.

Proof. Since a bounded operator maps Cauchy sequences to Cauchy se-


quences, this extension can only be given by
Af = lim Afn , fn ∈ D(A), f ∈ X. (1.43)
n→∞
To show that this definition is independent of the sequence fn → f , let
gn → f be a second sequence and observe
kAfn − Agn k = kA(fn − gn )k ≤ kAkkfn − gn k → 0. (1.44)
From continuity of vector addition and scalar multiplication it follows that
our extension is linear. Finally, from continuity of the norm we conclude
that the norm does not increase. 

An operator in L(X, C) is called a bounded linear functional and the


space X ∗ = L(X, C) is called the dual space of X.
Problem 1.9. Show that the integral operator
Z 1
(Kf )(x) = K(x, y)f (y)dy, (1.45)
0
where K(x, y) ∈ C([0, 1] × [0, 1]), defined on D(K) = C[0, 1] is a bounded
operator both in X = C[0, 1] (max norm) and X = L2 (0, 1).
d
Problem 1.10. Show that the differential operator A = dx defined on
D(A) = C 1 [0, 1] ⊂ C[0, 1] is an unbounded operator.
Chapter 2

Hilbert spaces

2.1. Orthonormal bases


In this section we will investigate orthonormal series and you will notice
hardly any difference between the finite and infinite dimensional cases.
As our first task, let us generalize the projection into the direction of
one vector:
A set of vectors {uj } is called orthonormal set if huj , uk i = 0 for j 6= k
and huj , uj i = 1.

Lemma 2.1. Suppose {uj }nj=1 is an orthonormal set. Then every f ∈ H


can be written as
n
X
f = fk + f⊥ , fk = huj , f iuj , (2.1)
j=1

where fk and f⊥ are orthogonal. Moreover, huj , f⊥ i = 0 for all 1 ≤ j ≤ n.


In particular,
n
X
2
kf k = |huj , f i|2 + kf⊥ k2 . (2.2)
j=1

Moreover, every fˆ in the span of {uj }nj=1 satisfies

kf − fˆk ≥ kf⊥ k (2.3)

with equality holding if and only if fˆ = fk . In other words, fk is uniquely


characterized as the vector in the span of {uj }nj=1 being closest to f .

17
18 2. Hilbert spaces

Proof. A straightforward calculation shows huj , f − fk i = 0 and hence fk


and f⊥ = f − fk are orthogonal. The formula for the norm follows by
applying (1.22) iteratively.
Now, fix a vector
n
fˆ =
X
cj uj . (2.4)
j=1

in the span of {uj }nj=1 . Then one computes

kf − fˆk2 = kfk + f⊥ − fˆk2 = kf⊥ k2 + kfk − fˆk2


n
X
= kf⊥ k2 + |cj − huj , f i|2 (2.5)
j=1

from which the last claim follows. 

From (2.2) we obtain Bessel’s inequality


n
X
|huj , f i|2 ≤ kf k2 (2.6)
j=1

with equality holding if and only if f lies in the span of {uj }nj=1 .
Of course, since we cannot assume H to be a finite dimensional vec-
tor space, we need to generalize Lemma 2.1 to arbitrary orthonormal sets
{uj }j∈J . We start by assuming that J is countable. Then Bessel’s inequality
(2.6) shows that
X
|huj , f i|2 (2.7)
j∈J
converges absolutely. Moreover, for any finite subset K ⊂ J we have
X X
k huj , f iuj k2 = |huj , f i|2 (2.8)
j∈K j∈K
P
by the Pythagorean theorem and thus j∈J huj , f iuj is Cauchy if and only
2
P
if j∈J |huj , f i| is. Now let J be arbitrary. Again, Bessel’s inequality
shows that for any given ε > 0 there are at most finitely many j for which
|huj , f i| ≥ ε (namely at most kf k/ε). Hence there are at most countably
many j for which |huj , f i| > 0. Thus it follows that
X
|huj , f i|2 (2.9)
j∈J

is well-defined and (by completeness) so is


X
huj , f iuj . (2.10)
j∈J
2.1. Orthonormal bases 19

In particular, by continuity of the scalar product we see that Lemma 2.1


holds for arbitrary orthonormal sets without modifications.
Theorem 2.2. Suppose {uj }j∈J is an orthonormal set in an inner product
space H. Then every f ∈ H can be written as
X
f = fk + f⊥ , fk = huj , f iuj , (2.11)
j∈J

where fk and f⊥ are orthogonal. Moreover, huj , f⊥ i = 0 for all j ∈ J. In


particular, X
kf k2 = |huj , f i|2 + kf⊥ k2 . (2.12)
j∈J

Moreover, every fˆ in the span of {uj }j∈J satisfies


kf − fˆk ≥ kf⊥ k (2.13)
with equality holding if and only if fˆ = fk . In other words, fk is uniquely
characterized as the vector in the span of {uj }j∈J being closest to f .

Note that from Bessel’s inequality (which of course still holds) it follows
that the map f → fk is continuous.
P particularly interested in the case where every f ∈ H
Of course we are
can be written as j∈J huj , f iuj . In this case we will call the orthonormal
set {uj }j∈J an orthonormal basis.
If H is separable it is easy to construct an orthonormal basis. In fact, if H
is separable, then there exists a countable total set {fj }N j=1 . After throwing
away some vectors we can assume that fn+1 cannot be expressed as a linear
combinations of the vectors f1 , . . . fn . Now we can construct an orthonormal
set as follows: We begin by normalizing f1
f1
u1 = . (2.14)
kf1 k
Next we take f2 and remove the component parallel to u1 and normalize
again
f2 − hu1 , f2 iu0
u2 = . (2.15)
kf2 − hu1 , f2 iu1 k
Proceeding like this we define recursively
fn − n−1
P
j=1 huj , fn iuj
un = Pn−1 . (2.16)
kfn − j=1 huj , fn iuj k
This procedure is known as Gram-Schmidt orthogonalization. Hence
we obtain an orthonormal set {uj }N n n
j=1 such that span{uj }j=1 = span{fj }j=1
for any finite n and thus also for N . Since {fj }N
j=1 is total, we infer that
N
{uj }j=1 is an orthonormal basis.
20 2. Hilbert spaces

Theorem 2.3. Every separable inner product space has a countable or-
thonormal basis.

Example. In L2 (−1, 1) we can orthogonalize the polynomial fn (x) = xn .


The resulting polynomials are up to a normalization equal to the Legendre
polynomials
3 x2 − 1
Pn (x) = 1, P1 (x) = x, P2 (x) = , ... (2.17)
2
(which are normalized such that Pn (1) = 1). ⋄

Example. The set of functions


1
un (x) = √ ein x , n ∈ Z, (2.18)

forms an orthonormal basis for H = L2 (0, 2π). The corresponding orthogo-
nal expansion is just the ordinary Fourier series. ⋄

If fact, if there is one countable basis, then it follows that every other
basis is countable as well.
Theorem 2.4. If H is separable, then every orthonormal basis is countable.

Proof. We know that there is at least one countable orthonormal basis


{uj }j∈J . Now let {uk }k∈K be a second basis and consider the set Kj =
{k ∈ K|huk , uj i 6= 0}. Since these are the expansion coefficients of uj with
S
respect to {uk }k∈K , this set is countable. Hence the set K̃ = j∈J Kj is
countable as well. But k ∈ K\K̃ implies uk = 0 and hence K̃ = K. 

It even turns out that, up to unitary equivalence, there is only one


(separable) infinite dimensional Hilbert space:
A bijective operator U ∈ L(H1 , H2 ) is called unitary if U preserves
scalar products:
hU g, U f i2 = hg, f i1 , g, f ∈ H1 . (2.19)
By the polarization identity this is the case if and only if U preserves norms:
kU f k2 = kf k1 for all f ∈ H1 . The two Hilbert space H1 and H2 are called
unitarily equivalent in this case.
Let H be an infinite dimensional Hilbert space and let {uj }j∈N be any
orthogonal basis. Then the map U : H → ℓ2 (N), f 7→ (huj , f i)j∈N is unitary
(by Theorem 2.6 (iii)). In particular,
Theorem 2.5. Any separable infinite dimensional Hilbert space is unitarily
equivalent to ℓ2 (N).
2.2. The projection theorem and the Riesz lemma 21

To see that any Hilbert space has an orthonormal basis we need to


resort to Zorn’s lemma: The collection of all orthonormal sets in H can be
partially ordered by inclusion. Moreover, any linearly ordered chain has an
upper bound (the union of all sets in the chain). Hence Zorn’s lemma implies
the existence of a maximal element, that is, an orthonormal set which is not
a proper subset of any other orthonormal set.

Theorem 2.6. For an orthonormal set {uj }j∈J in an Hilbert space H the
following conditions are equivalent:
(i) {uj }j∈J is a maximal orthogonal set.
(ii) For every vector f ∈ H we have
X
f= huj , f iuj . (2.20)
j∈J

(iii) For every vector f ∈ H we have


X
kf k2 = |huj , f i|2 . (2.21)
j∈J

(iv) huj , f i = 0 for all j ∈ J implies f = 0.

Proof. We will use the notation from Theorem 2.2.


(i) ⇒ (ii): If f⊥ 6= 0 than we can normalize f⊥ to obtain a unit vector f˜⊥
which is orthogonal to all vectors uj . But then {uj }j∈J ∪ {f˜⊥ } would be a
larger orthonormal set, contradicting maximality of {uj }j∈J .
(ii) ⇒ (iii): Follows since (ii) implies f⊥ = 0.
(iii) ⇒ (iv): If hf, uj i = 0 for all j ∈ J we conclude kf k2 = 0 and hence
f = 0.
(iv) ⇒ (i): If {uj }j∈J were not maximal, there would be a unit vector g
such that {uj }j∈J ∪ {g} is larger orthonormal set. But huj , gi = 0 for all
j ∈ J implies g = 0 by (iv), a contradiction. 

By continuity of the norm it suffices to check (iii), and hence also (ii),
for f in a dense set.

2.2. The projection theorem and the Riesz


lemma
Let M ⊆ H be a subset, then M ⊥ = {f |hg, f i = 0, ∀g ∈ M } is called
the orthogonal complement of M . By continuity of the scalar prod-
uct it follows that M ⊥ is a closed linear subspace and by linearity that
(span(M ))⊥ = M ⊥ . For example we have H⊥ = {0} since any vector in H⊥
must be in particular orthogonal to all vectors in some orthonormal basis.
22 2. Hilbert spaces

Theorem 2.7 (projection theorem). Let M be a closed linear subspace of


a Hilbert space H, then every f ∈ H can be uniquely written as f = fk + f⊥
with fk ∈ M and f⊥ ∈ M ⊥ . One writes
M ⊕ M⊥ = H (2.22)
in this situation.

Proof. Since M is closed, it is a Hilbert space and has an orthonormal basis


{uj }j∈J . Hence the result follows from Theorem 2.2. 

In other words, to every f ∈ H we can assign a unique vector fk which


is the vector in M closest to f . The rest f − fk lies in M ⊥ . The operator
PM f = fk is called the orthogonal projection corresponding to M . Clearly
we have PM ⊥ f = f − PM f = f⊥ .
Moreover, we see that the vectors in a closed subspace M are precisely
those which are orthogonal to all vectors in M ⊥ , that is, M ⊥⊥ = M . If M
is an arbitrary subset we have at least
M ⊥⊥ = span(M ). (2.23)

Finally we turn to linear functionals, that is, to operators ℓ : H → C.


By the Cauchy-Schwarz inequality we know that ℓg : f 7→ hg, f i is a bounded
linear functional (with norm kgk). In turns out that in a Hilbert space every
bounded linear functional can be written in this way.
Theorem 2.8 (Riesz lemma). Suppose ℓ is a bounded linear functional on
a Hilbert space H. Then there is a vector g ∈ H such that ℓ(f ) = hg, f i for
all f ∈ H. In other words, a Hilbert space is equivalent to its own dual space
H∗ = H.

Proof. If ℓ ≡ 0 we can choose g = 0. Otherwise Ker(ℓ) = {f |ℓ(f ) = 0} is a


proper subspace and we can find a unit vector g̃ ∈ Ker(ℓ)⊥ . For every f ∈ H
we have ℓ(f )g̃ − ℓ(g̃)f ∈ Ker(ℓ) and hence
0 = hg̃, ℓ(f )g̃ − ℓ(g̃)f i = ℓ(f ) − ℓ(g̃)hg̃, f i. (2.24)
In other words, we can choose g = ℓ(g̃)∗ g̃. 

The following easy consequence is left as an exercise.


Corollary 2.9. Suppose B is a bounded skew liner form, that is,
|B(f, g)| ≤ Ckf k kgk. (2.25)
Then there is a unique bounded operator A such that
B(f, g) = hAf, gi. (2.26)
2.3. Orthogonal sums and tensor products 23

2.3. Orthogonal sums and tensor products


Given two Hilbert spaces H1 and H2 we define their orthogonal sum H1 ⊕H2
to be the set of all pairs (f1 , f2 ) ∈ H1 × H2 together with the scalar product
h(g1 , g2 ), (f1 , f2 )i = hg1 , f1 i1 + hg2 , f2 i2 . (2.27)
It is left as an exercise to verify that H1 ⊕ H2 is again a Hilbert space.
Moreover, H1 can be identified with {(f1 , 0)|f1 ∈ H1 } and we can regard H1
as a subspace of H1 ⊕ H2 . Similarly for H2 . It is also custom to write f1 + f2
instead of (f1 , f2 ).
More generally, let Hj j ∈ N, be a countable collection of Hilbert spaces
and define
M ∞ X∞ ∞
X
Hj = { fj | fj ∈ Hj , kfj k2j < ∞}, (2.28)
j=1 j=1 j=1
which becomes a Hilbert space with the scalar product
X∞ ∞
X ∞
X
h gj , fj i = hgj , fj ij . (2.29)
j=1 j=1 j=1
L∞
Example. j=1 C = ℓ2 (N). ⋄

Suppose H and H̃ are two Hilbert spaces. Our goal is to construct their
tensor product. The elements should be products f ⊗ f˜ of elements f ∈ H
and f˜ ∈ H̃. Hence we start with the set of all finite linear combinations of
elements of H × H̃
n
αj (fj , f˜j )|(fj , f˜j ) ∈ H × H̃, αj ∈ C}.
X
F(H, H̃) = { (2.30)
j=1

Since we want (f1 + f2 ) ⊗ f˜ = f1 ⊗ f˜+ f2 ⊗ f˜, f ⊗ (f˜1 + f˜2 ) = f ⊗ f˜1 + f ⊗ f˜2 ,


and (αf ) ⊗ f˜ = f ⊗ (αf˜) we consider F(H, H̃)/N (H, H̃), where
n n n
˜ βk f˜k )}
X X X
N (H, H̃) = span{ αj βk (fj , fk ) − ( αj fj , (2.31)
j,k=1 j=1 k=1

and write f ⊗ f˜ for the equivalence class of (f, f˜).


Next we define
hf ⊗ f˜, g ⊗ g̃i = hf, gihf˜, g̃i (2.32)
which extends to a skew linear form on F(H, H̃)/N (H, H̃). To show that we
obtain a scalar product, we need to ensure positivity. Let f = i αi fi ⊗ f˜i 6=
P

0 and pick orthonormal bases uj , ũk for span{fi }, span{f˜i }, respectively.


Then
αi huj , fi ihũk , f˜i i
X X
f= αjk uj ⊗ ũk , αjk = (2.33)
j,k i
24 2. Hilbert spaces

and we compute
X
hf, f i = |αjk |2 > 0. (2.34)
j,k

The completion of F(H, H̃)/N (H, H̃) with respect to the induced norm is
called the tensor product H ⊗ H̃ of H and H̃.
Lemma 2.10. If uj , ũk are orthonormal bases for H, H̃, respectively, then
uj ⊗ ũk is an orthonormal basis for H ⊗ H̃.

Proof. That uj ⊗ ũk is an orthonormal set is immediate from (2.32). More-


over, since span{uj }, span{ũk } is dense in H, H̃, respectively, it is easy to
see that uj ⊗ ũk is dense in F(H, H̃)/N (H, H̃). But the latter is dense in
H ⊗ H̃. 

Example. We have H ⊗ Cn = Hn . ⋄

It is straightforward to extend the tensor product to any finite number


of Hilbert spaces. We even note
M∞ ∞
M
( Hj ) ⊗ H = (Hj ⊗ H), (2.35)
j=1 j=1

where equality has to be understood in the sense, that both spaces are
unitarily equivalent by virtue of the identification
X∞ ∞
X
( fj ) ⊗ f = fj ⊗ f. (2.36)
j=1 j=1

2.4. Compact operators


A linear operator A defined on a normed space X is called compact if
every sequence Afn has a convergent subsequence whenever fn is bounded.
The set of all compact operators is denoted by C(X). It is not hard to see
that the set of compact operators is a ideal of the set of bounded operators
(Problem 2.1):
Theorem 2.11. Every compact linear operator is bounded. Linear combi-
nations of compact operators are bounded and the product of a bounded and
a compact operator is again compact.

If X is a Banach space then this ideal is even closed:


Theorem 2.12. Let X be a Banach space, and let An be a convergent
sequence of compact operators. Then the limit A is again compact.
2.4. Compact operators 25

(0) (1)
Proof. Let fj be a bounded sequence. Chose a subsequence fj such
(1) (1) (1)
that A1 fj converges. From fj choose another subsequence fj such that
(2) (n)
A2 fj converges and so on. Since fj might disappear as n → ∞, we con-
(j)
sider the diagonal sequence fj = fj . By construction, fj is a subsequence
(n)
of fj for j ≥ n and hence An fj is Cauchy for any fixed n. Now
kAfj − fk k = k(A − An )(fj − fk ) + An (fj − fk )k
≤ kA − An kkfj − fk k + kAn fj − An fk k (2.37)
shows that Afj is Cauchy since the first term can be made arbitrary small
by choosing n large and the second by the Cauchy property of An fj . 

Note that it suffices to verify compactness on a dense set.


Theorem 2.13. Let X be a normed space and A ∈ C(X). Let X be its
completion, then A ∈ C(X), where A is the unique extension of A.

Proof. Let fn ∈ X be a given bounded sequence. We need to show that


Afn has a convergent subsequence. Pick fnj ∈ X such that kfnj −fn k ≤ 1j and
by compactness of A we can assume that Afnn → g. But then kAfn − gk ≤
kAkkfn − fnn k + kAfnn − gk shows that Afn → g. 

One of the most important examples of compact operators are integral


operators:
Lemma 2.14. The integral operator
Z b
(Kf )(x) = K(x, y)f (y)dy, (2.38)
a
where K(x, y) ∈ C([a, b] × [a, b]), defined on L2 (a, b) is compact.

Proof. First of all note that K(., ..) is continuous on [a, b] × [a, b] and hence
uniformly continuous. In particular, for every ε > 0 we can find a δ > 0
such that |K(y, t) − K(x, t)| ≤ ε whenever |y − x| ≤ δ. Let g(x) = Kf (x),
then
Z b
|g(x) − g(y)| ≤ |K(y, t) − K(x, t)| |f (t)|dt
a
Z b
≤ ε |f (t)|dt ≤ εk1k kf k, (2.39)
a
whenever |y − x| ≤ δ. Hence, if fn (x) is a bounded sequence in L2 (a, b),
then gn (x) = Kfn (x) is equicontinuous and has a uniformly convergent
subsequence by the Arzelà-Ascoli theorem (Theorem 2.15 below). But a
uniformly convergent sequence is also convergent in the norm induced by
the scalar product. Therefore K is compact. 
26 2. Hilbert spaces

Note that (almost) the same proof shows that K is compact when defined
on C[a, b].
Theorem 2.15 (Arzelà-Ascoli). Suppose the sequence of functions fn (x),
n ∈ N, on a compact interval is (uniformly) equicontinuous, that is, for
every ε > 0 there is a δ > 0 (independent of n) such that
|fn (x) − fn (y)| ≤ ε if |x − y| < δ. (2.40)
If the sequence fn is bounded, then there is a uniformly convergent subse-
quence.

(The proof is not difficult but I still don’t want to repeat it here since it
is covered in most real analysis courses.)
Compact operators are very similar to (finite) matrices as we will see in
the next section.
Problem 2.1. Show that compact operators form an ideal.

2.5. The spectral theorem for compact


symmetric operators
Let H be an inner product space. A linear operator A is called symmetric
if its domain is dense and if
hg, Af i = hAg, f i f, g ∈ D(A). (2.41)
A number z ∈ C is called eigenvalue of A if there is a nonzero vector
u ∈ D(A) such that
Au = zu. (2.42)
The vector u is called a corresponding eigenvector in this case. An eigen-
value is called simple if there is only one linearly independent eigenvector.
Theorem 2.16. Let A be symmetric. Then all eigenvalues are real and
eigenvectors corresponding to different eigenvalues are orthogonal.

Proof. Suppose λ is an eigenvalue with corresponding normalized eigen-


vector u. Then λ = hu, Aui = hAu, ui = λ∗ , which shows that λ is real.
Furthermore, if Auj = λj uj , j = 1, 2, we have
(λ1 − λ2 )hu1 , u2 i = hAu1 , u2 i − hu1 , Au2 i = 0 (2.43)
finishing the proof. 

Now we show that A has an eigenvalue at all (which is not clear in the
infinite dimensional case)!
Theorem 2.17. A symmetric compact operator has an eigenvalue α1 which
satisfies |α1 | = kAk.
2.5. The spectral theorem for compact symmetric operators 27

Proof. We set α = kAk and assume α 6= 0 (i.e, A 6= 0) without loss of


generality. Since
kAk2 = sup kAf k2 = sup hAf, Af i = sup hf, A2 f i (2.44)
f :kf k=1 f :kf k=1 f :kf k=1

there exists a normalized sequence un such that


lim hun , A2 un i = α2 . (2.45)
n→∞

Since A is compact, it is no restriction to assume that A2 un converges, say


limn→∞ A2 un = α2 u. Now
k(A2 − α2 )un k2 = kA2 un k2 − 2α2 hun , A2 un i + α4
≤ 2α2 (α2 − hun , A2 un i) (2.46)
(where we have used kA2 un k ≤ kAkkAun k ≤ kAk2 kun k = α2 ) implies
limn→∞ (A2 un − α2 un ) = 0 and hence limn→∞ un = u. In addition, u is
a normalized eigenvector of A2 since (A2 − α2 )u = 0. Factorizing this last
equation according to (A − α)u = v and (A + α)v = 0 show that either
v 6= 0 is an eigenvector corresponding to −α or v = 0 and hence u 6= 0 is an
eigenvector corresponding to α. 

Note that for a bounded operator A, there cannot be an eigenvalue with


absolute value larger than kAk, that is, the set of eigenvalues is bounded by
kAk (Problem 2.2).
Now consider a symmetric compact operator A with eigenvalue α1 (as
above) and corresponding normalized eigenvector u1 . Setting
H1 = {u1 }⊥ = {f ∈ H|hu1 , f i = 0} (2.47)
we can restrict A to H1 using
D(A1 ) = D(A) ∩ H1 = {f ∈ D(A)|hu1 , f i = 0} (2.48)
since f ∈ D(A1 ) implies
hu1 , Af i = hAu1 , f i = α1 hu1 , f i = 0 (2.49)
and hence Af ∈ H1 . Denoting this restriction by A1 , it is not hard to
see that A1 is again a symmetric compact operator. Hence we can apply
Theorem 2.17 iteratively to obtain a sequence of eigenvalues αj with cor-
responding normalized eigenvectors uj . Moreover, by construction, un is
orthogonal to all uj with j < n and hence the eigenvectors {uj } form an
orthonormal set. This procedure will not stop unless H is finite dimensional.
However, note that αj = 0 for j ≥ n might happen if An = 0.
Theorem 2.18. Suppose H is a Hilbert space and A : H → H is a compact
symmetric operator. Then there exists a sequence of real eigenvalues αj
28 2. Hilbert spaces

converging to 0. The corresponding normalized eigenvectors uj form an


orthonormal set and every f ∈ H can be written as

X
f= huj , f iuj + h, (2.50)
j=1

where h is in the kernel of A, that is, Ah = 0.


In particular, if 0 is not an eigenvalue, then the eigenvectors form an
orthonormal basis (in addition, H needs not to be complete in this case).

Proof. Existence of the eigenvalues αj and the corresponding eigenvectors


has already been established. If the eigenvalues should not converge to zero,
there is a subsequence such that vk = αj−1 k
ujk is a bounded sequence for
which Avk has no convergent subsequence since kAvk − Avl k2 = kujk −
ujl k2 = 2.
Next, setting
n
X
fn = huj , f iuj , (2.51)
j=1
we have
kA(f − fn )k ≤ |αn |kf − fn k ≤ |αn |kf k (2.52)
since f − fn ∈ Hn . Letting n → ∞ shows A(f∞ − f ) = 0 proving (2.50). 

Remark: There are two cases where our procedure might fail to con-
struct an orthonormal basis of eigenvectors. One case is where there is
an infinite number of nonzero eigenvalues. In this case αn never reaches 0
and all eigenvectors corresponding to 0 are missed. In the other case, 0 is
reached, but there might not be a countable basis and hence again some of
the eigenvectors corresponding to 0 are missed. In any case one can show
that by adding vectors from the kernel (which are automatically eigenvec-
tors), one can always extend the eigenvectors uj to an orthonormal basis of
eigenvectors.
This is all we need and it remains to apply these results to Sturm-
Liouville operators.
Problem 2.2. Show that if A is bounded, then every eigenvalue α satisfies
|α| ≤ kAk.

2.6. Applications to Sturm-Liouville operators


Now, after all this hard work, we can show that our Sturm-Liouville operator

d2
L=− + q(x), (2.53)
dx2
2.6. Applications to Sturm-Liouville operators 29

where q is continuous and real, defined on


D(L) = {f ∈ C 2 [0, 1]|f (0) = f (1) = 0} ⊂ L2 (0, 1) (2.54)
has an orthonormal basis of eigenfunctions.
We first verify that L is symmetric:
Z 1
hf, Lgi = f (x)∗ (−g ′′ (x) + q(x)g(x))dx
0
Z 1 Z 1
′ ∗ ′
= f (x) g (x)dx + f (x)∗ q(x)g(x)dx
0 0
Z 1 Z 1
′′ ∗
= −f (x) g(x)dx + f (x)∗ q(x)g(x)dx (2.55)
0 0
= hLf, gi.
Here we have used integration by part twice (the boundary terms vanish
due to our boundary conditions f (0) = f (1) = 0 and g(0) = g(1) = 0).
Next we would need to show that L is bounded. But this task is bound
to fail, since L is not even bounded (Problem 1.10)!
So here comes the trick: If L is bounded its inverse L−1 might still be
bounded. L−1 might even be compact an this is the case here! Since L might
not be injective (0 might be an eigenvalue), we consider RL (z) = (L − z)−1 ,
z ∈ C, which is also known as the resolvent of L.
A straightforward computation
u+ (z, x)  x
Z 
f (x) = u− (z, t)g(t)dt
W (z) 0
u− (z, x)  1
Z 
+ u+ (z, t)g(t)dt , (2.56)
W (z) x
verifies that f satisfies (L − z)f = g, where u± (z, x) are the solutions of the
homogenous differential equation −u′′± (z, x) + (q(x) − z)u± (z, x) = 0 satisfy-
ing the initial conditions u− (z, 0) = 0, u′− (z, 0) = 1 respectively u+ (z, 1) = 0,
u′+ (z, 1) = 1 and
W (z) = W (u+ (z), u− (z)) = u′− (z, x)u+ (z, x) − u− (z, x)u′+ (z, x) (2.57)
is the Wronski determinant, which is independent of x (check this!).
Note that z is an eigenvalue if and only if W (z) = 0. In fact, in this
case u+ (z, x) and u− (z, x) are linearly dependent and hence u− (z, 0) =
cu+ (z, 0) = 0 which shows that u− (z) satisfies both boundary conditions
and is thus an eigenfunction.
Introducing the the Green function

1 u+ (z, x)u− (z, t), x ≥ t
G(z, x, t) = (2.58)
W (u+ (z), u− (z)) u + (z, t)u− (z, x), x ≤ t
30 2. Hilbert spaces

we see that (L − z)−1 is given by


Z 1
(L − z)−1 g(x) = G(z, x, t)g(t)dt. (2.59)
0
Moreover, from G(z, x, t) = G(z, t, x) it follows that (L − z)−1 is symmetric
for z ∈ R (Problem 2.3) and from Lemma 2.14 it follows that K is compact.
Hence Theorem 2.18 applies to (L − z)−1 and we obtain:
Theorem 2.19. The Sturm-Liouville operator L has a countable number of
eigenvalues En . All eigenvalues are discrete and simple. The corresponding
normalized eigenfunctions un form an orthonormal basis for L2 (0, 1).

Proof. Pick a value λ ∈ R such that RL (λ) exists. By Theorem 2.18 there
are eigenvalues αn of RL (λ) with corresponding eigenfunctions un . More-
over, RL (λ)un = αn un is equivalent to Lun = (λ + α1n )un , which shows that
En = λ + α1n are eigenvalues of L with corresponding eigenfunctions un .
Now everything follows from Theorem 2.18 except that the eigenvalues are
simple. To show this, observe that if un and vn are two different eigenfunc-
tions corresponding to En , then un (0) = vn (0) = 0 implies W (un , vn ) = 0
and hence un and vn are linearly dependent. 
Problem 2.3. Show that the integral operator
Z 1
(Kf )(x) = K(x, y)f (y)dy, (2.60)
0
where K(x, y) ∈ C([0, 1] × [0, 1]) is symmetric if K(x, y)∗ = K(x, y).
Chapter 3

Banach spaces

31
Bibliography

[1] H. Heuser, Funktionalanalysis, B.G.Teubner, Stuttgart, 1986.


[2] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional
Analysis, rev. and enl. edition, Academic Press, San Diego, 1980.
[3] W. Rudin, Real and Complex Analysis, 3rd edition, McGraw-Hill, New York,
1987.
[4] J. Weidmann, Lineare Operatoren in Hilberträumen I: Grundlagen, B.G.Teubner,
Stuttgart, 2000.
[5] D. Werner, Funktionalanalysis, 3rd edition., Springer, Berlin, 2000.

33
Glossary of notations

C(H) . . . set of compact operators, 24.


C(U ) . . . set of continuous functions from U to C.
C(U, V ) . . . set of continuous functions from U to V .
C0∞ (U, V ) . . . set of compactly supported smooth functions
χΩ (.) . . . characteristic function of the set Ω
dim . . . dimension of a linear space
D(.) . . . domain of an operator
e . . . exponential function, ez = exp(z)
hull(.) . . . convex hull
H . . . a Hilbert space
i . . . complex unity, i2 = −1
Im(.) . . . imaginary part of a complex number
inf . . . infimum
Ker(A) . . . kernel of an operator A
L(X, Y ) . . . set of all bounded linear operators from X to Y , 14
L(X) = L(X, X)
max . . . maximum
N . . . the set of positive integers
N0 = N ∪ {0}
Ran(A) . . . range of an operator A
Re(.) . . . real part of a complex number
R . . . the set of real numbers
sup . . . supremum
supp . . . support of a function
span(M ) . . . set of finite linear combinations from M
Z . . . the set of integers

35
36 Glossary of notations

I . . . identity operator

z . . . square root of z with branch cut along (−∞, 0)
z ∗ . . . complex conjugation
k.k . . . norm
k.kp . . . norm in the Banach space Lp
h., ..i . . . scalar product in H
⊕ . . . orthogonal sum of linear spaces or operators, 23
∂ . . . gradient
∂α . . . partial derivative
M⊥ . . . orthogonal complement, 21
(λ1 , λ2 ) = {λ ∈ R | λ1 < λ < λ2 }, open interval
[λ1 , λ2 ] = {λ ∈ R | λ1 ≤ λ ≤ λ2 }, closed interval
Index

Banach space, 6 bounded, 14


Basis compact, 24
orthonormal, 19 domain, 14
Bessel inequality, 18 linear, 14
Boundary condition, 4 symmetric, 26
Boundary value problem, 4 unitary, 20
Orthogonal, 10
Cauchy-Schwarz inequality, 11 Orthogonal complement, 21
Complete, 6 Orthogonal sum, 23

Diffusion equation, 1 Parallel, 10


Domain, 14 Parallelogram law, 11
Perpendicular, 10
Eigenvalue, 26 Polarization identity, 11
simple, 26 Pythagorean theorem, 10
Eigenvector, 26
Resolvent, 29
Riesz lemma, 22
Fourier series, 20
Scalar product, 9
Gram-Schmidt orthogonalization, 19 Separable, 7
Green function, 29 Separation of variables, 2
Sturm–Liouville problem, 4
Heat equation, 1
Hilbert space, 9 Tensor product, 24
Theorem
Inner product, 9 Arzelà-Ascoli, 26
Inner product space, 9 Total, 7
Triangel inequality, 5
Linear functional, 15, 22
Unit vector, 10
Norm, 5
operator, 14 Wave equation, 3
Normalized, 10 Weierstraß approxiamation, 7
Normed space, 5

Operator

37

You might also like