0% found this document useful (0 votes)
104 views

Pub Operator-Theory

This document provides an introduction to operator theory. It begins by defining key concepts such as normed spaces and Banach spaces. It then introduces linear operators between normed spaces and defines bounded operators. It states that the space of bounded linear operators between two Banach spaces is itself a Banach space. The document provides examples of normed and Banach spaces and bounded linear operators between them. It discusses extending bounded linear operators defined on dense subspaces to the whole space.

Uploaded by

vahid mesic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

Pub Operator-Theory

This document provides an introduction to operator theory. It begins by defining key concepts such as normed spaces and Banach spaces. It then introduces linear operators between normed spaces and defines bounded operators. It states that the space of bounded linear operators between two Banach spaces is itself a Banach space. The document provides examples of normed and Banach spaces and bounded linear operators between them. It discusses extending bounded linear operators defined on dense subspaces to the whole space.

Uploaded by

vahid mesic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

OPERATOR THEORY

These lecture notes are based on the courses Operator Theory developed at
King’s College London by G. Barbatis, E.B. Davies and J.A. Erdos, and
Functional Analysis II developed at the University of Sussex by P.J. Bushell,
D.E. Edmunds and D.G. Vassiliev. As usual, all errors are entirely my re-
sponsibility. The same applies to the accompanying exercise sheets.

1 Introduction
Spaces
By IF we will always denote either the field IR of real numbers or the field
C of complex numbers.

1.1. Definition A norm on a vector space X over IF is a map k · k :


X → [0, ∞) satisfying the conditions
(i) kxk = 0 ⇐⇒ x = 0;
(ii) kλxk = |λ|kxk, ∀x ∈ X, ∀λ ∈ IF;
(iii) kx + yk ≤ kxk + kyk, ∀x, y ∈ X (the triangle inequality).
A vector space X equipped with a norm is called a normed space.

1.2. Definition A normed space X is called a Banach space if it is com-


plete, i.e. if every Cauchy sequence in X is convergent.

1.3. Examples
1. Any finite-dimensional normed space is a Banach space.
2. Let lp , 1 ≤ p ≤ ∞, denote the vector space of all sequences x =
(xk )∞
k=1 , xk ∈ IF, such that


!1/p
p
X
kxkp := |xk | < ∞, 1 ≤ p < ∞,
k=1

kxk∞ := sup |xk | < ∞.


k∈IN

1
Then k · kp is a norm on lp and lp , 1 ≤ p ≤ ∞, are Banach spaces.
3. Let C([0, 1]) be the space of all continuous functions on [0, 1] and
Z 1 1/p
p
kf kp := |f (t)| dt , 1 ≤ p < ∞,
0

kf k∞ := sup |f (t)|.
0≤t≤1

Then each k · kp is a norm on C([0, 1]).


The space C([0, 1]) with the norm k · k∞ is a Banach space.
The space C([0, 1]) with the norm k · kp , 1 ≤ p < ∞, is not a Banach
space.
Indeed, let us consider the sequence (fn ) in C([0, 1]), where
1 1


 0 if 0 ≤ t ≤ 2 − 2n ,



fn (t) = nt − n−1
2 if 1
2 − 1
2n <t< 1
2 + 1
2n ,



 1 1
1 if + ≤t≤1.

2 2n

It is easily seen that each fn is a piece-wise linear function, 0 ≤ fn ≤ 1 and


!1/p
Z 1
2 +max { 2n
1 1
, 2m }  
1 1
1/p
p
kfn − fm kp < 1 dt ≤ max , → 0,
1
2 −max { 2n
1 1
, 2m } n m
as m, n → ∞.

So, (fn ) is a Cauchy sequence with respect to k·kp . Now, suppose there exists f ∈ C([0, 1])
such that kf − fn kp → 0. Taking into account that for an arbitrary δ ∈]0, 1/2[ there exists
N such that
fn = 1 on [1/2 + δ, 1], ∀n > N,
we obtain
Z 1 Z 1
p
|1 − f (t)| dt = |fn (t) − f (t)|p dt ≤ kf − fn kpp → 0 as n → ∞.
1/2+δ 1/2+δ

Since the first integral is independent of n it has to be zero, which implies that f = 1 on
[1/2 + δ, 1] and hence on ]1/2, 1] since δ was arbitrary. A similar argument shows that
f = 0 on [0, 1/2[. Thus f is discontinuous at t = 1/2 and we obtain a contradiction.

1.4. Definition Normed spaces X and Y are called isomorphic if there


exists a linear isometry from X onto Y . Isometry is a map which does not
change the norms of the corresponding points.

2
1.5. Theorem For any normed space (X, k · k) there exists a Banach
space (X 0 , k · k0 ) and a linear isometry from X onto a dense linear subspace
of X 0 . Two Banach spaces in which (X, k · k) can be so imbedded are iso-
morphic .

1.6. Definition The space (X 0 , k · k0 ) from Theorem 1.5 is called the


completion of (X, k · k).
The completion of the space (C([0, 1]), k · kp ), 1 ≤ p < ∞, is the Banach
space Lp ([0, 1]) from the theory of Lebesgue integral.

Operators
1.7. Definition Let X and Y be vector spaces. A map B : X → Y is
called a linear operator (map) if
B(λx + µz) = λBx + µBz, ∀x, z ∈ X, ∀λ, µ ∈ IF.
1.8. Theorem Let X and Y be normed spaces. For a linear operator
B : X → Y the following statements are equivalent:
(i) B is continuous;
(ii) B is continuous at 0;
(iii) there exists a constant C < +∞ such that kBxk ≤ Ckxk, ∀x ∈ X.

1.9. Definition A liner operator B is bounded if it satisfies the last (and


hence all) of the three conditions above. If B is bounded we define its norm
by the equality
kBk := inf{C : kBxk ≤ Ckxk, ∀x ∈ X}.
It is easy to see that
kBxk ≤ kBk kxk, ∀x ∈ X, (1.1)

kBk = inf{C : kBxk ≤ C, all x s.t. kxk ≤ 1} =


( )
kBxk
inf{C : kBxk ≤ C, all x s.t. kxk = 1} = sup : x 6= 0 =
kxk
sup{kBxk : kxk ≤ 1} = sup{kBxk : kxk = 1}

3
(prove these relations!).

1.10. Theorem Let X and Y be normed spaces. Then k · k is indeed


a norm on the vector space B(X, Y ) of all bounded linear operators from X
into Y, and

kABk ≤ kAkkBk, ∀B ∈ B(X, Y ), ∀A ∈ B(Y, Z). (1.2)

Moreover, if Y is complete then B(X, Y ) is a Banach space.

Let X be a Banach space. Theorem 1.10 says that

B(X) := B(X, X)

is actually what we call a Banach algebra.

A vector space E is called an algebra if for any pair (x, y) ∈ E × E a unique product
xy ∈ E is defined with the properties

(xy)z = x(yz),

x(y + z) = xy + xz,
(x + y)z = xz + yz,
λ(xy) = (λx)y = x(λy),
for all x, y, z ∈ E and scalars λ.
E is called an algebra with identity if it contains an element e such that for all x ∈ E,

ex = xe = x, ∀x ∈ E.

A normed algebra is a normed space which is an algebra such that

kxyk ≤ kxkkyk, ∀x, y ∈ E,

and if E has an identity e,


kek = 1.

A Banach algebra is a normed algebra which is complete, considered as a normed space .

1.11. Definition Let X and Y be subspaces of X̃ and Ỹ . An operator


B̃ : X̃ → Ỹ is said to be an extension of B : X → Y if B̃x = Bx, ∀x ∈ X.

4
1.12. Theorem Let X and Y be Banach spaces and D be a dense lin-
ear subspace of X. Let A be a bounded linear operator from D (equipped
with the X−norm) into Y. Then there exists a unique extension of A to
a bounded linear operator Ā : X → Y defined on the whole X; moreover
kAk = kĀk.

1.13. Example Let X = Y be the space L2 ([a, b]), i.e. the completion
of D = (C([a, b]), k · k2 ), (−∞ < a < b < +∞). Consider the operator
A : C([a, b]) → C([a, b]) ⊂ L2 ([a, b]),
Z b
(Af )(t) = k(t, τ )f (τ )dτ, where k ∈ C([a, b]2 ).
a
Theorem 1.12 allows one to extend this operator to a bounded linear operator
Ā : L2 ([a, b]) → L2 ([a, b]), since D is dense in its completion X = L2 ([a, b]).
We only need to prove that A : D → D is a bounded operator.
Denote
K = max 2 |k(t, τ )|.
(t,τ )∈[a,b]

Using Cauchy–Schwarz inequality we obtain


Z !1/2 !1/2
b Z b Z b
2 2
|(Af )(t)| = k(t, τ )f (τ )dτ ≤ |k(t, τ )| dτ |f (τ )| dτ ≤

a a a

K b − akf k2 , ∀f ∈ C([a, b]).
Therefore
!1/2
Z b
2
kAf k2 = |(Af )(t)| dt ≤ K(b − a)kf k2 , ∀f ∈ C([a, b]),
a

i.e. A is bounded and kAk ≤ K(b − a).


(Let Y be the Banach space (C([a, b]), k · k∞ ). Then the above proof shows
that A can be extended
√ to a bounded linear operator Ā : L2 ([a, b]) →
C([a, b]), kAk ≤ K b − a .)

1.14. Definition Let X be a normed space and xn ∈ X, n ∈ IN. We


say that the series ∞ n=1 xnPis convergent to a vector x ∈ X if the sequence
P

(Sj ) of partial sums, Sj = jn=1 xn , converges to x. We then write



X
xn = x.
n=1

5
We say that the series converges absolutely if

X
kxn k < ∞.
n=1

1.15. Theorem Any absolutely convergent series in a Banach space is


convergent.

Proof: For m > k we have



m m
X X

kSm − Sk k =
xn ≤ kxn k → 0, as k, m → ∞.
n=k+1 n=k+1

Hence the sequence (Sj ) is Cauchy and therefore converges to some x ∈ X. 2

2 Spectral theory of bounded linear opera-


tors
Auxiliary results
Let us recall that for any normed spaces X and Y we denote by B(X, Y )
the space of all bounded linear operators acting from X into Y . We use also
the following notation B(X) = B(X, X).

2.1. Definition Let A be a linear operator from a vector space X into


a vector space Y. The kernel of A is the set

Ker(A) := {x ∈ X : Ax = 0}.

The range of the operator A is the set

Ran(A) := {Ax : x ∈ X}.

Ker(A) and Ran(A) are linear subspaces of X and Y correspondingly.


(Why?) Moreover, if X and Y are normed spaces and A ∈ B(X, Y ), then
Ker(A) is closed (why?); this is not necessarily true for Ran(A).

6
2.2. Theorem (Banach) Let X and Y be Banach spaces and let B ∈
B(X, Y ) be one-to-one and onto (i.e. Ker(B) = {0} and Ran(B) = Y ).
Then the inverse operator B −1 : Y → X is bounded, i.e. B −1 ∈ B(Y, X).

Proof: The proof of this fundamental result can be found in any textbook on
functional analysis. 2

Let X and Y be vector spaces and let an operator B : X → Y have a


right inverse and a left inverse operators Br−1 , Bl−1 : Y → X,
Bl−1 B = IX , BBr−1 = IY .
(By I we always denote the identity operator: Ix = x, ∀x ∈ X. Subscript
(if any) indicates the space on which the identity operator acts.) Then B
has a two-sided inverse operator B −1 = Br−1 = Bl−1 . Indeed,
Br−1 = IX Br−1 = (Bl−1 B)Br−1 = Bl−1 (BBr−1 ) = Bl−1 IY = Bl−1 .

2.3. Lemma Let X, Y and Z be vector spaces.


(i) If operators B : X → Y and T : Y → Z are invertible, then T B : X → Z
is invertible too and (T B)−1 = B −1 T −1 .
(ii) If operators B, T : X → X commute: T B = BT , then T B : X → X is
invertible if and only if both B and T are invertible.
(iii) If operators B, T : X → X commute and B is invertible, then B −1 and
T also commute.

Proof: (i)
B −1 T −1 T B = B −1 B = IX , T BB −1 T −1 = T T −1 = IZ .
(ii) According to (i) we have to prove only that the invertibility of T B implies
the invertibility of B and T . Let S : X → X be the inverse of T B, i.e.
ST B = T BS = I. It is clear that ST is a left inverse of B. Since B and
T commute, we have BT S = I, i.e. T S is a right inverse of B. Hence B
is invertible and its inverse equals ST = T S. Similarly we prove that T is
invertible.
(iii)
B −1 T = B −1 T BB −1 = B −1 BT B −1 = T B −1 . 2

7
2.4. Lemma Let X be a Banach space, B ∈ B(X) and kBk < 1. Then
I − B is invertible,

(I − B)−1 = Bn
X
(2.1)
n=0
and k(I − B)−1 k ≤ 1/(1 − kBk).

Proof: It follows from (1.2) that


kB n k ≤ kB n−1 kkBk ≤ · · · ≤ kBkn , ∀n ∈ IN.
Hence the series in the right-hand side of (2.1) is absolutely convergent and,
consequently, convergent in B(X) (see Theorems 1.10 and 1.15). Let us
denote its sum by R ∈ B(X). We have
N
B n = lim (I − B N +1 ) = I,
X
(I − B)R = (I − B) lim
N →∞ N →∞
n=0

because kB N +1 k ≤ kBkN +1 → 0 as N → ∞. Analogously we prove that


R(I − B) = I. Thus (I − B)−1 = R ∈ B(X). Further,
N N
−1
X
n
1 − kBkN +1
kBkn = lim
X
k(I − B) k = lim B ≤ lim =

N →∞
n=0
N →∞
n=0
N →∞ 1 − kBk
1
. 2
1 − kBk
2.5. Lemma Let X and Y be Banach spaces, A, B ∈ B(X, Y ). Let A be
invertible and kBk < 1/kA−1 k. Then A + B is invertible,
∞ ∞
! !
−1 −1 n −1 −1 −1 n
X X
(A + B) = (−A B) A =A (−BA )
n=0 n=0

and
kA−1 k
k(A + B)−1 k ≤
1 − kBkkA−1 k
Proof: We have
A + B = A(I + A−1 B) = (I + BA−1 )A
and kA−1 Bk ≤ kA−1 kkBk < 1, kBA−1 k ≤ kBkkA−1 k < 1. Now it is left to
apply Lemmas 2.3(i) and 2.4. 2

8
The spectrum
In this subsection we will deal only with complex vector spaces, i.e. with
the case IF = C, if it is not stated otherwise.

2.6. Definition Let X be a Banach space and B ∈ B(X). The resol-


vent set ρ(B) of the operator B is defined to be the set of all λ ∈ C such
that B − λI has an inverse operator (B − λI)−1 ∈ B(X). The spectrum σ(B)
of the operator B is the complement of ρ(B):

σ(B) = C\ρ(B),

i.e. σ(B) is the set of all λ ∈ C such that B − λI is not invertible on X.


A complex number λ is called an eigenvalue of B if there exists x ∈ X\{0}
such that Bx = λx. In this case x is called an eigenvector of B corresponding
to the eigenvalue λ.

2.7. Lemma All eigenvalues of B belong to σ(B).

Proof: Suppose λ is an eigenvalue and x 6= 0 is a corresponding eigenvector


of B. Then (B − λI)x = 0. Since we have also (B − λI)0 = 0, the operator
B − λI is not one–to–one. So, B − λI is not invertible on X, i.e. λ ∈ σ(B).
2
On the other hand the set of all eigenvalues may be smaller than the spec-
trum.

2.8. Examples
1. If the space X is finite-dimensional then the spectrum of B ∈ B(X) co-
incides with the set of all its eigenvalues, i.e. with the set of zeros of the
determinant of a matrix corresponding to B − λI (with respect to a given
basis of X).
2. Let X be one of the Banach spaces C([0, 1]), Lp ([0, 1]), 1 ≤ p ≤ ∞, and
let B be defined by the formula

Bf (t) = tf (t), t ∈ [0, 1].

Then σ(B) = [0, 1] but B does not have eigenvalues (exercise).

9
2.9. Lemma Let X be a Banach space and B ∈ B(X). Then σ(B) is
a compact set and
σ(B) ⊂ {λ ∈ C : |λ| ≤ kBk}. (2.2)
Proof: Suppose |λ| > kBk. Then

kBk < 1/|λ|−1 = 1/k − λ−1 Ik = 1/k(−λI)−1 k.

Hence, according to Lemma 2.5 the operator B − λI is invertible, i.e. λ ∈ /


σ(B). So, we have proved (2.2).
Let us take an arbitrary λ0 ∈ ρ(B). Then for any λ ∈ C such that |λ − λ0 | <
1/k(B − λ0 I)−1 k we conclude from Lemma 2.5 and the representation

B − λI = (B − λ0 I) + (λ0 − λ)I (2.3)

that the operator B − λI is invertible, i.e. λ ∈ ρ(B). Hence ρ(B) is an open


set and its complement σ(B) is closed. So, σ(B) is a bounded (see (2.2))
closed set, i.e. a compact set. 2

2.10. Definition Let X be a Banach space and B ∈ B(X). The operator-


valued function

ρ(B) 3 λ → R(B; λ) := (B − λI)−1 ∈ B(X)

is called the resolvent of the operator B.

2.11. Lemma (the resolvent equation)

R(B; λ) − R(B; λ0 ) = (λ − λ0 )R(B; λ)R(B; λ0 ), ∀λ, λ0 ∈ ρ(B).

Proof:

R(B; λ) − R(B; λ0 ) = (B − λI)−1 − (B − λ0 I)−1 =


(B − λI)−1 ((B − λ0 I) − (B − λI))(B − λ0 I)−1 =
(λ − λ0 )R(B; λ)R(B; λ0 ). 2

2.12. Lemma If operators A, B ∈ B(X) commute: AB = BA, then for


any λ ∈ ρ(B) and µ ∈ ρ(A) the operators A, B, R(A; µ) and R(B; λ) all

10
commute with each other.
Proof: It is clear that the operators A, B, A − µI and B − λI commute. Now
the desired result follows from Lemma 2.3(iii). 2

Taking A = B in the last lemma we obtain the following result.


2.13. Corollary For any λ, µ ∈ ρ(B) the operators B, R(B; λ) and R(B; µ)
commute. 2

For any normed space X we denote by X ∗ its dual (= adjoint) space, i.e.
the space of all bounded linear functionals on X, i.e. B(X, IF).

2.14. Theorem Let Z be a complex Banach space, Ω ⊂ C be an open


subset of the complex plane and F : Ω → Z be a Z–valued function. Then
the following statements are equivalent:
(i) for any λ0 ∈ Ω there exists the derivative
dF 1
(λ0 ) = F 0 (λ0 ) := lim (F (λ) − F (λ0 )) ∈ Z,
dλ λ→λ0 λ − λ0

i.e.
1

(F (λ) − F (λ0 )) − F 0 (λ0 ) → 0 as λ → λ0 ;



λ − λ0
(ii) any λ0 ∈ Ω has a neighbourhood where F (λ) can be represented by an
absolutely convergent series

(λ − λ0 )n Fn (λ0 ), Fn (λ0 ) ∈ Z;
X
F (λ) =
n=0

(iii) for any G ∈ Z ∗ the complex–valued function

Ω 3 λ 7→ G(F (λ)) ∈ C

is holomorphic (= analytic) in Ω.
If Z = B(X, Y ) for some Banach spaces X and Y , then for the operator–
valued function F the above statements are equivalent to
(iv) for any x ∈ X and g ∈ Y ∗ the complex–valued function

Ω 3 λ 7→ g(F (λ)x) ∈ C

11
is holomorphic (= analytic) in Ω.

Proof: It is clear that each of the statements (i) and (ii) implies (iii). (Why?)
It is also obvious that (iii) implies (iv). Indeed, for any x ∈ X and g ∈ Y ∗
the mapping
B(X, Y ) 3 B 7→ g(Bx) ∈ C
is a bounded linear functional on B(X, Y ). The opposite implications are
very non–trivial. We will not give the proof here. It can be found, e.g., in E.
Hille & R.S. Phillips, Functional Analysis and Semi–Groups, Ch. III, Sect.
2. 2

2.15. Definition A vector–valued (operator–valued) function is called holo-


morphic (= analytic) in Ω if it satisfies the conditions (i)–(iii) (conditions
(i)–(iv)) of the above theorem.

2.16. Remark We will not use the non–trivial part of Theorem 2.14. We
will only use the fact that each of the statements (i) and (ii) implies (iii) and
(iv). In the proof of the following theorem we will check that the resolvent
satisfies both (i) and (ii).

2.17. Theorem The B(X)–valued function R(B; ·) is analytic in ρ(B) and


has the following properties
!
dR(B; λ)
= R2 (B; λ0 ), ∀λ0 ∈ ρ(B), (2.4)
dλ λ=λ0

−λR(B; λ) → I as |λ| → ∞, (2.5)


1
kR(B; λ)k ≥ , λ ∈ ρ(B), (2.6)
d(λ, σ(B))
where d(λ, K) := inf{|λ − µ| : µ ∈ K} denotes the distance of λ from K.

Proof: Let us take an arbitrary λ0 ∈ ρ(B). It follows from Lemma 2.5 and the
proof of Lemma 2.9 that the function R(B; ·) is bounded in a neighbourhood
of λ0 . From the resolvent equation (Lemma 2.11) we obtain

R(B; λ) → R(B; λ0 ) as λ → λ0

12
and
R(B; λ) − R(B; λ0 )
lim = lim R(B; λ)R(B; λ0 ) = R2 (B; λ0 ).
λ→λ0 λ − λ0 λ→λ0

Thus (2.4) is valid for any point λ0 ∈ ρ(B), i.e R(B; ·) is analytic in ρ(B).
Note that (2.3) and Lemma 2.5 imply the following Taylor expansion

(λ − λ0 )n Rn+1 (B; λ0 ), if |λ − λ0 | < 1/k(B − λ0 I)−1 k. (2.7)
X
R(B; λ) =
n=0

Further,

X
−1 −1 −n n
k − λR(B; λ) − Ik = k(I − λ B) − Ik = λ B ≤


n=1

|λ|−1 kBk kBk
|λ|−n kBkn =
X
−1
= → 0, as |λ| → ∞
n=1 1 − |λ| kBk |λ| − kBk

(see Lemma 2.4) and (2.5) is proved.


Let us take an arbitrary λ0 ∈ ρ(B). The proof of Lemma 2.9 implies that
λ ∈ ρ(B) if |λ − λ0 | < 1/kR(B; λ0 )k. Hence d(λ0 , σ(B)) ≥ 1/kR(B; λ0 )k, i.e.
1
kR(B; λ0 )k ≥ , ∀λ0 ∈ ρ(B). 2
d(λ0 , σ(B))

2.18. Lemma Let X be a normed space. Then for any z ∈ X there exists
g ∈ X ∗ such that g(z) = kzk and kgk = 1.

Proof: Let X0 = lin{z} be the one–dimensional subspace of X spanned


by z:
X0 := {αz| α ∈ IF}.
We define a functional g0 : X0 → IF by the equality g0 (αz) = αkzk. It is clear
that g0 (z) = kzk and kg0 k = 1. (Why?) Due to the Hahn–Banach theorem g0
can be extended to a functional g ∈ X ∗ such that g(z) = kzk and kgk = 1. 2

2.19. Lemma σ(B) 6= ∅ for any B ∈ B(X).

13
Proof: Let us suppose that σ(B) = ∅ and take arbitrary x ∈ X\{0}, g ∈ X ∗ .
It follows from Theorem 2.17 that the function

f (λ) := g(R(B; λ)x)

is analytic in ρ(B) = C and f (λ) → 0 as |λ| → ∞ (see (2.5) and also


Theorem 2.14 and Remark 2.16). Liouville’s theorem implies f ≡ 0. Let us
take g ∈ X ∗ such that g(R(B; λ0 )x) = kR(B; λ0 )xk for the given x ∈ X\{0}
and some λ0 ∈ C (see Lemma 2.18). Then we obtain

0 = f (λ0 ) = g(R(B; λ0 )x) = kR(B; λ0 )xk,

i.e. R(B; λ0 )x = 0. Consequently x = (B − λ0 I)R(B; λ0 )x = 0 for x ∈


X\{0}. The obtain contradiction shows that σ(B) cannot be empty. 2

Combining Lemmas 2.9, 2.19 and Theorem 2.17 we obtain the following re-
sult.

2.20. Theorem Let X be a Banach space and B ∈ B(X). The spec-


trum of B is a non-empty compact set contained in the closed disk {λ ∈
C : |λ| ≤ kBk} and the resolvent (B − λI)−1 is an analytic B(X)–valued
function on C\σ(B). 2

2.21. Remark Let (ak ) be a sequence of real numbers. We will use the
following notation:

lim
n→∞
inf an := n→∞
lim inf ak , lim sup an := n→∞
lim sup ak .
k≥n n→∞ k≥n

These limits exist (finite or infinite) because bn := inf k≥n ak and cn :=


supk≥n ak are monotone. The Sandwich Theorem implies that the limit
limn→∞ an exists if
lim
n→∞
inf an = lim sup an .
n→∞

2.22. Definition Let B ∈ B(X). The spectral radius r(B) of the operator
B is defined by the equality

r(B) := sup{|λ| : λ ∈ σ(B)}.

14
This is the radius of the minimal disk centred at 0 and containing σ(B).

2.23. Theorem (the spectral radius formula) Let X be a Banach space


and B ∈ B(X). Then
lim kB n k1/n .
r(B) = n→∞ (2.8)
Proof: Let us take an arbitrary λ ∈ σ(B). We have

B n − λn I = (B − λI)(λn−1 I + λn−2 B + · · · + λB n−2 + B n−1 ).

Since B−λI is not invertible and the operators in the RHS commute, B n −λn I
is not invertible (see Lemma 2.3(ii)). So, λ ∈ σ(B) implies λn ∈ σ(B n ).
Consequently

r(B)n = (sup{|λ| : λ ∈ σ(B)})n = sup{|λ|n : λ ∈ σ(B)} ≤ r(B n ).

Hence, according to (2.2) (with B n instead of B)

r(B) ≤ (r(B n ))1/n ≤ kB n k1/n ,

and therefore
r(B) ≤ lim inf kB n k1/n .
n→∞

Now it is sufficient to prove that

lim sup kB n k1/n ≤ r(B). (2.9)


n→∞

If λ ∈ C is such that |λ| > kBk, then λ ∈


/ σ(B) and

(B − λI)−1 = −λ−1 (I − λ−1 B)−1 = − λ−n−1 B n
X
(2.10)
n=0

(see Lemma 2.4). Let us take arbitrary x ∈ X, g ∈ X ∗ and consider the


function
f (λ) := g(R(B; λ)x).
If |λ| > kBk then (2.10) implies the following Laurent expansion

λ−n−1 g(B n x).
X
f (λ) = − (2.11)
n=0

15
Since R(B; ·) is analytic in C\σ(B) (see Theorem 2.17), so is f (see Theorem
2.14 and Remark 2.16). Hence, (2.11) is valid if |λ| > r(B). Taking λ =
aeiθ , a > r(B) and integrating the series for λm+1 f (λ) term by term with
respect to θ we have
Z 2π Z ∞
2π X
am+1 ei(m+1)θ f (aeiθ )dθ = − am−n ei(m−n)θ g(B n x)dθ =
0 0 n=0
∞ Z 2π
am−n g(B n x) ei(m−n)θ dθ = −2πg(B m x).
X

n=0 0

Thus
1 Z 2π m+1
|g(B m x)| ≤ a |g(R(B; aeiθ )x)|dθ ≤ am+1 M (a)kgkkxk,
2π 0
where
M (a) := sup kR(B; aeiθ )k
0≤θ≤2π

(see (1.1)). Taking g ∈ X ∗ such that g(B m x) = kB m xk, kgk = 1 (see Lemma
2.18) we obtain
kB m xk ≤ am+1 M (a)kxk, ∀x ∈ X,
i.e.
kB m k ≤ am+1 M (a), ∀m ∈ IN.
Consequently
lim sup kB m k1/m ≤ a, if a > r(B),
m→∞

i.e. we have proved (2.9). 2

The functional calculus


Let p(λ) = N n
and let B ∈ B(X). It is natural
P
n=0 an λ be a polynomial
PN
to define p(B) by the equality p(B) = n=0 an B n . It turns out that one
can define f (B) for any complex–valued function f which is analytic in some
neighbourhood of σ(B).

Let f be a complex–valued function which is analytic in a disk {λ ∈ C :

16
|λ| < rf }, where rf > 0 is some given number. Then we have the following
Taylor expansion

an λn , |λ| < rf ,
X
f (λ) =
n=0

where an = f (n) (0)/n! and the series is absolutely convergent.

2.24. Definition Let X be a Banach space, B ∈ B(X) and r(B) < rf ,


i.e. σ(B) ⊂ {λ ∈ C : |λ| < rf }. Then we define f (B) by the equality

an B n .
X
f (B) := (2.12)
n=0

We need to check that this definition makes sense. Let us take ε > 0 such
that r(B) + ε < rf . It follows from the spectral radius formula that for suffi-
ciently large n we have kB n k < (r(B)+ε)n . Therefore the power series (2.12)
is absolutely convergent. Hence, f (B) ∈ B(X) (see Theorems 1.10 and 1.15).

A disadvantage of the above definition is that f has to be analytic in some


disk containing σ(B). This does not seem fair if σ(B) is far from being a
disk, e.g. is a curve. So, it is natural to look for an alternative definition of
f (B).

2.25. Definition A bounded open set Ω ⊂ C is called admissible if its


boundary ∂Ω consists of a finite number of smooth closed pairwise disjoint
curves. The orientation of ∂Ω is chosen in such a way that Ω remains on the
left as we move along ∂Ω in the positive direction.

Let Ω be an admissible set and let f be complex–valued function which


is analytic in some neighbourhood of the closure Cl(Ω) of Ω. Then we can
write the Cauchy integral formula from Complex Analysis:
1 Z f (λ)
f (λ0 ) = dλ, λ0 ∈ Ω.
2πi ∂Ω λ − λ0
The formal substitution λ0 = B gives
1 Z
f (B) = f (λ)(λI − B)−1 dλ.
2πi ∂Ω

17
The RHS is well defined and belongs to B(X) if ∂Ω σ(B) = ∅. Here and
T

below the integrals are understood as norm limits of the corresponding Rie-
mann sums. This motivates the following definition.

2.26. Definition Let X be a Banach space, B ∈ B(X). Suppose f is


analytic in some (open) neighbourhood ∆f of σ(B) and Ω is an admissible
set such that
σ(B) ⊂ Ω ⊂ Cl(Ω) ⊂ ∆f . (2.13)
Then
1 Z
f (B) := − f (λ)R(B; λ)dλ. (2.14)
2πi ∂Ω

It is always possible to find an admissible set satisfying (2.13). Indeed, σ(B)


is a closed bounded subset of the open set ∆f . Therefore d(σ(B), ∂∆f ) > 0.
Here d(K1 , K2 ) := inf{|λ1 − λ2 | : λ1 ∈ K1 , λ2 ∈ K2 } denotes the distance
between two sets.

2.27. Proposition The RHS of (2.14) does not depend on the choice of
an admissible set satisfying (2.13).

Proof: Let Ω and Ω1 be admissible sets satisfying (2.13). Then there ex-
ists an admissible set Ω0 such that
\
σ(B) ⊂ Ω0 ⊂ Cl(Ω0 ) ⊂ Ω Ω1 ⊂ ∆f .

It is sufficient to prove that


1 Z 1 Z
f (λ)R(B; λ)dλ = f (λ)R(B; λ)dλ, (2.15)
2πi ∂Ω 2πi ∂Ω0
because the same argument will apply to the pair Ω1 and Ω0 .
Let us take arbitrary x ∈ X, g ∈ X ∗ and prove that
1 Z 1 Z
f (λ)g(R(B; λ)x)dλ = f (λ)g(R(B; λ)x)dλ. (2.16)
2πi ∂Ω 2πi ∂Ω0
The function λ 7→ f (λ)g(R(B; λ)x) ∈ C is analytic in a neighbourhood of
S
Cl(Ω)\Ω0 . Since the boundary of this set equals ∂Ω ∂Ω0 , (2.16) follows
from the Cauchy theorem.

18
Thus,
1 Z 1 Z
 
g f (λ)R(B; λ)xdλ − f (λ)R(B; λ)xdλ = 0, ∀g ∈ X ∗ .
2πi ∂Ω 2πi ∂Ω0
Consequently
1 Z 1 Z
f (λ)R(B; λ)xdλ = f (λ)R(B; λ)xdλ, ∀x ∈ X,
2πi ∂Ω 2πi ∂Ω0
(see Lemma 2.18), i.e. (2.15) holds. 2

Our aim now is to show that Definition 2.26 does not contradict the “com-
mon sense”, i.e. that in the situations where we now what f (B) is, (2.14)
gives the same result.

2.28. Lemma Let Ω be an admissible neighbourhood of σ(B). Then for


any λ0 6∈ Cl(Ω) we have

m 1 Z
(B − λ0 I) = − (λ − λ0 )m R(B; λ)dλ, ∀m ∈ Z. (2.17)
2πi ∂Ω
In particular
1 Z
R(B; λ0 ) = (B − λ0 I)−1 = − (λ − λ0 )−1 R(B; λ)dλ.
2πi ∂Ω
Proof: Let us denote the RHS of (2.17) by Am . Since λ0 6∈ Cl(Ω), the function
f (λ) := (λ − λ0 )m is analytic in some neighbourhood of Cl(Ω) and
1 Z
(λ − λ0 )m dλ = 0.
2πi ∂Ω
Using this equality and the resolvent equation (Lemma 2.11)

R(B; λ) = R(B; λ0 ) + (λ − λ0 )R(B; λ0 )R(B; λ),

we obtain
1 Z
Am = −R(B; λ0 ) (λ − λ0 )m dλ −
2πi ∂Ω
1 Z
R(B; λ0 ) (λ − λ0 )m+1 R(B; λ)dλ = R(B; λ0 )Am+1 .
2πi ∂Ω

19
Hence
Am = (B − λ0 I)−1 Am+1 , m = 0, ±1, ±2, . . .
This recursion formula shows that (2.17) follows from the case m = 0. We
thus have to prove that
1 Z
− R(B; λ)dλ = I.
2πi ∂Ω
According to Proposition 2.27 it will suffice to prove that
1 Z
− R(B; λ)dλ = I
2πi |λ|=r
for sufficiently large r > 0. If |λ| = r > kBk we have the following absolutely
and uniformly convergent expansion

(B − λI)−1 = − λ−n−1 B n
X

n=0

(see (2.10)). Termwise integration gives



1 Z n 1
Z
λ−n−1 dλ = B 0 = I. 2
X
− R(B; λ)dλ = B
2πi |λ|=r n=0 2πi |λ|=r

2.29. Corollary Let Ω be an admissible neighbourhood of σ(B). Then for


any polynomial p we have
1 Z
p(B) = − p(λ)R(B; λ)dλ.
2πi ∂Ω
In particular

n 1 Z
λn R(B; λ)dλ, ∀n ∈ IN {0}.
[
B =−
2πi ∂Ω
PN m
Proof: It is sufficient to represent p(λ) in the form p(λ) = m=0 cm (λ − λ0 ) ,
λ0 6∈ Cl(Ω), and apply the last lemma. 2

2.30. Theorem Let f be analytic in a disk ∆f = {λ ∈ C : |λ| < rf },

20
where rf > r(B). Then Definitions 2.24 and 2.26 give the same result.

Proof: Termwise integration gives



!
1 Z 1 Z
an λn R(B; λ)dλ =
X
− f (λ)R(B; λ)dλ = −
2πi ∂Ω 2πi ∂Ω n=0
∞ ∞
1 Z
 
λn R(B; λ)dλ = an B n
X X
an −
n=0 2πi ∂Ω n=0

(see Corollary 2.29). 2

Let f and g be functions analytic in some (open) neighbourhoods ∆f and


∆g of σ(B). It follows directly from Definition 2.26 that

(αf + βg)(B) = αf (B) + βg(B), ∀α, β ∈ C. (2.18)

2.31. Theorem (f g)(B) = f (B)g(B).

Proof: Let us take admissible sets Ωf and Ωg such that


\
σ(B) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆f ∆g . (2.19)

Using the resolvent equation (Lemma 2.11)

R(B; λ) R(B; µ)
R(B; λ)R(B; µ) = − − , λ 6= µ,
µ−λ λ−µ
we obtain
! !
1 Z 1 Z
f (B)g(B) = − f (λ)R(B; λ)dλ − g(µ)R(B; µ)dµ =
2πi ∂Ωf 2πi ∂Ωg
!
1 Z 1 Z
− f (λ) − g(µ)R(B; λ)R(B; µ)dµ dλ =
2πi ∂Ωf 2πi ∂Ωg
!
1 Z 1 Z g(µ)
− f (λ)R(B; λ) dµ dλ −
2πi ∂Ωf 2πi ∂Ωg µ − λ
!
1 Z 1 Z g(µ)
− f (λ) R(B; µ)dµ dλ.
2πi ∂Ωf 2πi ∂Ωg λ − µ

21
Since λ ∈ Ωg for any λ ∈ ∂Ωf and µ 6∈ Cl(Ωf ) for any µ ∈ ∂Ωg (see (2.19)),
the Cauchy theorem implies
1 Z g(µ) 1 Z f (λ)
dµ = g(λ), dλ = 0.
2πi ∂Ωg µ − λ 2πi ∂Ωf λ − µ
Therefore
1 Z
f (B)g(B) = − f (λ)g(λ)R(B; λ)dλ −
2πi ∂Ωf
!
1 Z 1 Z f (λ)
dλ g(µ)R(B; µ)dµ = (f g)(B) − 0 = (f g)(B). 2
2πi ∂Ωg 2πi ∂Ωf λ − µ

2.32. Corollary f (B)g(B) = g(B)f (B).

Proof: f (B)g(B) = (f g)(B) = (gf )(B) = g(B)f (B). 2

2.33. Theorem Let A, B ∈ B(X) and AB = BA. Suppose f and g are


analytic in some neighbourhoods ∆f and ∆g of σ(A) and σ(B) correspond-
ingly. Then f (A)g(B) = g(B)f (A).

Proof: Let us take admissible sets Ωf and Ωg such that

σ(A) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ ∆f , σ(B) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆g .

Since AB = BA implies R(A; µ)R(B; λ) = R(B; λ)R(A; µ) for any λ ∈ ρ(B)


and µ ∈ ρ(A) (see Lemma 2.12), we have
! !
1 Z 1 Z
f (A)g(B) = − f (µ)R(A; µ)dµ − g(λ)R(B; λ)dλ =
2πi ∂Ωf 2πi ∂Ωg
1 Z Z
− 2 f (µ)g(λ)R(A; µ)R(B; λ)dλdµ =
4π ∂Ωf ∂Ωg
1 Z Z
− 2 f (µ)g(λ)R(B; λ)R(A; µ)dµdλ =
4π ∂Ωg ∂Ωf
! !
1 Z 1 Z
− g(λ)R(B; λ)dλ − f (µ)R(A; µ)dµ = g(B)f (A). 2
2πi ∂Ωg 2πi ∂Ωf

22
2.34. Theorem (the spectral mapping theorem) Let f be analytic
in some neighbourhood of σ(B). Then

σ(f (B)) = f (σ(B)) := {f (λ) : λ ∈ σ(B)}.

Proof: Let us prove that

σ(f (B)) ⊂ f (σ(B)). (2.20)

Let take any ζ ∈ C\f (σ(B)). Then f − ζ = 6 0 in some neighbourhood of


σ(B). Hence the function g := 1/(f − ζ) is analytic in a neighbourhood of
σ(B). Now Theorem 2.31 implies
!
1
g(B)(f (B) − ζI) = (f − ζ) (B) = 1(B) = I.
f −ζ
Similarly (f (B) − ζI)g(B) = I. Therefore g(B) = R(f (B); ζ) and ζ 6∈
σ(f (B)). Thus
ζ 6∈ f (σ(B)) =⇒ ζ 6∈ σ(f (B)),
i.e. (2.20) holds.
It is left to prove that
f (σ(B)) ⊂ σ(f (B)). (2.21)
Let us take an arbitrary µ ∈ σ(B) and consider the function
(
(f (λ) − f (µ))/(λ − µ) if λ 6= µ,
h(λ) :=
f 0 (µ) if λ = µ.

It is clear that h is analytic in some neighbourhood of σ(B) and f (λ)−f (µ) =


(λ − µ)h(λ). It follows from Theorem 2.31 that

f (B) − f (µ)I = (B − µI)h(B).

The operator B − µI is not invertible, because µ ∈ σ(B). Since the operators


B − µI and h(B) commute (see Corollary 2.32), the operator f (B) − f (µ)I
cannot be invertible (see Theorem 2.3(ii)). Thus

µ ∈ σ(B) =⇒ f (µ) ∈ σ(f (B)),

i.e. (2.21) holds. 2

23
2.35. Theorem Let f be analytic in some neighbourhood ∆f of σ(B) and g
be analytic in some neighbourhood ∆g of f (σ(B)). Then for the composition
g ◦ f (defined by (g ◦ f )(λ) := g(f (λ))) we have

(g ◦ f )(B) = g(f (B)).

Proof: The RHS of the last equality is well defined due to the spectral map-
ping theorem.
Let us take admissible sets Ωf and Ωg such that

σ(B) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ ∆f ,

σ(f (B)) = f (σ(B)) ⊂ f (Cl(Ωf )) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆g .


For any ζ ∈ ∂Ωg the function hζ := 1/(f − ζ) is analytic in some neigh-
bourhood of Cl(Ωf ). Exactly as in the proof of Theorem 2.34 we prove that
hζ (B) = R(f (B); ζ). Using the Cauchy theorem we obtain

1 Z 1 Z
g(f (B)) = − g(ζ)R(f (B); ζ)dζ = − g(ζ)hζ (B)dζ =
2πi ∂Ωg 2πi ∂Ωg
!
1 Z 1 Z 1
− g(ζ) − R(B; λ)dλ dζ =
2πi ∂Ωg 2πi ∂Ωf f (λ) − ζ
!
1 Z 1 Z g(ζ)
− R(B; λ) dζ dλ =
2πi ∂Ωf 2πi ∂Ωg ζ − f (λ)
1 Z 1 Z
− g(f (λ))R(B; λ)dλ = − (g ◦ f )(λ)R(B; λ)dλ =
2πi ∂Ωf 2πi ∂Ωf
(g ◦ f )(B). 2

24
Summary

Suppose X is a Banach space and B ∈ B(X). Let A(B) be the set of


all functions f which are analytic in some neighbourhood of σ(B). (The
neighbourhood can depend on f ∈ A(B).) The map

A(B) 3 f 7→ f (B) ∈ B(X)

defined by (2.14) has the following properties:


(i) if f, g ∈ A(B), α, β ∈ C, then αf + βg, f g ∈ A(B) and (αf + βg)(B) =
αf (B) + βg(B), (f g)(B) = f (B)g(B);
(ii) if f has the power series expansion f (λ) = ∞ n
P
P∞ n=0 cn λ , valid in a neigh-
bourhood of σ(B), then f (B) = n=0 cn B n ;
(iii) σ(f (B)) = f (σ(B)) (the spectral mapping theorem);
(iv) if f ∈ A(B) and g ∈ A(f (B)), then (g ◦ f )(B) = g(f (B)).

Linear operator equations


Let X and Y be normed spaces and A ∈ B(X, Y ). The adjoint operator
A∗ : Y ∗ → X ∗ is defined by the equality

(A∗ g)(x) := g(Ax), ∀g ∈ Y ∗ , ∀x ∈ X.

It is not difficult to prove that A∗ ∈ B(Y ∗ , X ∗ ) and kA∗ k = kAk (Functional


Analysis I). It turns out that A∗ plays a very important role in the study of
the following linear operator equation

Ax = y, y ∈ Y. (2.22)

2.36. Lemma For the closure of the range of A we have

Ran(A) ⊂ {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )}. (2.23)

Proof: Let us take an arbitrary y ∈ Ran(A) and g ∈ Ker(A∗ ). There exist


xn ∈ X such that yn := Axn converge to y as n → ∞. Hence

g(y) = g( lim Axn ) = lim g(Axn ) = lim (A∗ g)(xn ) = lim 0(xn ) = 0. 2
n→∞ n→∞ n→∞ n→∞

25
2.37. Corollary (a necessary condition of solvability) If (2.22) has a
solution then g(y) = 0, ∀g ∈ Ker(A∗ ).

Proof: (2.22) is solvable if and only if y ∈ Ran(A). 2

Now we are going to prove that (2.23) is in fact an equality. We start with
the following corollary of the Hahn–Banach theorem.

2.38. Lemma Let Y0 be a closed linear subspace of Y and y0 ∈ Y \Y0 .


Then there exists g ∈ Y ∗ such that g(y0 ) = 1, g(y) = 0 for all y ∈ Y0 and
kgk = 1/d, where d = d(y0 , Y0 ) = inf y∈Y0 ky0 − yk. (d > 0 because Y0 is
closed.)

Proof: Let
Y1 := {λy0 + y : λ ∈ IF, y ∈ Y0 }.
Define a functional g1 : Y1 → IF by the equality

g1 (λy0 + y) = λ.

It is clear that g1 is linear and g1 (y0 ) = 1, g1 (y) = 0 for all y ∈ Y0 . Further,


|g1 (λy0 + y)| |λ|
kg1 k = sup = sup =
λy0 +y6=0 kλy0 + yk λy0 +y6=0 kλy0 + yk
|λ| 1 1
sup = sup −1
= sup =
λ6=0, y∈Y0 kλy0 + yk λ6=0, y∈Y0 ky0 + λ yk z∈Y0 ky0 + zk
1 1 1
= = .
inf z∈Y0 ky0 + zk inf w∈Y0 ky0 − wk d
Due to the Hahn–Banach theorem g1 can be extended to a functional g ∈ Y ∗
such that g(y0 ) = 1, g(y) = 0 for all y ∈ Y0 and kgk = 1/d. 2

2.39. Theorem

Ran(A) = {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )}.

Proof: According to Lemma 2.36 it is sufficient to prove that

N (A) := {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )} ⊂ Ran(A).

26
Suppose the contrary: there exists y0 ∈ N (A), y0 6∈ Ran(A). Lemma 2.38
implies the existence of g ∈ Y ∗ such that g(y0 ) = 1, g(y) = 0 for all y ∈
Ran(A). Then
(A∗ g)(x) = g(Ax) = 0, ∀x ∈ X,
i.e A∗ g = 0, i.e. g ∈ Ker(A∗ ). Since y0 ∈ N (A), we have g(y0 ) = 0. Contra-
diction! 2

2.40. Corollary Suppose Ran(A) is closed. Then (2.22) is solvable if and


only if g(y) = 0, ∀g ∈ Ker(A∗ ).

Proof: (2.22) is solvable if and only if y ∈ Ran(A) = Ran(A). 2

Projections
2.41. Definition Let X be a normed space. An operator B ∈ B(X) is
called a projection if it is idempotent, i.e. B 2 = B.

2.42. Lemma Let P ∈ B(X) be a projection. Then Q := I − P is a


projection, P Q = QP = 0 and

Ran(P ) = Ker(Q), Ran(Q) = Ker(P ).

Proof: Q2 = (I − P )2 = I − 2P + P 2 = I − 2P + P = I − P = Q, i.e. Q is a
projection.

P Q = P (I − P ) = P − P 2 = 0 = (I − P )P = QP.

It follows from the equality QP = 0 that Ran(P ) ⊂ Ker(Q) (why?). On


the other hand for any x ∈ Ker(Q) we have x − P x = 0, i.e. x = P x,
hence x ∈ Ran(P ). Thus Ker(Q) ⊂ Ran(P ). So, Ran(P ) = Ker(Q). The
equality Ran(Q) = Ker(P ) follows from the last one if P is replaced by Q. 2

2.43. Definition Let L, L1 , L2 be subspaces of a vector space X. We


say that L is the direct sum of L1 , L2 and write L = L1 ⊕ L2 if

L1 ∩ L2 = {0} and L = L1 + L2 := {y1 + y2 : y1 ∈ L1 , y2 ∈ L2 }.

27
2.44. Proposition Let L, L1 , L2 be subspaces of a vector space X. Then
L = L1 ⊕ L2 iff every element y of L can be uniquely written as y = y1 + y2
with yk ∈ Lk .

Proof: Exercise. 2

2.45. Lemma If P ∈ B(X) is a projection, then Ran(P ) is closed and


X = Ran(P ) ⊕ Ker(P ).

Proof: Ran(P ) = Ker(I − P ), so Ran(P ) is closed. The equality I =


P + (I − P ) and Lemma 2.42 imply
Ran(P ) + Ker(P ) = Ran(P ) + Ran(I − P ) = X.
So, it is left to prove that Ran(P ) ∩ Ker(P ) = {0}. Let us take an arbi-
trary x ∈ Ran(P ) ∩ Ker(P ). Since x ∈ Ran(P ), there exists y ∈ X such that
x = P y. Consequently P x = P 2 y = P y = x. On the other hand x ∈ Ker(P ),
i.e. P x = 0. Thus x = 0. 2

2.46. Theorem Let X be a Banach space and P ∈ B(X) be a non–trivial


projection, i.e. P 6= 0, I. Then σ(P ) = {0, 1}.

Proof: Using the spectral mapping theorem (Theorem 2.34) we obtain


{0} = σ(0) = σ(P − P 2 ) = {λ − λ2 : λ ∈ σ(P )}.
So, λ ∈ σ(P ) =⇒ λ = 0 or 1, i.e. σ(P ) ⊂ {0, 1}.
On the other hand, Ran(P ) 6= {0} and Ran(I − P ) 6= {0} (why?), i.e.
Ker(I − P ) 6= {0} and Ker(P ) 6= {0} (see Lemma 2.42). Consequently the
operators P and P − I are not invertible, i.e. {0, 1} ⊂ σ(P ). 2
T
Suppose B ∈ B(X) and Ω is an admissible set such that ∂Ω σ(B) = ∅. Consider
the operator Z
1
P := − R(B; λ)dλ. (2.24)
2πi ∂Ω
T
If Ω σ(B) = ∅ then P = 0 (follows from the Cauchy theorem). If σ(B) ⊂ Ω then
P = 1(B) = I (Lemma 2.28). The only non–trivial T case is when there exists a non–empty
subset σ of σ(B) such that σ 6= σ(B), σ ⊂ Ω, Ω (σ(B)\σ) = ∅ and both σ and σ(B)\σ
are closed. This can happen only if σ(B) is not connected. Let an admissible set Ω0 and
open sets ∆, ∆0 be such that
σ ⊂ Ω ⊂ Cl(Ω) ⊂ ∆, σ(B)\σ ⊂ Ω0 ⊂ Cl(Ω0 ) ⊂ ∆0

28
T S
and ∆ ∆0 = ∅. Let us consider the function f : ∆1 := ∆ S∆0 → C which equals S 1 in
∆ and 0 in ∆0 . It is analytic in ∆1 . Since σ(B) ⊂ Ω1 := Ω Ω0 and ∂Ω1 = ∂Ω ∂Ω0 ,
we have Z Z
1 1
P =− R(B; λ)dλ = − f (λ)R(B; λ)dλ = f (B).
2πi ∂Ω 2πi ∂Ω1
It is clear that f 2 = f . Hence P 2 = P (Theorem 2.31). It follows from the spectral
mapping theorem (Theorem 2.34) that σ(P ) = σ(f (B)) = {0, 1}. Thus P is a non–trivial
projection which commutes with g(B) for any function g analytic in some neighbourhood
of σ(B) (Corollary 2.32). P is called the spectral projection corresponding to the spectral
set σ. Since
g(B)P x = P g(B)x ∈ Ran(P ), ∀x ∈ X,
and
P g(B)x0 = g(B)P x0 = g(B)0 = 0, ∀x0 ∈ Ker(P ),
Ran(P ) and Ker(P ) are invariant under g(B), i.e.

g(B)Ran(P ) ⊂ Ran(P ), g(B)Ker(P ) ⊂ Ker(P ).

Compact operators
2.47. Definition Let X be a normed space. A set K ⊂ X is relatively com-
pact if every sequence in K has a Cauchy subsequence. K is called compact
if every sequence in K has a subsequence which converges to an element of K.

2.48. Proposition K relatively compact =⇒ K bounded.


K compact =⇒ K closed and bounded.
Any subset of a relatively compact set is relatively compact.
Any closed subset of a compact set is compact.

2.49. Theorem K is relatively compact if and only if it is totally bounded,


i.e. for every ε > 0, X contains a finite set, called ε−net for K, such that
the finite set of open balls of radius ε and centres in the ε−net covers K.

Proof: See e.g. C. Goffman & G. Pedrick, First Course in Functional Anal-
ysis, Section 1.11, Lemma 1. 2

29
2.50. Proposition Let X be finite–dimensional. K ⊂ X is relatively
compact iff K is bounded. K ⊂ X is compact iff K is closed and bounded.

2.51. Theorem On a finite–dimensional vector space X, any norm k · k


is equivalent to any other norm k · k0 , i.e. there exist constants c1 , c2 > 0 such
that
c1 kxk ≤ kxk0 ≤ c2 kxk, ∀x ∈ X.
Proof : See e.g. E. Kreyszig, Introductory Functional Analysis with Appli-
cations, Theorem 2.4-5. 2

2.52. Definition Let X and Y be normed spaces. A linear operator


T : X → Y is called compact (or completely continuous) if it maps bounded
sets of X into relatively compact sets of Y . We will denote the set of all
compact linear operators acting from X into Y by Com(X, Y ).

Since any relatively compact set is bounded, T ∈ Com(X, Y ) maps bounded


sets into bounded sets. In particular the image T (SX ) of the unit ball

SX := {x ∈ X : kxk ≤ 1} (2.25)

is bounded, i.e T ∈ B(X, Y ). Thus

Com(X, Y ) ⊂ B(X, Y ).

It follows from Definition 2.47, that a linear operator T : X → Y is compact


iff for any bounded sequence xn ∈ X, n ∈ IN, the sequence (T xn )n∈IN has a
Cauchy subsequence.

2.53. Lemma Let X and Y be normed spaces. A linear operator T :


X → Y is compact iff T (SX ) is relatively compact.

Proof: Since SX is bounded, T ∈ Com(X, Y ) implies that T (SX ) is rela-


tively compact (see Definition 2.52).
Hence we need to prove that if T (SX ) is relatively compact then T ∈
Com(X, Y ). It is clear that for any t > 0 the set tT (SX ) = T (tSX ) is rel-
atively compact. (Why?) Let W ⊂ X be an arbitrary bounded set. Then
W ⊂ tSX if t ≥ sup{kxk : x ∈ W }. Using the obvious fact that any subset

30
of a relatively compact set is relatively compact (see Proposition 2.48) we
conclude from T (W ) ⊂ T (tSX ) that T (W ) is relatively compact. 2

2.54. Theorem Let X, Y and Z be normed spaces.


(i) If T1 , T2 ∈ Com(X, Y ) and α, β ∈ IF, then αT1 + βT2 ∈ Com(X, Y ).
(ii) If T ∈ Com(X, Y ), A ∈ B(Z, X) and B ∈ B(Y, Z), then T A ∈ Com(Z, Y )
and BT ∈ Com(X, Z).
(iii) If Tk ∈ Com(X, Y ), k ∈ IN, and kT − Tk k → 0 as k → ∞, then
T ∈ Com(X, Y ).

Proof: (i). Let xn ∈ X, n ∈ IN be an arbitrary bounded sequence. It


follows from the compactness of T1 that (T1 xn ) has a Cauchy subsequence
(T1 x(1)
n ). Using the compactness of T2 we can extract a Cauchy subsequence
(T2 x(2) (1) (2)
n ) from the sequence (T2 xn ). It is clear that (T1 xn ) is a Cauchy
sequence. Consequently any  bounded sequence  xn ∈ X, n ∈ IN has a sub-
(2) (2)
sequence (xn ) such that (αT1 + βT2 )xn is a Cauchy sequence. Thus
αT1 + βT2 ∈ Com(X, Y ).
(ii) Let zn ∈ Z, n ∈ IN be an arbitrary bounded sequence. Then (Azn )
is a bounded sequence in X and it follows from the compactness of T that
(T Azn ) has a Cauchy subsequence (T Azn(1) ). So, T A ∈ Com(Z, Y ). For any
bounded sequence xn ∈ X, n ∈ IN, the sequence (T xn ) has a Cauchy subse-
quence (T x(1) (1)
n ). It is easy to see that (BT xn ) is a Cauchy sequence. Hence
BT ∈ Com(X, Z).
(iii) Let xn ∈ X, n ∈ IN be an arbitrary bounded sequence. Since T1 is com-
pact (T1 xn ) has a Cauchy subsequence (T1 x(1) n ). Using the compactness of T2
we can extract a Cauchy subsequence (T2 xn ) from the sequence (T2 x(1)
(2)
n ). It
(2)
is clear that (T1 xn ) is a Cauchy sequence. Repeating this process we obtain
sequences (x(m) n )n∈IN in X such that (xn
(m1 )
) is a subsequence of (x(mn
0)
) if
(m)
m1 > m0 , and (Tk xn )n∈IN is a Cauchy sequence if k ≤ m. Let us consider
the diagonal sequence (x(n) (n)
n ). It is easily seen that (xn ) is a subsequence of
(xn ). (Tk x(n) (n)
n )n∈IN is a Cauchy sequence for any k ∈ IN, because (Tk xn )n≥k is
a subsequence of the Cauchy sequence (Tk x(k) (n)
n )n∈IN . Let us prove that (T xn )
is also Cauchy. We have

kT x(n) (m) (n) (n) (n) (m)


n − T xm k ≤ kT xn − Tk xn k + kTk xn − Tk xm k +
 
kTk x(m) (m) (n) (m) (n) (m)
m − T xm k ≤ kT − Tk k kxn k + kxm k + kTk xn − Tk xm k.

31
For a given ε > 0, we choose k so large that
ε
kT − Tk k < ,
3M
where M := supn∈IN kxn k. Then we determine n0 so that
ε
kTk x(n) (m)
n − Tk xm k < , ∀n, m ≥ n0 .
3
Now we have
kT x(n) (m)
n − T xm k < ε, ∀n, m ≥ n0 .

Hence (T x(n)
n ) is a Cauchy sequence, i.e. T ∈ Com(X, Y ). 2

Propositions (i) and (iii) of the last theorem mean that Com(X, Y ) is a
closed linear subspace of B(X, Y ). Let Y = X. It follows from (ii) that

Com(X) := Com(X, X)
is actually what we call an ideal in the Banach algebra B(X).
2.55. Definition We say that T ∈ B(X, Y ) is a finite rank operator if
Ran(T ) is finite–dimensional.
It is clear that any finite rank operator is compact (see Proposition 2.50).
Theorem 2.54(iii) implies that a limit of a convergent sequence of finite rank
operators is compact.
2.56. Theorem Every bounded subset of a normed space X is relatively
compact iff X is finite–dimensional.

Proof: See e.g. L.A. Liusternik & V.J. Sobolev, Elements of Functional
Analysis, §16). 2
2.57. Corollary The identity operator IX is compact iff X is finite–
dimensional. 2

2.58. Theorem Let T ∈ Com(X, Y ) and let at least one of the spaces
X and Y be infinite–dimensional. Then T is not invertible.
Proof: Suppose T is invertible: T −1 ∈ B(Y, X). Then operators T −1 T = IX
and T T −1 = IY are compact by Theorem 2.54(ii). This however contradicts
Corollary 2.57. 2

32
2.59. Corollary Let X be an infinite–dimensional Banach space and
T ∈ Com(X). Then 0 ∈ σ(T ). 2
2.60. Theorem Any non-zero eigenvalue λ of a compact operator T ∈
Com(X) has finite multiplicity, i.e. the subspace Xλ spanned by eigenvec-
tors of T corresponding to λ is finite–dimensional.

Proof: Any bounded subset Ω of Xλ is relatively compact because T x =


λx, ∀x ∈ Ω, i.e. Ω = λ−1 T (Ω), and T is a compact operator. Hence Xλ is
finite–dimensional by Theorem 2.56. 2

2.61. Theorem Let T ∈ Com(X). Then σ(T ) is at most countable


and has at most one limit point, namely, 0. Any non-zero point of σ(T )
is an eigenvalue of T , which has finite multiplicity according to Theorem
2.60. In a neighbourhood of λ0 ∈ σ(T )\{0} the resolvent R(T ; λ) admits the following
representation:
k
X
R(T ; λ) = (λ − λ0 )−n−1 (T − λ0 I)n P + R̃(T ; λ)
n=0

where R̃(T ; λ) is analytic in some neighbourhood of λ0 , P is the spectral projection cor-


responding to λ0 . P and hence (T − λ0 I)n P are finite rank operators.

Proof: See e.g. W. Rudin, Functional Analysis, Theorem 4.25 and T. Kato,
Perturbation Theory For Linear Operators, Ch. III, §6, Section 5. 2

2.62. Theorem Let X and Y be Banach spaces and T ∈ B(X, Y ). Then


T is compact if and only if T ∗ is compact.

Proof: See W. Rudin, Functional Analysis, Theorem 4.19. 2

Let (X, k · kX ) be a normed space and Z be its linear subspace. Suppose


Z is equipped with a norm k · kZ . We will say that (Z, k · kZ ) is continuously
embedded in (X, k · kX ) if the operator I : (Z, k · kZ ) → (X, k · kX ), Ix = x,
is continuous, i.e.
kxkX ≤ constkxkZ , ∀x ∈ Z.
If I : (Z, k · kZ ) → (X, k · kX ) is compact we say that (Z, k · kZ ) is compactly
embedded in (X, k · kX ).

33
2.63. Proposition Let (Z, k · kZ ) be compactly embedded in (X, k · kX ),
Y be a normed space and let A : Y → (Z, k · kZ ) be bounded. Then
A ∈ Com(Y, X).

Proof: Apply Theorem 2.54(ii) to the operator A = IA. 2

2.64. Example Let m ∈ IN {0}, −∞ < a < b < ∞. We denote by


S

C m ([a, b]) the space of all m times continuously differentiable functions on


[a, b] equipped with the norm
m

X dj u(t)
kuk(m) := max .

t∈[a,b] dtj
j=0

C m ([a, b]) is a Banach space. C m ([a, b]) is compactly embedded in C n ([a, b])
if m > n. (Let X and Z be some spaces of functions defined on a compact
set K. Normally it is reasonable to expect Z to be compactly embedded in
X if Z consists of functions which are “more smooth” than functions from
X.) It follows from Proposition 2.63 that if a linear operator T is continuous
from C n ([a, b]) to C m ([a, b]), m > n, then it is compact in C n ([a, b]). For
example the operator defined by the formula
Z b
(T u)(t) := k(t, s)u(s)ds, k ∈ C 1 ([a, b]2 ),
a

is bounded from C ([a, b]) = C([a, b]) to C 1 ([a, b]) and hence compact in
0

C([a, b]).

3 The geometry of Hilbert spaces


3.1. Definition An inner (scalar) product space is a vector space H
together with a map (· , ·) : H × H → IF such that
(λx + µy, z) = λ(x, z) + µ(y, z), (3.1)
(x, y) = (y, x), (3.2)
(x, x) ≥ 0, and (x, x) = 0 ⇐⇒ x = 0, (3.3)
for all x, y, z ∈ H and λ, µ ∈ IF.

34
(3.1)–(3.3) imply

(x, λy + µz) = λ̄(x, y) + µ̄(x, z), (3.4)

(x, 0) = (0, x) = 0. (3.5)


3.2. Examples
1. CN , (x, y) := N
P
k=1 wk xk ȳk , wk > 0. (For wk = 1 we have the standard
N
inner product on C .)
2. l2 , (x, y) := ∞
P
k=1 xk ȳk . (The series converges because of the inequality
|xk yk | ≤ (|xk |2 + |yk |2 )/2.)
3. C([0, 1]), (f, g) := 01 f(t)g(t)dt.
R

4. C 1 ([0, 1]), (f, g) := 01 f (t)g(t) + f 0 (t)g 0 (t) dt.
R

3.3. Theorem Every inner product space becomes a normed space by set-
ting kxk := (x, x)1/2 .

Proof: Linear Analysis. 2

3.4. Theorem (Cauchy–Schwarz inequality)

|(x, y)| ≤ kxkkyk, ∀x, y ∈ H. (3.6)

Proof: Linear Analysis. 2

3.5. Definition A system {xα }α∈J ⊂ H is called orthogonal if (xα , xβ ) = 0,


for all α, β ∈ J such that α 6= β.

3.6. Proposition (Pythagoras’ theorem) If a system of vectors x1 , . . . , xn ∈


H is orthogonal, then

kx1 + x2 + · · · + xn k2 = kx1 k2 + kx2 k2 + · · · + kxn k2 .

Proof: Express both sides in terms of inner products. 2

3.7. Proposition (polarization identity) If IF = IR then

4(x, y) = kx + yk2 − kx − yk2 , ∀x, y ∈ H.

35
If IF = C then

4(x, y) = kx + yk2 − kx − yk2 + ikx + iyk2 − ikx − iyk2 , ∀x, y ∈ H.

Proof: The same. 2

3.8. Proposition (parallelogram law)

kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 , ∀x, y ∈ H.

Proof: The same. 2

According to Proposition 3.8 for the norm induced by an inner product


the parallelogram law holds. Now the following result implies that a norm is
induced by an inner product iff the parallelogram law holds.

3.9. Theorem (Jordan – Von Neumann) A norm satisfying the par-


allelogram law is derived from an inner product. 2

3.10. Definition A complete inner product space is called a Hilbert space.

So, every Hilbert Space is a Banach space and the whole Banach space theory
can be applied to Hilbert spaces.

3.11. Examples 1. CN with any of the norms of Example 3.2-1 is a


Hilbert space.
2. l2 is a Hilbert space.
3. The inner product space from Example 3.2-3 is not a Hilbert space.

Orthogonal complements
3.12. Theorem Let L be a closed linear subspace of a Hilbert space H.
For each x ∈ H there exists a unique point y ∈ L such that

kx − yk = d(x, L) := inf{kx − zk : z ∈ L}.

This point satisfies the equality

(x − y, z) = 0, ∀z ∈ L.

36
Proof: Step I. Let yn ∈ L be such that kx−yn k → d := d(x, L). Applying the parallelogram
law to the vectors x − yn and x − ym and using the fact that (yn + ym )/2 ∈ L we have

kyn − ym k2 = 2kx − yn k2 + 2kx − ym k2 − kx − yn + x − ym k2 =


1
2kx − yn k2 + 2kx − ym k2 − 4kx − (yn + ym )k2 ≤
2
2 2 2 2 2 2
2kx − yn k + 2kx − ym k − 4d → 2d + 2d − 4d = 0 as n, m → ∞.

Hence (yn ) is a Cauchy sequence, its limit y belongs to L and

kx − yk = lim kx − yn k = d(x, L).

Step II. Let us take an arbitrary z ∈ L such that kzk = 1. Then w := y + λz ∈ L, ∀λ ∈ IF


(why?). For λ = (x − y, z) we have

d2 ≤ kx − wk2 = (x − y − λz, x − y − λz) = kx − yk2 − λ(z, x − y) −


λ̄(x − y, z) + |λ|2 = kx − yk2 − |λ|2 = d2 − |λ|2 .

Hence
(x − y, z) = λ = 0, ∀z ∈ L s.t. kzk = 1.

Consequently (x − y, z) = 0, ∀z ∈ L\{0} (why?). It is clear that the last equality is valid


for z = 0 as well.
Step III. Let y0 ∈ L be such that kx − y0 k = d(x, L). Then we obtain from Step II:
(x − y0 , z) = 0, ∀z ∈ L. Hence (y − y0 , z) = 0, ∀z ∈ L. Taking z = y − y0 ∈ L we obtain
ky − y0 k2 = 0, i.e. y = y0 . This proves the uniqueness. 2

3.13. Remark The last theorem and its proof remain true in the case
when H is an inner product space and L is its complete linear subspace.

3.14. Definition The orthogonal complement M ⊥ of a set M ⊂ H is


the set
M ⊥ := {x ∈ H : (x, y) = 0, ∀y ∈ M }.
3.15. Proposition Let M be an arbitrary subset of H. Then
(i) M ⊥ is a closed linear subspace of H;
(ii) M ⊂ M ⊥⊥ ;
(iii) M is dense in H =⇒ M ⊥ = {0}.

Proof: Exercise. 2

3.16. Theorem For any closed linear subspace M of a Hilbert space H we

37
have H = M ⊕ M ⊥ .

Proof: According to Theorem 3.12 for any x ∈ H there exists y ∈ M such


that
kx − yk = d(x, M ) and (x − y, z) = 0, ∀z ∈ M.
So, u := x − y ∈ M ⊥ and
x=y+u
is a decomposition of the required form, i.e. H = M + M ⊥ . Let us prove
now that M ∩ M ⊥ = {0}. For any w ∈ M ∩ M ⊥ we have

kwk2 = (w, w) = 0, i.e. w = 0. 2

Complete orthonormal sets


3.17. Definition The linear span of a subset A of a vector space X is the
linear subspace
N
( )
X
linA = λn xn : xn ∈ A, λn ∈ IF, n = 1, . . . , N, N ∈ IN .
n=1

If X is a normed space than the closed linear span of A is the closure of linA.
We will use the following notation for the closed linear span of A

spanA := Cl(linA).

3.18. Proposition If A is a finite set then spanA = linA.

Proof: Cf. Example 1.3–1. 2

3.19. Definition A system {eα }α∈J ⊂ H is called orthonormal if


(
1 if β = α ,
(eα , eβ ) = δαβ =
0 if β 6= α .

38
3.20. Definition A system {eα }α∈J ⊂ H is called linearly independent if for
any finite subset {eα1 , . . . , eαN }, N
n=1 cn eαn = 0 =⇒ c1 = · · · = cN = 0.
P

3.21. Proposition Any orthonormal system is linearly independent.


PN
Proof: Suppose n=1 cn eαn = 0. Then
N N
!
X X
0= cn eαn , eαm = cn (eαn , eαm ) = cm
n=1 n=1

for any m = 1, . . . , N . 2

Let us consider the Gauss approximation problem: suppose {e1 , . . . , eN }


is an orthonormal system in an inner product space H. For a given x ∈ H
find c1 , . . . , cN such that kx − N
n=1 cn en k is minimal.
P

3.22. Theorem The Gauss approximation problem has a unique solu-


tion cn = (x, en ). The vector w := x − N
P
n=1 (x, e )e is orthogonal to
P n n
L := lin{e1 , . . . , eN }. Moreover, kwk2 = kxk2 − N 2
n=1 |(x, en )| (Bessel’s
PN
equality) and, consequently, n=1 |(x, en )|2 ≤ kxk2 (Bessel’s inequality).

Proof: For arbitrary c1 , . . . , cN we have


N
2 N N
!
X X X
x − c n n =
e x− cn en , x − cm em =


n=1 n=1 m=1
N N N N
!
2
X X X X
kxk − cn (en , x) − cm (x, em ) + cn en , cm em =
n=1 m=1 n=1 m=1
N N N N N
!
2 2
X X X X X
kxk − |(x, en )| + (x, en )en − cn en , (x, em )em − cm em =
n=1 n=1 n=1 m=1 m=1
2
N N N

X
kxk2 − |(x, en )|2 + (x, en )en −
X X

cn en .

n=1 n=1 n=1

Therefore kx − N 2 2
n=1 cn en k is minimal when cn = (x, en ) and kwk = kxk −
P
PN 2
n=1 |(x, en )| . The fact that w is orthogonal to L follows from Theorem

39
3.12 (see also Remark 3.13). Let us give an independent direct proof of this
fact:
N N
!
X X
(w, em ) = x − (x, en )en , em = (x, em ) − (x, en )(en , em ) =
n=1 n=1
N
(x, en )δnm = (x, em ) − (x, em ) = 0
X
(x, em ) −
n=1

for any m = 1, . . . , N . So, w is orthogonal to e1 , . . . , eN , i.e. to L. 2

3.23. Lemma Let H be a Hilbert space and {en } be an orthonormal


system, cn ∈ IF, n ∈ IN and ∞ 2 P∞
n=1 |cn | < ∞. Then the series
P
n=1 cn en is
convergent and for its sum y we have

!1/2
2
X
kyk = |cn | .
n=1

Proof: Let ym = m
P
n=1 cn en . Then, using Pythagoras’
P∞
theorem (see Proposi-
2
tion 3.6) and the convergence of the series n=1 |cn | , we obtain
2
Xm m
− yk k2 = |cn |2 → 0, as k, m → ∞, (m > k).
X
kym cn en =
n=k+1 n=k+1

Thus (ym ) is a Cauchy sequence, has a limit y and


m ∞
2 2 2
|cn |2 . 2
X X
kyk = m→∞
lim kym k = m→∞
lim |cn | =
n=1 n=1

3.24. Corollary Let H be a Hilbert space and {en } be an orthonormal set.


For any x ∈ H the series ∞
P
n=1 (x, en )en is convergent and for its sum y we
have

! 1/2
2
X
kyk = |(x, en )| ≤ kxk.
n=1

Proof: It follows from Theorem 3.22 that


N
|(x, en )|2 ≤ kxk2
X

n=1

40
for any N ∈ IN. Therefore

|(x, en )|2 ≤ kxk2 . 2
X

n=1

3.25. Definition The Fourier coefficients of an element x ∈ H with respect


to an orthonormal set {en } are the numbers (x, en ).

3.26. Theorem Let {en } be an orthonormal set in a Hilbert space H.


Then the following statements are equivalent
(i) x = ∞ (x, e )e , ∀x ∈ H;
P
n=1P∞ n n
(ii) kxk = n=1 |(x, en )|2 , ∀x ∈ H (Parseval’s identity);
2

(iii) (x, en ) = 0, ∀n ∈ IN =⇒ x = 0;
(iv) lin{en } is dense in H.

Proof: (i) =⇒ (ii) Follows from Corollary 3.24.

(ii) =⇒ (iii) Immediate.

(iii) =⇒ (iv) For an arbitrary x ∈ H and y := ∞


P
n=1 (x, en )en (see Corol-
lary 3.24) we have (x − y, em ) = 0, ∀m ∈ IN (see the proof of Theorem
3.22). Then x = y by (iii). It is clear that y ∈ span{en }. Consequently
x ∈ span{en } for any x ∈ H, i.e. Cl(lin{en }) = H.

(iv) =⇒ (i) For an arbitrary x ∈ H and y = ∞ n=1 (x, en )en we have (x −


P

y, en ) = 0, ∀n ∈ IN. Therefore (x − y, z) = 0, ∀z ∈ lin{en }, i.e. x − y ∈


(lin{en })⊥ . But (lin{en })⊥ = {0} since lin{en } is dense (see Proposition
3.15). Hence x = y. 2

3.27. Definition The orthonormal set {en } is called complete if it sat-


isfies any (and hence all) of the conditions of Theorem 3.26.

3.28. Examples
1. An orthonormal subset of a finite-dimensional Hilbert space is complete
iff it is a basis.
2. Let H = l2 ,
en = (0, . . . , 0, 1, 0, . . .).
| {z }
n

41
Then {en }n∈IN is a complete orthonormal set (why?).
3. The set {en }n∈Z ,
en (t) = ei2πnt ,
is orthonormal in the inner product space C([0, 1]) of Examples 3.2-3, 3.11-3
(check this!). It is one of the basic results of the classical Fourier analysis
that it has also the properties (ii)–(iv) of Theorem 3.26 (with Z and ∞
P
n=−∞
instead of IN and ∞
P P∞
n=1 correspondingly). The Fourier series n=−∞ (f, en )en
of a function f ∈ C([0, 1]) is convergent to f with respect to the norm k · k2 ,
but may be not uniformly convergent, i.e. not convergent with respect to the
norm k · k∞ .

3.29. Theorem (Gram–Schmidt orthogonalization process) For an


arbitrary countable or finite set {yn } of vectors of an inner product space H
there exists a countable or finite orthonormal set {en } such that

lin{yn } = lin{en }.

Proof: Let us first construct a set {zn } such that lin{yn } = lin{zn } and zn
are linearly independent. We define zn inductively:

z1 = yn1 ,

where yn1 is the first non-zero yn ,

z2 = yn2 ,

where yn2 is the first yn not in the linear span of z1 , and, generally,

zk = ynk ,

where ynk is the first yn not in the linear span of z1 , . . . , zk−1 . It is clear that
lin{yn } = lin{zn } and zn are linearly independent (prove this!).
Now we define inductively vectors en such that for any N the set {en }N n=1
is orthonormal and lin{en }N n=1 = lin{z } N
n n=1 . By our construction z1 is non-
zero, so
z1
e1 =
kz1 k

42
is well defined and has all the necessary properties in the case N = 1. Assume
that vectors e1 , . . . , eN −1 have been defined. By hypothesis zN ∈
/ LN −1 :=
N −1 N −1
lin{en }n=1 = lin{zn }n=1 , so the vector
N
X −1
uN = zN − (zN , ek )ek
k=1

must be non-zero and hence


uN
eN =
kuN k
is well defined. We have
lin{en }N N
n=1 = lin(eN , LN −1 ) = lin(uN , LN −1 ) = lin(zN , LN −1 ) = lin{zn }n=1 .
 ⊥
−1
Further, uN ∈ lin{en }N n=1 by Theorem 3.22. Consequently eN is orthogo-
−1
nal to any element of the orthonormal set {en }N
n=1 . It is clear that keN k = 1.
So the set {en }N
n=1 is orthonormal. This completes the induction step.

Thus we have constructed an orthonormal set {en } such that


lin{en } = lin{zn } = lin{yn }. 2

3.30. Definition A normed space is called separable if it contains a count-


able dense subset.

3.31. Theorem A Hilbert space H is separable iff it contains a count-


able (or finite) complete orthonormal set.

Proof: Let H be a separable Hilbert space, {yn }n∈IN be a dense subset and
let {en } be the orthonormal set of the last theorem. Then lin{en } = lin{yn }
is dense in H, i.e. {en } is complete.
Conversely, assume that H contains a countable (or finite) complete or-
thonormal set {en }. Let
N
( )
X
QN = λn en : λ1 , . . . , λN ∈ Q + iQ ,
n=1

43
where Q is the field of rational numbers, and let

Q = ∪QN .

Then (exercise) Q is a countable dense subset of H. 2

Let U : H1 → H2 be an isomorphism of inner product spaces H1 and H2


(see Definition 1.4). It follows from the polarization identity (see Proposition
3.7) that U preserves the inner products

(U x, U y)2 = (x, y)1 , ∀x, y ∈ H1 .

3.32. Theorem All infinite dimensional separable Hilbert spaces are iso-
morphic to l2 and hence to each other.
Proof: Let H be an infinite dimensional separable Hilbert space and let
{en }n∈IN be a complete orthonormal
P∞ set in it. Let us define U : l2 → H by
the formula U (λn )n∈IN = n=1 λn en (see Lemma 3.23). It is clear that U is
linear. Lemma 3.23 and Theorem 3.26 (Definition 3.27) imply that U is onto
and an isometry. 2

3.33. Remark Suppose {eα } is a not necessarily countable orthonormal set. It is


called complete if
(x, eα ) = 0, ∀α =⇒ x = 0.

Any two complete orthonormal subsets of a Hilbert space H have the same cardinality.
This cardinality is called the dimension of H. Two Hilbert spaces are isomorphic iff they
have the same dimension. (For proofs see e.g. C. Goffman & G. Pedrick, First Course in
Functional Analysis, Section 4.7.).

Bounded linear functionals on a Hilbert space


3.34. Theorem (Riesz) f is a bounded linear functional on a Hilbert space
H, i.e. a bounded linear operator from H to IF if and only if there exists a
unique z ∈ H such that

f (x) = (x, z), ∀x ∈ H. (3.7)

44
Moreover, kf k = kzk.

Proof: It is clear that the equality (3.7) defines a bounded linear functional for an ar-
bitrary z ∈ H. So, we have to prove that any bounded linear functional on H has a unique
representation of the form (3.7).
Uniqueness. If z1 , z2 satisfy (3.7), then

(x, z1 ) = (x, z2 ), ∀x ∈ H,

and therefore z1 − z2 ∈ H⊥ = {0}.


Existence. If f = 0, take z = 0. Suppose f 6= 0. Then Ker(f ) is a closed linear subspace
of H (see Definition 2.1) and Ker(f ) 6= H. Let y ∈ Ker(f )⊥ , y 6= 0. We have

f (f (x)y − f (y)x) = f (x)f (y) − f (x)f (y) = 0, ∀x ∈ H,

i.e.
f (x)y − f (y)x ∈ Ker(f ), ∀x ∈ H.
Consequently
(f (x)y − f (y)x, y) = 0, ∀x ∈ H.
The left-hand side of the last equality equals f (x)kyk2 −f (y)(x, y) = f (x)kyk2 −(x, f (y)y).
Thus
f (y)
f (x) = (x, z), ∀x ∈ H, where z := y.
kyk2
The equality kf k = kzk follows from the relations

|f (x)| = |(x, z)| ≤ kzkkxk, ∀x ∈ H, and f (z) = kzk2 . 2

4 Spectral theory in Hilbert spaces


The adjoint operator
4.1. Theorem Let H1 and H2 be Hilbert spaces and B ∈ B(H1 , H2 ). There
exists a unique operator B ∗ ∈ B(H2 , H1 ) such that

(Bx, y) = (x, B ∗ y), ∀x ∈ H1 , ∀y ∈ H2 . (4.1)


Proof: Let us take an arbitrary y ∈ H2 . Then

H1 3 x 7→ f (x) := (Bx, y) ∈ IF

45
is a linear functional and

|f (x)| ≤ kBxkkyk ≤ kBkkxkkyk = (kBkkyk)kxk, ∀x ∈ H1 .

So, f is a bounded linear functional on H1 and kf k ≤ kBkkyk. According to the Riesz


theorem (Theorem 3.34) there exists a unique z = z(B, y) ∈ H1 such that

(Bx, y) = f (x) = (x, z), ∀x ∈ H1 ,

and kzk = kf k ≤ kBkkyk. It is easy to see that the map

H2 3 y 7→ B ∗ y := z ∈ H1

is linear (check this!). We have already proved that

kB ∗ yk ≤ kBkkyk, ∀y ∈ H2 ,

i.e. B ∗ is bounded and


kB ∗ k ≤ kBk. (4.2)
It is clear that (4.1) is satisfied and that the constructed operator B ∗ is the unique oper-
ator satisfying this equality. 2

4.2. Definition The operator B ∗ from the previous theorem is called


the adjoint of B.

4.3. Theorem Let H1 , H2 and H3 be Hilbert spaces, B, B1 , B2 ∈ B(H1 , H2 )


and T ∈ B(H2 , H3 ). Then
(i) (αB1 + βB2 )∗ = αB1∗ + βB2∗ , ∀α, β ∈ IF;
(ii) (T B)∗ = B ∗ T ∗ ;
(iii) B ∗∗ = B;
(iv) kB ∗ k = kBk;
(v) kB ∗ Bk = kBB ∗ k = kBk2 ;
(vi) if B is invertible then so is B ∗ and (B ∗ )−1 = (B −1 )∗ .

Proof: (i)
((αB1 + βB2 )x, y) = α(B1 x, y) + β(B2 x, y) = α(x, B1∗ y) + β(x, B2∗ y) =
(x, (αB1∗ + βB2∗ )y), ∀x ∈ H1 , ∀y ∈ H2 .

(ii)
(T Bx, z) = (Bx, T ∗ z) = (x, B ∗ T ∗ z), ∀x ∈ H1 , ∀z ∈ H3 .
(iii)

(B ∗∗ x, y) = (y, B ∗∗ x) = (y, (B ∗ )∗ x) = (B ∗ y, x) = (x, B ∗ y) = (Bx, y),


∀x ∈ H1 , ∀y ∈ H2 .

46
Consequently
((B − B ∗∗ )x, y) = 0, ∀x ∈ H1 , ∀y ∈ H2 .
Taking y = (B − B ∗∗ )x we obtain (B − B ∗∗ )x = 0, ∀x ∈ H1 .
(iv) follows from (4.2) and (iii):

kBk = k(B ∗ )∗ k ≤ kB ∗ k ≤ kBk.

(v)
kB ∗ Bk ≤ kB ∗ kkBk = kBk2
(see (iv)). On the other hand
kBk2 = sup kBxk2 = sup (Bx, Bx) = sup (x, B ∗ Bx) ≤
kxk=1 kxk=1 kxk=1

sup kxkkB Bxk = kB ∗ Bk.



kxk=1

Thus kB ∗ Bk = kBk2 . Using this equality with B ∗ instead of B we derive


from (iii) and (iv) that kBB ∗ k = kBk2 .
(vi) It is sufficient to take the adjoint operators in the equality

BB −1 = IH2 , B −1 B = IH1 ,

(see (ii)). 2

Parts (i)–(iii) of Theorem 4.3 show that B(H) is a Banach algebra with
an involution. Part (v) shows that it is, in fact, a C ∗ –algebra.

A mapping x 7→ x∗ of an algebra E into itself is called an involution if it has the fol-


lowing properties
(αx + βy)∗ = αx∗ + βy ∗ ,
(xy)∗ = y ∗ x∗ ,
x∗∗ = x,
for all x, y ∈ E and α, β ∈ IF.
A Banach algebra E with an involution satisfying the identity

kx∗ xk = kxk2 , ∀x ∈ E,

is called a C ∗ –algebra.

The analogue of Theorem 4.3(i) for operators acting on normed spaces


has the following form
(αB1 + βB2 )∗ = αB1∗ + βB2∗ , ∀α, β ∈ IF. (4.3)

47
Let us explain why this equality is different from that of Theorem 4.3(i). If
H1 and H2 are Hilbert spaces, then we have two definitions of the adjoint
of B ∈ B(H1 , H2 ): Definition 4.2 and that for operators acting on normed
spaces. B ∗ ∈ B(H2 , H1 ) according to the first one, while the second one gives
us an operator B ∗ ∈ B(H2∗ , H1∗ ). In order to connect them to each other we
have to use the identification H1∗ ∼ H1 , H2∗ ∼ H2 given by Theorem 3.34.
This identification is conjugate linear (anti-linear):

fj (x) = (x, zj ), j = 1, 2, ∀x ∈ H =⇒ αf1 (x)+βf2 (x) = (x, αz1 +βz2 ), ∀x ∈ H.

This is why we have the complex conjugation in Theorem 4.3(i) and do not
have it in (4.3).
Note also that we use the following definition of linear operations on X ∗ (for
an arbitrary normed space X)

(f1 + f2 )(x) := f1 (x) + f2 (x), ∀f1 , f2 ∈ X ∗ , ∀x ∈ X,

(αf )(x) := αf (x), ∀f ∈ X ∗ , ∀x ∈ X, ∀α ∈ IF.


One can replace the last definition by

(αf )(x) := αf (x), ∀f ∈ X ∗ , ∀x ∈ X, ∀α ∈ IF.

In this case Theorem 4.3(i) is valid for operators acting on normed spaces.

4.4. Theorem Let H1 and H2 be Hilbert spaces, B ∈ B(H1 , H2 ). Then

Ker(B ∗ ) = Ran(B)⊥ , Ker(B) = Ran(B ∗ )⊥ .

Proof:

B ∗ y = 0 ⇐⇒ (x, B ∗ y) = 0, ∀x ∈ H1 ⇐⇒ (Bx, y) = 0, ∀x ∈ H1 ⇐⇒
y ∈ Ran(B)⊥ .

Thus Ker(B ∗ ) = Ran(B)⊥ . Since B ∗∗ = B, the second assertion follows from


the first if B is replaced by B ∗ . 2

4.5. Corollary (cf. Theorem 2.39). Let H1 and H2 be Hilbert spaces,


B ∈ B(H1 , H2 ). Then

Cl(Ran(B)) = Ker(B ∗ )⊥ , Cl(Ran(B ∗ )) = Ker(B)⊥ .

48
Proof: Note that for any linear subspace M of a Hilbert space H we have
the equality Cl(M ) = M ⊥⊥ . (Prove this!) 2

4.6. Definition Let H, H1 and H2 be Hilbert spaces. An operator B ∈


B(H) is said to be
(i) normal if BB ∗ = B ∗ B,
(ii) self-adjoint if B ∗ = B, i.e. if

(Bx, y) = (x, By), ∀x, y ∈ H.

An operator U ∈ B(H1 , H2 ) is called unitary if U ∗ U = IH1 , U U ∗ = IH2 , i.e.


if U −1 = U ∗ .

Note that any self-adjoint operator is normal. Any unitary operator U ∈


B(H) is normal as well.

4.7. Theorem An operator B ∈ B(H) is normal if and only if

kBxk = kB ∗ xk, ∀x ∈ H. (4.4)

Proof: We have
kBzk2 = (Bz, Bz) = (B ∗ Bz, z),
kB ∗ zk2 = (B ∗ z, B ∗ z) = (BB ∗ z, z)
for any z ∈ H. So, (4.4) holds if B is normal.
If kBzk = kB ∗ zk, ∀z ∈ H, then using the polarization identity (see Propo-
sition 3.7) we obtain

(Bx, By) = (B ∗ x, B ∗ y), ∀x, y ∈ H,

i.e.
((B ∗ B − BB ∗ )x, y) = 0, ∀x, y ∈ H.
Taking y = (B ∗ B − BB ∗ )x we deduce that (B ∗ B − BB ∗ )x = 0, ∀x ∈ H. 2

4.8. Theorem Let B ∈ B(H) be a normal operator. Then


(i) Ran(B ∗ )⊥ = Ker(B) = Ker(B ∗ ) = Ran(B)⊥ ,
(ii) Bx = αx =⇒ B ∗ x = αx,
(iii) eigenvectors corresponding to different eigenvalues of B are orthogonal

49
to each other.

Proof: (i) follows from Theorems 4.4 and 4.7. Applying (i) to B − αI in
place of B we obtain (ii). (Exercise: prove that B − αI is normal.)
Finally, suppose Bx = αx, By = βy and α 6= β. We have from (ii)

α(x, y) = (αx, y) = (Bx, y) = (x, B ∗ y) = (x, βy) = β(x, y).


Since α 6= β, we conclude (x, y) = 0. 2
4.9. Theorem If U ∈ B(H1 , H2 ), the following statements are equiva-
lent.
(i) U is unitary.
(ii) Ran(U ) = H2 and (U x, U y) = (x, y), ∀x, y ∈ H1 .
(iii) Ran(U ) = H2 and kU xk = kxk, ∀x ∈ H1 .
Proof: If U is unitary, then Ran(U ) = H2 because U U ∗ = IH2 . Also, U ∗ U = IH1 , so
that
(U x, U y) = (x, U ∗ U y) = (x, y), ∀x, y ∈ H1 .
Thus (i) implies (ii). It is obvious that (ii) implies (iii).
It follows from the polarization identity (see Proposition 3.7) that (iii) implies (ii). So, if
(iii) holds, then
(U ∗ U x, y) = (U x, U y) = (x, y), ∀x, y ∈ H1 .

Consequently U ∗ U = IH1 (why?). On the other hand (iii) implies that U is a linear isom-
etry of H1 onto H2 . Hence U is invertible. Since U ∗ U = IH1 , we have U −1 = U ∗ (why?),
i.e. U is unitary. 2

Theorem 4.9 shows that the notion of a unitary operator coincides with the
notion of an isomorphism of Hilbert spaces (see Definition 1.4) and that these
isomorphisms preserve the inner products (see also the equality in the para-
graph between Theorems 3.31 and 3.32).

4.10. Theorem
(i) B is self-adjoint, λ ∈ IR =⇒ λB is self-adjoint.
(ii) B1 , B2 are self-adjoint =⇒ B1 + B2 is self-adjoint.
(iii) Let B1 , B2 be self-adjoint. Then B1 B2 is self-adjoint if and only if B1
and B2 commute.
(iv) If Bn , n ∈ IN are self-adjoint and kB − Bn k → 0, then B is self-adjoint.

50
Proof: Exercise. 2

4.11. Theorem Let H be a complex Hilbert space (i.e. IF = C) and


B ∈ B(H). Then B is self-adjoint iff (Bx, x) is real for all x ∈ H.

Proof: If B is self-adjoint, then

(Bx, x) = (x, Bx) = (Bx, x), ∀x ∈ H,

i.e. (Bx, x) is real.


Let us prove the converse. For any x, y ∈ H we have

4(Bx, y) = (B(x + y), x + y) − (B(x − y), x − y) +


i(B(x + iy), x + iy) − i(B(x − iy), x − iy) (4.5)

(check this expanding the right-hand side; this is the polarization identity for
operators) and similarly

4(x, By) = (x + y, B(x + y)) − (x − y, B(x − y)) +


i(x + iy, B(x + iy)) − i(x − iy, B(x − iy)).

Since
(Bz, z) = (Bz, z) = (z, Bz), ∀z ∈ H,
we have
(Bx, y) = (x, By), ∀x, y ∈ H,
i.e. B is self-adjoint. 2

4.12. Theorem All eigenvalues of a self-adjoint operator B ∈ B(H) are


real and eigenvectors corresponding to different eigenvalues of B are orthog-
onal to each other.

Proof: According to Theorem 4.8(iii) it is sufficient to prove the first state-


ment. Let λ be an eigenvalue and x ∈ H\{0} be a corresponding eigenvector:
Bx = λx. Using Theorem 4.11 we obtain

(Bx, x)
λ= ∈ IR. 2
kxk2

51
4.13. Theorem Let H be a Hilbert space. Each of the following four
properties of a projection P ∈ B(H) implies the other three:
(i) P is self-adjoint.
(ii) P is normal.
(iii) Ran(P ) = Ker(P )⊥ .
(iv) (P x, x) = kP xk2 , ∀x ∈ H.
Proof: It is trivial that (i) implies (ii).
Theorem 4.8(i) shows that Ker(P ) = Ran(P )⊥ if P is normal. Since P is a projection
Ran(P ) is closed (see Lemma 2.45). So, Ran(P ) = Ran(P )⊥⊥ = Ker(P )⊥ (see the proof
of Corollary 4.5). Thus (ii) implies (iii).
Let us prove that (iii) implies (i). Indeed, if (iii) holds, then

(P x, (I − P )y) = 0, ((I − P )x, P y) = 0, ∀x, y ∈ H,

(see Lemma 2.42). Therefore,

(P x, y) = (P x, P y + (I − P )y) = (P x, P y) = (P x + (I − P )x, P y) = (x, P y),

i.e. P is self-adjoint. So, we have proved that the properties (i)–(iii) are equivalent to each
other.
Now it is sufficient to prove that (iii) is equivalent to (iv). We have seen above that (iii)
implies the equality (P x, (I − P )x) = 0, ∀x ∈ H. Hence

(P x, x) = (P x, P x) + (P x, (I − P )x) = (P x, P x) = kP xk2 , ∀x ∈ H.

Finally, assume (iv) holds. Let us take arbitrary y ∈ Ran(P ) and z ∈ Ker(P ) and consider
x = y + tz, t ∈ IR. It is clear that P x = y. Consequently

0 ≤ kP xk2 = (P x, x) = (y, y + tz) = kyk2 + t(y, z).

Thus
t(y, z) ≥ −kyk2 , ∀t ∈ IR,
i.e.
(y, z) = 0, ∀y ∈ Ran(P ), ∀z ∈ Ker(P ),

i.e. (iii) holds.


(In the case when H is a complex Hilbert space the proof can be slightly simplified: after
proving the implications (i) =⇒ (ii) =⇒ (iii) =⇒ (iv) as above, we obtain directly from
Theorem 4.11 that (iv) =⇒ (i).) 2

4.14. Definition Property (iii) of the last theorem is usually expressed

52
by saying that P is an orthogonal projection.

Numerical range
4.15. Theorem Let H be a Hilbert space and B ∈ B(H). Suppose there
exists c > 0 such that

|(Bx, x)| ≥ ckxk2 , ∀x ∈ H. (4.6)

Then B is invertible and kB −1 k ≤ 1/c.

Proof: It follows from (4.6) that Ker(B) = {0} and

ckxk2 ≤ kBxkkxk, ∀x ∈ H,

i.e.
kxk ≤ c−1 kBxk, ∀x ∈ H. (4.7)
Let us prove that Ran(B) is closed. For any y ∈ Cl(Ran(B)) there exist
xn ∈ H such that Bxn → y as n → ∞. It follows from (4.7) that

kxn − xm k ≤ c−1 kBxn − Bxm k → 0 as n, m → ∞.

Hence (xn ) is a Cauchy sequence in the Hilbert space H. Let us denote its
limit by z. We have
 
Bz = B lim xn = lim Bxn = y.
n→∞ n→∞

Thus y ∈ Ran(B), i.e. Ran(B) is closed.


(4.6) implies that if x ∈ Ran(B)⊥ , then x = 0. So, Ran(B)⊥ = {0}, i.e.
Ran(B) is dense in H (why?). Consequently Ran(B) = H and the operator
B is invertible. The inequality kB −1 k ≤ 1/c follows from (4.7). 2

4.16. Definition Let H be a Hilbert space and let B ∈ B(H). Then


the set
Num(B) := {(Bx, x) : kxk = 1, x ∈ H} (4.8)

53
is called the numerical range of the operator B.

It is clear that

|(Bx, x)| ≤ kBxkkxk ≤ kBkkxk2 , ∀x ∈ H,

so,
Num(B) ⊂ {µ ∈ C : |µ| ≤ kBk}. (4.9)
The numerical range of any bounded linear operator is convex (see e.g. G.
Bachman & L. Narici, Functional Analysis, 21.4).

4.17. Theorem For any B ∈ B(H) we have

σ(B) ⊂ Cl(Num(B)). (4.10)

Proof: Let us take an arbitrary λ ∈ C\Cl(Num(B)) and z ∈ H such that


kzk = 1. It is clear that (Bz, z) ∈ Num(B). Hence

|((B − λI)z, z)| = |(Bz, z) − λ| ≥ d > 0,

where d is the distance from λ to Cl(Num(B)). Now for arbitrary x ∈ H\{0}


we have
!
x x
|((B − λI)x, x)| = kxk2 (B − λI) , ≥ dkxk2 .

kxk kxk

According to Theorem 4.15 B − λI is invertible, i.e. λ ∈


/ σ(B). Hence
σ(B) ⊂ Cl(Num(B)). 2

54
Spectra of self-adjoint and unitary operators
It follows from Theorem 4.11 that if B ∈ B(H) is self-adjoint then Num(B) ⊂
IR. Therefore Theorem 4.17 implies the following result.

4.18. Theorem Let B ∈ B(H) be self-adjoint and

m := inf (Bx, x), M := sup (Bx, x).


kxk=1 kxk=1

Then σ(B) ⊂ [m, M ] ⊂ IR. 2

4.19. Theorem If U ∈ B(H) is unitary, then σ(U ) ⊂ {λ ∈ C : |λ| = 1}.

Proof: It follows from Theorem 4.9(iii) that kU k = 1 = kU −1 k. If |λ| > 1


then λ ∈
/ σ(U ) by Lemma 2.9. Suppose |λ| < 1. We have

U − λI = U (I − λU −1 )

and kλU −1 k = |λ| < 1. Hence U − λI is invertible by Lemmas 2.3(i) and 2.4.
So, λ ∈
/ σ(U ), i.e. σ(U ) ⊂ {λ ∈ C : |λ| = 1}. 2

Spectra of normal operators


4.20. Theorem The spectral radius of an arbitrary normal operator
B ∈ B(H) equals its norm: r(B) = kBk (cf. Theorem 2.23).

Proof: Replacing x by Bx in (4.4) we obtain

kB 2 xk = kB ∗ Bxk, ∀x ∈ H.

So, kB 2 k = kB ∗ Bk. Now Theorem 4.3(v) implies kB 2 k = kB ∗ Bk = kBk2 .


Hence
kB 2 k = kBk2 .
Since (B n )∗ = (B ∗ )n (see Theorem 4.3(ii)), if B is normal then so is B n
(why?). Hence it follows by induction that kB m k = kBkm for all integers m
of the form 2k . Using Theorem 2.23 we can write
k k
lim kB n k1/n = lim kB 2 k1/2 = lim kBk = kBk. 2
r(B) = n→∞
k→∞ k→∞

55
4.21. Theorem Let B ∈ B(H) be a normal operator. Then

kBk = sup |(Bx, x)|. (4.11)


kxk=1

Proof: Theorems 4.17 and 4.20 imply

kBk = r(B) = sup{|λ| : λ ∈ σ(B)} ≤ sup{|λ| : λ ∈ Cl(Num(B))} =


sup{|λ| : λ ∈ Num(B)} = sup |(Bx, x)| ≤ kBk. 2
kxk=1

Compact operators in Hilbert spaces


4.22. Theorem Let X be a normed space, H be a Hilbert space and
let T ∈ Com(X, H). Then there exists a sequence of finite rank operators
Tn ∈ Com(X, H), such that kT − Tn k → 0 as n → ∞ (see Definition 2.55).
(n) (n)
Proof: Let y1 , . . . , ykn be a 1/n–net of the relatively compact set T (SX )
(n) (n)
(see (2.25) and Theorem 2.49) and let Ln := span{y1 , . . . , ykn }. Let Pn be
the orthogonal projection onto Ln , i.e.

Pn y = z, where y = z + w, z ∈ Ln , w ∈ L⊥
n,

(see Theorem 3.16). It is easy to see that Tn := Pn T is a finite rank operator


and
1
k(T − Tn )xk = kT x − Pn T xk = dist(T x, Ln ) < , ∀x ∈ SX ,
n
(see Theorem 3.12). Thus kT − Tn k ≤ 1/n. 2

4.23. Remark A subset {en }n∈IN of a Banach space X is called a Schauder


basis if for any x ∈ X there exists unique representation of the form

X
x= λn en , λn ∈ IF.
n=1

It is clear that any Banach space having a Schauder basis is separable (cf.
the proof of Theorem 3.31). Theorem 4.22 remains valid if we replace H by

56
a Banach space having a Schauder basis. In 1973 P. Enflo gave a negative
solution to the long standing problem on existing of a Schauder basis in any
separable Banach space: he constructed a separable Banach space without
Schauder basis. He also proved that there exist a separable Banach space
and a compact operator acting in it such that this operator is not a limit of
a convergent sequence of finite rank operators.

Hilbert–Schmidt operators
Let H1 and H2 be separable Hilbert spaces and {en }n∈IN be a complete or-
thonormal set in H1 . Then for any operator T ∈ B(H1 , H2 ) the quantity
P∞ 2
n=1 kT en k is independent of the choice of a complete orthonormal set {en }
(this quantity may be infinite). Indeed, if {fn }n∈IN is a complete orthonormal
set in H2 then
∞ ∞ X
∞ ∞ X
∞ ∞

2 2 2
kT ∗ fk k2 .
X X X X
kT en k = |(T en , fk )| = |(en , T fk )| =
n=1 n=1 k=1 n=1 k=1 k=1

So, ∞
P∞
2
n=1 kT en k = kT e0n k2 for any two complete orthonormal sets {en }
P
n=1
and {e0n } in H1 .

4.24. Definition The Hilbert–Schmidt norm of an operator T ∈ B(H1 , H2 )


is defined as !1/2

2
X
kT kHS := kT en k
n=1

and the operators for which it is finite are called Hilbert–Schmidt operators.

Note that
kT k ≤ kT kHS . (4.12)
Indeed,

! ∞ ∞
X X X
kT xk = T (x, en )en = (x, en )T en ≤ |(x, en )|kT en k ≤


n=1 n=1 n=1

!1/2 ∞ !1/2
|(x, en )|2 kT en k2
X X
= kT kHS kxk, ∀x ∈ H1 ,
n=1 n=1

57
(see Theorem 3.26).

4.25. Exercise Prove that the set of all Hilbert–Schmidt operators is a


subspace of B(H1 , H2 ) and that k · kHS is indeed a norm on that subspace.

4.26. Theorem Hilbert–Schmidt operators are compact.

Proof: Let T be a Hilbert–Schmidt operator and let {en }n∈IN be a complete


orthonormal set in H1 . The operator TN defined by
∞ N
!
X X
TN an en = an T en
n=1 n=1
P∞
is a finite rank operator (why?). Moreover for any x = n=1 an en ∈ H1 we
have

∞ ∞
X X

kT x − TN xk = an T en ≤ |an |kT en k ≤
n=N +1 n=N +1
 1/2  1/2  1/2
∞ ∞ ∞
|an |2   kT en k2  ≤ kxk  kT en k2 
X X X

n=N +1 n=N +1 n=N +1

and therefore
 1/2

kT en k2 
X
kT − TN k ≤  → 0 as N → ∞.
n=N +1

Hence T is compact being the limit of a sequence of compact operators (see


Theorem 2.54(iii)). 2

4.27. Example (Integral operators) Let H = L2 ([0, 1]) and let a mea-
surable function
k : [0, 1] × [0, 1] → IF
be such that Z 1 Z 1
|k(t, τ )|2 dτ dt < ∞.
0 0
Define the operator K by
Z 1
(Kf )(t) := k(t, τ )f (τ )dτ.
0

58
Then K : L2 ([0, 1]) → L2 ([0, 1]) is a Hilbert–Schmidt operator and
Z 1 Z 1 1/2
2
kKkHS = |k(t, τ )| dτ dt .
0 0

Proof: For t ∈ [0, 1] let kt (τ ) := k(t, τ ). Let {en }n∈IN be a complete orthonor-
mal set in L2 ([0, 1]). Then
Z 1
Ken (t) = k(t, τ )en (τ )dτ = (kt , en ).
0

Hence Z 1 Z 1
2 2
kKen k = |Ken (t)| dt = |(kt , en )|2 dt
0 0
and therefore
∞ ∞ Z 1 Z ∞
1 X
kKen k2 = |(kt , en )|2 dt = |(kt , en )|2 dt =
X X

n=1 n=1 0 0 n=1


Z 1 Z 1 Z 1
kkt k2 dt = |k(t, τ )|2 dτ dt < ∞,
0 0 0

since {en } is a complete orthonormal set in L2 ([0, 1]).

5 Spectral theory of compact normal opera-


tors
In this chapter H will always denote a Hilbert space.

5.1. Theorem Let T ∈ Com(H) be a normal operator. Then T has


an eigenvalue λ such that |λ| = kT k.

Proof: This result follows from Theorems 2.61 and 4.20. Since we have
not proved Theorem 2.61, we give an independent proof of Theorem 5.1.
There is nothing to prove if T = 0. So, let us suppose that T 6= 0.
It follows from Theorem 4.21 that there exist xn ∈ H, n ∈ IN such that
kxn k = 1 and |(T xn , xn )| → kT k. We can suppose that the sequence of
numbers (T xn , xn ) is convergent (otherwise we could take a subsequence of

59
(xn )). Let λ be the limit of this sequence. It is clear that |λ| = kT k =
6 0. We
have

kT xn − λxn k2 = (T xn − λxn , T xn − λxn ) = kT xn k2 − λ(xn , T xn ) −


λ(T xn , xn ) + |λ|2 kxn k2 = kT xn k2 − 2Re(λ(T xn , xn )) + |λ|2 ≤
2|λ|2 − 2Re(λ(T xn , xn )) → 2|λ|2 − 2|λ|2 = 0 as n → ∞.

Hence kT xn − λxn k → 0 as n → ∞.
Since T is a compact operator the sequence (T xn ) has a Cauchy subsequence
(T xnk )k∈IN . The last sequence is convergent because the space H is complete.
Let us denote its limit by y. It follows from the formula
1
xnk = (T xnk − (T xnk − λxnk ))
λ
that the sequence (xnk )k∈IN converges to x := λ−1 y. Consequently
 
Tx = T lim xnk = lim T xnk = y = λx
k→∞ k→∞

and

kxk = lim xnk = lim kxnk k = 1.

k→∞ k→∞

Thus x 6= 0 is an eigenvector of T corresponding to the eigenvalue λ. 2

5.2. Theorem (Spectral theorem for a compact normal operator)


Let T ∈ Com(H) be a normal operator. Then there exists a finite or a count-
able orthonormal set {en }N
n=1 , N ∈ IN ∪ {∞}, of eigenvectors of T such that
any x ∈ H has a unique representation of the form
N
X
x= cn en + y, y ∈ Ker(T ), cn ∈ C. (5.1)
n=1

One then has


N
X
Tx = λn cn en , (5.2)
n=1

where λn 6= 0 is the eigenvalue of T corresponding to the eigenvector en .


Moreover,
σ(T )\{0} = {λn }N
n=1 , (5.3)

60
|λ1 | ≥ |λ2 | ≥ · · · (5.4)
and
lim λn = 0, if N = ∞. (5.5)
n→∞

Proof: Step I We will use an inductive process. Let λ1 be an eigenvalue of T


such that |λ1 | = kT k (see Theorem 5.1) and let e1 be the corresponding eigen-
vector such that ke1 k = 1. Suppose we have already constructed non-zero
eigenvalues λ1 , . . . , λk of T satisfying (5.4) and corresponding eigenvectors
e1 , . . . , ek such that the set {en }kn=1 is orthonormal. Let

Lk := lin{e1 , . . . , ek }, Hk := L⊥
k.

It is clear that

L1 ⊂ L2 ⊂ · · · ⊂ Lk , H1 ⊃ H2 ⊃ · · · ⊃ Hk . (5.6)
Pk
Any x ∈ Lk has a unique representation of the form x = n=1 cn en , cn ∈ C.
Consequently
k k
λn cn en ∈ Lk , T ∗ x =
X X
Tx = λn cn en ∈ Lk
n=1 n=1

(see Theorem 4.8(ii)). Hence

T Lk ⊂ Lk , T ∗ Lk ⊂ Lk .

For any z ∈ Hk = L⊥
k we have

(x, T z) = (T ∗ x, z) = 0, (x, T ∗ z) = (T x, z) = 0, ∀x ∈ Lk ,

because T ∗ x, T x ∈ Lk . Thus T z, T ∗ z ∈ Hk . So,

T Hk ⊂ Hk , T ∗ Hk ⊂ Hk .

Let us consider the restriction of T to the Hilbert space Hk :

Tk := T |Hk ∈ B(Hk ).

It is easy to see that Tk is a compact normal operator (why?). There are two
possibilities.

61
Case I. Tk = 0. In this case the construction terminates.
Case II. Tk 6= 0. In this case there exists an eigenvalue λk+1 6= 0 of Tk
such that |λk+1 | = kTk k = kT |Hk k (see Theorem 5.1). It follows from the
construction and from (5.6) that |λk+1 | ≤ |λk | ≤ · · · ≤ |λ1 |. Let ek+1 ∈ Hk =
L⊥k be an eigenvector of Tk corresponding to λk+1 and such that kek+1 k =
1. It is clear that ek+1 is an eigenvector of T and that the set {en }k+1 n=1 is
orthonormal.
This construction gives us a finite or a countable orthonormal set {en }N n=1 ,
N ∈ IN ∪ {∞}, of eigenvectors of T and corresponding non-zero eigenvalues
λn satisfying (5.4). This set is finite if we have Case I for some k and infinite
if we always have Case II.
Step II Let us prove (5.5). Suppose (5.5) does not hold. Then there exists
δ > 0 such that |λn | ≥ δ, ∀n ∈ IN (see (5.4)). Consequently

kT en − T em k2 = kλn en − λm em k2 = |λn |2 + |λm |2 ≥ 2δ 2 > 0 if n 6= m.

Hence (T en ) cannot have a Cauchy subsequence. However, this contradicts


the compactness of T . So, we have proved (5.5).
Step III Let us consider the subspace

HN := ∩N
n=1 Hn .

If N is finite, then according to our construction T |HN = 0, i.e. HN ⊂


Ker(T ). If N = ∞, then for any x ∈ HN we have

kT xk ≤ kT |Hk kkxk = |λk+1 |kxk → 0 as k → ∞.

Thus kT xk = 0, i.e. T x = 0, ∀x ∈ HN . Consequently HN ⊂ Ker(T ). On


the other hand Ker(T ) ⊂ L⊥
k = Hk for any k by Theorem 4.8(iii). Hence

HN = Ker(T ). (5.7)

Step IV For an arbitrary x ∈ H we have


N N
!
X X
x− (x, en )en , ek = (x, ek ) − (x, en )(en , ek ) = (x, ek ) − (x, ek ) = 0,
n=1 n=1
∀k = 1, . . . , N,

62
(cf. the proof of Theorem 3.22). Thus x has a representation of the form
(5.1), where cn = (x, en ) and
N
X
y := x − (x, en )en ∈ HN = Ker(T ).
n=1

Taking the inner products of both sides of (5.1) with ek and using (5.7) we
obtain ck = (x, ek ), i.e. a representation of the form (5.1) is unique.
Step V (5.2) follows immediately from (5.1) and the  continuity of  T . It
N
is left to prove (5.3). Let us take an arbitrary λ ∈ C\ {λn }n=1 ∪ {0} . It is
clear that the distance d from λ to the closed set {λn }N n=1 ∪ {0} is positive.
(5.1), (5.2) imply
N
X
(T − λI)x = (λn − λ)(x, en )en − λy.
n=1

It is easy to see that the operator T − λI is invertible and


N
−1
(λn − λ)−1 (x, en )en − λ−1 y.
X
(T − λI) x = (5.8)
n=1

Note that
N
k(T − λI)−1 xk2 = |λn − λ|−2 |(x, en )|2 + |λ|−2 kyk2 ≤
X

n=1
N
!
−2 2 2
= d−2 kxk2 < ∞
X
d |(x, en )| + kyk
n=1

(see Proposition 3.6, Lemma 3.23 and (5.7)). So, we have proved that λ ∈ /
σ(T ), i.e. σ(T ) ⊂ {λn }N
n=1 ∪{0}. On the other hand we have {λ } N
n n=1 ⊂ σ(T ).
2

5.3. Remark Since any self-adjoint operator is normal (see Definition


4.6), the previous theorem is valid for self-adjoint operators. In this case all
eigenvalues λn are real (see Theorem 4.12 or Theorem 4.18). This variant of
Theorem 5.2 is known as the Hilbert–Schmidt theorem.

63
Let Pn be the orthogonal projection onto the one–dimensional subspace gen-
erated by en , i.e. Pn := (·, en )en and let PKer (T) be the orthogonal projection
onto Ker(T ). Then (5.1), (5.2) and (5.8) can be rewritten in the following
way:
N
X
I= Pn + PKer (T) , (5.9)
n=1
N
X
T = λ n Pn , (5.10)
n=1
N
R(T ; λ) = (T − λI)−1 = (λn − λ)−1 Pn − λ−1 PKer (T)
X
(5.11)
n=1

(see Step IV of the proof of Theorem 5.2), where the series are strongly
convergent. (A series ∞ n=1 An , An ∈ B(X, Y ), is called strongly convergent
P
P∞
if the series n=1 An x converges in Y for any x ∈ X.) On the other hand it
is easy to see that in the case N = ∞ the series (5.9) and (5.11) do not
converge in the B(H)–norm, while (5.10) does (prove this!).
Suppose a function f is analytic in some neighbourhood ∆f of σ(T ) and
Ω is an admissible set such that

σ(T ) ⊂ Ω ⊂ Cl(Ω) ⊂ ∆f .

Then we obtain from Definition 2.26, (5.11) and the Cauchy theorem
1 Z
 
f (T )x = − f (λ)R(T ; λ)dλ x =
2πi ∂Ω
N
!
1 Z
(λn − λ)−1 Pn x − λ−1 PKer (T) x dλ =
X
− f (λ)
2πi ∂Ω n=1
N 
1 Z

f (λ)(λn − λ)−1 dλ Pn x +
X

n=1 2πi ∂Ω
1 Z
 
f (λ)λ−1 dλ PKer (T) x =
2πi ∂Ω
N
X
f (λn )Pn x + f (0)PKer (T) x, ∀x ∈ H. (5.12)
n=1

The series here are convergent because (Pn x, Pm x) = 0 if m 6= n and


PN 2 PN 2 2
n=1 kPn xk = n=1 |(x, en )| ≤ kxk (see Lemma 3.23 and Corollary 3.24).

64
Note that if dim(H) < +∞ it may happen that 0 6∈ σ(T ) and 0 6∈ Ω. If in
this case f (0) 6= 0 then
1 Z
f (λ)λ−1 dλ = 0 6= f (0).
2πi ∂Ω
Nevertheless (5.12) holds because PKer (T) = 0, since Ker (T) = 0.

It is clear that the RHS of (5.12) is well defined for any bounded on σ(T )
not necessarily analytic function f . This motivates the following definition.
5.4. Definition Let T ∈ Com(H) be a normal operator and f be a bounded
function on σ(T ). Then
N
X
f (T )x := f (λn )Pn x + f (0)PKer (T) x, ∀x ∈ H.
n=1

Note that if dim(H) < +∞ and 0 6∈ σ(T ), then f may be not defined
at 0. The above definition, however, still makes sense because in this case
PKer (T) = 0, since Ker (T) = 0, and we assume that f (0)PKer (T) = 0.
The operator f (T ) is well defined because
2
N

X
f (λn )Pn x + f (0)PKer (T) x =



n=1
N
|f (λn )(x, en )|2 + |f (0)|2 kPKer (T) xk2 ≤
X

n=1
!2
sup |f (λ)| kxk2 < ∞, ∀x ∈ H, (5.13)
λ∈σ(T )

(cf. Step V of the proof of Theorem 5.2). It follows from (5.13) that

kf (T )k ≤ sup |f (λ)|.
λ∈σ(T )

It is easy to see that if Ker(T ) = {0}, then

kf (T )k ≤ sup |f (λ)| = sup |f (λ)|.


λ∈{λn } λ∈σ(T )\{0}

65
Since kf (T )en k = kf (λk )en k = |f (λn )| and kf (T )yk = kf (0)yk = |f (0)|kyk
for any y ∈ Ker(T ), we have
(
supλ∈σ(T ) |f (λ)| if Ker (T) 6= {0},
kf (T )k ≥
supλ∈σ(T )\{0} |f (λ)| if Ker (T) = {0}.
Thus (
supλ∈σ(T ) |f (λ)| if Ker (T) 6= {0},
kf (T )k = (5.14)
supλ∈σ(T )\{0} |f (λ)| if Ker (T) = {0}.
If dim(H) < +∞ and Ker(T ) = {0}, then 0 6∈ σ(T ) and σ(T ) \ {0} =
σ(T ). If dim(H) = +∞ and Ker(T ) = {0}, then 0 ∈ σ(T ) is the only limit
point of the set σ(T ) \ {0}. Therefore
sup |f (λ)| = sup |f (λ)|, if f is continuous at 0.
λ∈σ(T ) λ∈σ(T )\{0}

However, if f is not continuous at 0, the above equality is not necessarily


true.
It is easy to see that the functional calculus f 7→ f (T ) from Definition
5.4 has the same properties as that from Definition 2.26 (see Theorems 2.30,
2.31, 2.34 and 2.35). The advantages of Definition 5.4 are that f has to be
defined only on σ(T ) and may be non-analytic. The disadvantage is that
T has to be a compact normal operator acting on a Hilbert space, while
Definition 2.26 deals with an arbitrary bounded linear operator acting on a
Banach space.

5.5. Theorem Let T ∈ Com(H) be a normal operator and f be a bounded


function on σ(T ). The operator f (T ) is normal. It is compact if and only if
one of the following statements is true:
(i) N < +∞, dim(Ker(T )) < +∞, i.e. dim(H) < +∞,
(ii) N < +∞, dim(Ker(T )) = +∞ and f (0) = 0,
(iii) N = +∞, dim(Ker(T )) < +∞ and f (λn ) → 0 as n → ∞,
(iv) N = +∞, dim(Ker(T )) = +∞, f (0) = 0 and f (λn ) → 0 as n → ∞.
The operator f (T ) is self–adjoint if and only if one of the following state-
ments is true:
(a) Ker(T ) = {0} and f (λn ) ∈ IR for each n,
(b) Ker(T ) 6= {0}, f (0) ∈ IR and f (λn ) ∈ IR for each n.

Proof: Exercise. 2

66

You might also like