Linear Algebra
Linear Algebra
Michaelmas 2015
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Definition of a vector space (over R or C), subspaces, the space spanned by a subset.
Linear independence, bases, dimension. Direct sums and complementary subspaces.
[3]
Linear maps, isomorphisms. Relation between rank and nullity. The space of linear
maps from U to V , representation by matrices. Change of basis. Row rank and column
rank. [4]
Determinant and trace of a square matrix. Determinant of a product of two matrices
and of the inverse matrix. Determinant of an endomorphism. The adjugate matrix. [3]
Eigenvalues and eigenvectors. Diagonal and triangular forms. Characteristic and
minimal polynomials. Cayley-Hamilton Theorem over C. Algebraic and geometric
multiplicity of eigenvalues. Statement and illustration of Jordan normal form. [4]
Dual of a finite-dimensional vector space, dual bases and maps. Matrix representation,
rank and determinant of dual map. [2]
Bilinear forms. Matrix representation, change of basis. Symmetric forms and their link
with quadratic forms. Diagonalisation of quadratic forms. Law of inertia, classification
by rank and signature. Complex Hermitian forms. [4]
Inner product spaces, orthonormal sets, orthogonal projection, V = W ⊕ W ⊥ . Gram-
Schmidt orthogonalisation. Adjoints. Diagonalisation of Hermitian matrices. Orthogo-
nality of eigenvectors and properties of eigenvalues. [4]
1
Contents IB Linear Algebra
Contents
0 Introduction 3
1 Vector spaces 4
1.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Linear independence, bases and the Steinitz exchange lemma . . 6
1.3 Direct sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Linear maps 15
2.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Linear maps and matrices . . . . . . . . . . . . . . . . . . . . . . 19
2.3 The first isomorphism theorem and the rank-nullity theorem . . . 21
2.4 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Elementary matrix operations . . . . . . . . . . . . . . . . . . . . 27
3 Duality 29
3.1 Dual space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Dual maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4 Bilinear forms I 37
5 Determinants of matrices 41
6 Endomorphisms 49
6.1 Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2 The minimal polynomial . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.1 Aside on polynomials . . . . . . . . . . . . . . . . . . . . 53
6.2.2 Minimal polynomial . . . . . . . . . . . . . . . . . . . . . 54
6.3 The Cayley-Hamilton theorem . . . . . . . . . . . . . . . . . . . 57
6.4 Multiplicities of eigenvalues and Jordan normal form . . . . . . . 63
7 Bilinear forms II 72
7.1 Symmetric bilinear forms and quadratic forms . . . . . . . . . . . 72
7.2 Hermitian form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2
0 Introduction IB Linear Algebra
0 Introduction
In IA Vectors and Matrices, we have learnt about vectors (and matrices) in a
rather applied way. A vector was just treated as a “list of numbers” representing
a point in space. We used these to represent lines, conics, planes and many
other geometrical notions. A matrix is treated as a “physical operation” on
vectors that stretches and rotates them. For example, we studied the properties
of rotations, reflections and shears of space. We also used matrices to express
and solve systems of linear equations. We mostly took a practical approach in
the course.
In IB Linear Algebra, we are going to study vectors in an abstract setting.
Instead of treating vectors as “lists of numbers”, we view them as things we
can add and scalar-multiply. We will write down axioms governing how these
operations should behave, just like how we wrote down the axioms of group
theory. Instead of studying matrices as an array of numbers, we instead look at
linear maps between vector spaces abstractly.
In the course, we will, of course, prove that this abstract treatment of linear
algebra is just “the same as” our previous study of “vectors as a list of numbers”.
Indeed, in certain cases, results are much more easily proved by working with
matrices (as an array of numbers) instead of abstract linear maps, and we don’t
shy away from doing so. However, most of the time, looking at these abstractly
will provide a much better fundamental understanding of how things work.
3
1 Vector spaces IB Linear Algebra
1 Vector spaces
1.1 Definitions and examples
Notation. We will use F to denote an arbitrary field, usually R or C.
Intuitively, a vector space V over a field F (or an F-vector space) is a space
with two operations:
– We can add two vectors v1 , v2 ∈ V to obtain v1 + v2 ∈ V .
– We can multiply a scalar λ ∈ F with a vector v ∈ V to obtain λv ∈ V .
Of course, these two operations must satisfy certain axioms before we can
call it a vector space. However, before going into these details, we first look at a
few examples of vector spaces.
Example.
(i) Rn = {column vectors of length n with coefficients in R} with the usual
addition and scalar multiplication is a vector space.
An m × n matrix A with coefficients in R can be viewed as a linear map
from Rm to Rn via v 7→ Av.
This is a motivational example for vector spaces. When confused about
definitions, we can often think what the definition means in terms of Rn
and matrices to get some intuition.
(ii) Let X be a set and define RX = {f : X → R} with addition (f + g)(x) =
f (x) + g(x) and scalar multiplication (λf )(x) = λf (x). This is a vector
space.
More generally, if V is a vector space, X is a set, we can define V X = {f :
X → V } with addition and scalar multiplication as above.
(iii) Let [a, b] ⊆ R be a closed interval, then
4
1 Vector spaces IB Linear Algebra
We always write 0 for the additive identity in V , and call this the identity.
By abuse of notation, we also write 0 for the trivial vector space {0}.
In a general vector space, there is no notion of “coordinates”, length, angle
or distance. For example, it would be difficult to assign these quantities to the
vector space of real-valued continuous functions in [a, b].
From the axioms, there are a few results we can immediately prove.
Proposition. In any vector space V , 0v = 0 for all v ∈ V , and (−1)v = −v,
where −v is the additive inverse of v.
Proof is left as an exercise.
In mathematics, whenever we define “something”, we would also like to define
a “sub-something”. In the case of vector spaces, this is a subspace.
Definition (Subspace). If V is an F-vector space, then U ⊆ V is an (F-linear)
subspace if
(i) u, v ∈ U implies u + v ∈ U .
(ii) u ∈ U, λ ∈ F implies λu ∈ U .
(iii) 0 ∈ U .
These conditions can be expressed more concisely as “U is non-empty and if
λ, µ ∈ F, u, v ∈ U , then λu + µv ∈ U ”.
Alternatively, U is a subspace of V if it is itself a vector space, inheriting the
operations from V .
We sometimes write U ≤ V if U is a subspace of V .
Example.
If we have two subspaces U and V , there are several things we can do with
them. For example, we can take the intersection U ∩ V . We will shortly show
that this will be a subspace. However, taking the union will in general not
produce a vector space. Instead, we need the sum:
Definition (Sum of subspaces). Suppose U, W are subspaces of an F vector
space V . The sum of U and V is
U + W = {u + w : u ∈ U, w ∈ W }.
5
1 Vector spaces IB Linear Algebra
6
1 Vector spaces IB Linear Algebra
1 0 1
(i) Let V = R3 and S = 0 , 1 , 2 . Then
0 1 2
a
hSi = b : a, b ∈ R .
b
7
1 Vector spaces IB Linear Algebra
So S is linearly dependent.
This in turn gives an alternative characterization of what it means to be a
basis:
Proposition. If S = {e1 , · · · , en } is a subset of V over F, then it is a basis if
and only if every v ∈ V can be written uniquely as a finite linear combination
of elements in S, i.e. as
Xn
v= λi ei .
i=1
Then we have
n
X
0=v−v = (λi − µi )ei .
i=1
Linear independence implies that λi − µi = 0 for all i. Hence λi = µi . So v can
be expressed in a unique way.
On the other hand, if S is not linearly independent, then we have
n
X
0= λi ei
i=1
8
1 Vector spaces IB Linear Algebra
Tr = (T \ Tr0 ) ∪ {e1 , · · · , er }
9
1 Vector spaces IB Linear Algebra
spans V .
(Note that the case r = 0 is trivial, since we can take Tr0 = ∅, and the case
r = n is the theorem which we want to achieve.)
Suppose we have these. Since Tr spans V , we can write
k
X
er+1 = λi ti , λi ∈ F, ti ∈ Tr .
i=1
We know that the ei are linearly independent, so not all ti ’s are ei ’s. So there is
some j such that tj ∈ (T \ Tr0 ). We can write this as
1 X λi
tj = er+1 + − ti .
λj λj
i6=j
0
We let Tr+1 = Tr0 ∪ {tj } of order r + 1, and
0
Tr+1 = (T \ Tr+1 ) ∪ {e1 , · · · , er+1 } = (Tr \ {tj }} ∪ {er+1 }
V ⊇ hTr+1 i ⊇ hTr i = V.
So hTr+1 i = V .
Hence we can inductively find Tn .
From this lemma, we can immediately deduce a lot of important corollaries.
Corollary. Suppose V is a vector space over F with a basis of order n. Then
(i) Every basis of V has order n.
(ii) Any linearly independent set of order n is a basis.
(iii) Every spanning set of order n is a basis.
(iv) Every finite spanning set contains a basis.
(v) Every linearly independent subset of V can be extended to basis.
Proof. Let S = {e1 , · · · , en } be the basis for V .
(i) Suppose T is another basis. Since S is independent and T is spanning,
|T | ≥ |S|.
The other direction is less trivial, since T might be infinite, and Steinitz
does not immediately apply. Instead, we argue as follows: since T is
linearly independent, every finite subset of T is independent. Also, S is
spanning. So every finite subset of T has order at most |S|. So |T | ≤ |S|.
So |T | = |S|.
(ii) Suppose now that T is a linearly independent subset of order n, but
hT i =
6 V . Then there is some v ∈ V \ hT i. We now show that T ∪ {v} is
independent. Indeed, if
m
X
λ0 v + λi ti = 0
i=1
10
1 Vector spaces IB Linear Algebra
(iv) Suppose T is any finite spanning set. Let T 0 ⊆ T be a spanning set of least
possible size. This exists because T is finite. If |T 0 | has size n, then done
by (iii). Otherwise by the Steinitz exchange lemma, it has size |T 0 | > n.
So T 0 must be linearly dependent because S is spanning. So P there is some
t0 , · · · , tm ∈ T distinct and λ1 , · · · , λm ∈ F such that t0 = λi ti . Then
T 0 \ {t0 } is a smaller spanning set. Contradiction.
(v) Suppose T is a linearly independent set. Since S spans, there is some
S 0 ⊆ S of order |T | such that (S \ S 0 ) ∪ T spans V by the Steinitz exchange
lemma. So by (ii), (S \ S 0 ) ∪ T is a basis of V containing T .
Note that the last part is where we actually use the full result of Steinitz.
Finally, we can use this to define the dimension.
Definition (Dimension). If V is a vector space over F with finite basis S, then
the dimension of V , written
By the corollary, dim V does not depend on the choice of S. However, it does
depend on F. For example, dimC C = 1 (since {1} is a basis), but dimR C = 2
(since {1, i} is a basis).
After defining the dimension, we can prove a few things about dimensions.
Lemma. If V is a finite dimensional vector space over F, U ⊆ V is a proper
subspace, then U is finite dimensional and dim U < dim V .
Proof. Every linearly independent subset of V has size at most dim V . So let
S ⊆ U be a linearly independent subset of largest size. We want to show that S
spans U and |S| < dim V .
If v ∈ V \ hSi, then S ∪ {v} is linearly independent. So v 6∈ U by maximality
of S. This means that hSi = U .
Since U =6 V , there is some v ∈ V \ U = V \ hSi. So S ∪ {v} is a linearly
independent subset of order |S| + 1. So |S| + 1 ≤ dim V . In particular, dim U =
|S| < dim V .
11
1 Vector spaces IB Linear Algebra
The proof is not hard, as long as we manage to pick the right basis to do the
proof. This is our slogan:
When you choose a basis, always choose the right basis.
We need a basis for all four of them, and we want to compare the bases. So we
want to pick bases that are compatible.
Proof. Let R = {v1 , · · · , vr } be a basis for U ∩W . This is a linearly independent
subset of U . So we can extend it to be a basis of U by
S = {v1 , · · · , vr , ur+1 , · · · , us }.
T = {v1 , · · · , vr , wr+1 , · · · , wt }.
So X X X
λi vi + µj uj = − νk wk .
Since the left hand side is something in U , and the right hand side is something
in W , they both lie in U ∩ W .
Since S is a basis of U , there is only one way of writing the left hand vector
as a sum of vi and uj . However, since R is a basis of U ∩ W , we can write the
left hand vector just as a sum of vi ’s. So we must have µj = 0 for all j. Then
we have X X
λi vi + νk wk = 0.
Finally, since T is linearly independent, λi = νk = 0 for all i, k. So S ∪ T is
linearly independent.
Proposition. If V is a finite dimensional vector space over F and U ∪ V is a
subspace, then
dim V = dim U + dim V /U.
12
1 Vector spaces IB Linear Algebra
Proof. Let {u1 , · · · , um } be a basis for U and extend this to a basis {u1 , · · · , um ,
vm+1 , · · · , vn } for V . We want to show that {vm+1 + U, · · · , vn + U } is a basis
for V /U .
It is easy to see that this spans V /U . If v + U ∈ V /U , then we can write
X X
v= λi ui + µi vi .
Then X X X
v+U = µi (vi + U ) + λi (ui + U ) = µi (vi + U ).
So done.
To show that they are linearly independent, suppose that
X
λi (vi + U ) = 0 + U = U.
U ⊕ W = {(u, w) : u ∈ U, w ∈ W },
13
1 Vector spaces IB Linear Algebra
The difference between these two definitions is that the first is decomposing
V into smaller spaces, while the second is building a bigger space based on two
spaces.
Note, however, that the external direct sum U ⊕ W is the internal direct
sum of U and W viewed as subspaces of U ⊕ W , i.e. as the internal direct sum
of {(u, 0) : u ∈ U } and {(0, v) : v ∈ V }. So these two are indeed compatible
notions, and this is why we give them the same name and notation.
Definition ((Multiple) (internal) direct sum). If U1 , · · · , Un ⊆ V are subspaces
of V , then V is the (internal) direct sum
n
M
V = U1 ⊕ · · · ⊕ Un = Ui
i=1
P
if every v ∈ V can be written uniquely as v = ui with ui ∈ Ui .
This can be extended
P to an infinite sum with the same definition, just noting
that the sum v = ui has to be finite.
14
2 Linear maps IB Linear Algebra
2 Linear maps
In mathematics, apart from studying objects, we would like to study functions
between objects as well. In particular, we would like to study functions that
respect the structure of the objects. With vector spaces, the kinds of functions
we are interested in are linear maps.
= λα(u)i + µα(v)i .
So α is linear.
(ii) Let X be a set and g ∈ FX . Then we define mg : FX → FX by mg (f )(x) =
g(x)f (x). Then mg is linear. For example, f (x) 7→ 2x2 f (x) is linear.
Rx
(iii) Integration I : (C([a, b]), R) → (C([a, b]), R) defined by f 7→ a f (t) dt is
linear.
(iv) Differentiation D : (C ∞ ([a, b]), R) → (C ∞ ([a, b]), R) by f 7→ f 0 is linear.
15
2 Linear maps IB Linear Algebra
So β is linear.
Definition (Image and kernel). Let α : U → V be a linear map. Then the
image of α is
im α = {α(u) : u ∈ U }.
The kernel of α is
ker α = {u : α(u) = 0}.
16
2 Linear maps IB Linear Algebra
If two vector spaces are isomorphic, then it is not too surprising that they
have the same dimension, since isomorphic spaces are “the same”. Indeed this is
what we are going to show.
Proposition. Let α : U → V be an F-linear map. Then
(i) If α is injective and S ⊆ U is linearly independent, then α(S) is linearly
independent in V .
(ii) If α is surjective and S ⊆ U spans U , then α(S) spans V .
(iii) If α is an isomorphism and S ⊆ U is a basis, then α(S) is a basis for V .
Here (iii) immediately shows that two isomorphic spaces have the same
dimension.
Proof.
(i) We prove the contrapositive. Suppose that α is injective and α(S) is linearly
dependent. So there are s0 , · · · , sn ∈ S distinct and λ1 , · · · , λn ∈ F not all
zero such that
n n
!
X X
α(s0 ) = λi α(si ) = α λi si .
i=1 i=1
Then
n
X
v = α(u) = λi α(si ).
i=1
So α(S) spans V .
17
2 Linear maps IB Linear Algebra
defined by
α 7→ (α(e1 ), · · · , α(en )).
Proof. We first make sure this is indeed a function — if α is an isomorphism,
then from our previous proposition, we know that it sends a basis to a basis. So
(α(e1 ), · · · , α(en )) is indeed a basis for V .
We now have to prove surjectivity and injectivity.
Suppose α, β : Fn → V are isomorphism such that Φ(α) = Φ(β). In other
words, α(ei ) = β(ei ) for all i. We want to show that α = β. We have
x1 n
! x1
α ... = α xi β(ei ) = β ... .
X X X
xi ei = xi α(ei ) =
xn i=1 xn
Hence α = β.
Next, suppose that (v1 , · · · , vn ) is an ordered basis for V . Then define
x1
.. X
α . = xi vi .
xn
It is easy to check that this is well-defined and linear. We P also knowP that α is
injective since (v1 , · · · , vn ) is linearly independent. So if xi vi = yi vi , then
xi = yi . Also, α is surjective since (v1 , · · · , vn ) spans V . So α is an isomorphism,
and by construction Φ(α) = (v1 , · · · , vn ).
18
2 Linear maps IB Linear Algebra
Similarly, X
β(u) = ui f (ei ).
So α(u) = β(u) for every u. So α = β. P
For existence, if u ∈ U , we can write u = ui ei in a unique way. So defining
X
α(u) = ui f (ei )
19
2 Linear maps IB Linear Algebra
α
U V
Then the corollary tells us that every A gives rise to an α, and every α corresponds
to an A that fit into this diagram.
Proof. If α is a linear map U → V , then for each 1 ≤ i ≤ m, we can write α(ei )
uniquely as
Xn
α(ei ) = aji fj
j=1
for some aji ∈ F. This gives a matrix A = (aij ). The previous proposition tells
us that every matrix A arises in this way, and α is determined by A.
Definition (Matrix representation). We call the matrix corresponding to a
linear map α ∈ L(U, V ) under the corollary the matrix representing α with
respect to the bases (e1 , · · · , em ) and (f1 , · · · , fn ).
A B
Fr Fs Ft
s(R) s(S) s(T )
α β
U V W
20
2 Linear maps IB Linear Algebra
combination of w1 , · · · , wt :
!
X
βα(ui ) = β Aki vk
k
X
= Aki β(vk )
k
X X
= Aki Bjk wj
k j
!
X X
= Bjk Aki wj
j k
X
= (BA)ji wj
j
ᾱ : U/ ker α → im α
(u + ker α) 7→ α(u)
Note that if we view a vector space as an abelian group, then this is exactly
the first isomorphism theorem of groups.
Proof. We know that 0 ∈ ker α and 0 ∈ im α.
Suppose u1 , u2 ∈ ker α and λ1 , λ2 ∈ F. Then
21
2 Linear maps IB Linear Algebra
Proof. Let ek+1 , · · · , em be a basis for the kernel of α. Then we can extend this
to a basis of the (e1 , · · · , em ).
Let fi = α(ei ) for 1 ≤ i ≤ k. We now show that (f1 , · · · , fk ) is a basis for
im α (and thus k = r). We first show that it spans. Suppose v ∈ im α. Then we
have !
Xm
v=α λi ei
i=1
So v ∈ hf1 , · · · , fk i.
To show linear dependence, suppose that
k
X
µi fi = 0.
i=1
So we have !
k
X
α µi ei = 0.
i=1
22
2 Linear maps IB Linear Algebra
Pk
So i=1 µi ei ∈ ker α. Since (ek+1 , · · · , em ) is a basis for ker α, we can write
k
X m
X
µi ei = µi ei
i=1 i=k+1
Example. Let
W = {x ∈ R5 : x1 + x2 + x3 = 0 = x3 − x4 − x5 }.
α:U ⊕W →V
(u, w) 7→ u + w,
Then we have
This is a result we’ve previously obtained through fiddling with basis and horrible
stuff.
Corollary. Suppose α : U → V is a linear map between vector spaces over F
both of dimension n < ∞. Then the following are equivalent
(i) α is injective;
(ii) α is surjective;
23
2 Linear maps IB Linear Algebra
(iii) α is an isomorphism.
Proof. It is clear that, (iii) implies (i) and (ii), and (i) and (ii) together implies
(iii). So it suffices to show that (i) and (ii) are equivalent.
Note that α is injective iff n(α) = 0, and α is surjective iff r(α) = dim V = n.
By the rank-nullity theorem, n(α) + r(α) = n. So the result follows immediately.
Lemma. Let A ∈ Mn,n (F) = Mn (F) be a square matrix. The following are
equivalent
(i) There exists B ∈ Mn (F) such that BA = In .
(ii) There exists C ∈ Mn (F) such that AC = In .
If these hold, then B = C. We call A invertible or non-singular, and write
A−1 = B = C.
Proof. Let α, β, γ, ι : Fn → Fn be the linear maps represented by matrices
A, B, C, In respectively with respect to the standard basis.
We note that (i) is equivalent to saying that there exists β such that βα = ι.
This is true iff α is injective, which is true iff α is an isomorphism, which is true
iff α has an inverse α−1 .
Similarly, (ii) is equivalent to saying that there exists γ such that αγ = ι.
This is true iff α is injective, which is true iff α is isomorphism, which is true iff
α has an inverse α−1 .
So these are the same things, and we have β = α−1 = γ.
A
Fm Fn
We now want to consider what happens when we have two different basis {ui }
and {ei } of U . These will then give rise to two different maps from Fm to our
space U , and the two basis can be related by a change-of-basis map P . We can
put them in the following diagram:
ιU
U U
(ui ) (ei )
P
Fm Fm
24
2 Linear maps IB Linear Algebra
where ιU is the identity map. If we perform a change of basis for both U and V ,
we can stitch the diagrams together as
ιU α ιV
U U V V
(ui ) (ei ) (fi ) (vi )
P A Q
Fm Fm Fn Fn
B = Q−1 AP.
B = Q−1 AP,
Note that one can view P as the matrix representing the identity map iU
from U with basis (ui ) to U with basis (ei ), and similarly for Q. So both are
invertible.
Proof. On the one hand, we have
n
X XX X
α(ui ) = Bji vj = Bji Q`j f` = [QB]`i f` .
j=1 j ` `
QB = AP.
25
2 Linear maps IB Linear Algebra
Two matrices are equivalent if and only if they represent the same linear map
with respect to different basis.
Corollary. If A ∈ Matn,m (F), then there exists invertible matrices P ∈
GLm (F), Q ∈ GLn (F) so that
I 0
Q−1 AP = r
0 0
So r(AT ) = r.
26
2 Linear maps IB Linear Algebra
This is called a reflection, where the rows we changed are the ith and jth row.
1
..
.
1 λ
n
Eij (λ) =
. ..
1
..
.
1
27
2 Linear maps IB Linear Algebra
n
(ii) AEIj (λ) is obtained by adding λ× column i to column j.
(iii) ATin (λ) is obtained from A by rescaling the ith column by λ.
Multiplying on the left instead of the right would result in the same operations
performed on the rows instead of the columns.
Proposition. If A ∈ Matn,m (F), then there exists invertible matrices P ∈
GLm (F), Q ∈ GLn (F) so that
I 0
Q−1 AP = r
0 0
We are going to start with A, and then apply these operations to get it into
this form.
Proof. We claim that there are elementary matrices E1m , · · · , Eam and F1n , · · · , Fbn
(these E are not necessarily the shears, but any elementary matrix) such that
I 0
E1m · · · Eam AF1n · · · Fbn = r
0 0
This suffices since the Eim ∈ GLM (F) and Fjn ∈ GLn (F). Moreover, to prove the
claim, it suffices to find a sequence of elementary row and column operations
reducing A to this form.
If A = 0, then done. If not, there is some i, j such that Aij 6= 0. By swapping
row 1 and row i; and then column 1 and column j, we can assume A11 6= 0. By
rescaling row 1 by A111 , we can further assume A11 = 1.
Now we can add −A1j times column 1 to column j for each j 6= 1, and then
add −Ai1 times row 1 to row i 6= 1. Then we now have
1 0 ··· 0
0
A = .
..
B
0
It is an exercise to show that the row and column operations do not change
the row rank or column rank, and deduce that they are equal.
28
3 Duality IB Linear Algebra
3 Duality
Duality is a principle we will find throughout mathematics. For example, in IB
Optimisation, we considered the dual problems of linear programs. Here we will
look for the dual of vector spaces. In general, we try to look at our question in a
“mirror” and hope that the mirror problem is easier to solve than the original
mirror.
At first, the definition of the dual might see a bit arbitrary and weird. We
will try to motivate it using what we will call annihilators, but they are much
more useful than just for these. Despite their usefulness, though, they can be
confusing to work with at times, since the dual space of a vector space V will be
constructed by considering linear maps on V , and when we work with maps on
dual spaces, things explode.
x1 − x3 = 0
2x1 − x2 = 0.
29
3 Duality IB Linear Algebra
Example.
x1
– If V = R3 and θ : V → R that sends x2 7→ x1 − x3 , then θ ∈ V ∗ .
x3
It turns out it is rather easy to specify how the dual space looks like, at least
in the case where V is finite dimensional.
Lemma. If V is a finite-dimensional vector space over f with basis (e1 , · · · , en ),
then there is a basis (ε1 , · · · , εn ) for V ∗ (called the dual basis to (e1 , · · · , en ))
such that
εi (ej ) = δij .
Proof. Since linear maps are characterized by their values on a basis, there exists
unique choices for ε1 , · · · , εn ∈ V ∗ . Now we show that (ε1 , · · · , εn ) is a basis.
Suppose θ ∈ V ∗ . We show
Pn that we can write it uniquely Pas a combination of
n
ε1 , · · · , εn . We have θ = i=1 λi εi if and only if θ(ej ) = i=1 λi εi (ej ) (for all
j) if and only if λj = θ(ej ). So we have uniqueness and existence.
Corollary. If V is finite dimensional, then dim V = dim V ∗ .
When V is not finite dimensional, this need not be true. However, we
know that the dimension of V ∗ is at least as big as that of V , since the above
gives a set of dim V many independent vectors in V ∗ . In fact for any infinite
dimensional vector space, dim V ∗ is strictly larger than dim V , if we manage to
define dimensions for infinite-dimensional vector spaces.
It helps to come up with a more concrete example of how dual spaces look
like. Consider the vector space Fn , where we treat each element as a column
∗
vector (with respect to the standard Pnbasis). Then we can regard elements of V
as just row vectors (a1 , · · · , an ) = j=1 aj εj with respect to the dual basis. We
have
! n x1
X X X X .
a j εj ei = aj xi δij = ai xi = a1 · · · an .. .
xi i,j i=1 xn
30
3 Duality IB Linear Algebra
So we can compute
n
! n
! n
!
X X X
Pi` η` (ej ) = Pi` η` Qkj fk
`=1 `=1 k=1
X
= Pi` δ`k Qkj
k,`
X
= Pi` Q`j
k,`
= [P Q]ij
= δij .
Pn
So εi = `=1 P`iT η` .
Now we’ll return to our original motivation, and think how we can define
subspaces of V ∗ in terms of subspaces of V , and vice versa.
Definition (Annihilator). Let U ⊆ V . Then the annihilator of U is
U 0 = {θ ∈ V ∗ : θ(u) = 0, ∀u ∈ U }.
W 0 = {v ∈ V : θ(v) = 0, ∀θ ∈ W }.
31
3 Duality IB Linear Algebra
32
3 Duality IB Linear Algebra
What happens to the matrices when we take the dual map? The answer is
that we get the transpose.
Proposition. Let V, W be finite-dimensional vector spaces over F and α : V →
W be a linear map. Let (e1 , · · · , en ) be a basis for V and (f1 , · · · , fm ) be a basis
for W ; (ε1 , · · · , εn ) and (η1 , · · · , ηm ) the corresponding dual bases.
Suppose α is represented by A with respect to (ei ) and (fi ) for V and W .
Then α∗ is represented by AT with respect to the corresponding dual bases.
Proof. We are given that
m
X
α(ei ) = Aki fk .
k=1
So done.
Note that if α : U → V and β : V → W , θ ∈ W ∗ , then
33
3 Duality IB Linear Algebra
θ ∈ ker α∗ ⇔ α∗ (θ) = 0
⇔ (∀v ∈ V ) θα(v) = 0
⇔ (∀w ∈ im α) θ(w) = 0
⇔ θ ∈ (im α)0 .
So im α∗ ⊆ (ker α)0 .
But we know
dim(ker α)0 + dim ker α = dim V,
So we have
34
3 Duality IB Linear Algebra
35
3 Duality IB Linear Algebra
(i) If U ≤ V , then U 00 = U .
(ii) If α ∈ L(V, W ), then α∗∗ = α.
Proof.
(i) Let u ∈ U . Then u(θ) = θ(u) = 0 for all θ ∈ U 0 . So u annihilates
everything in U 0 . So u ∈ U 00 . So U ⊆ U 00 . We also know that
So we must have U = U 00 .
(ii) The proof of this is basically — the transpose of the transpose is the
original matrix. The only work we have to do is to show that the dual of
the dual basis is the original basis.
Let (e1 , · · · , en ) be a basis for V and (f1 , · · · , fm ) be a basis for W , and let
(ε1 , · · · , εn ) and (η1 , · · · , ηn ) be the corresponding dual basis. We know
that
ei (εj ) = δij = εj (ei ), fi (ηj ) = δij = ηj (fi ).
So (e1 , · · · , en ) is dual to (ε1 , · · · , εn ), and similarly for f and η.
If α is represented by A, then α∗ is represented by AT . So α∗∗ is represented
by (AT )T = A. So done.
Proposition. Let V be a finite-dimensional vector space F and U1 , U2 are
subspaces of V . Then we have
(i) (U1 + U2 )0 = U10 ∩ U20
(ii) We have
So done.
36
4 Bilinear forms I IB Linear Algebra
4 Bilinear forms I
So far, we have been looking at linear things only. This can get quite boring.
For a change, we look at bi linear maps instead. In this chapter, we will look at
bilinear forms in general. It turns out there isn’t much we can say about them,
and hence this chapter is rather short. Later, in Chapter 7, we will study some
special kinds of bilinear forms which are more interesting.
Definition (Bilinear form). Let V, W be vector spaces over F. Then a function
φ : V × W → F is a bilinear form if it is linear in each variable, i.e. for each
v ∈ V , φ(v, · ) : W → F is linear; for each w ∈ W , φ( · , w) : V → F is linear.
Example. The map defined by
V ×V∗ →F
(v, θ) 7→ θ(v) = ev(v)(θ)
is a bilinear form.
Pn
Example. Let V = W = Fn . Then the function (v, w) = i=1 vi wi is bilinear.
Example. If V = W = C([0, 1], R), then
Z a
(f, g) 7→ f g dt
0
is a bilinear form.
Example. Let A ∈ Matm,n (F). Then
φ : Fm × Fn → F
(v, w) 7→ vT Aw
is bilinear. Note that the (real) dot product is the special case of this, where
n = m and A = I.
In fact, this is the most general form of bilinear forms on finite-dimensional
vector spaces.
Definition (Matrix representing bilinear form). Let (e1 , · · · , en ) be a basis for
V and (f1 , · · · , fm ) be a basis for W , and ψ : V × W → F. Then the matrix A
representing ψ with respect to the basis is defined to be
Aij = ψ(ei , fj ).
P P
Note that if v = λi ei and w = µj fj , then by linearity, we get
X
ψ(v, w) = ψ λi ei , w
X
= λi ψ(ei , w)
i
X X
= λi ψ ei , µj fj
i
X
= λi µj ψ(ei , fj )
i,j
= λT Aµ.
37
4 Bilinear forms I IB Linear Algebra
So ψ is determined by A.
We have identified linear maps with matrices, and we have identified bilinear
maps with matrices. However, you shouldn’t think linear maps are bilinear maps.
They are, obviously, two different things. In fact, the matrices representing
matrices and bilinear forms transform differently when we change basis.
Proposition. Suppose (e1 , · · · , en ) and (v1 , · · · , vn ) are basis for V such that
X
vi = Pki ek for all i = 1, · · · , n;
Bij = φ(vi , wj )
X X
=φ Pki ek , Q`j f`
X
= Pki Q`j φ(ek , f` )
X
T
= Pik Ak` Q`j
k,`
= (P T AQ)ij .
Note that while the transformation laws for bilinear forms and linear maps
are different, we still get that two matrices are representing the same bilinear
form with respect to different bases if and only if they are equivalent, since if
B = P −1 AQ, then B = ((P −1 )T )T AQ.
If we are given a bilinear form ψ : V × W → F, we immediately get two linear
maps:
ψL : V → W ∗ , ψR : W → V ∗ ,
defined by ψL (v) = ψ(v, · ) and ψR (w) = ψ( · , w).
For example, if ψ : V × V ∗ → F, is defined by (v, θ) 7→ θ(v), then ψL : V →
V is the evaluation map. On the other hand, ψR : V ∗ → V ∗ is the identity
∗∗
map.
38
4 Bilinear forms I IB Linear Algebra
So we get X
ψL (ei ) = AT`i η` .
So AT represents ψL .
We also have
ψR (fj )(ei ) = Aij .
So X
ψR (fj ) = Akj εk .
Definition (Left and right kernel). The kernel of ψL is left kernel of ψ, while
the kernel of ψR is the right kernel of ψ.
Then by definition, v is in the left kernel if ψ(v, w) = 0 for all w ∈ W .
More generally, if T ⊆ V , then we write
Proof. Since ψR and ψL are represented by A and AT (in some order), they both
have trivial kernel if and only if n(A) = n(AT ) = 0. So we need r(A) = dim V
and r(AT ) = dim W . So we need dim V = dim W and A have full rank, i.e. the
corresponding linear map is bijective. So done.
39
4 Bilinear forms I IB Linear Algebra
F2 × F2 → F
a b
, 7→ ad − bc
c d
40
5 Determinants of matrices IB Linear Algebra
5 Determinants of matrices
We probably all know what the determinant is. Here we are going to give a
slightly more abstract definition, and spend quite a lot of time trying motivate
this definition.
Recall that Sn is the group of permutations of {1, · · · , n}, and there is a
unique group homomorphism ε : Sn → {±1} such that ε(σ) = 1 if σ can be
written as a product of an even number of transpositions; ε(σ) = −1 if σ can be
written as an odd number of transpositions. It is proved in IA Groups that this
is well-defined.
X n
Y
det A = ε(σ) Aiσ(i) .
σ∈Sn i=1
This is a big scary definition. Hence, we will spend the first half of the
chapter trying to understand what this really means, and how it behaves. We
will eventually prove a formula that is useful for computing the determinant,
which is probably how you were first exposed to the determinant.
Example. If n = 2, then S2 = {id, (1 2)}. So
det A = A11 A22 A33 + A12 A23 A31 + A13 A21 A32
− A11 A23 A32 − A22 A31 A13 − A33 A12 A21 .
We will first prove a few easy and useful lemmas about the determinant.
X n
Y
= ε(τ ) Ajτ (j)
τ ∈Sn j=1
= det A.
41
5 Determinants of matrices IB Linear Algebra
Then
n
Y
det A = aii .
i=1
Proof. We have
X n
Y
det A = ε(σ) Aiσ(i)
σ∈Sn i=1
d(v1 , · · · , vn ) = 0.
42
5 Determinants of matrices IB Linear Algebra
since if σ(i) is not k or l, then τ does nothing; if σ(i) is k or l, then τ just swaps
them around, but A(k) = A(l) . So we get
n
X Y n
X Y
Aiσ(i) = Aiσ0 (i) ,
σ∈An i=1 σ 0 ∈τ An i=1
We have shown that determinants are volume forms, but is this the only
volume form? Well obviously not, since 2 det A is also a valid volume form.
However, in some sense, all volume forms are “derived” from the determinant.
Before we show that, we need the following
Lemma. Let d be a volume form on Fn . Then swapping two entries changes
the sign, i.e.
d(v1 , · · · , vi , · · · , vj , · · · , vn ) = −d(v1 , · · · , vj , · · · , vi , · · · , vn ).
0 = d(v1 , · · · , vi + vj , · · · , vi + vj , · · · , vn )
= d(v1 , · · · , vi , · · · , vi , · · · , vn )
+ d(v1 , · · · , vi , · · · , vj , · · · , vn )
+ d(v1 , · · · , vj , · · · , vi , · · · , vn )
+ d(v1 , · · · , vj , · · · , vj , · · · , vn )
= d(v1 , · · · , vi , · · · , vj , · · · , vn )
+ d(v1 , · · · , vj , · · · , vi , · · · , vn ).
So done.
43
5 Determinants of matrices IB Linear Algebra
Corollary. If σ ∈ Sn , then
for any vi ∈ Fn .
Theorem. Let d be any volume form on Fn , and let A = (A(1) · · · A(n) ) ∈
Matn (F). Then
We know that lots of these are zero, since if ik = ij for some k, j, then the term
is zero. So we are just summing over distinct tuples, i.e. when there is some σ
such that ij = σ(j). So we get
X n
Y
d(A(1) , · · · , A(n) ) = d(eσ(1) , · · · , eσ(n) ) Aσ(j)j .
σ∈Sn j=1
So done.
We can rewrite the formula as
It is not hard to see that the same proof gives for any v1 , · · · , vn , we have
44
5 Determinants of matrices IB Linear Algebra
λn
1 λ1
.. ..
. .
1 λk−1
B=
λk
λk+1 1
.. ..
. .
λn 1
So AB has the kth column identically zero. So det(AB) = 0. So it is sufficient
to prove that det(B) 6= 0. But det B = λk 6= 0. So done.
45
5 Determinants of matrices IB Linear Algebra
We are now going to come up with an alternative formula for the determinant
(which is probably the one you are familiar with). To do so, we introduce the
following notation:
Notation. Write Âij for the matrix obtained from A by deleting the ith row
and jth column.
Lemma. Let A ∈ Matn (F). Then
(i) We can expand det A along the jth column by
n
X
det A = (−1)i+j Aij det Âij .
i=1
We could prove this directly from the definition, but that is messy and scary,
so let’s use volume forms instead.
Proof. Since det A = det AT , (i) and (ii) are equivalent. So it suffices to prove
just one of them. We have
where d is the volume form induced by the determinant. Then we can write as
n
!
X
(1) (n)
det A = d A , · · · , Aij ei , · · · , A
i=1
n
X
= Aij d(A(1) , · · · , ei , · · · , A(n) )
i=1
The volume form on the right is the determinant of a matrix with the jth column
replaced with ei . We can move our columns around so that our matrix becomes
Âij 0
B=
stuff 1
We get that det B = det Âij , since the only permutations that give a non-zero
sum are those that send n to n. In the row and column swapping, we have made
n − j column transpositions and n − i row transpositions. So we have
n
X
det A = Aij (−1)n−j (−1)n−i det B
i=1
n
X
= Aij (−1)i+j det Âij .
i=1
This is not only useful for computing determinants, but also computing
inverses.
46
5 Determinants of matrices IB Linear Algebra
The calculation for [A adj A] = (det A)In can be done in a similar manner, or by
considering (A adj A)T = (adj A)T AT = (adj(AT ))AT = (det A)In .
Note that the coefficients of (adj A) are just given by polynomials in the
entries of A, and so is the determinant. So if A is invertible, then its inverse is
given by a rational function (i.e. ratio of two polynomials) in the entries of A.
This is very useful theoretically, but not computationally, since the polyno-
mials are very large. There are better ways computationally, such as Gaussian
elimination.
We’ll end with a useful tricks to compute the determinant.
Lemma. Let A, B be square matrices. Then for any C, we have
A C
det = (det A)(det B).
0 B
Proof. Suppose A ∈ Matk (F), and B ∈ Mat` (F), so C ∈ Matk,` (F). Let
A C
X= .
0 B
X k+`
Y
det X = ε(σ) Xiσ(i) .
σ∈Sk+` i=1
If j ≤ k and i > k, then Xij = 0. We only want to sum over permutations σ such
that σ(i) > k if i > k. So we are permuting the last j things among themselves,
and hence the first k things among themselves. So we can decompose this into
47
5 Determinants of matrices IB Linear Algebra
X k
Y `
Y
det X = ε(σ1 σ2 ) Xiσ1 (i) Xk+j σ2 (k+j)
σ=σ1 σ2 i=1 j=1
k
! `
X Y X Y
= ε(σ1 ) Aiσ1 (i) ε(σ2 ) Bjσ2 (j)
σ1 ∈Sk i=1 σ2 ∈S` j=1
= (det A)(det B)
Corollary.
A1 stuff
A2 Yn
det = det Ai
..
. i=1
0 An
48
6 Endomorphisms IB Linear Algebra
6 Endomorphisms
Endomorphisms are linear maps from a vector space V to itself. One might
wonder — why would we want to study these linear maps in particular, when
we can just work with arbitrary linear maps from any space to any other space?
When we work with arbitrary linear maps, we are free to choose any basis
for the domain, and any basis for the co-domain, since it doesn’t make sense to
require they have the “same” basis. Then we proved that by choosing the right
bases, we can put matrices into a nice form with only 1’s in the diagonal.
However, when working with endomorphisms, we can require ourselves to use
the same basis for the domain and co-domain, and there is much more we can
say. One major objective is to classify all matrices up to similarity, where two
matrices are similar if they represent the same endomorphism under different
bases.
6.1 Invariants
Definition. If V is a (finite-dimensional) vector space over F. An endomorphism
of V is a linear map α : V → V . We write End(V ) for the F-vector space of all
such linear maps, and I for the identity map V → V .
When we think about matrices representing an endomorphism of V , we’ll
use the same basis for the domain and the range. We are going to study some
properties of these endomorphisms that are not dependent on the basis we pick,
known as invariants.
Lemma. Suppose (e1 , · · · , en ) and (f1 , · · · , fn ) are bases for V and α ∈ End(V ).
If A represents α with respect to (e1 , · · · , en ) and B represents α with respect
to (f1 , · · · , fn ), then
B = P −1 AP,
where P is given by
n
X
fi = Pji ej .
j=1
Proof. This is merely a special case of an earlier more general result for arbitrary
maps and spaces.
Definition (Similar matrices). We say matrices A and B are similar or conjugate
if there is some P invertible such that B = P −1 AP .
Recall that GLn (F), the group of invertible n × n matrices. GLn (F) acts on
Matn (F) by conjugation:
(P, A) 7→ P AP −1 .
We are conjugating it this way so that the associativity axiom holds (otherwise
we get a right action instead of a left action). Then A and B are similar iff they
are in the same orbit. Since orbits always partition the set, this is an equivalence
relation.
Our main goal is to classify the orbits, i.e. find a “nice” representative for
each orbit.
49
6 Endomorphisms IB Linear Algebra
tr AB = tr BA.
(iii) We have
50
6 Endomorphisms IB Linear Algebra
is a direct sum.
Proof. Suppose
k
X k
X
xi = yi ,
i=1 i=1
with xi , yi ∈ E(λi ). We want to show that they are equal. We are going to find
some clever map that tells us what xi and yi are. Consider βj ∈ End(V ) defined
by Y
βj = (α − λr ι).
r6=j
Then
k
! k Y
X X
βj xi = (α − λr ι)(xi )
i=1 i=1 r6=j
k Y
X
= (λi − λr )(xi ).
i=1 r6=j
51
6 Endomorphisms IB Linear Algebra
Similarly, we obtain
k
!
X Y
βj yi = (λj − λr )(yj ).
i=1 r6=j
P P
Since we know that xi = yi , we must have
Y Y
(λj − λr )xj = (λj − λr )yj .
r6=j r6=j
Q
Since we know that r6=j (λPr − λj ) 6= 0, we must have xi = yi for all i.
So each expression for xi is unique.
The proof shows that any set of non-zero eigenvectors with distinct eigenvalues
is linearly independent.
Proof.
– (i) ⇔ (ii): Suppose (e1 , · · · , en ) is a basis for V . Then
α(ei ) = Aji ej ,
52
6 Endomorphisms IB Linear Algebra
with m ≥ 0, a0 , · · · , am ∈ F.
We write F[t] for the set of polynomials over F.
Note that we don’t identify a polynomial f with the corresponding function
it represents. For example, if F = Z/pZ, then tp and t are different polynomials,
even though they define the same function (by Fermat’s little theorem/Lagrange’s
theorem). Two polynomials are equal if and only if they have the same coeffi-
cients.
However, we will later see that if F is R or C, then polynomials are equal
if and only if they represent the same function, and this distinction is not as
important.
Definition (Degree). Let f ∈ F[t]. Then the degree of f , written deg f is the
largest n such that an 6= 0. In particular, deg 0 = −∞.
Notice that deg f g = deg f + deg g and deg f + g ≤ max{deg f, deg g}.
Lemma (Polynomial division). If f, g ∈ F[t] (and g =
6 0), then there exists
q, r ∈ F[t] with deg r < deg g such that
f = qg + r.
Proof is omitted.
Lemma. If λ ∈ F is a root of f , i.e. f (λ) = 0, then there is some g such that
f (t) = (t − λ)g(t).
for some g(t), r(t) ∈ F[t] with deg r < deg(t − λ) = 1. So r has to be constant,
i.e. r(t) = a0 for some a0 ∈ F. Now evaluate this at λ. So
So a0 = 0. So r = 0. So done.
Definition (Multiplicity of a root). Let f ∈ F[t] and λ a root of f . We say λ
has multiplicity k if (t − λ)k is a factor of f but (t − λ)k+1 is not, i.e.
53
6 Endomorphisms IB Linear Algebra
We can use the last lemma and induction to show that any non-zero f ∈ F[t]
can be written as
Yk
f = g(t) (t − λi )ai ,
i=1
54
6 Endomorphisms IB Linear Algebra
Now let
k
Y
p(t) = (t − λi ).
i=1
with λ1 , · · · , λk ∈ F distinct, and p(α) = 0 (we can wlog assume p is monic, i.e.
the leading coefficient is 1). We will show that
k
X
V = Eα (λi ).
i=1
In other words, we want to show P that for all v ∈ V , there is some vi ∈ Eα (λi )
for i = 1, · · · , k such that v = vi .
To find these vi out, we let
Y t − λi
qj (t) = .
λj − λi
i6=j
We still have deg q ≤ k − 1, but q(λi ) = 1 for any i. Since q and 1 agree on k
points, we must have q = 1.
Let πj : V → V be given by πj = qj (α). Then the above says that
k
X
πj = ι.
j=1
P
Hence given v ∈ V , we know that v = πj v.
We now check that πj v ∈ Eα (λj ). This is true since
k
1 Y 1
(α − λj ι)πj v = Q (α − λι )(v) = Q p(α)(v) = 0.
i6=j (λj − λi ) i=1 i6=j (λj − λi )
55
6 Endomorphisms IB Linear Algebra
So
αvj = λj vj .
So done.
In the above proof, if v ∈ Eα (λi ), then πj (v) = δij v. So πi is a projection
onto the Eα (λi ).
Definition (Minimal polynomial). The minimal polynomial of α ∈ End(V ) is
the non-zero monic polynomial Mα (t) of least degree such that Mα (α) = 0.
The monic requirement is just for things to look nice, since we can always
divide by the leading coefficient of a polynomial to get a monic version.
Note that if A represents α, then for all p ∈ F[t], p(A) represents p(α).
Thus p(α) is zero iff p(A) = 0. So the minimal polynomial of α is the minimal
polynomial of A if we define MA analogously.
There are two things we want to know — whether the minimal polynomial
exists, and whether it is unique.
Existence is always guaranteed in finite-dimensional cases. If dim V = n < ∞,
2
then dim End(V ) = n2 . So ι, α, α2 , · · · , αn are linearly dependent. So there are
some λ0 , · · · , λn2 ∈ F not all zero such that
2
n
X
λi αi = 0.
i=0
56
6 Endomorphisms IB Linear Algebra
Mβ (β|Ei ) = Mβ (β)|Ei = 0.
Since Mβ (t) is a product of its distinct linear factors, it follows that β|Ei is
diagonalizable. So we can choose a basis Bi of eigenvectors for β|Ei . We can do
this for all i. Sk
Then since V is a direct sum of the Ei ’s, we know that B = i=1 Bi is a
basis for V consisting of eigenvectors for both α and β. So done.
57
6 Endomorphisms IB Linear Algebra
We will not prove this yet, but just talk about it first. It is tempting to prove
this by substituting t = α into det(tι − α) and get det(α − α) = 0, but this is
meaningless, since what the statement χα (t) = det(tι − α) tells us to do is to
expand the determinant of the matrix
t − a11 a12 ··· a1n
a21
t − a22 · · · a2n
.. .. .. ..
. . . .
an1 an2 ··· t − ann
then
ρ(λ1 )
ρ(A) =
.. .
.
ρ(λn )
Qn
Since χA (t) is defined as i=1 (t − λi ), it follows that χA (A) = 0. So if α is
diagonalizable, then the theorem is clear.
This was easy. Diagonalizable matrices are nice. The next best thing we can
look at is upper-triangular matrices.
58
6 Endomorphisms IB Linear Algebra
Then
t − λ1 ∗ ··· ∗
0 t − λ2 ··· ∗ Yn
χα (t) = det . = (t − λi ).
.. .. ..
.. . . . i=1
0 0 ··· t − λn
So it is a product of linear factors.
We are going to prove the converse by induction on the dimension of our
space. The base case dim V = 1 is trivial, since every 1 × 1 matrix is already
upper triangular.
Suppose α ∈ End(V ) and the result holds for all spaces of dimensions
< dim V , and χα is a product of linear factors. In particular, χα (t) has a root,
say λ ∈ F.
Now let U = E(λ) 6= 0, and let W be a complementary subspace to U in V ,
i.e. V = U ⊕ W . Let u1 , · · · , ur be a basis for U and wr+1 , · · · , wn be a basis
for W so that u1 , · · · , ur , wr+1 , · · · , wn is a basis for V , and α is represented by
λIr stuff
0 B
This is not similar to a real upper triangular matrix (if θ is not an integer
multiple of π). This is since the eigenvalues are e±iθ and are not real. On the
other hand, as a complex matrix, it is triangulable, and in fact diagonalizable
since the eigenvalues are distinct.
For this reason, in the rest of the section, we are mostly going to work in C.
We can now prove the Cayley-Hamilton theorem.
59
6 Endomorphisms IB Linear Algebra
Proof. In this proof, we will work over C. By the lemma, we can choose a basis
{e1 , · · · , en } is represented by an upper triangular matrix.
λ1 ∗ · · · ∗
0 λ2 · · · ∗
A= . . .
.. . .
.. . . ..
0 0 ··· λn
V0 = 0 ⊆ V1 ⊆ · · · ⊆ Vn−1 ⊆ Vn = V.
So χα (α) = 0 as required.
Note that if our field F is not C but just a subfield of C, say R, we can just
pretend it is a complex matrix, do the same proof.
60
6 Endomorphisms IB Linear Algebra
We can see this proof more “visually” as follows: for simplicity of expression,
we suppose n = 4. In the basis where α is upper-triangular, the matrices A − λi I
look like this
0 ∗ ∗ ∗ ∗ ∗ ∗ ∗
0 ∗ ∗ ∗ 0 0 ∗ ∗
A − λ1 I =
0 0 ∗ ∗
A − λ2 I = 0 0 ∗ ∗
0 0 0 ∗ 0 0 0 ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
0 ∗ ∗ ∗ 0 ∗ ∗ ∗
A − λ3 I =
0 0 0 ∗
A − λ4 I = 0 0 ∗ ∗
0 0 0 ∗ 0 0 0 0
This is exactly what we showed in the proof — after multiplying out the first k
elements of the product (counting from the right), the image is contained in the
span of the first n − k basis vectors.
Proof. We’ll now prove the theorem again, which is somewhat a formalization
of the “nonsense proof” where we just substitute t = α into det(α − tι).
Let α be represented by A, and B = tI − A. Then
But we know that adj B is a matrix with entries in F[t] of degree at most n − 1.
So we can write
61
6 Endomorphisms IB Linear Algebra
We would like to just throw in t = A, and get the desired result, but in all these
derivations, t is assumed to be a real number, and, tIn − A is the matrix
t − a11 a12 ··· a1n
a21
t − a22 · · · a2n
.. .. .. ..
. . . .
an1 an2 ··· t − ann
−AB0 = a0 In
B0 − AB1 = a1 In
..
.
Bn−2 − ABn−1 = an−1 In
ABn−1 − 0 = In
−AB0 = a0 In
AB0 − A2 B1 = a1 A
..
.
An−1 Bn−2 − An Bn−1 = an−1 An−1
An Bn−1 − 0 = An .
(i) λ is an eigenvalue of α.
(ii) λ is a root of χα (t).
(iii) λ is a root of Mα (t).
Proof.
62
6 Endomorphisms IB Linear Algebra
Mα (α)(v) = 0(v) = 0.
(α − λι)g(α)(v) = Mα (α)v = 0.
We can compute χA (t) = (t−1)2 (t−2). So we know that the minimal polynomial
is one of (t − 1)2 (t − 2) and (t − 1)(t − 2).
By direct and boring computations, we can find (A − I)(A − 2I) = 0. So we
know that MA (t) = (t − 1)(t − 2). So A is diagonalizable.
63
6 Endomorphisms IB Linear Algebra
– Let
λ 1 ··· 0
.. ..
0 λ . .
A=
. ..
.
.. ..
. . 1
0 0 ··· λ
We will later show that aλ = n = cλ and gλ = 1.
– Consider A = λI. Then aλ = gλ = n, cλ = 1.
Lemma. If λ is an eigenvalue of α, then
(i) 1 ≤ gλ ≤ aλ
(ii) 1 ≤ cλ ≤ aλ .
Proof.
(i) The first inequality is easy. If λ is an eigenvalue, then E(λ) 6= 0. So
gλ = dim E(λ) ≥ 1. To prove the other inequality, if v1 , · · · , vg is a basis
for E(λ), then we can extend it to a basis for V , and then α is represented
by
λIg ∗
0 B
So χα (t) = (t − λ)g χB (t). So aλ > g = gλ .
(ii) This is straightforward since Mα (λ) = 0 implies 1 ≤ cλ , and since Mα (t) |
χα (t), we know that cλ ≤ αλ .
Lemma. Suppose F = C and α ∈ End(V ). Then the following are equivalent:
(i) α is diagonalizable.
(ii) gλ = aλ for all eigenvalues of α.
(iii) cλ = 1 for all λ.
Proof.
P
– (i) ⇔ (ii): α is diagonalizable iff dim V = dim Eα (λi ). But this is
equivalent to
X X
dim V = gλi ≤ aλi = deg χα = dim V.
P P
So we must have gλi = aλi . Since each gλi is at most aλi , they must
be individually equal.
– (i) ⇔ (iii): α is diagonalizable if and only if Mα (t) is a product of distinct
linear factors if and only if cλ = 1 for all eigenvalues λ.
Definition (Jordan normal form). We say A ∈ MatN (C) is in Jordan normal
form if it is a block diagonal of the form
Jn1 (λ1 ) 0
Jn2 (λ2 )
..
.
0 Jnk (λk )
64
6 Endomorphisms IB Linear Algebra
P
where k ≥ 1, n1 , · · · , nk ∈ N such that n = ni , λ1 , · · · , λk not necessarily
distinct, and
λ 1 ··· 0
.. ..
0 λ . .
Jm (λ) =
. .
.. .. ..
. 1
0 0 ··· λ
is an m × m matrix. Note that Jm (λ) = λIm + Jm (0).
Theorem (Jordan normal form theorem). Every matrix A ∈ Matn (C) is similar
to a matrix in Jordan normal form. Moreover, this Jordan normal form matrix
is unique up to permutation of the blocks.
This is a complete solution to the classification problem of matrices, at least
in C. We will not prove this completely. We will only prove the uniqueness part,
and then reduce the existence part to a special form of endomorphisms. The
remainder of the proof is left for IB Groups, Rings and Modules.
We can rephrase this result using linear maps. If α ∈ End(V ) is an endo-
morphism of a finite-dimensional vector space V over C, then the theorem says
there exists a basis such that α is represented by a matrix in Jordan normal
form, and this is unique as before.
Note that the permutation thing is necessary, since if two matrices in Jordan
normal form differ only by a rearrangement of blocks, then they are similar, by
permuting the basis.
Example. Every 2 × 2 matrix in Jordan normal form is one of the three types:
65
6 Endomorphisms IB Linear Algebra
0 0 ··· λ
If (e1 , · · · , en ) is the standard basis for Cn , we have
Thus we know (
0 i≤k
Jn (0)k (ei ) =
ei−k k<i≤n
In other words, for k < n, we have
k k 0 In−k
(Jn (λ) − λI) = Jn (0) = .
0 0
66
6 Endomorphisms IB Linear Algebra
67
6 Endomorphisms IB Linear Algebra
Hence we know
(
r r−1 1 r≤m
n((Jm (λ) − λIm ) ) − n((Jm (λ) − λIm ) )=
0 otherwise.
V = V1 ⊕ · · · ⊕ Vk ,
68
6 Endomorphisms IB Linear Algebra
This allows us to decompose V into a block diagonal matrix, and then each
block will only have one eigenvalue.
Note that if c1 = · · · = ck = 1, then we recover the diagonalizability theorem.
Hence, it is not surprising that the proof of this is similar to the diagonalizability
theorem. We will again prove this by constructing projection maps to each of
the Vi .
Proof. Let Y
pj (t) = (t − λi )ci .
i6=j
Then p1 , · · · , pk have no common factors, i.e. they are coprime. Thus by Euclid’s
algorithm, there exists q1 , · · · , qk ∈ C[t] such that
X
pi qi = 1.
So X
V = Vj .
To show this is a direct sum, note that πi πj = 0, since the product contains
Mα (α) as a factor. So
X
πi = ιπi = πj πi = πi2 .
P
So π is a projection, and πj |Vj = ιVj . So if v = vi , then applying πi to both
sides gives vi = πi (v).
L Hence there is a unique way of writing v as a sum of
things in Vi . So V = Vj as claimed.
Note that we didn’t really use the fact that the vector space is over C, except
to get that the minimal polynomial is a product of linear factors. In fact, for
arbitrary vector spaces, if the minimal polynomial of a matrix is a product of
linear factors, then it can be put into Jordan normal form. The converse is also
true — if it can be put into Jordan normal form, then the minimal polynomial
is a product of linear factors, since we’ve seen that a necessary and sufficient
condition for the minimal polynomial to be a product of linear factors is for
there to be a basis in which the matrix is upper triangular.
Using this theorem, by restricting α to its generalized eigenspaces, we can
reduce the existence part of the Jordan normal form theorem to the case Mα (t) =
(t − λ)c . Further by replacing α by α − λι, we can reduce this to the case where
0 is the only eigenvalue.
69
6 Endomorphisms IB Linear Algebra
70
6 Endomorphisms IB Linear Algebra
Equivalently, we have
We need to pick our v2 that is in this kernel but not in the kernel of A − I
(which is the eigenspace E1 we have computed above). So we have
1 0 2
v2 = 1 , v1 = 0 , v3 = 1 .
0 1 2
Hence we have
0 1 2
P = 0 1 1
1 0 2
and
1 1 0
P −1 AP = 0 1 0 .
0 0 2
71
7 Bilinear forms II IB Linear Algebra
7 Bilinear forms II
In Chapter 4, we have looked at bilinear forms in general. Here, we want to look
at bilinear forms on a single space, since often there is just one space we are
interested in. We are also not looking into general bilinear forms on a single
space, but just those that are symmetric.
= φ(y, x).
We are going to see what happens when we change basis. As in the case of
endomorphisms, we will require to change basis in the same ways on both sides.
Lemma. Let V is a finite-dimensional vector space, and φ : V × V → F a
bilinear form. Let (e1 , · · · , en ) and (f1 , · · · , fn ) be bases of V such that
n
X
fi = Pki ek .
k=1
72
7 Bilinear forms II IB Linear Algebra
B = P T AP.
q(v) = φ(v, v)
for all v ∈ V .
Note that quadratic forms are not linear maps (they are quadratic).
Example. Let V = R2 and φ be represented by A with respect to the standard
basis. Then
x A11 A12 x
q = x y = A11 x2 + (A12 + A21 )xy + A22 y 2 .
y A21 A22 y
73
7 Bilinear forms II IB Linear Algebra
q(v + w) = φ(v + w, v + w)
= φ(v, v) + φ(v, w) + φ(w, v) + φ(w, w)
= q(v) + 2φ(v, w) + q(w).
So we have
1
φ(v, w) =(q(v + w) − q(v) − q(w)).
2
So it is determined by q, and hence unique.
Theorem. Let V be a finite-dimensional vector space over F, and φ : V ×V → F
a symmetric bilinear form. Then there exists a basis (e1 , · · · , en ) for V such
that φ is represented by a diagonal matrix with respect to this basis.
This tells us classifying symmetric bilinear forms is easier than classifying
endomorphisms, since for endomorphisms, even over C, we cannot always make
it diagonal, but we can for bilinear forms over arbitrary fields.
Proof. We induct over n = dim V . The cases n = 0 and n = 1 are trivial, since
all matrices are diagonal.
Suppose we have proven the result for all spaces of dimension less than n.
First consider the case where φ(v, v) = 0 for all v ∈ V . We want to show that
we must have φ = 0. This follows from the polarization identity, since this φ
induces the zero quadratic form, and we know that there is a unique bilinear
form that induces the zero quadratic form. Since we know that the zero bilinear
form also induces the zero quadratic form, we must have φ = 0. Then φ will
be represented by the zero matrix with respect to any basis, which is trivially
diagonal.
If not, pick e1 ∈ V such that φ(e1 , e1 ) 6= 0. Let
74
7 Bilinear forms II IB Linear Algebra
for some λ, µ, ν ∈ R.
There are two ways to do this. The first way is to follow the proof we just had.
We first find our symmetric bilinear form. It is the bilinear form represented by
the matrix
1 1 3
A = 1 1 2 .
3 2 1
We then find f1 such that φ(f1 , f1 ) 6= 0. We note that q(e1 ) = 1 6= 0. So we pick
1
f1 = e1 = 0 .
0
Then
1 1 3 v1
φ(e1 , v) = 1 0 0 1 1 2 v2 = v1 + v2 + 3v3 .
3 2 1 v3
Next we need to pick our f2 . Since it is in the kernel of φ(f1 , · ), it must satisfy
φ(f1 , f2 ) = 0.
φ(f2 , f2 ) 6= 0.
75
7 Bilinear forms II IB Linear Algebra
76
7 Bilinear forms II IB Linear Algebra
We see that the diagonal matrix we get is not unique. We can re-scale our
basis by any constant, and get an equivalent expression.
77
7 Bilinear forms II IB Linear Algebra
Note that we have seen these things in special relativity, where the Minkowski
inner product is given by the symmetric bilinear form represented by
−1 0 0 0
0 1 0 0
0 0 1 0 ,
0 0 0 1
in units where c = 1.
Proof. We’ve already shown that there exists a basis (e1 , · · · , en ) such that
φ(ei , ej ) = λi δij for some λ1 , · · · , λn ∈ R. By reordering, we may assume
λi > 0 1 ≤ i ≤ p
λi < 0 p + 1 ≤ i ≤ r
λi = 0 i > r
We let µi be defined by
√
√λ i
1≤i≤p
µi = −λi p+1≤i≤r
1 i>r
Defining
1
vi = ei ,
µi
we find that φ is indeed represented by
Ip
−Iq ,
0
We will later show that this form is indeed unique. Before that, we will have
a few definitions, that really only make sense over R.
Definition (Positive/negative (semi-)definite). Let φ be a symmetric bilinear
form on a finite-dimensional real vector space V . We say
78
7 Bilinear forms II IB Linear Algebra
79
7 Bilinear forms II IB Linear Algebra
Aij = φ(vi , wj ).
for 1 ≤ i ≤ n, 1 ≤ j ≤ m.
As usual, this determines the whole sesquilinear form. This follows P from
the analogous
P fact for the bilinear form on V̄ × W → C. Let v = λ i v i and
W = µj wj . Then we have
X
φ(v, w) = λi µj φ(vi , wj ) = λ† Aµ.
i,j
80
7 Bilinear forms II IB Linear Algebra
If A is Hermitian, then
X X X X
φ λi ei , µj ej = λ† Aµ = µ† A† λ = φ µj ej , λj ej .
So done.
Proposition (Change of basis). Let φ be a Hermitian form on a finite di-
mensional vector space V ; (e1 , · · · , en ) and (v1 , · · · , vn ) are bases for V such
that
Xn
vi = Pki ek ;
k=1
Bij = φ(vi , vj )
X X
=φ Pki ek , P`j e`
n
X
= P̄ki P`j Ak`
k,`=1
= (P † AP )ij .
81
7 Bilinear forms II IB Linear Algebra
So
1
φ(x, y) = (ψ(x + y) − ψ(x − y) + iψ(x − iy) − iψ(x + iy)).
4
Theorem (Hermitian form of Sylvester’s law of inertia). Let V be a finite-
dimensional complex vector space and φ a hermitian form on V . Then there
exists unique non-negative integers p and q such that φ is represented by
Ip 0 0
0 −Iq 0
0 0 0
82
8 Inner product spaces IB Linear Algebra
We will see that if we have an inner product, then we can define lengths and
distances in a sensible way.
Example.
(i) Rn or Cn with the usual inner product
n
X
(x, y) = x̄i yi
i=1
(iii) More generally, for any w : [0, 1] → R+ continuous, we can define the inner
product on C([0, 1], F) as
Z 1
(f, g) = w(t)f¯(t)g(t) dt.
0
This is just the usual notion of norm on Rn and Cn . This gives the notion of
length in inner product spaces. Note that kvk > 0 with equality if and only if
v = 0.
83
8 Inner product spaces IB Linear Algebra
Note also that the norm k · k determines the inner product by the polarization
identity.
We want to see that this indeed satisfies the definition of a norm, as you might
have seen from Analysis II. To prove this, we need to prove the Cauchy-Schwarz
inequality.
Theorem (Cauchy-Schwarz inequality). Let V be an inner product space and
v, w ∈ V . Then
|(v, w)| ≤ kvkkwk.
Proof. If w = 0, then this is trivial. Otherwise, since the norm is positive
definite, for any λ, we get
kv + wk2 = (v + w, v + w)
= (v, v) + (v, w) + (w, v) + (w, w)
≤ kvk2 + 2kvkkwk + kwk2
= (kvk + kwk)2 .
So done.
The next thing we do is to define orthogonality. This generalizes the notion
of being “perpendicular”.
Definition (Orthogonal vectors). Let V be an inner product space. Then
v, w ∈ V are orthogonal if (v, w) = 0.
Definition (Orthonormal set). Let V be an inner product space. A set {vi :
i ∈ I} is an orthonormal set if for any i, j ∈ I, we have
(vi , vj ) = δij
84
8 Inner product spaces IB Linear Algebra
we have
n
X
(vj , v) = λi (vj , vi ) = λj .
i=1
In particular,
n
X
kvk2 = |(vi , v)|2 .
i=1
85
8 Inner product spaces IB Linear Algebra
hv1 , · · · , vk i = he1 , · · · , ek i
for every k.
Note that we are not requiring the set to be finite. We are just requiring it
to be countable.
Proof. We construct it iteratively, and prove this by induction on k. The base
case k = 0 is contentless.
Suppose we have already found v1 , · · · , vk that satisfies the properties. We
define
X k
uk+1 = ek+1 − (vi , ei+1 )vi .
i=1
We want to prove that this is orthogonal to all the other vi ’s for i ≤ k. We have
k
X
(vj , uk+1 ) = (vj , ek+1 ) − (vi , ek+1 )δij = (vj , ek+1 ) − (vj , ek+1 ) = 0.
i=1
So it is orthogonal.
We want to argue that uk+1 is non-zero. Note that
86
8 Inner product spaces IB Linear Algebra
W ⊥ = {v ∈ V : (v, w) = 0, ∀w ∈ W }.
So we have v − w ∈ W ⊥ . So done.
87
8 Inner product spaces IB Linear Algebra
with v1 , w1 ∈ V1 , v2 , w2 ∈ V2 .
Here we write v1 + v2 ∈ V1 ⊕ V2 instead of (v1 , v2 ) to avoid confusion.
This external direct sum is equivalent to the internal direct sum of {(v1 , 0) :
v1 ∈ V1 } and {(0, v2 ) : v2 ∈ V2 }.
Proposition. Let V be a finite-dimensional inner product space and W ≤ V .
Let (e1 , · · · , ek ) be an orthonormal basis of W . Let π be the orthonormal
projection of V onto W , i.e. π : V → W is a function that satisfies ker π = W ⊥ ,
π|W = id. Then
(i) π is given by the formula
k
X
π(v) = (ei , v)ei .
i=1
kv − π(v)k ≤ kv − wk,
with equality if and only if π(v) = w. This says π(v) is the point on W
that is closest to v.
W⊥ v
w π(v)
Proof.
(i) Let v ∈ V , and define X
w= (ei , v)ei .
i=1
as required.
88
8 Inner product spaces IB Linear Algebra
(ii) This is just Pythagoras’ theorem. Note that if x and y are orthogonal,
then
kx + yk2 = (x + y, x + y)
= (x, x) + (x, y) + (y, x) + (y.y)
= kxk2 + kyk2 .
We apply this to our projection. For any w ∈ W , we have
kv − wk2 = kv − π(v)k2 + kπ(v) − wk2 ≥ kv − π(v)k2
with equality if and only if kπ(v) − wk = 0, i.e. π(v) = w.
So we get X X
α∗ (wj ) = (vi , α∗ (wj ))vi = Āji vi .
i i
89
8 Inner product spaces IB Linear Algebra
So done.
What does this mean, conceptually? Note that the inner product V defines
an isomorphism V → V̄ ∗ by v 7→ ( · , v). Similarly, we have an isomorphism
W → W̄ ∗ . We can then put them in the following diagram:
α
V W
∼
= ∼
=
V̄ ∗ W̄ ∗
α∗
Then α∗ is what fills in the dashed arrow. So α∗ is in some sense the “dual” of
the map α.
Definition (Adjoint). We call the map α∗ the adjoint of α.
We have just seen that if α is represented by A with respect to some or-
thonormal bases, then α∗ is represented by A† .
Definition (Self-adjoint). Let V be an inner product space, and α ∈ End(V ).
Then α is self-adjoint if α = α∗ , i.e.
for all v, w.
Thus if V = Rn with the usual inner product, then A ∈ Matn (R) is self-
adjoint if and only if it is symmetric, i.e. A = AT . If V = Cn with the usual
inner product, then A ∈ Matn (C) is self-adjoint if and only if A is Hermitian,
i.e. A = A† .
Self-adjoint endomorphisms are important, as you may have noticed from IB
Quantum Mechanics. We will later see that these have real eigenvalues with an
orthonormal basis of eigenvectors.
Orthogonal maps
Another important class of endomorphisms is those that preserve lengths. We
will first do this for real vector spaces, since the real and complex versions have
different names.
Definition (Orthogonal endomorphism). Let V be a real inner product space.
Then α ∈ End(V ) is orthogonal if
for all v, w ∈ V .
90
8 Inner product spaces IB Linear Algebra
So we know
n
X
α∗ α(vj ) = (vi , α∗ αvj ))vi = vj .
i=1
It follows from the fact that α∗ = α−1 that α is invertible, and it is clear
from definition that O(V ) is closed under multiplication and inverses. So this is
indeed a group.
This is analogous to our result for general vector spaces and general bases,
where we replace O(V ) with GL(V ).
Proof. Same as the case for general vector spaces and general bases.
91
8 Inner product spaces IB Linear Algebra
Unitary maps
We are going to study the complex version of orthogonal maps, known as unitary
maps. The proofs are almost always identical to the real case, and we will not
write the proofs again.
Definition (Unitary map). Let V be a finite-dimensional complex vector space.
Then α ∈ End(V ) is unitary if
for all v, w ∈ V .
By the polarization identity, α is unitary if and only if kα(v)k = kvk for all
v ∈V.
U (V ) = {α ∈ End(V ) : α is unitary}.
U (V ) → {orthonormal basis of V }
α 7→ {α(e1 ), · · · , α(en )}.
92
8 Inner product spaces IB Linear Algebra
(i) Suppose first V is a complex inner product space. Then by the fundamental
theorem of algebra, α has an eigenvalue, say λ. We pick v ∈ V \ {0} such
that αv = λv. Then
since it has degree less than the minimal polynomial. So there is some
v ∈ V such that
Mα
(α)(v) 6= 0.
f
So it must be that f (α)(v) = 0. Let U = hv, α(v)i. Then this is an
α-invariant subspace of V since f has degree 2.
Now α|U ∈ End(U ) is self-adjoint. So if (e1 , e2 ) is an orthonormal basis of
U , then α is represented by a real symmetric matrix, say
a b
b a
But then χα|U (t) = (t − a)2 − b2 , which has real roots, namely a ± b. This
is a contradiction, since Mα|U = f , but f is irreducible.
(ii) Now suppose αv = λv, αw = µw and λ 6= µ. We need to show (v, w) = 0.
We know
(αv, w) = (v, αw)
by definition. This then gives
λ(v, w) = µ(v, w)
93
8 Inner product spaces IB Linear Algebra
V = hvi ⊥ U.
94
8 Inner product spaces IB Linear Algebra
V = W ⊥ hvi.
This theorem and the analogous one for self-adjoint endomorphisms have a
common generalization, at least for complex inner product spaces. The key fact
that leads to the existence of an orthonormal basis of eigenvectors is that α and α∗
commute. This is clearly a necessary condition, since if α is diagonalizable, then
α∗ is diagonal in the same basis (since it is just the transpose (and conjugate)),
and hence they commute. It turns out this is also a sufficient condition, as you
will show in example sheet 4.
95
8 Inner product spaces IB Linear Algebra
However, we cannot generalize this in the real orthogonal case. For example,
cos θ sin θ
∈ O(R2 )
− sin θ cos θ
cannot be diagonalized (if θ 6∈ πZ). However, in example sheet 4, you will find a
classification of O(V ), and you will see that the above counterexample is the
worst that can happen in some sense.
96