0% found this document useful (0 votes)

20 views83 pages

DONALDSON-Linear Algebra

Uploaded by

fbersani8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views83 pages

DONALDSON-Linear Algebra

Uploaded by

fbersani8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

Math 121B — Linear Algebra

Neil Donaldson

Fall 2022

Linear Algebra, Stephen Friedberg, Arnold Insel & Lawrence Spence, 4th Ed 2003, Prentice Hall.

Review from 121A

We begin by recalling a few basic notions and notations.

Vector Spaces Bold-face v denotes a vector in a vector space V over a field F. A vector space is closed
under vector addition and scalar multiplication

∀v1 , v2 ∈ V, ∀λ1 , λ2 ∈ F, λ1 v1 + λ2 v2 ∈ V

Examples. Here are four (families of) vector spaces over the field R.

• R2 = { xi + yj : x, y ∈ R} = {( yx ) : x, y ∈ R} is a vector space over the field R.

• Pn (R); polynomials with degree ≤ n and coefficients in R

• P(R); polynomials over R with any degree.

• C (R); continuous functions from R to R.

Linear Combinations and Spans Let β ⊆ V be a subset of a vector space V over F. A linear
combination of vectors in β is any finite sum

λ1 v1 + · · · + λ n v n

where λ j ∈ F and v j ∈ β. The span of β comprises all linear combinations: this is a subspace of V.

Bases and Co-ordinates A set β ⊆ V is a basis of V if it has two properties:

Linear Independence Any linear combination yielding the zero vector is trivial; for distinct v j ∈ β,

λ1 v1 + · · · + λn vn = 0 =⇒ ∀ j, λ j = 0

Spanning Set V = Span β; every vector in V is a (finite!) linear combination of elements of β.

Theorem. β is a basis of V ⇐⇒ every v ∈ V is a unique linear combination of elements of β.

The cardinality of all basis sets is identical; this is the dimension dimF V.

1
Example. P2 (R) has standard basis β = {1, x, x2 }: every degree ≤ 2 polynomial is a unique as
a linear combination p( x ) = a + bx + cx2 and so dim P2 (R) = 3. The real numbers a, b, c are the
co-ordinates of p with respect to β; the co-ordinate vector of p is written

a
 

[ p] β = b
c

Linearity and Linear Maps A function T : V → W between vector spaces V, W over the same field
F is (F-)linear if it respects the linearity properties of V, W

∀v1 , v2 ∈ V, ∀λ1 , λ2 ∈ F, T( λ1 v1 + λ2 v2 ) = λ1 T( v1 ) + λ2 T( v2 )

We write L(V, W ) for the set (indeed vector space!) of linear maps from V to W: this is shortened to
L(V ) if V = W. An isomorphism is an invertible/bijective linear map.

Theorem. If dimF V = n and β is a basis of V, then the co-ordinate map v 7→ [v] β is an isomorphism
of vector spaces V → Fn .

Matrices and Linear Maps If V, W are finite-dimensional, then any linear map T : V → W can be
described using matrix multiplication.

2 −1
Example. If A = 0 −1 , then the linear map L A : R2 → R3 (left-multiplication by A) is
−4 3

2 −1 2x − y
   
x x

LA =  0 −1 =  −y 
y y
−4 3 3y − 4x

The linear map in fact defines the matrix A; we recover the columns of the matrix by feeding the
standard basis vectors to the linear map.

2 −1
   
 0  = LA 1  −1 = L A 0

0 1
−4 3

More generally, if T ∈ L(V, W ) and β = {v1 , . . . , vn } and γ = {w1 , . . . , wm } are bases of V, W

respectively, then the matrix of T with respect to β and γ is

[T]γβ = [T(v1 )]γ · · · [T(vn )]γ ∈ Mm×n (F)

whose jth column is obtained by feeding the jth basis vector of β to T and taking its co-ordinate vector
with respect to γ. This fits naturally with the co-ordinate isomorphisms
γ
T(v) = w ⇐⇒ [T] β [v] β = [w]γ

2
There are two special cases when V = W:
β
• If β = γ, then we simply write [T] β instead of [T] β .
γ γ
• If T = I is the identity map, then Q β := [I] β is the change of co-ordinate matrix from β to γ.

Being able to convert linear maps into matrix multiplication is a central skill in linear algebra. Test
your comfort by working through the following; if everything feels familiar, you should consider
yourself in a good place as far as pre-requisites are concerned!

Example. Let T : P2 (R) → P1 (R) be the linear map defined by differentiation

T( a + bx + cx2 ) = b + 2cx (∗)

The standard bases of P2 (R) and P1 (R) are, respectively, β = {1, x, x2 } and γ = {1, x }. Observe that

0 1 0

[T(1)]γ = [0]γ = , [T( x )]γ = [1]γ = , [T( x2 )]γ = [2x ]γ =
0 0 2
0 1 0
=⇒ [T]γβ = [T(1)]γ [T( x )]γ [T( x2 )]γ =
0 0 2

Written in co-ordinates, we see the original linear map (∗)

a
 
0 1 0   b

T( a + bx + cx2 ) = [T]γβ [ a + bx + cx2 ] β = b = = [b + 2cx ]γ (†)

γ 0 0 2 2c
c

1. η = {1 + x, x + x2 , x2 + 1} is also a basis of P2 (R). Show that

1 1 0

γ
[T] η =
0 2 2

2. As in (†) above, the matrix multiplication

a
 
1 1 0   a+b

b =
0 2 2 2b + 2c
c
γ
corresponds to an equation [T( p)]γ = [T]η [ p]η for some polynomial p( x ); what is p( x ) in terms
of a, b, c?
β
3. Find the change of co-ordinate matrix Qη and check that the matrices of T are related by
β
[T]γη = [T]γβ Qη

3
1 Diagonalizability & the Cayley–Hamilton Theorem
1.1 Eigenvalues, Eigenvectors & Diagonalization (Review)

Definition 1.1. Suppose V is a vector space over F and T ∈ L(V ). A non-zero v ∈ V is an eigenvector
of T with eigenvalue λ ∈ F (together an eigenpair) if

T(v) = λv

For matrices, the eigenvalues/vectors of A ∈ Mn (F) are precisely those of L A ∈ L(Fn ).

Suppose λ is an eigenvalue of T;
1. The eigenspace of λ is the nullspace Eλ := N (T − λI).
2. The geometric multiplicity of λ of the dimension dim Eλ .

We say that T is diagonalizable if there exists a basis of eigenvectors; an eigenbasis.

We start by recalling a couple of basic facts, the first of which is easily proved by induction.

Lemma 1.2. If v1 , . . . , vk are eigenvectors corresponding to distinct eigenvalues, then {v1 , . . . , vk }

is linearly independent.
Moreover, if dimF V = n and T ∈ L(V ) has n distinct eigenvalues, then T is diagonalizable.

Eigenvalues and Eigenvectors in finite dimensions

If dimF V = n and ϵ is a basis, then the eigenvector definition is equivalent to a matrix equation
[T] ϵ [ v ] ϵ = λ [ v ] ϵ
In such a situation, T being diagonalizable means ∃ β such that [T] β is a diagonal matrix
λ1 0 · · · 0
 
.. 
 0 λ2 . 

[T] β = 
 .. ..

 . . 0


0 ··· 0 λn
Thankfully there is systematic way to find eigenvalues and eigenvectors in finite-dimensions:
1. Choose any basis ϵ of V and compute the matrix A = [T]ϵ ∈ Mn (F).
2. Observe that
λ ∈ F is an eigenvalue ⇐⇒ ∃[v]ϵ ∈ Fn \ {0} such that A[v]ϵ = λ[v]ϵ
⇐⇒ ∃[v]ϵ ∈ Fn \ {0} such that ( A − λI ) [v]ϵ = 0
⇐⇒ det ( A − λI ) = 0
This last is a degree n polynomial equation whose roots are the eigenvalues.
3. For each eigenvalue λ j , compute the eigenspace Eλ j = N (T − λ j I) to find the eigenvectors.
Remember that Eλ j is a subspace of the original vector space V, so translate back if necessary!

4
Definition 1.3. The characteristic polynomial of T ∈ L(V ) is the degree-n polynomial

p(t) := det(T − tI)

The eigenvalues of T are precisely the solutions to the characteristic equation p(t) = 0.

Examples 1.4. 1. A = 01 −01 has characteristic polynomial p(t) = t2 + 1 = (t + i )(t − i ). As a

linear map L A ∈ L(R2 ), A has no eigenvalues and no eigenvectors!

As a linear map L A ∈ L(C2 ), we have two eigenvalues ±i. Indeed

− i −1 i

( A − iI)v = v =⇒ Ei = Span
1 −i 1

and similarly E−i = Span −1i . We therefore have an eigenbasis β = { i , −i } (of C2 ), with

1 1
respect to which

i 0

[L A ] β =
0 −i

2. Let T ∈ L( P2 (R)) be defined by

T( f )( x ) = f ( x ) + ( x − 1) f ′ ( x )

With respect to the standard basis ϵ = {1, x, x2 }, we have the non-diagonal matrix

1 −1 0
 

A = [T]ϵ = 0 2 −2 =⇒ p(t) = det( A − tI ) = (1 − t)(2 − t)(3 − t)

0 0 3

With three distinct eigenvalues, T is diagonalizable. To find the eigenvectors, compute the
nullspaces:
0 −1 0 1
λ1 = 1: 0 = ( A − λ1 I )[v1 ]ϵ = 0 1 −2 [v1 ]ϵ =⇒ [v1 ]ϵ ∈ Span 0 =⇒ E1 = Span{1}
0 0 2 0
−1 −1 0 1
λ2 = 2: A − λ2 I = 0 0 −2 =⇒ [v2 ]ϵ ∈ Span −1 =⇒ E2 = Span{1 − x }
0 0 1 0
−2 −1 0 1
λ3 = 3: A − λ3 I = 0 −1 −2 =⇒ [v3 ]ϵ ∈ Span −2 =⇒ E3 = Span{1 − 2x + x2 }
0 0 0 1

Making a sensible choice of non-zero eigenvectors, we obtain an eigenbasis, with respect to

which the linear map is necessarily diagonal

β = {v1 , v2 , v3 } = {1, 1 − x, 1 − 2x + x2 } = {1, 1 − x, (1 − x )2 }

T a + b(1 − x ) + c(1 − x )2 = a + 2b(1 − x ) + 3c(1 − x )2

1 0 0
 

[T] β = 0 2 0
0 0 3

5
Conditions for diagonalizability of finite-dimensional operators
We now borrow a little terminology from the theory of polynomials.

Definition 1.5. Let F be a field and p(t) a polynomial with coefficients in F.

1. Let λ ∈ F be a root; p(λ) = 0. The algebraic multiplicity mult(λ) is the largest power of λ − t to
divide p(t). Otherwise said, there exists1 some polynomial q(t) such that

p(t) = (λ − t)mult(λ) q(t) and q(λ) ̸= 0

2. We say that p(t) splits over F if it factorizes completely into linear factors; equivalently
∃ a, λ1 , . . . , λk ∈ F such that

p ( t ) = a ( λ 1 − t ) m1 · · · ( λ k − t ) m k

When p(t) splits, the algebraic multiplicities sum to the degree n of the polynomial

n = m1 + · · · + m k

Of course, we are most interested when p(t) is the characteristic polynomial of a linear map T ∈ L(V ).
If such a polynomial splits, then a = 1 and λ1 , . . . , λk are necessarily the (distinct) eigenvalues of T.

Example 1.6. The field matters! For instance p(t) = t2 + 1 = (t − i )(t + i ) = −(i − t)(−i − t) splits
over C but not over R. Its roots are plainly ±i.

For the purposes of review, we state the main result; this will be proved in the next section.

Theorem 1.7. Let V be finite-dimensional. A linear map T ∈ L(V ) is diagonalizable if and only if,

1. Its characteristic polynomial splits over F, and,

2. The geometric and algebraic multiplicities of each eigenvalue are equal; dim Eλ j = mult(λ j ).

3 1 0
Example 1.8. The matrix A = 030 is easily seen to have eigenvalues λ1 = 3 and λ2 = 5. Indeed
005

p ( t ) = (3 − t )2 (5 − t ), mult(3) = 2, mult(5) = 1
1 0
E3 = Span 0 , E5 = Span 0 , dim E3 = dim E5 = 1
0 1

This matrix is non-diagonalizable since dim E3 = 1 ̸= 2 = mult(3).

Everything prior to this should be review. If it feels very unfamiliar, revisit your notes from 121A,
particularly sections 5.1 and 5.2 of the textbook.

1 The existence follows from Descartes factor theorem and the division algorithm for polynomials.

6
Exercises 1.1 1. For each matrix over R; find its characteristic polynomial, its eigenvalues/spaces,
and its algebraic and geometric multiplicities; decide if it is diagonalizable.
2 0 0 0 −1 6 0 0
(a) A = 00 30 13 01 (b) B = −02 60 03 00
0003 0 003

2. Suppose A is a real matrix with eigenpair (λ, v). If λ ̸∈ R show that (λ, v) is also an eigenpair.

3. Show that the characteristic polynomial of A = 34 −34 does not split over R. Diagonalize A

over C.

4. Give an example of a 2 × 2 matrix whose entries are rational numbers and whose characteristic
polynomial splits over R, but not over Q.

5. Diagonalize LC ∈ L(C2 ) where C = 2i2 10 .

6. If p(t) splits, explain why

mult(λ1 ) mult(λk )
det T = λ1 · · · λk

where λ1 , . . . , λk are the distinct eigenvalues of T.

7. Suppose T ∈ L(V ) is invertible with eigenvalue λ. Prove that λ−1 is an eigenvalue of T−1 with
the same eigenspace Eλ . If T is diagonalizable, prove that T−1 is also diagonalizable.

8. If V is finite-dimensional and T ∈ L(V ), we may define det T to equal det[T] β , where β is any
basis of V. Explain why the choice of basis does not matter; that is, if γ is any other basis of V,
we have det[T]γ = det[T] β .

7
1.2 Invariant Subspaces and the Cayley–Hamilton Theorem
The proof of Theorem 1.7 is facilitated by a new concept, of which eigenspaces are a special case.

Definition 1.9. Suppose T ∈ L(V ). A subspace W of V is T-invariant if T(W ) ⊆ W. In such a case,

the restriction of T to W is the linear map

TW : W → W : w 7→ T(w)

Examples 1.10. 1. The trivial subspace {0} and the entire vector space V are invariant for any
linear map T ∈ L(V ).

2. Every eigenspace is invariant; if v ∈ Eλ , then T(v) = λv ∈ Eλ .

3 1 0
3. Continuing Example 1.8, if A = 0 3 0 then W = Span {i, j} is an invariant subspace for the
005
linear map L A . Indeed

A( xi + yj) = (3x + y)i + 3yj ∈ W

W is an example of a generalized eigenspace; we’ll study these properly at the end of term.

To prove our diagonalization criterion, we need to see how to factorize the characteristic polynomial.
It turns out that factors of p(t) correspond to T-invariant subspaces!
1 2 4
Example 1.11. W = Span{i, j} is an invariant subspace of A = 0 3 1 ∈ M3 (R). With respect to
002
the standard basis, the restriction [L A ]W has matrix 10 23 . The characteristic polynomial pW (t) of the

restriction is plainly a factor of the whole,

p(t) = (1 − t)(2 − t)(3 − t) = (2 − t) pW (t)

Theorem 1.12. Suppose T ∈ L(V ), that dim V is finite and that W is a T-invariant subspace of V.
Then the characteristic polynomial of the restriction TW divides that of T.

The proof simply abstracts the approach of the example.

Proof. Extend a basis βW of W to a basis β of V. Since T(w) ∈ Span βW for each w ∈ W, we see that
the matrix of [T] has block form

A B

[T] β = =⇒ p(t) = det( A − tI ) det(C − tI ) = pW (t) det(C − tI )
O C

where pW (t) is the characteristic polynomial, and A = [TW ] βW the matrix of the restriction TW .

Corollary 1.13. If λ is an eigenvalue of T, then TEλ = λIEλ is a multiple of the identity, whence,

1. The characteristic polynomial of the restriction TEλ is pλ (t) = (λ − t)dim Eλ .

2. pλ (t) divides the characteristic polynomial of T. In particular dim Eλ ≤ mult(λ).

8
We are now in a position to state and prove an extended version of Theorem 1.7.

Theorem 1.14. Suppose dimF V = n and that T ∈ L(V ) has distinct eigenvalues λ1 , . . . , λk . The
following are equivalent:

1. T is diagonalizable.
2. The characteristic polynomial splits over F and dim Eλ j = mult(λ j ) for each j; indeed

p(t) = pλ1 (t) · · · pλk (t) = (λ1 − t)dim Eλ1 · · · (λk − t)dim Eλk
k
3. ∑ dim Eλ j = n
j =1

4. V = Eλ1 ⊕ · · · ⊕ Eλk

7 0 −12
Example 1.15. A= 01 0 is diagonalizable. Indeed p(t) = (1 − t)2 (3 − t) splits, and we have
2 0 −3

λ 1 3
mult(λ) n 22 0 o 1
3 and R4 = E1 ⊕ E3
Eλ Span 0 , 1 Span 0
1 0 1
dim Eλ 2 1
n 2 0 3 o 1 0 0
With respect to the eigenbasis β = 0 , 1 , 0 , the map is diagonal [L A ] β = 0 1 0
1 0 1 003

Proof. (1 ⇒ 2) If T is diagonalizable with eigenbasis β, then [T] β is diagonal. But then

p ( t ) = ( λ 1 − t ) m1 · · · ( λ k − t ) m k

splits and ∑ mult(λi ) = n. The cardinality n of an eigenbasis is at most ∑ dim Eλi since every
element is an (independent) eigenvector. By Corollary 1.13 (dim Eλ j ≤ mult(λ j )) we see that

n≤ ∑ dim Eλ j
≤ ∑ mult(λ j ) = n =⇒ ∀ j, dim Eλ j = mult(λ j )

whence the inequalities are equalities with each pair equal dim Eλ j = mult(λ j )

(2 ⇒ 3) p(t) splits =⇒ n = ∑ mult(λ j ) = ∑ dim Eλ j

(3 ⇒ 4) Assume Eλ1 ⊕ · · · ⊕ Eλ j exists.2 If (λ j+1 , v j+1 ) is an eigenpair, then v j+1 ̸∈ Eλ1 ⊕ · · · ⊕ Eλ j for
otherwise this would contradict Lemma 1.2.
By induction, Eλ1 ⊕ · · · ⊕ Eλk exists; by assumption it has dimension n = dim V and therefore
equals V.

(4 ⇒ 1) For each j, choose a basis β j of Eλ j . Then β := β 1 ∪ · · · ∪ β k is a basis of V consisting of

eigenvectors of T; an eigenbasis.

2 Distinct eigenspaces have trivial intersection: i1 ̸= i2 ≤ j =⇒ Ei1 ∩ Ei2 = {0}.

9
T-cyclic Subspaces and the Cayley–Hamilton Theorem
We finish this chapter by introducing a general family of invariant subspaces and using them to prove
a startling result.

Definition 1.16. Let T ∈ L(V ) and let v ∈ V. The T-cyclic subspace generated by v is the span

⟨v⟩ = Span{v, T(v), T2 (v), . . .}

3 1 0
Example 1.17. Recalling Example 1.10.3 Let A = 030 , and v = i + k. It is easy to see that
005

Av = 3i + 5k, A2 v = 9i + 25k, ..., A m v = 3m i + 5m k

all of which lie in Span{i, k}. Plainly this is the L A -cyclic subspace ⟨i + k⟩.

The proof of the following basic result is left as an exercise.

Lemma 1.18. ⟨v⟩ is the smallest T-invariant subspace of V containing v, specifically:

1. ⟨v⟩ is T-invariant.
2. If W ≤ V is T-invariant and v ∈ W, then ⟨v⟩ ≤ W.
3. dim ⟨v⟩ = 1 ⇐⇒ v is an eigenvector of T.

We were lucky in the example that the general form Am v was so clear. It is helpful to develop a more
precise test for identifying the dimension and a basis of a T-cyclic subspace.
Suppose a T-cyclic subspace ⟨v⟩ = Span{v, T(v), T2 (v), . . .} has finite dimension.3 Let k ≥ 1 be
maximal such that the set

{v, T(v), . . . , Tk−1 (v)}

is linearly independent.
• If k doesn’t exist, the infinite linearly independent set {v, T(v), . . .} contradicts dim ⟨v⟩ < ∞.

• By the maximality of k, Tk (v) ∈ Span{v, T(v), . . . , Tk−1 (v)}; by induction this extends to

j ≥ k =⇒ T j (v) ∈ Span{v, T(v), . . . , Tk−1 (v)}

It follows that ⟨v⟩ = Span{v, T(v), . . . , Tk−1 (v)}, and we’ve proved a useful criterion.

Theorem 1.19. Suppose v ̸= 0, then

dim ⟨v⟩ = k ⇐⇒ {v, T(v), . . . , Tk−1 (v)} is a basis of ⟨v⟩

⇐⇒ k is maximal such that {v, T(v), . . . , Tk−1 (v)} is linearly independent
⇐⇒ k is minimal such that Tk (v) ∈ Span{v, T(v), . . . , Tk−1 (v)}

3 Necessarily the situation if dim V < ∞, when we are thinking about characteristic polynomials.

10
Examples 1.20. 1. According to the Theorem, in Example 1.17 we need only have noticed

• v = i + k and Av = 3i + 5k are linearly independent.

• That A2 (i + k) = 9i + 25k ∈ Span{v, Av}.
We could then conclude that ⟨v⟩ = Span{v, Av} has dimension 2.

2. Let T( p( x )) = 3p( x ) − p′′ ( x ) viewed as a linear map T ∈ L( P2 (R)) and consider the T-cyclic
subspace generated by the polynomial p( x ) = x2

T( x2 ) = 3x2 − 2, T2 ( x2 ) = T(3x2 − 2) = 3(3x2 − 2) − 6 = 9x2 − 12, ...

Observe that { x2 , T( x2 )} is linearly independent, but that

T2 ( x2 ) = 9x2 − 12 = −9x2 + 6(3x2 − 2) ∈ Span{ x2 , T( x2 )}

We conclude that dim x2 = 2. An alternative basis for x2 is plainly {1, x2 }.

We finish by considering the interaction of a T-cyclic subspace with the characteristic polynomial.
Surprisingly, the coefficients of the characteristic polynomial and the linear combination coincide.

Continuing the Example, if W = x2 and βW = { x2 , T( x2 )} = { x2 , 3x2 − 2}, then

0 −9

[TW ] βW = =⇒ pW (t) = t2 − 6t + 9
1 6

Theorem 1.21. Let T ∈ L(V ) and suppose W = ⟨w⟩ has dim W = k with basis

βW = {w, T(w), . . . , Tk−1 (w)}

in accordance with Theorem 1.19, then

1. If Tk (w) + ak−1 Tk−1 (w) + · · · + a0 w = 0, then the characteristic polynomial of TW is

pW (t) = (−1)k tk + ak−1 tk−1 + · · · + a1 t + a0

2. pW (TW ) = 0 is the zero map on W.

Proof. 1. This is an exercise.

2. Write S ∈ L(V ) for the linear map

S := pW (T) = (−1)k Tk + ak−1 Tk−1 + · · · + a0 I

Part 1 says S(w) = 0. Since S is a polynomial in T, it commutes with all powers of T:

∀ j, S(T j (w)) = T j (S(w)) = 0

Since S is zero on the basis βW of W, we see that SW is the zero function.

11
With a little sneakiness, we can drop the W’s in the second part of the Theorem and observe an
intimate relation between a linear map and its characteristic polynomial.

Corollary 1.22 (Cayley–Hamilton). If V is finite-dimensional, then T ∈ L(V ) satisfies its charac-

teristic polynomial; p(T) = 0.

Proof. Let w ∈ V and consider the cyclic subspace W = ⟨w⟩ generated by w. By Theorem 1.12,

p ( t ) = qW ( t ) pW ( t )

for some polynomial qW . But the previous result says that pW (T)(w) = 0, whence

p(T)(w) = 0

Since we may apply this reasoning to any w ∈ V, we conclude that p(T) is the zero function.

Examples 1.23. 1. A = 21 has p(t) = t2 − 6t + 5 and we confirm:

7 6 2 1

A2 − 6A = −6 = −5I
18 19 3 4

It may seem like a strange thing to do for this matrix, but the characteristic equation can be
used to calculate the inverse of A:
1 1 4 −1

2 −1
A − 6A + 5I = 0 =⇒ A( A − 6I ) = −5I =⇒ A = (6I − A) =
5 5 −3 2

2. We use the Cayley–Hamilton Theorem to compute A4 when

2 −1 83
 

A = 0 1 −6
0 0 2

The characteristic polynomial is

p(t) = (2 − t)2 (1 − t) = 4 − 8t + 5t2 − t3

By Cayley–Hamilton,

A4 = AA3 = A(5A2 − 8A + 4I )
= 5A3 − 8A2 + 4A = 5(5A2 − 8A + 4I ) − 8A2 + 4A
4 −3 50 2 −1 83 1 0 0
     
3
= 17A2 − 36A + 20I = 17 0 1 −18 − 36 0 1 −6 + 20 0 1 0
0 0 4 0 0 2 0 0 1
16 −15 562
 
3
=0 1 −90
0 0 16

12
3. Recall Example 1.4.2, where the linear map T( f ( x )) = f ( x ) + ( x − 1) f ′ ( x ) had

p(t) = (1 − t)(2 − t)(3 − t) = −t3 + 6t2 − 11t + 6

By Cayley–Hamilton, T3 = 6T2 − 11T + 6I. You can check this explicitly, after first computing

T2 ( f ( x )) = f ( x ) + 3( x − 1) f ′ ( x ) + ( x − 1)2 f ′′ ( x ), etc.

Cayley–Hamilton can also be used to simplify higher powers of T and even to compute the
inverse!
1 3 1
I= (T − 6T2 + 11T) =⇒ T−1 = (T2 − 6T + 11I)
6 6
1 1
=⇒ T−1 ( f ( x )) = f ( x ) − ( x − 1) f ′ ( x ) + ( x − 1)2 f ′′ ( x )
2 6
3 0 0
Exercises 1.2 1. For the linear map T = L A : R3 → R3 where A = 024 find the T-cyclic
0 002
subspace generated by the standard basis vector e3 = 0 .
1
0
1 2 4
2. Let T = L A , where A = 031 and let v = 1 . Compute T(v) and T2 (v). Hence describe
002 −1
the T-cyclic subspace ⟨v⟩ and its dimension.
2 0 0 0
3. Given A = 00 30 13 01 , find two distinct L A -invariant subspaces W ≤ R4 such that dim W = 3.
0003

4. Suppose that W and X are T-invariant subspaces of V. Prove that the sum

W + X = {w + x : w ∈ W, x ∈ X }

is also T-invariant.

5. Prove Lemma 1.18.

6. Give an example of an infinite-dimensional vector space V, a linear map T ∈ L(V ), and a vector
v such that ⟨v⟩ = V.
d
7. Let β = {sin x, cos x, 2x sin x, 3x cos x } and T = dx ∈ L(Span β). Plainly the subspace W :=
Span{sin x, cos x } is T-invariant. Compute the matrices [T] β and [TW ] βW and observe that
2
p ( t ) = pW ( t )

8. Verify explicitly that A = 23 satisfies its characteristic polynomial.

9. Check the details of Example 1.23.3 and evaluate T4 as a linear combination of I, T and T2 . In
particular, check the evaluation of T−1 ( f ( x )).

13
10. Suppose a, b are constants with a ̸= 0 and define T( f ( x )) = a f ( x ) + b f ′ ( x ).

(a) Find an expression for the inverse T−1 ( f ( x )) if T ∈ L( P1 (R))

(b) Find an expression for the inverse T−1 ( f ( x )) if T ∈ L( P2 (R))

Your answers should be written in terms of f and its derivatives.

Rx
11. Let T( f )( x ) = f ′ ( x ) + 1x 0 f (t) dt be a linear map T ∈ L( P2 (R)).

(a) Find the characteristic polynomial of T and identify its eigenspaces. Is T diagonalizable?
(b) Find a, b, c ∈ R such that T3 = aT2 + bT + cI.
(c) What are dim L( P2 (R)) and dim Span{Tk : k ∈ N0 }? Explain.

12. If A = ac db has non-zero determinant, use the Cayley–Hamilton Theorem to obtain the usual

expression for A−1 .

3 1 0
13. Recall Examples 1.10.3, 1.17, and 1.20.1 with A = 0 3 0 .
005
x
(a) If v = y , show that det(v, Av, A2 v) = −4y2 z
z
(b) Hence determine all L A -cyclic subspaces of R3 .

14. (a) Consider Example 1.20.2 where T ∈ L( P2 (R)) is defined by T( p( x )) = 3p( x ) − p′′ ( x ).
Prove that all T-cyclic subspaces have dimension ≤ 2.
(b) What if we instead consider S ∈ L( P2 (R) defined by S( p( x )) = 3p( x ) − p′ ( x )?

15. We prove part 1 of Theorem 1.21.

(a) Explain why the matrix of TW with respect to the basis βW is

0 0 0 ··· 0 − a0
 
1 0 0 0 − a1 
0 1 0 0 − a2
 

[TW ] βW =  .. .. ..  ∈ Mk ( F )
 
. . . 
0 0 0 0 − a k −2 
 

0 0 0 ··· 1 − a k −1

(b) Compute the characteristic polynomial pW (t) = det [TW ] βW − tIk by expanding the de-

terminant along the first row.

14
2 Inner Product Spaces, part 1
y1
You should be familiar with the scalar/dot product in R2 . For any vectors x = ( xx12 ) , y = , define

y2

x · y : = x1 y1 + x2 y2
√ y
The norm or length of a vector is ||x|| = x · x.
The angle θ between vectors satisfies x · y = ||x|| ||y|| cos θ. θ x
Vectors are orthogonal or perpendicular precisely when x · y = 0.
The dot product is what allows us to compute lengths of and angles between vectors in R2 . An inner
product is an algebraic structure that generalizes this idea to other vector spaces.

2.1 Real and Complex Inner Product Spaces

Unless explicitly stated otherwise, throughout this chapter F is either the real R or complex C field.

Definition 2.1. An inner product space (V, ⟨ , ⟩) is a vector space V over F together with an inner
product: a function ⟨ , ⟩ : V × V → F satisfying the following properties ∀x, y, z ∈ V, λ ∈ F,

(a) Linear: ⟨λx + y, z⟩ = λ ⟨x, z⟩ + ⟨y, z⟩

(b) Conjugate-Symmetric: ⟨y, x⟩ = ⟨x, y⟩ (complex conjugate!)
(c) Positive-definite:
x ̸= 0 =⇒ ⟨x, x⟩ > 0

The norm or length of x ∈ V is ||x|| := ⟨x, x⟩. A unit vector has ||x|| = 1.
p

Vectors x, y are perpendicular/orthogonal if ⟨x, y⟩ = 0 and orthonormal if they are additionally unit
vectors.

Real inner product spaces

The definition simplifies slightly when F = R.
• Conjugate-symmetry becomes plain symmetry: ⟨y, x⟩ = ⟨x, y⟩.
• Linearity + symmetry yields bilinearity: a real inner product is also linear in its second slot
⟨x, λy + z⟩ = λ ⟨x, y⟩ + ⟨x, z⟩
A real inner product is often termed a positive-definite, symmetric, bilinear form.
The simplest example is the natural generalization of the dot product.

Definition 2.2. Euclidean space means Rn equipped with the standard inner (dot) product,
v
n u n
⟨x, y⟩ = x · y = y x = ∑ x j y j = x1 y1 + · · · + xn yn ,
T
||x|| = t ∑ x2j
u
j =1 j =1

Unless the inner product is stated explicitly, if we refer to Rn as an inner product space, we mean
Euclidean space. However, there are many other ways to make Rn into an inner product space. . .

15
Example 2.3. Define an alternative inner product on R2 via

⟨x, y⟩ = x1 y1 + 3x2 y2

It is easy to check that this satisfies the required properties:

(a) Linearity: follows from the associative/distributive laws in R,

⟨λx + y, z⟩ = (λx1 + y1 )z1 + 3(λx2 + y2 )z2 = λ( x1 z1 + 3x2 z2 ) + (y1 z1 + 3y2 z2 )

= λ ⟨x, z⟩ + ⟨y, z⟩

(b) Symmetry: follows from the commutativity of multiplication in R,

⟨y, x⟩ = y1 x1 + 3y2 x2 = x1 y1 + 3x2 y2 = ⟨x, y⟩

(c) Positive-definiteness: if x ̸= 0, then ⟨x, x⟩ = x12 + 3x22 > 0.

n o
1
With respect to ⟨ , ⟩, the concept of orthogonality feels strange: e.g., {x, y} = 1 , 2√1 3 3 is an

2 1 −1
orthonormal set!
1 1 2 1
||x||2 = (12 + 3 · 12 ) = 1, ||y||2 = (3 + 3 · 12 ) = 1, ⟨x, y⟩ = √ (3 − 3) = 0
4 12 4 3
However, with respect to the standard dot product, these vectors are not special:

1 5 1
x·x = , y·y = , x·y = √
2 6 2 3

We have the same vector space R2 , but different inner product spaces: (R2 , ⟨ , ⟩) ̸= (R2 , ·).

The above is an example of a weighted inner product: choose weights a1 , . . . , an ∈ R+ and define
n
⟨x, y⟩ = ∑ a j x j y j = a1 x1 y1 + · · · + a n x n y n
j =1

It is a simple exercise to check that this defines an inner product on Rn . In particular, Rn may be
equipped with infinitely many distinct inner products!
More generally, a symmetric matrix A ∈ Mn (R) is positive-definite if x T Ax > 0 for all non-zero x ∈ Rn .
It is straightforward to check that

⟨x, y⟩ := yT Ax

defines an inner product on Rn . In fact all inner products on Rn arise in this fashion! The weighted
inner products correspond to A being diagonal (Euclidean space is A = I), but this is not required.

Example 2.4. The matrix A = 31 11 is positive-definite and thus defines an inner product

⟨x, y⟩ = 3x1 y1 + x1 y2 + x2 y1 + x2 y2

16
Lemma 2.5 (Basic properties). Let V be an inner product space, let x, y, z ∈ V and λ ∈ F.
1. ⟨0, x⟩ = 0
2. ||x|| = 0 ⇐⇒ x = 0
3. ||λx|| = |λ| ||x||
4. ⟨x, z⟩ = ⟨y, z⟩ for all z =⇒ x = y y
5. (Cauchy–Schwarz inequality) |⟨x, y⟩| ≤ ||x|| ||y|| with equality x
if and only if x, y are parallel
6. (Triangle Inequality) ||x + y|| ≤ ||x|| + ||y|| with equality if and x+y
only if x, y are parallel and point in the same direction Triangle inequality

Be careful with notation: |λ| means the absolute value/modulus (of a scalar), while ||x|| means the norm
(of a vector).
⟨x,y⟩
In the real case, Cauchy–Schwarz allows us to define angle via cos θ = ||x||||y||
, since the right-hand
side lies in the interval [−1, 1]. However, except in Euclidean and R2 R3 ,
this notion is of limited use;
orthogonality (⟨x, y⟩ = 0) and orthonormality are usually all we care about.

Proof. Parts 1–3 are exercises. For simplicity, we prove 5 and 6 only when F = R.
4. Let z = x − y, apply the linearity condition and part 2:

⟨x, z⟩ = ⟨y, z⟩ =⇒ 0 = ⟨x − y, z⟩ = ⟨x − y, x − y⟩ = ||x − y||2 =⇒ x = y

5. If y = 0, the result is trivial. WLOG (and by part 3) we may assume ||y|| = 1; if the inequality
holds for this, then it holds for all non-zero y by parts 1 and 4. Now expand:

0 ≤ ||x − ⟨x, y⟩ y||2 (positive-definiteness)

2 2 2
= ||x|| + |⟨x, y⟩| ||y|| − ⟨x, y⟩ ⟨x, y⟩ − ⟨x, y⟩ ⟨y, x⟩ (bilinearity)
= ||x||2 − |⟨x, y⟩|2 (symmetry)

Taking square-roots establishes the inequality.

By part 2, equality holds if and only if x = ⟨x, y⟩ y, which is precisely when x, y are linearly
dependent.

6. We establish the squared result.

2
||x|| + ||y|| − ||x + y||2 = ||x||2 + 2 ||x|| ||y|| + ||y||2 − ||x||2 − ⟨x, y⟩ − ⟨y, x⟩ − ||y||2
= 2 ||x|| ||y|| − ⟨x, y⟩

≥ 2 ||x|| ||y|| − |⟨x, y⟩|

≥0 (Cauchy–Schwarz)

Equality requires both equality in Cauchy–Schwarz (x, y parallel) and that ⟨x, y⟩ ≥ 0; since x, y
are already parallel, this means that one is a non-negative multiple of the other.

17
Complex Inner Product Spaces
Definition 2.1 is already set up nicely when C = F. One subtle difference comes from how we expand
linear combinations in the second slot.

Lemma 2.6. An inner product is a positive-definite, conjugate-symmetric, sesquilinear4 form: it is

conjugate-linear ( anti-linear) in the second slot,

⟨z, λx + y⟩ = λ ⟨z, x⟩ + ⟨z, y⟩

The proof is very easy if you remember your complex conjugates; try it!
Warning! If you dabble in the dark arts of Physics, be aware that their convention5 is for an inner
product to be conjugate-linear in the first entry and linear in the second!

Definition 2.7. The standard (Hermitian) inner product and norm on Cn are
v
n u n
2
x, y y ∗
x ∑ j j 1 1
x y x y x y , || || t ∑ x j
x
u
⟨ ⟩ = = = + · · · + n n =
j =1 j =1

where y∗ = y T is the conjugate-transpose of y and x j is the modulus.

Weighted inner products may be defined as on page 16:

n
⟨x, y⟩ = ∑ a j x j y j = a1 x1 y1 + · · · + a n x n y n
j =1

Note that the weights a j must still be positive real numbers. We may similarly define inner products in
terms of positive-definite matrices ⟨x, y⟩ = y∗ Ax.

Definition 2.8. A matrix A ∈ Mn (C) is Hermitian (self-adjoint) if A∗ = A. It is moreover positive-

definite if x∗ Ax > 0 for all non-zero x ∈ Cn .

The self-adjoint condition reduces to symmetry (A T = A) if A is a real matrix.

Example 2.9. It can be seen (Exercise 6) that A = 3i −3i is positive-definite, whence

3 −i x1

⟨x, y⟩ = y1 y2 = 3x1 y1 + ix1 y2 − ix2 y1 + 3x2 y2
i 3 x2

defines an inner product on C2 .

Almost all results in this chapter will be written for general inner product spaces, thus covering the
real and complex cases simultaneously. If you don’t feel confident with complex numbers, simply
let F = R and delete all complex conjugates at first read! Very occasionally a different proof will be
required depending on the field. For simplicity, examples will more often use real inner products.
4 The prefix sesqui- means one-and-a-half ; for instance a sesquicentenary is a 150 year anniversary.
5 The common Physics notation relates to ours via ⟨x | y⟩ = ⟨y, x⟩.

18
Further Examples
As before, the field F must be either R or C.

Definition 2.10 (Frobenius inner product). If A, B ∈ Mm×n (F), define

⟨ A, B⟩ = tr( B∗ A)

where tr is the trace of an n × n matrix; this makes Mm×n (F) into an inner product space.

This isn’t really a new example: if we map A → Fm×n by stacking the columns of A, then the
Frobenius inner product is the standard inner product in disguise.

Example 2.11. In M3×2 (C),

* 1 i
 
0 7 +
 
1 i

0 1 3 + 2i  2 − i −3 − 2i

2 − i 0  ,  1 2i = tr 2−i 0 = tr
7 −2i 4 9 − 4i −4 + 7i
 
0 −1 3 − 2i 4 0 −1
= 2 − i − 4 + 7i = −2 − 8i
2
1 i 1 i
   
1 2+i 0  6 i

2 − i 0  = tr 2−i 0 = tr =8
−i 0 −1 −i 2

0 −1 0 −1

Definition 2.12 (L2 inner product). Given a real interval [ a, b], the function
Z b
⟨ f , g⟩ := f (t) g(t) dt
a

defines an inner product on the space C [ a, b] of continuous functions f : [ a, b] → F.

With careful restriction, this works even for infinite intervals and a larger class of functions.6 Veri-
fying the required properties is straightforward if you know a little analysis; for instance continuity
allows us to conclude
Z b
|| f ||2 = | f ( x )|2 dx = 0 ⇐⇒ f ( x ) ≡ 0
a

This is our first example of an infinite-dimensional inner product space.

Example 2.13. Let f ( x ) = x and g( x ) = x2 ; these lie in the inner product space C [−1, 1] with
respect to the L2 inner product.
Z 1 Z 1 1
2 2
Z
2
⟨ f , g⟩ = x dx = 0, 3
|| f || = x dx = , 2
|| g||2 = x4 dx =
−1 −1 3 −1 5
n o q q
1 1 3 5 2
With some simple scaling, we see that || f || f , || g|| g = 2 x, 2x forms an orthonormal set.

6 For us, functions will always be continuous (often polynomials) on closed bounded intervals. The square-integrable

functions and L2 -spaces for which the inner product is named are a more complicated business and beyond this course.

19
∞
Definition 2.14 (ℓ2 inner product). A sequence ( xn ) is square-summable if ∑ | xn |2 < ∞. These se-
quences form a vector space on which we can define an inner product7 n =1

∞
⟨( xn ), (yn )⟩ = ∑ xn yn
n =1

In essence we’ve taken the standard inner product on Fn and let n → ∞! This example, and its L2
cousin, are the prototypical Hilbert spaces, which have great application to differential equations, sig-
nal processing, etc. Since a rigorous discussion requires a significant amount of analysis (convergence
of series, completeness, integrability), these objects are generally beyond the course.
Our final example of an inner product is a useful, and hopefully obvious, hack to which we shall
repeatedly appeal in examples.

Lemma 2.15. Let V be a vector space over R or C. If β is a basis of V, then there exists exactly one
inner product on V for which β is an orthonormal set.

Exercises 2.1 1. Evaluate the inner product of the given vectors.

1 1
(a) x = 2 , y = −1 where ⟨x, y⟩ = 2x1 y1 + 3x2 y2 + x3 y3
1 1
(b) x = 2i , y = 4 where ⟨x, y⟩ is the standard Hermitian inner product on C2
1 5i

(c) x = 2i1 , y = 5i4 where ⟨x, y⟩ = y∗ 2i −2i x

(d) f ( x ) = x − 1, g( x ) = x + 1 where ⟨ f , g⟩ is the L2 inner product on C [0, 2]

2. Suppose ⟨x, y⟩ = ⟨x, z⟩ for all x. Prove that y = z.
3. For each z ∈ V, The linearity condition says that the map Tz : V → F defined by
Tz (x) = ⟨x, z⟩
is linear (Tz is an element of the dual space V ∗ ). What, if anything, can you say about the function
Uz : x 7→ ⟨z, x⟩?
n
4. Define ⟨x, y⟩ := ∑ x j y j on Cn . Is this an inner product? Which of the properties (a), (b), (c)
j =1
from Definition 2.1 does it satisfy?
5. (a) Verify that the matrix in Example 2.4 is positive-definite and that ⟨x, y⟩ = 3x1 y1 + x1 y2 +
x2 y1 + x2 y2 therefore defines an inner product on R2 .
(Hint: Try to write ||x||2 as a sum of squares. . . )
(b) Let x = 11 . With respect to the inner product in part (a), find a non-zero unit vector y

which is orthogonal to x.

6. By multiplying out | x1 − ix2 |2 , show that the matrix 3i −3i is positive-definite.

(Hint: recall that | a|2 = aa for complex numbers!!!)

7 Neither of these facts are obvious; in particular, we’d need to see that the sum of two square-summable sequences is

also square-summable.

20
7. Show that every eigenvalue of a positive definite matrix is positive.

8. Prove parts 1, 2 and 3 of Lemma 2.5.

9. Let V be an inner product space, prove Pythagoras’ Theorem:

If ⟨x, y⟩ = 0, then ||x + y||2 = ||x||2 + ||y||2

10. Use basic algebra to prove the Cauchy–Schwarz inequality for vectors x = ( ba ) and y = ( dc ) in
R2 with the standard (dot) product.

11. Prove the Cauchy–Schwarz and triangle inequalities for a complex inner product space. What
has to change compared to the proof of Lemma 2.5?

12. Prove the polarization identities:

(a) In any real inner product space: ⟨x, y⟩ = 1

4 ||x + y||2 − 14 ||x − y||2
4 2
1
(b) In any complex inner product space: ⟨x, y⟩ = 4 ∑ ik x + ik y
k =1

If you know the length of every vector, then you know the inner product!
R2 √
13. Prove that 0 x+x1 dx ≤ √23

(Hint: use Cauchy–Schwarz)

14. Let m ∈ Z and consider the complex-valued function f m ( x ) = √12π eimx . If ⟨ , ⟩ is the L2 inner
product on C [−π, π ], prove that { f m : m ∈ Z} is an orthonormal set.
This example is central to the study of Fourier series.
(Hint: If complex functions are scary, use Euler’s formula eimx = cos mx + i sin mx and work with the
real-valued functions cos mx and sin mx. The difficulty is that you then need integration by parts. . . )

15. Let ⟨ , ⟩ be an inner product on Fn (recall that F = R or C). Define the matrix A ∈ Mn (F) by
A jk = ek , e j where {e1 , . . . , en } is the standard basis. Verify that A is the matrix of the inner
product:

∀x, y ∈ Fn , ⟨x, y⟩ = y∗ Ax

In particular,

• A is a Hermitian/self-adjoint matrix: A∗ = A (if F = R this is simply symmetric);

• A is positive-definite: for all x ∈ Fn , x∗ Ax > 0;

More generally, if β = {v1 , . . . , vn } is a basis then A jk = vk , v j defines the matrix of the inner
product with respect to β:

⟨x, y⟩ = [y]∗β A[x] β

21
2.2 Orthogonal Sets and the Gram–Schmidt Process
We start with a simple definition, relating subspaces to orthogonality.

Definition 2.16. Let U be a subspace of an inner product space V. The orthogonal complement U ⊥ is
the set

U ⊥ = {x ∈ V : ∀u ∈ U, ⟨x, u⟩ = 0}

It is easy to check that U ⊥ is itself a subspace of V and that U ∩ U ⊥ = {0}. It can moreover be seen
that U ⊆ (U ⊥ )⊥ , though equality need not hold in infinite dimensions (see Exercise 7).
n 1 0 o n 0 o
Example 2.17. U = Span 0 , − 1 3 ⊥
≤ R has orthogonal complement U = Span 3 .
0 3 1

As in the example, we often have a direct sum decomposition

V = U ⊕ U ⊥ : otherwise said,

∀x ∈ V, ∃ unique u ∈ U, w ∈ U ⊥ such that x = u + w

In such a case, we’ll call u = πU (x) and w = πU ⊥ ( x ) the or-

⊥
thogonal projections of x onto U and U respectively. As our
first result shows, these are easy to compute whenever U has
a finite orthogonal basis.

Theorem 2.18. Let V be an inner product space and let U = Span β where β = {u1 , . . . , un } is an
orthogonal set of non-zero vectors;
(
0 if j ̸= k
u j , uk = 2
u j ̸= 0 if j = k

Then:

1. β is a basis of U and each x ∈ U has unique representation

n x, u j
x= ∑ 2
uj (∗)
j =1 uj

This simplifies to x = ∑ x, u j u j if β is an orthonormal set.

2. V = U ⊕ U ⊥ . For any x ∈ V, we may write x = u + w where

n x, u j
u= ∑ 2
uj ∈ U and w = x − u ∈ U⊥
j =1 uj

Observe that (∗) essentially calculates the co-ordinate vector [x] β ∈ Fn . Recalling how unpleasant
such calculations have been in the past, often requiring large matrix inversions, we immediately see
the power of inner products and orthogonal bases.

22
Proof. 1. Since β spans U, a given x ∈ U may be written
n
x= ∑ ak uk
k =1

for some scalars ak . The orthogonality of β recovers the required expression for a j :
n
2
x, u j = ∑ ak uk , u j = a j u j
k =1

Finally, let x = 0 to see that β is linearly independent.

2. Clearly u ∈ U. For each uk , the orthogonality of β tells us that

n x, u j
⟨w, uk ⟩ = ⟨x − u, uk ⟩ = ⟨x, uk ⟩ − ∑ 2
u j , uk = ⟨x, uk ⟩ − ⟨x, uk ⟩ = 0
j =1 uj

Since w is orthogonal to a basis of U it is orthogonal to any element of U; we conclude that

w ∈ U ⊥ . Finally, U ∩ U ⊥ = {0} forces the uniqueness of the decomposition x = u + w,
whence V = U ⊕ U ⊥ .

Examples 2.19. 1. Consider the standard orthonormal basis β = {e1 , e2 } of R2 . For any x = ( xx12 ),
we easily check that
2
∑ x, e j e j = x1 e1 + x2 e2 = x
j =1
n 1 2 3 o
2. In R3 , β = {u1 , u2 , u3 } = 2 , −1 , 6 is an orthogonal set and thus a basis. We
3 0 −5
7
compute the co-ordinates of x = 4 with respect to β:
2

1 2 3
     
3 x, u j 7 + 8 + 6   14 − 4 + 0   21 + 24 − 10  
x=∑ u =
2 j
2 + −1 + 6
j =1 u j
1+4+9 4+1+0 9 + 36 + 25
3 0 −5
1 2 3 3/2
       
3 1
= 2 + 2 −1 +  6  =⇒ [x] β =  2 
2 2
3 0 −5 1/2

Compare this with the painfully slow augmented matrix method for finding co-ordinates!
1 n 1 0 o
3. Revisiting Example 2.17, let x = 1 . Since β = 0 , −1 is an orthogonal basis of U, we
1 0 3
observe that
1 0 5 0
       
1  2   1  ⊥ 2
πU ( x ) = 0 + −1 = −1 , πU ( x ) = x − πU ( x ) = 3
1 10 5 5
0 3 3 1

are the orthogonal projections corresponding to R3 = U ⊕ U ⊥ .

23
The Gram–Schmidt Process
Theorem 2.18 tells us how to compute the orthogonal projections corresponding to V = U ⊕ U ⊥ ,
provided U has a finite, orthogonal basis. Given how useful such bases are, our next goal is to see that
such exist for any finite-dimensional subspace. Helpfully there exists a constructive algorithm.

Theorem 2.20 (Gram–Schmidt). Suppose S = {s1 , . . . , sn } is a linearly independent subset of an

inner product space V. Construct a sequence of vectors u1 , . . . , un inductively:

• Choose u1 = a1 s1 where a1 ̸= 0
k −1
!
sk , u j
• For each k ≥ 2, choose uk = ak sk − ∑ 2
uj where ak ̸= 0 (†)
j =1 uj

Then β := {u1 , . . . , un } is an orthogonal basis of Span S.

The purpose of the scalars ak is to give you some freedom; choose them to avoid unpleasant fractions!
If you want a set of orthonormal vectors, it is easier to scale everything after the algorithm is complete.
Indeed, by taking S to be a basis of V and normalizing the resulting β, we conclude:

Corollary 2.21. Every finite-dimensional inner product space has an orthonormal basis.
n 1 2 1 o
Example 2.22. S = { s1 , s2 , s3 } = 0 , −1 , 1 is a linearly independent subset of R3 .
0 3 1
1
1. Choose u1 = s1 = 0 =⇒ ||u1 ||2 = 1
0
2 0 0
1
⟨s2 ,u1 ⟩
=⇒ ||u2 ||2 = 10

2
2. s2 − u = −1 − 0 = −1 : choose u2 = −1
||u1 ||2 1 3 1 0 3 3

0
1 1 0 0
⟨s3 ,u1 ⟩ ⟨s3 ,u2 ⟩

1 2 2
3. s3 − u − u2 = 1 − 0 − −1 = 3 : choose u3 = 3
||u1 ||2 1 ||u2 ||2 1 1 0 10 3 5 1 1
n o
The orthogonality of β = {u1 , u2 , u3 } is clear. It is now trivial to observe that u1 , √1 u2 , √1 u3 is
10 10
an orthonormal basis of R3 .
Proof of Theorem 2.20. For each k ≤ n, define Sk = {s1 , . . . , sk } and β k = {u1 , . . . , uk }. We prove by
induction that each β k is an orthogonal set of non-zero vectors and that Span β k = Span Sk . The
Theorem is then the terminal case k = n.
(Base case k = 1) Certainly β 1 = { a1 s1 } is an orthogonal set and Span β 1 = Span S1 .
(Induction step) Fix k ≥ 2, assume β k−1 is an orthogonal non-zero set and that Span β k−1 =
Span Sk−1 . By Theorem 2.18, uk ∈ (Span β k−1 )⊥ . We also see that uk ̸= 0, for if not,
(†) =⇒ sk ∈ Span β k−1 = Span Sk−1
and S would be linearly dependent. It follows that β k is an orthogonal set of non-zero vectors.
Moreover, sk ∈ Span β k =⇒ Span Sk ≤ Span β k . Since these spaces have the same (finite) dimension
k, we conclude that Span β k = Span Sk .
By induction, β is an orthogonal, non-zero spanning set for Span S; by Theorem 2.18, it is a basis.

24
Example 2.23. This time we work in the space of polynomials P(R) equipped with the L2 inner
R1
product ⟨ f , g⟩ = 0 f ( x ) g( x ) dx on the interval [0, 1]. Let S = {1, x, x2 } and apply the algorithm:
R1
1. Choose f 1 ( x ) = 1 =⇒ || f 1 ||2 = 0
1 dx = 1
⟨ x, f 1 ⟩ R1 1
2. x − f = x− x dx = x −
|| f 1 ||2 1 0 2
R1
We choose f 2 ( x ) = 2x − 1, with || f 2 ||2 = 0 (2x − 1)2 dx = 1
3
R1
⟨ x2 , f 1 ⟩ ⟨ x2 , f 2 ⟩ R1 x2 (2x −1) dx 1
3. x2 − 2 f1 − 2 = x2 − 0
x2 dx − 0
1/3 (2x − 1) = x2 − x + 6
|| f 1 || || f 2 ||
R1 2
We choose f 3 ( x ) = 6x2 − 6x + 1 with || f 3 ||2 = 0
6x2 − 6x + 1 dx = 1
5

It follows that Span S has an orthonormal basis

n √ √ o
β = 1, 3(2x − 1), 5(6x2 − 6x + 1)

This example can be extended to arbitrary degree since the countable set {1, x, x2 , . . .} is basis of
P(R). Indeed this shows that ( P(R), ⟨ , ⟩) has an orthonormal basis.

Gram–Schmidt also shows that our earlier discussion of orthogonal projections is generic.

Corollary 2.24. If U is a finite-dimensional subspace of an inner product space V, then V = U ⊕ U ⊥

and the orthogonal projections may be computed as in Theorem 2.18.

If U is an infinite-dimensional subspace of V, then we need not have V = U ⊕ U ⊥ and the orthog-

onal projections might not be well-defined (see, for example, Exercises 7 and 8). Instead, if β is an
orthonormal basis of U, it is common to describe the coefficients ⟨x, u⟩ for each u ∈ β as the Fourier
coefficients, and the infinite sum

∑ ⟨x, u⟩ u
u∈ β

as the Fourier series of x, provided the sum converges.

Exercises 2.2 1. Apply Gram–Schmidt to obtain an orthogonal basis β for Span S. Then obtain the
co-ordinate representation (Fourier coefficients) of the given vector with respect to β.
n 1 0 0 o 1
(a) S = 1 , 1 , 0 ⊆ R3 , x = 0
1 1 1 1
3 5 −1 9 7 −17 1 27
(b) S = −1 1 , 5 −1 , 2 −6 ⊆ M2 (R), X = − (use the Frobenius product)

−4 8
n 1 0 0 o 1
(c) S = i , 1 , 0 ⊆ C3 with x = 1
0 i 1 1
R1
(d) S = {1, x, x } with ⟨ f , g⟩ = −1 f ( x ) g( x ) dx and f ( x ) = x2 .
2

Important! You’ll likely need much more practice than this to get comfortable with Gram–
Schmidt; make up your own problems!

25
n 1 2 o 3
2. Let S = {s1 , s2 } = 0 , −1 and U = Span S ≤ R3 . Find πU (x) if x = −1 .
3 0 −2
(Hint: First apply Gram–Schmidt)
3. Find the orthogonal complement to U = Span{ x2 } ≤ P2 (R) with respect to the inner product
R1
⟨ f , g⟩ = 0 f (t) g(t) dt
4. Let T ∈ L(V, W ) where V, W are inner product spaces with orthonormal bases β = {v1 , . . . , vn }
γ
and γ = {w1 , . . . , wm } respectively. Prove that the matrix A = [T] β ∈ Mm×n (F) of T with
respect to these bases has jkth entry
A jk = T(vk ), w j

5. Suppose that β is an orthonormal basis of an n-dimensional inner product space V. Prove that,
∀x, y ∈ V, ⟨x, y⟩ = [y]∗β [x] β
Otherwise said, the co-ordinate isomorphism ϕβ : V → Fn defined by ϕβ (x) = [x] β is an isomorphism
of inner product spaces where we use the standard inner product on Fn
6. Let U be a subspace of an inner product space V. Prove the following:
(a) U ⊥ is a subspace of V.
(b) U ∩ U ⊥ = {0}
(c) U ⊆ (U ⊥ ) ⊥
(d) If V = U ⊕ U ⊥ , then U = (U ⊥ )⊥ (this is always the case when dim U < ∞)
7. Let ℓ2 be the set of square-summable sequences of real numbers (Definition 2.14). Consider the
sequences u1 , u2 , u3 , . . ., where u j is the zero sequence except for a single 1 in the jth entry. For
instance,
u4 = (0, 0, 0, 1, 0, 0, 0, 0, . . .)
(a) Let U = Span{u j : j ∈ N}. Prove that U ⊥ contains only the zero sequence.
(b) Show that the sequence y = n1 lies in ℓ2 , but does not lie in U.

U is therefore a proper subset of (U ⊥ )⊥ = ℓ2 and ℓ2 ̸= U ⊕ U ⊥ .

8. Recall Exercise 2.1.14 where we saw that the set β = { √12π eimx : m ∈ Z} is orthonormal with
Rπ
respect to ⟨ f , g⟩ = −π f (t) g(t) dt.
(a) Show that the Fourier series of f ( x ) = x is
∞ D ∞
E 2(−1)m+1
F ( x ) := ∑ x, √12π eimx √12π eimx = ∑ sin mx
m=−∞ m =1
m

(b) Briefly explain why the Fourier series is not an element of Span β.
(c) Sketch a few of the Fourier approximations (sum up to m = 5 or 7. . . ) and observe, when
extended to R, how they approximate a discontinuous periodic function.
9. (Hard) Much of Theorem 2.18 remains true, with suitable modifications, even if β is an infinite
set. Restate and prove as much as you can, and identify the false part(s).

26
2.3 The Adjoint of a Linear Operator
Recall how the standard inner product on Fn may be written in terms of the conjugate-transpose
⟨x, y⟩ = y∗ x = yT x
We start by inserting a matrix into this expression and interpreting in two different ways. Suppose
A ∈ Mm×n (F), v ∈ Fn and w ∈ Fm , then
⟨ A∗ w, v⟩ = v∗ ( A∗ w) = (v∗ A∗ )w = ( Av)∗ w = ⟨w, Av⟩ (†)
| {z } | {z }
in Fn in Fm
p
Example 2.25. ∈ M2 (R), w = ( yx ) and v = q . Then,
As a sanity check, let A = 12

03

T x p 1 0 x p x p

A , = , = , = xp + (2x + 3y)q
y q 2 3 y q 2x + 3y q

x p x 1 2 p x p + 2q

,A = , = , = x ( p + 2q) + 3yq
y q y 0 3 q y 3q

Note how the inner products are evaluated on different spaces. At the level of linear maps this is a
relationship between L A ∈ L(Fn , Fm ) and L A∗ ∈ L(Fm , Fn ), one that is easily generalizable.

Definition 2.26. Let T ∈ L(V, W ) where V, W are inner product spaces over the same field F. The
adjoint of T is a function T∗ : W → V (read ‘T-star’) satisfying

∀v ∈ V, w ∈ W, ⟨T∗ (w), v⟩ = ⟨w, T(v)⟩

Note that the first inner product is computed within V and the second within W.

The adjoint effectively extends the conjugate-transpose to linear maps. We now use the same notation
for three objects, so be careful!
T
• If A is a real or complex matrix, then A∗ = A is its conjugate-transpose.
• If T is a linear map, then T∗ is its adjoint.
• If V is a vector space, then V ∗ = L(V, F) is its dual space.
Thankfully the two notations line up nicely, as part 3 of our first result shows.

Theorem 2.27 (Basic Properties). 1. If an adjoint exists,8 then it is unique and linear.

2. If T and S have adjoints, then

(T∗ )∗ = T, (TS)∗ = S∗ T∗ , (λT + S)∗ = λT∗ + S∗

3. Suppose V, W are finite-dimensional with orthonormal bases β, γ respectively. Then the matrix
β γ
of the adjoint of T ∈ L(V, W ) is the conjugate-transpose of the original: [T∗ ]γ = ([T] β )∗ .

8 Existence of adjoints is trickier, so we postpone this a little: see Corollary 2.34 and Exercise 12.

27
Proof. 1. (Uniqueness) Suppose T∗ and S∗ are adjoints of T. Then

⟨T∗ (x), y⟩ = ⟨x, T(y)⟩ = ⟨S∗ (x), y⟩

Since this holds for all y, Lemma 2.5 part 4 says that ∀x, T∗ (x) = S∗ (x), whence T∗ = S∗ .
(Linearity) Simply translate across, use the linearity of T, and again appeal to Lemma 2.5:

∀z, ⟨T∗ (λx + y), z⟩ = ⟨λx + y, T(z)⟩ = λ ⟨x, T(z)⟩ + ⟨y, T(z)⟩ = λ ⟨T∗ (x), z⟩ + ⟨T∗ (y), z⟩
= ⟨λT∗ (x) + T∗ (y), z⟩
=⇒ T∗ (λx + y) = λT∗ (x) + T∗ (y)

2. These may be proved similarly to part 1 and are left as an exercise.

β
3. By Exercise 2.2.4, the jkth entry of [T∗ ]γ is

T∗ (wk ), v j = wk , T(v j ) = T(v j ), wk = Akj

We revisit our motivating set-up (†) in the language of part 3. Suppose:

• V = Fn and W = Fm have standard orthonormal bases β = {e1 , . . . , en } and γ = {e1 , . . . , em }.
• T = L A ∈ L(Fn , Fm ).
Since the matrix of T with respect to the standard bases is simply A itself, the theorem confirms our
earlier observation that the adjoint of L A is left multiplication by the conjugate-transpose A∗ :
β β
[T∗ ]γ = ([T]γβ )∗ = A∗ = [L A∗ ]γ =⇒ T∗ = (L A )∗ = L A∗
Here is another straightforward example, this time using the standard inner products on C2 and C3 .
1 −3
Example 2.28. Let T = L A ∈ L(C3 , C2 ) where A = −2i 1− i 4+2i .

γ
Plainly A = [T] β with respect to β = {e1 , e2 , e3 } and γ = {e1 , e2 }. We conclude that T∗ = L A∗ :

i 2
 
β
[ T∗ ] γ = A ∗ =  1 1+i 
−3 4 − 2i

As a sanity check, multiply out a few examples of ⟨ A∗ w, v⟩ = ⟨w, Av⟩; make sure you’re comfortable
with the fact that the left inner product is on C2 and the right on C3 !

The Theorem tells us that every linear map T ∈ L(V, W ) between finite-dimensional spaces has an
adjoint and moreover how to compute it:
γ
1. Choose orthonormal bases (exist by Corollary 2.21) and find the matrix [T] β .
γ ∗
2. Take the conjugate-transpose [T] β and translate back to find T∗ ∈ L(W, V ).
The prospect of twice applying Gram–Schmidt and translating between linear maps and their ma-
trices is unappealing; calculating this way can quickly become an enormous mess! In practice, it is
often better to try a modified approach; see for instance part 2(b) of the next Example.

28
d
Examples 2.29. Let T = dx ∈ L( P1 (R)) be the derivative operator; T( a + bx ) = b. We treat P1 (R)
as an inner product space in two ways.

1. Equip the inner product for which the standard basis ϵ = {1, x } is orthonormal. Then

0 1 0 0

∗
[T] ϵ = =⇒ [T ]ϵ = =⇒ T∗ ( a + bx ) = ax
0 0 1 0

R1
2. Equip the L2 inner product ⟨ f , g⟩ = 0 f ( x ) g( x ) dx. As we saw in Example 2.23, the basis
β = { f 1 , f 2 } = {1, 2x − 1} is orthogonal with || f 1 || = 1 and || f 2 || = √13 . We compute the
d
adjoint of T = in two different ways.
dx
n √ o n √ o
(a) The basis γ = { g1 , g2 } = f 1 , 3 f 2 = 1, 3(2x − 1) is orthonormal. Observe that
√
√ 0 2 3

0 0

T( g1 ) = 0, T( g2 ) = 2 3 =⇒ [T]γ = ∗
=⇒ [T ]γ = √
0 0 2 3 0
b b √ b √

∗ ∗
=⇒ T ( a + bx ) = T a + + √ 3(2x − 1) = a + · 2 3g2
2 2 3 2
= 3(2a + b)(2x − 1)

(b) Use the orthogonal basis β and the projection formula (Theorem 2.18). With p( x ) = a + bx,

⟨T∗ ( p ), f 1 ⟩ ⟨T∗ ( p ), f 2 ⟩
T∗ ( p ) = 2
f1 + f 2 = ⟨ p, T(1)⟩ + ⟨ p, T(2x − 1)⟩ · 3(2x − 1)
|| f 1 || || f 2 ||2
Z 1
= ⟨ p, 0⟩ + 3 ⟨ p, 2⟩ (2x − 1) = 3 2( a + bx ) dx (2x − 1)
0
= 3(2a + b)(2x − 1)

Note the advantages here: no square roots and no need to change basis at the end!

The calculations for the second example were much nastier, even though we were already in posses-
sion of an orthogonal basis. The crucial point is that the two examples produce different maps T∗ : the
adjoint depends on the inner product!

Why should we care about adjoints?

Adjoints might seem merely to be an abstraction of something simple (transposes) for its own sake.
A convincing explanation of why adjoints are useful takes a lot of work; here is a short version.
Given a linear map T ∈ L(V ) on an inner product space, we now have two desirable types of basis.
1. Eigenbasis: diagonalizes T.
2. Orthonormal basis: recall (Theorem 2.18) how these simplify computations.
The capstone result of this course is the famous spectral theorem (Theorem 2.37) which says, in short,
that self-adjoint operators (T∗ = T) have an orthonormal eigenbasis, the holy grail of easy computation!
Such operators are important both theoretically and in applications such as quantum mechanics.

29
The Fundamental Subspaces Theorem
To every linear map are associated its range and nullspace. These interact nicely with the adjoint. . .

Theorem 2.30. If T ∈ L(V, W ) has adjoint T∗ , then,

1. R(T∗ )⊥ = N (T)

2. If V is finite dimensional, then R(T∗ ) = N (T)⊥

The corresponding results hold if we swap V ↔ W and T ↔ T∗ .

The proof is left to Exercise 6. You’ve likely observed this with transposes of small matrices.

Example 2.31. Let A = 10 23 − 1

−2 . Viewed as a linear map between Euclidean spaces, T = L A has

adjoint T∗ = L AT . It is easy to compute the relevant subspaces:

n 1 0 o −1
R( A) = R2 , N ( AT ) = {0}, R( AT ) = Span 2 , 3 , N ( A) = Span 2
−1 −2 3

The Riesz Representation Theorem

This powerful result demonstrates a natural relation between an inner product space and its dual-
space V ∗ = L(V, F).

Theorem 2.32. If V is finite-dimensional and g : V → F is linear, then there exists a unique y ∈ V

such that g(x) = ⟨x, y⟩.

R1
Example 2.33. g( p) := 0 p( x ) dx is a linear map g : P2 (R) → R. Equip P2 (R) with the inner
product for which the standard basis {1, x, x2 } is orthonormal. Then

g( a + bx + cx2 ) = a + 21 b + 31 c = a + bx + cx2 , 1 + 21 x + 13 x2

We conclude that g( p) = ⟨ p, q⟩, where q( x ) = 1 + 21 x + 31 x2 .

The idea of the proof is very simple: if g(x) = ⟨x, y⟩ then the nullspace of g must equal Span{y}⊥ . . .

Proof. If g is the zero map, take y = 0. Otherwise, rank g = 1 and

V = N ( g) ⊕ N ( g)⊥ where dim N ( g)⊥ = 1 (rank–nullity theorem and Exercise 2.2.6)

Let u ∈ N ( g)⊥ be either of the two unit vectors and define, independently of u,

y := g(u)u ∈ V

Following the decomposition, write x = n + αu where n ∈ N ( g) and observe that

D E
⟨x, y⟩ = n + αu, g(u)u = g(u)α = 0 + g(αu) = g(n + αu) = g(x)

The uniqueness of y follows from the cancellation property (Lemma 2.5, part 4).

30
Due to the tight correspondence, the map is often decorated as gy . Riesz’s theorem indeed says that
y 7→ gy is an isomorphism V ∼ = V ∗ . While there are infinitely many isomorphisms between these
spaces, the inner product structure identifies a canonical or preferred choice.

Corollary 2.34. Every linear map on a finite-dimensional inner product space has an adjoint.

Note how only the domain is required to be finite-dimensional! Riesz’s Theorem and the Corollary
also apply to continuous linear operators on (infinite-dimensional) Hilbert spaces, though the proof
is a little trickier.

Proof. Let T ∈ L(V, W ) where dim V < ∞, and suppose z ∈ W is given. Simply define T∗ (z) := y
where y ∈ V is the unique vector in Riesz’s Theorem arising from the linear map

g : V → F, g ( x ) = ⟨T( x ), z ⟩

and check the required property:

⟨T∗ (z), x⟩ = ⟨y, x⟩ = g(x) = ⟨z, T(x)⟩

Exercises 2.3 1. For each inner product space V and linear operator T ∈ L(V ), evaluate T∗ on the
given vector.
2x +y
(a) V = R2 with the standard inner product, T ( yx ) = x−3y and x = 35

(b) V = C2 with the standard inner product, T ( wz ) = (2z1− +iw

3− i
i )z and x = 1+2i

R1
(c) V = P1 (R) with ⟨ f , g⟩ = 0 f (t) g(t) dt, T( f ) = f ′ + 3 f and f (t) = 4 − 2t
2. Suppose A = 14 13 and consider the linear map T = L A ∈ L(R2 ) where R2 is equipped with

the weighted inner product

⟨x, y⟩ = 4x1 y1 + x2 y2
(a) Find the matrix of T with respect to the orthonormal basis β = {v1 , v2 } = { 21 e1 , e2 }.
(b) Find the adjoint T∗ and its matrix with respect to the standard basis ϵ = {e1 , e2 }.
(Hint: the answer isn’t A T !)
d
3. Extending Examples 2.29, find the adjoint of T = dx ∈ L( P2 (R)) with respect to:
(a) The inner product where the standard basis ϵ = {1, x, x2 } is orthonormal.
R1
(b) (Hard!) The L2 inner product 0 f ( x ) g( x ) dx.
4. Let T( f ) = f ′′ be a linear transformation of P2 (R) and let ϵ = {1, x, x2 } be the standard basis.
Find T∗ ( a + bx + cx2 ):
(a) With respect to the inner product where ϵ is orthonormal;
R1
(b) With respect to the L2 inner product ⟨ f , g⟩ = −1 f (t) g(t) dt.
(Hint: {1, x, 3x2 − 1} is orthogonal)

31
5. Prove part 2 of Theorem 2.27.

6. Prove the Fundamental Subspaces Theorem 2.30.

7. For each inner product space V and linear transformation g : V → F, find a vector y ∈ V such
that g(x) = ⟨x, y⟩ for all x ∈ V.
x
3
(a) V = R with the standard inner product, and g y = x − 2y + 4z
z
z
(b) V = C2 with the standard inner product, and g ( w ) = iz − 2w
R1
(c) V = P2 (R) with the L2 inner product ⟨ f , h⟩ = 0 f ( x )h( x ) dx, and g ( f ) = f ′ (1)

8. (a) In the proof of Theorem 2.32, explain why y depends only on g (not u).
(b) In the proof of Corollary 2.34, check that g(x) := ⟨T(x), z⟩ is linear.

9. Let y, z ∈ V be fixed vectors and define T ∈ L(V ) by T(x) = ⟨x, y⟩ z. Show that T∗ exists and
find an explicit expression.

10. Suppose A ∈ Mm×n (F). Prove that A∗ A is diagonal if and only if the columns of A are orthog-
onal. What additionally would it mean if A∗ A = I?

11. Suppose T ∈ L(V ) where V is a finite-dimensional inner product space.

(a) Prove that the eigenvalues of T∗ are the complex conjugates of those of T.
(Hint: relate the characteristic polynomial p∗ (t) = det(T∗ − tI) to that of T)
(b) Prove that T∗ is diagonalizable if and only if T is.

12. (Hard) We present two linear maps which do not have an adjoint!

(a) Since ϵ = {1, x, x2 , . . .} is a basis of P(R), we may define a linear map T ∈ L( P(R)) via
T( x n ) = 1 for all n; for instance

T(4 + 3x + 2x5 ) = 9

Let ⟨ , ⟩ be the inner product for which ϵ is orthonormal. If T∗ existed, show that
∞
T∗ (1 ) = ∑ xn
n =0

would be an infinite series: T∗ therefore does not exist.

(b) For a related challenge, recall the space ℓ2 of square-summable real sequences. For any
sequence ( xn )∞ 2 2
n=1 ∈ ℓ ), define T ∈ L(ℓ ) via

∞
!
1
T( xn ) = ∑ xn , 0, 0, 0, 0, 0, . . .
n =1
n

Find the adjoint T∗ . If V ≤ ℓ2 is the subspace whose elements have only finitely many
non-zero terms, show that the restriction TV does not have an adjoint.

32
2.4 Normal & Self-Adjoint Operators and the Spectral Theorem
We now come to the fundamental question of this chapter: for which linear operators T ∈ L(V ) can
we find an orthonormal eigenbasis? Many linear maps are, of course, not even diagonalizable, so in
general this is far too much to hope for! Let’s see what happens if such a basis exists. . .
If β is an orthonormal basis of eigenvectors of T, then

[T] β = diag(λ1 , · · · , λn ) =⇒ [T∗ ] β = diag(λ1 , · · · , λn )

If V is a real inner product space, then these matrices are identical and so T∗ = T. In the complex
case, we instead observe that

[TT∗ ] β = diag(|λ1 |2 , · · · , |λn |2 ) = [T∗ T] β =⇒ TT∗ = T∗ T

Definition 2.35. Suppose T is a linear operator on an inner product space V and assume T has an
adjoint. We say that T is,

Normal if TT∗ = T∗ T,

Self-adjoint if T∗ = T.

The definitions for square matrices over R and C are identical, where ∗ now denotes the conjugate-
transpose.

A real self-adjoint matrix A ∈ Mn (R) is plainly symmetric: A T = A. A complex self-adjoint matrix is

also called Hermitian: A∗ = A.
If T is self-adjoint then it is certainly normal, but the converse is false as the next example shows.

Example 2.36. The (non-symmetric) real matrix A = 21 −21 is normal but not self-adjoint:

2 −1 2 1 5 0 2 1 2 −1

T
AA = = = = AT A
1 2 −1 2 0 5 −1 2 1 2

More generally, every non-zero skew-hermitian matrix A∗ = − A is normal but not self-adjoint:

A∗ = − A =⇒ AA∗ = − A2 = A∗ A

We saw above that linear maps with an orthonormal eigenbasis are either self-adjoint or normal
depending whether the inner product space is real or complex. Amazingly, this provides a complete
characterisation of such maps!

Theorem 2.37 (Spectral Theorem, version 1). Let T be a linear operator on a finite-dimensional
inner product space V.

Complex case: V has an orthonormal basis of eigenvectors of T if and only if T is normal.

Real case: V has an orthonormal basis of eigenvectors of T if and only if T is self-adjoint.

The theorem gets is name from the spectrum (set of eigenvalues) of T.

33
Examples 2.38. 1. We diagonalize the self-adjoint linear map T = L A ∈ L(R2 ) where A = 6 3 .

3 −2

Characteristic polynomial p(t) = (6 − t)(−2 − t) − 9 = t2 − 4t − 21 = (t − 7)(t + 3)

Eigenvalues λ1 = 7, λ2 = −3
Eigenvectors (normalized) w1 = √1 31 , w2 = √1 −31

10 10

The basis β = {w1 , w2 } is orthonormal, with respect to which [T] β = 70 −03 is diagonal.

2. The map T = L A ∈ L(R2 ) where A = 10 −32 is neither self-adjoint nor normal:

1 3 1 0 10 −6 1 3 1 0 1 3

∗
AA = = ̸= = = A∗ A
0 −2 3 −2 −6 4 3 13 3 −2 0 −2

It is diagonalizable, indeed

1 3 1 0

γ= , ⇝ [T] γ =
0 −1 0 −2

In accordance with the spectral theorem, γ is not orthogonal.

3. Let A = 01 −01 and consider T = L A acting on both C2 and R2 . Since T is normal but not

self-adjoint, we’ll see how the field really matters in the spectral theorem.
First the complex case: T ∈ L(C2 ) is normal and thus diagonalizable with respect to an or-
thonormal basis of eigenvectors. Here are the details.

Characteristic polynomial p(t) = t2 + 1 = (t − i )(t + i )

Eigenvalues λ1 = i, λ2 = −i
Eigenvectors (normalized) w1 = √1 1i , w2 = √1 −1i

2 2

Certainly ⟨w1 , w2 ⟩ = 0, β = {w1 , w2 } is orthonormal, and [T] β = i 0 is diagonal.

0 −i

Now for the real case: T ∈ L(R2 ) is not self-adjoint and thus should not be diagonalizable
with respect to an orthonormal basis of eigenvectors. Indeed this is trivial; the characteristic
polynomial has no roots in R and so there are no real eigenvalues! It is also clear geometrically:
T is rotation by 90° counter-clockwise around the origin, so it has no eigenvectors.

4. Let A = −3i 3i and consider the self-adjoint operator T = L A ∈ L(C2 ).

Characteristic polynomial p(t) = t2 − 6t + 9 − 1 = (t − 2)(t − 4)

Eigenvalues λ1 = 2, λ2 = 4
Eigenvectors (normalized) w1 = √1 1i , w2 = √1 −1i

2 2

With respect to the orthonormal basis β = {w1 , w2 }, we have [T] β = 20 .

34
Proving the Spectral Theorem for Self-Adjoint Operators

Lemma 2.39 (Basic properties of self-adjoint operators). Let T ∈ L(V ) be self-adjoint.

1. If W ≤ V is T-invariant then the restriction TW is self-adjoint;

2. Every eigenvalue of T is real;
3. If dim V is finite then T has an eigenvalue.

It is irrelevant whether V is real or complex. The previous example demonstrates part 2; even when
V = C2 is a complex inner product space, the eigenvalues of a self-adjoint matrix are real.

Proof. 1. Let w1 , w2 ∈ W. Then

⟨TW (w1 ), w2 ⟩ = ⟨T(w1 ), w2 ⟩ = ⟨w1 , T(w2 )⟩ = ⟨w1 , TW (w2 )⟩

2. Suppose (λ, x) is an eigenpair. Then

λ ||x||2 = ⟨T(x), x⟩ = ⟨x, T(x)⟩ = λ ||x||2 =⇒ λ ∈ R

3. This is trivial if V is complex since every characteristic polynomial splits over C. We therefore
assume V is real. Choose any orthonormal basis γ of V, let A = [T]γ ∈ Mn (R), and define
S := L A ∈ L(Cn ). Then;

• The characteristic polynomial of S splits over C, whence there exists an eigenvalue λ ∈ C.

• The characteristic polynomials of S and T are identical (to that of A).
• S is self-adjoint and thus (part 2) λ ∈ R.

It follows that T has the same real eigenvalue λ.

We are now able to prove the spectral theorem for self-adjoint operators on a finite-dimensional inner
product space V. The argument applies regardless of whether V is real or complex.

Proof of the Spectral Theorem (self-adjoint case). We prove by induction on dim V.

(Base case) If dim V = 1, then V = Span{x} and T(x) = λx for some unit vector x and scalar λ ∈ R;
plainly {x} is an orthonormal eigenbasis for T.
(Induction step) Fix n ∈ N and assume that every self-adjoint operator on every inner product space
of dimension n satisfies the spectral theorem. Let dim V = n + 1 and T ∈ L(V ) be self-adjoint.
By part 3 of the Lemma, T has an eigenpair (λ, x) where we may assume x has unit length. Let
W = Span{x}⊥ . If w ∈ W, then

⟨x, T(w)⟩ = ⟨T(x), w⟩ = λ ⟨x, w⟩ = 0 (∗)

whence W is T-invariant. Plainly dim W = n. By part 1 of the Lemma, TW is self-adjoint. By the

induction hypothesis, TW is diagonalized by some orthonormal basis γ of W. But then T is diagonal-
ized by the orthonormal basis β = γ ∪ {x} of V.

35
Proving the Spectral Theorem for Normal Operators
What changes for normal operators on complex inner product spaces? Not much! Indeed the proof
is almost identical when T is merely normal.

• We don’t need parts 2 and 3 of Lemma 2.39: every linear operator on a finite-dimensional
complex inner product space has an eigenvalue and we no longer care whether eigenvalues are
real.

• Two parts of the induction step need fixed:

– W being T-invariant: This isn’t quite as simple as (∗), but thankfully part 3 of the next
result provides the needed correction.
– TW being normal: We need a replacement for part 1 of Lemma 2.39; this is a little more
involved.

Rather than write out all the details, we leave this to Exercises 6 and 7.
For completeness, and as an analogue/extension of Lemma 2.39, we summarize some of the basic
properties of normal operators. These also apply to self-adjoint operators as a special case.

Lemma 2.40 (Basic properties of normal operators). Let T be normal on V. Then:

1. ∀x ∈ V, ||T(x)|| = ||T∗ (x)||.

2. T − tI is normal for any scalar t.

3. T(x) = λx ⇐⇒ T∗ (x) = λx so that T and T∗ have the same eigenvectors and conjugate
eigenvalues. This recovers the previously established fact that λ ∈ R if T is self-adjoint.

4. Distinct eigenvalues of T have orthogonal eigenvectors.

Proof. 1. ||T(x)||2 = ⟨T(x), T(x)⟩ = ⟨T∗ T(x), x⟩ = ⟨TT∗ (x), x⟩ = ⟨T∗ (x), T∗ (x)⟩ = ||T∗ (x)||2 .

2. ⟨x, (T − tI)(y)⟩ = ⟨x, T(y)⟩ − t ⟨x, y⟩ = ⟨T∗ (x), y⟩ − tx, y = (T∗ − tI)(x), y shows that T −
tI has adjoint T∗ − tI. It is trivial to check that these commute.
parts
3. T(x) = λx ⇐⇒ ||(T − λI)(x)|| = 0 ⇐⇒ (T∗ − λI)(x) = 0 ⇐⇒ T∗ (x) = λx.
1&2

4. In part this follows from the spectral theorem, but we can also prove more straightforwardly.
Suppose T(x) = λx and T(y) = µy where λ ̸= µ. By part 3,

λ ⟨x, y⟩ = ⟨λx, y⟩ = ⟨T(x), y⟩ = ⟨x, T∗ (y)⟩ = ⟨x, µy⟩

= µ ⟨x, y⟩

This is a contradiction unless ⟨x, y⟩ = 0.

36
Schur’s Lemma
It is reasonable to ask how useful an orthonormal basis can be in general. Here is one answer.

Lemma 2.41 (Schur). Suppose T is a linear operator on a finite-dimensional inner product space V.
If the characteristic polynomial of T splits, then there exists an orthonormal basis β of V such that
[T] β is upper-triangular.

The spectral theorem is a special case; since the proof is similar, we leave it to the exercises.
The conclusion of Schur’s lemma is weaker than the spectral theorem, though it applies to more
operators: indeed if V is complex, it applies to any T! Every example of the spectral theorem is also
an example of Schur’s lemma. Example 2.38.2 provides another, since the matrix A is already upper
triangular with respect to the standard orthonormal basis. Here is a another example.

Example 2.42. Consider T( f ) = 2 f ′ ( x ) + x f (1) as a linear map T ∈ L( P1 (R)) with respect to the
R1
L2 inner product ⟨ f , g⟩ = 0 f (t) g(t) dt. We have

T( a + bx ) = 2b + ( a + b) x

If [T] β is to be upper-triangular, the first vector in β must be an eigenvector of T. It is easily checked

that f 1 = 1 + x is such with eigenvalue 2. To find a basis satisfying Schur’s lemma, we need only find
f 2 orthogonal to this and then normalize. This can be done by brute force since the problem is small,
but for the sake of practice we apply Gram–Schmidt to the polynomial 1:

⟨1, 1 + x ⟩ 1 + 12 1
1− 2
( 1 + x ) = 1 − 1
(1 + x ) = (5 − 9x ) =⇒ f 2 = 5 − 9x
||1 + x || 1+1+ 3 14

Indeed we obtain an upper-triangular matrix for T:

2 −13

T( f 2 ) = −18 − 4x = −13(1 + x ) − (5 − 9x ) = −13 f 1 − f 2 =⇒ [T]{ f1 , f2 } =
0 −1

We can also work with the corresponding orthonormal basis as posited in the theorem, though the
matrix is messier:

2 − √133
(r ) !
3 1
β = { g1 , g2 } = (1 + x ), √ (5 − 9x ) =⇒ [T] β =
7 7 0 −1

Alternatively, we could have started with the other eigenvector h1 = 2 − x: an orthogonal vector to
this is h2 = 4 − 9x, with respect to which

−1 −13

[T]{h1 ,h2 } =
0 2

In both cases the eigenvalues are down the diagonal, as must be for an upper-triangular matrix.

In general, it is difficult to quickly find a suitable basis satisfying Schur’s lemma. After trying the
proof in the exercises, you should be able to describe a method, though it is impractically slow!

37
Exercises 2.4 1. For each linear operator T on an inner product space V, decide whether T is nor-
mal, self-adjoint, or neither. If the spectral theorem permits, find an orthonormal eigenbasis.

(a) V = R2 and T ( yx ) = −2a

−b
2a+5b
x − a−b
(b) V = R3 and T y = 5b
z 4a−2b+5c

(c) V = C2 and T ( wz ) = 2z +iw

z+2w
(d) V = R4 with T : (e1 , e2 , e3 , e4 ) 7→ (e3 , e4 , e1 , e2 )
R1
(e) V = P2 (R) with ⟨ f , g⟩ = 0 f (t) g(t) dt and T( f ) = f ′
(Hint: Don’t compute T∗ ! Instead assume T is normal and aim for a contradiction. . . )
R1
2. Let T( f ( x )) = f ′ ( x ) + 4x f (0) where T ∈ L( P1 (R)) and ⟨ f , g⟩ = −1 f (t) g(t) dt. Find an
orthonormal basis of P1 (R) with respect to which the matrix of T is upper-triangular.

3. Suppose S, T are self-adjoint operators on an inner product space V. Prove that ST is self-adjoint
if and only if ST = TS.
(Hint: recall Theorem 2.27)

4. Let T be normal on a finite-dimensional inner product space V. Prove that N (T∗ ) = N (T) and
that R(T∗ ) = R(T).
(Hint: Use Lemma 2.40 and the Fundamental Subspaces Theorem 2.30)

5. Let T be self-adjoint on a finite-dimensional inner product space V. Prove that

∀x ∈ V, ||T(x) ± ix||2 = ||T(x)||2 + ||x||2

Hence prove that T − iI is invertible and that [(T − iI)−1 ]∗ = (T + iI)−1 .

6. Let W be a T-invariant subspace of an inner product space V and let TW ∈ L(W ) be the restric-
tion of T to W. Prove:

(a) W ⊥ is T∗ -invariant.
(b) If W is both T- and T∗ -invariant, then (TW )∗ = (T∗ )W .
(c) If W is both T- and T∗ -invariant and T is normal, then TW is normal.

7. Use the previous question to complete the proof of the spectral theorem for a normal operator
on a finite-dimensional complex inner product space.

8. (a) Suppose S is a normal operator on a finite-dimensional complex inner product space, all of
whose eigenvalues are real. Prove that S is self-adjoint.
(b) Let T be a normal operator on a finite-dimensional real inner product space V whose char-
acteristic polynomial splits. Prove that T is self-adjoint and that there exists an orthonor-
mal basis of V of eigenvectors of T.
(Hint: Mimic the proof of Lemma 2.39 part 3 and use part (a))

9. Prove Schur’s lemma by induction, similarly to the proof of the spectral theorem.
(Hint: T∗ has an eigenvector x; why? Now show that W = Span{x}⊥ is T-invariant. . . )

38
2.5 Unitary and Orthogonal Operators and their Matrices
In this section we focus on length-preserving transformations of an inner product space.

Definition 2.43. A linear9 isometry of an inner product space V is a linear map T satisfying

∀x ∈ V, ||T(x)|| = ||x||

Every eigenvalue of an isometry must have modulus 1: if T(w) = λw, then

||w||2 = ||T(w)||2 = ||λw||2 = |λ|2 ||w||2

1
Example 2.44. Let T = L A ∈ L(R2 ), where A = 4 −3 . Then

5 3 4

2 2 2
x 1 4x − 3y 1 x

T (4x − 3y)2 + (3x + 4y)2 = x2 + y2 =

= =
y 5 3x + 4y 25 y

This matrix is very special in that its inverse equals its transpose:

1 4 3 1 4 3

−1
A = 16 = = AT
+ 9 − 3 4 5 − 3 4
25 25

We call such matrices orthogonal. The simple version of what follows is that every linear isometry on
Rn is multiplication by an orthogonal matrix.

Definition 2.45. A unitary operator T on an inner product space V is an invertible linear map satis-
fying T∗ T = I = TT∗ . A unitary matrix is a (real or complex) matrix satisfying A∗ A = I.
If V is real, we usually call these orthogonal operators/matrices; this isn’t necessary, since unitary en-
compasses both real and complex spaces. An orthogonal matrix satisfies A T A = I.

1 i 2+2i
Example 2.46. The matrix A = is unitary:

3 2−2i i

1 −i 2 + 2i i 2 + 2i 1 −i2 + 4 + 4 (−i + i )(2 + 2i )

∗
A A= = =I
9 2 − 2i −i 2 − 2i i 9 (i − i )(2 − 2i ) 4 + 4 − i2

If V is finite-dimensional, the operator/matrix notions correspond straightforwardly. By Theorem

2.27, if we choose any orthonormal basis β of V, then

T ∈ L(V ) is unitary/orthogonal ⇐⇒ [T] β is unitary/orthogonal

Moreover, we need only assume T∗ T = I (or TT∗ = I) if V is finite-dimensional: if β is an orthonormal

basis, then

T∗ T = I ⇐⇒ [T∗ ] β [T] β = I ⇐⇒ [T] β [T∗ ] β = I ⇐⇒ TT∗ = I

In infinite dimensions, we need T∗ to be both the left- and right-inverse of T. This isn’t an empty
requirement (see Exercise 13).
9 There also exist non-linear isometries: for instance translations (T( x ) = x + a for any constant a) and complex conjugation

(T(x) = x) on Cn . Together with linear isometries, these essentially comprise all isometries in finite dimensions.

39
We now tackle the correspondence between unitary operators and isometries.

Theorem 2.47. Let T be a linear operator on an inner product space V.

1. If T is a unitary/orthogonal operator, then it is a linear isometry.

2. If T is a linear isometry and V is finite-dimensional, then T is unitary/orthogonal.

Proof. 1. If T is unitary, then

∀x, y ∈ V, ⟨x, y⟩ = ⟨T∗ T(x), y⟩ = ⟨T(x), T(y)⟩ (†)

In particular taking x = y shows that T is an isometry.

2. (I − T∗ T)∗ = I∗ − (T∗ T)∗ = I − T∗ T is self-adjoint. By the spectral theorem, there exists an

orthonormal basis of V of eigenvectors of I − T∗ T. For any such x with (real) eigenvalue λ,

0 = ||x||2 − ||T(x)||2 = ⟨x, x⟩ − ⟨T(x), T(x)⟩ = ⟨x, (I − T∗ T)x⟩ = λ ||x||2

=⇒ λ = 0

Since I − T∗ T = 0 on a basis, T∗ T = I. Since V is finite-dimensional, we also have TT∗ = I

whence T is unitary.

The finite-dimensional restriction is important in part 2: we use the existence of adjoints, the spectral
theorem, and that a left-inverse is also a right-inverse. See Exercise 13 for an example of a non-unitary
isometry in infinite dimensions.
The proof shows a little more:

Corollary 2.48. On a finite-dimensional space, being unitary is equivalent to each of the following:

(a) Preservation of the inner product10 (†).

(b) The existence of an orthonormal basis β = {w1 , . . . , wn } such that T( β) = {T(w1 ), . . . , T(wn )}
is also orthonormal.
(c) That every orthonormal basis β of V is mapped to an orthonormal basis T( β).

While (a) is simply (†), claims (b) and (c) are also worth proving explicitly: see Exercise 9. If β is the
standard orthonormal basis of Fn and T = L A , then the columns of A form the orthonormal set T( β).
This makes identifying unitary/orthogonal matrices easy:

Corollary 2.49. A matrix A ∈ Mn (R) is orthogonal if and only if its columns form an orthonormal
basis of Rn with respect to the standard (dot) inner product.
A matrix A ∈ Mn (C) is unitary if and only if its columns form an orthonormal basis of Cn with
respect to the standard (Hermitian) inner product.

10 In ⟨x,y⟩
particular, in a real inner product space isometries also preserve the angle θ between vectors since cos θ = ||x||||y|| .

40
Examples 2.50. 1. The matrix Aθ = cos θ − sin θ ∈ M (R) is orthogonal for any θ. Example 2.44 is

sin θ cos θ 2
− 1 3 −1 3 − 1 4
this with θ = tan 4 = sin 5 = cos 5 . More generally (Exercise 6), it can be seen that every
real orthogonal 2 × 2 matrix has the form Aθ or

cos θ sin θ

Bθ =
sin θ − cos θ

for some angle θ. The effect of the L Aθ is to rotate counter-clockwise by θ, while that of LBθ is to
reflect across the line making angle 12 θ with the positive x-axis.
√ √ !
√2 3 1
2. A = √1 ∈ M3 (R) is orthogonal: check the columns!.
6 √2 √0 −2
− 2 3 −1

3. A = √1 1 i ∈ M2 (C) is unitary: indeed it maps the standard basis to the orthonormal basis

2 i 1

1 1 1 i

T( β ) = √ ,√
2 i 2 1
It is also easy to check that the characteristic polynomial is

√1 − t √i 1 2 1
!
1

p(t) = det 2 2 = t− √ + =⇒ t = √ (1 ± i ) = e±πi/4
i 1
√ −t
√
2 2 2 2 2

whence the eigenvalues of T both have modulus 1.

4. Here is an example of an infinite-dimensional unitary operator. On the space C [−π, π ], the

function T( f ( x )) = eix f ( x ) is linear. Moreover

1 1
D E Z π Z π D E
eix f ( x ), g( x ) = eix f ( x ) g( x ) dx = f ( x )e−ix g( x ) dx = f ( x ), e−ix g( x )
2π −π 2π −π

whence T∗ ( f ( x )) = e−ix f ( x ). Indeed T∗ = T−1 and so T is a unitary operator.

Since C [−π, π ] is infinite-dimensional, we don’t expect all parts of the Corollary to hold:

(a) T does preserve the inner product.

(b), (c) C [−π, π ] doesn’t have an orthonormal basis; there is no orthonormal set β = { f k } so that
every continuous function is a finite linear combination.11 We cannot therefore claim that T
maps orthonormal bases to orthonormal bases!

11 An infinite orthonormal set β = { f : k ∈ Z} can be found so that every function f ‘equals’ an infinite series in the
k
sense that || f − ∑ ak f k || = 0. Since these are not finite sums, β isn’t strictly a basis, though it isn’t uncommon for it to be
so described. Moreover, given that the norm is defined by an integral, this also isn’t a claim that f and ∑ ak f k are equal
as functions. Indeed the infinite series need not be continuous! For these reasons, when working with Fourier series, one
tends to consider a broader class than the continuous functions.

41
Unitary and Orthogonal Equivalence
Suppose A ∈ Mn (R) is symmetric (self-adjoint) A T = A. By the spectral theorem, A has an orthonor-
mal eigenbasis β = {w1 , . . . , wn }: Aw j = λ j w j . Arranging the eigenbasis as the columns of a matrix,
we see that the columns of U = (w1 · · · wn ) are orthonormal and so U is an orthogonal matrix. We
can therefore write
λ1 · · · 0
 

A = UDU −1 = U  ... ..
.
..  U T
. 


0 ··· λn
A similar approach works if A ∈ Mn (C) is normal: we now have A = UDU ∗ where U is unitary.

Example 2.51. The matrix A = −11+−i i 11++i

i is normal as can easily be checked. Its characteristic

polynomial is

p(t) = t2 − 2(1 + i )t + 4i = (t − 2i )(t − 2)

with corresponding orthonormal eigenvectors

1 1 1 1

w2 = √ , w2i = √
2 −i 2 i
We conclude that
−1
1 1 1 2 0 1 1 1 1 1 1 0 1 i

A= √ √ =
2 −i i 0 2i 2 −i i −i i 0 i 1 −i

This is an example of unitary equivalence.

Definition 2.52. Square matrices A, B are unitarily equivalent if there exists a unitary matrix U such
that B = U ∗ AU. Orthogonal equivalence is similar: B = U T AU.

The above discussion proves half the following:

Theorem 2.53. A ∈ Mn (C) is normal if and only if it is unitarily equivalent to a diagonal matrix
(the matrix of its eigenvalues).
A ∈ Mn (R) is self-adjoint (symmetric) if and only if it is orthogonally equivalent to a diagonal matrix.

Proof. We’ve already observed the (⇒) direction.

For the converse, let D be diagonal, U unitary, and A = U ∗ DU. Then

A∗ A = (U ∗ DU )∗ U ∗ DU = U ∗ D ∗ UU ∗ DU = U ∗ DDU = U ∗ DDU = U ∗ DUU ∗ DU

= U ∗ DU (U ∗ DU )∗ = AA∗

since U ∗ = U −1 and because diagonal matrices commute: DD = DD.

In the special case where A is real and U is orthogonal, then A is symmetric:

A T = (U T DU )T = U T D T U = U T DU = A

42
Exercises 2.5 1. For each matrix A find an orthogonal or unitary U and a diagonal D = U ∗ AU.
2 1 1
12 0 −1 2 3−3i
(a) (b) (c) (d)

21 1 0 3+3i 5 121
112

2. Which of the following pairs are unitarily/orthogonally equivalent? Explain your answers.
0 −1 0 2 0 0
(a) A = 01 10 and B = 02 20 (b) A = 1 0 0 and B = 0 −1 0

0 0 1 0 0 0
0 −1 0 1 0 0
(c) A = 1 0 0 and B = 0 i 0
0 0 1 0 0 −i

3. Let a, b ∈ C be such that | a|2 + |b|2 = 1. Prove that every 2 × 2 matrix of the form

a −eiθ b is
b eiθ a
unitary. Are these all the unitary 2 × 2 matrices? Prove or disprove.
4. If A, B are orthogonal/unitary, prove that AB and A−1 are also orthogonal/unitary.
(This proves that orthogonal/unitary matrices are groups under matrix multiplication)
5. Check that A = 13 4i5 −54i ∈ M2 (C) satisfies A T A = I (it is a complex orthogonal matrix).

(These don’t have the same nice relationship with inner products, and are thus less useful to us)
6. Supply the details of Exercise 2.50.1.
(Hints: β = {i, j} is orthonormal, whence { Ai, Aj} must be orthonormal. Now draw pictures to
compute the result of rotating and reflecting the vectors i and j.)
7. Show that the linear map in Example 2.50.4 has no eigenvectors.
8. Prove that A ∈ Mn (C) has an orthonormal basis of eigenvectors whose eigenvalues have mod-
ulus 1, if and only if A is unitary.
9. Prove parts (b) and (c) of Corollary 2.48 for a finite-dimensional inner product space:
(a) If β is an orthonormal basis such that T( β) is orthonormal, then T is unitary.
(b) If T is unitary, and η is an orthonormal basis, then T(η ) is an orthonormal basis.
10. Let T be a linear operator on a finite-dimensional inner product space V. If ||T(x)|| = ||x|| for
all x in some orthonormal basis of V, must T be unitary? Prove or disprove.
11. Let T be a unitary operator on an inner product space V and let W be a finite-dimensional
T-invariant subspace of V. Prove:
(a) T(W ) = W (Hint: show that TW is injective);
(b) W⊥ is T-invariant.
12. Let W a subspace of an inner product space V such that V = W ⊕ W ⊥ . Define T ∈ L(V ) by
T(u + w) = u − w where u ∈ W and w ∈ W ⊥ . Prove that T is unitary and self-adjoint.
13. In the inner product space ℓ2 of square-summable sequences, consider the linear operator
T( x1 , x2 , . . .) = (0, x1 , x2 , . . .). Prove that T is an isometry and compute its adjoint. Check that T
is non-invertible and non-unitary.
14. Prove Schur’s Lemma for matrices. Every A ∈ Mn (R) is orthogonally equivalent and every
A ∈ Mn (C) is unitarily equivalent to an upper triangular matrix.

43
2.6 Orthogonal Projections
Recall the discussion of the Gram-Schmidt process, where we saw that any finite-dimensional sub-
space W of an inner product space V has an orthonormal basis βW = {w1 , . . . , wn }. In such a situa-
tion, we can define the orthogonal projections onto W and W ⊥ via
n
πW : V → W : x 7 → ∑ x, w j w j , ⊥
πW : V → W ⊥ : x 7 → x − πW ( x )
j =1

Our previous goal was to use orthonormal bases to ease computation. In this section we develop
projections more generally. First recall the notion of a direct sum within a vector space V:

V = X ⊕ Y ⇐⇒ ∀v ∈ V, ∃ unique x ∈ X, y ∈ Y such that v = x + y

Definition 2.54. A linear map T ∈ L(V ) is a projection if: Y

V = R(T) ⊕ N (T) and TR(T) = IR(T)

Otherwise said, T(r + n) = r whenever r ∈ R(T) and n ∈ N (T). y v

Alternatively, given V = X ⊕ Y, the projection along Y onto X is the map X
v = x + y 7→ x.
x = T( v )
We call a A ∈ Mn (F) a projection matrix if L A ∈ L(Fn ) is a projection.

1 6 −2 2 1
Example 2.55. A= is a projection matrix with R( A) = Span and N ( A) = Span .

5 3 −1 1 3
Indeed, it is straightforward to describe all projection matrices in M2 (R). There are three cases:

1. A = I is the identity matrix: R( A) = R2 and N ( A) = {0};

2. A = 0 is the zero matrix: R( A) = {0} and N ( A) = R2 ;

3. Choose distinct subspaces R( A) = Span ( ba ) and N ( A) = Span ( dc ), then

1 a 1 ad − ac

A= (d − c) =
ad − bc b ad − bc bd −bc

Think about why this last does what we claim.

It should be clear that every projection T has (at most) two eigenspaces:
• R(T) is an eigenspace with eigenvalue 1
• N (T) is an eigenspace with eigenvalue 0
If V is finite-dimensional and ρ, η are bases of R( T ), N ( T ) respectively, then the matrix of T with
respect to ρ ∪ η has block form

I 0

[T] ρ ∪ η =
0 0
where rank I = rank T. In particular, every finite-dimensional projection is diagonalizable.

44
Lemma 2.56. T ∈ L(V ) is a projection if and only if T2 = T.

Proof. Throughout, assume r ∈ R(T) and n ∈ N (T).

(⇒) Since every vector in V has a unique representation v = r + n, simply compute

T2 (v) = T T(r + n) = T(r) = r = T(v)

(⇐) Suppose T2 = T. Note first that if r ∈ R(T), then r = T(v) for some v ∈ V, whence

T(r) = T2 (v) = T(v) = r (†)

Thus T is the identity on R(T). Moreover, if x ∈ R(T) ∩ N (T), (†) says that x = T(x) = 0, whence

R(T) ∩ N (T) = {0}

and so R(T) ⊕ N (T) is a well-defined subspace of V.12 To finish things off, let v ∈ V and observe that

T v − T(v) = T(v) − T2 (v) = 0 =⇒ v − T(v) ∈ N (T)

so that v = T(v) + v − T(v) is a decomposition into R(T)- and N (T)-parts. We conclude that

V = R(T) ⊕ N (T) and that T is a projection.

Thus far the discussion hasn’t had anything to do with inner products. . .

Definition 2.57. An orthogonal projection is a projection T ∈ L(V ) on an inner product space for
which we additionally have

N (T) = R(T)⊥ and R(T) = N (T)⊥

Alternatively, given V = W ⊕ W ⊥ , the orthogonal projection πW is the projection along W ⊥ onto W:

that is

R(πW ) = W and N ( πW ) = W ⊥
⊥ = I − π has R( π ⊥ ) = W ⊥ and N ( π ⊥ ) = W.
The complementary orthogonal projection πW W W W

Example (2.55 continued). The identity and zero matrices are both 2 × 2 orthogonal projection
matrices, while those of type 3 are orthogonal if ( ba ) · ( dc ) = 0: we obtain
2
1 a 1 a ab

A= 2 ( a b) = 2
a + b2 b a + b2 ab b2
More generally, if W ≤ Fn has orthonormal basis {w1 , . . . , wk }, then the matrix of πW is ∑kj=1 w j w∗j .
12 In
finite dimensions, the rank–nullity theorem and dimension counting finishes the proof here without having to
proceed further:

dim(R(T) ⊕ N (T)) = rank T + null T = dim V =⇒ R(T) ⊕ N (T) = V

45
Theorem 2.58. A projection T ∈ L(V ) is orthogonal if and only if it is self-adjoint T = T∗ .

Proof. (⇒) By assumption, R(T) and N (T) are orthogonal subspaces. Letting x, y ∈ V and using
subscripts to denote R(T)- and N (T)-parts, we see that

⟨x, T(y)⟩ = ⟨xr + xn , yr ⟩ = ⟨xr , yr + yn ⟩ = ⟨T(x), y⟩ =⇒ T∗ = T

(⇐) Suppose T is a self-adjoint projection. By the fundamental subspaces theorem,

N (T) = N (T∗ ) = R(T)⊥

Since T is a projection already, we have V = R(T) ⊕ N (T) = R(T) ⊕ R(T)⊥ , from which
⊥
R(T) = R(T)⊥ = N (T) ⊥

The language of projections allows us to rephrase the Spectral Theorem.

Theorem 2.59 (Spectral Theorem, mk. II). Let V be a finite-dimensional complex/real inner prod-
uct space and T ∈ L(V ) be normal/self-adjoint with spectrum {λ1 , . . . , λk } and corresponding
eigenspaces E1 , . . . , Ek . Let π j ∈ L(V ) be the orthogonal projection onto Ej . Then:

1. V = E1 ⊕ · · · ⊕ Ek is a direct sum of orthogonal subspaces, in particular, E⊥

j is the direct sum of
the remaining eigenspaces.

2. πi π j = 0 if i ̸= j.

3. (Resolution of the identity) IV = π1 + · · · + πk

4. (Spectral decomposition) T = λ1 π1 + · · · + λk πk

Proof. 1. T is diagonalizable and so V is the direct sum of the eigenspaces of T. Since T is normal,
the eigenvectors corresponding to distinct eigenvalues are orthogonal, whence the eigenspaces
are mutually orthogonal. In particular, this says that

Êj := Ei ≤ E⊥
M
j
i̸= j

Since V is finite-dimensional, we have V = Ej ⊕ E⊥

j , whence

dim Êj = ∑ dim Ei = dim V − dim Ej = dim E⊥j =⇒ Êj = E⊥

j
i̸= j

2. This is clear by part 1, since N (π j ) = E⊥

j = Ê j .

3. Write x = ∑kj=1 x j where each x j ∈ Ej . Then π j (x) = x j : now add. . .

4. T(x) = ∑kj=1 T(x j ) = ∑kj=1 λ j x j = ∑kj=1 λ j π j (x).

46
Examples 2.60. We verify the resolution of the identity and the spectral decomposition; for clarity,
we index projections and eigenspaces by eigenvalue rather than the natural numbers.
1. The symmetric matrix A = 10 2
2 7 has spectrum {6, 11} and orthonormal eigenvectors

1 1 1 2

w6 = √ , w11 = √
5 −2 5 1
The corresponding projections therefore have matrices
1 1 1 1 −2 1 4 2

T T
π6 = w6 w6 = (1 − 2) = , π11 = w11 w11 =
5 −2 5 −2 4 5 2 1
from which the resolution of the identity and the spectral decomposition are readily verified:
1 0 1 6 + 44 −12 + 22

π6 + π11 = and 6π6 + 11π11 = =A
0 1 5 −12 + 22 24 + 11

2. The normal matrix B = −11+−i i 11+ i

+i ∈ M2 (C) has spectrum {2, 2i } and corresponding orthonor-

mal eigenvectors
1 i 1 1

w2 = √ , w2i = √
2 1 2 i
The orthogonal projection matrices are therefore
1 i 1 1 i 1 1 −i

π2 = w2 w2∗ = (−i 1) = ∗
π2i = w2i w2i =
2 1 2 −i 1 2 i 1
from which
1 0 1 i i 1

π2 + π2i = and 2π2 + 2iπ2i = + =B
0 1 −i 1 −1 i
0 1 1
3. The matrix C = 1 0 1 has spectrum {−1, 2}, an orthonormal eigenbasis
110
1 1 1 
      
 1 1 1
{u, v, w} = √ 1 , √ −1 , √  1 
 3 2 6 −2 
1 0
and eigenspaces E2 = Span{u} and E−1 = Span{v, w}. The orthogonal projections have ma-
trices
1 1 1 1
   
1 1
π2 = uu T = 1 (1 1 1) = 1 1 1
3 3
1 1 1 1
1 1
   
1 1
π−1 = vv T + ww T = −1 (1 − 1 0) +  1  (1 1 − 2)
2 6
0 −2
1 −1 0 1 1 −2 2 −1 −1
     
1 1 1
= −1 1 0 +  1 1 −2 =  −1 2 −1
2 6 3
0 0 0 −2 −2 4 −1 −1 2
It is now easy to check the resolution of the identity and the spectral decomposition:
π 2 + π −1 = I and 2π2 − π−1 = C

47
Orthogonal Projections and Minimization Problems
We finish this section with an important observation that drives much of the application of inner
product spaces to other parts of mathematics and beyond. Throughout this discussion, X and Y
denote inner product spaces.

Theorem 2.61. Suppose Y = W ⊕ W ⊥ . For any y ∈ Y, the orthogonal projection πW (y) is the
unique element of W which minimizes the distance to y:

∀w ∈ W, ||y − πW (y)|| ≤ ||y − w||

⊥ (y) = y − π (y) ∈ W ⊥ ,
Proof. Apply Pythagoras’ Theorem: since πW W

||y − w||2 = ||y − πW (y) + πW (y) − w||2

= ||y − πW (y)||2 + ||πW (y) − w||2
≥ ||y − πW (y)||2

with equality if and only if w = πW (y).

This set up can be used to compute accurate approximations in many contexts.

Examples 2.62. 1. To obtain a quadratic polynomial approximation p( x ) = a + bx + cx2 to e x on

R1 2
the interval [−1, 1] we choose to minimize the integral −1 |e x − p( x )| dx, namely the squared
L2 -norm ||e x − p( x )||2 on C [−1, 1]. If we let W = Span{1, x, x2 }, then the finite-dimensionality
of W means that C [−1, 1] = W ⊕ W ⊥ . By the Theorem, the solution is p( x ) = πW (e x ).
To compute this, recall that we have an orthonormal basis for W, namely

y
q q
√1 , 3 5 2
2 2 x, 8 (3x − 1)

from which
2
1 3 5
p( x ) = ⟨1, e x ⟩ + ⟨ x, e x ⟩ x + (3x2 − 1) 3x2 − 1, e x
2 2 8
1 1 x
Z 1
3
Z
= e dx + x xe x dx
2 −1 2 −1 1
Z 1
5
+ (3x2 − 1) (3x2 − 1)e x dx
8 −1
1 5
= (e − e−1 ) + 3e−1 x + (e − 7e−1 )(3x2 − 1)
2 4
≈ 1.18 + 1.10x + 0.179(3x2 − 1)
−1 0 1
≈ 1 + 1.1x + 0.537x2 x
The linear and quadratic approximations to y = e x are drawn. Compare this with the Maclaurin
polynomial e x ≈ 1 + x + 21 x2 from calculus.

48
2. The nth Fourier approximation of a function f ( x ) is its orthogonal projection onto the finite-
dimensional subspace

Wn = Span{1, eix , e−ix , . . . , einx , e−inx }

= Span{1, cos x, sin x, cos 2x, sin 2x, . . . , cos nx, sin nx }

within L2 [−π, π ], namely

n
1 D E
Fn ( x ) =
2π ∑ f ( x ), eikx eikx
k =−n
n
1 1
=
2π
⟨ f ( x ), 1⟩ +
π ∑ ⟨ f (x), cos kx⟩ cos kx + ⟨ f (x), sin kx⟩ sin kx
k =1

According to the Theorem, this is the unique function in Fn ( x ) ∈ Wn minimizing the integral
Z π
|| f ( x ) − Fn ( x )||2 = | f ( x ) − Fn ( x )|2 dx
−π
(
1 if 0 < x ≤ π
For example, if f ( x ) = is extended periodically, then
−1 if − π < x ≤ 0

n
4 sin(2j − 1) x 4 sin 3x sin 5x sin(2n − 1) x

F2n−1 ( x ) =
π ∑ 2j − 1 = π sin x +
3
+
5
+···+
2n − 1
j =1

y
1

x
−2π −π π 2π

−1
y = f ( x ) and its eleventh Fourier approximation y = F11 ( x )

We’ll return to related minimization problems in the next section.

Exercises 2.6 1. Compute the matrices of the orthogonal projections onto W viewed as subspaces
of the standard inner product spaces Rn or Cn .

 1 1   i 1 
       
4

(a) W = Span (b) W = Span 2 , 0 (c) W = Span 1 , i
−1
      
1 −1 0 1
   

 1 1 
   

(d) W = Span 1 , 2 (watch out, these vectors aren’t orthogonal!)
0 1
 

49
2. For each of the following matrices, compute the projections onto each eigenspace, verify the
resolution of the identity and the spectral decomposition.
2 1 1
 
1 2 0 −1 2 3 − 3i

(a) (b) (c) (d) 1 2 1
2 1 1 0 3 + 3i 5
1 1 2

(You should have orthonormal eigenbases from the previous section)

3. If W be a finite-dimensional subspace of an inner product space V. If T = πW is the orthogonal
projection onto W, prove that I − T is the orthogonal projection onto W ⊥ .

4. Let T ∈ L(V ) where V is finite-dimensional.

(a) If T is an orthogonal projection, prove that ||T(x)|| ≤ ||x|| for all x ∈ V.

(b) Give an example of a projection for which the inequality in (a) is false.
(c) If T is a projection for which ||T(x)|| = ||x|| for all x ∈ V, what is T?
(d) If T is a projection for which ||T(x)|| ≤ ||x|| for all x ∈ V, prove that T is an orthogonal
projection.

5. Let T be a normal operator on a finite-dimensional inner product space. If T is a projection,

prove that it must be an orthogonal projection.

6. Let T be a normal operator on a finite-dimensional complex inner product space V. Use the
spectral decomposition T = λ1 π1 + · · · + λk πk to prove:

(a) If Tn is the zero map for some n ∈ N, then T is the zero map.
(b) U ∈ L(V ) commutes with T if and only if U commutes with each π j .
(c) There exists a normal U ∈ L(V ) such that U2 = T.
(d) T is invertible if and only if λ j ̸= 0 for all j.
(e) T is a projection if and only if every λ j = 0 or 1.
(f) T = −T∗ if and only if every λ j is imaginary.

7. Find a linear approximation to f ( x ) = e x on [0, 1] using the L2 inner product.

8. Consider the L2 inner product on C [−π, π ] inner product.

(a) Explain why sin x, x2n = 0 for all n.

(b) Find linear and cubic approximations to f ( x ) = sin x.
(Feel free to use a computer algebra package to evaluate the integrals!)

9. Revisit Example 2.62.2

(a) Verify that the general complex (eikx ) and real (cos kx, sin kx) expressions for the Fourier
approximation are correct.
(Hint: use Euler’s formula eikx = cos kx + i sin kx)
(b) Verify the explicit expression for F2n−1 ( x ) when f ( x ) is the given step-function. What is
F2n ( x ) in this case?

50
2.7 The Singular Value Decomposition and the Pseudoinverse
Given T ∈ L(V, W ) between finite-dimensional inner product spaces, the overarching concern of this
chapter is the existence and computation of bases β, γ of V, W with two properties:
• That β, γ be orthonormal, thus facilitating easy calculation within V, W;
γ
• That the matrix [T] β be as simple as possible.
We have already addressed two special cases:
Spectral Theorem When V = W and T is normal/self-adjoint, ∃ β = γ such that [T] β is diagonal.
Schur’s Lemma When V = W and p(t) splits, ∃ β = γ such that [T] β is upper triangular.
In this section we allow V ̸= W and β ̸= γ, and obtain a result that applies to any linear map between
finite-dimensional inner product spaces.
3 1
Example 2.63. Let A = 2 −2 and consider orthonormal bases β = {v1 , v2 } of R2 and γ =
1 3
{w1 , w2 , w3 } of R3 respectively:
1 −1 1
      
1 1 1 −1  1 1   1  

β= √ , √ γ=, √ 0 , √ −2 , √ −1
2 1 2 1
 
 2
1 6 1 3 −1 
√ 4 √0

γ
Since Av1 = 4w1 and Av2 = 2 3w2 , whence [L A ] β = 0 2 3 is almost diagonal.
0 0

Our main result says that such bases always exist.

Theorem 2.64 (Singular Value Decomposition). Suppose V, W are finite-dimensional inner prod-
uct spaces and that T ∈ L(V, W ) has rank r. Then:

1. There exist orthonormal bases β = {v1 , . . . , vn } of V and γ = {w1 , . . . , wm } of W, and positive

scalars σ1 ≥ σ2 ≥ · · · ≥ σr such that
(
σj w j if j ≤ r diag(σ1 , . . . , σr ) O

γ
T( v j ) = equivalently [T] β =
0 otherwise O O

2. Any such β is an eigenbasis of T∗ T, whence the scalars σj are uniquely determined by T: indeed

σj2 v j
( (
if j ≤ r σj v j if j ≤ r
T∗ T ( v j ) = and T∗ ( w j ) =
0 otherwise 0 otherwise

3. If A ∈ Mm×n (F) with rank A = r, then A = PΣQ∗ where

diag(σ1 , . . . , σr ) O

γ
Σ = [L A ] β = , P = ( w1 , . . . , w m ), Q = ( v1 , . . . , v n )
O O

Since the columns of P, Q are orthonormal, these matrices are unitary.

51
Definition 2.65. The numbers σ1 , . . . , σr are the singular values of T. If T is not maximum rank, we
have additional zero singular values σr+1 = · · · = σmin(m,n) = 0.

Freedom of Choice While the singular values are uniquely determined, there is often significant free-
dom regarding the bases β and γ, particularly if any eigenspace of T∗ T has dimension ≥ 2.
Special Case (Spectral Theorem) If V = W and T is normal/self-adjoint, we may choose β to be an
eigenbasis of T, then σj is the modulus of the corresponding eigenvalue (see Exercise 7).
Rank-one decomposition If we write g j : V → F for the linear map g j : v 7→ v, v j (recall Riesz’s
Theorem), then the singular value decomposition says
r r
T= ∑ σj w j gj that is T( v ) = ∑ σj v, v j w j
j =1 j =1

thus rewriting T as a linear combination of rank-one maps (w j g j : V → W).

For matrices, g j is simply multiplication by the row vector v∗j and we may write13
r
A= ∑ σj w j v∗j
j =1

3 1
Example (2.63 cont). We apply the method in the Theorem to A = 2 −2 .
1 3
T 14 2 2 2
The symmetric(!) matrix A A = 2 14 has eigenvalues σ1 = 16, σ2 = 12 and orthonormal eigen-

√
vectors v1 = √1 11 , v2 = √1 −11 . The singular values are therefore σ1 = 4, σ2 = 2 3. Now

2 2
compute
1 −1
   
1 1 1 1   1 1 −1 1  

w1 = Av1 = √ A = √ 0 , w2 = Av2 = √ A = √ −2
σ1 4 2 1 2 1 σ2 2 6 1 6 1
1
and observe that these are orthonormal. Finally choose w3 = √13 −1 to complete the orthonormal
−1
basis γ of R3 . A singular value decomposition is therefore
 1 −1 1  
√ √ √
3 1 4 √ 0
  
2 6 3 √1 √1
!
−2 −1  
A = 2 −2 = PΣQ∗ =  0 √ √
3 0 2 3
2 2

6 −1 √1

√
1 3 √1 √1 −1
√ 0 0 2 2
2 6 3

By expanding the decomposition, A is expressed as the sum of rank-one matrices:

 1   −1 
√ √
2 2 1 −1
   
2 √ 6
 −2  √
σ1 w1 v1T + σ2 w2 v2T = 4  0  √12 √12 + 2 3  √ −1 √1

= 0 0 +  2 −2
 
6 2 2
√1 1 2 2 −1 1
√
2 6

13 Since β is orthonormal, it is common to write v∗j for the map g j = ⟨ , v j ⟩ in general contexts. To those familiar with
the dual space V ∗ = L(V, F), the set { g1 , . . . , gn } = {v1∗ , . . . , v∗n } is the dual basis to β. In this course v∗j will only ever mean
the conjugate-transpose of a column vector in Fn . This discussion is part of why physicists write inner products differently!

52
Proof. 1. Since T∗ T is self-adjoint, the spectral theorem says it has an orthonormal basis of eigen-
vectors β = {v1 , . . . , vn }. If T∗ T(v j ) = λ j v j , then
(
λj if j = k
T ( v j ), T ( v k ) = T∗ T ( v j ), v k = λ j v j , v k = (∗)
0 if j ̸= k

2
whence every eigenvalue is a non-negative real number: λ j = T(v j ) ≥ 0.
Since rank T∗ T = rank T = r (Exercise 8), exactly r eigenvalues are non-zero; by reordering
basis vectors if necessary, we may assume

λ1 ≥ · · · ≥ λr > 0

If j ≤ r, define σj := λ j > 0 and w j := σ1j T(v j ), then the set {w1 , . . . , wr } is orthonormal (∗).
p

If necessary, extend this to an orthonormal basis γ of W.

2. If orthonormal bases β and γ exist such that [T] β = diag(σO1 ,...,σr ) O
γ β
O
, then [T∗ ]γ is essentially the
same matrix just that its dimensions have been reversed:

diag(σ1 , . . . , σr ) O diag(σ12 , . . . , σr2 ) O

∗ β ∗
[T ] γ = =⇒ T T =
O O O O

whence T∗ and T∗ T are as claimed.

3. This is merely part 1 in the context of T = L A ∈ L(Fn , Fm ). The orthonormal bases β, γ consist
of column vectors and so the (change of co-ordinate) matrices P, Q are unitary.

Examples 2.66. 1. The matrix A = 20 32 has A T A = 4 6 with eigenvalues σ12 = 16 and σ22 = 1

6 13
and orthonormal eigenbasis

1 1 1 −2

β= √ ,√
5 2 5 1
The singular values are therefore σ1 = 4 and σ2 = 1, from which we obtain

1 1 1 2 1 −1

γ= Av1 , Av2 = √ ,√
σ1 σ2 5 1 5 2
and the decomposition

√2 −1 √1 √2
! !
√ 4 0

A = PΣQ∗ = 5 5 5 5
√1 √2 0 1 −2
√ √1
5 5 5 5

Multiplying out, we may write A as a sum of rank-one matrices

4 2 1 −1 4 2 4 1 2 −1

A= 1 2 + −2 1 =

+
5 1 5 2 5 1 2 5 −4 2

53
2. The decomposition can be very messy to find in non-matrix situations. Here is a classic example
where we simply observe the structure directly.
R1
The L2 inner product ⟨ f , g⟩ = 0 f ( x ) g( x ) dx on P2 (R) and P1 (R) admits orthonormal bases
n√ √ o n√ o
β= 5(6x2 − 6x + 1), 3(2x − 1), 1 , γ= 3(2x − 1), 1

d
Let T = dx be the derivative operator. The matrix of T is already in the required form!
√
2 15 √0 0

[T]γβ =
0 2 3 0
√ √
thus β, γ are suitable bases and the singular values of T are σ1 = 2 15 and σ2 = 2 3.
Since β, γ are orthonormal, we could have used the adjoint method to evaluate this directly,

60 0 0
 

[T∗ T] β = ([T]γβ )T [T]γβ =  0 12 0 =⇒ σ12 = 60, σ22 = 12

0 0 0

Up to sign, {[v1 ] β , [v2 ] β , [v3 ] β } is therefore forced to be the standard ordered basis of R3 , con-
firming that β was the correct basis of P2 (R) all along!

The Pseudoinverse
The singular value decomposition of a map T ∈ L(V, W ) gives rise to a natural map from W back to
V. This map behaves somewhat like an inverse even when the operator is non-invertible!

Definition 2.67. Given the singular value decomposition of a rank r map T ∈ L(V, W ), the pseu-
doinverse of T is the linear map T† ∈ L(W, V ) defined by

1
(
† σj v j if j ≤ r
T (w j ) =
0 otherwise

Restricted to Span{v1 , . . . , vr } → Span{w1 , . . . , wr }, the pseudoinverse really does invert T:

VO = Span{v1 , . . . , vr } ⊕ Span{vr+1 , . . . , vn } { 0V }
( C
vj if j ≤ r
| {z } | {z }
† N (T)⊥ =R(T† ) N (T)=R(T† )⊥
T T( v j ) = O
0 otherwise
( T T† bijection
wj if j ≤ r
TT† (w j ) =
0 otherwise R(T)=N (T† )⊥ R(T)⊥ =N (T† )
z }| { z }| {
W = Span{w1 , . . . , wr } ⊕ Span{wr+1 , . . . , wm } { 0W }
Otherwise said, the combinations are orthogonal projections: T† T = πN
⊥ †
(T) and TT = πR(T)

54
Given the singular value decomposition of a matrix A = PΣQ∗ , its pseudoinverse is that of matrix of
(L A )† , namely
r
1 diag(σ1−1 , . . . , σr−1 ) O

A† = ∑ v j w∗j = QΣ† P∗ where Σ† =
σ
j =1 j
O O
3 1
Examples 2.68. 1. Again continuing Example 2.63, A = 2 −2 has pseudoinverse
1 3

1 1
A† = v1 w1∗ + v2 w2∗
σ1 σ2
1 1 1 −1

= √ √ (1 0 1) + √ √ √ (−1 − 2 1)
4 2 2 1 2 3 2 6 1
1 1 0 1 1 1 2 −1 1 5 4 1

= + =
8 1 0 1 12 −1 −2 1 24 1 −4 5

which is exactly what we would have found by computing A† = QΣ† P∗ . Observe that

2 1 1
 
1 0 1

A† A = and AA† = 1 2 1
0 1 3
1 1 2

are the orthogonal projection matrices onto the spaces N ( A)⊥ = Span{v1 , v2 } = R2 and
R( A) = Span{w1 , w2 } ≤ R3 respectively. Both spaces have dimension 2, since rank A = 2.
d
2. The pseudoinverse of T = dx : P2 (R) → P1 (R), as seen in Example 2.66.2, maps
√ 1 √ 1
T† ( 3(2x − 1)) = √ 5(6x2 − 6x + 1) = √ (6x2 − 6x + 1)
2 15 2 3
1 √ 1
T† (1 ) = √ 3(2x − 1) = x −
2 3 2
b b √

† †
=⇒ T ( a + bx ) = T a + + √ 3(2x − 1)
2 2 3
b 1 b

= a+ x− + (6x2 − 6x + 1)
2 2 12
b a b
= x2 + ax − −
2 2 6
The pseudoinverse of ‘differentiation’ therefore returns a √
particular choice of
√anti-derivative,
2
namely the unique anti-derivative of a + bx lying in Span{ 5(6x − 6x + 1), 3(2x − 1)}.

Exercises 2.7 1. Find the ingredients β, γ and the singular values for each of the following:
x
(a) T ∈ L(R2 , R3 ) where T ( yx ) = x+y
x −y
R1
(b) T : P2 (R) → P1 (R) and T( f ) = f ′′ where ⟨ f , g⟩ := 0 f ( x ) g( x ) dx
R 2π
(c) V = W = Span{1, sin x, cos x } and ⟨ f , g⟩ = 0 f ( x ) g( x ) dx, with T( f ) = f ′ + 2 f

55
2. Find a singular value decomposition of each of the matrices:
1 1 1 1 1
   
1 0 1

(a)  1 1  (b) (c) 1 −1 0 
1 0 −1
−1 −1 1 0 −1

3. Find an explicit formula for T† for each map in Exercise 1.

4. Find the pseudoinverse of each of the matrix in Exercise 2.
5. Suppose A = PΣQ∗ is a singular value decomposition.
(a) Describe a singular value decomposition of A∗ .
(b) Explain why A† = QΣ† P∗ isn’t a singular value decomposition of A† ; what would be the
correct decomposition? (Hint: what is wrong with Σ† ?)
6. Suppose T : V → W is written according to the singular value theorem. Prove that γ is a basis
of eigenvectors of TT∗ with the same non-zero eigenvalues as T∗ T, including repetitions.
7. (a) Suppose T = L(V ) is normal. Prove that each v j in the singular value theorem may be
chosen to be an eigenvector of T and that σj is the modulus of the corresponding eigenvalue.
(b) Let A = 01 10 . Show that any orthonormal basis β of R2 satisfies the singular value theo-

rem. What is γ here? What is it about the eigenvalues of A that make this possible?
(Even when T is self-adjoint, the vectors in β need not also be eigenvectors of T!)
8. In the proof of the singular value theorem we claimed that rank T∗ T = rank T. Verify this by
checking explicitly that N (T∗ T) = N (T).
(This is circular logic if you use the decomposition, so you must do without!)
9. Let V, W be finite-dimensional inner product spaces and T ∈ L(V, W ). Prove:
(a) T∗ TT† = T† TT∗ = T∗ .
(Hint: evaluate on the basis γ = {w1 , . . . , wm } in the singular value theorem)
(b) If T is injective, then T∗ T is invertible and T† = (T∗ T)−1 T∗ .
(c) If T is surjective, then TT∗ is invertible and T† = T∗ (TT∗ )−1 .
10. Consider the equation T(x) = b, where T is a linear map between finite-dimensional inner
product spaces. A least-squares solution is a vector x which minimizes ||T(x) − b||.
(a) Prove that x0 = T† (b) is a least-squares solution and that any other has the form x0 + n
for some n ∈ N (T).
(Hint: Theorem 2.61 says that x0 is a least-squares solution if and only if T(x0 ) = πR(T) (b))
(b) Prove that x0 = T† (b) has smaller norm than any other least-squares solution.
(c) If T is injective, prove that x0 = T† (b) is the unique least-squares solution.
11. Find the minimal norm solution to the first system, and the least-squares solution to the second:

3x + y = 1
(
3x + 2y + z = 9

2x − 2y = 0
x − 2y + 3z = 3
x + 3y = 0



56
Linear Regression (non-examinable)
Given a data set {(t j , y j ) : 1 ≤ j ≤ m}, we may employ the least-squares method to find a best-fitting
line y = c0 + c1 t; often used to predict y given a value of t.
The trick is to minimize the sum of the squares of the vertical deviations of the line from the data set.

t1 1 y1
   
m
c1

2 2  .. ..   .. 
∑ y j − c0 − c1 t j = ||y − Ax|| where A =  . .  x = c0 y =  . 
j =1
tm 1 ym
With the indicated notation, we recognize this as a least-squares problem. Indeed if there are at least
two distinct t-values in the data set, then rank A = 2 is maximal and we have a unique best-fitting
line with coefficients given by

c1

= A † y = ( A T A ) −1 A T y
c0
Example 2.69. Given the data set {(0, 1), (1, 1), (2, 0), (3, 2), (4, 2)}, we compute
0 1 1
   
1 1 1
−1
c1 30 10 15 3 1
  
T −1 T
A= 2 1 , y = 0 =⇒ x0 = = ( A A) A y =

=
c0 10 5 6 10 2
  
3 1 2
4 1 2
3
The regression line therefore has equation y = 10 ( t + 2).
The process can be applied more generally to approximate using other functions. To find the best-
fitting quadratic polynomial y = c0 + c1 t + c2 t2 , we’d instead work with
 2
t1 t1 1 y1
  
c2
 
m 2
 .. . .  .. 
A=. . .
. .  x = c1 y =  .  =⇒ ∑ y j − c0 − c1 t − c2 t2j = ||y − Ax||2
  
t2 t m 1 c0 ym j =1
m
Provided we have at least three distinct values t1 , t2 , t3 , the matrix A is guaranteed to have rank 3 and
there will be a best-fitting least-squares quadratic: in this case
1
y= (15t2 − 39t + 72)
70
This curve and the best-fitting straight line are shown below.
y y

2 2

1 1

0 0
0 1 2 3 4 0 1 2 3 4
t t

57
Optional Problems Use a computer to invert any 3 × 3 matrices!

1. Check the calculation for the best-fitting least-squares quadratic in Example 2.69.

2. Find the best-fitting least-squares linear and quadratic approximations to the data set

{(1, 2), (3, 4), (5, 7), (7, 9), (9, 12)}

3. Suppose a data set {(t j , y j ) : 1 ≤ j ≤ m} has unique regression line y = ct + d.

(a) Show that the equations A T Ax0 = A T y can be written in matrix form

∑ t2j ∑ t j c ∑ tj yj

=
∑ tj m d ∑ yj

(b) Recover the standard expressions from statistics:

Cov(t, y)
c= and d = y − ct
σt2

where
m m
1 1
• t=
m ∑ t j and y = m ∑ y j are the means (averages),
j =1 j =1
m
1
• σt2 =
m ∑ (t j − t)2 is the variance,
j =1
m
1
• Cov(t, y) =
m ∑ (t j − t)(y j − y) is the covariance.
j =1

58
2.8 Bilinear and Quadratic Forms
In this section we slightly generalize the idea of an inner product. Throughout, V is a vector space
over a field F; it need not be an inner product space and F can be any field (not just R or C).

Definition 2.70. A bilinear form B : V × V → F is linear in each entry: ∀v, x, y ∈ V, λ ∈ F,

B(λx + y, v) = λB(x, v) + B(y, v), B(v, λx + y) = λB(v, x) + B(v, y)

Additionally, B is symmetric if ∀x, y ∈ V, B(x, y) = B(y, x).

Examples 2.71. 1. If V is a real inner product space, then the inner product ⟨ , ⟩ is a symmetric
bilinear form. Note that a complex inner product is not bilinear!

2. If A ∈ Mn (F), then B(x, y) := x T Ay is a bilinear form on Fn . For instance, on R2 ,

T 1 2

B(x, y) = x y = x1 y1 + 2x1 y2 + 2x2 y1
2 0

defines a symmetric bilinear form, though not an inner product since it isn’t positive definite;
for example B(j, j) = 0.

As seen above, we often make use of a matrix.

Definition 2.72. Let B be a bilinear form on a finite-dimensional space with basis ϵ = {v1 , . . . , vn }.
The matrix of B with respect to ϵ is the matrix [ B]ϵ = A ∈ Mn (F) with ijth entry

Aij = B(vi , v j )

Given x, y ∈ V, compute their co-ordinate vectors [x]ϵ , [y]ϵ with respect to ϵ, then
B(x, y) = [x]ϵT A[y]ϵ
The set of bilinear forms on V is therefore in bijective correspondence with Mn (F). Moreover,
T
B(y, x) = [y]ϵT A[x]ϵ = [y]ϵT A[x]ϵ = [x]ϵT AT [y]ϵ
β
Finally, if β is another basis of V, then an appeal to the change of co-ordinate matrix Qϵ yields
B(x, y) = [x]ϵT A[y]ϵ = ( Qϵβ [x] β )T A( Qϵβ [y]ϵ ) = [x] Tβ ( Qϵβ )T AQϵβ [y] β =⇒ [ B] β = ( Qϵβ )T [ B]ϵ Qϵβ
To summarize:

Lemma 2.73. Let B be a bilinear form on a finite-dimensional vector space.

1. If A is the matrix of B with respect to some basis, then every other matrix of B has the form
Q T AQ for some invertible Q.

2. B is symmetric if and only if its matrix with respect to any (and all) bases is symmetric.

Naturally, the simplest situation is when the matrix of B is diagonal. . .

59
Examples 2.74. 1. Example 2.71.2 can be written

T 1 2

B(x, y) = x y = x1 y1 + 2x1 y2 + 2x2 y1 = ( x1 + 2x2 )(y1 + 2y2 ) − 4x2 y2
2 0
T
1 0 y1 + 2y2 T 1 2 1 0 1 2

= x1 + 2x2 x2 =x y
0 −4 y2 0 1 0 −4 0 1
β
If ϵ is the standard basis, then [ B] β = 10 −04 where Qϵ = 12 . It follows that Qϵβ = 1 −2

1 −2 01 0 1
from which β = 0 , 1 is a diagonalizing basis.

2. In general, we may perform a sequence of simultaneous row and column operations to diago-
(λ)
nalize any symmetric B; we require only elementary matrices Eij of type III.14 For instance:

1 −2 3
 

B(x, y) = x T Ay = x T −2 0 4 y (A = [ B]ϵ with respect to the standard basis)

3 4 −1
1 0 3
 
(2) (2)
E21 AE12 = 0 −4 10  (add twice row 1 to row 2, columns similarly)
3 10 −1
1 0 0
 
(−3) (2) (2) (−3)
E31 E21 AE12 E13 = 0 −4  10  (subtract 3 times row 1 from row 3, etc.)
0 10 −10
1 0 0
 
(1) (−3) (2) (2) (−3) (1)
E23 E31 E21 AE12 E13 E32 = 0 6 0  (add row 3 to row 2, etc.)
0 0 −10

If β is the diagonalizing basis, the the change of co-ordinate matrix is

1 2 0 1 0 −3 1 0 0 1 −1 −3
     
(2) (−3) (1)
Qϵβ = E12 E13 E32 = 0 1 0 0 1 0  0 1 0 = 0 1 0 
0 0 1 0 0 1 0 1 1 0 1 1
n 1 −1 −3 o
from which β = 0 , 1 , 0 . If you’re having trouble believing this, invert the
0 1 1
change of co-ordinate matrix and check that

B(x, y) = ( x1 − 2x2 + 3x3 )(y1 − 2y2 + 3y3 ) + 6x2 y2 − 10(− x2 + x3 )(−y2 + y3 )

Warning! If F = R then every symmetric B may be diagonalized by an orthonormal basis (for the
usual dot product on Rn ). It is very unlikely that our algorithm will produce such! The algorithm has
two main advantages over the spectral theorem: it is typically faster and it applies to vector spaces
over any field. As a disadvantage, it is highly non-unique.
14 Recall (λ)
that Eij is the identity matrix with an additional λ in the ijth entry.
(λ)
• As a column operation (right-multiplication), A 7→ AEij adds λ times the ith column to the jth .
(λ) (λ)
• As a row operation (left-multiplication), A 7→ Eji A = ( Eij )T A adds λ times the ith row to the jth .

60
Example 2.75. We diagonalize B(x, y) = x T 16 63 y = x1 y1 + 6x1 y2 + 6x2 y1 + 3x2 y2 in three ways.

1 −6
• −16 10 16 63 10 −16 = 10 −033 = [ B] β where β = 0 , . This corresponds to

1

B(x, y) = ( x1 + 6x2 )(y1 + 6y2 ) − 33x2 y2

• 1 −2 16 1 0 −11 0 = [ B]γ where γ = 1 , 0 . This corresponds to

0 1 63 −2 1 = 0 3 −2 1

B(x, y) = −11x1 y1 + 3(2x1 + x2 )(2y1 + y2 )

√
• If F = R, we may apply the spectral theorem to see that [ B]η = 2+ 37 √ 0
is diagonal
0 2− 37
√ o
6
n
with respect to η = √
1+ 37
, −1−6 37 . The expression for B(x, y) in these co-ordinates is
disgusting, so we omit it; that B can be diagonalized orthogonally doesn’t mean it should be!

Theorem 2.76. Suppose B is a bilinear form of a finite-dimensional space V over F.

1. If B is diagonalizable, then it is symmetric.

2. If B is symmetric and F does not have characteristic two (see aside), then B is diagonalizable.

Proof. 1. If B is diagonalizable, ∃ β such that [ B] β is diagonal and thus symmetric.

2. Suppose B is non-zero (otherwise the result is trivial). We prove by induction on n = dim V.

If n = 1, the result is trivial: B( x, y) = axy for some a ∈ F is clearly symmetric.
Fix n ∈ N and assume that every non-zero symmetric bilinear form on a dimension n vector
space over F is diagonalizable. Let dim V = n + 1. By the discussion below, ∃x ∈ V such that
B(x, x) ̸= 0. Consider the linear map

T : V → F : v 7→ B(x, v)

Clearly rank T = 1 =⇒ dim N (T) = n. Moreover, B is symmetric when restricted to N (T);

by the induction hypothesis there exists a basis β of N (T) such that [ BN (T) ] β is diagonal. But
then B is diagonal with respect to the basis β ∪ {x}.

Aside: Characteristic two fields This means 1 + 1 = 0 in F; this holds, for instance, in the field
Z2 = {0, 1} of remainders modulo 2. We now see the importance of char F ̸= 2 to the above result.

• The proof uses the existence of x ∈ V such that B(x, x) ̸= 0. If B is non-zero, ∃u, v such that
B(u, v) ̸= 0. If both B(u, u) = 0 = B(v, v), then x = u + v does the job whenever char F ̸= 2:

B(x, x) = B(u, v) + B(v, u) = 2B(u, v) ̸= 0

• To see that the requirement isn’t idle, consider B(x, y) = x T 01 10 y on the finite vector space

Z22 = { 00 , 10 , 01 , 11 } over Z2 . Every element of this space satisfies B(x, x) = 0! Perhaps

surprisingly, the matrix of B is identical with respect to any basis of Z22 , whence B is symmetric
but non-diagonalizable.

61
In Example 2.75 notice how the three diagonal matrix representations have something in common:
one each of the diagonal entries are positive and negative. This is a general phenomenon:

Theorem 2.77 (Sylvester’s Law of Inertia). Suppose B is a symmetric bilinear form on a real vector
space V with diagonal matrix representation diag(λ1 , . . . , λn ). Then the number of entries λ j which
are positive/negative/zero is independent of the diagonal representation.

Definition 2.78. The signature of a symmetric bilinear form B is the triple (n+ , n− , n0 ) representing
how many positive, negative and zero terms are in any diagonal representation. Sylvester’s Law says
that the signature is an invariant of a symmetric bilinear form.

Positive-definiteness says that a real inner product on an n-dimensional space has signature (n, 0, 0).
Practitioners of relativity often work in Minkowski spacetime: R4 equipped with a signature (1, 3, 0)
bilinear form, typically
 2
c 0 0 0

 0 −1 0 0 
B(x, y) = x T   y = c2 x1 y1 − x2 y2 − x3 y3 − x4 y4
0 0 −1 0 
0 0 0 −1
where c is the speed of light. Vectors are time-, space-, or light-like depending on whether B(x, x) is
positive, negative or zero. For instance x = 3c−1 e1 + 2e2 + 2e3 + e4 is light-like.

Proof. For simplicity, let V = Rn and write B(x, y) = x T Ay where A is symmetric.

1. Define rank B := rank A and observe this is independent of basis (exercises).
2. Let β, γ be diagonalizing bases, ordered according to whether B is positive, negative or zero.
β = { v 1 , . . . , v p , v p +1 , . . . , v r , v r +1 , . . . , v n }
γ = { w 1 , . . . , w q , w q +1 , . . . , w r , w r +1 , . . . , w n }
Here r = rank B in accordance with part 1: our goal is to prove that p = q.
3. Assume p < q, define the matrix C and check what follows:
 T 
v1 A
 .. 
 . 
 vTp A 
 wqT+1 A  ∈ M(r−q+ p)×n (R)
C= 
 .. 
 
.
wrT A

(a) rank C ≤ r − q + p < r =⇒ null C > n − r, thus

∃x ∈ Rn such that Cx = 0 and x ̸∈ Span{vr+1 , . . . , vn }
(b) The first p entries of Cx = 0 mean that x ∈ Span{v p+1 , . . . , vr , vr+1 , . . . , vn } and so
B(x, x) < 0. Note how we use part (a) to get a strict inequality here.
(c) Now write x with respect to γ: this time we see that x ∈ Span{w1 , . . . , wq , wr+1 , . . . , wn },
whence B(x, x) ≥ 0: a contradiction.

62
Quadratic Forms & Diagonalizing Conics

Definition 2.79. To every symmetric bilinear form B : V × V → F is associated a quadratic form

K : V → F : x 7→ B(x, x)

A function K : V → F is termed a quadratic form when such a symmetric bilinear form exists.

Examples 2.80. 1. If B is a real inner product, then K (v) = ⟨v, v⟩ = ||v||2 is the square of the norm.

2. Let dim V = n and A be the matrix of B with respect to a basis β. By the symmetry of A,

n
(
Aij if i = j
K (x) = x Ax = ∑ xi Aij x j = ∑ ãij xi x j where ãij =
T

i,j=1 1≤ i ≤ j ≤ n 2Aij if i ̸= j

3 −1
E.g., K (x) = 3x12 + 4x22 − 2x1 x2 corresponds to the bilinear form B(x, y) = x T y

−1 4

As a fun application, we consider the diagonalization of conics in R2 . The general non-zero conic has
equation

ax2 + 2bxy + cy2 + dx + ey + f = 0

where the first three terms comprise a quadratic form

T a b

2 2
K (x) = ax + 2bxy + cy ↭ B(v, w) = v w
b c

If {v1 , v2 } is a diagonalizing basis, then there exist scalars λ1 , λ2 for which

K (t1 v1 + t2 v2 ) = λ1 t21 + λ2 t22

whence the general conic becomes

λ1 t21 + λ2 t22 + µ1 t1 + µ2 t2 = η, λ1 , λ2 , µ1 , µ2 , η ∈ R
µj
If λi ̸= 0, we may complete the square via the linear transformation s j = t j + 2λ j . The canonical
forms are then recovered:

Parabola: λ1 or λ2 = 0 (but not both).

Ellipse: λ1 s21 + λ2 s22 = k ̸= 0 where λ1 , λ2 , k have the same sign.

Hyperbola: λ1 s21 + λ2 s22 = k ̸= 0 where λ1 , λ2 have opposite signs.

Since B is symmetric, we may take {v1 , v2 } to be an orthonormal basis of R2 , whence any conic may be
put in canonical form by applying only a rotation/reflection and translation (completing the square).
Alternatively, we could diagonalize K using our earlier algorithm; this additionally permits shear
transforms. By Sylvester’s Law, the diagonal entries will have the same number of (+, −, 0) terms
regardless of the method, so the canonical form will be unchanged.

63
Examples 2.81. 1. We describe and plot the conic with equation 7x2 + 24xy = 144.
The matrix of the associated bilinear form is 12 7 12

0 t2 y
which has orthonormal eigenbasis 4
4 t1
1 4 1 −3
3
β = { v1 , v2 } = , 22 4
5 3 5 4 3
1v 2
2
with eigenvalues (λ1 , λ2 ) = (16, −9). In the rotated ba- 1
v1
sis, this is the canonical hyperbola −1
−4 −2 −1 2 4
− 2 −2 x
t 2 2 2 3
t − 3 − −
16t21 − 9t22 = 144 ⇐⇒ 12 − 22 = 1 −4 −4
3 4
which is easily plotted. In case this is too fast, use the −4
change of co-ordinate matrix to compute directly:

1 4 −3 t1 β x 1 4x + 3y

Qϵβ = =⇒ = [x] β = Qϵ =
5 3 4 t2 y 5 −3x + 4y

which quickly recovers the original conic by substitution.

2. The conic defined by K (x) = x2 + 12xy + 3y2 = −33 defines t2

a hyperbola in accordance
y
0 with Example 2.75. With respect 6
1
to the basis γ = { −2 , 1 }, we see that −3
−2
K (x) = −11x2 + 3(2x + y)2 = −11t21 + 3t22 = −33 3
t21 t2 −1
⇐⇒ − 2 =1 v2
3 11
−6 −3 v1 3 6
Even though γ is non-orthogonal, we can still plot the conic!
If we instead used the orthonormal basis η, we’d obtain −3 1 x

√ √ 2
K (x) = ( 37 + 2)s21 − ( 37 − 2)s22 = −33 −6 3
t1
however the calculation to find η is time-consuming and the
expressions for s1 , s2 are extremely ugly.

A similar approach can be applied to higher degree quadratic equations/manifolds: e.g. ellipsoids,
paraboloids and hyperboloids in R3 .

Exercises 2.8 1. Prove that the sum of any two bilinear forms is bilinear, and that any scalar multi-
ple of a bilinear form is bilinear: thus the set of bilinear forms on V is a vector space.
(You can’t use matrices here, since V could be infinite-dimensional!)
2. Compute the matrix of the bilinear form
B(x, y) = x1 y1 − 2x1 y2 + x2 y1 − x3 y3
1
n 1 0 o
3
on R with respect to the basis β = 0 , 0 , 1 .
1 −1 0

64
3. Check that the function B( f , g) = f ′ (0) g′′ (0) is a bilinear form on the vector space of twice-
differentiable functions. Find the matrix of B with respect to β = {cos t, sin t, cos 2t, sin 2t}
when restricted to the subspace Span β.

4. For each matrix A, find a diagonal matrix D and an invertible matrix Q such that Q T AQ = D.

3 1 2
 
1 3

(a) (b) 1 4 0 
3 2
2 0 −1

5. If K is a quadratic form and K (x) = 2, what is the value of K (3x)?

6. If F does not have characteristic 2, and K (x) = B(x, x) is a quadratic form, prove that we can
recover the bilinear form B via
1
B(x, y) = (K (x + y) − K (x) − K (y))
2

7. If B(x, y) = x T 01 y is a bilinear form on F2 , compute the quadratic form K (x).

8. Suppose B is a symmetric bilinear form on a real finite-dimensional space. With reference to

the proof of Sylvester’s Law, explain why rank B is independent of the choice of diagonalizing
basis.

9. If char F ̸= 2, apply the diagonalizing algorithm to the symmetric bilinear form B = x T 01 10 y

on F2 . What goes wrong if char F = 2?

10. Describe and plot the following conics:

(a) x2 + y2 + xy = 6
(b) 35x2 + 120xy = 4x + 3y

11. Suppose that a non-empty, non-degenerate15 conic C in R2 has the form ax2 + 2bxy + cy2 + dx +
ey + f = 0, where at least one of a, b, c ̸= 0, and define ∆ = b2 − ac. Prove that:

• C is a parabola if and only if ∆ = 0;

• C is an ellipse if and only if ∆ < 0;
• C is a hyperbola if and only if ∆ > 0.

(Hint: λ1 , λ2 are the eigenvalues of a symmetric matrix, so. . . )

15 The conic contains at least two points and cannot be factorized as a product of two straight lines: for example, the

following are disallowed;

• x2 + y2 + 1 = 0 is empty (unless one allows conics over C . . .)
• x2 + y2 = 0 contains only one point;
• x2 − xy − x + y = ( x − 1)( x − y) = 0 is the product of two lines.

65
3 Canonical Forms
3.1 Jordan Forms & Generalized Eigenvectors
Throughout this course we’ve concerned ourselves with variations of a general question: for a given
map T ∈ L(V ), find a basis β such that the matrix [T] β is as close to diagonal as possible. In this
chapter we see what is possible when T is non-diagonalizable.
8 4
Example 3.1. The matrix A = −−25 12 ∈ M2 (R) has characteristic equation

p(t) = (−8 − t)(12 − t) + 4 · 25 = t2 − 4t + 4 = (t − 2)2

and thus a single eigenvalue λ = 2. It is non-diagonalizable since the eigenspace is one-dimensional

−10 4 2

E2 = N = Span
−25 10 5

However, if we consider a basis β = {v1 , v2 } where v1 = 25 is an eigenvector, then [L A ] β is upper-

triangular, which is better than nothing! How simple can we make this matrix? Let v2 = ( yx ), then

−8x + 4y x −10x + 4y

Av2 = =2 + = 2v2 + (−5x + 2y)v1
−25x + 12y y −25x + 10y
2 −5x + 2y

=⇒ [L A ] β =
0 2

Since v2 cannot be parallel to v1 , the only thing we cannot have is a diagonal matrix. The next best
thing is for the upper right corner be 1; for instance we could choose

2 1 2 1

β = { v1 , v2 } = , =⇒ [L A ] β =
5 3 0 2

Definition 3.2. A Jordan block is a square matrix of the form

λ 1
 
..

λ . 
J= ..
 
.

 1
λ

where all non-indicated entries are zero. Any 1 × 1 matrix is also a Jordan block.
A Jordan canonical form is a block-diagonal matrix diag( J1 , . . . , Jm ) where each Jk is a Jordan block.
A Jordan canonical basis for T ∈ L(V ) is a basis β of V such that [T] β is a Jordan canonical form.

If a map is diagonalizable, then any eigenbasis is Jordan canonical and the corresponding Jordan
canonical form is diagonal. What about more generally? Does every non-diagonalizable map have a
Jordan canonical basis? If so, how can we find such?

66
n 1 1 1 o
Example 3.3. It can easily be checked that β = {v1 , v2 , v3 } = 0 , 2 , 1 is a Jordan canon-
1 0 1
ical basis for
−1 2 3
 

A =  −4 5 4
−2 1 4

(really L A ∈ L(R3 )). Indeed

4 1+3 2 0 0
     

Av1 = 2v1 , Av2 = 3v2 , Av3 = 5 = 2 + 3 = v2 + 3v3 =⇒ [L A ] β = 0 3 1

3 0+3 0 0 3

Generalized Eigenvectors
Example 3.3 was easy to check, but how would we go about finding a suitable β if we were merely
given A? We brute-forced this in Example 3.1, but such is not a reasonable approach in general.
Eigenvectors get us some of the way:

• v1 is an eigenvector in Example 3.1, but v2 is not.

• v1 and v2 are eigenvectors in Example 3.3, but v3 is not.

The practical question is how to fill out a Jordan canonical basis once we have a maximal independent
set of eigenvectors. We now define the necessary objects.

Definition 3.4. Suppose T ∈ L(V ) has an eigenvalue λ. Its generalized eigenspace is

Kλ := {x ∈ V : (T − λI)k (x) = 0 for some k ∈ N} = N (T − λI)k

[

k ∈N

A generalized eigenvector is any non-zero v ∈ Kλ .

As with eigenspaces, the generalized eigenspaces of A ∈ Mn (F) are those of the map L A ∈ L(Fn ).

It is easy to check that our earlier Jordan canonical bases consist of generalized eigenvectors.

Example 3.1: We have one eigenvalue λ = 2. Since ( A − 2I )2 = 00 00 is the zero matrix, every

non-zero vector is a generalized eigenvector; plainly K2 = R2 .

Example 3.3: We see that

( A − 2I )v1 = 0, ( A − 3I )v2 = 0, ( A − 3I )2 v3 = ( A − 3I )v2 = 0

whence β is a basis of generalized eigenvectors. Indeed

K3 = Span{v1 , v2 }, K2 = E2 = Span{v3 }

though verifying this with current technology is a little awkward. . .

67
In order to easily compute generalized eigenspaces, it is useful to invoke the main result of this
section. We postpone the proof for a while due to its meatiness.

Theorem 3.5. Suppose that the characteristic polynomial of T ∈ L(V ) splits over F:

p ( t ) = ( λ 1 − t ) m1 · · · ( λ k − t ) m k

where the λ j are the distinct eigenvalues of T with algebraic multiplicities m j . Then:

1. For each eigenvalue; (a) Kλ = N (T − λI)m and (b) dim Kλ = m.

2. V = Kλ1 ⊕ · · · ⊕ Kλk : there exists a basis of generalized eigenvectors.

Compare this with the statement on diagonalizability from the start of the course.
With regard to part 2; we shall eventually be able to choose this to be a Jordan canonical basis. In
conclusion: a map has a Jordan canonical basis if and only if its characteristic polynomial splits.

Examples 3.6. 1. Observe how Example 3.3 works in this language:

−1 2 3
 

A = −4 5 4 =⇒ p(t) = (2 − t)1 (3 − t)2

−2 1 4
1
 
1
K2 = N ( A − 2I ) = Span 0 =⇒ dim K2 = 1

1
2 −1 −1  1 1 
     
2
K3 = N ( A − 3I ) = N 0 0
 0  = Span  2 , 1 =⇒ dim K3 = 2
 
2 −1 −1 0 1
 

R3 = K2 ⊕ K3
5 2 −1
2. We find the generalized eigenspaces of the matrix A = 00 0
9 6 −1
The characteristic polynomial is
5−t −1
p(t) = det( A − λI ) = −t = −t(t2 − 5t + t − 5 + 9) = −(0 − t)1 (2 − t)2
9 −1 − t
1
• λ = 0 has multiplicity 1; indeed K0 = N ( A − 0I )1 = N ( A) = Span −1 is just the
3
eigenspace E0 .
• λ = 2 has multiplicity 2,
2
3 2 −1 0 −4 0  1 0 
      

K2 = N ( A − 2I )2 = N 0 −2 0  = N 0 4 0 = Span 0 , 0

9 6 −3 0 −12 0 0 1
 
1
In this case the corresponding eigenspace is one-dimensional, E2 = Span 0 ⪇ K2 , and
3
the matrix is non-diagonalizable.
Observe also that R3 = K0 ⊕ K2 in accordance with the Theorem.

68
Properties of Generalized Eigenspaces and the Proof of Theorem 3.5
A lot of work is required to justify our main result. Feel free to skip the proofs at first reading.

Lemma 3.7. Let λ be an eigenvalue of T ∈ L(V ). Then:

1. Eλ is a subspace of Kλ , which is itself a subspace of V.

2. Kλ is T-invariant.

3. Suppose Kλ is finite-dimensional and µ ̸= λ. Then:

(a) Kλ is (T − µI)-invariant and the restriction of T − µI to Kλ is an isomorphism.

(b) If µ is another eigenvalue, then Kλ ∩ Kµ = {0}. In particular Kλ contains no eigenvectors
other than those in Eλ .

Proof. 1. These are an easy exercise.

2. Let x ∈ Kλ , then ∃k such that (T − λI)k (x) = 0. But then

(T − λI)k T(x) = (T − λI)k T(x) − λx + λx

= (T − λI)k+1 (x) + λ(T − λI)k (x) = 0

Otherwise said, T(x) ∈ Kλ .

3. (a) Let x ∈ Kλ . Part 2 tells us that

(T − µI)(x) = T(x) − µx ∈ Kλ

whence Kλ is (T − µI)-invariant.
Suppose, for a contradiction, that T − µI is not injective on Kλ . Then

∃y ∈ Kλ \ {0} such that (T − µI)(y) = 0

Let k ∈ N be minimal such that (T − λI)k (y) = 0 and let z = (T − λI)k−1 (y). Plainly
z ̸= 0, for otherwise k is not minimal. Moreover,

(T − λI)(z) = (T − λI)k (y) = 0 =⇒ z ∈ Eλ

Since T − µI and T − λI commute, we can also compute the effect of T − µI;

(T − µI)(z) = (T − µI)(T − λI)k−1 (y) = (T − λI)k−1 (T − µI)(x) = 0

which says that z is an eigenvector in Eµ ; if µ isn’t an eigenvalue, then we already have

our contradiction! Even if µ is an eigenvalue, Eµ ∩ Eλ = {0} provides the desired contra-
diction.
We conclude that (T − µI)Kλ ∈ L(Kλ ) is injective. Since dim Kλ < ∞, the restriction is
automatically an isomorphism.
(b) This is another exercise.

69
Now to prove Theorem 3.5: remember that the characteristic polynomial of T is assumed to split.

Proof. (Part 1(a)) Fix an eigenvalue λ. By definition, we have N (T − λI)m ≤ Kλ .

For the converse, parts 2 and 3 of the Lemma tell us (why?) that

pλ (t) = (λ − t)dim Kλ from which dim Kλ ≤ m (∗)

By the Cayley–Hamilton Theorem, TKλ satisfies its characteristic polynomial, whence

∀x ∈ Kλ , (λI − T)dim Kλ (x) = 0 =⇒ Kλ ≤ N (T − λI)m

(Parts 1(b) and 2) We prove simultaneously by induction on the number of distinct eigenvalues of T.
(Base case) If T has only one eigenvalue, then p(t) = (λ − t)m . Another appeal to Cayley–
Hamilton says (T − λI)m (x) = 0 for all x ∈ V. Thus V = Kλ and dim Kλ = m.
(Induction step) Fix k and suppose the results hold for maps with k distinct eigenvalues. Let T
have distinct eigenvalues λ1 , . . . , λk , µ, with multiplicities m1 , . . . , mk , m respectively. Define16

W = R(T − µI)m

The subspace W has the following properties, the first two of which we leave as exercises:
• W is T-invariant.
• W ∩ Kµ = {0} so that µ is not an eigenvalue of the restriction TW .
• Each Kλ j ≤ W: since (T − µI)Kλ is an isomorphism (Lemma part 3), we can invert,
j
m
1
x ∈ Kλ j =⇒ x = (T − µI)m (T − µI)− Kλ (x) ∈ R(T − µI)m = W
j

We conclude that λ j is an eigenvalue of the restriction TW with generalized eigenspace Kλ j .

Since TW has k distinct eigenvalues, the induction hypotheses apply:

W = K λ1 ⊕ · · · ⊕ K λ k and pW (t) = (λ1 − t)dim Kλ1 · · · (λk − t)dim Kλk

Since W ∩ Kµ = {0} it is enough finally to use the rank–nullity theorem and count dimensions:
k
dim V = rank(T − µI)m + null(T − µI)m = dim W + dim Kµ = ∑ dim Kλ j
+ dim Kµ
j =1
(∗)
≤ m1 + · · · + mk + m = deg( p(t)) = dim V

The inequality is thus an equality; each dim Kλ j = m j and dim Kµ = m. We conclude that

V = K λ1 ⊕ · · · ⊕ K λ k ⊕ K µ

which completes the induction step and thus the proof. Whew!

16 This is yet another argument where we consider a suitable subspace to which we can apply an induction hypothesis;
recall the spectral theorem, Schur’s lemma, bilinear form diagonalization, etc. Theorem 3.12 will provide one more!

70
Cycles of Generalized Eigenvectors
By Theorem 3.5, for every linear map whose characteristic polynomial splits there exists generalized
eigenbasis. This isn’t the same as a Jordan canonical basis, but we’re very close!
5 1 0
Example 3.8. The matrix A = 0 5 1 ∈ M3 (R) is a single Jordan block, whence there is a single
005
generalized eigenspace K5 = R3 and the standard basis ϵ = {e1 , e2 , e3 } is Jordan canonical.
The crucial observation for what follows is that one of these vectors e3 generates the others via re-
peated applications of A − 5I:

e2 = ( A − 5I )e3 , e1 = ( A − 5I )e2 = ( A − 5I )2 e3

Definition 3.9. A cycle of generalized eigenvectors for a linear operator T is a set

n o
β x := (T − λI)k−1 (x), . . . , (T − λI)(x), x

where the generator x ∈ Kλ is non-zero and k is minimal such that (T − λI)k (x) = 0.

Note that the first element (T − λI)k−1 (x) is an eigenvector.

Our goal is to show that Kλ has a basis consisting of cycles of generalized eigenvectors; putting these
together results in a Jordan canonical basis.

Lemma 3.10. Let β x be a cycle of generalized eigenvectors of T with length k. Then:

1. β x is linearly independent and thus a basis of Span β x .

2. Span β x is T-invariant. With respect to β x , the matrix of the restriction of T is the k × k Jordan
λ 1 .
!
.
λ . .
block [TSpan β x ] β x = .. 1 .
λ

In what follows, it will be useful to consider the linear map U = T − λI. Note the following:
• The nullspace of U is the eigenspace: N (U) = Eλ ≤ Kλ .
• T commutes with U: that is TU = UT.
• β x = {Uk−1 (x), . . . , U(x), x}; that is, Span β x = ⟨x⟩ is the U-cyclic subspace generated by x.

Proof. 1. Feed the linear combination ∑kj=−01 a j U j (x) = 0 to Uk−1 to obtain

a0 Uk−1 (x) = 0 =⇒ a0 = 0

Now feed the same combination to Uk−2 , etc., to see that all coefficients a j = 0.

2. Since T and U commute, we see that

T U j (x) = U j T(x) = U j (U + λI)(x) = U j+1 (x) + λU j (x) ∈ Span β x

This justifies both T-invariance and the Jordan block claim!

71
The basic approach to finding a Jordan canonical basis is to find the generalized eigenspaces and play
with cycles until you find a basis for each Kλ . Many choices of canonical basis exist for a given map!
We’ll consider a more systematic method in the next section.
1 0 2
Examples 3.11. 1. The characteristic polynomial of A = 0 1 6 ∈ M3 (R) splits:
6 −2 1
1−t 6 0 1−t
p ( t ) = (1 − t ) +2 = (1 − t) (1 − t)2 + 12 − 12 = (1 − t)3

−2 1 − t 6 −2
With only one eigenvalue we see that K1 = R3 . Simply choose any vector in R3 and see what
U = A − I does to it! For instance, with x = e1 ,
n 1 1 1 o n 12 0 1 o
β x = U2 0 , U 0 , 0 = 36 , 0 , 0
0 0 0 0 6 0

provides a Jordan canonical basis of R3 . We conclude

12 0 1 1 1 0 12 0 1 −1
A = QJQ−1 = 36 0 0 011 36 0 0
0 60 001 0 60

In practice, almost any choice of x ∈ R3 will generate a cycle of length three!

7 1 −4
2. The matrix B = 0 3 0 ∈ M3 (R) has characteristic equation
8 1 −5

p(t) = (3 − t)(t2 − 2t − 3) = −(t + 1)1 (t − 3)2

1
dim K−1 = 1 =⇒ K−1 = E−1 = Span 0 , spanned by a cycle of length one.
2
Since dim K3 = 2, we have
4 1 −4 2
−16 0 16 n 1 0 o
K3 = N ( B − 3I )2 = N =N00 0 0 0 0 = Span 0 , 1
8 1 −8−32 0 32 1 0
1
This is spanned by a cycle of length two: 0 is an eigenvector and
1
1 0
0 = ( B − 3I ) 1
1 0
n 1
1 0 o
We conclude that β = 0 , 0 , 1 is a Jordan canonical basis for B, and that
2 1 0
1 1 0 −1 0 0 1 1 0 −1
B = QJQ−1 = 001 0 31 001
210 0 03 210

d
3. Let T = on P3 (R). With respect to the standard basis ϵ = {1, x, x2 , x3 },
dx
0 1 0 0
A = [T]ϵ = 00 00 20 03
0000

With only one eigenvalue λ = 0, we have a single generalized eigenspace K0 = P3 (R). It is

easy to check that f ( x ) = x3 generates a cycle of length three and thus a Jordan canonical basis:
0 1 0 0 6 0 0 0 −1 0 1 0 0 6 0 0 0
2 3 0010 0600 0020 0600
β = {6, 6x, 3x , x } =⇒ [T] β = 0001 = 0030 0003 0030
0000 0001 0000 0001

72
Our final results state that this process works generally.

Theorem 3.12. Let T ∈ L(V ) have an eigenvalue λ. If dim Kλ < ∞, then there exists a basis
β λ = β x1 ∪ · · · ∪ β xn of Kλ consisting of finitely many linearly independent cycles.

Intuition suggests that we create cycles β x j by starting with a basis of the eigenspace Eλ and extending
backwards: for each x, if x = (T − λI)(y), then x ∈ β y ; now repeat until you have a maximum length
cycle. This is essentially what we do, though a sneaky induction is required to make sure we keep
track of everything and guarantee that the result really is a basis of Kλ .

Proof. We prove by induction on m = dim Kλ .

(Base case) If m = 1, then Kλ = Eλ = Span x for some eigenvector x. Plainly {x} = β x .
(Induction step) Fix m ≥ 2. Write n = dim Eλ ≤ m and U = (T − λI)Kλ .

(i) For the induction hypothesis, suppose every generalized eigenspace with dimension < m (for
any linear map!) has a basis consisting of independent cycles of generalized eigenvectors.
(ii) Define W = R(U) ∩ Eλ : that is
(
U(w) = 0 and
w ∈ W ⇐⇒
w = U(v) for some v ∈ Kλ

Let k = dim W, choose a complementary subspace X such that Eλ = W ⊕ X and select a basis
{xk+1 , . . . , xn } of X. If k = 0, the induction step is finished (why?). Otherwise we continue. . .
(iii) The calculation in the proof of Lemma 3.10 (take j = 1) shows that R(U) is T-invariant; it is
therefore the single generalized eigenspace K̃λ of TR(U) .
(iv) By the rank–nullity theorem,

dim R(U) = rank U = dim Kλ − null U = m − dim Eλ < m

By the induction hypothesis, R(U) has a basis of independent cycles. Since the last non-zero
element in each cycle is an eigenvector, this basis consists of k distinct cycles β x̂1 ∪ · · · ∪ β x̂k
whose terminal vectors form a basis of W.
(v) Since each x̂ j ∈ R(U), there exist vectors x1 , . . . , xk such that x̂ j = U(x j ). Including the length-
one cycles generated by the basis of X, the cycles β x1 , . . . , β xn now contain

dim R(U) + k + (n − k ) = rank U + null U = m

vectors. We leave as an exercise the verification that these vectors are linearly independent.

Corollary 3.13. Suppose that the characteristic polynomial of T ∈ L(V ) splits (necessarily dim V <
∞). Then there exists a Jordan canonical basis, namely the union of bases β λ from Theorem 3.12.

Proof. By Theorem 3.5, V is the direct sum of generalized eigenspaces. By the previous result, each
Kλ has a basis β λ consisting of finitely many cycles. By Lemma 3.10, the matrix of TKλ has Jordan
canonical form with respect to β λ . It follows that β = β λ is a Jordan canonical basis for T.
S

73
Exercises 3.1 1. For each matrix, find the generalized eigenspaces Kλ , find bases consisting of
unions of disjoint cycles of generalized eigenvectors, and thus find a Jordan canonical form J
and invertible Q so that the matrix may be expressed as QJQ−1 .

11 −4 −5
 
1 1 1 2

(a) A = (b) B = (c) C = 21 −8 −11
−1 3 3 2
3 −1 0
2 1 0 0
 
0 2 1 0
(d) D = 0 0 3 0


0 1 −1 3

2. If β = {v1 , . . . , vn } is a Jordan canonical basis, what can you say about v1 ? Briefly explain why
the linear map L A ∈ L(R2 ) where A = 10 −01 has no Jordan canonical form.

3. Find a Jordan canonical basis for each linear map T:

(a) T ∈ L( P2 (R)) defined by T( f ( x )) = 2 f ( x ) − f ′ ( x )

(b) T( f ) = f ′ defined on Span{1, t, t2 , et , tet }
(c) T( A) = 2A + A T defined on M2 (R)
a
4. In Example 3.11.1, suppose x = b . Show that almost any choice of a, b, c produces a Jordan
c
canonical basis β x .

5. We complete the proof of Lemma 3.7.

(a) Prove part 1: that Eλ ≤ Kλ ≤ V.

(b) Verify that T − µI and T − λI commute.
(c) Prove part 3(b): generalized eigenspaces for distinct eigenvalues have trivial intersection.

6. Consider the induction step in the proof of Theorem 3.5.

(a) Prove that W is T-invariant.

(b) Explain why W ∩ Kµ = {0}.
(c) The assumption pW (t) = (λ1 − t)dim Kλ1 · · · (λk − t)dim Kλk near the end of the proof is the
induction hypothesis for part 1(b). Why can’t we also assume that dim Kλ j = m j and thus
tidy the inequality argument near the end of the proof?

7. We finish some of the details of Theorem 3.12.

(a) In step (ii), suppose dim W = k = 0. Explain why {x1 , . . . , xn } is in fact a basis of Kλ , so
that the rest of the proof is unnecessary.
(b) In step (v), prove that the m vectors in the cycles β x1 , . . . , β xn are linearly independent.
(Hint: model your argument on part 1 of Lemma 3.10)

74
3.2 Cycle Patterns and the Dot Diagram
In this section we obtain a useful result that helps us compute Jordan forms more efficiently and
systematically. To give us some clues how to proceed, here is a lengthy example.

Example 3.14. Precisely three Jordan canonical forms A, B, C ∈ M3 (R) correspond to the charac-
teristic polynomial p(t) = (5 − t)3 :

5 0 0 5 1 0 5 1 0
     

A= 0 5 0
  B= 0 5
 0 C = 0 5 1
0 0 5 0 0 5 0 0 5

In all three cases the standard basis β = {e1 , e2 , e3 } is Jordan canonical, so how do we distinguish
things? By considering the number and lengths of the cycles of generalized eigenvectors.

• A has eigenspace E5 = K5 = R3 . Since ( A − 5I )v = 0 for all v ∈ R3 , we have maximum

cycle-length one. We therefore need three distinct cycles to construct a Jordan basis, e.g.

β e1 = { e 1 } , β e2 = { e 2 } , β e3 = {e3 } =⇒ β = β e1 ∪ β e2 ∪ β e3 = {e1 , e2 , e3 }

• B has eigenspace E5 = Span{e1 , e3 }. By computing

a b
   

v = b =⇒ ( B − 5I )v = 0 =⇒ ( B − 5I )2 v = 0
  
c 0

we see that β v is a cycle with maximum length two, provided b ̸= 0 (v ̸∈ E5 ). We therefore

need two distinct cycles, of lengths two and one, to construct a Jordan basis, e.g.

β e2 = ( B − 5I )e2 , e2 = {e1 , e2 }, β e3 = {e3 } =⇒ β = β e2 ∪ β e3 = {e1 , e2 , e3 }

• C has eigenspace E5 = Span e1 . This time

a b c
     

v = b =⇒ (C − 5I )v =  c  , (C − 5I )2 v = 0 , (C − 5I )3 v = 0
c 0 0

generates a cycle with maximum length two provided c ̸= 0. Indeed this cycle is a Jordan basis,
so one cycle is all we need:

β = β e3 = (C − 5I )2 e3 , (C − 5I )e3 , e3 = {e1 , e2 , e3 }

Why is the example relevant? Suppose that dimR V = 3 and that T ∈ L(V ) has characteristic polyno-
mial p(t) = (5 − t)3 . Theorem 3.12 tells us that T has a Jordan canonical form, and that is is moreover
one of the above matrices A, B, C. Our goal is to develop a method whereby the pattern of cycle-
lengths can be determined, thus allowing us to be able to discern which Jordan form is correct. As a
side-effect, this will also demonstrate that the pattern of cycle lengths for a given T is independent of
the Jordan basis so that, up to some reasonable restriction, the Jordan form of T is unique. To aid us
in this endeavor, we require some terminology. . .

75
Definition 3.15. Let V be finite dimensional and Kλ a generalized eigenspace of T ∈ L(V ). Follow-
ing the Theorem 3.12, assume that β λ = β x1 ∪ · · · ∪ β xn is a Jordan canonical basis of TKλ , where the
cycles are arranged in non-increasing length. That is:

1. β x j = {(T − λI)k j −1 (x j ), . . . , x j } has length k j , and

2. k1 ≥ k2 ≥ · · · ≥ k n

The dot diagram of TKλ is a representation of the elements of β λ , one dot for each vector: the jth column
represents the elements of β x j arranged vertically with x j at the bottom.

Given a linear map, our eventual goal is to identify the dot diagram as an intermediate step in the
computation of a Jordan basis. First, however, we observe how the conversion of dot diagrams to a
Jordan form is essentially trivial.

Example 3.16. Suppose dim V = 14 and that T ∈ L(V ) has the following eigenvalues and dot
diagrams:

λ1 = −4 λ2 = 7 λ3 = 12
• • • • • • • • •
• • • •
•

Then generalized eigenspaces of T satisfy:

• K−4 = N (T + 4I)2 and dim K−4 = 6;

• K7 = N (T − 7I)3 and dim K7 = 5;

• K12 = N (T − 12I) = E12 and dim K12 = 3;

T has a Jordan canonical basis β with respect to which its Jordan canonical form is

−4 1
 
 0 −4 
−4 1
 
 
0 −4
 
 
 

 −4 


 −4 

7 1 0
 
[T] β = 
 
0 7 1

 
0 0 7
 
 
7 1
 
 
0 7
 
 
12
 
 
12
 
 
12

Note how the sizes of the Jordan blocks are non-increasing within each eigenvalue. For instance, for
λ1 = −4, the sequence of cycle lengths (k j ) is 2 ≥ 2 ≥ 1 ≥ 1.

76
Theorem 3.17. Suppose β λ is a Jordan canonical basis of TKλ as described in Definition 3.15, and
suppose the ith row of the dot diagram has ri entries. Then:

1. For each r ∈ N, the vectors associated to the dots in the first r rows form a basis of N (T − λI)r .

2. r1 = null(T − λI) = dim V − rank(T − λI)

3. When i > 1, ri = null(T − λI)i − null(T − λI)i−1 = rank(T − λI)i−1 − rank(T − λI)i

Example (3.14 cont). We describe the dot diagrams of the three matrices A, B, C, along with the
corresponding vectors in the Jordan canonical basis β and the values ri .

A: • • • x1 x2 x3 e1 e2 e3

Since A − 5I is the zero matrix, r1 = 3 − rank( A − 5I ) = 3. The dot diagram has one row,
corresponding to three independent cycles of length one: β = β e1 ∪ β e2 ∪ β e3 .

B: • • ( B − 5I )x1 x2 e1 e3
• x1 e2
0 1 0
Row 1: B − 5I = 0 0 0 =⇒ rank( B − 5I ) = 1 and r1 = 3 − 1 = 2. The first row {e1 , e3 } is
000
a basis of E5 = N ( B − 5I ).
Row 2: ( B − 5I )2 is the zero matrix, whence r2 = rank( B − 5I ) − rank( B − 5I )2 = 1 − 0 = 1.
The dot diagram corresponds to β = β e2 ∪ β e3 = {e1 , e2 } ∪ {e3 }.

C: • (C − 5I )2 x1 e1
• (C − 5I )x1 e2
• x1 e3
0 1 0
Row 1: C − 5I = 001 =⇒ r1 = 3 − rank(C − 5I ) = 1. The first row {e1 } is a basis of
000
E5 = N (C − 5I ).
0 0 1
Row 2: (C − 5I )2 = 000 =⇒ r2 = rank(C − 5I ) − rank(C − 5I )2 = 2 − 1 = 1. The first
000
two rows {e1 , e2 } form a basis of N (C − 5I )2 .
Row 3: (C − 5I )3 is the zero matrix, whence r3 = rank(C − 5I )2 − rank(C − 5I )3 = 1 − 0 = 1.
Proof. As previously, let U = T − λI.
1. Since each dot represents a basis vector U p (v j ), any v ∈ Kλ may be written uniquely as a linear
combination of the dots. Applying U simply moves all the dots up a row and all dots in the top
row to 0. It follows that v ∈ N (Ur ) ⇐⇒ it lies in the span of the first r rows. Since the dots
are linearly independent, they form a basis.
2. By part 1, r1 = dim N (U) = null(T − λI) = dim V − rank(T − λI).
3. More generally,
ri = (r1 + · · · + ri ) − (r1 + · · · + ri−1 ) = dim N (Ui ) − dim N (Ui−1 )
= null(Ui ) − null(Ui−1 ) = rank(T − λI)i−1 − rank(T − λI)i

77
Since the ranks of maps (T − λI)i are independent of basis, so also is the dot diagram. . .

Corollary 3.18. For any eigenvalue λ, the dot diagram is uniquely determined by T and λ. If we
list Jordan blocks for each eigenspace in non-increasing order, then the Jordan form of a linear map
is unique up to the order of the eigenvalues.

We now have a slightly more systematic method for finding Jordan canonical bases.
6 2 −4 −6
Example 3.19. The matrix A = 00 30 03 00 has characteristic equation
2 1 −2 −1

6−t −6
p ( t ) = (3 − t )2 = (2 − t)(3 − t)3
2 −1 − t

We have two generalized eigenspaces:

4 2 −4 −6 3
0 1
• K2 = E2 = N ( A − 2I ) = N 0 0 1 00 0 = Span 00 . The trivial dot diagram • corresponds
2 1 −2 −3 2
to this single eigenvector.

• K3 = N ( A − 3I )3 . To find the dot diagram, compute powers of A − 3I:

3 2 −4 −6
Row 1: A − 3I = 00 00 00 00 has rank 2 and the first row has r1 = 4 − 2 = 2 entries.
2 1 −2 −4
−3 00 6
Row 2: (A 2
− 3I ) = 00 00 0 has rank 1 and the second row has r2 = 2 − 1 = 1 entry.
00 0
−2 00 4

Since we now have three dots (equalling dim K3 ), the algorithm terminates and the dot diagram
for K3 is • •
•
For the single dot in the second row, we choose something in N ( A − 3I )2 which isn’t an eigen-
vector; perhaps the simplest choice is x1 = e2 , which yields the two-cycle
2 0
β x1 = {( A − 3I )x1 , x1 } = 0 , 1
0 0
1 0
0
To complete the first row, choose any eigenvector to complete the span: for instance x2 = 2 .
1
0

We now have suitable cycles and a Jordan canonical basis/form:

3 2 0 0 3 2 0 0 2 0 0 0 3 2 0 0 −1
0 , 0 , 1 , 2 −1 0012 0310 0012
β= 0 0 0 1 , A = QJQ = 0001 0030 0001
2 1 0 0 2100 0003 2100

Other choices are available! For instance, if we’d chosen the two-cycle generated by x1 = e3 , we’d
obtain a different Jordan basis but the same canonical form J:
3 −4 0 0 3 −4 0 0 2 0 0 0 3 −4 0 0 −1
β̃ = 0 ,
0
0
0 , 01 , 21 , A= 0 0 02
0 0 11
0310
0030
0 0 02
0 0 11
2 −2 0 0 2 −2 0 0 0003 2 −2 0 0

78
We do one final example for a non-matrix map.
∂f ∂f
Example 3.20. Let ϵ = {1, x, y, x2 , y2 , xy} and define T f ( x, y) = 2 ∂x − ∂y as a linear operator on

V = SpanR ϵ. The matrix and characteristic polynomial of T is easy to compute:

 0 2 −1 0 0 0   0 0 0 8 2 −4 
0 0 0 4 0 −1 00000 0
[T] ϵ = 0
0
0
0
0
0
0
0
−2
0
2 
0
=⇒ p(t) = t6 , [T2 ]ϵ =  00 00 00 00 00
0
0
, [T3 ]ϵ = O
0 0 0 0 0 0 00000 0
0 0 0 0 0 0 00000 0

There is only one eigenvalue λ = 0 and therefore one generalized eigenspace K0 = V. We could keep
working with matrices, but it is easy to translate the nullspaces of the matrices back to subspaces of
V, from which the necessary data can be read off:

N (T) = Span{1, x + 2y, x2 + 4y2 + 4xy} null T = 3, rank T = 3, r1 = 3

N (T2 ) = Span{1, x, y, x2 + 2xy, 2y2 + xy} null T2 = 5, rank T2 = 1, r2 = 3 − 1 = 2

We now have five dots; since dim K0 = 6, the last row has one, and the dot diagram is • • •
• •
•
2 2
Since the first two rows span N (T ), we may choose any f 1 ̸∈ N (T ) for the final dot: f 1 = xy is
suitable, from which the first column of the dot diagram becomes

T2 ( xy) • • −4 • •
T( xy) • 2y − x •
xy xy

Now choose the second dot on the second row to be anything in N (T2 ) such that the first two rows
span N (T2 ): this time f 2 = x2 − 4y2 is suitable, and the diagram becomes:

T2 ( xy) T( x2 − 4y2 ) • −4 4x + 8y •
T( xy) x2 − 4y2 2y − x x2 − 4y2
xy xy

The final dot is now chosen so that the first row spans N (T): this time f 3 = x2 + 4y2 + 4xy works.
The result is a Jordan canonical basis and form for T
0 1 0
 
0 0 1 
0 0 0
 
2 2 2 2
β = −4, 2y − x, xy, 4x + 8y, x − 4y , x + 4y + 4xy , J = [T] β =

0 1
 
 
 
 0 0 
0

As previously, many other choices of cycle-generators f 1 , f 2 , f 3 are available; while these result in
different Jordan canonical bases, Corollary 3.18 assures us that we’ll always obtain the same canonical
form J.

79
Exercises 3.2 1. Let T be a linear operator whose characteristic polynomial splits. Suppose the
eigenvalues and the dot diagrams for the generalized eigenspaces Kλi are as follows:

λ1 = 2 λ2 = 4 λ3 = −3
• • • • • • •
• • •
• •

Find the Jordan form J of T.

2. Suppose T has Jordan canonical form

2 1 0
 
0 2 1 
0 0 2
 

J= 2 1
 
 
0 2
 
 
 
 3 
3

(a) Find the characteristic polynomial of T.

(b) Find the dot diagram for each eigenvalue.
(c) For each eigenvalue find the smallest k j such that Kλ j = N (T − λ j I)k j .

3. For each matrix A find a Jordan canonical form and an invertible Q such that A = QJQ−1 .
0 −3 1 2
 
−3 3 −2 0 1 −1
   
 −2 1 −1 2
(a) A = −7 6 −3 (b) A = −4 4 −2 (c) A =
 −2 1 −1 2

1 −1 2 −2 1 1 −2 −3 1 4

4. For each linear operator T, find a Jordan canonical form J and basis β:

(a) T( f ) = f ′ on SpanR {et , tet , t2 et , e2t }

(b) T f ( x ) = x f ′′ ( x ) on P3 (R)

(c) T( f ) = a f x + b f y on SpanR {1, x, y, x2 , y2 , xy}. How does your answer depend on a, b?

5. (Generalized Eigenvector Method for ODEs) Let A ∈ Mn (R) have an eigenvalue λ and sup-
pose β v0 = {vk−1 , . . . , v1 , v0 } is a cycle of generalized eigenvectors for this eigenvalue. Show
that
k −1
(
b0′ (t) = 0, and
x(t) := eλt ∑ b j (t)v j satisfies x′ (t) = Ax ⇐⇒
j =0 b′j (t) = b j−1 (t) when j ≥ 1

Use this method to solve the system of differential equations

3 1 0 0
 
0 3 1 0
x′ =  x

0 0 3 0
0 0 0 2

80
3.3 The Rational Canonical Form (non-examinable)
We finish the course with a very quick discussion of what can be done when the characteristic poly-
nomial of a linear map does not split. In such a situation, we may assume that
m m
p(t) = (−1)n ϕ1 (t) 1 · · · ϕk (t) k (∗)
where each ϕj (t) is an irreducible monic polynomial over the field.

Example 3.21. The following matrix has characteristic equation p(t) = (t2 + 1)2 (3 − t)
0 −1 0 0 0 !
1 0 0 0 0
A= 0 0 0 −1 0 ∈ M5 (R)
0 0 1 0 0
0 0 0 0 3

This doesn’t split over R since t2 + 1 = 0 has no real roots. It is, however, diagonalizable over C.

A couple of basic facts from algebra:

• Every polynomial splits over C: every A ∈ Mn (C) therefore has a Jordan form.
• Every polynomial over R factorizes into linear or irreducible quadratic factors.
The question is how to deal with non-linear irreducible factors in the characteristic polynomial.

Definition 3.22. The monic polynomial tk + ak−1 tk−1 + · · · + a0 has companion matrix

0 0 0 ··· 0 − a0
 
1 0 0 0 − a1 
 
0 1 0 0 − a2 
.
. ..
.
..  (when k = 1, this is the 1 × 1 matrix (− a0 ))
. . 

0 0 0 0 − a k −2 
 

0 0 0 ··· 1 − a k −1

If T ∈ L(V ) has characteristic polynomial (∗), then a rational canonical basis is a basis for which

C1 O · · · O
 
 O C2 O
[T] β =  ..

.. .. 

 . . .
O O ··· Cr

where each Cj is a companion matrix of some (ϕj (t))s j where s j ≤ m j . We call [T] β a rational canonical
form of T.

We state the main result without proof:

Theorem 3.23. A rational canonical basis exists for any linear operator T on a finite-dimensional
vector space V. The canonical form is unique up to ordering of companion matrices.

Example (3.21 cont). The matrix A is already in rational canonical form: the standard basis is rational
canonical with three companion blocks,
C1 = C2 = 01 −01 , C3 = (3)

81
Example 3.24. Let A = 4 −3 ∈ M2 (R). Its characteristic polynomial

2 2

p(t) = t2 − 6t + 14 = (t − 3)2 + 5

doesn’t split over R and so it has no eigenvalues. Instead simply pick a vector, x = 1 (say), define

0
4
y = Ax = 2 , let β = {x, y} and observe that

0 −14

[L A ] β =
1 6

is a rational canonical form. Indeed this works for any x ̸= 0: if β := {x, Ax}, then Cayley–Hamilton
forces
0 −14

2
A x = (6A − 14I )x = −14x + 6Ax =⇒ [L A ] β =
1 6

whence β is a rational canonical basis and the form [L A ] β is independent of x!

A systematic approach to finding rational canonical forms is similar to that for Jordan forms: for each
m
irreducible divisor of p(t), the subspace Kϕ = N ϕ(T) plays a role analogous to a generalized
eigenspace; indeed Kλ = Kϕ for the linear irreducible factor ϕ(t) = λ − t!
We finish with two examples; hopefully the approach is intuitive, even without theoretical justifica-
tion.

Examples 3.25. If the characteristic polynomial of T ∈ L(R4 ) is

p(t) = (ϕ(t))2 = (t2 − 2t + 3)2 = t4 − 4t3 + 10t2 − 12t + 9

then there are two possible rational canonical forms; here is an example of each.
0 −15 0 −9
!
2 2 −3 0
1. If A = 0 −9 0 −6 , then ϕ( A) = O is the zero matrix, whence N (ϕ( A)) = R4 . Since ϕ(t)
−3 0 5 2
isn’t the full characteristic polynomial, we expect there to be two independent cycles of length
two in the canonical basis. Start with something simple as a guess:
1 0 −3
0
x1 = 0 =⇒ x2 = Ax1 = 2 =⇒ Ax2 = 40 = −3x1 + 2x2
0
0 −3 −6

Now make another choice that isn’t in the span of {x1 , x2 }:

0 0 0
0 3 6
x3 = 1 =⇒ x4 = Ax3 = −
0 =⇒ Ax4 = − −3 = −3x3 + 2x4
0 5 10

We therefore have a rational canonical basis β = {x1 , x2 , x3 , x4 } and

1 0 0 0
0 −3 0 0 1 0 0 0
−1
0 2 0 −3 1 2 0 0 0 2 0 −3
A= 0 0 1 0 0 0 0 −3 0 0 1 0
0 −3 0 5 0 0 1 2 0 −3 0 5

Over C, this example is diagonalizable. Indeed each of the 2 × 2 companion matrices is diago-
nalizable over C.

82
0 0 2 1
1 1 −1 −1
2. Let B = 0 1 −2 −16 . This time
00 1 5

3 2 −7 −29
!
3
11
2 −1 1 4 13
ϕ( B) = B − 2B + 3I = 1 −3 −6 −17 =⇒ N (ϕ( B)) = Span −1
1 , −02
0 1 1 2 0 1

Anything not in this span will suffice as a generator for a single cycle of length four: e.g.,
1 0 0 2
x1 = 0 , x2 = Bx1 = 0 , x3 = Bx2 = 1 , x4 = Bx3 = −01
0 1 1
0 0 0 1
−1 1 0 0 2

2 0 1 1 0
Bx4 = −14 = −9 0 + 12 0 − 10 1 +4 −1
4 0 0 0 1

We therefore have a rational canonical basis β = {x1 , x2 , x3 , x4 } and

1 0 0 2
0 0 0 −9 1 0 0 2
−1
B= 011 0 1 0 0 12 0 1 1 0
0 0 1 −1 0 1 0 −10 0 0 1 −1
000 1 001 4 0 0 0 1
1 0 0
!
λ
0 λ 0 0
In contrast to the√first example, B isn’t diagonalizable over C. It has Jordan form J = 0 0 λ 1
where λ = 1 + i 2. 0 0 0 λ

Probing Understanding
No ratings yet
Probing Understanding
30 pages
Notes Analysis
No ratings yet
Notes Analysis
14 pages
EIGEN
No ratings yet
EIGEN
9 pages
Math W5
No ratings yet
Math W5
45 pages
6 25 Notes
No ratings yet
6 25 Notes
10 pages
Diagonalizable Linear Marix
No ratings yet
Diagonalizable Linear Marix
7 pages
Eigenvalues
No ratings yet
Eigenvalues
48 pages
Lecture 13 & 14
No ratings yet
Lecture 13 & 14
3 pages
Eigenvectors, Spectral Theorems
No ratings yet
Eigenvectors, Spectral Theorems
22 pages
Chapter 5 Qálgebra
No ratings yet
Chapter 5 Qálgebra
10 pages
Course 1
No ratings yet
Course 1
30 pages
Linear Algebra (MA - 102) Lecture - 12 Eigen Values and Eigen Vectors
No ratings yet
Linear Algebra (MA - 102) Lecture - 12 Eigen Values and Eigen Vectors
22 pages
Diagonalization
No ratings yet
Diagonalization
21 pages
Notes On Eigenvalues 1
No ratings yet
Notes On Eigenvalues 1
9 pages
Study Material - Module 3 - Vector Spaces-II
No ratings yet
Study Material - Module 3 - Vector Spaces-II
11 pages
Matrices: Q1. What Is Diagonalization in Matrices Means? Also Explain by Constructing An Example
No ratings yet
Matrices: Q1. What Is Diagonalization in Matrices Means? Also Explain by Constructing An Example
14 pages
Linear Algebra & Singular Value Decomposition
No ratings yet
Linear Algebra & Singular Value Decomposition
5 pages
ALGExercises4 Sol
No ratings yet
ALGExercises4 Sol
9 pages
THEORY3-4 Eigen Vectors and Eigen Values
No ratings yet
THEORY3-4 Eigen Vectors and Eigen Values
12 pages
Week7 PDF
No ratings yet
Week7 PDF
18 pages
Week7 PDF
No ratings yet
Week7 PDF
18 pages
Linear Algebra Spring 2020 6th
No ratings yet
Linear Algebra Spring 2020 6th
147 pages
Instructor: Ks Senthil Raani: Insights!
No ratings yet
Instructor: Ks Senthil Raani: Insights!
15 pages
Minpolyandappns
No ratings yet
Minpolyandappns
14 pages
Econ 2001
No ratings yet
Econ 2001
35 pages
T T V Is Parallel To V? This Is An Example: Av V Av Iv Iv Av 0 I A
No ratings yet
T T V Is Parallel To V? This Is An Example: Av V Av Iv Iv Av 0 I A
5 pages
Matrix Algebra
No ratings yet
Matrix Algebra
18 pages
DiagonalIzation Matrix
No ratings yet
DiagonalIzation Matrix
4 pages
Linear Algebra Notes: 1 Column Picture of Matrix Vector Multiplication
No ratings yet
Linear Algebra Notes: 1 Column Picture of Matrix Vector Multiplication
10 pages
Section 4 Eigenvalues and Eigenvectors Lecture
No ratings yet
Section 4 Eigenvalues and Eigenvectors Lecture
5 pages
When Is A Linear Operator
No ratings yet
When Is A Linear Operator
9 pages
Groups and Geometry The Second Part of Algebra and Geometry: T.W.K Orner April 20, 2007
No ratings yet
Groups and Geometry The Second Part of Algebra and Geometry: T.W.K Orner April 20, 2007
67 pages
CHAPTER 4 - Eigenvalues and Diagonalization
No ratings yet
CHAPTER 4 - Eigenvalues and Diagonalization
13 pages
Alg2 PDF
No ratings yet
Alg2 PDF
67 pages
Eigenvalues, Eigenvectors, and Eigenspaces of Linear Operators Math 130 Linear Algebra
No ratings yet
Eigenvalues, Eigenvectors, and Eigenspaces of Linear Operators Math 130 Linear Algebra
3 pages
MTH6140 Linear Algebra II: Notes 5 18th November 2010
No ratings yet
MTH6140 Linear Algebra II: Notes 5 18th November 2010
11 pages
Notes9 PDF
No ratings yet
Notes9 PDF
9 pages
Eigenvalues and Eigenvectors, Linear Mappings
No ratings yet
Eigenvalues and Eigenvectors, Linear Mappings
18 pages
Eigenvalues, Eigenvectors, and Eigenspaces of Linear Operators Math 130 Linear Algebra
No ratings yet
Eigenvalues, Eigenvectors, and Eigenspaces of Linear Operators Math 130 Linear Algebra
3 pages
Aa547 Lecture Lec4 Expanded
No ratings yet
Aa547 Lecture Lec4 Expanded
6 pages
Matrix of Linear Transform WRT Preferred Basis
No ratings yet
Matrix of Linear Transform WRT Preferred Basis
6 pages
Linear Algebra 2
No ratings yet
Linear Algebra 2
8 pages
Linear Algebra Notes: Eigenvectors & Diagonalization
No ratings yet
Linear Algebra Notes: Eigenvectors & Diagonalization
67 pages
Eigenvalues
No ratings yet
Eigenvalues
15 pages
776MA14 Differential Geometry
No ratings yet
776MA14 Differential Geometry
206 pages
Chapter 2 - Eigenvalues and Diagonalizability
No ratings yet
Chapter 2 - Eigenvalues and Diagonalizability
11 pages
Chap2 Eigenvalues and Eigenvectors
No ratings yet
Chap2 Eigenvalues and Eigenvectors
12 pages
Lecture 15 and 16
No ratings yet
Lecture 15 and 16
2 pages
Invariant Subspaces
No ratings yet
Invariant Subspaces
7 pages
Canonical Forms Linear Algebra Notes: Satya Mandal October 25, 2005
No ratings yet
Canonical Forms Linear Algebra Notes: Satya Mandal October 25, 2005
28 pages
Basic Matrix Theory
No ratings yet
Basic Matrix Theory
10 pages
Dynamics Systems State Space Control
No ratings yet
Dynamics Systems State Space Control
6 pages
Sushmitha-Linear Operators As Matrices
No ratings yet
Sushmitha-Linear Operators As Matrices
13 pages
Final Strecth
No ratings yet
Final Strecth
22 pages
Tic 14
No ratings yet
Tic 14
2 pages
Bili Near
No ratings yet
Bili Near
8 pages
Linear Algebra Cheat Sheet
No ratings yet
Linear Algebra Cheat Sheet
5 pages
Chapter 7. Eigenvalues, Eigenvectors
No ratings yet
Chapter 7. Eigenvalues, Eigenvectors
52 pages
The Minimal Polynomial: Michael H. Mertens October 22, 2015
No ratings yet
The Minimal Polynomial: Michael H. Mertens October 22, 2015
17 pages
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Bisbull 90
No ratings yet
Bisbull 90
8 pages
DONALDSON-Linear Algebra I
No ratings yet
DONALDSON-Linear Algebra I
88 pages
PERSPECTIVES Annual Outlook 2025
100% (1)
PERSPECTIVES Annual Outlook 2025
36 pages
Cherry 2003
No ratings yet
Cherry 2003
7 pages
SE IT CGL Lab Manual
No ratings yet
SE IT CGL Lab Manual
96 pages
Chapter 5 - Scheduling Management
No ratings yet
Chapter 5 - Scheduling Management
70 pages
Generalized Extended Tanh-Function Method and Its Application
No ratings yet
Generalized Extended Tanh-Function Method and Its Application
10 pages
( (3D Terrain) ) 3D Graphic Java - Render Fractal Landscapes - JavaWorld
No ratings yet
( (3D Terrain) ) 3D Graphic Java - Render Fractal Landscapes - JavaWorld
10 pages
Le Maitre 1976
No ratings yet
Le Maitre 1976
10 pages
Subtraction Regrouping
No ratings yet
Subtraction Regrouping
2 pages
Brahmagupta: Early Life and Work
100% (1)
Brahmagupta: Early Life and Work
3 pages
Noise in The Cabin of Agricultural Tractors
No ratings yet
Noise in The Cabin of Agricultural Tractors
5 pages
Heim - Theory - Reconstructed-10-05-2025-Toward Experimental Validation
No ratings yet
Heim - Theory - Reconstructed-10-05-2025-Toward Experimental Validation
23 pages
Mid Term Last Year
No ratings yet
Mid Term Last Year
4 pages
Physical Quantities
No ratings yet
Physical Quantities
16 pages
Kalman Filter: Multiplying Normal Distributions
No ratings yet
Kalman Filter: Multiplying Normal Distributions
19 pages
TCode & Table Data
No ratings yet
TCode & Table Data
10 pages
SCIENCE
No ratings yet
SCIENCE
4 pages
Kiangsu-Chekiang College (Shatin) F.5 Final Examination 2023-24 MATHEMATICS Compulsory Part Paper 1 Question-Answer Book June 18, 2024 (Tuesday)
No ratings yet
Kiangsu-Chekiang College (Shatin) F.5 Final Examination 2023-24 MATHEMATICS Compulsory Part Paper 1 Question-Answer Book June 18, 2024 (Tuesday)
17 pages
Strings Js Notes
No ratings yet
Strings Js Notes
3 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
1-PAC - Learning Framework - Example-20-12-2024
No ratings yet
1-PAC - Learning Framework - Example-20-12-2024
75 pages
NJ Proposal SHTND
No ratings yet
NJ Proposal SHTND
12 pages
Effect of Axial Gap Between Inlet Nozzle and Impeller On Efficieny
No ratings yet
Effect of Axial Gap Between Inlet Nozzle and Impeller On Efficieny
12 pages
Input Output Models
No ratings yet
Input Output Models
4 pages
A LEVEL APPLIED MATHEMATICS P4252 Seminar 20192
No ratings yet
A LEVEL APPLIED MATHEMATICS P4252 Seminar 20192
14 pages
Presentation On Research Paper "The Peak Pool Boiling Heat Flux On Horizontal Cylinders"
No ratings yet
Presentation On Research Paper "The Peak Pool Boiling Heat Flux On Horizontal Cylinders"
19 pages
Dexterous Abacus Level 1 Workbook
No ratings yet
Dexterous Abacus Level 1 Workbook
91 pages
Recommended System
No ratings yet
Recommended System
33 pages
2009 - Ukmt
No ratings yet
2009 - Ukmt
17 pages
Fundamentals of Computer Programming: Arrays (CLO3)
No ratings yet
Fundamentals of Computer Programming: Arrays (CLO3)
17 pages
FALLSEM2019-20 MAT1011 ETH VL2019201005249 Reference Material II 19-Sep-2019 Solved-Problems Double-and-Triple-Integrals-2 PDF
100% (1)
FALLSEM2019-20 MAT1011 ETH VL2019201005249 Reference Material II 19-Sep-2019 Solved-Problems Double-and-Triple-Integrals-2 PDF
18 pages
Module 4 - Lecture Notes Engineering Design-Pages-15-18,3-13,1
No ratings yet
Module 4 - Lecture Notes Engineering Design-Pages-15-18,3-13,1
16 pages