0% found this document useful (0 votes)
137 views526 pages

MA212 LT Notes

Uploaded by

prw1118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views526 pages

MA212 LT Notes

Uploaded by

prw1118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 526

MA212 Further Mathematical Methods

Lecture 1: Some revision of MA100 linear algebra

Dr James Ward

⌅ Vector spaces and subspaces


⌅ Linear independence and span

The London School of Economics and Political Science


Department of Mathematics

Lecture 1, page 1
Vector spaces

A real vector space is a non-empty set, V , whose elements are


called vectors that is closed under two algebraic operations, namely
Closure under vector addition: Given the vectors u, v 2 V , we
can add them to form the vector u + v 2 V ;
Closure under scalar multiplication: Given the vector u 2 V
and the scalar ↵ 2 R, we can multiply the vector by the scalar to
form the vector ↵u 2 V .
In MA100, the scalars were always real numbers giving us a real
vector space. Later in the course, we will see what happens when
the scalars are complex numbers.

Lecture 1, page 3
Vector spaces

In each vector space V , there is a special element called the zero


vector and we denote this by 0. This has the property that
Zero vector: For all v 2 V , v + 0 = v .
There is also a special scalar, called the unit scalar and we denote
this by 1. This has the property that
Unit scalar: For all v 2 V , 1v = v .

Lecture 1, page 4
Vector spaces

Also, every vector in the vector space has an inverse under the
operation of vector addition
Inverse vectors: For all v 2 V , there is a vector v 2 V such
that v + ( v ) = 0.
It should come as no surprise that the vector v is the vector v
multiplied by the scalar 1, i.e. v = ( 1)v .

Lecture 1, page 5
Vector spaces

Lastly, the operations of vector addition and scalar multiplication


must satisfy some obvious properties.
For all vectors u, v , w 2 V and all scalars ↵, 2 R,

⌅ u + v = v + u,
⌅ u + (v + w ) = (u + v ) + w ,
⌅ ↵(u + v ) = ↵u + ↵v ,
⌅ (↵ + )u = ↵u + u,
⌅ ↵( u) = (↵ u).

And these should all make sense given what you know about how
vectors and scalars behave.

Lecture 1, page 6
Example

The set 80 1 9
>
< a 1 >
=
3 B C
R = @a 2 A a 1 , a 2 , a 3 2 R ,
>
: >
;
a3
where vector addition and scalar multiplication are defined by
0 1 0 1 0 1 0 1 0 1
a1 b1 a1 + b1 a1 ↵a1
B C B C B C B C B C
@a2 A + @b2 A = @a2 + b2 A and ↵ @a2 A = @↵a2 A ,
a3 b3 a3 + b3 a3 ↵a3

is a real vector space.


0 1
0
B C
Observe that the zero vector in this space is 0 = @0A.
0

Lecture 1, page 7
Example

The set ( ! )
a1 a2
a1 , a2 , a3 , a4 2 R ,
a3 a4
where vector addition and scalar multiplication are defined by
! ! !
a1 a2 b1 b2 a1 + b1 a2 + b2
+ =
a3 a4 b3 b4 a3 + b3 a4 + b4
and ! !
a1 a2 ↵a1 ↵a2
↵ = ,
a3 a4 ↵a3 ↵a4
is a real vector space.
!
0 0
Observe that the zero vector in this space is 0 = .
0 0

Lecture 1, page 8
Example

The set of all functions f : [0, 1] ! R with the usual ‘point-wise’


operations of vector addition and scalar multiplication is a real
vector space. We’ll call this vector space F[0, 1].
Observe that the zero vector in this space is the function
0 : [0, 1] ! R given by 0(x) = 0 for all x 2 [0, 1].
Note: By ‘point-wise’ operations we just mean that, given the
functions f , g : [0, 1] ! R and the scalar ↵ 2 R, we have the
functions f + g and ↵f whose values are given by

(f + g )(x) = f (x) + g (x) and (↵f )(x) = ↵f (x),

for all x 2 [0, 1].

Lecture 1, page 9
Subspaces

Suppose that V is a vector space and U is a subset of V . If U is


also a vector space, then we call U a subspace of V .
Essentially, this means that U is a subspace of V if it is a
non-empty subset of V that is closed under the operations of
vector addition and scalar multiplication being used in V .

Theorem: U is a subspace of a real vector space V if and


only if U is a non-empty subset of V which is both

CUVA: For all u, v 2 U, u + v 2 U


( close under vector addition )
and

CUSM: For all u 2 U and ↵ 2 R, ↵u 2 U.


( close under scalar multiplication)

Lecture 1, page 10
Example

Show that the set


80 1 9
>
< x >
=
B C
U = @y A 2x y +z =0
>
: >
;
z

is a subspace of R3 .

① ve Rs and Ufo as It U since 2. co ) -


O -10=0

② CUVA : take
any fyz ) (few )
,
e u ,
so 2x y -12=0 and
-

zu -
U -1W -
-

2 txt U) Lyte) t CZ the ) ( 2x y + Z ) I -1W )


-

t ( 2K
=

U is
-

o CUVA
-
=

③×µz )
Cush
EU and AER . XX -

dy tot =
aux y -12-1=0
-

u is CUSH

Lecture 1, page 11
Example

Show that the set of all continuous functions f : [0, 1] ! R is a


subspace of F[0, 1].

CU VA : take any f. SE V , so f. g :[o I] → Acts


. .

ftge U

CU SM : take any FEU and AER .

So f- E EO , D →R .
Cts .

Lf C- U

so U is a
subspace of Flo D .

Lecture 1, page 12
Linear independence

Definition: Suppose that V is a vector space.


A finite set of vectors, {v1 , v2 , . . . , vk } ✓ V is linearly
independent if the only solution to the equation

↵ 1 v1 + ↵ 2 v2 + · · · + ↵ k vk = 0

is ↵1 = ↵2 = · · · = ↵k = 0.

Note: We often call this the trivial solution to the equation as it is


always one of the solutions. The key for linear independence is
that it is the only solution!

Lecture 1, page 13
Example

Show that the set of vectors


8 0 1 0 1 0 19
>
< 1 0 2 > =
B C B C B C
@3A , @2A , @ 0 A
>
: >
;
1 1 1

is linearly dependent.

-2

(}) t 3
( §) t

tf ! ) =/ ! )
so not just the trivial solution LD

Lecture 1, page 14
Example

Show that the set of vectors


80 1 0 1 0 19
>
< 1 0 0 >
=
B C B C B C
@3A , @2A , @0A
>
: >
;
1 1 1

is linearly independent.

let a
( }) t da
f! ) t 23
/! ) -
-
o

21--0
122--0

{ 341-1292=0

hit a + got 43=0

Lecture 1, page 15
A test for linear independence

Question: How can we easily decide if a finite set of vectors in Rn


is linearly independent?
Answer: Write the vectors as the rows of a matrix and perform
row operations to get the row-echelon form (REF) of the matrix.
The vectors are linearly independent if and only if the REF does
not have a row of zeroes.
Recall: We have three types of row operation:
(1) exchange two rows,
(2) multiply a row by a non-zero scalar,
(3) add a non-zero scalar multiple of one row to another row.
The REF has a one as the first entry of each non-zero row and
each such ‘leading one’ only has zeroes below it.
Lecture 1, page 16
Example

Is the set of vectors


80 1 0 1 0 19
>
< 1 2 1 >
=
B C B C B C
@1A , @ 0 A , @ 3 A
>
: >
;
0 1 1

linearly independent?

ti 's :p :i::÷ ÷.it "l÷÷:L


.

-
I

d- Rz

)
Rz -7 l

(
l O
no 200 rows in REE
-

- o , z
Rz -7 -
Rs LI
O O I

Lecture 1, page 17
Example

Is the set of vectors


8 0 1 0 1 0 19
>
< 1 0 2 > =
B C B C B C
@3A , @2A , @ 0 A
>
: >
;
1 1 1

linearly independent?

( ! ! ! )r3→m-R3- ( g ! ! ) ( g ! :o)
'
HI, '

a tow of zeros in the REF

( y
Rz -74k
I 3 I

o i Yz the set is LD

O O O

Lecture 1, page 18
More tests for linear independence

When we are dealing with a set of n vectors in Rn , the following


theorem from MA100 gives us a number of useful ways to test for
linear independence.

Theorem: For a square matrix A, the following statements


are equivalent.
(1) The rows of A are linearly independent.
(2) The REF of A does not have a row of zeroes.
(3) The determinant of A is non-zero.
(4) The inverse of A exists.

Lecture 1, page 19
Other thoughts on linear independence

When we are just dealing with two vectors, we also have the scalar
multiple test for linear independence.

Theorem: Suppose that V is a vector space.


Two non-zero vectors u, v 2 V are linearly independent if
and only if there is no scalar ↵ such that u = ↵v .

Also remember that any set that contains the zero vector is linearly
dependent.

Theorem: Suppose that V is a vector space and U ✓ V .


If 0 2 U, then U is linearly dependent.

Lecture 1, page 20
Linear combinations

Definition: Suppose that V is a vector space and that S ✓ V


is the set of vectors

S = {u1 , u2 , . . . , un }.

Any vector of the form

↵ 1 u1 + ↵ 2 u2 + · · · + ↵ n un for scalars ↵1 , ↵2 , . . . , ↵n

is a linear combination of the vectors in S.

As V is closed under vector addition and scalar multiplication, we


know that every such linear combination will also be a vector in V .

Lecture 1, page 21
Linear span

Indeed, if we were to take the set of all possible linear combinations


of the vectors in S, we would get the linear span of S.

Definition: Suppose that V is a vector space and that S ✓ V


is the set of vectors

S = {u1 , u2 , . . . , un }.

The linear span of S is the set of vectors

Lin(S) = {↵1 u1 +↵2 u2 +· · ·+↵n un | ↵1 , ↵2 , . . . , ↵n are scalars}.

Instead of ‘Lin(S)’, you may also see ‘span(S)’ being used.

Lecture 1, page 22
Linear spans give subspaces

The linear span of a set of vectors is one way of generating


subspaces.

Theorem: If V is a vector space and S ✓ V , then Lin(S) is


a subspace of V .

Indeed, we can even show that the linear span of S ✓ V is the


smallest subspace of V that contains all of the vectors in S.

Theorem: Suppose that V is a vector space and S ✓ V .


If U is any subspace of V with S ✓ U, then Lin(S) ✓ U.

That is, any subspace of V that contains all of the vectors in S


will also contain all of the vectors in Lin(S).

Lecture 1, page 23
Example

In R3 , consider the linear span


80 19 8 0 1 9
>
< 1 > = > < 1 >
=
B C B C
Lin @1A = ↵ @1A ↵ 2 R .
>
: >
; > : >
;
1 1

It is a subspace of R3 and it is the smallest subspace of R3 that


contains the vector (1, 1, 1)t .
Geometrically, in R3 , we can think of this as a line through the
origin in the direction (1, 1, 1)t .

Lecture 1, page 24
Example

In F[0, 1], consider the functions fn : [0, 1] ! R given by


fn (x) = x n
for n = 0, 1, 2, . . .. The linear span
Lin{f0 , f1 , . . . , fn } = {↵0 f0 +↵1 f1 +· · ·+↵n fn | ↵0 , ↵1 , . . . , ↵n 2 R}
is a subspace of F[0, 1] and it is the smallest subspace of F[0, 1]
that contains all of the functions f0 , f1 , . . . , fn .
For all x 2 [0, 1], these functions have values given by
(↵0 f0 + ↵1 f1 + · · · + ↵n fn )(x) = ↵0 + ↵1 x + · · · + ↵n x n ,
and so we can think of this linear span as the set of all polynomials
of degree at most n.
We’ll call this vector space Pn [0, 1].

Lecture 1, page 25
Linear independence and linear span

Lastly, we can use linear spans to say some useful things about
linear independence.

Theorem: Suppose that V is a vector space and that S ✓ V


is a finite set of vectors.
The following statements are equivalent.
(1) S is linearly independent.
(2) Each vector in Lin(S) can be written as a linear
combination of the vectors in S in exactly one way.
(3) For each u in S, u is not in Lin(S \ {u}).

Note: (3) says that each vector u 2 S can not be written as a


linear combination of the vectors left in S after you remove u.

Lecture 1, page 26
MA212 Further Mathematical Methods
Lecture 2: More revision and something new

Dr James Ward

⌅ Bases and dimension


⌅ Testing functions for linear independence
⌅ Wronskians

The London School of Economics and Political Science


Department of Mathematics

Lecture 2, page 1
Information

⌅ Exercises 11 is on Moodle.
I Attempt questions 2, 3, 5, 6 and 9.
I Revision of MA100 linear algebra you should know.
I Submit your written solutions on Moodle by Friday
or follow any instructions laid down by your class teacher.

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes start in Week 2. See your timetable for details.

Lecture 2, page 2
Bases

Definition: Suppose that V is a vector space.


A finite set of vectors B ✓ V is a basis of V if
(1) B is linearly independent and
(2) B spans V , i.e. V = Lin(B).

Equivalently, a basis can be characterised by the following theorem.

Theorem: Suppose that V is a vector space.


A finite set of vectors B ✓ V is a basis of V if and only if
each vector in V can be written as a linear combination of
the vectors in B in exactly one way.

Lecture 2, page 3
Example

Suppose that, for 1  k  n, the vector ek has 1 as its kth


component and all its other components are zero.
The set of vectors En = {e1 , e2 , . . . , en } is a basis of Rn .
Why? Because each vector (a1 , a2 , . . . , an )T 2 Rn can be
written as
a 1 e1 + a 2 e2 + · · · + a n en ,

but it can’t be done in any other way using a linear combination of


the vectors in En .
We will call En the standard basis of Rn .

Lecture 2, page 4
Finite-dimensional vector spaces

Definition: A vector space is finite-dimensional if it has a


basis that contains a finite number of vectors.

In the case of a finite-dimensional vector space, we can prove that

Theorem: If a vector space is finite-dimensional and it has a


basis which contains n vectors, then any set of n + 1 vectors
from this vector space is linearly dependent.

And one consequence of this is that

Corollary: All bases of a finite-dimensional vector space


contain the same number of vectors.

Lecture 2, page 5
Dimension

This corollary means that we can assign a single number to each


finite-dimensional vector space as follows.

Definition: The dimension of a finite-dimensional vector


space V is the number, dim(V ), of vectors in a basis of V .

For example: Rn has dimension n, i.e. dim(Rn ) = n, as the


standard basis of Rn contains n vectors.
However, not all vector spaces are finite-dimensional.

Lecture 2, page 6
Example (Just for fun!)

The vector space F[0, 1] is not finite-dimensional.

Lecture 2, page 7
Functions: Developing a test for linear independence

We’ll look at three functions f1 , f2 , f3 2 F[0, 1] which are two-times


di↵erentiable. That is, the first and second derivatives of these
functions exist at all points in the domain [0, 1].

We want to find a condition that will guarantee that the set


{f1 , f2 , f3 } is linearly independent.

This condition will easily generalise to any number of functions (as


long as they are suitably di↵erentiable) and any domain.

Lecture 2, page 8
Looking...

A fo )
suppose { fi is LD then aft Kfz + asf, o must have a
. ,
-
.

non -
trivial solution ( ie .
at least one of a .
As as to )

for all XEEO I ],


di fi CX) -192 fzlx) -193 -13 (x) = 0
,

Afi ex) tdzficx) tasfs Cx)


'
- o

aifi'M tdzfz
" "
CX) tazfz Cx) -
O

l::* :* :*:L H :L
fix) fix) fs IX) O

as this has non -


trivial solution , the deil of the coefficient matrix

must be 0 .

Lecture 2, page 9
Concluding...

we define the Wronski " " be " ""


fi IX) fax) fslx)

!!! , *
Is:
if ff f. to }
, ,
is LD , then WH)=0 for all XEEO .
D

contrapositive
:
if WCXHO ,
then {fi fz.to }
.
is LI

Lecture 2, page 10
The Wronskian

Definition: Suppose that f1 , f2 , . . . , fk are (k 1)-times


di↵erentiable functions over some domain D.
The Wronskian of these functions is the k ⇥ k determinant

f1 (x) f2 (x) ··· fk (x)


f10 (x) f20 (x) ··· fk0 (x)
W (x) = .. .. .. ..
. . . .
(k 1) (k 1) (k 1)
f1 (x) f2 (x) · · · fk (x)

for x 2 D.

That is, the Wronskian has a column for each function, the first
row is just the functions and each successive row is the derivative
of the previous row until we get enough rows for a square matrix.

Lecture 2, page 11
The Wronskian test

In general, this gives us the following one way test.

Theorem: Suppose that f1 , f2 , . . . , fk are (k 1)-times


di↵erentiable functions over some domain D.
If W (x) 6= 0 for some x 2 D, then the set {f1 , f2 , . . . , fk } is
linearly independent.

Note: To apply this test, we just need to find at least one value
of x in the domain that gives us a non-zero Wronskian.

Lecture 2, page 12
Example

Let f1 , f2 , f3 : R ! R be the functions given by

f1 (x) = 1, f2 (x) = x + 1 and f3 (x) = x 2 + x + 1.

Show that the set {f1 , f2 , f3 } is linearly independent.

2×2+1 /
x4xtI

(
" x -11

WH) -
- O
l
= 2 to for at least one AER

O O

{ fi fz fz }
, ,
is LI

Lecture 2, page 13
Example

Let f1 , f2 , f3 : R ! R be the functions given by

f1 (x) = 1, f2 (x) = sin x and f3 (x) = sin(2x).

Show that the set {f1 , f2 , f3 } is linearly independent.

( I
I Sihx sihzx
WCX)
Stl -45nA)
=
= -
C- sin x ) czcoszx )
O cost zcoszx

o -
Sim -

48in21
,
= -

4 sxsinzxt 25in XCDSZX

say Wco ) tells


x o, O →
nothing
-
us
-
- n . -
-

say x= -9 ,
W (E) = . .
-
=
-2=10

next to -

for at least one AER {f. fats } is LI

Lecture 2, page 14
A warning!

We have seen that if W (x) 6= 0 for at least one value of x in the


domain, then we can conclude that the set of functions is linearly
independent.

BUT
If W (x) 6= 0 for no values of x in the domain, i.e.

W (x) = 0 for all x in the domain,

then we can not conclude that the set of functions is linearly


dependent.
That is, the converse of our theorem is not true and we do not
have a two way test. We’ll see a counterexample on the next slide.

Lecture 2, page 15
Example

Let f1 , f2 : R ! R be the functions given by


8
<x 2 if x 0,
f1 (x) = x 2 and f2 (x) =
:0 if x < 0.

Show that W (x) = 0 for all x 2 R but the set {f1 , f2 } is linearly
independent.
any
II! I:/
xxo.mx' o

o.mn/I;g/--o
-

any "
-
- -
-
.

for all HER , wcx)


-
-
o

af, t
tzfz -
O , for all XEIR , x. fix ) -1 Afzal) -
O

At X= -
I, Alf , L I) t
-

dzfzc-17=0 91=0 ; at q= I , Alf (1) 1-


,
a
fall) =D 02=0

so
only solution is Aida = o
{fi fz }
, is LI
Lecture 2, page 16
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 11: Assumed background

For these and all other exercises on this course, you must show all your working.
1. Determine if the following matrices have inverses and, if they do, find them.
0 1
0 1 0 1 1 1 0 0 0 1
1 1 1 1 1 0 B0 0 1 2 1
1 1C
(i) @1 2 3 A , (ii) @0 1 1A , (iii) B @1 0
C , (iv) @1 4 1 A.
0 1A
1 3 5 1 0 1 1 8 1
0 1 1 0

2. Determine whether the following sets of vectors in R4 are linearly independent.



(a) {(2, 1, 1, 1)t , (1, 2, 1, 1)t , (3, 1, 0, 2)t , (1, 2, 3, 4)t };

(b) {(1, 2, 0, 1)t , (2, 1, 0, 1)t };

(c) {(2, 1, 1, 1)t , (1, 2, 1, 1)t , (3, 1, 0, 2)t , (3, 3, 0, 2)t }.

For any set that is not linearly independent, find a basis for the subspace spanned by the set.

3. For each n
✓ 2, find a basis B for the subspace

V = {(v1 , v2 , . . . , vn )t 2 Rn : v1 + · · · + vn = 0},

of Rn . You should show that your set B is linearly independent and spans V .

4. Calculate the rank of the following matrix and find a basis for its nullspace.
0 1
1 2 3 1
B2 3 1 2C
B C
B 1 0 7 1C
B C
@3 4 1 3A
1 3 8 1

5. Find all the solutions to the equation

:
0 1
1 2 1
@2 4 3 A x = b
1 2 2

when (i) b = (1, 2, 1)t and (ii) b = (1, 2, 2)t . In case (i), write down a basis for the subspace of which
the solution set is a translate.

6. Let 1  p, q  n be given with p 6= q. Let the n ⇥ n matrix A be defined by aii = 1 for all
1  i  n, apq = a, and otherwise aij = 0. The matrix A represents the row operation that adds a
times row q to row p. What is the inverse of A?

7. Write each of the four vectors

u1 = (1, 2, 1)t , u2 = (0, 1, 2)t , u3 = ( 1, 2, 1)t and u4 = (1, 1, 1)t

as a linear combination of the other three.

Exercises continue overleaf...


8. Determine if the following systems of linear equations have solutions and, if they do, give the
complete solution.

(a) x1 + 2x3 + 3x4 = 2 (b) x1 + 2x2 + 3x3 + 4x4 = 2


x2 + x3 x4 = 2 3x1 x2 + x3 + 3x4 = 1
2x1 2x2 + 3x3 + 7x4 = 6 2x1 + x2 + x3 + 2x4 = 0
3x1 + 4x2 + 8x3 + 7x4 = 2 x1 3x2 + 2x3 + 4x4 = 1


9. Suppose we have a matrix A, and we obtain A0 from A by the elementary row operation of
adding the first row of A to the second row. (That is, the second row of A0 is the sum of the first
two rows of A.)

(a) Show that the row space of A is equal to the row space of A0 .
(You need to show that every vector in the row space of A0 is in the row space of A, and vice
versa).

(b) Suppose the columns c10 , . . . , cn0 of A0 satisfy ↵1 c10 + · · · + ↵n cn0 = 0 for some scalars ↵1 , . . . , ↵n .
Show that the columns c1 , . . . , cn of A also satisfy ↵1 c1 + · · · + ↵n cn = 0.

(c) Deduce that, if the column vectors of A are linearly independent, then the column vectors of A0
are also linearly independent.

You should be able to see that (a)-(c) also hold for any other elementary row operation.
Ci ) det l l lo 9) (5-3)-1 113-2 )
2-11=4
-
i. - -
I = I -

so no inverse

Cii ) det = I Cl -

O) t ICI -

O) =
2
I l l

(
-

) )
l

(
l -
I
l I l

f )
adj CA )
- -

Am =
A cot =

y l
- I I
-

I
y l
, ,
l l
-

l l l
l l l -
l

± I d-

( )
then A -1=4
-

adj CA) =

E
iz f -

V
-
± , ±

deify ly ! ) fly g)
ciii dei i. de' t.int *
9
-
- t in =
" i. i 0

Ii:÷ :
""

÷: ÷:
"" " ""

SO A
-
I = -
I d- Is - I
± - I ± i
inverse
-

no

Is ,
×

Civ ) det =
I .-
L
-
12 ) -
I .
6 -1 In b = -
12

f! ! ! ) off! I ÷ ÷)
AM " admit"
-
-

÷)
* * an" -
' = I I -

. :
-

T
take negative
iii. in :÷÷÷n:÷÷÷n : : : :L
2- (a) 2 l l l l O O O

O O 0

Since the REF doesn't have a row of zeros ,


the vector is linearly independent

"
l: : : →

e: : : →
l : : : :L →
Iii : :S .si : : :
doesn't
since the REE have a row
of zeros , the vector is
linearly independent

Is:÷÷H : : :÷t
Cc ) The vector can be written as 2 l l l l o O
43

since there is a row


of zeros in the REF .
the vector is not linearly independent
and the basis

µ;) g) I
-
-

3 .

Suppose B -
fei , ez , - -
-

, en
) is a basis for V .

let die , tazezt -


- .
1- then =D ,

1+021 1=0
then an the only
=a
a t - - -
t
,
so solution is area = - - -

mm
Hence B is
linearly independent more general proof

since there exists di -


Vi

Az -_
U2

then melt tzezt tan en =


( Ui 02 On It which shows B V
spans
- . .
- a

, , .
, .

Mr

Iv
dim span CB) -
-
dim V
ti : : ¥ :÷÷÷t
2 3
' '
I I

rank CA) =
dim ( RS CA ) ) = 2

g) (g) )
in parametric term '

= ×, + xp
u

(g) )
so a basis of NH ) =

{ )
-
,

f ! § ! / g ! ! ) ( to ! ! / ÷ ! ! )
s .

we construct the matrix CAH ' -


-
' →

! §)
Therefore
"
A =

! HI %,
" " "'

Therefore
µ}) (g) (! ) Sw
the solution set is
= t

dis The equation can be written as :

I :÷÷¥÷÷÷÷÷ ÷:
" "" ...
solution
so no

: : :O .
6 The inverse of derived times
.
A can be by subtracting a how f to row p ,

so A-
"
will be defined by aii - I
for all Isis n

{ da !" "

otherwise
-

key understand
:
how it works

tbh ! ) µ )
7 .

suppose =
a + c

Then we have a c
-

I and thus a = £

{ I ::
- -

'

zatbxzc ,

at .me .

This means u, =
I uz Eds -
- I U4

l :÷ : ÷:
""

The equation is consistent ,


so we have the solution :
Xi -

-
b 544
-
43=44-2

712=0
714--94
42

Iii: : :÷¥l÷÷ :
Cb ) I 2 3 4 2

9
4
13
T2
-

9
Therefore
I
the equation has
unique solution X 73=-4
-

a
z
-
,

12=-1-4 74=-433
faa :?) fan :÷:a9ai)
s.ca , we can write matrix a- -

:&: a
.
so a-
:b hi .

i. :

Ail Aiz - - -

Aig Ail Aiz - - -

Aig

row (A) = Ai ( All Ak - - -

Aij ) t Az ( Azl Azz . - -

Aaj ) t . -
.
t dj ( Ail Air - - -

Aij )
'
row (A ) = a, ( Aa Ar - -
-

Aaj ) t Az ( Ail -1921 Art Azz - - -

Aijtazj ) t
- - -

t dj ( Ail air . . .

Aij )

=
di ( All AK - - -

Aig ) t 92 ( All Ak - - .

Aij ) t Az ( Az , Azz . . .
Azj ) t .
. .
t Nj ( Ail Aiz
- - -

Aij )

=
(911-02) ( Al , Ar - - -

Aij ) t Az ( AH Azz . - -

92J ) t . -
-
t dj ( Ail Air - - -

Aij )

since a, to dj are constants , row CA) =


row CA
'
)

if
'
Cb ) a c, t
'
. . .

+ dncn =
0 then a cat Aa ) t ' .
-

tan ( Cnt Azn ) =D


,

QQ -10 , Az , t - -
-
t tncnt an Azn =D

Since QAZI f- Az Azz t - - -

1- An Azn O
-

, 9,9 t -
- i

t An Ch =D

cc )
if column vectors of A ate
linearly independent ,

the
then
only solution for diaz, f- Az Azz t . - -

+ an Azn O -
is a , -_9z= -
- -
= an =D

since all Cit Azl tan ( Cnt Azn ) =D also has


921 Azz Azn FO , ) t
unique solution of
' "

,
.
. . .

. a

A
a an =D
linearly independent
'
so A is also
- - - - -
=
.
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 11: Assumed background

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. The first and the third matrices are singular, meaning that there are no inverses. One way to
see this is to note that, in the first matrix, adding the first and the third rows gives twice the second
row. That means that the rows are not linearly independent. Similarly for the third matrix, where
you can spot that the sum of the top two rows equals the sum of the bottom two rows.
0 1
1/2 1/2 1/2
The inverse of the second matrix is @ 1/2 1/2 1/2A.
1/2 1/2 1/2
0 1
1 1/2 1/2
The inverse of the fourth matrix is @ 1/6 0 1/6 A.
1/3 1/2 1/6
Techniques for finding inverses are covered in MA100.

2. For (a), we place the vectors as the rows of a matrix, and carry out row reductions.
0 1 0 1 0 1
2 1 1 1 1 2 1 1 R2 2R1 ! R2 1 2 1 1
B1 C B
1C R1 $ R2 B2 1C B 1C
B 2 1 1 1 C R3 3R1 ! R3 B0 3 3 C
@3 1 0 2A ! @3 1 0 2 A ! @0 5 3 1A
1 2 3 4 1 2 3 4 R4 R1 ! R4 0 0 4 3
0 1 0 1
1 21 1 1 2 1 1
1 B C B 1 1/3C
3 R2 ! R2 B0 11 1/3C R3 + 5R2 ! R3 B0 1 C
! @0 35 1A ! @0 0 2 2/3A
0 0
4 3 0 0 4 3
0 1
1 2 1 1
B
R4 + 2R3 ! R4 B0 1 1 1/3 C
C
! @0 0 2 2/3 A
0 0 0 13/3
The final matrix arising is in “upper triangular form” or “reduced form”, and no zero row has arisen,
so the set of vectors is linearly independent.
For (b), the only way a pair of vectors can be linearly dependent is if one is a multiple of the other;
that’s clearly not the case here, so the pair is linearly independent.
For (c), the set is linearly dependent; here are four ways to demonstrate that.

(i) 2(2, 1, 1, 1)t + 2(1, 2, 1, 1)t 3(3, 1, 0, 2)t + (3, 3, 0, 2)t = (0, 0, 0, 0)t .

(ii) All four of them lie in the 3-dimensional hyperplane normal to ( 2, 0, 1, 3)t .

(iii) The matrix we get by putting these as the rows of a matrix has determinant zero.

(iv) Carrying out row reductions as in (a) above yields a zero row.

The easiest way to find a basis for the space spanned by this set of vectors, let’s call it C, is to notice
that (by part (a)) the first three vectors form a linear independent set and they span Lin(C), so they
form a basis for Lin(C). Another answer can be found by taking the non-zero rows arising at the
end of the row-reduction process.
3. For j = 2, . . . , n, let uj be the vector with a 1 in position j and a 1 in position 1, with all
other entries zero. Let B = {u2 , . . . , un }. Clearly each element of B is in V . We claim that B is a
basis of V .
First we check that B is linearly independent. Suppose we have ↵2 u2 + · · · + ↵n un = 0. Then,
considering the jth co-ordinate (j > 1), we see that all the ui have a 0 in position j except uj , which
has a 1; so ↵j = 0. This means that all of ↵2 , . . . , ↵n are zero, and the set B is indeed linearly
independent.
Now we check that B spans V . We could do this by a dimension argument: Lin(B) is a subspace
of dimension at least n 1, contained in V . As V is not the whole of Rn , the dimension of V is no
greater than n 1, so Lin(B) = V . More directly, if x = (x1 , . . . , xn )t is any vector in V , we have
x1 = x2 x3 · · · xn , and we can write x = x2 u2 + · · · + xn un .

4. Subtracting appropriate multiples of the first row from all the other rows reduces the given
matrix A to 0 1
1 2 3 1
B0 1 5 0C
B C
B0 2 10 0C
B C.
@0 2 10 0A
0 1 5 0
Now we see that all the rows except the first are multiples of each other, and one further round of
row reductions gets the matrix into reduced row-echelon form:
0 1
1 0 7 1
B0 1 5 0C
B C
B=B B 0 0 0 0C.
C
@0 0 0 0A
0 0 0 0

As there are only two non-zero rows in the reduced matrix, the rank of the matrix is 2.
The nullspace of A is the same as the nullspace of B. To write down the solution of the matrix
equation Bx = 0, we treat the variables corresponding to rows without a leading 1 as “free”. So, if
x = (x, y, z, w), the solution is given by x = 7z w and y = 5z for z, w 2 R. This can then be
written as 0 1 0 1 0 1
x 7 1
By C B 5C B0C
B C = zB C + wB C,
@z A @1A @0A
w 0 1
and these two vectors, i.e. (7, 5, 1, 0)t and ( 1, 0, 0, 1)t , form a basis for the nullspace of A.

5. For (i), we form the augmented matrix, and row-reduce it:


0 1 0 1 0 1
1 2 1 1 R2 2R1 ! R2 1 2 1 1 R3 R2 ! R3 1 2 0 1
@ 2 4 3 2 A ! @ 0 0 1 0 A ! @ 0 0 1 0 A.
1 2 2 1 R3 R1 ! R3 0 0 1 0 R1 R2 ! R1 0 0 0 0

This represents the equivalent equations x + 2y = 1 and z = 0 which means that the solution set can
be written as 80 1 0 1 0 1 9
< x 1 2 =
@y A = @0A + y @ 1 A : y 2 R .
: ;
z 0 0
Geometrically, this is the line through (1, 0, 0)t in the direction of the vector ( 2, 1, 0)t and so it’s a
translate of the subspace with basis {( 2, 1, 0)t }.
For (ii), repeating the process gives us an inconsistency and so there are no solutions.
6. The inverse of A is the matrix B where bii = 1 for every i, bpq = a, and otherwise bij = 0.
You can either use row operations to find this, or you can note that the operation that “undoes” the
e↵ect of A is the addition of a times row q to row p, i.e. left-multiplication by the matrix B above.
Thus multiplying on the left by BA leaves the original matrix unchanged, so BA = I or B = A 1 .

7. First we find a solution to


a1 u1 + a2 u2 + a3 u3 + a4 u4 = 0.
This gives the matrix equation
0 1
0 a1 0 1
1 0 1 1 B 1C 0
a2 C @ A
@ 2 1 2 1A B
@a 3 A = 0 ,
1 2 1 1 0
a4
and row reduction transforms this to
0 1
0 1 a 0 1
1 0 0 2 B 1C 0
@0 1 0 A Ba2 C @ A
3 @ A= 0 ,
a3
0 0 1 3 0
a4
with a non-zero solution (a1 , a2 , a3 , a4 )t = (2, 3, 3, 1)t . From this we get
2u1 3u2 + 3u3 + u4 = 0,
and this allows us to write any of the four vectors as a linear combination of the others. For example
we have u2 = 23 u1 + u3 + 13 u4 .

8. For (a), the general solution is x1 = 6 5x4 , x2 = 0 and x3 = 2 + x4 for x4 2 R.


1 1 27 13
For (b), the general solution is x1 = 12 , x2 = 4, x3 = 12 and x4 = 12 .

9. This is an important result, as it underlies our use of row operations to solve all kinds of
practical matrix problems.
For (a), the new second row of A0 is the sum of two rows of A, so it is also in the row space of A.
Therefore all rows of A0 are in the row space of A, and so the row space of A0 is contained in the row
space of A. This holds in reverse too: the second row of A can be obtained from the rows of A0 by
taking the second row of A0 and subtracting the first row. So the row space of A is contained in the
row space of A0 , and we may conclude that the two row spaces are equal.
For (b), we can think of the ↵1 c01 + · · · + ↵n c0n = 0 as a linear equation satisfied by each row of A0 .
In particular, the entries of the first two rows of A0 satisfy the equation, and hence so do the entries
in the second row of A. So the equation is also satisfied by the columns of A.
For (c), we need to show that, if the columns c1 , . . . , cn of A are linearly independent, then the
corresponding columns c01 , . . . , c0n of A0 are linearly independent. This can be done by showing that
the contrapositive of this statement (which is logically equivalent to the original statement) is true.
That is, we show instead that, if the columns c01 , . . . , c0n of A0 are linearly dependent, then the
corresponding columns c1 , . . . , cn of A are linearly dependent.
To see why this is true, consider that if the columns c01 , . . . , c0n of A0 are linearly dependent, then
there are scalars ↵1 , . . . , ↵n , not all zero, such that
↵1 c01 + · · · + ↵n c0n = 0.
But, by (b), we know that this means that these very same scalars ↵1 , . . . , ↵n , not all zero, are also
such that
↵1 c1 + · · · + ↵n cn = 0.
That is, the corresponding columns c1 , . . . , cn of A are linearly dependent, as required.
MA212 Further Mathematical Methods
Lecture 3: Linear transformations and matrices

Dr James Ward

⌅ Linear transformations
⌅ Using matrices to represent linear transformations
⌅ Coordinate vectors and change of basis

The London School of Economics and Political Science


Department of Mathematics

Lecture 3, page 1
Linear transformations

Definition: Suppose that U and V are vector spaces.


A linear transformation is a function T : U ! V such that

T (↵u1 + u2 ) = ↵T (u1 ) + T (u2 ),

for all vectors u1 , u2 2 U and scalars ↵, .

Observe that linear transformations ‘preserve’ the operations of

⌅ vector addition as, with ↵ = = 1, u1 + u2 2 U becomes


T (u1 + u2 ) = T (u1 ) + T (u2 ) 2 V ,
⌅ scalar multiplication as, with = 0, ↵u1 2 U becomes
T (↵u1 ) = ↵T (u1 ) 2 V .

Lecture 3, page 3
Example

Show that the function T : R3 ! R2 given by


0 1
x !
B C 2x y
T @y A =
x +y +z
z

is a linear transformation.

suppose
( Yu;) µ;)
'
e IR
,
,

y tie l ÷÷÷H :: ::::::::::::: ,


. "
. . ...

lie :II: I
"

167%7 I µ; ) if;)
transformation
= a T is a linear
e at e
-

t = t
. u, , a,

Lecture 3, page 4
Example

Show that if A is an m ⇥ n matrix, then T (x ) = Ax is a linear


transformation from Rn to Rm .
suppose x. yeah and a . Be IR

T ( ax t Py ) =
A ldxt BY ) =
AA X t BAY = a Ttx) t BT ly )

T is a linear transformation

Lecture 3, page 5
Example

If U is the vector space of all continuous functions f : [0, 1] ! R,


show that the function T : U ! R given by
Z 1
T (f ) = f (x) dx
0

is a linear transformation.

suppose f. S E U and ABER

Te aft BS I =
I:@ft PS ) Cx ) dx =
If Kfar t fax ) ) dx

=
dftoflxdxt toffs G) DX = a TH t ft CS)

T is a linear transformation

Lecture 3, page 6
Representing linear transformations by matrices

Theorem: If T : Rn ! Rm is a linear transformation and


(e1 , e2 , . . . , en ) is the standard ordered basis of Rn , then

T (x ) = AT x for all x 2 Rn ,

where AT is the m ⇥ n matrix


0 1
B C
AT = @T (e1 ) T (e2 ) · · · T (en )A .

Lecture 3, page 7
Proof: Any x 2 Rn can be written as

x = ↵1 e1 + ↵2 e2 + · · · + ↵n en = (↵1 , ↵2 , . . . , ↵n )t .

for some scalars ↵1 , ↵2 , . . . , ↵n . Thus, using the linearity of T ,

T (x ) = T (↵1 e1 + ↵2 e2 + · · · + ↵n en )
= ↵1 T (e1 ) + ↵2 T (e2 ) + · · · + ↵n T (en )
0 1
0 1 ↵1
B C
B CB ↵2 C
= @T (e1 ) T (e2 ) · · · T (en )A B C
B .. C = AT x
@ . A
↵n

as required.

Lecture 3, page 8
Example

If T : R3 ! R2 is the linear transformation given by


0 1
x !
B C 2x y
T @y A = ,
x +y +z
z

find a matrix AT such that T (x ) = AT x for all x 2 R3 .

Y) l?)
tf! ) I 491--171
-
-

That ? ? 9) x

a-
AT

Lecture 3, page 9
Generalising what we have seen

We have seen how to represent a linear transformation


T : Rn ! Rm by a matrix when we are working with the standard
basis of Rn .
But, we will also want to represent a linear transformation by a
matrix when we are working with other bases of Rn ...
. . . Maybe a basis which gives us a diagonal matrix!
To do this, we need to see how we can represent vectors relative to
other bases using coordinate vectors.

Lecture 3, page 10
Coordinate vectors

Definition: Let B = (u1 , u2 , . . . , un ) be an ordered basis of


some vector space U.
If u 2 U can be written as

u = ↵ 1 u1 + ↵ 2 u2 + · · · + ↵ n un ,

for some scalars ↵1 , ↵2 , . . . , ↵n , we call the vector


2 3
↵1
6 7
6 ↵2 7
[u]B = 6 6 .. 7
7
4 . 5
↵n B

the coordinate vector of u relative to the ordered basis B.

Lecture 3, page 11
A quick note on ordered bases...

Suppose that B = {u1 , u2 } is a basis of some vector space U.


As B is a set (note the ‘curly’ brackets) it is unordered and so we
are free to consider the vectors in B in any order.
In this case, there are two ways of ordering the vectors, giving us
two ordered bases, i.e.

B1 = (u1 , u2 ) and B2 = (u2 , u1 ).

As B1 and B2 are ordered pairs (note the ‘round’ brackets) they


are ordered and so we are not free to consider the vectors in B1
and B2 in any order.

Lecture 3, page 12
...And coordinate vectors

Now, let’s take the vector u 2 U to be

u = 2u1 + 3u2 .

From this, we can get two di↵erent coordinate vectors depending


on which ordered basis we use, i.e.
" # " #
2 3
[u]B1 = and [u]B2 =
3 2
B1 B2

But, [u]B makes no sense as a coordinate vector because, as B is


unordered, it doesn’t allow us to choose between these two
coordinate vectors.

Lecture 3, page 13
Coordinate vectors: A special case

When we are working with En = (e1 , e2 , . . . , en ), the standard


ordered basis of Rn , we write the coordinate vector of u relative to
this basis as u.
This is because any u 2 Rn can be written as
u = ↵ 1 e1 + ↵ 2 e2 + · · · + ↵ n en ,
for some scalars ↵1 , ↵2 , . . . , ↵n . That is, we have both
0 1 2 3
↵1 ↵1
B C 6 7
B ↵2 C 6 ↵2 7
u =B C
B .. C and [u]En = 6 .. 7 .
6 7
@ . A 4 . 5
↵n ↵n En
Thus, as the components of both vectors must be the same, we
just take [u]En = u.

Lecture 3, page 14
Change of basis

Theorem: If B = (u1 , u2 , . . . , un ) is an ordered basis of Rn ,

u = MB [u]B for all u 2 Rn ,

where MB is the n ⇥ n matrix


0 1
B C
MB = @u1 u2 · · · un A .

That is, the matrix MB takes the coordinate vector of u relative to


the ordered basis B and gives us the coordinate vector of u
relative to the standard ordered basis of Rn .

Lecture 3, page 15
Proof: Any u 2 Rn can be written as

u = ↵ 1 u1 + ↵ 2 u2 + · · · + ↵ n un ,

for some scalars ↵1 , ↵2 , . . . , ↵n . But this gives us

0 1
0 1 ↵1
B C
B CB ↵2 C
u = @u1 u2 · · · un A B C
B .. C = MB [u]B ,
@ . A
↵n
as required.

Lecture 3, page 16
Change of basis continued

An obvious consequence of this is that we also have

[u]B = MB 1 u for all u 2 Rn ,

and so the matrix MB 1 takes the coordinate vector of u relative to


the standard ordered basis of Rn and gives us the coordinate vector
of u relative to the ordered basis B.
Note: The inverse of MB certainly exists because it is a square
matrix whose columns form a basis of Rn and so they are linearly
independent.

Lecture 3, page 17
Representing linear transformations by matrices again

Theorem: Let T : Rn ! Rm be a linear transformation


given by
T (x ) = AT x for all x 2 Rn .

If B and C are ordered bases of Rn and Rm respectively, then



X -

MBK ) B
ACT ,B [x ]B ,
-

[T (x )]C =
T T
input
TH) -
Mc [Tix)) c
output
where ACT ,B 1
= MC A T MB .

Proof: For all x 2 Rn , we have T (x ) = AT x and so

MC [T (x )]C = AT MB [x ]B =) [T (x )]C = MC 1 AT MB [x ]B ,

as required.

Lecture 3, page 18
Example

Consider the ordered bases of R3 and R2 given by


00 1 0 1 0 11
1 0 1 ! !!
BB C B C B CC 1 1
B = @@0A , @1A , @ 2 AA and C = ,
1 1
1 0 1

respectively and the linear transformation T : R3 ! R2 given by


0 1
x !
B C 2x y
T @y A = .
x +y +z
z

Find ACT ,B .

Lecture 3, page 19
fi y )
4 I 91
f! ! ! )
es . At -
- .
MB -
-

Me
-
-

mi
'
-
-
Eff it

( III )
'
ai
f: it C i 71
'

Mia, mis I
fo : :)
-

= -
- .
-
.

Lecture 3, page 20
Example continued

0 1
4
B C
Verify that this works by finding T @4A using both AT and ACT ,B .
6

Tfg ) Atf ! ) ( T T 7)
4) ( it )
① AT
using
-
-
-
=
-

's
ai
ffg ) ) ai
② using "
.

④ Pre I
-

B
-

. '

④ iii. Is
.

-
-

.
or -
s
'll to + "'

"
ai
it '
ftp.t : : it I Ha
-
-

c -
-

t exact Ic or i -
elitist =

Lecture 3, page 21
Representing linear transformations by matrices yet again

Theorem: Suppose that U and V are finite-dimensional


vector spaces and that T : U ! V is a linear transformation.
If B = (u1 , u2 , . . . , un ) and C are ordered bases of U and V
respectively, then

[T (x )]C = ACT ,B [x ]B for all x 2 U,

where ACT ,B is the m ⇥ n matrix


0 1
B C
ACT ,B = @[T (u1 )]C [T (u2 )]C ··· [T (un )]C A .

Lecture 3, page 22
Proof: Any u 2 U can be written as

u = ↵ 1 u1 + ↵ 2 u2 + · · · + ↵ n un ,

for some scalars ↵1 , ↵2 , . . . , ↵n , with coordinate vector


2 3
↵1
6 7
6 ↵2 7
[u]B = 6 7
6 .. 7
4 . 5
↵n B

and, using the linearity of T , we have

T (u) = T (↵1 u1 + ↵2 u2 + · · · + ↵n un )

= ↵1 T (u1 ) + ↵2 T (u2 ) + · · · + ↵n T (un ).

Lecture 3, page 23
Now, suppose that C = (v1 , v2 , . . . , vm ) is the ordered basis for V .
As T (u1 ), T (u2 ), . . . , T (un ) 2 V , we can write them as

T (u1 ) = a11 v1 +a12 v2 + · · · + a1m vm ,


T (u2 ) = a21 v1 +a22 v2 + · · · + a2m vm ,
..
.
T (un ) = an1 v1 +an2 v2 + · · · + anm vm

for some scalars aij with 1  i  n and 1  j  m so that


2 3 2 3 2 3
a11 a21 an1
6 7 6 7 6 7
6a12 7 6a22 7 6 an2 7
[T (u1 )]C = 6 7 6 7 6 7
6 .. 7 , [T (u2 )]C = 6 .. 7 , . . . , [T (un )]C = 6 .. 7 .
4 . 5 4 . 5 4 . 5
a1n C
a2n C
anm C

are their coordinate vectors relative to the basis C .

Lecture 3, page 24
So, substituting these into our expression for T (u), we get
T (u) = ↵1 (a11 v1 + a12 v2 + · · · + a1m vm ) +
↵2 (a21 v1 + a22 v2 + · · · + a2m vm ) + · · · +
↵n (an1 v1 + an2 v2 + · · · + anm vm )

= (↵1 a11 + ↵2 a21 + · · · + ↵n an1 )v1 +


(↵1 a12 + ↵2 a22 + · · · + ↵n an2 )v2 + · · · +
(↵1 a1m + ↵2 a2m + · · · + ↵n anm )vm
so that
2 3
↵1 a11 + ↵2 a21 + · · · + ↵n an1
6 7
6 ↵1 a12 + ↵2 a22 + · · · + ↵n an2 7
[T (u)]C = 6
6 .. 7 ,
7
4 . 5
↵1 a1m + ↵2 a2m + · · · + ↵n anm C
is the coordinate vector of T (u) relative to the basis C .

Lecture 3, page 25
Thus, noting that matrix multiplication gives us
0 1 0 10 1
↵1 a11 + ↵2 a21 + · · · + ↵n an1 a11 a21 · · · an1 ↵1
B C B CB C
B ↵1 a12 + ↵2 a22 + · · · + ↵n an2 C Ba12 a22 · · · an2 C B↵2 C
B C=B . .. C B C
B .. C B . .. ..
. C B .. C ,
@ . A @ . . . A@ . A
↵1 a1m + ↵2 a2m + · · · + ↵n anm a1n a2n · · · anm ↵n

we can write
0 1
B C
[T (u)]C = @[T (u1 )]C [T (u2 )]C ··· [T (un )]C A [u]B ,

as required.
€13

Lecture 3, page 26
Example

As before, consider the ordered bases of R3 and R2 given by


00 1 0 1 0 11
1 0 1 ! !!
BB C B C B CC 1 1
B = @@0A , @1A , @ 2 AA and C = ,
1 1
1 0 1

respectively and the linear transformation T : R3 ! R2 given by


0 1
x !
B C 2x y
T @y A = .
x +y +z
z

Find ACT ,B .

Lecture 3, page 27
Tf ! ) K ) ti )
-
- -
-
to IT Ic -
-

Tok
'
II ) I out if I
-
-
i le c

'
II ) H tilt ¥ pit k
-
-
-
-
i
. ,
,

( Hyuk Hyuk Yu.sk/--


's
Ai =
( 67 it

Lecture 3, page 28
Example

Let P2 be the vector space of all polynomial functions of degree at


most two from R to R and let f0 , f1 , f2 : R ! R be the functions
given by

f0 (x) = 1, f1 (x) = x and f2 (x) = x 2 .

{ Ifans 1Pa
(a) Show that B = (f0 , f1 , f2 ) is anh
ordered
mm
basis of P2 .
(b) Show that the function T : P2 ! P2 given by

T (f )(x) = f (x + 1) for all x 2 R,

is a linear transformation.
(c) Find the matrix AB,B
T .

Lecture 3, page 29
(9) Cfo fi fz)
, ,
is LI
using Wroskian

Lin Cfo.fi fz) Ps ,


-
-
,
so
spans

Cb ) select f.Sept a. BE IR

for all x ER , Taft Pg ) Cx) =


(aft B) ( 4th =
dfcxtl) t PS cxtl) = d Tcf) # BTC9) (x)

for f- C- Ah , f -
-
do fo t 9. fit Kfz with No hi AER, , . we can be sure THIER as well since

for all HER ,


Tcf) CX) flxtl) = = do folxtl) ta , f, cxti) t a flat) =
do -
I t MIX-11 ) -1 d- Cx -1113

= not hit Az t ( 9 , -1292 ) Xf as X '


C- IPZ

"

fetus rims Hi ) f: " ÷ )


.
ai .
-

. .
".
-
-

o
*on. .

1-Cfo) Cx) = fotxtl) =


I = to .

f! )
TH) Cx) fi cxtt)
-
-
=
Htt =
fit fo Clt)) pi
,
yyfz, ex) =
fzcxtt) -
-
Cath =
442kt I =
fit # t fo
#gyps ,
f !) ,
Lecture 3, page 30
MA212 Further Mathematical Methods
Lecture 4: Similar matrices

Dr James Ward

⌅ Some last words on linear transformations


I Kernels and images, null spaces and ranges
I ranks and nullities
⌅ Similar matrices
⌅ Eigenvalues, eigenvectors and diagonalisation
⌅ Characteristic polynomials, traces and determinants

The London School of Economics and Political Science


Department of Mathematics
Lecture 4, page 1
Kernels and images

Every linear transformation has two subspaces associated with it, a


kernel and an image.

Definition: Suppose that U and V are vector spaces and


that T : U ! V is a linear transformation.
⌅ The kernel of T is ker(T ) = {u 2 U | T (u) = 0}.
⌅ The image of T is im(T ) = {T (u) 2 V | u 2 U}.

Theorem: Suppose that U and V are vector spaces and that


T : U ! V is a linear transformation.
⌅ The kernel of T , ker(T ), is a subspace of U.
⌅ The image of T , im(T ), is a subspace of V .

Lecture 4, page 3
Proof: T : u →v is a linear transformation . That Bd ) = a TCU) t BTN)

claim imit) STCUIEVIUEU ) is subspace of V


fue UI Taeko }
a
-

claim Kent) -
: -

: -

im the IV GV
is a
subspace of u eu
d

im 4) to as 7109 - o ,
so o c- im 4)

kept) EU ,
Kent) -40 as Ot KENT)
Cura : Oi , Ouimet)
because Tco )
-
- O

⇐u
P A
Ev
Ii -
Tah ) and 02=17142) for some Ui Uze
,
U

Uit Uz -
Tuli) -17142 ) TCU, that
CUVA Ui U2 C-KENT)
-

TL Ui ) O TCU2) =D
: "

-
-

, -

UHUZE U dit Uz E im CT)


T CUH U2 ) -
TCU ) -1 Tcyzj =p to =p
UHUZE Kent) cu smh : Oi C- int) and any AER
A- TCU)
C. USM : Uiekekt) so -1Wh =D and AER for some me a

THUD = at all) - 90=0 AUG KENT) all -


Nuh ) -
Thu,

adieu all, c- im LT)


i.
Kent) is a
subspace of V
. .
ima ) is a
subspace of V

Lecture 4, page 4
Example

Suppose that T : R2 ! R2 is a linear transformation given by


! !
x x y
T = .
y 2x 2y

Find ker(T ) and im(T ).

Kerch -
-

f ER
'
l T (5) = o
} { Cy ) EIR
-
-
'
l ( 1×7,1=0 } =
Iff ) HR Kay ,
'
}

=
{ (g) TRY x
-
-

y } Ling ( f ) }
=

im th -
-

f ER I T -
-

(5) for some


(f) c- R2
} =
{ Hu%) -
-

Y) for some u . OHH

flu as
} Ling l !) }
-
- - =
=

Lecture 4, page 5
Kernels and images, ranges and null spaces

The kernel and image of a linear transformation are related to the


range and null space of a matrix.

Theorem: If T : Rn ! Rm is a linear transformation and

T (x ) = AT x

for some m ⇥ n matrix AT , then

ker(T ) = N(AT ) and im(T ) = R(AT ).

{ At 71=0 }
"

Proof: KENT) =
HER l T CX) - o ) = =
N LAT)

TCU) ERM } AM I
"
in CT) = { I IE IR =
{ AT U E IER }
" =
R CAT )

Lecture 4, page 6
Nullity and rank

Definition: If T is a linear transformation, then the


⌅ nullity of T , ⌘(T ), is the dimension of its kernel, i.e.
✓ ◆
⌘(T ) = dim ker(T ) ,

⌅ rank of T , ⇢(T ), is the dimension of its image, i.e.


✓ ◆
⇢(T ) = dim im(T ) .

Note: This agrees with what we know about the nullity and rank
of a matrix.

Lecture 4, page 7
The rank-nullity (or dimension) theorem

These numbers are related by the following theorem.

Theorem: If U and V are vector spaces and T : U ! V is


a linear transformation, then
Rn
k
⇢(T ) + ⌘(T ) = dim(U).
k t
dim CREAM dime NCA ))
Proof: See Section 7.24 of the Anthony and Harvey textbook.
Note: This agrees with what we know as the rank-nullity theorem
for matrices.

Lecture 4, page 8
Similar matrices

Definition: Two n ⇥ n matrices A and B are similar if there


is an invertible matrix P such that B = P 1 AP.

Note: Equivalently, we can say that two n ⇥ n matrices A and B


are similar if there is an invertible matrix Q such that B = QAQ 1
because we can just take Q = P 1 so that Q 1 = (P 1 ) 1 = P.
Note: Following on from the last lecture, if a linear transformation
T can be represented by a square matrix AT relative to the
standard ordered basis, then it can be represented by the matrix
AB,B
T = MB 1 AT MB ,
relative to the ordered basis B. That is, the similar matrices AT
and AB,B
T represent the same transformation relative to di↵erent
bases.
Lecture 4, page 9
Similarity is an equivalence relation

Theorem: Similarity is an equivalence relation as we have


the following.
(a) Every square matrix is similar to itself.
(b) If A is similar to B, then B is similar to A.
(c) If A is similar to B and B is similar to C , then A is
similar to C .

Here, of course, the ‘equivalence’ that similar matrices share is


that they represent the same linear transformation (relative to
di↵erent ordered bases).

Lecture 4, page 10
Proof:

Lecture 4, page 11
Diagonalisability

One of our most important applications of similarity is that it


underwrites what we do when we diagonalise square matrices.

Definition: An n ⇥ n matrix A is diagonalisable if it is similar


to some diagonal matrix. That is, if there is an invertible
matrix P such that

1
P AP = D,

where D is a diagonal matrix.

In MA100, you saw how to diagonalise square matrices using


eigenvalues and eigenvectors.
We’ll talk more about diagonalisation later, but we do want to talk
about eigenvalues again now.
Lecture 4, page 12
Eigenvalues and eigenvectors

Definition: Suppose that A is an n ⇥ n matrix.


An eigenvalue of A is any scalar such that

Ax = x for some x 6= 0

and x is called an eigenvector of A corresponding to the


eigenvalue .

Note: To be an eigenvector, the vector x must be non-zero.


Otherwise, every scalar would satisfy the equation and there
would be no point looking for eigenvalues!

Lecture 4, page 13
Finding eigenvalues

To find eigenvalues, we note that

Ax = x =) (A In )x = 0,

and for this equation to have a solution x 6= 0, the matrix A In


can not have an inverse, i.e. we must have

|A In | = 0.

Thus, expanding this determinant, we will get a polynomial


equation of degree n in that can be solved to give us the
eigenvalues.

Lecture 4, page 14
Characteristic polynomials

For theoretical work, it is convenient to use a slightly di↵erent


polynomial of degree n as follows.

Definition: Suppose that A is an n ⇥ n matrix.


The characteristic polynomial of A is given by

pA ( ) = | In A|.

Note: It should be clear that, for an n ⇥ n matrix A, we have


|A In | = ( 1)n | In A| = ( 1)n pA ( ),
and so we can also say that the eigenvalues of A are the solutions
of the equation
pA ( ) = 0.

Lecture 4, page 15
Characteristic polynomials, traces and determinants

Note: The trace of a square matrix A, denoted Tr(A), is just the


sum of all the diagonal entries of A.

Theorem: If A is an n ⇥ n matrix and

n n 1
pA ( ) = + an 1 + · · · + a1 + a0 ,

for some constants an 1 , . . . , a1 , a0 , then

Tr(A) = an 1 and |A| = ( 1)n a0 .

Proof: If you’re interested, see Question 8 on Exercises 12.

Lecture 4, page 16
Example

Find the characteristic polynomial of the matrix


!
THAT I -14--5 =

1 2
A= .
3 4 IA 1=4 6=-2 -

Hence find the trace and determinant of this matrix. T

informal
1×12 Al
=/ ?-31 /
a-ha -47
PALM b N sa z
-
- -
- -
- -
- - -

,
a

Pack)= Nt And tao 91=-5 90=-2,

Truth -
Ai -_
5 and IAI -
- C- 1590=-2

Lecture 4, page 17
Similarity and characteristic polynomials

Theorem: If two matrices are similar, then they have the


same characteristic polynomial.

Proof:

Lecture 4, page 18
A warning!

Note: The converse of this theorem is not true.


Example: Show that the matrices
! !
0 0 0 1
A= and B=
0 0 0 0

have the same characteristic polynomial but they are not similar.

PAIN -
-
HI2 -
A I
=/ to Oa f
=P and PB CN =
HIz B
-

I =/ do I - I =
Paul)

for any invertible P


fo f ) p (oooo ) t ( Oo b ) B
' i
p Ap
- -

p
-
-
= =
,

B cannot be similar to A

Lecture 4, page 19
Similarity and eigenvalues, traces and determinants

An easy consequence of the last theorem is the following.

Corollary: If A and B are similar matrices, then they have


the same eigenvalues, trace and determinant.

Proof: A B ate similar means


.
they have the same characteristic polynomial

i. e .
Paix) =
KI n -
A I = in t an i
-
am ' t . . .
tant Ao =
ItIn -

B I =
PB Id)

Palm o ⇐ PB CX) O and so A. B have the eigenvalues


-

same
-
- -

Tr LA) =
-
an -
I
=
Tr KB) and so A. B have the same trace

"
IAI -
- C 1)
-
do =
IBI and so A. B have the same equivalent

Lecture 4, page 20
Eigenvalues, traces and determinants

One last consequence of what we have seen above is the following.

Theorem: If A is an n ⇥ n matrix with eigenvalues


1, 2, . . . , n , then

Tr(A) = 1 + 2 + ··· + n and |A| = 1 2··· n.

Proof: As the eigenvalues 1 , 2, . . . , n of A are the solutions of


the equation pA ( ) = 0 where
n n 1
pA ( ) = + an 1 + · · · + a1 + a0 ,

in its factorised form, we must have µ by fundamental theorem

of Algebra
pA ( ) = ( 1 )( 2) · · · ( n ).

Lecture 4, page 21
But, expanding out these brackets, we get

n n 1
pA ( ) = ( 1 + 2 + ··· + n) + · · · + ( 1)n 1 2··· n,

which means that, comparing the coefficients of the n 1 and


constant terms, we must have

an 1 = ( 1 + 2 + ··· + n) and a0 = ( 1)n 1 2··· n.

Thus, using our earlier theorem, we have

Tr(A) = an 1 = 1 + 2 + ··· + n

and
|A| = ( 1)n a0 = ( 1)2n 1 2··· n = 1 2··· n,

as required.

Lecture 4, page 22
example session ( Wh

fix)
{ No
recall : -

ni and fix) -
-
Xzo are functions f. fz :
IR → R

X- O

these functions are


linearly independent as af taste
, o gives us :

for all HER , Afia) -192421×1=0

Afi Afzal =D
Take X=l ,
Cl ) 1- Ax -192=0
F) 42=0

take X - -
I, asf H ) tlzfzti )
,
=
o A, -192103=0 di -
0

Hence ,
only trivial solution di - Az -
-
o .
So {fhfz } is
linearly independent

?
1 . Are these functions LI or LD

'
Cl ) I . X . X LI

CZ ) I . Sim , cos X LI

-
③ 1. sin 2x , Cds X LD cosh = I -
Sinn

(4) I , X. XZ , ( HX )
'
LD ( HXJZ =
It 2X -192

(5) I , X , CHAZ LI

2- lecture 3 page 29
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 12: Wronskians and Linear Transformations

For these and all other exercises on this course, you must show all your working.
1.
✓ Use the Wronskian to show that the set {ex , xex , x2 ex } is linearly independent.

2. Calculate the Wronskian for the functions

e↵x , e x , e x .

Hence show that this set of functions is linearly independent if ↵, and are all di↵erent.
[Hint: Use column operations to simplify the determinant before you expand it.]

3. Consider the functions f1 , f2 : R ! R given by

:
( (
x2 if x  0 0 if x  0
f1 (x) = and f2 (x) = .
0 if x 0 x2 if x 0

Show that the Wronskian, W (x), for these functions is zero for all values of x, but f1 and f2 are
linearly independent.

4. A polynomial function of degree at most k is a function f : R ! R of the form

f (x) = ak xk + · · · + a2 x2 + a1 x + a0 .

Let Pk be the set of polynomial functions of degree at most k.

(a) Show that P3 is a subspace of the vector space of all functions from R to R.

(b) For j = 0, 1, 2, 3, let gj (x) = xj . Show that B = (g0 , . . . , g3 ) is an ordered basis for P3 .

Define the function D : P3 ! P3 by D(f ) = f 0 , the derivative of f .

(c) Show that D is a linear transformation.

(d) Find the matrix AB,B


D representing D with respect to the ordered basis B.

(e) Find ker(D) and im(D).

5. A linear transformation T : R3 ! R3 is represented by the matrix



0 1
2 1 0
AT = @ 1/2 7/2 1/2A .
1/2 1/2 5/2

(a) If B = (1, 0, 1)t , (1, 1, 0)t , (0, 1, 1)t , what is the matrix AB,B
T representing the linear transforma-
tion T with respect to the ordered basis B? [Hint: Question 1 of Exercises 1 may help.]
(b) Given that C = (2, 0, 2)t , (3, 3, 0)t , (1, 4, 3)t is also an ordered basis, show that AC,B
T is the
identity matrix I3 . Why does that happen?

Exercises continue overleaf...


6.
✓ Let B = (1, 0, 1)t , (1, 1, 0)t , (0, 1, 1)t be the ordered basis of R3 as in the previous exercise, and
consider the ordered basis C = (1, 1)t , ( 1, 1)t of R2 . Let T : R3 ! R2 be a linear transformation
represented by the matrix ✓ ◆
1 1 1
AC,B
T = ,
0 1 0
with respect to the ordered bases B and C.
Let B 0 = (1, 0, 1)t , (1, 1, 0)t , (0, 1, 1)t and C 0 = (2, 1)t , (1, 0)t be alternative ordered bases for
0 ,B 0
R3 and R2 respectively. Find AC T , the matrix representing T with respect to the ordered bases B 0
and C .0

7. In R3 , the basis is changed from the standard ordered basis (e1 , e2 , e3 ) to the ordered basis
B = (1, 2, 3)t , (3, 1, 2)t , (2, 3, 1)t .
Find the matrix Q such that, if x = (x1 , x2 , x3 ) is an element of R3 written in terms of the standard
basis, then Qx is the same element written in terms of the ordered basis B.

8. If A is an n ⇥ n matrix and x is a variable, the characteristic polynomial of A is given by

det(xI A) = xn + an 1x
n 1
+ · · · + a1 x + a0 ,

for some constants an 1 , . . . , a 1 , a0 .

Show that ( 1)n a0 is the determinant of A and that an 1 is the trace of A (i.e. the sum of its
diagonal entries).

Subspace :
Bz EIR & 1134/0 i C USAA ; CUVA

linear transformation Tlcdft


:
I -

-
NH ) tf -1191

basis : linear independent ; span



"

I .
denote f. Cx ) =
, fzlx) = Xe
"
. fzcx) -
-

He

fi
'
then f ,
ex ) = ex ,
Ix) -

- ex + xex , f's Ix) = Exe


"
t Mex

f ! CX ) ex f " "
Mex
"

= ,
Ix) - e text x ex ,
f- six) = Ze t 2x ex t 2x ex t

e Ze
"
t Xe
"
= Next e't ze
"

Hii:: Ii ÷:&:* I
'
hence WIX) -
-

f , LX) fzlx ) fslx ) "

::*:*: .

Xlzxtx 2)
'
= ( HX ) CR -144 -12) KTX ) LZXTXZ ) XIX -14×-12) -1×42-17) 041T X )
-

t
-
-

lex xex x' ex ) independent


Yo so the set is
linearly
=
, ,
,

nx/
l

/
l l

He Be GEREMIA
x x

a B
-

r) ( p p ) ( f -
o
)
filmed? fix)=eB× fsu,
=
= r =
. . -

semi
-

2 denote
42 BZ p2
. ,

em em
:: : :c:c::::.
l ::: ::: :: l : :÷÷÷÷÷÷.ie
"
"
"
then who = e "

So if at Btr ,
WH) to
" B
I @+ a) em
"
=
e c B a)e (
-

r -
a) em . -
cesta) em ]

if at ft r this set of functions


. Wix) to , then is
linearly independent .

T
not so obvious

fxz; oof too 1×1--0


3
any xeo who o
any x>0 next
- -

i
. - - -
-
. .

take any a. GEIR , a fill) t ffzcx) - O

for x so , ant tf 0=0 -


a -
-
o

for BR f-
Of incomplete
Aso ax O t
-
O
Wronskian doesn't work the other around
-
-
,

× way
Hence by definition f .
,
and £ are linearly independent ie .
next -
-
o # LD

assume a fix ) t azfzlx)=O for all X ,

at X I 9=0

f
-
-
-

f and fz
.

, LI

at X =
I , Az =
O
4 As X't Ask t ai
.
Ca )
by definition . Bz = Xt Ao

① 113 EIR and 93=1/0 Since AOE 113 SO IE IB

formal writing : observe . The function IN) belongs to IPs , since 93 -


Ot 92 Ot ai
-
-
O Tao = Ao GP3

② cheek closed under vector addition


take
f.SE/Pg;ft9--93X3taix49iXTAotb3X3-baX2tbiXtbo--L9ztbz)X3t
any

last ballot ( hit b.) X t caotbo ) EPs

③ check closed under scalar multiplication


take AER . ftp. ,
then af = 49393 t 992Mt -

taint ago C- Ps

\ /
Cb) WCHB ) = I X XZ 93 =
" " × " =
" *°

O l 2X 3×2

O O 2 tox

O O O 6

so vectors in 13=180 .
. - i

.
Ss ) are linearly independent
Lin { So Si 92,93 }
, ,
= Ao fo t Alf , t Ask -1 9383 =
433 , so B spans 03

Therefore , B is an ordered basis for Ps

CC ) select f. SEP
3
and a. BE IR

Daft BS ) CX ) Dcfcxl ) tf Dcflx))


'

( aft BS ) H) aftx)
'

for all HEIR ,


= =
t BS Cx) = a

for all YER ,


D (f) (x) =
f'CX) = al t 29271-1 393×2 =
9180 t 2929 , t 39392 EIPZ

Hence D : P 's → 1133 is a linear transformation

f! ) ( Ig )
ed ) TC fo ) CX) =
To Cx) =
0 Then ETCSODB -
(TCSDTB =

'
1- ( Sl ) CX ) = 8 , CX) = I -
fo B ,

( Og )
Ft 93) ) B

T 1941×1--9247--24=291 ITCH B) = -

TLS 3) (x) = S 's H) =


3×2=392 , ,

AFB ( (Tl )

( Og § ! ! )
Hence -
1B (THD B ITCH ) B [TOMB =
* dim ( Ker) t dim ( im) -
- dim 37=4

(e) Kep (D) =


{ f Elp 3
I D (f) =
o } = { ft B '
I f- '
=
O } { ftp./ait2Xaz-zp2az=o }
=
=
{f -
-
C
/ CER }

=D ( dim 1 )

im (D) =
{ felt 3 I beg ) =
f for some SEIP
3
} =
{ ftp./Ait2XAzt372a3--AotAiXtazxI-A3X3 }
=
{ f Elp '
l (Ai -
Ao ) t Raz -
ai ) x + Gas -
Az) Xd -
As 93=0 } = Bz ( dim 3)


"

ask.fi/I--ATfy/=fI/=fg/ ,
me
! !) . Mei -

tf! ! ! )
µ ) Atf ! ) (g) ( %) ( § ! !)
's
AE ME'A¥BmB=
( ) '
- -
- - =
- .
-
r

method
( I ( Y)
standard

µ) =/ ! )
this

( 4)
T = AT = is the

Hence AFB
( 2g § ! )
.

f! ) ( g ) Tf ! ) ( ! )

(g) ( Yo )
'
Cb) T T AFB '
=
=
; i =
want : = Mi AT MB Is -
-

, , ,
instead of finding Mi!

Therefore AFB =
Is check AT MB -
-
Mo

AFB AFB Mps


'
'
6 .
=
Mi At Mp AT =
Mc

melt it .
mis -

ft ! ! ) me
÷÷÷÷÷ )
ill : :L -4 ! ! ;)
'

Ii i Hi II
'
* -
-

H ol f :)
'
mi -
-
'
mi -
-

(I ÷ ! )
'

MB -

¥¥¥¥s :
-

'

,
' X

( Ig
:B
at (I f f ) Oo } )
'
'
Hence = ME At mis -
-

ATB .
*
MB is the matrix converts standard coordinates into B coordinates

÷!)
motto
l 's ! ! ) ! !) ! ;)
" " .
. " m.

f } ! ! ) Coo ! ! ) f } ! ;)
'
men
-

a- mi- -
-
-
-
-

8 . (* proof )

since Ao -
-
fu ) ,
then def C-A) = Ao

"
and by both sides dell A) det CA )
multiplying by C- IS ao
=
i =
-
-

so dei LA ) C- lmao
-

for the
polynomial . rewrite it into ex -

an ) . . -
ex -
Ann )

the TNA )
-

coefficient is au Ann an Tr CA) any


-
- -
i
-
"
-
= - -
=
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 12: Wronskians and Linear Transformations

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. We calculate the Wronskian:


ex xex x2 ex 1 x x2
W (x) = ex (x + 1)ex (x2 + 2x)ex = e3x 1 (x + 1) (x2 + 2x) .
x x 2
e (x + 2)e (x + 4x + 2)e x 1 (x + 2) (x2 + 4x + 2)

We don’t need to evaluate this determinant in general. All we need to know is that the Wronskian
is non-zero for at least one value of x. The easiest value to consider is x = 0.
1 0 0
W (0) = e0 1 1 0 = 2.
1 2 2

The fact that W (x) is not identically zero means that the three functions are linearly independent.

2. Again let’s evaluate the Wronskian and simplify as far as is convenient.

e↵x e x e x 1 1 1
W (x) = ↵e↵x e x e x = e↵x e x e x
↵ .
↵2 e↵x 2e x 2e x ↵2 2 2

This last determinant is independent of x, so we need to show that it’s non-zero whenever ↵, , are
all di↵erent. (Note: showing that the Wronskian is zero whenever some pair are equal – why is this
obvious from the start? – is not the same thing at all.)
As in the hint, perhaps the easiest way to do this is via columns operations.

1 1 1 0 0 1
↵ = ↵
↵2 2 2 ↵2 2 2 2 2

↵ 1 1
= = (↵ )( )
↵2 2 2 2 ↵+ +

= (↵ )( )( ↵),

which is indeed non-zero whenever all three are di↵erent.


Note: tackling this by other means, you might find yourself confronted by an expression such as

W (0) = ( ) + ↵ (↵ )+↵ ( ↵).

If you don’t think it’s obvious that this expression is non-zero, then it’s fundamentally dishonest to
write “therefore W (0) 6= 0 as long as ↵, and are all di↵erent”. But, one way forward is to note
that, since W (0) is zero if (say) = , then is a factor of W (0). That is, we can now write:
⇥ ⇤
W (0) = ( ) + ↵2 ↵ ↵ =( )( ↵)( ↵).

3. The functions f1 and f2 are di↵erentiable with derivatives given by


( (
0 2x if x  0 0 0 if x  0
f1 (x) = and f2 (x) = ,
0 if x 0 2x if x 0
respectively. This means that, for x 0 and x  0, the Wronskian for these functions is

0 x2 x2 0
W (x) = and W (x) = ,
0 2x 2x 0

respectively. In either case, W (x) is zero.


On the other hand, assume that a1 f1 (x) + a2 f2 (x) = 0 for all x. Looking at x = 1, we get a1 = 0.
Looking at x = 1, we get a2 = 0. Therefore f1 and f2 are linearly independent (this is equivalent to
saying that neither one is a multiple of the other).
The lesson is that using the Wronskian is a one-way test for a set of functions to be linearly indepen-
dent. If the Wronskian is non-zero for some value of x, then we are guaranteed linear independence.
However, if the Wronskian is zero for all x, then we cannot conclude anything, as this example shows.

4. Before we get going, it’s important to understand what the set P3 consists of. It’s a set, whose
elements are functions: in particular, it’s a subset of the vector space of all functions from R to R.
A function from R to R is in the space P3 if it can be written as a polynomial of degree at most 3.
So, for instance, the function f given by f (x) = 3x3 x + 11 is in P3 , as is the constant function f
given by f (x) = 9. The general form of a function f in P3 is f (x) = a3 x3 + a2 x2 + a1 x + a0 , where
the ai are real numbers.
I want to contrast the notations f and f (x): f denotes a function (an element of P3 , in this question),
while f (x) denotes the value of that function at the real number x — so f (x) is a real number,
depending on f and x. When we talk about adding two functions, we mean adding them point-wise:
the function f + g is defined by saying that its value (f + g)(x) at the real number x is given by
(f + g)(x) = f (x) + g(x).
(a) We need to show that, if f and g are two elements of P3 , then f + g is in P3 , as is any scalar
multiple of f .
Suppose that f (x) = c3 x3 + c2 x2 + c1 x + c0 and g(x) = d3 x3 + d2 x2 + d1 x + d0 ; then we have
(f + g)(x) = (c3 + d3 )x3 + · · · + (c0 + d0 ), so f + g is indeed in P3 . Also, for any real number a,
(af )(x) = (ac3 )x3 + · · · + ac0 , so af is in P3 .
(b) By its definition, P3 is the linear span of the set B, so B spans P3 . To check that B is linearly
independent, one good way is to evaluate the Wronskian. (Details omitted, but it’s easy, and
very similar to an example in lectures.)
(c) We need to observe that, for f and g in P3 , D(f + g) = (f + g)0 = f 0 + g 0 = D(f ) + D(g).
Also, for a 2 R, D(af ) = (af )0 = af 0 = aD(f ). We also need to notice that, whenever f is a
polynomial of degree at most 3, so is f 0 : indeed f 0 is a polynomial of degree at most 2.
(d) The vector space P3 has dimension 4, so we know that the function D we’ve just shown to be a
linear transformation from this space to itself will be represented by a 4 ⇥ 4 matrix AB,B
D , with
respect to the ordered basis B.
We have to think of functions in P3 in terms of their “B-coordinates”: we know that every
function in P3 can be expressed uniquely as a linear combination ↵0 g0 + · · · + ↵3 g3 of the basis
elements: that means its B-coordinates are (↵0 , . . . , ↵3 )t . The matrix takes the set of coordinates
corresponding to a function f , and maps them to the coordinates of the function D(f ).
For instance, in terms of the ordered basis B, the function g3 (x) = x3 has co-ordinates (0, 0, 0, 1)t ,
meaning that g3 = 0 ⇥ g0 + · · · + 1 ⇥ g3 . Now D(g3 )(x) is the derivative of the function x3 , which
is 3x2 = 3g2 (x). In terms of the basis B, D(g3 ) has co-ordinates (0, 0, 3, 0)t . This means that
the matrix AB,B
D representing D with respect to B must map (0, 0, 0, 1)t to (0, 0, 3, 0)t . Thinking
about how matrix multiplication works, this means that the fourth column of AB,B D is (0, 0, 3, 0)t .
In general, the ith column is the image of the ith basis vector, written in terms of the basis. We
have D(g0 ) = 0, D(g1 ) = g0 , D(g2 ) = 2g1 , D(g3 ) = 3g2 , so the full matrix is
0 1
0 1 0 0
B0 0 2 0C
AB,B
D =B@0 0 0 3A .
C

0 0 0 0

(e) The kernel of D is the set of functions f in P3 such that D(f ) = 0, and of course this is the set
of constant functions, which is Lin({g0 }).
The image of D is the set of polynomial functions of degree at most 2. We’ve already remarked
that D(f ) is a polynomial of degree at most 2, for all f 2 P3 . There are a number of ways of
showing that every polynomial g of degree at most 2 is in the image of D: the most direct is to
let f be an indefinite integral of g — which is a polynomial of degree at most 3, and note that
D(f ) = f 0 = g, and so g is in the image of D.

There’s a lot of simple arithmetic to be done in the next few exercises. Probably the majority of
students will make some kind of arithmetic error along the way. That’s understandable. You may be
reassured to know that, in an exam, most of the marks are for applying the right method, not for your
accuracy at arithmetic. However, you shouldn’t be too careless, and you are supposed, whenever it’s
reasonable, to check your answers. One particular case is when you are working out the inverse
of a matrix: once you’ve got your answer, it’s easy to check whether it’s right. So you should never
get an inverse wrong in your homework. My rule of thumb on marking is to delete one mark for a
small number of arithmetic errors (within reason), and to delete another mark if the final answer is
wrong and could have been checked easily. You have been warned!
5. For (a), we first form MB , the matrix whose columns are the give vectors, and work out the
inverse. (If you’ve done Question 1 from Exercises 1, then referring back and quoting the answer
from there is legitimate, indeed positively recommended!)
0 1 0 1
1 1 0 1 1 1
1
MB = @0 1 1A and MB 1 = @ 1 1 1A .
2
1 0 1 1 1 1

Now
0 1 0 10 1
1 1 1 4 2 0 1 1 0
1@ 1
AB,B
T
1
= M B AT M B = 1 1 1A @ 1 7 1 A @ 0 1 1A
2 2
1 1 1 1 1 5 1 0 1
0 10 1
1 1 1 4 6 2
1@
= 1 1 1A @0 6 8A
4
1 1 1 4 0 6
0 10 1 0 1
1 1 1 2 3 1 2 0 0
1@ A @ A @
= 1 1 1 0 3 4 = 0 3 1A .
2
1 1 1 2 0 3 0 0 3

Tip: when manipulating matrices, you’ll be much less prone to arithmetic errors if you use integers
as far as possible, as in the calculation above.

For (b), we could just calculate MC 1 , and then verify that AC,BT = MC 1 AT MB is equal to the
identity matrix. A tiny amount of thought will save you a lot of calculation: all we really need to do
is check that AT MB = MC , as indeed it is.
General tip: if you are told what the outcome to some calculation is supposed to be, it is often much
easier to verify the answer than to perform the calculation.
There’s even slightly more to be gained, in the question taken as a whole, by thinking ahead and
planning what we’ll have to calculate before actually doing so. We see that we are asked to work out
AB,B
T = MB 1 AT MB and that in this part we’ll need to know AT MB ; so it pays to do the multiplication
AT MB first when working out AB,B
T — which indeed is what we did in the answer to (a) above. Note:
AT MB is the matrix that takes a vector in B-coordinates and gives its image in standard coordinates
– the question asks us about expressing this image in both B- and C-coordinates.
You were also asked why AC,B
T is the identity matrix. That’s a slightly vague question; in principle
there could be several plausible alternative answers, and it might not be clear to you what the
questioner is looking for. Tip: the one approach guaranteed to attract no credit (which for some
reason seems to be the most popular approach) is to act as though the question isn’t there.
What is the matrix AC,BT supposed to represent? It takes the B-coordinates of a vector v in R3 ,
and returns the C-coordinates of the image T (v ) = AT v . The fact that this matrix is the identity
matrix is equivalent to the observation that AT maps each vector in the basis B to the corresponding
element of the basis C (which you might care to check directly).

6. The easiest way to tackle this is first to find the matrix AT representing T with respect to the
standard bases. We know that AC,B T = MC 1 AT MB , so AT = MC AC,B 1
T MB . Then we can find the
C 0 ,B 0 C,B
matrix AT = MC 01 AT MB 0 = MC 01 MC AT MB 1 MB 0 . Here are the matrices we need. We have
1
MB in the previous answer.
0 1
1 1 0 ✓ ◆ ✓ ◆ ✓ ◆
@ 1 1 2 1 0 1
MB 0 = 0 1 1A , MC = , MC 0 = and 1
MC 0 = .
1 1 1 0 1 2
1 0 1

Notice that we don’t need to invert MB 0 or MC . Now we calculate:


0 1
✓ ◆✓ ◆ 1 1 1 ✓ ◆
C,B 1 1 1 1 1 1 @1 1 1 2
AT = MC AT MB = 1
1 1A = ,
2 1 1 0 1 0 0 0 1
1 1 1

so that ✓ ◆
0 ,B 0 1 0 1
AC
T
1
= M C 0 AT M B 0 =
5 0 3
As an aside, we can see from the form of AT that v = (1, 1, 0)t is in the kernel of the transformation
T . Let’s see how this is reflected with respect to the other bases.
The vector v is the second element of the basis B 0 , so its B 0 -coordinates are (0, 1, 0)t . This means
0 ,B 0
that we should have AC T (0, 1, 0)t = 0, which is equivalent to saying that the second column of the
matrix is all-zero, as indeed it is.
Similarly, with respect to the basis B, we have v = (1, 0, 1)t (0, 1, 1)t ; i.e. the B-coordinates of v
are (1, 0, 1)t . Therefore we should have AC,B t t
T (1, 0, 1) = (0, 0, 0) , and indeed it is so.

7. Remember (and understand) that MB is the matrix that converts B-coordinates to standard
coordinates.
Here, if the basis in the question is B, you’re asked for the matrix that converts standard coordinates
to B-coordinates, so the matrix Q you are asked for is MB 1 .
0 1 0 1
1 3 2 5 1 7
1 @
MB = @2 1 3A and MB 1 = 7 5 1 A.
18
3 2 1 1 7 5

So, if a vector is written in standard co-ordinates as x = (x1 , x2 , x3 )t , then it is written in the new
co-ordinates as MB 1 x .
8. Since a0 = f (0), setting x to zero gives that a0 = det( A). Remember that multiplying each row
of a matrix by 1 has the e↵ect of multiplying the determinant by 1, so det( A) = ( 1)n det(A).
Imagine expanding out the expression for the determinant of xI A completely: each term will be
a product of entries, either of the form aij or of the form x aii . To get a coefficient for xn 1 in any
of these terms, one must either multiply by diagonal entries exactly n 1 times or exactly n times.
But it is impossible to multiply by diagonal entries exactly n 1 times, since one entry aij o↵ the
diagonal in the product implies that neither x aii nor x ajj will appear in the product.
Therefore the coefficient of xn 1 in the characteristic polynomial comes solely from multiplying all
the diagonal entries, and is equal to the coefficient of xn 1 in the product
(x a11 ) · · · (x ann ). This will be the sum a11 · · · ann , i.e. minus the trace of A.
MA212 Further Mathematical Methods
Lecture 5: Real inner products

Dr James Ward

⌅ Real inner product spaces


⌅ Norms, the Cauchy-Schwarz inequality and angles
⌅ Orthogonal vectors and some useful results
I The generalised theorem of Pythagoras
I The triangle inequality
⌅ Orthonormal bases and Gram-Schmidt orthogonalisation

The London School of Economics and Political Science


Department of Mathematics
Lecture 5, page 1
Real inner product spaces

Definition: Suppose that V is a real vector space.


An inner product on V is a function h·, ·i : V ⇥ V ! R that,
for all u, v , w 2 V and ↵, 2 R, satisfies
(i) h↵u + v , w i = ↵ hu, w i + hv , w i;
(ii) hu, v i = hv , ui; symmetric
(iii) hu, ui 0 and hu, ui = 0 if and only if u = 0. positivity

A real vector space with an inner product is called a real inner


product space.

Note: We will refer to (i), (ii) and (iii) as linearity in the first
argument, symmetry and positivity respectively.

Lecture 5, page 3
Real inner products are bilinear

Theorem: Suppose that V is a real inner product space.


For all u, v , w 2 V and ↵, 2 R, we have

hu, ↵v + w i = ↵ hu, v i + hu, w i .

Proof: Linearity in the first argument gives us


h↵v + w , ui = ↵ hv , ui + hw , ui ,
and, applying symmetry to each of these inner products, we get
hu, ↵v + w i = ↵ hu, v i + hu, w i ,
as required.
Note: We will refer to this as linearity in the second argument. We
sometimes say that real inner products are bilinear as they are
linear in both arguments.
Lecture 5, page 4
Example

Show that the dot product in R2 , i.e.


! !
u1 v1
· = u 1 v1 + u 2 v2
u2 v2
( all , -1 BA ) wi t ( that B Uz) Wz
is an inner product on R2 .
=

=
x ( Ui W , t Us Uk ) t B ( Vi Wit lk wz)
'
ang a. WEIR and a BEN
(Mus)
u.

( mud ( Ww: )
>

a B

HI!:p: ) Tn)
=
.
+

is couteau
-
- -

= ace we ,.
+ ecv.ws

dis u I
. =
%) ( T! ) .
=
Ui Vitus U2 = Icu, talk =
( Y! ) Cute)
.
= V - U

( ) (a)
U, UI 2
Ciii ) U U
-
= .
= Uit Us 70
uz

(hug ) ( )
2
Ul
U U- o
'

⇐ -
o Uft Us -
o ⇐ Ui = Use o ⇐ 4=0
"
.

so the dot product is IP on R


Lecture 5, page 5
A useful fact about dot products

Note: We can write the dot product as a matrix product if we let

u · v = v t u,

by identifying the single entry in the 1 ⇥ 1 matrix v t u with the


real number u · v .
E.g. Looking at the dot product in R2 , this works because
!
⇣ ⌘ u ⇣ ⌘ ⇣ ⌘ ⇣ ⌘
t 1
v u = v1 v 2 = v1 u 1 + v2 u 2 = u 1 v 1 + u 2 v2 = u · v .
u2

Of course, we could have used u · v = u t v instead, but this won’t


be so useful as the course progresses!

Lecture 5, page 6
Example

Let C[0, 1] be the vector space of all continuous functions from


[0, 1] to R. Show that
Z 1
hf , g i = f (x)g (x) dx
0

is an inner product on C[0, 1]. take any f. S he C


. lo D. and a. PER

cis < aft pg ,


h> =
To Kf # BS I had DX =
No fcxlhlxldxt Bff slidhlxdx

=
asf h > , t p < 9 h>
.

ciiscf.gs =/! fix) Glx) dx ft SK) fix) dx


-
-
-
- c 9.f >

Sf Hx) )
-
ciii, off > = DX so

f -
-
o
Lf f s ,
-
-
ft O -
DX -

-
o

<
f.f s = o f! If Pdx -
-
o fix) so for all a Elo I .
] f- o
Lecture 5, page 7
Remark

When we have real vector spaces like Rn or C[0, 1], we usually use
the inner products
Z 1
hu, v i = u · v or hf , g i = f (x)g (x) dx,
0

respectively. That is, these are the standard inner products on


these real vector spaces and we use them if no other inner product
is specified.
But, of course, we can define di↵erent inner products on any given
real vector space to get a di↵erent real inner product space.

Lecture 5, page 8
Example A is symmetric .
so At = A
t
!
1 1
Is hu, v i = u t v an inner product on R2 ?
1 2
take any a. U , we R2 and a. BE IR

Cil Lau -1pA ,


W > =
( nut BUY A W -
- Aut Awt Butane =
a < U. W s tf s V. Ws
'
cii, cu, U > U'All - (UTA U l = It At u at AU u>
=
= = c u ,

- A
1×1 matrix CAB 5h - BtAt

( L) ( )
Ul
Cii) cu , a > =
(Ui uz)
l =
UP -1241kt 2h22 =
Cultus ) 't uz > o
, U2

4=0 L U Us - (O -101402--0
.

< U . U > =D (Uit Us ) 2-1422--0 dit Uso and 42=0 Ui -


- O and 42=0 4=0

so this is a IP on B

Lecture 5, page 9
Example
!
1 2
Is hu, v i = u t v an inner product on R2 ?
2 1
I
similarly . A is symmetric,
so At A
-
-

cis & dis v

ciii's < U U>


.
= CU . Us)
( 122 ) ( Y!)
,
= Uf -14442-1422 =
do -12425 3h22
-

in
particular . A -
⇐ I -40 ,
but < U, Us -
22-212-31-112=-3 so

so this is not an IP on IR
'

Lecture 5, page 10
Norms and angles

Definition: Suppose that V is a real inner product space.


The norm on V is the function k · k : V ! R given by
p
kv k = hv , v i.

for all v 2 V .

Note: The norm is well defined since we know that hv , v i 0 for


all v 2 V and so the positive square root of hv , v i will always be a
nonnegative real number.
Example: Using the dot product on Rn we find that the norm is
p q
kv k = v · v = v12 + v22 + · · · + vn2 ,
and this is just the length (or magnitude) of the vector v .

Lecture 5, page 11
Unit vectors

Note: If v is a vector in a real inner product space, the vector


v
v̂ = ,
kv k

is a unit vector (i.e. vector of norm one) in the same direction as v .


Example: Using the dot product in R2 , if
!
1 p p
v= we have kv k = 12 + 32 = 10,
3

and so !
1 1
v̂ = p
10 3
is a unit vector in the same direction as the vector v .

Lecture 5, page 12
The Cauchy-Schwarz inequality

To get angles, we need the following result.

Theorem: Suppose that V is a real inner product space.

| hx , y i |  kx kky k for all x , y 2 V .

Note: To be clear, | hx , y i | is the absolute value of hx , y i.


Proof: See Section 10.1.4 of the Anthony and Harvey textbook.

Lecture 5, page 13
Angles

Now, using the Cauchy-Schwarz inequality, we have

hx , y i
1 1
kx kky k

for any two non-zero vectors x , y 2 V .


So, we can define the angle between these vectors, ✓, by taking

hx , y i
cos ✓ =
kx kky k

where 0  ✓  ⇡.
Note: This ‘angle’ will only have its usual geometric significance if
we are using the dot product in Rn .

Lecture 5, page 14
Orthogonal vectors

Even though we may not have a sensible interpretation of ‘angle’


(or even ‘perpendicular’) when we have other inner products or
other vector spaces, we do want to be able to say that two vectors
are orthogonal.

Definition: Suppose that V is a real inner product space.


Two vectors x , y 2 V are orthogonal if hx , y i = 0.

Note: Using the dot product in Rn , if two non-zero vectors are


orthogonal, then they are perpendicular (i.e. at 90 to each other).

Lecture 5, page 15
Zero vectors and orthogonality

Every vector in a real inner product space is orthogonal to the zero


vector in that space.

Theorem: Suppose that V is a real inner product space.


For any w 2 V , h0, w i = 0 and hw , 0i = 0.

Proof: Using linearity in the first argument with ↵, = 0 and any


vectors u, v , w 2 V , we have

h0u + 0v , w i = 0 hu, w i + 0 hv , w i =) h0, w i = 0

and then symmetry gives us hw , 0i = 0, as required.

Lecture 5, page 16
The generalised theorem of Pythagoras

Even though we may lack any sensible geometric interpretation of


the results, we get to keep two familiar theorems from geometry
when we move to unfamiliar inner products or vector spaces.

Theorem: Suppose that V is a real inner product space.


If x , y 2 V are orthogonal, then

kx + y k2 = kx k2 + ky k2 .

Proof: Suppose that x , y 2 V , then we have


kx + y k2 = hx + y , x + y i = hx , x + y i + hy , x + y i
= hx , x i + hx , y i + hy , x i + hy , y i
= hx , x i + 0 + 0 + hy , y i = kx k2 + ky k2
as required.
Lecture 5, page 17
The triangle inequality

Theorem: Suppose that V is a real inner product space.


If x , y 2 V , then

kx + y k  kx k + ky k.

Proof: Suppose that x , y 2 V , then we have


kx + y k2 = hx , x i + hx , y i + hy , x i + hy , y i As in proof above
= kx k2 + 2 hx , y i + ky k2 Symmetry
 kx k2 + 2kx kky k + ky k2 Cauchy-Schwarz
= (kx k + ky k)2
and so, taking the positive square root of both sides, we get
kx + y k  kx k + ky k
as required. Lecture 5, page 18
Orthonormal sets of vectors

Orthogonal vectors are very useful because they give us


orthonormal bases which, unlike the bases we have seen so far,
have some very useful properties.

Definition: Suppose that V is a real inner product space.


A set of vectors {v1 , v2 , . . . , vn } ✓ V is orthonormal if
8
<0 for i 6= j
hvi , vj i =
:1 for i = j

where 1  i, j  n.

Note: In an orthonormal set, the vectors are mutually orthogonal


(i.e. if you take two di↵erent vectors, those vectors are orthogonal)
and unit (i.e. every vector has a norm of one).
Lecture 5, page 19
Orthogonality and linear independence

For now, the key thing about orthonormal sets is that they are
linearly independent.

Theorem: If S is an orthonormal set of vectors in a real inner


product space, then S is a linearly independent set.

Proof: Suppose that {v1 , v2 , . . . , vn } ✓ V is an orthonormal set


of vectors and consider the equation
↵1 v1 + ↵2 v2 + · · · + ↵n vn = 0.
Now, for each 1  i  n, we can take the inner product of both
sides of this equation with the vector vi to get
h↵1 v1 + ↵2 v2 + · · · + ↵i vi + · · · + ↵n vn , vi i = h0, vi i ,

Lecture 5, page 20
which, using linearity in the first argument, gives us

↵1 hv1 , vi i + ↵2 hv2 , vi i + · · · + ↵i hvi , vi i + · · · + ↵n hvn , vi i = 0.

But then, using the orthonormality of the vectors, we get

↵1 (0) + ↵2 (0) + · · · + ↵i (1) + · · · + ↵n (0) = 0 =) ↵i = 0.

Thus, for 1  i  n, ↵i = 0 is the only solution to the equation


and so this set of vectors is linearly independent, as required.

orthogonal t rector LI vector

LI # t

Lecture 5, page 21
Orthonormal bases

With this in hand, we can now define an orthonormal basis of a


real inner product space.

Definition: Suppose that V is a real inner product space.


An orthonormal basis of V is an orthonormal set of vectors
that spans V .

Note: By definition, an orthonormal basis spans its real inner


product space and, by the previous theorem, its vectors form a
linearly independent set. That’s why it gives us a basis!
Example: The standard basis of Rn is an orthonormal basis of Rn
when we are using the dot product on Rn .

Lecture 5, page 22
Gram-Schmidt orthogonalisation

Let’s say that {u1 , u2 , . . . , un } is a basis for some subspace W of


a real inner product space V . In certain circumstances, we may
want to find an orthonormal basis of W , say {ŵ1 , ŵ2 , . . . , ŵn },
which means that we need

⌅ Lin{ŵ1 , ŵ2 , . . . , ŵn } = Lin{u1 , u2 , . . . , un } and


⌅ {ŵ1 , ŵ2 , . . . , ŵn } is an orthonormal set.

To do this, we use the Gram-Schmidt orthonormalisation process


and this runs as follows.
Starting with the basis {u1 , u2 , . . . , un } of W we perform the
following steps.

Lecture 5, page 23
Step 1: Take ŵ1 to be the vector
u1
ŵ1 = .
ku1 k
{ŵ1 } is an orthonormal set which spans Lin{u1 }.
Step 2: Take ŵ2 to be the vector
w2
ŵ2 = where w2 = u2 hu2 , ŵ1 i ŵ1 .
kw2 k
{ŵ1 , ŵ2 } is an orthonormal set which spans Lin{u1 , u2 }.
Step 3: Take ŵ3 to be the vector
w3
ŵ3 = where w 3 = u3 hu3 , ŵ1 i ŵ1 hu3 , ŵ2 i ŵ2 .
kw3 k
{ŵ1 , ŵ2 , ŵ3 } is an orthonormal set which spans Lin{u1 , u2 , u3 }.
Step . . . And so on, following the obvious pattern until...

Lecture 5, page 24
Step n: Take ŵn to be the vector
n 1
X
wn
ŵn = where w n = un hun , ŵi i ŵi
kwn k
i=1

Here {ŵ1 , ŵ2 , . . . , ŵn } is an orthonormal set which spans


Lin{u1 , u2 , u3 , . . . , un }.
And we are done! The set {ŵ1 , ŵ2 , . . . , ŵn } is our orthonormal
basis of W .
Note: As seen in MA100, we can use this to answer questions like
Find an orthonormal basis of a subspace of Rn and extend
it to an orthonormal basis of Rn .

See, for example, questions 7 and 8 in Exercises 13.

Lecture 5, page 25
Example

Consider the vector space C[0, 1] with for all f , g 2 C[0, 1] the
inner product Z 1
hf , g i = f (x)g (x) dx,
0
.
Let U = Lin{f1 , f2 , f3 } be the subspace of C[0, 1] where

f1 (x) = 1, f2 (x) = x and f3 (x) = x 2 ,

for all x 2 [0, 1]. Find an orthonormal basis of U.

Lecture 5, page 26
Gram -
Schmidt :

1st
step :
" fill
'
=L fi fi
,
> = do I dx EX ) to -

-
=
I Hell = ,

hi =
1¥11 =

fi .
So hi :[0.13 → IR ,
hi CX) -
-
I

2nd step : cfz ,


hi > = f! x da (E) to = = 's
Sz = fz -

Lfz ,
hi > Ihl =
X -
d- * I = X -
E ,
so gz -
-
fa I hi
-

1192112=42 .
Sz > = To Ix -
4)
-

dx= ⇐ Cx 4) 3) -

o
'
=
zi 119211=452
SO ha -
-
I =
TE cfz Zhi )-
so hai EO .
I ] → IR and hzlx) = TE ex -

I)
,

119211

3rd
step : 83=-13 cfs -

,
h, > hi -
of 3 hz > hz ,

hz= ¥1 = to f, where
fs :[oil) → R , fzcx) =P -

X -
I

Therefore , the orthonormal basis is { hi CX ) ,


hzlx ) . hslx ) )
MA212 Further Mathematical Methods
Lecture 6: Orthogonal matrices

Dr James Ward

⌅ Orthogonal matrices and their properties


⌅ Rotations and reflections in R2
⌅ Rotations and reflections in R3

The London School of Economics and Political Science


Department of Mathematics

Lecture 6, page 1
Information

⌅ Exercises 13 is on Moodle.
I Attempt questions 2, 4, 5, 9, 10 and 11.
I Follow your class teacher’s submission instructions.
I Do it! Actually submit your homework!

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes
I Go to them.

Lecture 6, page 2
Orthogonal matrices

Definition: A real n ⇥ n matrix A is orthogonal if At A = In .

In particular, we can see straight away that

Theorem: If A is an orthogonal matrix, then |A| = 1 or 1.


In particular, A is invertible. ⇐

Proof: As A is orthogonal, it is a real n ⇥ n matrix where

At A = In =) |At A| = |In | =) |At ||A| = 1 =) |A|2 = 1.

Thus, |A| is 1 or 1.
Of course, as |A| =
6 0, this means that A is invertible.

Lecture 6, page 3
A warning!

Note: The converse of this theorem is not true.


Example: Show that the matrix
!
2 0
A= ,
0 12

has |A| = 1 but it is not orthogonal.

IAI = 2x I -
O =/

but at a- =

(f ¥ ) fo : ) ( I %)
= FIN

so A is not orthogonal

Lecture 6, page 4
Other definitions of orthogonal matrices

When we are looking to see whether a matrix is orthogonal, it is


useful to have the following equivalent definitions.

Theorem: The following statements about a real n⇥n matrix


A are equivalent. multiplication of diagonal matrices is commutative AB
:
-
BA

(1) A is orthogonal, i.e. At A = In .


(2) At = A 1.

(3) AAt = In .
(4) The columns of A form an orthonormal set of vectors.

Proof: To establish the equivalences, we’ll show that


(1) () (2), (2) () (3) and (1) () (4).

Lecture 6, page 5
Proof of (1) () (2), i.e. At A = In () At = A 1

Suppose that A is a real n ⇥ n matrix.


LTR: If A is orthogonal, then it is invertible and

At A = In =) (At A)A 1
=A 1
=) At (AA 1
)=A 1
,

which means that At = A 1.

RTL: If At = A 1, then

At A = A 1
A = In ,

which means that A is orthogonal.

Lecture 6, page 6
Proof of (2) () (3), i.e. At = A 1
() AAt = In

Suppose that A is a real n ⇥ n matrix.


LTR: If At = A 1, then

AAt = AA 1
= In ,

as required.
RTL: If AAt = In , then A is invertible as

|AAt | = |In | =) |A||At | = 1 =) |A|2 = 1 =) |A| =


6 0.

Thus, we have
1
A (AAt ) = A 1
=) (A 1
A)At = A 1
,

which means that At = A 1.

Lecture 6, page 7
Proof of (1) () (4), i.e. At A = In () columns of A are ON

Let the vectors x1 , x2 , . . . , xn 2 Rn be the columns of A.


Then the matrix product At A is
0 1 0 1
—– x1t —– 0 1 x1t x1 x1t x2 · · · x1t xn
B C B t C
B—– x2t —–C B t
C Bx2 x1 x2 x2 · · · x2t xn C
B C @x1 x2 · · · xn A = B .. C
B .. C B .. .. ..
. C.
@ . A @ . . . A
—– xnt —– t t
xn x1 xn x2 · · · xnt xn

Thus, we see that A is orthogonal if and only if


8 8
<1 if i = j <1 if i = j
At A = In () xjt xi = () xi · xj =
:0 if i 6= j :0 if i 6= j
which happens if and only if {x1 , x2 , . . . , xn } is an orthonormal set,
as required.

Lecture 6, page 8
Properties of orthogonal matrices

Orthogonal matrices can also be characterised in terms of dot


products and norms.

Theorem: The following statements about a real n⇥n matrix


A are equivalent.
(1) A is orthogonal.
(2) For all u, v 2 Rn , (Au) · (Av ) = u · v .
(3) For all u 2 Rn , kAuk = kuk.

Proof: To establish the equivalences, we’ll show that

(1) () (2) and (2) () (3).

Lecture 6, page 9
Proof of (1) () (2)

For any u, v 2 Rn , we have

(Au) · (Av ) = (Av )t (Au) = v t At Au.

LTR: Thus, if A is orthogonal, we have At A = In so that

(Au) · (Av ) = v t In u = v t u = u · v ,

as required.
RTL: Also, if for all u, v 2 Rn , we have

(Au) · (Av ) = u · v =) v t At Au = v t u = v t In u,

and so, by question 11(b) from Exercises 13, we have At A = In .


Thus, A is orthogonal, as required.

Lecture 6, page 10
Proof of (2) () (3)

LTR: If, for any u, v 2 Rn , we have (Au) · (Av ) = u · v we can


set v = u to get
(Au) · (Au) = u · u =) kAuk2 = kuk2 .
Thus, as norms are nonnegative, kAuk = kuk, as required.
RTL: If, for any u 2 Rn , we have kAuk = kuk then, for any
u, v 2 Rn , we can use question 4 from Exercises 14 twice to get
4[(Au) · (Av )] = kAu + Av k2 kAu Av k2 (?)
= kA(u + v )k2 kA(u v )k2
= ku + v k2 ku v k2 (†)
= 4[u · v ] (?)
as required. Note: (?)’s indicate applications of the result from the
exercises and (†) indicates two applications of kAuk = kuk.
Lecture 6, page 11
Remarks about orthogonal matrices

This theorem relates orthogonal matrices to dot products and


norms which in turn are related to angles and lengths. That is,
(2) tells us that orthogonal matrices preserve angles in the sense
that, if ✓ is the angle between u and v and is the angle between
Au and Av , we have
cos cos ✓
= =) cos = cos ✓ =) =✓
kAukkAv k kukkv k

as such angles must lie between 0 and ⇡.


(3) tells us that orthogonal matrices preserve lengths, i.e. for any
u 2 Rn , the length of Au is the same as the length of u.

Lecture 6, page 12
Remarks about linear transformations

In particular, notice that due to (2) and (3), we can now see that

⌅ A function from Rn to Rn that preserves the vector space


structure of Rn (i.e. linear combinations) is a linear
transformation that may happen to be represented by any
kind of n ⇥ n matrix.
⌅ A linear transformation from Rn to Rn also preserves the
geometric structure of Rn (i.e. lengths and angles) if and only
if it is represented by an orthogonal n ⇥ n matrix.

Let’s see what this means for linear transformations that are
represented by orthogonal matrices.

Lecture 6, page 13
Orthogonal matrices in R2

Suppose that (e1 , e2 ) is the standard ordered basis of R2 and that


0 1
| |
B C
A = @v1 v2 A so that Ae1 = v1 and Ae2 = v2 .
| |
If A is orthogonal, we know that kv1 k = kv2 k = 1 and v1 · v2 = 0.
Let’s say that
!
cos ✓
v1 =
sin ✓

for some angle 0  ✓ < 2⇡.


[Note: kv1 k2 = cos2 ✓ + sin2 ✓ = 1.]

Now, we have two ways of choosing v2 ...

Lecture 6, page 14
Case 1: First choice of v2

We can take
! !
sin ✓ cos(✓ + ⇡2 )
v2 = =
cos ✓ sin(✓ + ⇡2 )

[Note: kv2 k = 1 and v1 · v2 = 0.]

In this case, the matrix A represents an anti-clockwise rotation


through the angle ✓ and we notice that

cos ✓ sin ✓
|A| = = cos2 ✓ + sin2 ✓ = 1.
sin ✓ cos ✓

Lecture 6, page 15
Case 2: Second choice of v2

We can take
! !

sin ✓ cos(✓ 2)
v2 = = ⇡
cos ✓ sin(✓ 2)

[Note: kv2 k = 1 and v1 · v2 = 0.]

In this case, the matrix A represents a reflection about the line


through the origin in the direction (cos 2✓ , sin 2✓ )t and we notice that

cos ✓ sin ✓
|A| = = cos2 ✓ sin2 ✓ = 1.
sin ✓ cos ✓

Lecture 6, page 16
Orientation

In particular, we say that an orthogonal matrix is orientation


⌅ preserving if its determinant is 1. → rotation
⌅ reversing if its determinant is 1. → reflectio
This is because we have

Lecture 6, page 17
Remarks about rotations in R2

As we have seen, the matrix


!
cos ✓ sin ✓
sin ✓ cos ✓

represents an anticlockwise rotation through the angle ✓.


This means that the matrix
! !
cos( ✓) sin( ✓) cos ✓ sin ✓
=
sin( ✓) cos( ✓) sin ✓ cos ✓

represents a clockwise rotation through the angle ✓.


Indeed, it should be clear that a clockwise rotation through an
angle 0  ✓ < 2⇡ is the same as an anticlockwise rotation through
an angle 2⇡ ✓!

Lecture 6, page 18
Remarks about reflections in R2

If A and B are real 2 ⇥ 2 orthogonal matrices that both represent


reflections, then

|A| = 1 and |B| = 1.

The matrix AB is also orthogonal and involves doing the two


reflections one after the other (B first and then A). However, since

|AB| = |A||B| = ( 1)( 1) = 1,

this matrix represents a rotation! That is, the composition of two


reflections in R2 is a rotation.

Lecture 6, page 19
Orthogonal matrices in R3

If we have a 3 ⇥ 3 orthogonal matrix, then it must represent one of


the following linear transformations in R3 .
(1) A rotation by an angle ✓ about an axis [through the origin].
(2) A reflection about a plane [through the origin].
(3) A combination of (1) and (2) where the axis of rotation is the
normal to the plane of reflection.
Our goal is to be able to
⌅ find a matrix which represents any of these transformations.
⌅ take an orthogonal matrix and find out which of these
transformations it represents.
But first, we need to understand what these transformations are
and how we can describe them!

Lecture 6, page 20
Case 1: A is a rotation by an angle ✓ about an axis

The axis of rotation is a line through the origin in the direction v .

As v lies on the axis, it does not


rotate and so we have

Av = v .

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.

If u is in the plane of rotation, i.e. u ? v , then Au is u rotated


by an angle ✓ where

u · (Au) = kukkAuk cos ✓,

and Au is also in the plane of rotation, i.e. Au ? v too.

Lecture 6, page 21
In particular, this means that A is similar to the matrix
0 1
cos ✓ sin ✓ 0
B C
B = @ sin ✓ cos ✓ 0A .
0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation preserving.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the plane of rotation, A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 6, page 22
Case 2: A is a reflection about a plane

The plane of reflection goes through the origin with normal v .


As v is the normal to the plane, it
is completely reflected and so

Av = v.

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.
If u is in the plane of reflection, i.e. u ? v , then Au is just u as it
is not reflected, i.e.
Au = u.

Thus, u 6= 0 is an eigenvector of A with eigenvalue 1.

Lecture 6, page 23
In particular, this means that A is similar to the matrix
0 1
1 0 0
B C
B = @0 1 0 A .
0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation reversing.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the plane of reflection, A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 6, page 24
Case 3: A is a combination of cases 1 and 2.

The axis of rotation is a line in the direction v and this is also


normal to the plane of reflection.

As v is the normal to the plane, it


is completely reflected and so

Av = v.

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.

If u is in the plane of rotation, i.e. u ? v , then Au is u rotated


by an angle ✓ where
u · (Au) = kukkAuk cos ✓,
and Au is also in the plane of rotation, i.e. Au ? v too.
Lecture 6, page 25
In particular, this means that A is similar to the matrix
0 1
cos ✓ sin ✓ 0
B C
B = @ sin ✓ cos ✓ 0 A.
0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation reversing.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the common plane of rotation and reflection,
A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 6, page 26
extra -

example session ( W 37

I .

from lecture b- ,
slides 26

and if there's invertible '


PBP '
-

A P AP
'

recall B are similar an matrix P sit B A -


-
:

i.e
-
-
.
,

z .
show :
if A is similar to In , then A In -
-

Inp =P P In
' '
there's invertible matrix P A P
- -

-1
-

an
-

s .
.
=

Ca )
'
3 .

suppose that A and B are similar , show AZ and B are similar too
'
there's A P BP
-

so an invertible matrix P sit


-
.

"
then AZ BZP
'
AA ' '
(P BP ) ( P Bp) 1) Bp
-

P
-

P
-
=
B Cpp
-

= =
=

A and
'

therefore B are similar

Cb )
assuming that A. B invertible then At and B
I
are similar
-

are , .

'
At ( p IBP 5 p
'
B- Ip
-

'
-

A- =p BP then
-

- =

I
.

At and " ' " "


CXYI T X
-

so B similar recall
'

are
-

:
.

IAI to
show that invertible matrix then B invertible
CC )
if A is an .

T O
is

' K t t
I
I PAP ' I
-

A- P Bp I BI Ipl TAI IP ' l


-

PAP
-

B fo
-
-
-
- -
-
-
= -

recall :
Ix 'll =
txt l 'll
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 13: Similar matrices and real inner products

For these and all other exercises on this course, you must show all your working.
1. Suppose that A and B are similar matrices. Show that A2 and B 2 are similar. If A is invertible,
show that B is also invertible and that A 1 and B 1 are similar.

2. (a) Suppose that A and B are n ⇥ n matrices, and that A is invertible. Show that the matrices

AB and BA are similar.
(b) Suppose that the n ⇥ n matrices S and T are similar. Show that there are matrices A and B,
with A invertible, such that S = AB and T = BA.
✓ ◆ ✓ ◆
1 1 1 1
(c) Let A = and B = . Show that AB is not similar to BA.
0 0 1 1

3. Let A and B be n ⇥ n matrices. Suppose that there is an invertible n ⇥ n matrix S such


that SAS 1 and SBS 1 are both diagonal matrices (only zero entries o↵ the diagonal). Show thatn

AB = BA. hemn
Wp cop
ro
,


4. Suppose that A is an n ⇥ m matrix, and B is an m ⇥ k matrix.
(a) Show that R(AB) ✓ R(A) and N (B) ✓ N (AB).
(b) If also R(B) = Rm , show that R(AB) = R(A).
(c) Deduce that, if A is an n ⇥ m matrix, and B is an m ⇥ n matrix, with m < n, then AB is singular.


5. Verify that
h(x1 , x2 )t , (y1 , y2 )t i = x1 y1 x 1 y2 x2 y1 + 3x2 y2 .
is an inner product in R2 .
Does h(x1 , x2 )t , (y1 , y2 )t i = x1 y1 x 1 y2 x2 y1 + x2 y2 also give an inner product on R2 ?

6. Let C[ 1, 1] be the vector space of continuous functions on [ 1, 1]. Show that


Z 1
hf, gi = x2 f (x)g(x) dx + f (0)g(0)
1

defines an inner product on C[ 1, 1].


Indicate briefly why each of the following does not define an inner product on C[ 1, 1].
R1
(i) hf, gi = 1 xf (x)g(x) dx + f (0)g(0),
R1
(ii) hf, gi = 0 x2 f (x)g(x) dx + f (0)g(0),
R1
(iii) hf, gi = 1 x2 f (x)g(x) dx + f (0)g(1).

7. Use the Gram-Schmidt process to find an orthonormal basis for the subspace
080 1 0 191
< 2 2 =
Lin @ @1 , 0A A
A @
: ;
2 1

of R3 .

Exercises continue overleaf...


*
8. Find orthonormal bases for the image and kernel of the linear transformation represented by
the matrix 0 1
1 1 1 1
M = @2 0 1 1A
0 2 1 3
(with respect to the standard bases), and extend them to orthonormal bases of the whole spaces R3
and R4 respectively.

9. A function f : R ! R is odd if f ( x) = f (x) for all x, and even if f ( x) = f (x) for all x.

Show that, if f is odd and g is even, then
Z 1
f (x)g(x) dx = 0.
1

10. Let P2 [ 1, 1] be the space of all real polynomial functions on [ 1, 1] with degree two or less.

:
The 3-tuple of functions (1, x, x2 ) is an ordered basis of this space, as in Question 4 of Exercises 2.
An inner product is defined on P2 [ 1, 1] by
Z 1
hf, gi = f (x)g(x) dx.
1

Find an orthonormal basis of P2 [ 1, 1] with respect to this inner product.

11. (a) Suppose A is an n ⇥ n matrix with the property that x t Ay = 0 for all x , y in Rn . Show
that A = 0, i.e. every entry aij of A is equal to 0.
(b) Deduce that, if A and B are n ⇥ n matrices with the property that x t Ay = x t By for all x , y in
Rn , then A = B.
(c) Give an example of a non-zero 2 ⇥ 2 matrix A such that x t Ax = 0 for all x in R2 .
i . Since A and B are similar matrices ,
A =
PBP
"

then "
PBP PBP
' " '
AE PB p -
I
-

AP BP PB p
-

AA
- -

i.e
-
-
= = .

'
sa A and B are similar
'
and ( S ' Ats ) B
' ' '
I
-

S A' SS AS
-

13=5 As B is invertible
-

given similarity so
-
- =
. ,

'
also similar to A -1
"
BY S A which is
-

since A is invertible ,
=
s . .

2 . Ca) since A is invertible , AA't -


I

then AB ALBI ) ACBAA ' )


'

-
- = =
A C BA ) A

by definition , AB and BA are similar .

A- SA and
'
cb ) since s and T are similar ,
then there exists an invertible matrix St .

T = S -

-
"
ATA

then there exists invertible and B TA


I
-1 5- ALTA )
"
AB
-

A s -
-
-
. .
,

'
and T A- SA At ABA In BA =
BAU
- =
=

Co ill : : ) l: : ) fi : ) ( II 'll It
-

an and BA
'
-
-
-

for any invertible p .


P KAB > P
-

-
-


Pt
( f f) p =

(Oooo) tf ! I ) -

BA

so AB is not similar to BA

3 .

denote Di -
SAS
"
and Dz -
SBS -1 , so A- 5'D, s and B - 5 Dzs
'

' ' ' '


Dass ' Dis
-

'
-

and
-

Then AB Diss Das


-

S
-

S S Dilks BA S Da Dis
-

=
-
-
=
-
-

since the multiplication of diagonal matrices is commutative ,


D, Ds -
-
Dad,

SYD Das
AB=BA✓
'
then S DzDis
-

i.e
-
, ,
.
th
F) ) TF }
B A

))

( n →

A- =
n
.
B
-
-
m
,
AB =
Rk 1pm Rn

"
4 ,
Ca) A B becomes
-

a nxk matrix B R :
"
→ Rm ; A :
RM → Rn ; AB :
IR → Rn

}
" "
{
"
RCAB ) =
AB -

IER LIER Ca) let YER CAB ) , then IXEIR s -


I. y =
AB -

{ A.IE/RnlIElR }
M
RCA) -
-
let Z BX -
-
,
then y ABX - -
AZ . so YERLA )

let the RCA B ) , then In AB y


-
-
-

for some KERK


'm
=
A CBE) for some BEER
therefore , the RCA) and RCAB) ER CA )

}
M

N (B) =
{ XEIRK I Boxee Cb) let UENCB ) .

by definition , BI -
-
I then ABI =
AI -
-

}
n

N LAB) - { YE IRK / ABY -


-

I SO UENCAB )
A-
let IE NCB ) , then BA Om and
-
- thus ABN = AIM =
In B- →) RCA
RCB)
so N ( B) E N CAB )

Cb )
by definition , reps -
{ Bae
m
, uepkg.am
let I C- RCA ) , then P =
AI for some I c- 1pm =
Reps)

I I c- Rk Srt U
Bud and thus P All ABI E RCAB)
-

so =
-
.
-

hence RCASERCAB)

accompanied the fact that RCAB ) E RCA)


from Ca) RCA ) RCAB ) if RCB) Rm
by
- -
- -
,

CC) need to prove det CAB) -


-
o ?

def :
AB is non singular iff p CAB) = n CP is rank)

know :
PLAB ) e PLA) smh n

hence PLAB ) s n . so AB is
singular
5 .
take any x. g. ZEN and a. BER

ch ca
-18µL ) (I:) .
> =
I!! !!! ) (Ea) .
>
-
-
laxity IZ , ,
-

Kitty, ) Zz -

cast Byz ) Zi t 3 ltxzt 2) Zz

ftp.Z/t3dXzZzt3BYzZ2--dLXiZI-XiZz-XzZI-3X2Zz
= A XIZ, t By , Zi -
XXI Zz -
BY , Zz -
AXZZ , -

) t B ( Yi Zi -
Y , Zz -

YZZ, -13922-2)
= "
It # .
s te
' .
,

dammit iy.y.us ,
.
* g. .
Xi Yz -

Xzy , t 3 Xsyz

=
Yi Xi -

Y, Xs -
Yz Xt t 392×2

ysyt
=
a Cy , . Hi ,

Xz
t
Xslt XP 2×1×21-3 XI }
'
(3) L (Xi X 2)
, ,
( Xi ,
s = Xi Xi -
Xi Ka -
X 2X , t 3XzXz =
-
=
( Xi -
Xz) t 2X 30

if (Xi Xslt Lxi Xslt then


'
and and
fly)
L >
, , .
,
CX, -
XD -
o ,
so Xi - Xz Xz
-
-
o ,
thus Xiao =
0

if ( Xx! ) O then (Xi X 2142×2=0 so < (Xi xzjt (Xi Xslt 0


-

- -
> =
,
, . , ,

Overall ,
L ( Xi ,
Xzlt , Cy , , yet > = X. y, -
Xzy , -
Xi Yz t 3×292

However ,
L ( Xi , y , )t , Cy , .
Yzjt > = Xi Yi -

Xiyz -

Xzy , t Xzyz cannot give an inner product

for property 3 ,
let
4=4 ) ,
, then <
I . I > = 0 . which violates the rule that

ix. Is 0 it and only if I =


I ✓

6. cis does not


satisfy positivity .

f! ¥7417 ¥
effs Rdx
Iggy
O
fix) floro then
=
Take
-
= =
X
-
-
=
,
-
, ,

for Xso for xoxo

dis fails for at 't >


cu u > so iffuo take fix) then o for any x Ecco D
-

o xx
-
: -
-

, , .

dis does not satisfy symmetry


<
f. g > = J 's rifling dx -1 flag a) f- ft xzscxsflxidx -1510)fu)
,
=
sg.fi
"
taking fix ) -
-
I and SKI - X
()
2

f )
z
7 .
let 4 -
and be o
,

2 I

11411 -
-
evi =
=3

soni -
-

¥n= 's
2

( y ) (1)
n
< Us ,
W, > = .
Is = It -5=2

sow. o. a. wi > wi -

fo ) tf ! ) 4,1-(4%3)=41: )
-

- - -
- -

HWzlt-LW2.ws =
I

therefore wi -

iffy, = -5 I
an orthonormal basis for the subspace is
{ uh uh }
,

I I

l: )
o
-

t
8 '

RRE ( my M transforms from R4 into R3


: :
=

: 'ere
x
.
x

ime

!f ) }
'
" "Mt

Ling
-
-

take Ui =
(l l -
z 05h the l l l -2 0 It UE Bo l -2 4 I -351 by GS

( check Vita)

to extend to lR4 take 43=4


,
O l -
15h and 44 -
CO 2 I 351

Us =
# ( 2,0 , I ,
-
15h 64=1370 ( 1.3.24 )
by GS

therefore . { Ui ,
Oz ,
Is , Ug} form an
orthogonal basis of 1124 s -

t
. it and Oz
form
a basis of the kernel
same thing
I

{ (g) (E) }
run) -
-
as cm) -
Lin
,

(g ) try
({ ) Bto L2 -1.5T
'
take then Ui Uz GS ( check Uit Uz)
u,
by
- -

= - -

.
,

take
any Et f ch Uz }
µ) U, ) U , Uz ) U2
43 say Uz ( Us ( Us
Lin wz Uz
-
'
- - - -

. ,
=

Us =

WSH
Wsh

therefore { us
,
,
U2 , Us } form an orthonormal basis of R ' sit .
Ui . Uz form a basis of the image
9 .
I! " fix gcx) dx = ft th) Stx) dx t f! fans dx

I "o fix scxldx So ft


'

by the
property of symmetry
+ x) Stx) dx
-

since fi x) -
=
-

fix) and sext -91×1 .


-
ft tix) Stx) dx -
ft thx) xx) DX
=D .

10 .

define Ui = I , Uz - X and 43=92

Using the Gram -


Schmidt method ,
1141112 = s ai . as =
I ,
I dx = 2 . SO HU , Harz

so wi -
-

1¥ E ,
-

- -
-
F-

Secondly ,
we Uz -
cuz ,
wi > Tv ,
= Uz -
I xFd⇐
, x -

⇐ x' THE =
x -
o =
x

H wz 112 = cwz , wz > = ft ,


X
'
DX =

[3×3] !, = I -1 I =
-32 ,
so 11 Wall = ¥
so uh = b-
11W 211
=
÷ =
F- X = ¥ Uz

thirdly ,
Wz =
Us -
< Us ,
uh > Wi -
< Us, WI S WI
E
=
X
'
-
Stix '
dx -
H ,
x
'
dx )x
R E
Lf Xs ] !, [ ¥x4 ] !,
'
-

= X -
-

z -
x

=
x2 -
( Ex 2) E -

x EI ¥ ] -

=P -
I
f't
l l '
Ii
'

Z
Ht
J ) DX FF f¥
' -
II wash =
sws.ws > =
LK - =

TX
-

g-
× =

,
so 11 Wz he


Xd
SO if, WI J
t) ferox ¥0
-
-
'
=
(x
= -

= =
-

H Wsh

Therefore , the orthonormal basis is { hi ,


uh ,
his } on Pat -1 D ,
h
-

ii. Ca) A- ( )) n t x.
y C- Rn Xt Ay O w A-I
-
-

, ,

( )( )
suppose Ato so there Iaij to so I ai ) ai to contradiction
aig
.
, .

Cb) Xt Ay -
-
Xt By
x' Ay Xt
By o
-
- -

Xt LA B) y
- -
-
O €7 A B - -
o , so A - B

(I
'
A-
)
x )
, o
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 13: Similar matrices and real inner products

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. Suppose that A and B are similar, and that A = SBS 1 for some invertible matrix S. Then
we have A2 = (SBS 1 )(SBS 1 ) = SB(S 1 S)BS 1 = SB 2 S 1, so A2 and B 2 are similar.
Suppose now that A is invertible. We can write B = S 1 AS, and note that (S 1 A 1 S)B =
S 1 A 1 SS 1 AS = I, so B is invertible, with inverse B 1 = S 1 A 1 S, which is similar to A 1 .

2. (a) Suppose that A has an inverse A 1. We can write AB = AB(AA 1) = A(BA)A 1, showing
that AB and BA are similar.
(b) Suppose S and T are similar, with S = P 1T P . Then we can set A = P and B = P 1T , so that
A is invertible, AB = T and BA = S.
In other words, two n ⇥ n matrices are similar if and only if they can be written as AB and BA for
some pair of n ⇥ n matrices A and B, with A invertible.
(c) Here AB is the zero matrix, while BA isn’t. Any matrix S(AB)S 1 similar to AB = 0 will also
be the zero matrix, so in particular AB and BA are not similar.

3. Put SAS 1 = D1 and SBS 1 = D2 , where D1 and D2 are diagonal.


Now A = S 1 D1 S, and B = S 1D
2 S, so AB = S 1D
1 SS
1D
2S = S 1D
1 D2 S, and similarly
BA = S 1 D2 D1 S.
Moreover, since the Di are diagonal, we have D1 D2 = D2 D1 , so we can deduce that AB = BA, as
required.

4. (a) Take any vector y in the range of AB; so y = ABx for some x 2 Rk . Now y = Az where
z = Bx 2 Rm , so y 2 R(A). This proves that R(AB) ✓ R(A).
For any vector u 2 N (B), we have Bu = 0, and so ABu = 0, so u 2 N (AB). This shows that
N (B) ✓ N (AB).
(b) Now suppose that R(B) = Rm : we need to show that the range of A is contained in the range of
AB. So, take a vector z in the range of A: that means that Ax = z for some x 2 Rm . As the range
of B is Rm , there is some y 2 Rk with By = x . Putting this together, we have ABy = Ax = z , so
z is in the range of AB, which is what we needed.
(c) Finally, we note that, if A is an n ⇥ m matrix, then the rank ⇢(AB) of AB, which is the dimension
of R(AB), is at most the dimension ⇢(A) of R(A), and this is at most m, since R(A) is spanned by
the columns of A. If AB is an n ⇥ n matrix and m < n, this implies that AB is singular.
t
5. ✓ is linear◆ in the first variables, since it can be written in the form hx , y i = x Ay ,
The function
1 1
where here A = .
1 3
The function can be seen to be symmetric, i.e, hx , y i = hy , x i, directly from its definition; you’ll see
that this is e↵ectively the same thing as checking that the matrix A above is symmetric.
You do need to mention these points in your answer, but the main thing is to check positivity. For
that, it is not enough to choose a particular x and check that hx , x i 0: you need to demonstrate
this for all vectors x .
We have h(x1 , x2 )t , (x1 , x2 )t i = x21 2x1 x2 + 3x22 and as this is just (x1 x2 )2 + 2x22 it is always
non-negative. Also, if this is zero, we must have x1 = x2 and x2 = 0, which gives us x1 = x2 = 0.
And, conversely, if x1 = x2 = 0, this just gives us zero.
So the first function we are given does indeed give us an inner product in R2 .
For the second function we are given, we don’t have an inner product in R2 as positivity fails. That is,
even though we do always have h(x1 , x2 )t , (x1 , x2 )t i = (x1 x2 )2 0, the point is that this expression
can be equal to zero for non-zero x : for instance, h(1, 1)t , (1, 1)t i = 0.
Note: giving an example, like the vector (1, 1)t above, is the most convincing way to answer the
question.

6. It’s perhaps illuminating to do the second part first: explaining why each of (i)-(iii) fails to
define an inner product. Then we’ll know which points we have to take some care on when proving
that the first example does give an inner product.
(i) This is not an inner product because it fails positivity; if f (x) is large for negative values of
x and small for positive values of x, then hf, f i will be negative. For an explicit example, set
f (x) = xR for x  0 and f (x) = 0 for x 0. This is a continuous function, so it is in C[ 1, 1], yet
0
hf, f i = 1 x3 dx = 1/4.
(ii) This is not an inner product because hf, f i takes the value 0 on some continuous functions that
are not identically zero. The function f constructed in the answer to (i) is an explicit example.
(iii) This will also fail positivity, but it is even easier to see that it is not symmetric. The expression
hf, gi hg, f i = f (0)g(1) g(0)f (1) is typically non-zero (e.g., take f (x) = 1, g(x) = x) and this
means that we can have hf, gi = 6 hg, f i.
To show that the first function is an inner product, we note that hf, gi is certainly linear in f for
each fixed g: for f1 , f2 , g 2 C[ 1, 1] and real numbers ↵ and , we have
Z 1
h↵f1 + f2 , gi = x2 (↵f1 (x) + f2 (x))g(x) dx + (↵f1 (0) + f2 (0))g(0)
1
✓Z 1 ◆ ✓Z 1 ◆
2 2
= ↵ x f1 (x)g(x) dx + f1 (0)g(0) + x f2 (x)g(x) dx + f2 (0)g(0)
1 1

= ↵hf1 , gi + hf2 , gi.

The form is also clearly symmetric: hf, gi = hg, f i.


To check positivity, note that
Z 1
hf, f i = (xf (x))2 dx + (f (0))2 ,
1

which is always non-negative. This can only be zero if xf (x) = 0 for all x and f (0) = 0, which in
turn is only possible if f = 0. And, conversely, if f = 0, this just gives us zero.

7. This is rather more straightforward than the various parts of the next question, and we omit
the working.
One answer is: v1 = 13 (2, 1, 2)t and v2 = 13 (2, 2, 1)t .
0 1
1 0 1/2 1/2
8. @
We start with the kernel. The given matrix row-reduces to 0 1 1/2 3/2 A.
0 0 0 0
We can now read o↵ that the two vectors (1, 1, 2, 0)t and ( 1, 3, 0, 2)t span the nullspace of the
matrix, which is the same thing as the kernel of the linear transformation. ( This is another time
when it’s convenient to have integer vectors wherever possible. ) Take u1 = (1, 1, 2, 0)t (because it
has the simpler norm) and let v1 = p16 (1, 1, 2, 0)t . ( And here it’s best to keep it in this form; don’t
try to do any “cancellation”. )
Now we set
✓ ◆
t 1 1
w2 = ( 1, 3, 0, 2) p h( 1, 3, 0, 2)t , (1, 1, 2, 0)t i p (1, 1, 2, 0)t
6 6
1 1
= ( 1, 3, 0, 2)t (1, 1, 2, 0)t = ( 4, 8, 2, 6)t .
3 3
Finally, we normalise this to get v2 = p1 ( 2, 4, 1, 3)t .
30
So v1 and v2 form an orthonormal basis of the kernel. It’s easy to check that the two vectors are
orthogonal, so I expect you to do so.
To extend the set {v1 , v2 } to an orthonormal basis of R4 , it helps to observe that the rows of the
original matrix, or indeed the rows of the reduced matrix, are orthogonal to the kernel. So what we
need to find is an orthonormal basis {v3 , v4 } of the space spanned by either of those two pairs of
rows.
For instance, let u3 = (2, 0, 1, 1)t and u4 = (0, 2, 1, 3)t . For v3 , we take p1 (2, 0, 1, 1)t . Now take
6

1 1
w4 = (0, 2, 1, 3)t h(0, 2, 1, 3)t , (2, 0, 1, 1)t i(2, 0, 1, 1)t = (2, 6, 4, 8)t .
6 3
Finally, normalise this to get v4 = p1 (1, 3, 2, 4)t .
30

The vectors v1 , v2 , v3 and v4 form an orthonormal basis of R4 such that v1 and v2 form a basis of
the kernel.
Now we turn to the image. We know that the dimension of the image is 2, so for instance the first
two columns of the matrix span the image.
Set v1 = p1 (1, 2, 0)t and let
5

1 1
w1 = (1, 0, 2)t h(1, 2, 0)t , (1, 0, 2)t i(1, 2, 0)t = (4, 2, 10).
5 5
Now we normalise this to get v2 = p1 (2, 1, 5)t .
30
So v1 and v2 form an orthonormal basis of the image. It’s easy to check that the two vectors are
orthogonal, so I expect you to do so.
Perhaps the quickest way to find v3 is to look back to see why we got a zero row in the row-reduction.
This was because there was a non-trivial linear dependence among the rows: 2R1 R2 R3 = 0.
This means that w3 = (2, 1, 1)t is orthogonal to all the columns of the matrix. Therefore the
normalised multiple p16 (2, 1, 1)t will do for v3 .

Another way is just to write down what w3 has to be: any vector of the form (2, 1, x)t will be
orthogonal to v1 , and for such a vector to be orthogonal to v2 we require 5 + 5x = 0, so x = 1. (
This is more-or-less equivalent to choosing u3 = (0, 0, 1)t , which we can see is already orthogonal to
v1 , and adding an appropriate multiple of v2 , or of (2, 1, 5)t , to u3 . )
Either way, the vectors v1 , v2 and v3 form an orthonormal basis of R3 such that v1 and v2 form a
basis of the image.
9. If f is odd and g is even, then the product f g is odd: f ( x)g( x) = f (x)g(x). Now,
substituting y = x, we have
Z 0 Z 1 Z 1
f (x)g(x) dx = f ( y)g( y) dy = f (y)g(y) dy,
1 0 0

and therefore Z Z Z
1 0 1
f (x)g(x) dx = f (x)g(x) dx + f (x)g(x) dx = 0.
1 1 0

R1
10. Set g0 (x) = 1, g1 (x) = x, and g2 (x) = x2 . We have hg0 , g0 i = p1 g0
1 1 dx = 2, so f0 = 2

p
the constant function 1/ 2 — has norm 1.
We have hg0 , g1 i = 0, by the result in the previous question, as g0 is even and g1 is odd.
q q
Normalising g1 gives us the unit vector f1 = 32 g1 , i.e. f1 (x) = 32 x.
The final p function g2 is again even, so orthogonal to the odd function g1 , and hence to f1 , but
hg2 , f0 i = 2/3, so we put p
2 1
h2 (x) = g2 (x) f0 (x) = x2 ,
3 3
p
3p 5 2 1
and normalise to get f2 (x) = 2 2
(x 3 ).

The set {f0 , f1 , f2 } forms an orthonormal basis of the space.

11. (a) Suppose x t Ay = 0 for all x and y . Consider any entry aij of A; let x = ei and y = ej ,
the standard basis vectors. Then 0 = x t Ay = aij . This holds for every entry of A, so A is the zero
matrix.
(b) If A and B have the stated property, then x t (A B)y = 0 for all x and y . By the previous part,
we have that A B = 0, or A = B.
(c) One example is ✓ ◆
0 1
A= ,
1 0
(in fact, all examples are multiples of this). It is easy to check that x t Ax = x1 x2 x2 x1 = 0, for all
vectors x .
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 13: Similar matrices and real inner products

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. Suppose that A and B are similar, and that A = SBS 1 for some invertible matrix S. Then
we have A2 = (SBS 1 )(SBS 1 ) = SB(S 1 S)BS 1 = SB 2 S 1, so A2 and B 2 are similar.
Suppose now that A is invertible. We can write B = S 1 AS, and note that (S 1 A 1 S)B =
S 1 A 1 SS 1 AS = I, so B is invertible, with inverse B 1 = S 1 A 1 S, which is similar to A 1 .

2. (a) Suppose that A has an inverse A 1. We can write AB = AB(AA 1) = A(BA)A 1, showing
that AB and BA are similar.
(b) Suppose S and T are similar, with S = P 1T P . Then we can set A = P and B = P 1T , so that
A is invertible, AB = T and BA = S.
In other words, two n ⇥ n matrices are similar if and only if they can be written as AB and BA for
some pair of n ⇥ n matrices A and B, with A invertible.
(c) Here AB is the zero matrix, while BA isn’t. Any matrix S(AB)S 1 similar to AB = 0 will also
be the zero matrix, so in particular AB and BA are not similar.

3. Put SAS 1 = D1 and SBS 1 = D2 , where D1 and D2 are diagonal.


Now A = S 1 D1 S, and B = S 1D
2 S, so AB = S 1D
1 SS
1D
2S = S 1D
1 D2 S, and similarly
BA = S 1 D2 D1 S.
Moreover, since the Di are diagonal, we have D1 D2 = D2 D1 , so we can deduce that AB = BA, as
required.

4. (a) Take any vector y in the range of AB; so y = ABx for some x 2 Rk . Now y = Az where
z = Bx 2 Rm , so y 2 R(A). This proves that R(AB) ✓ R(A).
For any vector u 2 N (B), we have Bu = 0, and so ABu = 0, so u 2 N (AB). This shows that
N (B) ✓ N (AB).
(b) Now suppose that R(B) = Rm : we need to show that the range of A is contained in the range of
AB. So, take a vector z in the range of A: that means that Ax = z for some x 2 Rm . As the range
of B is Rm , there is some y 2 Rk with By = x . Putting this together, we have ABy = Ax = z , so
z is in the range of AB, which is what we needed.
(c) Finally, we note that, if A is an n ⇥ m matrix, then the rank ⇢(AB) of AB, which is the dimension
of R(AB), is at most the dimension ⇢(A) of R(A), and this is at most m, since R(A) is spanned by
the columns of A. If AB is an n ⇥ n matrix and m < n, this implies that AB is singular.
t
5. ✓ is linear◆ in the first variables, since it can be written in the form hx , y i = x Ay ,
The function
1 1
where here A = .
1 3
The function can be seen to be symmetric, i.e, hx , y i = hy , x i, directly from its definition; you’ll see
that this is e↵ectively the same thing as checking that the matrix A above is symmetric.
You do need to mention these points in your answer, but the main thing is to check positivity. For
that, it is not enough to choose a particular x and check that hx , x i 0: you need to demonstrate
this for all vectors x .
We have h(x1 , x2 )t , (x1 , x2 )t i = x21 2x1 x2 + 3x22 and as this is just (x1 x2 )2 + 2x22 it is always
non-negative. Also, if this is zero, we must have x1 = x2 and x2 = 0, which gives us x1 = x2 = 0.
And, conversely, if x1 = x2 = 0, this just gives us zero.
So the first function we are given does indeed give us an inner product in R2 .
For the second function we are given, we don’t have an inner product in R2 as positivity fails. That is,
even though we do always have h(x1 , x2 )t , (x1 , x2 )t i = (x1 x2 )2 0, the point is that this expression
can be equal to zero for non-zero x : for instance, h(1, 1)t , (1, 1)t i = 0.
Note: giving an example, like the vector (1, 1)t above, is the most convincing way to answer the
question.

6. It’s perhaps illuminating to do the second part first: explaining why each of (i)-(iii) fails to
define an inner product. Then we’ll know which points we have to take some care on when proving
that the first example does give an inner product.
(i) This is not an inner product because it fails positivity; if f (x) is large for negative values of
x and small for positive values of x, then hf, f i will be negative. For an explicit example, set
f (x) = xR for x  0 and f (x) = 0 for x 0. This is a continuous function, so it is in C[ 1, 1], yet
0
hf, f i = 1 x3 dx = 1/4.
(ii) This is not an inner product because hf, f i takes the value 0 on some continuous functions that
are not identically zero. The function f constructed in the answer to (i) is an explicit example.
(iii) This will also fail positivity, but it is even easier to see that it is not symmetric. The expression
hf, gi hg, f i = f (0)g(1) g(0)f (1) is typically non-zero (e.g., take f (x) = 1, g(x) = x) and this
means that we can have hf, gi = 6 hg, f i.
To show that the first function is an inner product, we note that hf, gi is certainly linear in f for
each fixed g: for f1 , f2 , g 2 C[ 1, 1] and real numbers ↵ and , we have
Z 1
h↵f1 + f2 , gi = x2 (↵f1 (x) + f2 (x))g(x) dx + (↵f1 (0) + f2 (0))g(0)
1
✓Z 1 ◆ ✓Z 1 ◆
2 2
= ↵ x f1 (x)g(x) dx + f1 (0)g(0) + x f2 (x)g(x) dx + f2 (0)g(0)
1 1

= ↵hf1 , gi + hf2 , gi.

The form is also clearly symmetric: hf, gi = hg, f i.


To check positivity, note that
Z 1
hf, f i = (xf (x))2 dx + (f (0))2 ,
1

which is always non-negative. This can only be zero if xf (x) = 0 for all x and f (0) = 0, which in
turn is only possible if f = 0. And, conversely, if f = 0, this just gives us zero.

7. This is rather more straightforward than the various parts of the next question, and we omit
the working.
One answer is: v1 = 13 (2, 1, 2)t and v2 = 13 (2, 2, 1)t .
0 1
1 0 1/2 1/2
8. @
We start with the kernel. The given matrix row-reduces to 0 1 1/2 3/2 A.
0 0 0 0
We can now read o↵ that the two vectors (1, 1, 2, 0)t and ( 1, 3, 0, 2)t span the nullspace of the
matrix, which is the same thing as the kernel of the linear transformation. ( This is another time
when it’s convenient to have integer vectors wherever possible. ) Take u1 = (1, 1, 2, 0)t (because it
has the simpler norm) and let v1 = p16 (1, 1, 2, 0)t . ( And here it’s best to keep it in this form; don’t
try to do any “cancellation”. )
Now we set
✓ ◆
t 1 1
w2 = ( 1, 3, 0, 2) p h( 1, 3, 0, 2)t , (1, 1, 2, 0)t i p (1, 1, 2, 0)t
6 6
1 1
= ( 1, 3, 0, 2)t (1, 1, 2, 0)t = ( 4, 8, 2, 6)t .
3 3
Finally, we normalise this to get v2 = p1 ( 2, 4, 1, 3)t .
30
So v1 and v2 form an orthonormal basis of the kernel. It’s easy to check that the two vectors are
orthogonal, so I expect you to do so.
To extend the set {v1 , v2 } to an orthonormal basis of R4 , it helps to observe that the rows of the
original matrix, or indeed the rows of the reduced matrix, are orthogonal to the kernel. So what we
need to find is an orthonormal basis {v3 , v4 } of the space spanned by either of those two pairs of
rows.
For instance, let u3 = (2, 0, 1, 1)t and u4 = (0, 2, 1, 3)t . For v3 , we take p1 (2, 0, 1, 1)t . Now take
6

1 1
w4 = (0, 2, 1, 3)t h(0, 2, 1, 3)t , (2, 0, 1, 1)t i(2, 0, 1, 1)t = (2, 6, 4, 8)t .
6 3
Finally, normalise this to get v4 = p1 (1, 3, 2, 4)t .
30

The vectors v1 , v2 , v3 and v4 form an orthonormal basis of R4 such that v1 and v2 form a basis of
the kernel.
Now we turn to the image. We know that the dimension of the image is 2, so for instance the first
two columns of the matrix span the image.
Set v1 = p1 (1, 2, 0)t and let
5

1 1
w1 = (1, 0, 2)t h(1, 2, 0)t , (1, 0, 2)t i(1, 2, 0)t = (4, 2, 10).
5 5
Now we normalise this to get v2 = p1 (2, 1, 5)t .
30
So v1 and v2 form an orthonormal basis of the image. It’s easy to check that the two vectors are
orthogonal, so I expect you to do so.
Perhaps the quickest way to find v3 is to look back to see why we got a zero row in the row-reduction.
This was because there was a non-trivial linear dependence among the rows: 2R1 R2 R3 = 0.
This means that w3 = (2, 1, 1)t is orthogonal to all the columns of the matrix. Therefore the
normalised multiple p16 (2, 1, 1)t will do for v3 .

Another way is just to write down what w3 has to be: any vector of the form (2, 1, x)t will be
orthogonal to v1 , and for such a vector to be orthogonal to v2 we require 5 + 5x = 0, so x = 1. (
This is more-or-less equivalent to choosing u3 = (0, 0, 1)t , which we can see is already orthogonal to
v1 , and adding an appropriate multiple of v2 , or of (2, 1, 5)t , to u3 . )
Either way, the vectors v1 , v2 and v3 form an orthonormal basis of R3 such that v1 and v2 form a
basis of the image.
9. If f is odd and g is even, then the product f g is odd: f ( x)g( x) = f (x)g(x). Now,
substituting y = x, we have
Z 0 Z 1 Z 1
f (x)g(x) dx = f ( y)g( y) dy = f (y)g(y) dy,
1 0 0

and therefore Z Z Z
1 0 1
f (x)g(x) dx = f (x)g(x) dx + f (x)g(x) dx = 0.
1 1 0

R1
10. Set g0 (x) = 1, g1 (x) = x, and g2 (x) = x2 . We have hg0 , g0 i = p1 g0
1 1 dx = 2, so f0 = 2

p
the constant function 1/ 2 — has norm 1.
We have hg0 , g1 i = 0, by the result in the previous question, as g0 is even and g1 is odd.
q q
Normalising g1 gives us the unit vector f1 = 32 g1 , i.e. f1 (x) = 32 x.
The final p function g2 is again even, so orthogonal to the odd function g1 , and hence to f1 , but
hg2 , f0 i = 2/3, so we put p
2 1
h2 (x) = g2 (x) f0 (x) = x2 ,
3 3
p
3p 5 2 1
and normalise to get f2 (x) = 2 2
(x 3 ).

The set {f0 , f1 , f2 } forms an orthonormal basis of the space.

11. (a) Suppose x t Ay = 0 for all x and y . Consider any entry aij of A; let x = ei and y = ej ,
the standard basis vectors. Then 0 = x t Ay = aij . This holds for every entry of A, so A is the zero
matrix.
(b) If A and B have the stated property, then x t (A B)y = 0 for all x and y . By the previous part,
we have that A B = 0, or A = B.
(c) One example is ✓ ◆
0 1
A= ,
1 0
(in fact, all examples are multiples of this). It is easy to check that x t Ax = x1 x2 x2 x1 = 0, for all
vectors x .
MA212 Further Mathematical Methods
Lecture 7: Orthogonal matrices continued

Dr James Ward

⌅ Rotations and reflections in R3


⌅ Finding an orthogonal matrix from a given transformation
⌅ Finding the transformation from a given orthogonal matrix
I Finding axes and angles of rotation
I Testing for anticlockwise or clockwise rotations
I Another way of finding angles of rotation

The London School of Economics and Political Science


Department of Mathematics
Lecture 7, page 1
rotations and reflections
1-
find the transformation matrix

* check left handedness -

UI UI I 1=-11 left handed ordered ca UI Is


{
I →
use
-

basis MB -
-

I UI UI II =
-

I →
swap the order to have left handed ordered
-
basis MB -

-
C UI UI I
.
,
)

AF
''

( since
Sino
'

)
= so
with if orientation preserving and , if orientation reversing
-

t '
- -

ago
o

It

'

Ay -
-
Mps AFB mis

decribe the transformation of a given matrix

is
type l rotation
:

* axis of rotation preserving


x
easily find relationships at Cztcs o then a-
(! )
-
-

g
IAI = + I AE =
y A In
-

(! )
IA I I AE I At In \
{ !})
= -
- -

find REF CA) → NCBI In ) →


N Lin ( then U -
-

K ,

reversing
*
angle

preserving
{
: Tr CA) -
-
dos Ott

reversing
:
Trial -
zaeso -
I

* direction

I § ayy if I { Yg
find any ate .
=p
anticlockwise

clockwise

a)
type 2 :

reflection


types both :
rotation and reflection
Orthogonal matrices in R3

If we have a 3 ⇥ 3 orthogonal matrix, then it must represent one of


the following linear transformations in R3 .
(1) A rotation by an angle ✓ about an axis [through the origin].
(2) A reflection about a plane [through the origin].
(3) A combination of (1) and (2) where the axis of rotation is the
normal to the plane of reflection.
Our goal is to be able to
⌅ find a matrix which represents any of these transformations.
⌅ take an orthogonal matrix and find out which of these
transformations it represents.
But first, we need to understand what these transformations are
and how we can describe them!

Lecture 7, page 3
Case 1: A is a rotation by an angle ✓ about an axis

The axis of rotation is a line through the origin in the direction v .

As v lies on the axis, it does not


rotate and so we have

Av = v .

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.

If u is in the plane of rotation, i.e. u ? v , then Au is u rotated


by an angle ✓ where

u · (Au) = kukkAuk cos ✓,

and Au is also in the plane of rotation, i.e. Au ? v too.

Lecture 7, page 4
In particular, this means that A is similar to the matrix
0 1
cos ✓ sin ✓ 0
A-F
' B B C
=
B = @ sin ✓ cos ✓ 0A .
0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation preserving.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the plane of rotation, A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 7, page 5
Case 2: A is a reflection about a plane

The plane of reflection goes through the origin with normal v .


As v is the normal to the plane, it
is completely reflected and so

Av = v.

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.
If u is in the plane of reflection, i.e. u ? v , then Au is just u as it
is not reflected, i.e.
Au = u.

Thus, u 6= 0 is an eigenvector of A with eigenvalue 1.

Lecture 7, page 6
In particular, this means that A is similar to the matrix
0 1
1 0 0
B B C
A-F B = @0 1 0 A .
'

0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation reversing.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the plane of reflection, A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 7, page 7
Case 3: A is a combination of cases 1 and 2.

The axis of rotation is a line in the direction v and this is also


normal to the plane of reflection.

As v is the normal to the plane, it


is completely reflected and so

Av = v.

Thus, v 6= 0 is an eigenvector of
A with eigenvalue 1.

If u is in the plane of rotation, i.e. u ? v , then Au is u rotated


by an angle ✓ where
u · (Au) = kukkAuk cos ✓,
and Au is also in the plane of rotation, i.e. Au ? v too.
Lecture 7, page 8
In particular, this means that A is similar to the matrix
0 1
cos ✓ sin ✓ 0
ATB B = B C
' B
=

@ sin ✓ cos ✓ 0 A.
0 0 1

Observe that |B| = 1 and so |A| = 1 too. That is, this linear
transformation is orientation reversing.
Also, if (u1 , u2 , v̂ ) is an ordered orthonormal basis of R3 where u1
and u2 are vectors in the common plane of rotation and reflection,
A and B are related by
0 1
| | |
B C
B = M t AM with M = @u1 u2 v̂ A .
| | |

Lecture 7, page 9
Recap!

A 3 ⇥ 3 orthogonal matrix represents one of the following in R3 .


(1) A rotation by an angle ✓ about an axis [through the origin].
(2) A reflection about a plane [through the origin].
(3) A combination of (1) and (2) where the axis of rotation is the
normal to the plane of reflection.
Further, any such orthogonal matrix is similar to
0 1 0 1
cos ✓ sin ✓ 0 | | |
B C B C
B = @ sin ✓ cos ✓ 0 A with M = @u1 u2 v̂ A
0 0 ±1 | | |
where ✓ is the angle of rotation and the ‘±1’ is
⌅ +1 if A is orientation-preserving (i.e. no reflection), and
⌅ 1 if A is orientation-reversing (i.e. a reflection).

Lecture 7, page 10
What about anticlockwise and clockwise?

We define an anticlockwise rotation with reference to the standard


ordered basis (e1 , e2 , e3 ) of R3 which is “left-handed”, i.e. with
your left-hand we have

(Think about Fleming’s


left-hand rule or how the
cross product

e1 ⇥ e2 = e3

works if you know about


such things.)

Lecture 7, page 11
Then, with this in place, taking e3 to be the direction of the axis
of rotation and Lin{e1 , e2 } to be the plane of rotation, an
anticlockwise rotation by ✓ is then defined in the R2 sense, i.e.

(Think: With “left-


handed” axes, an anti-
clockwise rotation goes
from e1 to e2 with e3
pointing up.)

Lecture 7, page 12
In particular, with the cases above in mind, we want our ordered
orthonormal basis (u1 , u2 , v̂ ) to map

e1 ! u 1 , e2 ! u 2 and e3 ! v̂ ,

in an order-preserving way, i.e. we want

| | |
u1 u2 v̂ = 1 so that orientations are preserved.
| | |
And, if we do this, any rotation discussed in the cases from earlier
will be anticlockwise.
Of course, if our determinant is 1, we can make it 1 by
interchanging the vectors u1 and u2 !
And, if we want a clockwise rotation, we can use the same ordered
basis but just use a negative angle instead!

Lecture 7, page 13
What we need to do now...

... We want to be able to answer the following questions.

⌅ Find an orthogonal matrix that represents a given rotation


and/or reflection in R3 .
⌅ Find the rotation and/or reflection that is represented by a
given 3 ⇥ 3 orthogonal matrix.

Lecture 7, page 14
Finding an orthogonal matrix for a given transformation

Method: Find a ‘left-handed’ orthonormal basis of R3 , let’s call it


B, that allows us to represent the transformation by the matrix
0 1
cos ✓ sin ✓ 0
B C
AB,B
T = @ sin ✓ cos ✓ 0A
0 0 ±1
where we take
⌅ ✓ to be the angle of rotation (positive for anticlockwise and
negative for clockwise), and
⌅ ±1 to be +1 if there is no reflection and 1 if there is.
Then change basis to standard coordinates using
AT = MB AB,B
T MB
1

to get our orthogonal matrix.


Lecture 7, page 15
Example

Find the matrix representing an anticlockwise rotation by ⇡/4


about the axis (1, 1, 0)t with respect to standard coordinates.
a unit vector in the direction of the axis E
tf ! )
-
-
.
,

we then have a, =

(§ ) and uh
Eff )
-
-

iii. is is

/
(

f ! -49g I
,

left handed
-
⇐ lui uh O l
, .
-
- = ICI ta ) -
- I

raaiiiiiiasiiiuu
! ) fi! i! ! )
#"
:9
f :{ s.in !
!
-
"
transformation a
-

i.
, our is
-

. .

:* ¥11 :} ill ! !.ir/=..=iaf::.i.:::.l


* mm* mi -

.
" "
:

to check : AT is orthogonal
Lecture 7, page 16
Finding the transformation from a given orthogonal matrix

Method: To find the axis, v 6= 0, determine whether the matrix is


orientation preserving or reversing by looking at |A|.

⌅ If |A| = +1, the axis is given by Av = v .


⌅ If |A| = 1, the axis is given by Av = v.

Then, to find the angle of rotation, pick a vector u ? v and find


the angle between the vectors u and Au.
[Note: We’ll see how to determine whether a rotation is clockwise
or anticlockwise in a moment.]

Lecture 7, page 18
Example

Find the axis and angle of rotation of the linear transformation


represented by the orthogonal matrix
0 1
8 1 4
1B C
A = @1 8 4A .
9
4 4 7

/ I
3 8 -
I 4
I Al =
(f) , g *
= . .
.
= I orientation preserving c pure rotation )

4 4 -7

axis of rotation I given by AI I


,
-
-
CA -

1311=1 since 4C, to eat l


CEO .

Il:÷¥H ! !: Hi ÷÷l
' "ene .

"" '

Lecture 7, page 19
( Yo ) f-
( ! Is Ig ) ( ! ) f- ( Ig )
utu and find Ae
pick say E- - -
-
,

k CAA ) Hull HANI


. -
-
.

also o -

f- to = I .
I .
cos o also = -
I
therefore .
'
0=65 ( Eg ) a 15273
°

so A is a rotation by 152.730 , about the axis 14,0 15h


.

Check -
or anti clock wise ?
-

Lecture 7, page 20
Is this transformation anticlockwise or clockwise?

Suppose, following our method, we have the vectors v and u.


Let’s pick a third vector w to form the ordered orthonormal basis
(û, ŵ , v̂ ) that preserves the ‘left-handedness’ of our our coordinate
axes. That is, the vectors are ordered so that

û ŵ v̂ = 1.

We ask, for a given angle ✓ where 0  ✓  ⇡, does the orthogonal


matrix A represent an anticlockwise or clockwise rotation?

Lecture 7, page 21
That is, what do we have?

One way to consider this question is to look at the determinant of


the vectors û, Aû and v̂ in that order as...

Lecture 7, page 22
û Aû v̂ = û ↵û + ŵ v̂

= û ŵ v̂
g) Cz -2oz -
ac,

= û ŵ v̂

=
Of course, using non-unit vectors u, Au and v (in that order) will
change the magnitude of but not its sign giving us the following
simple test.
Lecture 7, page 23
An anticlockwise or clockwise test

Find the sign of the determinant

u Au v .

If it is

⌅ positive, then the rotation is anticlockwise.


⌅ negative, then the rotation is clockwise.

Lecture 7, page 24
Example continued

We have found the vectors


0 1 0 1 0 1
0 1 4
B C 1B C B C
u = @1A , Au = @ 8A and v = @0A
9
0 4 1

where v is the direction of the axis and the angle is cos 1( 8/9).
Is this rotation anticlockwise or clockwise?

| ! Ht Y / ( f
"
-119

of
0 4

171g
= =

sa so
-
- .
.
=

419 I

( §)
anticlockwise cost c- 819 )
so A represents rotation by about the axis

Lecture 7, page 25
An alternative method for finding angles of rotation

Recall: If A is a 3 ⇥ 3 orthogonal matrix, then it is similar to


0 1
cos ✓ sin ✓ 0
B C
B = @ sin ✓ cos ✓ 0A
0 0 ±1

where the ‘±1’ is properties of trace c


from 22 D
Tr LA tB) Tr CA ) t Tr CB)
⌅ +1 if A is orientation-preserving, and
=

⌅ 1 if A is orientation-reversing. Tr CA ) =
Tr CAT )

Due to the similarity of A and B, we have Tr LAB) =


Tr CBA) i Tr (ABC) =

Tr(A) = Tr(B) =) Tr(A) = 2 cos ✓ ± 1 Trc CAB )


/

Thus, as we can easily find Tr(A) and we can easily determine


whether A is orientation-preserving or reversing, we can easily use
this formula to find ✓! Tr CB) Trl PAP y Tr C P ' PA) Tr CA)
=
-
'
=
-

Lecture 7, page 26
Example continued

Verify that the matrix


0 1
8 1 4
1B C
A = @1 8 4A ,
9
4 4 7

has an angle of rotation given by cos 1( 8/9).

we found that IAI = I and so THAT = 2 so -11

f- tf f- zaesotl Cosa f- cost f)


-

g-
= -
-
-

Lecture 7, page 27
MA212 Further Mathematical Methods
Lecture 8: Complex vector spaces

Dr James Ward

⌅ Complex numbers
⌅ Complex vector spaces
⌅ Complex inner product spaces

The London School of Economics and Political Science


Department of Mathematics

Lecture 8, page 1
Complex numbers

Recall that the set of complex numbers is

C = {a + ib | a, b 2 R}
p
where i = 1 so that i 2 = 1.
Complex numbers can be added and multiplied in the usual way,
i.e. for a, b, c, d 2 R we have

(a + ib) + (c + id) = (a + c) + i(b + d),

and

(a + ib)(c + id) = ac + iad + ibc + i 2 bd = (ac bd) + i(ad + bc).

Lecture 8, page 3
Complex conjugate and modulus

With a, b 2 R, we define the

⌅ complex conjugate of z = a + ib to be z = a ib.


p
⌅ modulus of z = a + ib to be |z| = a2 + b 2 .

Note, in particular, that the modulus is always a real number.


These two concepts are related since, for a, b 2 R, we have

zz = (a + ib)(a ib) = a2 i 2 b 2 = a2 + b 2 = |z|2 .

They also give us division since, if z 6= 0, we can take


1 1 a ib a ib z
= = = 2 2
= 2,
z a + ib (a + ib)(a ib) a +b |z|
w wz
for a, b 2 R not both zero. This gives us = 2.
z |z|

Lecture 8, page 4
Properties of complex conjugates and moduli

If w , z 2 C, the complex conjugate has the following properties


⇣w ⌘ w
w + z = w + z, wz = w z and = if z 6= 0.
z z

If w , z 2 C, the modulus has the following properties

w |w |
|wz| = |w | |z| and = if z 6= 0.
z |z|

This should all be familiar to you from MA100 and so we will not
dwell on it here.

Lecture 8, page 5
Complex vector spaces

A complex vector space is the same as a real vector space except


the scalars are now complex numbers!
Our most important example of a complex vector space is
80 1 9
>
> z1 >
>
>
> B C >
>
< B z2 C =
n B C
C = B . C z1 , z2 , . . . , zn 2 C .
>
>
> @ .. A >
>
>
>
: >
;
zn

In this case, vector addition and scalar multiplication are defined in


the same was as they are in Rn but now, of course, the scalars are
complex numbers.

Lecture 8, page 6
What’s the same in complex vector spaces?

Much of what you’ve seen in real vector spaces (where we have


real vectors and matrices), also holds in complex vector spaces
(where we have complex vectors and matrices). For example...

⌅ Assessing linear independence and dependence.


⌅ What we mean by a subspace and how to find a basis.
⌅ Doing row operations and solving linear equations.
⌅ Finding ranges (and ranks) and null spaces (and nullities).
⌅ Finding determinants and inverses.

The only change is that our scalars (i.e. what we can multiply by
and what we can have as the entries of a vector or matrix) are now
complex numbers.

Lecture 8, page 7
Example

0 10 1 0 1
1 0 i z1 2+i
B CB C B C
Solve the equation @ i 1 1 + i A @z2 A = @3 2i A.
0 i 1 z3 2i
REF v

augmented matrix
RREF v

f: :÷il¥n*t: ! I :÷l"n& : :L : )
"

number
complex
+
SO Zi ti Z 3=2 Ii let 23=-1 Cl
-

E , so that 2-1=2-1 i -
it

Zz ti Zg - 2 Zz =
2- if

¥z;) (Ii ) tf )
i . the solution is
=
t .
tee

Lecture 8, page 8
Complex conjugates of complex vectors and matrices

However, with complex vectors and matrices comes a new


operation, complex conjugation.
The complex conjugate of a vector
0 1 0 1
z1 z1
B C B C
z = @z2 A in C3 is, unsurprisingly, z = @z 2 A ,
z3 z3
and the complex conjugate of a 2 ⇥ 3 complex matrix
! !
a11 a12 a13 a11 a12 a13
A= is, unsurprisingly, A = ,
a21 a22 a23 a21 a22 a23

These are easily generalised for vectors in Cn for any n 2 N and


any m ⇥ n matrix.

Lecture 8, page 9
Complex conjugates of complex determinants

Theorem: If A is an n ⇥ n complex matrix, then |A| = |A|.


That is, the complex conjugate of the determinant of A is
the determinant of the complex conjugate of A.

Proof: We illustrate this for a 2 ⇥ 2 complex matrix.

a11 a12
|A| = = a11 a22 a12 a21
a21 a22
a11 a12
= a11 a22 a12 a21 = = |A|.
a21 a22

This proof obviously generalises for n ⇥ n matrices but gets very


tedious!

Lecture 8, page 10
What’s not the same in complex vector spaces?

One thing that certainly changes is that, in complex vector spaces,


inner products are di↵erent. We’ll look at this now.
And this will lead on to some other routine calculations where we
must take a bit more care when we have a complex vector space.
But, looking forward, what we consider to be a ‘special’ matrix
changes when we allow them to have complex entries. We’ll look
at ‘special’ complex matrices and what makes them ‘special’ in the
next lecture.

Lecture 8, page 11
Complex inner product spaces

Definition: Suppose that V is a complex vector space.


An inner product on V is a function h·, ·i : V ⇥ V ! C that,
for all u, v , w 2 V and ↵, 2 C, satisfies
(i) h↵u + v , w i = ↵ hu, w i + hv , w i;
(ii) hu, v i = hv , ui;
(iii) hu, ui 0 and hu, ui = 0 if and only if u = 0.
A complex vector space with an inner product is called a
complex inner product space.

Note: Unlike in the real case, (ii) is now called complex


conjugate symmetry and, even though u is a complex vector,
hu, ui is still a nonnegative real number.

Lecture 8, page 12
Complex conjugate linearity in the second argument

Theorem: Suppose that V is a complex inner product space.


For all u, v , w 2 V and ↵, 2 C, we have

hu, ↵v + w i = ↵ hu, v i + hu, w i .

Proof: Linearity in the first argument gives us


h↵v + w , ui = ↵ hv , ui + hw , ui ,
and, applying complex conjugate symmetry, we get
hu, ↵v + w i = ↵hu, v i + hu, w i,
which means that, taking the complex conjugate of both sides,
hu, ↵v + w i = ↵ hu, v i + hu, w i ,
as required.
Lecture 8, page 13
Complex inner products are not bilinear

Note: As complex inner products are complex conjugate linear


(and not linear) in the second argument, they are not bilinear.

Note: Just to be clear, complex conjugate symmetry means that


for any vector u in a complex inner product space

hu, ui = hu, ui,

i.e. hu, ui is always a real number. This is necessary for the


hu, ui 0 part of positivity to make sense.

Lecture 8, page 14
Example

Show that the dot product in C2 , i.e.


! !
u1 v1
· = u 1 v1 + u 2 v2
u2 v2

is an inner product on C2 .
Bri ) wit Haz ie uz ) WI

( fault Eplf: ) ( Yu! )


=
Kat -

Ci )
WT U , it , t Va WI )
.

, = d l U, t Uz WJ ) t fC

= a
twirlers fait
di )
.f =
GUTAI =
That is Uz = u ,
it , t Usui =

( Y;) .

Citi)
(Yz ) -

= u, a + uzi .
=
Iu, 14 lurk so and Cut!) .
-
-
o ⑦ lull 't last ?o
-
⇐ Ui O U2 0
-
- =
,

a real vector

fun;) -
- O
Lecture 8, page 15
Two di↵erences between real and complex dot products

Looking at the dot product in C2 , we can see that


⌅ It is not symmetric as
! ! ! !
u1 v1 v1 u1
· = u1 v1 +u2 v2 and · = v1 u1 +v2 u2 ,
u2 v2 v2 u2
are not equal, but it is complex conjugate symmetric.
⌅ It is not linear in the second argument as
! !
u1 ↵v1
· = u1 ↵v1 + u2 ↵v2 = ↵(u1 v1 + u2 v2 )
u2 ↵v2
which is not equal to
" ! !#
u1 v1
↵ · = ↵(u1 v1 + u2 v2 )
u2 v2
but it is complex conjugate linear in the second argument.
Lecture 8, page 16
Example
!
A '

2 1 i
Is hu, v i = u t v an inner product on C2 ?
1+i 2
'
Ci ) ft I ht
. , E Cl and a, B. red coat BI ,
we > = the a. he > + focus Ws

< a atf d. w > =


( out fast ATV =
(Aut t put ) Ant =
out Ate t potato -
-
ku w >
, + Pal he>
,

*, LUI = < U. I >


for u, de 42 is a txt matrix
K
( out = etat = UT A U = ( ft Ault =
U
'
A' tilt = Ut At = su, a >

at
( I Ii ) ( Ii Ii )
' '
note -
-
'
a
=
= -
-

Ciii ) for u E le , Lu u > go


, and cu. u ko iff a- o


(I ti ) ( Eu)
< u.us -
-
ah U2 )

Lecture 8, page 17
U U> for u I C- ¢2
IV. US = < .
,

in this case , ( Uil ) =


utpzll =( ut Bz v5 = let BIEÑ = let BZÑ = < v. u >

t
B_ze= ai "

( E)
I

)
note
ri z
=

( Ii ¥ ) =
=
Bz
Reo L: 4=0 we Use O LU, U > so

[ to R : cu U >
. =
O Illit 42124 I UH iU2P=O uit U2 o
-
-
and Ultimo Ui - -
U2

atishoo neo and UFO a =

(Un!) -
-

therefore , this is indeed an IP on a

Lecture 7, page 20
What do we keep from real inner product spaces?

When we move from real to complex inner product spaces, some


things do still work though...

⌅ As we still have positivity, the norm of a vector is defined in


exactly the same way.
This means that unit vectors are defined in the same way.
⌅ We also use the same definition of orthogonal.
Thus orthonormal sets of vectors are defined in the same way.
⌅ Perhaps surprisingly, the Cauchy-Schwarz inequality, the
generalised theorem of Pythagoras and the triangle
inequality also continue to hold. (See Exercises 15.)

Lecture 8, page 18
Example

Show that the set of vectors {v1 , v2 } where


0 1 0 1
1 1
1 B C 1 B C
v1 = p @ i A and v2 = p @ i A
2 3
0 1

is orthonormal when using the dot product in C3 . '

q)
'

" Ali . Ui -

it -
-
=
It =
± , =
,
.

Nhl f- to be =

.
=
E t t"" t I =
It Its = ,

it ik
. =

) Yg )
.
=
to tip .
to =
f Ho- -
-
o

Lecture 8, page 19
Gram-Schmidt orthogonalisation still works...

This works in the same way, i.e. we follow the procedure


Step 1: Take ŵ1 to be the vector
u1
ŵ1 = .
ku1 k

Step 2: Take ŵ2 to be the vector


w2
ŵ2 = where w2 = u2 hu2 , ŵ1 i ŵ1 .
kw2 k

Step 3: Take ŵ3 to be the vector


w3
ŵ3 = where w 3 = u3 hu3 , ŵ1 i ŵ1 hu3 , ŵ2 i ŵ2 .
kw3 k

Step . . . And so on, until we have used all of the original vectors .

Lecture 8, page 20
...BUT we must take care with the inner products!

For instance, the vectors ŵ1 and w2 are orthogonal since

hw2 , ŵ1 i = hu2 hu2 , ŵ1 i ŵ1 , ŵ1 i = hu2 , ŵ1 i hu2 , ŵ1 i hŵ1 , ŵ1 i = 0.

BUT if we INCORRECTLY used, say,

w 2 = u2 hŵ1 , u2 i ŵ1 ,

the vectors ŵ1 and w2 are not orthogonal since

hw2 , ŵ1 i = hu2 hŵ1 , u2 i ŵ1 , ŵ1 i = hu2 , ŵ1 i hŵ1 , u2 i hŵ1 , ŵ1 i =
6 0

as, with complex conjugate symmetry, hŵ1 , u2 i = hu2 , ŵ1 i which


need not be the same as hu2 , ŵ1 i!

Lecture 8, page 21
Example continued

Extend the orthonormal set of vectors {v1 , v2 } where


0 1 0 1
1 1
1 B C 1 B C
v1 = p @ i A and v2 = p @ i A
2 3
0 1

to an orthonormal basis of C3 . ↳ '

4=1 ! ) (t ) -
-
O

Gram -
Schmidt :
pick any us # Linge .gg say
, , us ,
fogy
(§ )
I"
Us
I ¥
.

① Ws
.
=
= 43 -
( Us A ) Ui
. -
cuz -

vz ) uz

② E- Yuta
-
-

(9) -
o -

÷ #
.

(÷ ) " " "" mm

il
.

'

HH t.es .

all 02 Us >
, , is on
=

=
C-DCI )

I -
ti

i 2-14--6
Ci ) -1215 )

basis of E3
Lecture 8, page 22
extra example session cw4)

(y )
O -
I 0
what does the orthogonal matrix A- ?
i
represent
g g
.

/ I I -6 ? I
O '
' °
IAI D= orientation
' = c i ) It I
reversing
-
- - =

-1 O O
O O l

U to sit AI I LATIN d
-
-
. = -

µ ) (t)
l -
I O
ATI - Ci ta to ↳ =D U -
-

,
,

O O 2

find ate ie e-
(§) Aa
⇐ Io ! ) ( ! ) ( ! )
-
-
:
-

. . -

O E Osa
-

Ot Otl -
Lt All
.
=
HUH HA UH Cosa =
Cosa cos 0=1 0=0

Hence A represent in the plane through the origin


with normal vector
(t )
I anticlockwise
note :
tu Aa e
> 0

c o clockwise

if 0=0 . Ay
-

-
a 14 I I 1=0 neither clockwise nor anticlockwise

neither clockwise anticlockwise


if 0=2 . Ay = -

a 11-161=0 nor

find the matrix that represents reflection in the plane through the origin with normal
(! ) and
2 .
a a

rotation 744 clockwise about this vector


by .

E- El :o) .
take E- (9) and e -
-

Eff )
/ / =/ ! For %E/=
" "

check left handed it


-

, + + i
-

Ati
'

fusion sago ! )
'
cui.ui.is mist
-

relative to .
-
- A- MBA
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 14: Orthogonal matrices

For these and all other exercises on this course, you must show all your working.


1. Let u = (a, b, c)t be a vector in R3 with a, c 6= 0. Show that v = (b, a, 0)t is orthogonal to u,
and that there is a value of x such that w = (a, b, x)t is orthogonal to both u and v . Find a formula
for x. can be a
useful conclusion TO
help find orthonormal vectors
2.
✓ (a) Find the orthogonal matrix A representing a rotation through an angle ⇡/3 anticlockwise
about an axis in the direction ( 1, 1, 1)t .
(b) Find the orthogonal matrix C representing a reflection in the plane through the origin with
normal ( 1, 1, 1)t .

:
3. Consider the orthogonal matrices
0 1 0 1
2/3 2/3 1/3 3/5 4/5 0
A = @ 1/3 2/3 2/3 A and B=@ 0 0 1A .
2/3 1/3 2/3 4/5 3/5 0

(a) Are A and B orientation-preserving or orientation-reversing?


(b) Describe the transformation represented by the matrix A.
(c) Describe the transformation represented by the matrix B.
(d) Without doing any further calculations, answer the following.

(i) Why are AB and BA orientation-reversing orthogonal matrices?

(ii) Why are the angles of rotation of AB and BA the same? (An earlier exercise should help.)

(iii) Why would it be incorrect to conclude that the axes of rotation of AB and BA are the same?

4. Show that, for x and y in any real inner product space,

hx , y i = 14 kx + y k2 1
4 kx y k2 .

(You should not assume that we are dealing with any specific inner product!)
i .

given :
a-
( Fg ) ,
U
=/ ! ) ,
he -
-

( Aba ) with x to be confirmed

< U. U> c

(§ ) Goa ) ab ba to orthogonal
=
, > = - = o . so u is to a .

if he is orthogonal to both u and U ,


then cut , us -
o and sue U > .
-
O

therefore
(§ ) ( ! ) and
( § ) (g)
ab ab -10=0
'
or t b txc O < o
-

c o s -
-

>
= -
, , ,

,
,

-
alba
since Cfo ,
X=
c- can make us to be
orthogonal to a and I

orthonormal vectors :

① need 114111 and 114211--1

a .
ca) a unit vector in the direction of the axis is it #
(! ) ✓② and it at t ut

(y) =/ ! )
'
then two orthonormal vectors are at and UI
x

14,43 1%32,1
lui uh it '
E) its t 's ) cui.ua I )
-

since Ko I left-handed
-
=
* is
-

a
-
- -
, . -
. .

, g orthogonal basis of Rs

µ.sn?j-csing?goy)=(Io-Ig ! )
"
relative to cut iv. is
.
,
our transformation is ¥ -
-

B Bpg)
combining the orthogonal vectors . we have an orthogonal matrix MB
' ' '

and mis mi "


y
: !
- -
- -

.
one. .

adit .

moai 'm.
"
soar

÷ : ÷:
-
"

.
(a. b C)
X
.


2 (a) I = C -
l l l It

( Yf )
43 43
from Guestion of of 2/3
-

l C
u, = Cb a
Ay
,
,
-

,
= l , I , =

Uz -
- la b . .
-92¥ ) -
- c -

hi ,
-2 )

1,3 42,33
Fifi )
,

Ef :o)
,

)
'

so it -
-

E ,
-
-
,
uh -
and uh -
-

ri
,
AYE
for reflection
( to ! ! ) j
do , a .

moai 'm.'
"

t : ! ! .tl?i ! !.tf÷÷:
men
-

e- '
.
. "

. . . .

/ I, / / /
I
/ Eg GI f
3. Ca) det IAI -
5- t t t
-

=
-5×-5 t 's 's
x t -5×-5=1

so A is orientation -

preserving ✓

137g g¥ I
det 1131 I
reversing
= - =L l l -

x I - -
I . so B is orientation -

Cb)
from Ca ) , we know that transformation A only includes rotation
let it be the axis of rotation , given by AI =
I

ft
2 2 I

al f: : :D ÷ !)
"
-

'
a- '' '
z -

Z
-

I 2

(!)
Since at at 03=0 , I =

choose a
(÷ ) such that at u then Aa I
? ! ) ( Io ) ( Io )
- =
=
.
.

taking
/ ! ! I / =/ If I ! /
A = -
i
/ to :/ = -
I so ,
so A is a clockwise rotation

Since I Al -
I , Tr CA) =
zoos Otl .
i.e .
It 25 t 25 =
2 SO -11 . So cos 0=-12 and D= F-V
-25
f!)
Overall transformation A clockwise rotation of about
,
represents a axis

cc ) from Cal , we know that transformation includes both rotation and reflection

let id be the axis of rotation , given by And =


We

t
tf ! I +4.7:p fo 's E)
* 'e
-
-
-

-4 3 5
I ✓

Since Cz -
Cztzc , so , he
=

(÷) as the axis of rotation

(1) µo¥¥q ! ) ÷)
choose e- such that at ht .
then Ba
-
-
=

hence
fi if II ,
ai
,
-
-

¥ tent! ! first ! If
,
= -

Eta .
-
Eco

which means B rotates clockwise ✓

Tr CB) =
zoos o -

I ,
ie . -35--2Cosa -

I , so Cosa -
- ¥ and O -
-
arc cos ¥
Overall ,
transformation

reflection about
B
represents
a
plane
a

whose
clockwise rotation

normal is ⇐
of

.
-
I
arc

,
costs
15h
about axis
) and a

I I 5-

÷÷÷÷÷)
(d) Ci ) AB -
-
and B"

¥,
"

orthogonal
product of
"

the two orthogonal matrix is

srmifianceopsd.net#ABl--detlBAl= I so
they are orientation
reversing ✓ orthogonal ?
- -

AB
.

d) Vince Tr CAB) = Tr CBA ) ,


the
angle is
unchanged V
Citi's ABI if he is the of AB Then BE the axis for
reversing I BABE BI axis is
-
- = -

. .

BA , no reason for I =
BE

*
4 .
¥ 11 X type -

I 11 x y 112
-
=
¥ I
llxty 112 -
Hx -9114

=
¥ ccxty xty > .
-
<x y
-

,
x y
-
s )

by property Ci) =
Iq ( ex .
xty > + ay xtys ,
-
ex , x -

y ) t ay ,
x -

y> I

=
¥ ( s Xty ,
x > t city y , > - ex -

y
. x > t ex
-

y y>)
,

=
&C c
-
X. x > t <
y . x St Lx ,
ys t
-
ay y , >
-
-
ax . x > + ay , x) tax , y> -

-
ay y > )
,

=
Th ( 4 ex y s ) ,

L X
y>
=
,
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 14: Orthogonal matrices

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. In general, a vector orthogonal to u = (a, b, c)t is v = (b, a, 0)t , since u · v = ab ba = 0.


A vector orthogonal to both u and v will be of the form (a, b, x)t , for some value of x – any such
vector is automatically orthogonal to v . For it to be orthogonal to u, we need a2 + b2 + cx = 0, and
so x = (a2 + b2 )/c.
p
2. To find A, the method is to change to an orthonormal basis in which (1/ 3)( 1, 1, 1)t is
the third element v3 . The previous question helps p us to complete this to an orthonormal basis very
quickly: a unit vector
p orthogonal to v 3 is v 1 = (1/ 2)(1, 1, 0)t . A unit vector orthogonal to both v1
and v3 is v2 = (1/ 6)( 1, 1, 2)t . We want to list these vectors as an ordered basis B = (v1 , v2 , v3 )
in such a way that the third vector is pointing up out of the plane in which the second is anticlockwise
of the first. This amounts to checking that
0 1
1 1 1
1
det(MB ) = det @1 1 1 A > 0,
6
0 2 1

which is correct (if it hadn’t been, we should swap v1 and v2 ). Thus, we take
0 1 1 1
1 0 1 1
1
p
2
p
6
p
3
p
2
p
2
0
B p1 C B p2 C
MB = @ p12 p1
6 3A
so that MB 1 = MBt = @ p16 p1
6 6A
.
0 p2 p1 p1 p1 p1
6 3 3 3 3

The matrix AB,B


T representing this linear transformation T with respect to the basis B is
0 p 1
p1/2 3/2 0
@ 3/2 1/2 0A ,
0 0 1

since the matrix representing rotation through ⇡/3 in the anticlockwise (positive) direction in R2 is
✓ ◆
cos(⇡/3) sin(⇡/3)
.
sin(⇡/3) cos(⇡/3)
0 1
2/3 2/3 1/3
We can now see that A = AT = MB AB,B
T M t = @ 1/3
B 2/3 2/3A.
2/3 1/3 2/3
This is a product of orthogonal matrices, and you can see that it is indeed orthogonal. You can also
see that it maps ( 1, 1, 1)t to itself, as it should.
For the second matrix C, representing the reflection S in the plane normal to ( 1, 1, 1)t , the easiest
approach is to note that 0 1
1 0 0
AB,B
S = @0 1 0 A
0 0 1
as the vector v3 is mapped to v3 whereas v1 and v2 , as they lie in the plane of reflection, are fixed.
Thus 0 1
1/3 2/3 2/3
C = AS = MB AB,B
S MB =
t @2/3 1/3 2/3A .
2/3 2/3 1/3
Note: do you have to make the axis the third vector in the ordered basis? No, but the transformation
matrix AB,B
S depends on which of the three vectors in the basis B is the axis. If you are dealing with
a rotation, and you want to represent the rotation with respect to a basis using the matrix
0 1
cos ✓ sin ✓ 0
@ sin ✓ cos ✓ 0A
0 0 1

then the axis vector has to be the third vector of the basis. For this question as a whole, there are
some clear advantages to using the same change-of-basis matrix (i.e. represent the transformation
with respect to the same basis) for both parts.

3. (a) The determinant of the orthogonal matrix A is +1, so this matrix is orientation-preserving.
The determinant of B is 1, so it is orientation-reversing.
(b) To find the axis of rotation of A, we look for the eigenvector corresponding to eigenvalue 1. We
have 0 1
1/3 2/3 1/3
A I = @ 1/3 1/3 2/3 A ,
2/3 1/3 1/3
and it’s perfectly legitimate to say that x = (1, 1, 1)t is in the null space “by inspection”: perhaps
you spotted that adding the three columns together gives the zero vector.
One way to find the angle of rotation is to find a vector y perpendicular to the axis x , and find the
angle between y and Ay . For instance, the vector y = (1, 0, 1)t is perpendicular to (1, 1, 1), and
Ay = (1, 1, 0)t . If ✓ denotes the angle of rotation, then we have

y · (Ay ) 1 1
cos ✓ = =p p = .
ky k kAy k 2· 2 2

( By the way, we have kAy k = ky k as A is orthogonal. ) Thus the angle of rotation is ⇡/3.
Another approach is to use the fact that the matrix representing a rotation by ✓ with respect to
standard co-ordinates is 0 1
cos ✓ sin ✓ 0
@ sin ✓ cos ✓ 0A ,
0 0 1
which has trace (sum of the diagonal entries) 1 + 2 cos ✓, and trace is preserved under similarity. So
the trace of A, which is equal to 2, is also equal to 1 + 2 cos ✓; this again gives ✓ = ⇡/3.
To determine whether the rotation is clockwise or anticlockwise, we look at the determinant
0 1 0 1
1 1 1
@ A
det y Ay x = det 0 @ 1 1A .
1 0 1

Then, as this gives us a determinant of 3, which is negative, we can see that we have a clockwise
rotation through an angle of ⇡/3 about the axis (1, 1, 1)t .
It is also correct to say that the matrix represents an anticlockwise rotation through an angle of ⇡/3
radians about the axis ( 1, 1, 1)t .
(c) To find the axis of rotation of B, we look for the eigenvector corresponding to eigenvalue 1. We
have 0 1
8/5 4/5 0
B+I =@ 0 1 1A ,
4/5 3/5 1
and it’s perfectly legitimate to say that x = (1, 2, 2)t is in the null space “by inspection”: perhaps
you spotted that c1 2c2 + 2c3 = 0.
One way to find the angle of rotation is to find a vector y perpendicular to the axis x , and find the
angle between y and By . For instance, the vector y = (0, 1, 1)t is perpendicular to (1, 2, 2), and
By = (4/5, 1, 3/5)t . If ✓ denotes the angle of rotation, then we have

y · (By ) 8/5 4
cos ✓ = =p p = .
ky k kBy k 2· 2 5

( By the way, we have kBy k = ky k as B is orthogonal. ) Thus the angle of rotation is cos 1 (4/5)

which is approximately 0.644 radians (or 36.9 degrees).


Another approach is to use the fact that the matrix representing a rotation by ✓ with respect to
standard co-ordinates is 0 1
cos ✓ sin ✓ 0
@ sin ✓ cos ✓ 0 A,
0 0 1
which has trace (sum of the diagonal entries) 1 + 2 cos ✓, and trace is preserved under similarity.
So the trace of B, which is equal to 3/5, is also equal to 1 + 2 cos ✓; this again gives cos ✓ = 4/5.
To determine whether the rotation is clockwise or anticlockwise, we look at the determinant
0 1 0 1
0 4/5 1
@ A
det y By x = det 1 1 @ 2A .
1 3/5 2

Then, as this gives us a determinant of 18/5, which is negative, we can see that we have a clockwise
rotation through an angle of cos ✓ = 4/5 about the axis (1, 2, 2)t combined with a reflection in the
plane through the origin with this vector as its normal.
It is also correct to say that the matrix represents an anticlockwise rotation through an angle of
cos ✓ = 4/5 radians about the axis ( 1, 2, 2)t combined with a reflection in the plane through the
origin with this vector as its normal.
(d) For (i), both AB and BA are products of orthogonal matrices, so orthogonal. Their determinants
are both equal to det(A) det(B) = 1, so they’re both orientation-reversing.
For (ii), the answer to Question 2 from Exercises 13 tells us that AB and BA are similar. This
means, in particular, that they have the same trace. We know that the trace of an orientation-
reversing matrix with an angle of rotation ✓ is 1 + 2 cos ✓, so AB and BA have the same angle of
rotation ✓.
For (iii), if the axis of rotation of AB is v 6= 0, we have

ABv = v.

However, multiplying through by B gives us

B(ABv ) = Bv =) (BA)Bv = Bv ,

and so, as B is orthogonal, Bv 6= 0 is the axis for BA. There is, of course, no reason to suppose that
v will give us the same direction as Bv as B will rotate and/or reflect the vector v .
4. All we have to do is expand out the right-hand side to see that

kx + y k2 = hx + y , x + y i = kx k2 + 2hx , y i + ky k2 ,

and similarly
kx y k2 = hx y, x y i = kx k2 2hx , y i + ky k2 ,
so their di↵erence is 4hx , y i, which is what we wanted.
It is worth noting that this proof works for any inner product, not just the dot product.
MA212 Further Mathematical Methods
Lecture 9: Special complex matrices
-

real LIL > = <


1. I >
Dr James Ward Rn x.yn-yt.nl

⌅ Hermitian transposes symmetric At A -

orthogonal ATA In
⌅ Hermitian matrices
-
-

⌅ Unitary matrices cptxcx.ys-cy.AT


on 11 It
⌅ Normal matrices I
-
.
-

The London School of Economics and Political Science


Department of Mathematics

Lecture 9, page 1
The Hermitian transpose

Definition: The Hermitian transpose, A⇤ , of a complex


matrix A is given by
t
A⇤ = A .

That is, it is the transpose of the complex conjugate of A.

Note: As the order in which we complex conjugate and transpose


doesn’t matter, we also have

A⇤ = At .

Lecture 9, page 3
Hermitian transposes when A is real

Note: All real matrices are complex matrices where the imaginary
part of each entry just happens to be zero. That is,

if A is real, then A = A.

As such, the Hermitian transpose of a real matrix is just its


transpose, i.e.

if A is real, then A⇤ = (A)t = At .

But, of course, if A is a nonreal complex matrix this is not the case!


! !
1 i 1 1 i
For example: A = gives us A⇤ = .
1+i 2 i i 2+i
Here A⇤ is certainly not At !

Lecture 9, page 4
Some useful facts about Hermitian transposes

The Hermitian transpose of a complex matrix has similar


properties to the transpose of a real matrix.

Theorem: If A is a complex matrix, then

(A⇤ )⇤ = A, (AB)⇤ = B ⇤ A⇤ and |A⇤ | = |A|.

Proof: If A and B are complex matrices, then


t t
⌅ (A⇤ )⇤ = (A ) = (At )t = A,
t t
⌅ (AB)⇤ = (AB)t = (A B)t = B A = B ⇤ A⇤ , and
t
⌅ |A⇤ | = |A | = |A| = |A|.

Lecture 9, page 5
A useful fact about complex dot products

Hermitian transposes also give us a way of writing the complex dot


product as a matrix product because, for u, v 2 Cn , we have

u · v = v ⇤ u,

if we identify the single entry in the 1 ⇥ 1 matrix v ⇤ u with the


complex number u · v .
E.g. Looking at the dot product in C2 , this works because
!
⇣ ⌘ u ⇣ ⌘ ⇣ ⌘ ⇣ ⌘
⇤ 1
v u = v 1 v2 = v1 u 1 + v2 u 2 = u 1 v1 + u 2 v 2 = u · v .
u2

Of course, in this case, we could not have used u ⇤ v instead as


u ⇤ v = u · v which need not be the same as u · v !

Lecture 9, page 6
A further fact about complex dot products

Theorem: If A is a complex n ⇥ n matrix, then

x · (A⇤ y ) = (Ax ) · y

for all x , y 2 Cn .

Proof: If A is a complex n ⇥ n matrix and x , y 2 Cn , then

x · (A⇤ y ) = (A⇤ y )⇤ x = y ⇤ (A⇤ )⇤ x = y ⇤ Ax = (Ax ) · y .

Lecture 9, page 7
Hermitian matrices

Definition: A complex matrix A is Hermitian if A⇤ = A.


!
1 i
For example: The matrix A = is Hermitian as A⇤ = A.
i 2
Note: For a complex matrix A to be Hermitian, its diagonal entries
must be real (i.e. aii 2 R) and its corresponding o↵-diagonal
entries must be complex conjugates (i.e. aij = aji with i 6= j).
Note: Hermitian complex matrices are the complex analogue of
symmetric real matrices.
In particular, every real (i.e. A = A) symmetric (i.e. At = A)
t
matrix is Hermitian as this gives us A⇤ = A = At = A.

Lecture 9, page 8
Properties of Hermitian matrices

Theorem: If A is an Hermitian matrix, then


(1) The eigenvalues of A are real.
(2) Eigenvectors of A corresponding to distinct eigenvalues
are orthogonal.

Proof: suppose a , .ee are eigenvalues of A with eigenvectors x. y to


-

AI A1 AL M1
-
- -
-
.

① x. CAL) =
I. ( UI ) tick 9)

[email protected]
=

)→ ,
Chi d)
-

LI 17=0
-

cis take a -
- U and 1=1 CI -
d) I 1=0 LI a) 11×42=0
-
I -
-
a * is real

② take Atm, ki d) CX y ) =D Ch A) ex y) O x. y o
orthogonal toy
-

x is
- - - -
- - -

ti u as reef
- to
htt
eigenvalues Lecture 9, page 9
Unitary matrices

Definition: A complex n ⇥ n matrix A is unitary if A⇤ A = In .

In particular, we can see straight away that

Theorem: If A is a unitary matrix, then |A| is modulus one.


In particular, A is invertible.

Proof: As A is unitary, it is a complex n ⇥ n matrix where

A⇤ A = In =) |A⇤ A| = |In | =) |A⇤ ||A| = 1 =) |A||A| = 1.

Thus, the modulus of |A| must be one.


Of course, as |A| =
6 0, this means that A is invertible.

Lecture 9, page 11
Other definitions of unitary matrices

When we are looking to see whether a matrix is unitary, it is useful


to have the following equivalent definitions.

Theorem: The following statements about a complex n ⇥ n


matrix A are equivalent.
(1) A is unitary, i.e. A⇤ A = In .
(2) A⇤ = A 1.

(3) AA⇤ = In .
(4) The columns of A form an orthonormal set of vectors.

Proof: To establish the equivalences, we’ll show that


(1) () (2), (2) () (3) and (1) () (4).

Lecture 9, page 12
Proof of (1) () (2), i.e. A⇤ A = In () A⇤ = A 1

Suppose that A is a complex n ⇥ n matrix.


LTR: If A is unitary, then it is invertible and

A⇤ A = In =) (A⇤ A)A 1
=A 1
=) A⇤ (AA 1
)=A 1
,

which means that A⇤ = A 1.

RTL: If A⇤ = A 1, then

A⇤ A = A 1
A = In ,

which means that A is unitary.

Lecture 9, page 13
Proof of (2) () (3) i.e. A⇤ = A 1
() AA⇤ = In

Suppose that A is a complex n ⇥ n matrix.


LTR: If A⇤ = A 1, then

AA⇤ = AA 1
= In ,

as required.
RTL: If AA⇤ = In , then A is invertible as

|AA⇤ | = |In | =) |A||A⇤ | = 1 =) |A||A| = 1 =) |A| =


6 0.

Thus, we have

A 1
(AA⇤ ) = A 1
=) (A 1
A)A⇤ = A 1
,

which means that A⇤ = A 1.

Lecture 9, page 14
Proof of (1) () (4), i.e. A⇤ A = In () columns of A are ON

Let the vectors x1 , x2 , . . . , xn 2 Cn be the columns of A.


Then the matrix product A⇤ A is
0 1 0 1
—– x1⇤ —– 0 1 x1⇤ x1 x1⇤ x2 · · · x1⇤ xn
B C B ⇤ C
B—– x2⇤ —–C B ⇤
C Bx 2 x 1 x 2 x 2 · · · x2⇤ xn C
B C @x 1 x 2 · · · xn A = B .. C
B .. C B .. .. ..
. C.
@ . A @ . . . A
—– xn⇤ —– ⇤ ⇤
xn x1 xn x2 · · · xn⇤ xn

Thus, we see that A is unitary if and only if


8 8
<1 if i = j <1 if i = j
A⇤ A = In () xj⇤ xi = () xi · xj =
:0 if i 6= j :0 if i 6= j
which happens if and only if {x1 , x2 , . . . , xn } is an orthonormal set,
as required.

Lecture 9, page 15
Remarks about unitary matrices

(4) tells us that the columns of a unitary matrix will give us n


orthonormal vectors in Cn , i.e. its columns will form an
orthonormal basis of Cn .
Note: Unitary complex matrices are the complex analogue of
orthogonal real matrices.
In particular, every real (i.e. A = A) orthogonal (i.e. At A = In )
t
matrix is unitary as this gives us A⇤ A = A A = At A = In .
We could go on and show how unitary matrices can be
characterised in terms of dot products and norms (just like
orthogonal matrices can be) but we won’t!

Lecture 9, page 16
Example
!
1 1
Show that the matrix A = p1 is unitary.
2 i i
Verify that its determinant, |A|, has a modulus of one.

Eli Isil : Iii :: Ii:) El : : ) 9)


'
Ata -
-
=
's -
-

A is unitary

I I:/
'

# Kim tines E ai
' Ah
.
-
.
- -
. i
-

as I -
it 2=1 i) ( T)
- = C i -
) i= -
iz = I modulus of IAI is I

Lecture 9, page 17
A property of unitary matrices

Theorem: All the eigenvalues of a unitary matrix have a


modulus of one.

Proof: suppose d is any eigenvalue of A with eigenvector x to

i. e .

AI # I , XFO , AHA In =

* * * *
( AI) CAH
-
=
CA Ay =
I A AI =
x In A = x* x

I
*
→ *x -
lap x x

(A1) # D=
-

Cays and
.
=
mix e ,
-

=
µp×* ,
't
(I 142 ) x X O
-
- -

=
X X=
.

11×11270 as XFO

IN 2--1 Idk I

Lecture 9, page 18
Normal matrices

Definition: A complex n ⇥ n matrix is normal if A⇤ A = AA⇤ .

Note: All Hermitian matrices are normal as A⇤ = A means that


A⇤ A and AA⇤ are both equal to A2 .
Note: All unitary matrices are normal as A⇤ A and AA⇤ are both
equal to In .
Note: But there are normal matrices which are neither Hermitian
nor unitary as the next example shows.

Lecture 9, page 19
Example
!
0 2
Show that the matrix A = is normal.
2 i
Also show that A is neither Hermitian nor unitary.

an*
-

⇐ ill : =

⇐ ÷.tl ::
*
AA -
-

(I ? ) ( Iz ! ) =

(I fi )
-

) → AH AA't
-
A is normal

but At =

( I Ii ) ⇐ ? )
t -
-
A A is not Hermitian

AAA
(I ti ) (to 9)
and =

t Iz A is not
unitary
=

Lecture 9, page 20
A quick summary

⌅ A complex n ⇥ n matrix is
I Hermitian if A⇤ = A,
I unitary if A⇤ A = In ,
I normal if A⇤ A = AA⇤ .
⌅ A real Hermitian matrix is symmetric.
A real unitary matrix is orthogonal.
⌅ If we have an Hermitian matrix
I all the eigenvalues are real,
I distinct eigenvalues have orthogonal eigenvectors.
⌅ All the eigenvalues of a unitary matrix have a modulus of one.

Lecture 9, page 21
MA212 Further Mathematical Methods
Lecture 10: Diagonalisation

Dr James Ward

⌅ Eigenvalues, eigenvectors and eigenspaces


⌅ Diagonalisation
⌅ Unitary diagonalisation
⌅ Spectral decomposition

The London School of Economics and Political Science


Department of Mathematics

Lecture 10, page 1


Information

⌅ Exercises 15 is on Moodle.
I Attempt questions 1, 4, 6, 7 and 8.
I Follow your class teacher’s submission instructions.
I Do it! Actually submit your homework!

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes
I Go to them.

Lecture 10, page 2


Eigenvalues

The eigenvalues of an n ⇥ n matrix A are the 2 C that satisfy

Ax = x

for some vector x 6= 0. We find these by solving the equation

|A In | = 0,

and if we find that this gives us

m1 m2 mk
( 1) ( 2) ···( k) = 0,

where m1 + m2 + · · · + mk = n, the i 2 C are the eigenvalues.


We call mi 2 N the algebraic multiplicity, a i , of i.

Lecture 10, page 3


Eigenvectors and eigenspaces

Given an eigenvalue, , of A we can find an eigenvector by solving

(A In )x = 0,

for a vector x 6= 0. More generally, this gives us the following.

Definition: If is an eigenvalue of A, its eigenspace E (A, )


is the null space of A In .

Note: All possible eigenvectors for are in E (A, ) as

E (A, ) = N(A In ) = {x | (A In )x = 0} = {x | Ax = x }.

However, as E (A, ) is a subspace, we also have 0 2 E (A, ) but,


by definition, 0 is not an eigenvector.

Lecture 10, page 4


Algebraic and geometric multiplicities

The geometric multiplicity of is g = dim E (A, ).


That is, the geometric multiplicity tells us how many linearly
independent eigenvectors we can find for each eigenvalue.
And, although we won’t prove it, you should note that

1g a .

That is, if a is the algebraic multiplicity of , then we can get at


most a linearly independent eigenvectors for .

Lecture 10, page 5


Example

Find the eigenvalues of the matrix


0 1
4 0 6
B C
A=@ 2 1 5A
3 0 5

and their multiplicities.

f-zi.iq/=o...ftYaIIa-n--o
IA tI3ko
-
4 X
-
o 6

-3 O
'
di -
I, 91=2 land D= -2,9-2=1
-
see ,

y
d l f -2=1
-
I

1=1 ,
CA -
I3 ) I
+ -

I l

l
-

l; :
3
raki
i O 6 I

,
'

I
l
2tN=3 - - -
-

-
" '
l
t
l
l

only
'

(
I LI erector → 91=1 I
Lecture 10, page 6
Diagonalisation

Definition: A matrix A is diagonalisable if there is an


invertible matrix P such that P 1 AP = D where D is
a diagonal matrix.

Note: In MA100, you restricted your attention to the case where A


is real (with real eigenvalues and eigenvectors), but the method
you saw there works even if A is complex or if A is real but just
happens to have complex eigenvalues and eigenvectors.
We now ask: What do we need for A to be diagonalisable?

Lecture 10, page 7


A condition for diagonalisability

Theorem: An n ⇥ n matrix A is diagonalisable if and only if


its eigenvectors form a basis of Cn .

Proof: We’ll show this for the 2 ⇥ 2 case. The proof generalises.
The key here is that, if v1 and v2 are the columns of a matrix P,
0 1 0 1
B C B C
AP = A @v1 v2 A = @Av1 Av2 A

and, if D is a diagonal matrix with entries 1 and 2,


0 1 0 1
!
B C 1 0 B C
PD = @v1 v2 A =@ 1 v1 2 v2 A .
0 2

Lecture 10, page 8


LTR: If A is diagonalisable, then there is an invertible matrix P
such that P 1 AP = D where D is a diagonal matrix. Let v1 and
v2 be the columns of P and let 1 and 2 be the diagonal entries
of D. Now, P 1 AP = D gives us AP = PD, i.e. from above

Av1 = 1 v1 and Av2 = 2 v2 .

Consequently, as P is invertible, we have two linearly independent


eigenvectors v1 and v2 which form a basis of C2 , as required.
RTL: If the eigenvectors of A form a basis of C2 , then we have
two linearly independent vectors v1 and v2 such that

Av1 = 1 v1 and Av2 = 2 v2 .

As v1 and v2 are linearly independent, we can take them to be the


columns of an invertible matrix P giving us, from above,
AP = PD, or P 1 AP = D, where D is a diagonal matrix.

Lecture 10, page 9


Diagonalisability and multiplicities

A direct consequence of the last theorem is that, when we have


repeated eigenvalues (i.e. an algebraic multiplicity a > 1), they
must supply us with enough linearly independent eigenvectors (i.e.
have a geometric multiplicity g = a ) in order for us to form our
basis of Cn .

Corollary: An n ⇥ n matrix A is diagonalisable if and only if


the geometric multiplicity of each eigenvalue is equal to its
algebraic multiplicity.
0 1
4 0 6
B C
For example: The matrix A = @ 2 1 5A from earlier.
3 0 5
This is not diagonalisable as a1 = 2 6= 1 = g1 .

Lecture 10, page 10


Something to note when diagonalising a real matrix

If A happens to be a real matrix (with complex eigenvalues and


eigenvectors), then we have some useful shortcuts!

Theorem: Suppose that A is a real n ⇥ n matrix.


If is an eigenvalue of A with eigenvector v , then is also
an eigenvalue of A and its eigenvector is v .

Proof: Suppose that A is a real n ⇥ n matrix.


If is an eigenvalue of A with eigenvector v 6= 0, then
Av = v =) Av = v =) Av = v.
But, as A is real, this gives us
Av = v,
and so is an eigenvalue of A with eigenvector v 6= 0, as required.
Lecture 10, page 11
Example
!
0 1
Diagonalise the matrix A = .
1 0

IA dIz t o
II 1×1=0 Ntl
-
-
-
-
- o a- Ii

÷) I fair H)
ai ca - '' *o -
-

A is real . a- - i
. so eigenvector (t );

so P -
-

ft t ) and D= ( f Ii )

Lecture 10, page 12


Unitary diagonalisation

Definition: A matrix A is unitarily diagonalisable if there is a


unitary matrix P such that P ⇤ AP = D where D is a diagonal
matrix.

Note: This is the complex analogue of what you saw in MA100


namely, when A is real (with real eigenvalues and eigenvectors), it
may be orthogonally diagonalisable.
We now ask: What do we need for A to be unitarily diagonalisable?

Lecture 10, page 13


A condition for unitary diagonalisability

For A to be diagonalisable, the columns of P must be linearly


independent eigenvectors of A so that P is invertible. Thus, if we
want P to be unitary, we now need these eigenvectors to be
orthonormal. That is...

Theorem: An n ⇥ n matrix A is unitarily diagonalisable if


and only if its eigenvectors form an orthonormal basis of Cn .

Proof: Repeat the proof we gave for diagonalisability on slides 8


and 9 replacing each occurrence of ‘linearly independent’ with
‘orthonormal’, each occurrence of ‘invertible’ with ‘unitary’ and
each occurrence of ‘P 1 ’ with ‘P ⇤ ’.

Lecture 10, page 14


What matrices are unitarily diagonalisable?

In MA100, you saw that a real matrix (with real eigenvalues and
eigenvectors) is orthogonally diagonalisable if and only if it is
symmetric.
We now see which matrices are unitarily diagonalisable.

Theorem: A matrix is unitarily diagonalisable if and only if


it is normal.

Proof: We will only prove this in the LTR direction. The RTL
direction requires theory that we aren’t going to cover in this
course.

Lecture 10, page 15


LTR: If A is unitarily diagonalisable, then there is a unitary matrix
P such that P ⇤ AP = D where D is diagonalisable. Thus, we have

A = PDP ⇤ and A⇤ = (PDP ⇤ )⇤ = (P ⇤ )⇤ D ⇤ P ⇤ = PD ⇤ P ⇤ .

This gives us

A⇤ A = (PD ⇤ P ⇤ )(PDP ⇤ ) = PD ⇤ In DP ⇤ = PD ⇤ DP ⇤

and

AA⇤ = (PDP ⇤ )(PD ⇤ P ⇤ ) = PDIn D ⇤ P ⇤ = PDD ⇤ P ⇤ .

Now, as D is diagonal, we have D ⇤ D = DD ⇤ (see Question 7 of


Exercises 15) and so A⇤ A = AA⇤ . Thus, A is normal, as required.

Lecture 10, page 16


Example
!
0 1
Unitarily diagonalise the matrix A = .
1 0

ok ;) fo : ) f :X : off : : )
note :
AA
't ' -
- -
- Ata -
-
'

, ,

so A is normal and thus unitarily diagonalisable

earlier we
found d, = i . Ui =

(f ) i Ase i , Use -

I, if
.
=
(f ) .
= I Ci ) t i C -F) =
It i 2=0 At Us orthogonal

1101112 =
(f ) (! ) = l Li ) tic F) = I -
i 2=2 ,
I =
Ft ( f )
1102112 '

f!;) =
I Ci )t C -
i ) L T)
-
= I -
i 2=2 ,
I #
=

( ) ( I %)
YR 452
" and D=

ios .iq

Lecture 10, page 17


The identity matrix

Now we’ll need this result about the n ⇥ n identity matrix In .

Theorem: If {v1 , v2 , . . . , vn } is an orthonormal basis of Cn ,

In = v1 v1⇤ + v2 v2⇤ + · · · + vn vn⇤ .

Proof: We’ll show this for the n = 2 case. The proof generalises.
Let {v1 , v2 } be an orthonormal basis of C2 and let ⌃ be the sum
⌃ = v1 v1⇤ + v2 v2⇤ ,
we want to show that, for all 1  i, j  2, we have
vi⇤ (⌃ I2 )vj = 0 so that we can conclude ⌃ = I2
using what we saw in Question 11 of Exercises 13.

Lecture 10, page 18


So, for any 1  i, j  2, we have
vi⇤ (⌃ I2 )vj = vi⇤ (v1 v1⇤ + v2 v2⇤ I2 )vj
= vi⇤ v1 v1⇤ vj + vi⇤ v2 v2⇤ vj vi⇤ vj
= (v1 · vi )(vj · v1 ) + (v2 · vi )(vj · v2 ) vj · vi .
Now, using the orthonormality of the vi , we have the following.

⌅ If i = j, say they’re both 1, we have


v1⇤ (⌃ I2 )v1 = (1)(1) + (0)(0) 1=0
and similarly when they’re both 2.
⌅ If i 6= j, say i = 1 and j = 2, we have
v1⇤ (⌃ I2 )v2 = (1)(0) + (0)(1) 0=0
and similarly when i = 2 and j = 1.

Thus, for all 1  i, j  2, we have vi⇤ (⌃ I2 )vj = 0 as required.


-2=12
Lecture 10, page 19
Spectral decompositions

Theorem: If A is a normal matrix, then

⇤ ⇤ ⇤
A= 1 v1 v1 + 2 v2 v2 + ··· + n vn vn

where {v1 , v2 , . . . , vn } is an orthonormal set of eigenvectors


corresponding to the eigenvalues i of A.
We call this the spectral decomposition of A.

Proof: As A is normal, we can find an orthonormal basis of Cn , say


{v1 , v2 , . . . , vn }, where each vi 6= 0 is an eigenvector of A so that

Avi = i vi ,

for some eigenvalue i of A.

Lecture 10, page 20


Thus, using the previous theorem, we have

A = AIn = A(v1 v1⇤ + v2 v2⇤ + · · · + vn vn⇤ )


= Av1 v1⇤ + Av2 v2⇤ + · · · + Avn vn⇤
⇤ ⇤ ⇤
= 1 v1 v1 + 2 v2 v2 + ··· + n vn vn ,

as required.
Note: The n ⇥ n matrices given by the vi vi⇤ are sometimes
denoted by Ei so that the spectral decomposition is

A= 1 E1 + 2 E2 + ··· + n En .

These matrices have some interesting properties and, if we have


time, we will talk about them again later in the course.

Lecture 10, page 21


Example
!
0 1
Find the spectral decomposition of the matrix A = .
1 0
from earlier . A is normal

evaluesdr-i.dz = - i A is normal

on erectors E- Eff ) ,
A- Eff ) I

't
unitarily diagonis able
A- = did U , t ask UI

= i .

(! ) # ci -
is this # fi ) # u ti ) spectral decomposition

l :: ::L it :
"
"
" -
"
.

Lecture 10, page 22


MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 15: Complex inner products and matrices

For these and all other exercises on this course, you must show all your working.

:
1. Show that, for x and y in Cn ,
1 1 i i
hx , y i = kx + y k2 kx y k2 + kx + iy k2 kx iy k2 .
4 4 4 4

2. Given that the Cauchy-Schwarz inequality holds in any complex inner product space, prove
that the generalised theorem of Pythagoras and the triangle inequality also hold. That is,
(i) if x and y are orthogonal, then kx + y k2 = kx k2 + ky k2 .
(ii) for any pair x , y of vectors, kx + y k  kx k + ky k;
Hint: Follow the proofs given for the real case.

3. Find the rank of the complex matrix


0 1
i 0 1
M = @ 0 1 0A .
1 i i

Find bases for the range R(M ) and the nullspace N (M ) of M .

4. Find an orthonormal basis for the subspace of C3 spanned by the vectors (i, 0, 1)t and (1, 1, 0)t .

5. Consider the complex matrix


0 1
1 1 1
C = @2 0 1 + iA .
0 1+i 1

Find an orthonormal basis of N (C), and extend it to an orthonormal basis of C3 .


6. (a) Which, if any, of the following matrices are Hermitian? Which are unitary? What does
that tell you about their eigenvalues?
✓ ◆ ✓ ◆ ✓ ◆
2 2i 0 i 0 1
A= ; B= ; C= .
2i 5 i 0 i 0

(b) Find the eigenvalues and corresponding eigenvectors of each of A, B and C.


7. Show that all diagonal matrices are normal.
Which complex numbers can arise as an eigenvalue of a normal matrix?
✓ p p ◆
1/ 2 i/ 2
8. Find all the triples (a, b, c) of real numbers for which the matrix
✓ is
a + ib c

(i) Hermitian, (ii) unitary and (iii) normal.


< X. y > =
Ly ,
X> ; L U ,
htt Bw > = ICU U > t . FLU ,
he >

< xtiy xtiy > , = .


xtiy > 1-
icy Xtiy > ,
= < xtiy , X> t i cxtiy y > ,

=
CX . x> -
icx y > + icy
,
, y>
-

Ray y )
,

/ . ¥11 type x =
I cxty , xty >

=
¥ ( ex , xty > t Cy , Atys )

=
¥ ( ex , X s t c X. y s t <
g. x > t Ly y ,
> )

-1411 x -

y 112 =
¥ ex -
y , X y> -

= ¥ ( CX X y ,
-

> -

Ly , x -

y s)

=
¥ ( ex X . > -

ex y > ,
-

ay , x> + ay y > ),

& llxtyll '


-

¥11 x yn -
'
= IT ( CX . X s t c X. y s t <
g. x > t < y y ,
> - < x. X > t ex y > +
,
ay, x> -
ay y s)
,

=
& c < X. y > t Ly ,
x > )

TI llxtiy If = Tic xtiy xtiys ,

=
pi (c x , xtiy > t i < y xtiy > )
.

= TE le x. x s -
is X.
y s t icy , X s
-

icy , y > )

= I' ( CX .
x > t Ly ys) , t Td Lex y , s -

Ly ,
X s )

Ig Hx type TE ex iy X
- = -

,
-

iy >

if Kx X iy =
,
-
> -

icy ,
X iy s)
-

=
¥ ( < x. x > + i say > -
ray X s ,
-

is < y y , > )

=
Ep ( CX X , s +
ay y ,
s ) -

I ( ex y ,
s -

ay × >7
,

Till xtiylp I'll - X -

ight TIC =
CX .
xstcy.us) t Td Lex y , s -
Ly .
x s )
-

Ep ( CX X , s + ay y ,
s) t
-

I ( ex y
,
s -

ay × >)
,

=
I ( LX ,
y> -

Cy , X > )

therefore RHS d- ( y > t Ly Xs ) t £ ( LX.gs ay ) y s


=L
ex , x> ex
-

,
-
-
,
= -
,

note : Z t E =
2 Re CZ )

2. Li )
from Q1 , 11×-19112 = 11×1142Relay > thy 112

since X. y orthogonal ,
ix. y> = o, so 11×-19112=11×114 Ily 112

Cii ) since Relay > E I ix. y> Is 11×11 llyll

11×114 Ily 1142 Rec x. y > E 11×112-11191124211×1111911

so Hxty IKE 41×111-1191112


i. e .

Hxty HE 11×111-11911
! :) l! ! :) l : : :o)
' '
s

:: :
.

. -
in
. .
.

since there are two leading ones in RRE ( M ) , then rank -


- 2,
nullity = 1

And the first two columns containing leading ones form the basis of RIMI = Lin
{ (I )
, . ( !) }
( g : Hit fool
man since '
-

4 . let it -

f! ) and
(!) E-

" UH even
.T i.it#Ti=Fti=E
-
- = =

so ni -
-

¥, ri -
-

(g)
us

!
w.

fi ) f ! ) it HI ihr
< > o
-

#
=
.
-
.
=
i .

+ i. -

son o. -
ca .
wi > wi =
+
in -

Elio ) (f)
,
-
-

+ i
=L t
" wzil-cwa.ws = I =
Mitt = B12

(÷ ) ( If
therefore wi -

Ha, -
-

E- -

Fi
.

an orthonormal basis for the subspace is


{ uh uh },
At
(3 It ⇐ Ii )
*
b. (a) A = = =
-
-
A
,

*a .
-

⇐ ⇐ =

Ii Iii : =L :* .
it.

so matrix A is Hermitian but not and know A's eigenvalues


unitary , we are real .

✓ -

(I j )
Bt
( to )t
't
B -
-
-
-
o = ' -
B
.

⇐ ill : of %) C if
*
'
' *
-
-

'
.

matrix B both Hermitian and Its and have modulus of


so is
v unitary eigenvalues are real a one -


.

u -

* =
E -
-

⇐ of ( 9
' =

to

fill : of 71--491=1
'
* a- .

So matrix C is not Hermitian but unitary .


Its eigenvalues have modulus of one -

v u

Cb ) let delta Iz d) 0 i.e


/ IF I 1=0 CZ X) CS x) -1412=0
=
so
-
-
-
. .

10 - 7h -142 -

4 = 0

( d 1) Cd b ) and
42=4
- - -
-
O ,
so diet

when a- i. save
⇐ If f) (j jiff ) → ,
where x. =
zixz
-

(%) ( ) ✓
x,
take Xz -
- i . Xi
-
-
Z
.
=

when d- b solve
€4 I I g) →
( fi f / : ) where E- ix.
-

. .

, ,

(Yj ) ¥1
take na -
- i .
x. =
-

Zi .
so =
Hence matrix A has eigenvectors
( iz ) and
(I;) with eigenvalues d - I and b

I 1=0
x i
let det CB
'
d ti 2--0 and
-

-
Izd ) -
-
O
,
i.e .
then d -

( I, I I f) fo il g )
when it ' ink
solve so na
-

.
-


,

take Kei Xi i
(7h ) (4) V
-
-
so
-

, .

( ! I I g) ( bio I g )
when D= -

i . solve so x. = -
ixz

,

( Yaz ) ( ji )
take xz I ,
-
-
Ai -
- -
i . so =

Hence matrix B has


eigenvectors (i ) and
(Y ) with
eigenvalues a- I .
-

I respectively

Et Ei E Fi
let det
I ? If and
-

CC Iz d)- -
-
o ,
i.e .
- o
, then of i - =
o D= and - -

f?
when a-Et Ei solve Ei
⇐¥1 Oo )
then
( I:) : ⇐ Er )
-

+
.

E Ei
( Ya)
'

! I f)
then
when a. solve
-
-
. -

and Et Ei E Fi
⇐s ) ⇐ )
Hence matrix c has eigenvectors . with
eigenvalues a- and - -

, ,
respectively

im ×

# t Thi
i
, •

;
'
t I
,
/ TE
y
• '

O
& re
( :b o.o ! ! ! ban )
aitbii o O -

> .
write a diagonal matrix as A -

o a "

, Honi

!
by definition .
't
A = # =
Ai -
b, i . . .
O . -
-

??
dans
° - '
'

.
.

(
af tbf

)
't'
then A# A = O O all
AA so diagonal matrices are normal
- - -

=
.

o aim .
. .

o
"

an'tbn
,
O O - -
.

/ /
a, -
X O - - .
0
since IA -
Intl =

o am . .
.

o
=
(ai -
t ) ( Az N -
-
-
-

can -
d) , then all entries of the diagonal
'

be
matrix possible eigenvalues
-
-

can .

O O an i, -

( a'?b ?)
'
it
s . denote A-

b) ( af? Y )
"

( tar:b if ) (Yj
' a '

a* at
-

is at A =

require so
-
- =
.
- =

( §) ffg)
hence have restrictions ib tri and E and thus

ga
we a- o b
for TER
-

so
-
- - -
-

.
=

atib tri
.


-
=

fat !!:* Ibf ) ( f 9)


a" "
in, repair ATA Is
-
-

.
A* A - -
-

sofa:÷÷÷÷÷ ÷÷÷ or
need more justification & process

d
*
city require A- A -

-
't
AA

( )
AA
't
= I 952 Til 4G -

blitz ) =
ATA derived from cii )

oyrzticblf CLE) -
a4b2 to

my :÷÷÷÷ . ..
:H÷ . or

where TER and -40 X

rewire
aa*=⇐:* . .
.
⇐ "
ai:*:?)
-
-
at "
I:c ! .
it :: ) "

Ytb
"
AHA & AA't Hermitian if top right then bottom left equal
fine
are is is
equal
-

, .

TEA =ac -7 f- Fat or a- O

C b
± Ft a4bZ=I
-

ba if C- take
F- any
-

it a. o and
!!!
men 'r
-

lit .

lot.÷÷kt÷H÷!
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 15: Complex inner products and matrices

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. As in Question 4 from Exercises 14, we just expand out the various norms and add. The catch
here is that we need to be aware that the inner product (or dot product) in Cn is not symmetric as,
in general, hx , y i = hy , x i. Starting with the first term, we have

kx + y k2 = kx k2 + hx , y i + hy , x i + ky k2 .

and, at this point, we could observe that, for a complex number z, z + z = 2Re(z) so that we have

kx + y k2 = kx k2 + 2Rehx , y i + ky k2 .

However, it’s probably easiest to leave the expression for kx + y k in the original form. Similarly, we
have
kx y k2 = kx k2 hx , y i hy , x i + ky k2 ,
so
kx + y k2 kx y k2 = 2hx , y i + 2hy , x i.
Now we can apply this result for the pair x , iy to get

kx + iy k2 kx iy k2 = 2hx , iy i + 2hiy , x i = 2ihx , y i + 2ihy , x i.

so that, multiplying this by i yields 2hx , y i 2hy , x i. Thus, the whole right-hand side is equal to
4hx , y i, as required.
It is worth noting that these proofs work for any inner product, not just the dot product.

2. For (i), suppose x and y are orthogonal. As we saw in question 1, we have

kx + y k2 = kx k2 + 2Rehx , y i + ky k2

and, as hx , y i = 0, this is just kx k2 + ky k2 , as claimed.


For (ii), we need to show that
kx + y k2  (kx k + ky k)2 .
Expanding both sides, this is equivalent to

kx k2 + ky k2 + 2Rehx , y i  kx k2 + ky k2 + 2kx k ky k.

This does hold, since


Rehx , y i  |hx , y i|  kx k ky k,
using Cauchy-Schwarz at the end.

3. Using row operations to put M in RRE form we get


0 1 0 1 0 1 0 1
i 0 1 1 0 i 1 0 i 1 0 i
@ 0 1 0A R1 ! iR! 1 @
0 1 0 A R3 !R3 +R1
@
! 0 1 0A
R3 !R3 iR2
@
! 0 1 0 A.
1 i i 1 i i 0 i 0 0 0 0

The rank of the matrix M is therefore equal to 2. The first two columns in the RRE form have
leading 1s, so the first two columns of M form a basis of the range: {(i, 0, 1)t , (0, 1, i)t }.
We can read o↵ from the row-reduced matrix that the nullspace is spanned by the single vector
(i, 0, 1)t . ( A word of warning: in the real case, the nullspace is the set of vectors orthogonal to the
row-space. In the complex case, that isn’t right: in fact the nullspace is the set of complex conjugates
of vectors orthogonal to the row space. )

4. We normalise the first vector, setting v1 = p1 (i, 0, 1)t .


2
Now we set
0 1 *011 + 0 1 *011 0 i 1+ 0 i 1 011 0 1 0 1
1 1 i 1
1 i@ A 1@ A
w2 = @1A @1A , v1 v1 = @1A @1A , @0A @0A = @1A 0 = 2 .
2 2 2
0 0 0 0 1 1 0 1 i

You should check that this vector, (1, 2, i)t , really is orthogonal to (i, 0, 1)t .
To finish, we normalise this second vector w2 , to get v2 = p1 (1, 2, i)t .
6

Thus, the requested orthonormal basis is { p12 (i, 0, 1)t , p16 (1, 2, i)t }. This is not the only right answer.
Note: a surprisingly large number of students go on to find a third unit vector v3 orthogonal to both
v1 and v2 , thus obtaining an orthonormal basis of C3 extending the one required. If you want to do
this, you should make it clear in your answer that you’re just doing so for sheer enjoyment (and/or
more practice), and you should display clearly the answer to the question that was actually asked.

5. Subtracting two times the first row from the second leaves
0 1
1 1 1
@0 2 1 + iA .
0 1+i 1
One way or another, we recognise the second and the third row as multiples of one another: to me, it
seems natural to multiply the third row by 1 i at this point. So we eliminate the third row, divide
the second row by two, and add it to the first row, giving us the row-reduced matrix
✓ ◆
1 0 (1 + i)/2
.
0 1 ( 1 + i)/2

The vector v1 = (1 + i, 1 + i, 2)t is in the nullspace: we’ll leave it as it is and normalise it later.
(It’s convenient to multiply up by 2; in fact, v1 is a multiple of the slightly simpler vector (1, i, 1 i)t ,
and it would save a little work to use that vector instead.)
Rather than applying Gram-Schmidt, which is going to involve us in some awkward arithmetic,
let’s use the complex analogue of the method introduced in Question 1 of Exercises 14, to get
v2 = ( 1 i, 1 + i, 0)t and v3 = (1 + i, 1 + i, 2)t .
Finally, we normalise all our vectors to get
0 1 0 1 0 1
1+i 1 i 1+i
1 1 1
u1 = p @ 1 + iA , u2 = @ 1 + iA and u3 = p @ 1 + iA .
2 2 2 2 2
2 0 2

These vectors form an orthonormal basis of C3 which extends the orthonormal basis {u1 } of N (C).
There are plenty of other correct answers.

6. (a) By inspection, A and B are Hermitian: each diagonal entry has to be real, and each
o↵-diagonal entry satisfies aij = aji . (C is not Hermitian since c12 6= c21 .) That tells us that the
eigenvalues of A and B are real.
The matrices B and C are unitary: their columns are vectors of norm 1 that are orthogonal to each
other (while the columns of A clearly don’t have norm 1). That means that the eigenvalues of B and
C have modulus 1.
Combining the two tells us that the only possible eigenvalues of B are ±1.
(b) The eigenvalues of A are 1 and 6, with corresponding eigenvectors (2i, 1)t and (1, 2i)t respec-
tively.
The eigenvalues of B are 1 and 1, with corresponding eigenvectors (i, 1)t and (i, 1)t respectively.
The characteristic polynomial of C is x2 i, so the eigenvalues of C are the square roots of i. What
are these complex numbers? The best approach is to think about complex numbers z in the form
rei✓ , where r = |z| and ✓ is the argument of z. The square roots of z then have the formula ±r1/2 ei✓/2 .
In this case z = i has modulus 1 and argument ⇡/2, so its “positive” square root has modulus 1 and
argument ⇡/4: in other words p (think about the argand diagram) it is the multiple of (1 + i) that has
modulus 1, namely (1 + i)/ 2.
The two eigenvalues, the square roots of i, are thus ± p12 (1 + i). The corresponding eigenvectors can
be written as (1, ± p12 (1 + i))t .

7. For a diagonal matrix D, DD⇤ is a diagonal matrix with entries given by di di = |di |2 . The
matrix D⇤ D is the same diagonal matrix, so D is normal.
A diagonal matrix has its diagonal entries as its eigenvalues, and every complex number can appear
as an entry on the diagonal of a diagonal matrix. So any complex number whatsoever can be an
eigenvalue of a normal matrix.
Knowing that A is Hermitian or unitary tells us a lot about its eigenvalues: knowing only that A is
normal tells us nothing.

8. (i) For a 2 ⇥ 2 matrix to be Hermitian, the entries on the diagonal have to be real, and
the o↵-diagonal entries have to be complex
p conjugates of each other. So c can be any real number,
whereas we must have a = 0 and b = 1/ 2.
(ii) For the matrix to be unitary, the two rows have to have norm 1, and be orthogonal to each other.
Here the first row has already been fixed to have norm 1. To be orthogonal to the first row, the
second row must be a multiple of (i, 1) – if the second entry is 1, then the first entry x has to satisfy
0 = (x, 1)t · (1, i)t = x + i = x i, so x = i. We want the real multiples of this vector of norm 1, and
there are two of these, namely p12 (i, 1) and p12 (i, 1).
p p
So the answer is that a = 0, and either b = c = 1/ 2 or b = c = 1/ 2.
(iii) A matrix A is normal if AA⇤ = A⇤ A. Let’s write down what that means for this matrix A:
! ✓ ◆
1 p1 (a ib + ic) 1 2 2 i
AA =⇤ 2 and ⇤
A A= 2 +a +b 2 + ac ibc
.
1 i 1
p (a + ib
2
ic) a + b + c2
2 2
2 + ac + ibc 2 +c
2

Notice, by the way, that both AA⇤ and A⇤ A are Hermitian – you can easily check that this is always
the case, and we’ll be exploring this in lectures. This means that, if the top-right entries of the two
matrices are equal, then their bottom-left entries are also equal.
For the two matrices to be equal, we need, equating real and imaginary parts of all the entries,

1 a c b 1
a 2 + b2 = , p = ac and p = bc.
2 2 2 2
p
The second of these equations says that we must have either c = 1/ 2 or a = 0.
p
• If c = 1/ 2, then the third equation is automatically satisfied, so we may take any values of
a and b such that a2 + b2 = 1/2.
p p
• If a = 0, then b = p±1/ 2. If now b = 1/ 2, then the third equationp is satisfied for any
value of c. If b = +1/ 2, then the third equation reduces to c = 1/ 2, as in the other case, so
there are no new solutions.
p
In summary, there are two types of solutions. One is where a = 0, b = 1/ 2 and c is any real p
number – this is exactly the condition for the matrix to be Hermitian. The other is where c = 1/ 2,
and a and b are any real numbers such that a2 + b2 = 1/2, so the matrix is of the form
✓ ◆
1 1 i
p ,
2 z 1
where z is a complex number of modulus 1. This includes the one example where the matrix is
unitary but not Hermitian (z = i).
We know that every matrix that is either Hermitian or unitary is also normal, so the matrices in (iii)
have to include all those in (i) or (ii).
MA212 Further Mathematical Methods
Lecture 11: Singular values

Dr James Ward

⌅ The matrices A⇤ A and AA⇤


⌅ Singular values
⌅ Singular values decompositions

The London School of Economics and Political Science


Department of Mathematics

Lecture 11, page 1


Context!

So far we have seen the following.

⌅ A square matrix has eigenvalues and eigenvectors.


⌅ With enough linearly independent eigenvectors, we can
diagonalise the matrix and use this in applications.
⌅ With enough orthonormal eigenvectors, we can even
unitarily diagonalise and find a spectral decomposition.

Two questions. What can we do if we have a matrix which is

⌅ not square? That is, it doesn’t even have eigenvalues and


eigenvectors! We’ll look at this today.
⌅ square but not diagonalisable? That is, we just don’t get
enough linearly independent eigenvectors. See Lecture 12.

Lecture 11, page 3


The matrix A⇤ A has nice properties even if A doesn’t!

If A is any complex matrix, especially if it is not square, we look at


A⇤ A because that has some nice properties.

Theorem: Suppose that A is a complex m ⇥ n matrix.


(1) A⇤ A is square and so it has eigenvalues.
(2) A⇤ A is Hermitian and so its eigenvalues are real.
(3) A⇤ A has nonnegative eigenvalues.

Proof: Suppose that A is a complex m ⇥ n matrix.


(1) As A is m ⇥ n, A⇤ A is n ⇥ n and so, as A is square, it has
eigenvalues.
(2) As (A⇤ A)⇤ = A⇤ (A⇤ )⇤ = A⇤ A, the matrix A⇤ A is Hermitian and
so, in particular, it has real eigenvalues.
Lecture 11, page 4
(3) If is any eigenvalue of A⇤ A with eigenvector u 6= 0, we have

A⇤ Au = u =) u ⇤ A⇤ Au = u ⇤ u.

However, we can also write

u ⇤ A⇤ Au = (Au)⇤ Au = (Au) · (Au) = kAuk2 ,

which means that, as kuk =


6 0, we have

kAuk2
kAuk2 = kuk2 =) = 0,
kuk2

as required.

Lecture 11, page 5


So does the matrix AA⇤ !

If A is any complex matrix, especially if it is not square, we could


also look at AA⇤ because that has the same nice properties.

Corollary: Suppose that A is a complex m ⇥ n matrix.


(1) AA⇤ is square and so it has eigenvalues.
(2) AA⇤ is Hermitian and so its eigenvalues are real.
(3) AA⇤ has nonnegative eigenvalues.

Proof: Taking the complex matrix in the previous theorem to be


A⇤ , we know that (A⇤ )⇤ A⇤ = AA⇤ is square, Hermitian and has
non-negative eigenvalues.

Lecture 11, page 6


A⇤ A and AA⇤ have the same positive eigenvalues!

If A is an m ⇥ n complex matrix, A⇤ A and AA⇤ are di↵erent


matrices because, for instance, one is n ⇥ n and the other is m ⇥ m.
Indeed, even if A is square, there is no reason to suppose that A⇤ A
and AA⇤ would be the same.
However, they do have something in common...

Theorem: Suppose that A is any complex matrix.


A⇤ A and AA⇤ have the same positive eigenvalues.

Proof: Suppose that A is any complex matrix.

Lecture 11, page 7


⌅ If is any positive eigenvalue of A⇤ A with eigenvector u 6= 0,
IATA AIL E- O
-

A⇤ Au = u =) AA⇤ Au = Au =) AA⇤ (Au) = (Au),

where Au 6= 0 because A⇤ Au = u 6= 0 as > 0 and u 6= 0.


Thus, is also an eigenvalue of AA⇤ with eigenvector Au.
⌅ If is any positive eigenvalue of AA⇤ with eigenvector v 6= 0,

AA⇤ v = v =) A⇤ AA⇤ v = A⇤ v =) A⇤ A(A⇤ v ) = (A⇤ v ),

where A⇤ v 6= 0 because AA⇤ v = v 6= 0 as > 0 and v 6= 0.


Thus, is also an eigenvalue of A⇤ A with eigenvector A⇤ v .

Consequently, A⇤ A and AA⇤ have exactly the same positive


eigenvalues as required.

Lecture 11, page 8


Singular values

Looking at the positive eigenvalues of the matrix A⇤ A, we can now


define the singular values of the matrix A.

Definition: Suppose that A is any complex matrix.


The singular values of A are the positive square roots of the
positive eigenvalues of A⇤ A.

Note: If A is m ⇥ n, then A⇤ A is n ⇥ n with positive eigenvalues


1 , 2 , . . . , k and the remaining n k eigenvalues (if any) are
p p p
zero. Thus, the singular values of A would be 1, 2, . . . , k.

Note: Given the previous theorem, we can also use AA⇤ to find the
singular values of A since A⇤ A and AA⇤ have the same positive
eigenvalues.

Lecture 11, page 9


Example
!
2 i 0
Find the singular values of the matrix A = .
0 i 2

11=4

anti : : :L
(÷ :X : : :)
**
.

(
Evokes IAA't xtzleo
'

-
D= 4 or 6 4 21 O
=

'
"'

singular values of Aare ro angry -2 o


Ii
"


E values
/A*A 4131 -
-
o 4=0.46

SUS are rf=2 and Tb


,

same

Lecture 11, page 10


What about eigenvectors?

Theorem: Suppose that A is any complex matrix.


A⇤ A has an
mm
orthonormal set of eigenvectors {u1 , u2 , . . . , uk }
corresponding to its positive eigenvalues 1 , 2 , . . . , k .
Further, the set of vectors {v1 , v2 , . . . , vk } given by

1
vi = p Aui ,
i

is an orthonormal set of eigenvectors corresponding to the


positive eigenvalues 1 , 2 , . . . , k of the matrix AA⇤ .

Proof: Suppose that A is any complex matrix.


The matrix A⇤ A is normal as (A⇤ A)⇤ (A⇤ A) and (A⇤ A)(A⇤ A)⇤ are
both equal to (A⇤ A)2 . As it is a normal matrix, the eigenvectors of
A⇤ A form an orthonormal basis of Cn .
Lecture 11, page 11
Let’s take 1 , 2 , . . . , k to be the positive eigenvalues of A⇤ A with
{u1 , u2 , . . . , uk } as the orthonormal set of eigenvectors that
correspond to these eigenvalues.
That is, we have ui 6= 0 such that
1
A⇤ Aui = i ui and we set vi = p Aui ,
i

where vi 6= 0 since Aui 6= 0 because A⇤ Aui = i ui 6= 0 as >0


and ui 6= 0. Now, as
Aui A A Au
AA⇤ vi = AA⇤ p = p A⇤ Aui = p i ui = i p i = i vi
i i i i

we can see that the vi are eigenvectors of AA⇤ corresponding to


the positive eigenvalues i .

Lecture 11, page 12


To see that {v1 , v2 , . . . , vk } is an orthonormal set, note that
s
Aui Auj (Auj )⇤ Aui uj⇤ A⇤ Aui ⇤
i uj ui i
vi ·vj = p · p = p = p = p = ui ·uj
i j i j i j i j j

so that we can see that they are

⌅ mutually orthogonal because, for i 6= j, this gives us


s
i
vi · vj = ui · uj = 0,
j

as the vectors in the set {u1 , u2 , . . . , uk } are orthogonal, and


⌅ unit because, for i = j, this gives us
s
i
vi · vi = ui · ui = 1,
i

as the vectors in the set {u1 , u2 , . . . , uk } are unit.

Lecture 11, page 13


Example continued

Find an orthonormal set of eigenvectors of the matrix A⇤ A.


Hence find an orthonormal set of eigenvectors for the positive
eigenvalues of the matrix AA⇤ .

(I I )
ai 't
4 o AA -_

⇐I
"" ,

thinks
"" " El :
Heil )
"o Fi
) Eli )
'
-
-
or.
aui

Fil : HE
"" " "

#
xx .

④ .
"

=
's
Is
f-
a- b.
use

it

finding these using


old method Lecture 11, page 14
And the other way...

Theorem: Suppose that A is any complex matrix.


AA⇤ has an orthonormal set of eigenvectors {v1 , v2 , . . . , vk }
corresponding to its positive eigenvalues 1 , 2 , . . . , k .
Further, the set of vectors {u1 , u2 , . . . , uk } given by

1
u i = p A ⇤ vi ,
i

is an orthonormal set of eigenvectors corresponding to the


positive eigenvalues 1 , 2 , . . . , k of the matrix A⇤ A.

Proof: Repeat the proof of the previous theorem replacing ‘A’


with ‘A⇤ ’ and ‘A⇤ ’ with ‘A’.

Lecture 11, page 15


Example continued

Find an orthonormal set of eigenvectors of the matrix AA⇤ .


Hence find an orthonormal set of eigenvectors for the positive
eigenvalues of the matrix A⇤ A.

II
⇐÷
ant .
A'*

are .ve# lil

*
one-Hit
a- a.
ui-aif.fi/-rli)=E(I )
I
* me.

Lecture 11, page 16


The singular values decomposition of a matrix

Theorem: Suppose that A is any complex matrix.


If {u1 , u2 , . . . , uk } is an orthonormal set of eigenvectors
corresponding to the positive eigenvalues 1 , 2 , . . . , k of
A⇤ A and the vectors vi are given by
1
vi = p Aui ,
i

then we have
p p p
⇤ ⇤ ⇤
A= 1 v1 u 1 + 2 v2 u 2 + ··· + k vk u k .

We call this the singular values decomposition of A.

Proof: Suppose that A is m ⇥ n so that A⇤ A is n ⇥ n.

Lecture 11, page 17


If 1 , 2 , . . . , k are the positive eigenvalues of A⇤ A, its remaining
eigenvalues k+1 , k+2 , . . . , n are all zero and, as A⇤ A is normal,
its eigenvectors ui form an orthonormal basis of Cn . Thus, using
the theorem on slide 18 of lecture 10, we have

In = u1 u1⇤ + u2 u2⇤ + · · · + un un⇤ .

which means that

A = AIn = Au1 u1⇤ + Au2 u2⇤ + · · · + Aun un⇤ .

However, if k + 1  i  n, we have i = 0 and so

A⇤ Aui = 0ui =) ui⇤ A⇤ Aui = 0ui⇤ ui = 0,

which means that

ui⇤ A⇤ Aui = (Aui )⇤ Aui = (Aui ) · (Aui ) = 0 =) Aui = 0.

Lecture 11, page 18


Thus, our expression for A becomes

A = Au1 u1⇤ + Au2 u2⇤ + · · · + Auk uk⇤ ,

where, as 1, 2, . . . , k are the positive eigenvalues of A⇤ A,


1 p
vi = p Aui =) Aui = i vi .
i

Consequently, we have
p p p
⇤ ⇤ ⇤
A= 1 v1 u 1 + 2 v2 u 2 + ··· + k vk u k ,

as required.
Note: The singular values decomposition has some interesting
properties and, if we have time, we will talk about them again later
in the course.

Lecture 11, page 19


Example continued

Find the singular values decomposition of the matrix A.


have e. values of A*A erectors erectors
we non zero ON
of Atf
:
on
-

*
of A- A

tf ! )
di -
-

4
Irf ! )
A -

Ui -
-

a- b -

u.
a- Eli't

uittru.us#--f4--
the SVD is A- Tx, I,

(f) ¥1101 ) tr -

fi (7) Fil -
I -
il )

.li: : .lt ii. i:)


= " " "
.
.

add up
check :
the decomposition and get A
Lecture 11, page 20
A quick summary

Suppose A is an m ⇥ n matrix.
This means that AA⇤ is m ⇥ m and A⇤ A is n ⇥ n.
Let 1, . . . , k be the positive eigenvalues of AA⇤ and A⇤ A.
⌅ A⇤ A has an orthonormal set of eigenvectors {u1 , . . . uk } in Cn .
? x
? 1 ? 1
? ?
The ‘link’: ?vi = p Aui or ?ui = p A⇤ vi .
y i ? i

⌅ AA⇤ has an orthonormal set of eigenvectors {v1 , . . . vk } in Cm .


As long as we start with the ui (or vi ) then use the ‘link’ to find
the vi (or ui ), the singular values decomposition of A will be
p p p
⇤ ⇤ ⇤
A= 1 v1 u 1 + 2 v2 u 2 + · · · + k vk u k .

Lecture 11, page 21


MA212 Further Mathematical Methods
Lecture 12: The Jordan normal form

case I :X ,
X. U 1A XII 62=63
-

t tr

U, Vz ? 03
Dr James Ward
case 2 :
X. d. d IA XII Uz
-
- Us
k d

⌅ The Jordan normal form U, Uz ? 03

⌅ Dealing with 2 ⇥ 2 matrices we can’t diagonalise


cases :
X. d. X IA -
XII bill,
⌅ Dealing with 3 ⇥ 3 matrices we can’t diagonalise... d

U, 62 ? Is ? IA -
NII Us -

- lk

I Case 1: Twice repeated eigenvalue with one eigenvector


I Case 2: Thrice repeated eigenvalue with two eigenvectors
I Case 3: Thrice repeated eigenvalue with one eigenvector

The London School of Economics and Political Science


Department of Mathematics
Lecture 12, page 1
Information

⌅ Exercises 16 is on Moodle.
I Attempt questions 3, 4, 5, 6, 7 and 8.
I Follow your class teacher’s submission instructions.
I Do it! Actually submit your homework!

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes
I Go to them.

Lecture 12, page 2


The Jordan normal form

A Jordan normal form (or JNF) is a matrix of the form


0 1
1 ⇤ 0 ··· 0 0
B C
B0 2 ⇤ ··· 0 0C
B C
B0 0 ··· 0 0C
B 3 C
J=B . .. .. .. .. .. C .
B .. . . . . .C
B C
B C
@0 0 0 ··· n 1 ⇤A
0 0 0 ··· 0 n

Here each diagonal entry is a number, each first upper-o↵-diagonal


entry (i.e. the ‘⇤’s) is a zero or a one, and all other entries are zero.
Thus, all diagonal matrices are JNFs (it’s just that all the ‘⇤’s are
zero) and, if a matrix is not diagonal (i.e. at least one of the ‘⇤’s
is one), then the JNF is the next best thing!

Lecture 12, page 3


Every square matrix has a JNF!

The key theorem (which we will not prove) is the following.

Theorem: Every square matrix A is similar to a JNF, i.e.


there is an invertible matrix P such that

1
P AP = J,

where J is a JNF.

Of course, if the matrix A is diagonalisable, we know how to find


J as it will just be our usual diagonal matrix D.
But, when A is not diagonalisable, i.e. when we don’t find enough
linearly independent eigenvectors, how do we find J ?

Lecture 12, page 4


What we’ll do...

In this course, we’ll see how to find P and J when we have a 2 ⇥ 2


or 3 ⇥ 3 matrix and we’ll see that it works, i.e. how we can
guarantee that the P we construct is an invertible matrix such that
P 1 AP = J is a JNF.
This method will generalise to 4 ⇥ 4 and larger square matrices,
but this won’t concern us.
A useful theorem to note in this context is the following.

Theorem: If A is a square matrix, eigenvectors corresponding


to distinct eigenvalues of A are linearly independent.

Proof: See Question 1 from Exercises 17.

Lecture 12, page 5


The 2 ⇥ 2 case

When A is a 2 ⇥ 2 matrix, one of three things can happen when we


look at its eigenvalues and eigenvectors.

⌅ A has two distinct eigenvalues giving us two linearly


independent eigenvectors, i.e. A is diagonalisable.
⌅ A has one [repeated] eigenvalue and we find that it has
I two linearly independent eigenvectors, i.e. A is diagonalisable.
I only one linearly independent eigenvector and so A is not
diagonalisable, i.e. we need to find a proper JNF!

Let’s see what we would do in this last case!

Lecture 12, page 6


Finding a JNF of a 2 ⇥ 2 matrix when we can’t diagonalise

When A is 2 ⇥ 2 with a repeated eigenvalue but only one linearly


independent eigenvector we must have
!
1
J=
0

as A is not diagonalisable.
We need an invertible matrix P that satisfies AP = PJ, i.e.
0 1 0 1
| | | | !
B C B C 1
A @v 1 v 2 A = @v 1 v 2 A
0
| | | |

where v1 and v2 are the columns of P.

Lecture 12, page 7


This gives us
0 1 0 1
B C B C
@Av1 Av2 A = @ v1 v1 + v2 A

and so we must have

Av1 = v1 and Av2 = v1 + v2 .

6 0 to be any eigenvector of A corresponding


So, we can take v1 =
to the eigenvalue and then v2 is any solution to the equation

(A I2 )v2 = v1 .

Here, as v1 is an eigenvector but v2 isn’t, the vectors v1 and v2


are linearly independent, i.e. P is indeed invertible.

Lecture 12, page 8


Example
!
0 1
Put the matrix A = in Jordan normal form.
1 2

I ? I! I
Erawes :
I A- Neko a -
- -

l -
I
= o .

IA think
I ) (f ) ( J) (f) yl! ) forger
(
treaters : -
o l -
-
o a -

y -
-
o .
-
-
-

, ,

only set one LI vector (f ) ,

for 02 We have CA aIz ) ie ,- =


y,
and diagonal is
( i II )
cannot
(! )
so
( f)
we
take a.
,
as = -

go for JNF !
a

p
C f) ( )
so ' '
and J
-
-

-
-

,
o y

Lecture 12, page 9


The 3 ⇥ 3 case

When A is a 3 ⇥ 3 matrix, one of six things can happen when we


look at its eigenvalues and eigenvectors. We see that A can have
⌅ three distinct eigenvalues, i.e. A is diagonalisable.
⌅ two eigenvalues, one of which is repeated twice, giving us
I two LI eigenvectors, i.e. A is diagonalisable.
I only one LI eigenvector and so A is not diagonalisable and we
need to find a proper JNF! This will be Case 1.
⌅ one eigenvalue which is repeated thrice giving us
I three LI eigenvectors, i.e. A is diagonalisable.
I two LI eigenvectors and so A is not diagonalisable, i.e. we
need to find a proper JNF! This will be Case 2.
I only one LI eigenvector and so A is not diagonalisable, i.e. we
need to find a proper JNF! This will be Case 3.

Let’s see what we would do in these three cases!

Lecture 12, page 10


Case 1: Twice repeated eigenvalue with one eigenvector

When A is 3 ⇥ 3 with eigenvalues , , µ where µ 6= and there is


only one linearly independent eigenvector for we let
0 1
1 0
B C
J = @0 0A
0 0 µ

as A is not diagonalisable.
We need an invertible matrix P that satisfies AP = PJ, i.e.
0 1 0 10 1
| | | | | | 1 0
B C B CB C
A @v1 v2 v3 A = @v1 v2 v3 A @ 0 0A
| | | | | | 0 0 µ

where v1 , v2 and v3 are the columns of P.

Lecture 12, page 11


This gives us
0 1 0 1
| | | | | |
B C B C
@Av1 Av2 Av3 A = @ v1 v1 + v2 µv3 A
| | | | | |

and so we must have

Av1 = v1 , Av2 = v1 + v2 and Av3 = µv3 .

So, we can take v1 6= 0 to be any eigenvector for the eigenvalue ,


v2 to be any solution to the equation

(A I3 )v2 = v1

and v3 to be any eigenvector for the eigenvalue µ.

Lecture 12, page 12


Note: To see that P is invertible, consider the equation

↵v1 + v2 + v3 = 0.

Multiplying through by the matrix A I3 , this gives us

↵(A I3 )v1 + (A I3 )v2 + (A I3 )v3 = 0.

But, as Av1 = v1 , (A I3 )v2 = v1 and Av3 = µv3 , this gives us

v1 + (µ )v3 = 0.

Thus, as µ 6= and {v1 , v3 } is a linearly independent set, we must


have = 0 and = 0. The original equation then gives us ↵ = 0
as v1 6= 0. Thus, as we only get the trivial solution, {v1 , v2 , v3 } is
a linearly independent set.

Lecture 12, page 13


Example

0 1
0 4 4
B C
Put the matrix A = @ 1 0 3A in Jordan normal form.
2 4 7

E values IA -
XI31=0 a- 2. 2.3

Erectors ca ith

Iq (f) fi )
-
'- o . M :

'
Ui -

find
(II)
need to a vast CA ' Isla U

9,343 ) =L! )
-
-

take Vi
. -

g.
,

until
*'

. .
l: :"
Lecture 12, page 14
Case 2: Thrice repeated eigenvalue with two eigenvectors

When A is 3 ⇥ 3 with eigenvalues , , and there are only two


linearly independent eigenvectors for we let
0 1
1 0
B C
J = @0 0A
0 0

as A is not diagonalisable.
We need an invertible matrix P that satisfies AP = PJ, i.e.
0 1 0 10 1
| | | | | | 1 0
B C B CB C
A @v1 v2 v3 A = @v1 v2 v3 A @ 0 0A
| | | | | | 0 0

where v1 , v2 and v3 are the columns of P.

Lecture 12, page 15


This gives us
0 1 0 1
| | | | | |
B C B C
@Av1 Av2 Av3 A = @ v1 v1 + v2 v3 A
| | | | | |

and so we must have

Av1 = v1 , Av2 = v1 + v2 and Av3 = v3 .

So, with some care, we can take v1 6= 0 to be an eigenvector for


the eigenvalue , v2 to be any solution to the equation

(A I3 )v2 = v1

and v3 to be an eigenvector for that is not a scalar multiple of v1 .

Lecture 12, page 16


Note: To see that P is invertible, consider the equation

↵v1 + v2 + v3 = 0.

Multiplying through by the matrix A I3 , this gives us

↵(A I3 )v1 + (A I3 )v2 + (A I3 )v3 = 0.

But, as Av1 = v1 , (A I3 )v2 = v1 and Av3 = v3 , this gives us

v1 = 0.

Thus, as v1 6= 0, we have = 0 and the original equation is now

↵v1 + v3 = 0.

But, as v1 and v3 were chosen to be linearly independent, this


must give us ↵ = 0 and = 0. Thus, as we only get the trivial
solution, {v1 , v2 , v3 } is a linearly independent set.

Lecture 12, page 17


Example

pal's ! ! ) .at ! ! ! )
0 1
2 1 1
B C
Put the matrix A = @ 1 2 1A in Jordan normal form.
2 2 1

f ÷ /
Evaluesia Misko-

"
-
""t

.
.
.

erectors CA dis )

⇐ y y)
-

x
-

-
o -
l l l
-

XTYTZ -0 for g. zur

we want U2 sat CA -
NI 3) 02=01
" t"
erector
.

not erector

!)
! !
;
-
a- a. take or so as
!f! ) ,
take us

Lecture 12, page 18


Case 3: Thrice repeated eigenvalue with one eigenvector

When A is 3 ⇥ 3 with eigenvalues , , and there is only one


linearly independent eigenvector for we let
0 1
1 0
B C
J = @0 1A
0 0

as A is not diagonalisable.
We need an invertible matrix P that satisfies AP = PJ, i.e.
0 1 0 10 1
| | | | | | 1 0
B C B CB C
A @v1 v2 v3 A = @v1 v2 v3 A @ 0 1A
| | | | | | 0 0

where v1 , v2 and v3 are the columns of P.

Lecture 12, page 19


This gives us
0 1 0 1
| | | | | |
B C B C
@Av1 Av2 Av3 A = @ v1 v1 + v2 v2 + v3 A
| | | | | |

and so we must have

Av1 = v1 , Av2 = v1 + v2 and Av3 = v2 + v3 .

So, we can take v1 6= 0 to be any eigenvector for the eigenvalue ,


v2 to be any solution to the equation

(A I3 )v2 = v1

and v3 to be any solution to the equation

(A I3 )v3 = v2 .

Lecture 12, page 20


Note: To see that P is invertible, consider the equation

↵v1 + v2 + v3 = 0.

Multiplying through by the matrix A I3 , this gives us

↵(A I3 )v1 + (A I3 )v2 + (A I3 )v3 = 0.

But, as Av1 = v1 , (A I3 )v2 = v1 and (A I3 )v3 = v2 , we get

v1 + v2 = 0.

Thus, as v1 is an eigenvector for and v2 isn’t, they are linearly


independent and so we must have = 0 and = 0. The original
equation then gives us ↵ = 0 as v1 6= 0. Thus, as we only get the
trivial solution, {v1 , v2 , v3 } is a linearly independent set.

Lecture 12, page 21


Example

af! i. g) f! ! !)
'
0 1
-

# and
0 0 1
B C
Put the matrix A = @1 0 3A in Jordan normal form.
0 1 3

( -31=0
evacuees IA dI3l=o
- X O l
d- i. i. ,

I a

O l 3 X -

f! ! ! ) ¥1
erectors canst
TIE: .

eareoi-41.ca#soz-- "

l
÷
a
or

l
LA KI ) hack
-
Usa
( q) Lecture 12, page 22
extra
example session (Wb )

case 2 can be problematic . since we may pick a O, sit -


U2 has no solution ,
so
choosing it can

be tricky sometimes .

( ! ! ;)
i.
find the JNF of A-
I

y y
evalues :
IA XII
-
- o 3 X -
o -

I (2- d) ( Cst ) Cl - d) -111=0


=p

I 2 -
X -
I D- 2) 3--0

o H, X 2 ( algebraic multiplicity of 3 )
-

/ -

erectors :
CA AI ) 7=0
-

l! ! t.in;÷ 4.)
"" case .

' .

. *
.

heed to find CA XI )
-
62=4 , take the
f !) ,
then be
§) and then

63=1 !)
f! ! ! ) f! ! ! )
so e-
-
and J -
-
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 16: Complex diagonalisation and singular values

For these and all other exercises on this course, you must show all your working.
✓ ◆
0 1
1. Show that the matrix A = is normal.
1 0
Find a unitary matrix U such that U ⇤ AU is diagonal.

2. Unitarily diagonalise the Hermitian matrices


0 1
✓ ◆ 1 0 1
0 i @0 4 0 A .
(a) and (b)
i 0
1 0 1

0 1
0 2 1
3. Consider the matrix A = @ 2 0 2A.
1 2 0
Find the eigenvalues and eigenvectors of A. Write down an invertible matrix P and a diagonal
matrix D such that D = P 1 AP . (There is no need to calculate P 1 .)
0 1
7 0 9
4. Consider the matrix A = @0 2 0A.
9 0 7
Find an orthogonal matrix P such that P t AP is diagonal.
Express A in the form 1 E1 + 2 E2 + 3 E3 where the i are the eigenvalues of A and the Ei are
3 ⇥ 3 matrices given by Ei = vi vit where the vi are the corresponding eigenvectors.
Verify that, for i 6= j, Ei Ej is equal to the zero matrix and that Ei Ei = Ei .
✓ ◆
1 1 0
5. Find the singular values of the matrix A = .
1 0 1
Find the singular values decomposition of A.

6. Show that, if 1 , . . . , k are the non-zero eigenvalues of an Hermitian matrix A, then | 1 |, . . . , | k |


are the singular values of A.

7. Let A be a non-singular square matrix, with eigenvalues 1, . . . , k. What are the eigenvalues
of its inverse, A 1 ?

8. Let A be an invertible matrix. What are the singular values of A 1 in terms of the singular
values of A?
( f)
't O
I .
since all entries of A are real . A -
-
A =

C Ilf ) f II
"

(Y )
'

(: I ) fo f )
-

then a'ta -
-
-
-

and AA' ' -

, , o o
o

't

By definition . Atta -
-
AA implies that A is normal ,
and thus can be
unitarily diagonal ised .

① find eigenvalues HH
I
I -11=0 Ii
g 7,1
let IA o a-
-

i.e
-
=
we
-

.
-

,
.
. so

② find eigenvectors
when a -
- i .

fi Iif ; ) fo )
-
-
.
then
I :} . so e .
-
-

fi )
fi if;) fo )
men x

fix :} (t)
i then so E-
-
- -
-
. .
.
-

since I I -
=

( Y ) (f) . = i. it l -
T -

-
o , it is
orthogonal to I

③ normalise the eigenvectors , neill


=(i).# -
-
Titi -
-
Fifi -
- F

=/ ( f ) ( f )
-

HEH .
-
- i-T-i.TT IT -
-
=P

so E #
(; ) and I E (t)
-
- -
-

therefore , u -

- CE Es ) -
-

Tt f! ! ) can make UAV't diagonal

for
( ji ) f /
I A- XIZI and Nti
O '
O then d o d It
-

A o
-

2. ca)
- -
= -
-
- : . ,

, ,

f:)
ie
El ! )
Iff ) foot
r
f!
when " " a man
- -
-

so
- -

- -
-
. .

when d ' -
"
f ! 4) ( f ) fo )
-
-
. so a-
(;) Han -
-
r is -
-

pili )
hence P -
-

¥ ( ! 4) sit . D= PAP -1 where D is


diagonal

( f for
"""" "

)
for I
Cb) O l
D= : '

# Kim 's
'

o, son out
-
-
= -

:p
o

4. go HII aye ) *m=r a=÷⇐,


when I

II
d -
- O .
I 0 '
'

, ,

II ) (g) ft (g) Eff )


when " " " un ie
.
a- -
-
r .
-
-

,
.

og g) (1) (g) %)
when 't-4 '

{ II:
'
E-
.

= us -
-

Hoshi

II ! )
'

sonia ee "

. .

3 .
To find eigenvalues .
let IA -

aI3l=o . ie .

I 1,1=0
'

SO C-d) ( d -14 ) -12L 2h -12) l l 4th) O


-
- - -

-
NI -19 ) -

-
o and thus D= o or ±3i

Then to find eigenvectors , we let CA - AI ) I - -


e

ti : :L
when D=

I :&:÷÷÷ 't :L
O 2 I
O,
'
a. "" .
" 2

H ÷t¥t f.
when D= 3i , -3 i 2 I
and "" so be -4 Si

{
-

. -

X -

zy -
zit -
-
o
U

¥¥# RRE # TEE -8M¥'T .


TE

Es 44k¥ 'THE
f ÷ y :o) I:)
men " -

" -

men 's .

÷ !ii )
: II
fgs! ! )
"
therefore
'

. P . and "

/
(4) ( ( 7- d) 2-811=0

/
4
. let IA -

XIs 1=0 . i.e . 7 X


- o
f SO
'

o z -
a o
← o
(2 -

d) ( Ib d) f
-
-
a -27--0

f o
7-a
D= 2 , -2 ,
16

f; ; :/ 't ! )
when '

is :*
so
.
.

menial : : :L to:L ÷÷÷


"""
.
.

f! )
'" "⇐

fog ! ! ) .FI/=fg ) lie:*


o
when " " so E-
-

' '

f! I g ) ( I go ! )
'
therefore ,
P -

- with D=

it! need to be orthonormal to generate decomposition

:L yo: : : ) y :O: Y : : :)
⇐ am .
a- - ' " '

but
(I ) ( I ooo ! ) 2/1/1010
E- " o -

it =
A- ) 2
lyre
-
-

yrs )
-

o -

,
,
x
Esiesust
(g) =L ! ! ! )
-
-
" o ' '
+ a
eye o yr )
x
verification : Ei Ez =
E, E3 =
Ez E3 =
Q

E, E , =
Ei , Ez EE Ez ,
E3 E3 =
E3

f ! ly ) ( too )
( y g)
' '
definition
(f f )
't *
A l
is
by AA
=
so =
-

-
.
. .

, ,

1--0
/ yl / ( 2 Xi
let IAA't -
d Iz .
i.e .
2 -
= 0 so -
-
I = 0
,

,
(3 -
X ) Cl d) - - O

x =
I or 3

R and B
so the singular values of the matrix A is l
-
-

when * i .

(: i) /; ) -
-

(8) .
so u. -

f; ) and orthonormal vector a-


Eff )
when a- 3 .

f ,
I, If :/ -

fo ) .
so a-
f; ) and E- Eli )
Tre A
( g g) ( ; ) Is
(I )
'
using the link . we have a
-

-
't
I =
'
=

,

,

¥ # E-
oil ! ! ) rill ) tf ! )
a- -
-
-

hence the decomposition is A


-
-
raid , uit + Ez is uit

# f )# ( O l tr
Irl ! ) Trio
=

l
-

l -

R l l )
,

il : : it El : ill
-

+
=
6 .
let It I ,
.
. .
.

. He be the
corresponding eigenvectors of eigenvalues a .dz .
- - -

.
XK

't
since A is Hermitian .
A -
-
A

then At Ui =
Alli =
dilli where I Eisk

and CAUI )
't '
Ui Ii
't *
At Ati =
A -
-
A di =
di CA ki ) =
Xi

hence the singular values of A ane Thi =


Hit =
Id , I .
- . .

.
Idk I

7 .
let Ui be the
corresponding eigenvectors where i sie k since A is non -

singular ,
di -1-0

since di .
. . .

.dk are
eigenvalues . AE -
-

dilli
A- AE
' '
A die
-
-
-

"
Iti =
A villi
'
A- II =
¥ II

by definition .
the eigenvalues of A
"
ane Ii where I sick

*
't
8 .

suppose the eigenvalues of AA are Xi .


I sie k

then A 's singular values are Tai


"
Ti
f from eigenvalues of CA At
a7 ie k
*
,
we know that are ,
is

need
T.fi ¥
'

then
'
A- s
singular values are =

proof

since A is invertible , no O e values

B*A*
*
(AB) =

't ' 't


't
# HA (A " t
-

't
I -

LAA " ) =
,
so CA ) =

' '
CA 5
'
(AAA )
-

'
A' ( A Y 't 't
At CA t
- - -

= = =

'
this means of CA*A5
I
singular values of A will the positive square roots
-

be
MA212 Further Mathematical Methods
Lecture 13: Using the Jordan normal form

Dr James Ward

⌅ Solving systems of di↵erential equations


⌅ Powers of matrices
⌅ Solving systems of di↵erence equations

The London School of Economics and Political Science


Department of Mathematics

Lecture 13, page 1


Solving systems of di↵erential equations

In MA100, you saw how to solve a system of di↵erential equations


y 0 (t) = Ay (t) when A is diagonalisable, i.e. there is an invertible
matrix P such that P 1 AP = D is diagonal.
Essentially, the method involves setting y (t) = Pz (t) so that
y 0 (t) = Pz 0 (t) and we get

Pz 0 (t) = APz (t) =) z 0 (t) = P 1


APz (t) =) z 0 (t) = Dz (t)

which is a simpler system of di↵erential equations.


As D is diagonal, each zi (t) is given by the separable di↵erential
equation zi0 (t) = i zi (t) which is easily solved to get zi (t) = ci e i t .
We can then use y (t) = Pz (t) to find y (t).

Lecture 13, page 3


But, in MA100, this wouldn’t always work!

As we know, in MA100, two unpleasant complications could arise


when we used diagonalisation...

(1) Even if A is a real matrix, it might have complex eigenvalues


and eigenvectors.
This is no longer a problem! We can deal with this now!
We’ll look at an example of this to revise the MA100 method
and see how complex numbers don’t really change anything.
(2) A may not be diagonalisable as it just doesn’t have enough
linearly independent eigenvectors.
This is no longer a problem! We can use the JNF instead!
We’ll look at an example of this to see how things change
when we use a JNF instead of a diagonal matrix.

Lecture 13, page 4


Example 1
! !
1 1 2
Solve y 0 (t) = Ay (t) when A = and y (0) = .
4 1 0
-

evabues ltzi I si

(Ii Ii ) ( Itoi
:

eio-coso-iis.no
-
,

" and D= note :

erectors
¥ ) fi )
.
'

'O
.

, , e- , Cosa - i' since

' '

set '
e'Ote 0=200
' '
-
'

E- PE I I' =p API
' -

.
-
-

Ay y -
-
pz -_
APE -
DE

zi-ci-ziizzeio-e-io-zis.no
(Zz ) ( Hfi Iz ) (IL)
Zi lH2ilZi
-
and
=

; ( Itzik "-2Mt
ZI= Ae zz= Be

" 't "-2Mt


Ael"
)
"'

) (
'"

(I ÷ ) (
Ae 't Be
1- PE - =

"-2Mt
, Ae ziaeutziltfzi Batiste
-

Tico)=2 ATB """ t

{ tell-2Mt
2- "" 't 't
and
gas
, y, # se yarn . -
zie , zieti
Yolo) o D= 2iAt2iB
etcezitye Zit )
-
"
B=I
-

"t
zietce e- )
-

=
= - -

= zetcoszt =
get sink)
Lecture 13, page 5
Example 2

Solve y 0 (t) = Ay (t) when A is the matrix on slide 22 of lecture 12.


Z' tPZ=Q

(y ! ! ) ( to ! ! )
' '

'
and F- that P API
-

F- so
I. espde
€15 Q'I '

'
set y Pz APE JE
' -

-
-

pz
'
-
APE 2- =P -

÷k¥l :÷:*:* :÷::*:*::* I


-
- et
.

I-eetdt.ee t
-
.

zie-t-zie-t-AHBZze-t-zze-t.tt
11

⇐ et ) Izzet)
' '
-
At -113 -
-
A

Zi -
-

FAI -113ftc)et Zeit -


-
At -113

E- ' ±
( ÷ o://AEIIIY.to/et=...Z2=tAttBlet
' -

Lecture 13, page 6


Powers of matrices

In MA100, you saw how to find Ak for any k 2 N when A is


diagonalisable, i.e. there is an invertible matrix P such that
P 1 AP = D is diagonal.
Essentially, the method involved noting that

Ak = (PDP 1
)(PDP 1 ) · · · (PDP 1
) = PD k P 1
| {z }

%)
k times
D"
where, as D is diagonal, D k is easy to find.
When we have to use a JNF instead, the same reasoning works, i.e.

Ak = (PJP 1
)(PJP 1 ) · · · (PJP 1
) = PJ k P 1
| {z }
k times

but, as J is a JNF, J k is a bit harder to find.

Lecture 13, page 7


When J is a 2 ⇥ 2 matrix

If J is a 2 ⇥ 2 matrix, our only proper JNF is


!
1
J= .
0
Multiplying this matrix by itself several times, we see that
! ! !
2 2 3 3 2 4 4 3
J2 = 2
, J3 = 3
, J4 = 4
, etc.
0 0 0
and so we can conclude that we have
! !
1 k k k 1
If J = , then J k = k
for k 2 N.
0 0

We’ll see this behaviour again when we look at the 3 ⇥ 3 JNFs.

Lecture 13, page 8


When J is a 3 ⇥ 3 matrix

If J is a 3 ⇥ 3 matrix, we have three proper JNFs given by


0 1 0 1 0 1
1 0 1 0 1 0
B C B C B C
J1 = @ 0 0 A , J2 = @ 0 0 A and J3 = @ 0 1A
0 0 µ 0 0 0 0

depending on whether we are in case 1, 2 or 3 from lecture 12.


We’ll deal with J1 and J2 together as they are pretty easy, J3 will
be a little more complicated.

Lecture 13, page 9


For J1 and J2

Multiplying J1 by itself several times, we see that


0 1 0 1
2 2 0 3 3 2 0
B C B C
J12 = @ 0 2 0 A , J13 = @ 0 3 0 A, etc.
0 0 µ2 0 0 µ3
and so we can conclude that, for k 2 N, we have
0 1
k k k 1 0
B C
J1k = @ 0 k 0 A.
0 0 µk
Notice, in particular, how the 2 ⇥ 2 upper left sub-matrix acts just
like our proper 2 ⇥ 2 JNF and the bottom right 1 ⇥ 1 matrix acts
just like what we see with a diagonal matrix!
Indeed, setting µ = , this gives us the corresponding result for J2k .

Lecture 13, page 10


Thus, if we are in case 1, we have
0 1 0 1
1
0 k k k 1 0
B C B C
J = @0 0 A =) J k = @ 0 k 0 A for k 2 N.
0 0 µ 0 0 µk

and, if we are in case 2, we can set µ = to get


0 1 0 1
1 0 k k k 1 0
B C B C
J = @0 0 A =) J k = @ 0 k 0 A for k 2 N.
0 0 0 0 k

Lecture 13, page 11


For J3

Multiplying J3 by itself several times, we see that


0 1 0 1
2 2 1 3 3 2 3
B 2 2 C, B C
J32 = @ 0 A J33 = @ 0 3 3 2A ,
0 0 2 0 0 3

0 1 0 1
4 4 3 6 2 5 5 4 10 3
B C B C
J34 = @ 0 4 4 3 A , J35 = @ 0 5 5 4 A, etc.
0 0 4 0 0 5

and so we can conclude that

Suppose that k 2 N and a = 12 k(k 1).


0 1 0 1
1 0 k k k 1 a k 2
B C k B k k 1 C.
If J = @ 0 1 A, then J = @ 0 k A
0 0 0 0 k

Lecture 13, page 12


Solving systems of di↵erence equations

In MA100, you saw how to solve a system of di↵erence equations


yk = Ayk 1 for k 2 N when A is diagonalisable, i.e. there is an
invertible matrix P such that P 1 AP = D is diagonal.
One method you saw (the analogue of what we saw in slide 3)
involves setting yk = Pzk so that yk 1 = Pzk 1 and we get
1
Pzk = APzk 1 =) zk = P APzk 1 =) zk = Dzk 1

which is a simpler system of di↵erence equations that we can solve.


However, the other method you saw involves noting that we have
yk = Ayk 1 = A2 yk 2 = A3 yk 3 = · · · = Ak yk k,

if we use yk = Ayk 1 recursively so that yk 1 = Ayk 2 etc. Thus,


yk = Ak y0 = PD k P 1
y0 ,
and given what we saw above, this is the method we will follow.
Lecture 13, page 13
But, in MA100, this wouldn’t always work!

Again, as we know, in MA100, two unpleasant complications could


arise when we used diagonalisation...

(1) Even if A is a real matrix, it might have complex eigenvalues


and eigenvectors.
This is no longer a problem! We can deal with this now!
We’ll look at an example of this to revise the MA100 method
and see how complex numbers don’t really change anything.
(2) A may not be diagonalisable as it just doesn’t have enough
linearly independent eigenvectors.
This is no longer a problem! We can use the JNF instead!
We’ll look at an example of this to see how things change
when we use a JNF instead of a diagonal matrix.

Lecture 13, page 14


Example 1
! !
1 1 2
Solve yk = Ayk 1 for k 2 N when A = and y0 = .
4 1 0

(Ii Ii ) ( fi Iz )
"
F- and D=
;

(Li Li ) ( if go.gg) ( Iii ! ) )


"
' ""
PDK p yo
-

y µ Akyo -
-
-
-
=

( ) (
¥q
e- iojk

)
"
Zill ( B- eiojk +
'
( Itzik t Ci -

= =

-
ziutzisk -12in silk -

ti eiojktzicrseiojk

( )
, Z ( KO )
K
=
fs
Hzierseio asinine)
⇐ real

r -
-
Htt =f5

Tano
-
-
211
"
O -
- tan 2=1 -

167rad
Lecture 13, page 15
Example 2
0 1
0 0 1
B C
Solve yk = Ayk 1 for k 2 N when A = @1 0 3A as before.
0 1 3
af! ! ! ) tf ! ! ! ) and

) ( g Ky ! )
"
÷ ' ±

!!
' ""
II
'' "
yr A' yo PJ "
'
with J
"
- -
-
- =


A

y: I I :*::÷÷i÷÷÷÷÷÷i: :L
"" " "
:
=

:
-
. .

' '
til "t÷÷÷H¥÷¥ Lecture 13, page 16
MA212 Further Mathematical Methods
Lecture 14: Di↵erence equations continued

'
for large K , if Cfo ,
Lk ' Ck
dominant
Dr James Ward
if f- 0 Bto
Lk E BK
.

,
subdominant
⌅ Dominant eigenvalues and long-term behaviour
⌅ Age-specific population growth
if Go Ato
,Ik=Ak(÷ )
, 13=0 .
subdominant

if o_0 , 13--0 A- 0
,
,
Lk I
-
-

The London School of Economics and Political Science


Department of Mathematics

Lecture 14, page 1


Information

⌅ Exercises 17 is on Moodle.
I Attempt questions 3, 4(a), 5 and 6.
I Follow your class teacher’s submission instructions.
I Do it! Actually submit your homework!

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes
I Go to them.

Lecture 14, page 2


Dominant eigenvalues

We now want to talk about the long-term behaviour of systems of


di↵erence equations and, to do this, we need the following.

Definition: Suppose that A is an n ⇥ n matrix.


We say that is the dominant eigenvalue of A if | | > |µ|
where µ is any other eigenvalue of A.

Note: A square matrix may not have a dominant eigenvalue.


For instance, if a 3 ⇥ 3 matrix has eigenvalues 1 , 2 , 3 with
| 1 | = | 2 | and | 1 | > | 3 |, then 1 is not dominant.
In particular, this means that eigenvalues with an algebraic
multiplicity greater than one can not be dominant.

Lecture 14, page 3


The long-term behaviour of yk = Ayk 1

When discussing this, the following theorem may be useful.

Theorem: Suppose that yk = Ayk 1 is a system of


di↵erence equations where A is a square matrix.
If A has a dominant eigenvalue , then the dominant
behaviour of yk is given by * x with the greatest magnitude

k
yk ' c v for large k,

where c is a constant that depends on y0 and v is an


eigenvector for the eigenvalue .

Proof: As every square matrix has a JNF, we note that


yk = Ayk 1 =) yk = PJ k P 1
y0 ,
and then use the dominant eigenvalue to simplify J k for largeLecture
k. 14, page 4
The proof when the JNF is a diagonal matrix

Suppose A is 2 ⇥ 2 and diagonalisable, if is the dominant


eigenvalue and µ is the other one, we have | | > |µ| and
! ! !
k 0 1 0 1 0
Dk = = k ' k
0 µk 0 (µ/ )k 0 0

for large k. This means that, using yk = PD k P 1 y0 , we have


0 1 0 1
| | ! | | !
B C 1 0 B C c
yk ' k @v w A P 1 y 0 = k @v w A = c kv,
0 0 0
| | | |

for large k. Here c is the first entry of the vector P 1y and v is


0
an eigenvector for the dominant eigenvalue .
This will obviously generalise as long as A is diagonalisable.

Lecture 14, page 5


The proof when we have a proper JNF

As we need a dominant eigenvalue, we must have an eigenvalue


with an algebraic multiplicity of one where | | > |µ| for any other
eigenvalue µ.
Based on the cases we considered in lecture 12, this means that
our JNF must be 0 1
µ 1 0
B C
J = @0 µ 0A
0 0
and we could run a proof like the one we saw in the previous slide.
However, due to space constraints, we’ll look at an example of how
this works instead.
What we do there generalises for the J above, but we won’t be
concerned with JNFs for n ⇥ n matrices with n 4 in this course.
Lecture 14, page 6
Example
0 1
1 3 2
B C
What is the long-term behaviour of yk = @ 0 1 0 A yk 1?
2 1 1

(3×3)×(1×3)=(1×3)

Lecture 14, page 7


A word of warning...

When we use this theorem to conclude that

k
yk ' c v for large k,

the constant c will be an entry of the vector P 1 y0 which depends


on the given initial condition y0 . Now, if this initial condition

C d
-
-

happens to make this entry zero, the theorem gives us yk ' 0 for
-
-

large k and so we do not ‘see’ the dominant behaviour.


In such cases, the long-term behaviour of yk = Ayk 1 we ‘see’ will
be given by some ‘sub-dominant’ behaviour and, to find this, we’d
need to do a calculation to see what is happening instead of using
the theorem we have here!

Lecture 14, page 8


Example continued : subdominant behaviour

What is the long-term behaviour if y0 = (1, 1, 0)t ?

Pelo I g) Is
"

fi )
'

I
Yo - and P
"
- "so

, .
⇐o
.

f- II)
'

::
"

÷ !! =/ ! ! ! )
"

I'
'
' Kerner's =P ''
.
en

T
subdominant
an example of
behaviour

Lecture 14, page 9


Age-specific population growth

We want to model the female population of a species.


To do this, let L be the life-span of this population and divide this
into n equal age classes of length L/n.
We will consider two demographic parameters...

⌅ Fertility: for 1  i  n, let ai be the average number of


daughters born to a female in the ith age class and
⌅ Mortality: for 1  i  n 1, let bi be the fraction of females
in the ith age class that are expected to survive long enough
to enter the (i + 1)th age class.

In particular, this means that we have ai 0 and 0  bi  1.


Also, if ai > 0 for some i, we say that the ith age class is fertile.

Lecture 14, page 10


Starting with the initial population, we measure the female
population at intervals of L/n so that, for 1  i  n, we can let
yk (i) be the population in the ith age class at time kL/n.
The number of daughters born between the (k 1)th and kth
measurements is
a 1 yk 1 (1) + a 2 yk 1 (2) + · · · + a n yk 1 (n),

and so this must be yk (1). Also, for 1  i  n 1, we must have


yk (i + 1) = bi yk 1 (i).

So, in matrix form we have yk = Lyk 1 where


0 1 0 10 1
yk (1) a1 a2 a3 · · · an 1 an yk 1 (1)
By (2)C Bb 0 ··· 0C B C
B k C B 1 0 0 C Byk 1 (2)C
B C B CB C
Byk (3)C = B 0 b2 0 ··· 0 0C B C
B C B C Byk 1 (3)C
B .. C B .. .. .. . . .. .. C B .. C
@ . A @. . . . . . A@ . A
yk (n) 0 0 0 ··· bn 1 0 yk 1 (n)
and we call L a Leslie matrix. Lecture 14, page 11
Example

Suppose the life-span of a female is 45 years and that we consider


three age classes covering the ages [0, 15), [15, 30) and [30, 45]. If
the demographic parameters are

a1 = 0, a2 = 4, a3 = 3, b1 = 1/2 and b2 = 1/4,

what is the Leslie matrix?


If the initial population in each age class is 1000, how much of the
population is in each age class after 15, 30 and 45 years?

⇐ g!;) f !!!! ) after


" bars any. =

so -
.
-

92=41--4}?¥g )
after 30yrs ,

after 45 yrs , yz Lya


-
-

4433£53)
-

Lecture 14, page 12


The female population in the long-term

In such models, we want to know what happens in the long-term.


The following theorem, stated without proof, will help us with this.

Theorem: Every Leslie matrix has exactly one positive real


eigenvalue. We’ll call this 1 .

Furthermore, also without proof, we have

Theorem: If a Leslie matrix has two successive fertile age


classes (i.e. ai , ai+1 6= 0 for some i), then 1 is dominant.

And, of course, if 1 is dominant and v is one of its eigenvectors,


k
yk ' c 1v for large k,
for some constant c will give us the dominant behaviour of yk .
Lecture 14, page 13
Example continued
Using our Leslie matrix from earlier, what can we say about the
dominant behaviour of the female population in the long-term?

( )
O 4 3
f- yz o o d =-3 is the unique real e.values

° 44 0
since two successive fertile classes , d is dominant

=/ ! ) ( §)
' "
"
erector for a- E . - -

a ; in the
long term, yxec ⇐ I

class 18 :b I
proportion
:
the in each age

KEY
( !) ( f ))
' " '
also ya = -
- I Ects a -3%,

so population in each class increases by fog every.


15 yrs .

also indeed the total population also increases by 501


,
.
every 15
yrs

Lecture 14, page 14


What if there isn’t a dominant eigenvalue?

If a Leslie matrix does not have a dominant eigenvalue, we have


other ways of seeing how the female population changes over time.
In this course, that usually means you will get some kind of hint...
Example: A female population is governed by the Leslie matrix
0 1
0 0 6
B C
L = @1/2 0 0A .
0 1/3 0

(a) Show that L does not have a dominant eigenvalue.


(b) Find L3 . How will this female population change over time?

Lecture 14, page 15


6/424.0 /
d

4+61611=0×3
- 0
IL AIK +"
-
= -

-1=0

¥3
"3 ,
'

°
a i > cantata)
-

so a
-
-
I,

moduli : Idf -
-
I ,
lad =

¥-1435 =L , 1h31 =
-41-1735=1 Ail is not dominant

y:÷÷K:÷÷M÷ : :L
"

4 1%4.4=4%1
" ii.

.
's

10 , It =L XI ,
12=441 -13=13# Io X4=L4Xo=LXo= XI
,
= ,
- - -

K 0.3 6
'
'
-

µ.gg
-

,
,

population evolves in a
cyclical way ,
so Xk= K -
i. 4.7 ;
-
-

. K =
2.518 . .
-

Lecture 7, page 20
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 17: Jordan normal forms

For these and all other exercises on this course, you must show all your working.
1. Prove that, if 1 , 2 are distinct eigenvalues of a square matrix A, then the corresponding
eigenvectors v1 , v2 are linearly independent.
Now show that, if A has three distinct eigenvalues 1 , 2 , 3 , then the corresponding eigenvectors are
linearly independent.
✓ ◆
1 a
2. (a) Let be a 2 ⇥ 2 matrix in Jordan normal form.
0 2

Explain briefly why the entry a is either 0 or 1. If a = 1, what can you say about 1 and 2?
✓ ◆
1 a
(b) Let S be the set of all 2 ⇥ 2 matrices in Jordan normal form with 1  2.
0 2

Show that no two matrices in S are similar.


3. Put the following matrices in Jordan normal form.
0 1 0 1
0 0 1 1 1 0
(a) A = @1 0 3A and (b) B = @0 0 1A .
0 1 3 1 1 1

4. Solve the following systems of di↵erential equations.


0 01 0 10 1 0 01 0 10 1 0 1
y1 0 0 1 y1 y1 1 1 0 y1 0
@ 0 A @ A @ A @ 0 A @ A @y 2 + 0A .
A @

(a) y2 = 1 0 3 y2 and (b) y2 = 0 0 1
y30 0 1 3 y3 y30 1 1 1 y3 1

5. You are given that A = P JP 1 where



0 1 0 1 0 1
0 2 2 0 1 1 1 1 0
1
A= @ 1 1 3A , P = @1 0 1A and @
J= 0 1 0 A.
2
1 1 3 1 1 0 0 0 1

(a) Use the information given to determine the eigenvalues and corresponding eigenvectors of A.

(b) Find the general solution to the di↵erential equation y 0 = Ay .

(c) Describe the long-term behaviour of this general solution.

(d) Find all solutions for which y1 (t) ! 0 as t ! 1 and y2 (0) = 0.


6. Find the general solution to the system of di↵erence equations

xk = 3xk 1 yk 1, yk = xk 1 + yk 1 and zk = x k 1 yk 1 + 2zk 1.

For which initial values (x0 , y0 , z0 ) will zk be positive for all sufficiently large k?
I .
① Since ×, and d2 are eigenvalues , we have All , di U,
-
-
and Alk -

-
Az Az

let ate t BE E then LAG ) BLAK ) o


-

a -1
- -
-
,

Adi I t Pdz Lk =
O ①

since dit Az , equation ① is true only when 4=8--0

② Similarly ,
hi .
dz .
As are eigenvalues ,
we have Alf dik -
-

.
A E- AZIZ and Aks =dz Use

let da t BE t res E then LAG ) BLAK ) PLAY )


-
-

, a -1 t =D

HADI t @ dy Lk t Axs) Us =D ①

since dit Az Hs , equation ① is true only when a- B =p -


-

2. Ca ) if a-4 , di -
-
Xz and they generate the same eigenvector .

( If I ) (Tf k )
I only take the form or
,

Cb )
suppose these exists two matrices in s that are similar ,

then there exists an invertible P , sit .


PAP -1=13

i.ec:: ::X: :L ⇐ all :: I


-
-
"

"

⇐:: ::::::.ie::: ::p:"


" " "

then

(
dip, =
dip, t bPs b
-
-
O or 13=0

di Ps =
dzB 4=12

app

II:p:
"" " " ' -
bP4

aps t da Pg a .
.
. . ..
:3
/ 1=0
oh let IA XI31=0 C-A) [ C-A) 13 N -13 ) t l O
-
.
-
. ice .

, so -

, * -

O l 3 X -
-

Mld -
l) -13kW -

t) -

Catt ) ca 1) =p -

(X 1) C at 3A X
-
- -
-
I 7=0

2
-

D 1) -
= 0 d, =
42=43--1

is :÷÷÷÷a
. ""

÷
men ." .

so a

(÷ ) is the eigenvector of A then CA its us . 4 i.e

(I ! ÷ )
= -
.

. .
=

¥ Hi:L
"" ⇐

÷÷÷÷i
"" " '

Similarly need ca XII Is


¥33 ) us ie CA DI )
(Yo ) by inspection
¥!) ( ! ) ✓
-

Es
=
we
- - -
=
;
- .
-
, , .

fi Ii g) ( g ! ! ) (÷ ! ! )
' ' "

therefore
"
A- PIP -

)
X I

/g
I
-
-

Cb) let l B -
AI 1=0 , ie .

so ( th ) lack t ) - -
I ]t l -

Lott ) -
o

y y
=
o ,

, N( 2- x) O
-
-

died2=0 and 43=2

( if Y ) (¥ ) =/ ! ) (f) ✓
'
when arts - o.
,
so a -

A -
o and take a
-
-
i . then a-
o
Z1 O
-
=

, , ,
Xi Y , -12-1=0
-

HEE
µ ! } ) =/ ! ) f! )
then need CA ' ie E
!
we
-
-
. -
.

'

÷ I :L ÷÷÷÷÷ Hr
men 's . . .
"" "

f! ! !/ ) fo ooo ) f ! ! ! )
therefore B - " JP ' -
-
' "
need to make p invertible

O O 2

×
your P is not invertible
'
A PJ P
by remembering
-

* AP PJ
-
- -
-

from
fog 0g } ) f! I g ) (g ! !)
'
and J
!
"
4. ca) seas . A '
-

PSP where P
- -
- -
- -
- -

"
=P APE JI
' '
set y = P2 Ay APE
'

then y =P? =
so z
- -
- -

, ,

iii. c: : " it ÷÷÷÷÷ :


none we can write "

from ③ ,
Z3= Aet , then write ⑦ as Zz
'
-
Zz
-
-
2-5- Aet

et
'd ' T t I
(Att B) et
'

Eze
-

take integrating factor e- ) and


-

I - -
-
-
A then Zze =
AT TB Zz -


- -

. ,

Zi (Att B) et
finally rewrite ① as Zi
-
- -

'd'
taking I -
-
et -

- et , G- ie
t
-

)
'
=
Att B , then z, e
t
-

= Et't BT te and Zia et CF t't Bt to)


4¥ ) III ) :
"
hence we nave ⇐
-
-

et

men

.ae#y.i.ii:lf:::i.tyee=f:::.:.i::::..::i :D :: .
-

..

teeny :÷ feet!:L
=

Cb )
5. Ca) from J , we get the eigenvalues from the diagonal entries :
Ai -
-
I ,
Az -
I and he -1

corresponding eigenvectors
f? ) (f ) f!) ✓
their are
e
-
-

, E- and E-

×
T
JE derived Qa ca)
'
Cb ) set y Pt so Z as at
eigenvector
-
-
- -
,

not an

my:÷i÷x÷ii÷÷÷÷÷ :
t
from ② and ③ Zz Aet and
-

we
get -
Zz Be
-
-
,

Aet
'

taking back to ① we get Zo Zi Zz


-
-
- -
,

d'
since I=
et '
-
-
e
t
-

. He
-
t
)
'

=
A , then z, e
t
-

-
-

Atta and Zi - et LAIT C )

:÷ )
nen. ⇐ C)

n
.

He:¥¥i¥*¥" I:#int lit


.
" B

yo: If :÷
+ c) Be '
so
.*
-

-
.
B to
go

# EE

( AaT¥¥÷) ( FA )
t
when t so -

, y =

④ since Kil -
-
taek last -
-
I . there's no dominant a .

mm

fo: Ii I ) ( g II )
"
"
" ' '

'll )
''
-
-
-

so this g. ate
can't use . .
.
.

Y: : ÷ ) ¥)
'" " " '

. ..

if
§)
as k -70 Bto yk = Bk
, ,
X

if 13=0 , Ato ,
C -
-
o , 9k =
A
§)
(§ )
"
if 13=0 , A
-
-
O , Cfo , 9k =L -

1) C

if A Bec -
-

- o
, yk -
-
e
'
Cd ) y , Ct ) → o Aet -1 BE = 0 as this A
-

- o

92107=0 Ct 13=0 Be C-

iii. Hii :S
' '
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 17: Jordan normal forms

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. Suppose that Av1 = 1 v1 and Av2 = 2 v2 with 1 6= 2 and the eigenvectors v1 and v2 are
non-zero.
Suppose that ↵v1 + v2 = 0 (†). Applying A to each side gives

0 = A0 = A(↵v1 + v2 ) = ↵Av1 + Av2 = ↵ 1 v1 + 2 v2 .

Subtracting 1 times equation (†) from this gives 0 = ( 2 1 )v2 . As 2 1 6= 0 and v2 =6 0,


we must have = 0. From equation (†), we can now deduce that ↵ = 0 as well since v1 6= 0. This
shows that {v1 , v2 } is linearly independent.
Now suppose 1 , 2 , 3 are three distinct eigenvalues of A, with corresponding eigenvectors v1 , v2 , v3 .
If we have ↵1 v1 + ↵2 v2 + ↵3 v3 = 0 (⇤) then, applying A, we get ↵1 1 v1 + ↵2 2 v2 + ↵3 3 v3 = 0.
Subtracting 3 times (⇤) gives ↵1 ( 1 3 )v1 + ↵2 ( 2 3 )v2 = 0. We’ve shown that {v1 , v2 } is a
linearly independent set, so we have ↵1 ( 1 3 ) = 0 and ↵2 ( 2 3 ) = 0. Thus, as 1 6= 3 and
2 6= 3 , we have ↵1 = ↵2 = 0. Now, from equation (⇤), we can deduce that ↵3 = 0 as well since
v3 6= 0. This proves that {v1 , v2 , v3 } is linearly independent.
Continuing in this way (i.e. arguing by induction), we can obtain the following important result. If
1 , . . . , k are distinct eigenvalues of a matrix A, then the corresponding eigenvectors are linearly
independent.

2. A 2 ⇥ 2 matrix in Jordan Normal Form is of one of the following two types:


✓ ◆ ✓ ◆
1 0 1 1
(i) or (ii) .
0 2 0 1

So S consists of diagonal matrices of type (i) with 1  2 , and all matrices of type (ii).
✓ ◆
1 1
Note: A matrix like , with 1 6= 2 , is not in Jordan Normal Form!1
0 2

We know that similar matrices have the same characteristic polynomials. A matrix
✓ ◆
1 ⇤
,
0 2

has characteristic polynomial (x 1 )(x 2 ). So 1 and 2 are uniquely determined by the char-
acteristic polynomial, subject to the requirement that 1  2 .
So the only pairs of matrices in the set S with the same characteristic polynomial are those pairs
✓ ◆ ✓ ◆
0 1
= I and .
0 0

These two matrices are not similar, as the first has two linearly independent eigenvectors — indeed
the eigenspace corresponding to eigenvalue is the whole of C2 — and the second does not.
1
Such a matrix has characteristic polynomial (x 1 )(x 2 ), and thus has two linearly independent eigenvectors
(see Question 7 from Exercises 5) and can be diagonalised, so it is similar to (i) — if you think you proved that these
two matrices are not similar then you may wish to go back and find your mistake. A matrix in JNF consists of Jordan
blocks arranged on the diagonal, and each block has all the diagonal entries equal — any diagonal matrix is of this
form, with all its blocks being of size 1.
(Alternatively, if P is any invertible matrix, then P 1( I)P = I: the only matrix similar to I is
I itself.)
So no two matrices in S are similar. We establish (more or less) in lectures that every 2 ⇥ 2 matrix is
similar to a matrix in S: this exercise shows that each 2 ⇥ 2 matrix is similar to exactly one matrix
in S.

3. For (a), the characteristic polynomial of the matrix A is x3 3x2 + 3x 1 = (x 1)3 and so we
have an eigenvalue of 1 with an algebraic multiplicity of three. Following the method given in the
lectures, we then solve
0 1 0 1
1 0 1 1
(A I)x = 0 =) @ 1 1 3A x = 0 =) x = @ 2A
0 1 2 1
is the sole eigenvector. So, we take v1 = (1, 2, 1)t and seek a vector v2 such that
0 1 0 1 0 1
1 0 1 1 1
(A I)v2 = v1 =) @ 1 1 3A v2 = @ 2A =) v2 = @ 1 A ,
0 1 2 1 0
will do the job. We also need a vector v3 such that
0 1 0 1 0 1
1 0 1 1 1
(A I)v3 = v2 =) @ 1 1 3A v 3 = @ 1 A =) v3 = @0A ,
0 1 2 0 0
will do the job. Thus, taking these vectors in the required order we see that
0 1 0 1
1 1 1 1 1 0
P = @ 2 1 0A gives P 1 AP = @0 1 1A = J,
1 0 0 0 0 1
which is the required Jordan Normal Form.
For (b), the characteristic polynomial of this matrix B is pB (x) = x3 2x2 = x2 (x 2) so have
eigenvalues of 2 and 0 where the latter has an algebraic multiplicity of two. Following the method
in the lectures, we then solve
0 1 0 1
1 1 0 1
(B 0I)x = 0 =) @ 0 0 1 x = 0 =) x = 1A ,
A @
1 1 1 0
is the sole eigenvector corresponding to the eigenvalue = 0. So, we take v1 = (1, 1, 0)t and seek a
vector v2 such that
0 1 0 1
1 1 0 1
(B 0I)v2 = v1 =) @0 0 1A v2 = v1 =) v2 = @ 0 A,
1 1 1 1
will do the job. We then solve
0 1 0 1
1 1 0 1
(B 2I)x = 0 =) @0 2 1A x = 0 =) x = @ 1A ,
1 1 1 2
is an eigenvector corresponding to the eigenvalue = 2 and so we can simply take v3 = (1, 1, 2)t .
Thus, taking these vectors in the required order we see that
0 1 0 1
1 1 1 0 1 0
P = @1 0 1A gives P 1 BP = @0 0 0A = J,
0 1 2 0 0 2
which is the required Jordan Normal Form.
4. I do hope you spotted that the matrices here are exactly those in the previous question.
(a) The method is to put y = P z , where
0 1
1 1 1
P =@ 2 1 0A
1 0 0

is the change-of-basis matrix from the previous question. This transforms the system of equations to
0 01 0 10 1
z1 1 1 0 z1
@z20 A = @0 1 1A @z2 A .
z30 0 0 1 z3

The matrix here is of course the matrix P 1 AP in Jordan Normal Form that we found in the previous
question.
Now we solve the equations one after another, working from the bottom row upwards. To start, the
general solution of the equation z30 = z3 is z3 = Xet , for some constant X. Then, we find that
Z
0 t t
z2 z2 = Xe =) z2 e = X dt = Xt + Y,

as the integrating factor is e t and so we have

z2 = Xtet + Y et ,

for some constant Y as the general solution for z2 . Finally we have


Z
t2
z10 z1 = Xtet + Y et =) z1 e t = (Xt + Y ) dt = X + Y t + Z,
2

as the integrating factor is still e t and so we have

t2 t
z1 = X e + Y tet + Zet ,
2
for some constant Z as the general solution for z1 .
Consequently, we have
0 10 2 1 0 2 1
1 1 1 Xt /2 + Y t + Z Xt /2 + (Y X)t + (Z Y + X)
y = P z = e t @ 2 1 0A @ Xt + Y A = et @ Xt2 (2Y X)t (2Z Y ) A .
1 0 0 X Xt2 /2 + Y t + Z

If your answer doesn’t exactly match this in form, it may still be right — one can rewrite the constants
above in terms of (for example) a = X/2, b = Y X, c = Z Y + Z and get another correct solution.
Another thing to note here is that we never needed to use the inverse P 1 .
(b) However, for this question, we do need the inverse of the change-of-basis matrix P we found
before, so that P 1 BP = J. We write y 0 = By + x , set y = P z , and obtain P z 0 = P Jz + x so
that z 0 = Jz + P 1 x . Here
0 1 0 1 0 1
1 1 1 1 3 1 0 1 0
1
P = @1 0 1A , P 1 = @2 2 2A and J = @0 0 0A .
4
0 1 2 1 1 1 0 0 2

Also P 1x =P 1 (0, 0, 1)t = 14 (1, 2, 1)t . Thus the equations have been transformed to
0 01 0 10 1 0 1
z1 0 1 0 z1 1
@z20 A = @0 0 0A @z2 A + @ 2A .1
4
z30 0 0 2 z3 1
Again we solve the equations from the bottom up so that we have
Z
0 1 2t 1 1
z3 2z3 = =) z3 e = e 2t dt = e 2t
+ X,
4 4 8
as the integrating factor is e 2t and so we have
1
z3 = + Xe2t .
8
for some constant X as the general solution for z3 . Then we have
1 1
z20 = =) z2 = t + Y,
2 2
for some constant Y as the general solution for z2 . Finally, we have
✓ ◆
0 1 1 1 2 1
z1 = t+Y + =) z1 = t + Y + t + Z,
2 4 4 4
for some constant Z as the general solution for z1 .
Consequently, we have
0 1
t2 /4 + (Y 1/4)t + Xe2t + (Y + Z 1/8)
y = Pz = @ t2 /4 + (Y + 1/4)t Xe2t + (Z + 1/8) A .
t/2 + 2Xe2t (Y + 1/4)

Again, this is not the only form of the answer.


Another approach is to solve the homogeneous equation y 0 = By , and then find some particular
solution of the equation y00 = By0 + x . The second phase is not easy in general, but in this example,
since x = (0, 0, 1)t is constant, we can take y0 to be the constant vector B 1 x . Thus the general
solution of the equation can be written as y = etB u B 1 x , where u is an arbitrary vector.

5. (a) From the diagonal entries of J, we can read o↵ that the eigenvalues of A are +1 (with an
algebraic multiplicity of two) and 1. The corresponding eigenvectors are (0, 1, 1)t and (1, 1, 0)t , the
first and third columns of P .
Notice that J has a Jordan block of size 2: if v1 and v2 are the first two columns of P (the two
columns corresponding to this block), then (A I)v2 = v1 6= 0, and so v2 is not an eigenvector of
A corresponding to = 1, but v1 is.
In general, if J breaks up into Jordan blocks, the leftmost column of P corresponding to each block
is an eigenvector of A. (For a diagonal matrix, of course, this means that every column of P is an
eigenvector.)
(b) As usual, we first solve z 0 = Jz . The solution to this is, working up: z3 (t) = ae t , z2 (t) = bet ,
z1 (t) = btet + cet , where a, b and c are arbitrary constants.
0 1
bet + ae t
The solution to the original equation is then y = P z = @btet + cet + ae t A.
btet + (b + c)et
(c) As t ! 1, the “dominant” term is btet . If we divide the solution by that dominant term, we see
that 0 b 1 0 1
a
1 t + te2t 0
y = @b + c + a2t A ! @ b A
tet t te
b + b+c
t
b
as t ! 1. (This assumes, of course, that b 6= 0. What happens if this is not the case?)
0 1
0
We might then write y ⇡ btet @1A. Here b is an arbitrary constant, which might be negative.
1
The above expression tells us about the rate of growth of the largest of the yi (t), and the relative
sizes of the yi (t) for large t. Whether this is a satisfactory answer depends on the context. If we
want to know how y2 (t) behaves for large t, this is fine: y2 (t) behaves as btet , for some constant b. If
we want to know the approximate relationship between y2 (t) and y3 (t) for large t, this is also fine:
y2 (t)/y3 (t) ! 1 as t ! 1. We also learn that y1 (t) is “much smaller than” y2 (t) and y3 (t) for large
t, but we might be interested in knowing “how much smaller”. Of course we can read o↵ the answer
to that question as well: y1 (t)/et ! b as t ! 1, so y1 (t) is approximately y2 (t)/t for large t. (More
precisely, ty1 (t)/y2 (t) ! 1 as t ! 1.)
(d) The given conditions tell us that b = 0, and then that c = a. So the solution is: y1 (t) = ae t ,
y2 (t) = a(e t et ), y3 (t) = aet .
0 1 0 10 1
xk 3 1 0 xk 1
6. @ A
We write yk = 1@ 1 0 A @ yk 1
A and call the matrix A.
zk 1 1 2 zk 1

We calculate pA (x) = (x 2)[(x 3)(x 1) + 1] = (x 2)3 and so we have an eigenvalue of 2 with


an algebraic multiplicity of three. Following the method given in the lectures, we then solve
0 1 0 1 0 1
1 1 0 1 0
(A 2I)x = 0 =) @1 1 0A x = 0 =) x = s @1A + t @0A ,
1 1 0 0 1

for s, t 2 R. That is, we find that (1, 1, 0)t and (0, 0, 1)t are two linearly independent eigenvectors
of A. Of course, we can’t diagonalise A as we are one eigenvector short, and so we go for the JNF
instead. Thus, taking note of the matrix A 2I, we choose v1 = (1, 1, 1)t to be our eigenvector so
that 0 1 0 1 0 1
1 1 0 1 1
(A 2I)v2 = v1 =) @1 1 0A v2 = @1A =) v2 = @0A ,
1 1 0 1 0

will do the job. We then take v3 = (0, 0, 1)t as the other eigenvector as it is not a scalar multiple of
the eigenvector v1 . Thus, taking these vectors in the required order we see that
0 1 0 1
1 1 0 2 1 0
P = @1 0 0A gives P 1 AP = @0 2 0A = J,
1 0 1 0 0 2

which is the required Jordan Normal Form.


Note: you might well get a di↵erent matrix P , but you should get the same matrix J (or the similar
matrix with J23 = 1 and J12 = 0). The matrix A has an eigenspace E(A; 2) of dimension 2 — we
can see this, for instance, because A 2I has rank 1 and thus nullity 2 — and so any matrix similar
to it also has exactly two linearly independent eigenvectors: for this J, these are the first and third
standard vectors.
Now we set Ak = (P JP 1 )k = P J k P 1 . To find J k , we think of it as made up of two Jordan blocks:
one is the 1 ⇥ 1 matrix with entry 2 in the bottom right corner, and this matrix has kth power (2k ).
The other block is the 2 ⇥ 2 matrix
✓ ◆ ✓ k ◆
2 1 2 k2k 1
and this has kth power .
0 2 0 2k

So we have
0 10 k 10 1 0 1
1 1 0 2 k2k 1 0 0 1 0 k+2 k 0
Ak = P J k P 1
= @1 0 0A @ 0 2k 0 A @1 1 0A = 2k 1@
k 2 k 0A
1 0 1 0 0 2k 0 1 1 k k 2
and the general solution of the recurrence relations is
0 1 0 1 0 1
xk x0 k(x0 y0 ) + 2x0
@ yk A = Ak @ y0 A = 2k 1 @ k(x0 y0 ) + 2y0 A .
zk z0 k(x0 y0 ) + 2z0

Lastly, zk = 2k 1 [k(x
0 y0 ) + 2z0 ] is positive for large k if either x0 > y0 , or x0 = y0 and z0 > 0.
MA212 Further Mathematical Methods
Lecture 15: Sums and complements

Dr James Ward

⌅ Sums and intersections


⌅ Direct sums
⌅ Complements
⌅ Orthogonal complements

The London School of Economics and Political Science


Department of Mathematics

Lecture 15, page 1


Sums and intersections...

We’re now going to return to real vector spaces so that we can


develop some new ideas.

Definition: Suppose that U and W are two subspaces of a


real vector space.
The sum of U and W is

U + W = {u + w | u 2 U and w 2 W }

and the intersection of U and W is

U \ W = {x | x 2 U and x 2 W }.

Lecture 15, page 3


...are subspaces

Theorem: If U and W are subspaces of a real vector space


V , then U + W and U \ W are also subspaces of V .

Proof:

Lecture 15, page 4


...are subspaces

Theorem: If U and W are subspaces of a real vector space


V , then U + W and U \ W are also subspaces of V .

Proof:

Lecture 15, page 4


Some geometric insight

In R3 , lines and planes through the origin are subspaces, and so we


can see how sums and intersections work.
Ut w -
IR's
Utw W -
Ufw

u u

W
-
- - -
- - -


- - -
.

.
.

few aww

j U
unw fo }
-
-

Uaw

However, the union of two subspaces, i.e.


U [ W = {x | x 2 U or x 2 W }
is not usually a subspace and that’s why we need the sum U + W !
Lecture 15, page 5
Example

Find U + W and U \ W when


80 1 9 80 1 9
>
< x >
= >
< a >
=
B C B C
U = @y A x, y 2 R and W = @a A a, b 2 R .
>
: >
; >
: >
;
0 b

④ (g) Half :o) tbh ! )


utw so z ate
-
. -

lytbl
-
=
+ -1

{ (g)
'
'
}
utw
-

Lin
R
-

, ,

Univ XEUTW ,
so XEU , X -

(§ ) and HEW , *
(f)
f! ) (g) .
-
e-
(g) '
a

unw=linµ ) )
Lecture 15, page 6
Direct sums

Definition: Suppose that U and W are two subspaces of a


real vector space.
We say that the sum U + W is direct, written U W , if
every vector in U + W can be written as a vector in U plus
a vector in W in exactly one way.

We know that x 2 U + W gives us at least one way of writing


x = u + w with u 2 U and w 2 W .
But, if the sum is direct, x 2 U W can only be written as
x = u + w with u 2 U and w 2 W .
in exactly one way.
Essentially, for non-trivial proper subspaces, a sum is direct when
vectors from U and vectors from W are linearly independent. Lecture 15, page 7
Deciding whether a sum is direct

Theorem: Suppose that U and W are subspaces of a real


vector space. The sum U + W is direct if and only if

U \ W = {0}.

Proof: LHS given Utw is direct so UEUTW can be written as


every
-
,

with OE V and W
v. utw WE in
exactly one way

OE Un w so that I C- U and I C- W E e
-
- (
by restriction of one
suppose
I Eto E It I
doing
-

it )
-
-

way
C- W EU C- W
C- U
p the only member of unwise
since W is a
i. e .

Un w -
-
fo )
subspace

Lecture 15, page 8


Deciding whether a sum is direct

Theorem: Suppose that U and W are subspaces of a real


vector space. The sum U + W is direct if and only if

U \ W = {0}.

Proof: RHS -
given Un w -
-
fo }

let 's more than one way of writing ee Ut w as I It he


suppose there
-
-

is

with UE v and we w

say I =
Uit id and I =
Ust we with UI , KEV and Wi W ,
-
EW
C- U E W
- n

ki t wi -
-
Ust we ki ez wa we ,
- - -
as Uiw are subspaces and have CUVA

Ui -
Uzeunw
-
and wz Eze
- un w so there's actually only one
way of
-

writing E a tht with UEU and WEW


go }
-

UI E
- - o and we -

y, so as un w =

Ut w is direct
y us and
-
we we
-
-

Lecture 15, page 8


Complements

Definition: Suppose that U and W are two subspaces of a


real vector space V . W is a complement of U if U W = V .

Note: Every subspace U of V has a complement. For instance, if

{u1 , u2 , . . . , uk } is a basis of U,

we can extend this to a basis of V , say

{u1 , u2 , . . . , uk , uk+1 , uk+2 , . . . , uk+n }

so that the subspace

W = Lin{uk+1 , uk+2 , . . . , uk+n }.

is a complement of U.

Lecture 15, page 9


Some geometric insight
W a complement of u
IR }
,

in IRZ in
U ,
a complement
u a complement
of W
,

E ; e
of w
w , a complement :

of u
pi -
-
U⑦ W Rs = U ⑦W

Lecture 15, page 10


Orthogonal complements

Every non-trivial proper subspace U of V can have many di↵erent


complements, but of these, one is very special...

Definition: Suppose that U is a subspace of the real vector


space V . The orthogonal complement of U is

U ? = {v 2 V | hu, v i = 0 for all u 2 U}.

Looking at this geometrically, we have...

Lecture 15, page 11


Some properties of orthogonal complements

Theorem: If U is a subspace of a real [finite-dimensional]


vector space V , then
(1) U ? is a subspace of V too, at ex 18

(2) V = U U ?,
(3) dim(V ) = dim(U) + dim(U ? ),
(4) (U ? )? = U.

Proof:

Lecture 15, page 12



(2) Suppose {yi uz ,
.
. . .

. Ek ) is ON basis of U

and we extend this to { UI Iz Ik Ika In } as an ON basis of V


' - - '
- -

'
. . s .
,

- -
C- U f Ut e u e Ut
- -
VEV be written as 1- An
now any vector can I a Uit tdkuk take, Uet, t UI
- - -
- -

=
-

so It Ut Ut ,
thus VE Ut Ut

Ut E V then Ut Ut V
clearly
-

Ut
-
,
,

② Ut
if U Eun , then UEU and UE Ut

SO LI I > .
= 0 So 1--0

thus the of nut and ut fo }


only member a is e un -

hence u
-
-
not at

(3) dim IV ) -
n Kt ch K) dim ( UJ t dim cut )
- -
- =

T t

basis of Ut
a basis
of

et ) UE Utt ( proof in Qs of ex 18 )

by Ut is also
a) ,
a
subspace

by . V = U Ut and V = Uh ⑦ Utt

dim ( V ) dimwit dimwit ) and dimer ) dim cut ) dim ( Utt )


by ③ ,
-
-
t

so dim ( UI -
- dimer ) -
dim cut ) =
dim ( Utt )

since UE Utt and dim cu) -


dim Cutt ) ,
we now have a- Utt
A useful result about matrices

Theorem: If A is a real m ⇥ n matrix, then R(A)? = N(At ).


t
RCA ) CAH ) NCA)
'
and = N =

Proof: Suppose A is a real m ⇥ n matrix and recall that


(Ax ) · z = (Ax )t z = x t At z = x t (At z ) = x · (At z ).
We then have
z 2 R(A)? () (Ax ) · z = 0 for every Ax 2 R(A)
() x · (At z ) = 0 for every x 2 Rn
() At z = 0
() z 2 N(At )
and so R(A)? = N(At ), as required.
Note: (At )t = A and (U ? )? = U, and so N(A)? = R(At ) too.

Lecture 15, page 14


Example
0 1
1 0
B C
Verify that R(A)? = N(At ) using the matrix A = @0 1A.
0 1

at
49,9 { =

filarian '

meats .
- tin
I} ! verify
:[ fool -141/1447 ) )
I
C- RCA) C- MCAT)

(g ! ) →
looks trader
!
=p ) .gg/=o.er-ier--o
! , for all a B. VER

(g ) ) )
,

'
Rathlin { i

to
vector RCA) is every vector
,
in t
i

,
every
I in MCAT ) ,
i. e.
MCAT ) -
-
RCA)
'

I Lecture 15, page 15


MA212 Further Mathematical Methods
Lecture 16: Projections

Dr James Ward

⌅ Projections
⌅ Orthogonal projections

The London School of Economics and Political Science


Department of Mathematics

Lecture 16, page 1


Information

⌅ Exercises 18 is on Moodle.
I Attempt questions 1, 2, 3, 4, 5 and 6.
I Follow your class teacher’s submission instructions.
I Do it! Actually submit your homework!

⌅ Extra Examples Sessions


I Start on Tuesday 12:00–13:00 on Zoom.
I Contact me via the Moodle forum if you want me to cover
anything in particular.

⌅ Classes
I Go to them.

Lecture 16, page 2


Projections

Definition: Suppose that U and W are subspaces of a real


vector space V such that V = U W . That is, every v 2 V
can be written as
v = u + w,

with u 2 U and w 2 W in exactly one way.


The projection of V onto U parallel to W is the function
T : V ! V given by T (v ) = u.
W
IRS

/
in
Geometrically... 1=1-11
V=U④w

¥
R2 EW
in
I fue EU

e.
E- ate
e
the
✓ = U④ W i

C- U EW i
.

a
TLE) I -
TLE ) -
-
I
Lecture 16, page 3
Questions, so many questions...

We can now ask some questions.

⌅ Are projections linear transformations?


Yes! So we can represent them by matrices!
⌅ Are projections represented by any particular kind of matrix?
Yes! The matrix must be idempotent!
⌅ Can we find the matrix that represents a given projection?
Yes! We’ll see two ways of doing this!

Let’s look at each of these in turn...

Lecture 16, page 4


Projections are linear transformations

Theorem: The projection of V onto U parallel to W is a


linear transformation.

Proof: Suppose that ↵, 2 R and v1 , v2 2 V , i.e.


v1 = u 1 + w 1 and v2 = u 2 + w 2 ,
with u1 , u2 2 U and w1 , w2 2 W in exactly one way.
This means that, as U and W are CUVA and CUSM, we have
↵v1 + v2 = ↵(u1 +w1 )+ (u2 +w2 ) = (↵u1 + u2 )+(↵w1 + w2 ),
with ↵u1 + u2 2 U and ↵w1 + w2 2 W . Thus,
T (↵v1 + v2 ) = ↵u1 + u2 = ↵T (v1 ) + T (v2 ),
and so T is a linear transformation.

Lecture 16, page 5


The onto and parallel spaces

As a projection is a linear transformation, it has an image and a


kernel which are related to the ‘onto space’ and ‘parallel space’.

Theorem: If T is the projection of V onto U parallel to W ,


then im(T ) = U and ker(T ) = W .

Proof: Let T be the projection of V onto U parallel to W .

⌅ If u 2 U, then we have u = u + 0 for u 2 U and 0 2 W so


that T (u) = u which means that u 2 im(T ).
⌅ If u 2 im(T ), then for some x 2 V we have T (x ) = u and
so, as T projects onto U, u 2 U.

Thus, the ‘onto space’ U = im(T ), as required.

Lecture 16, page 6


⌅ If w 2 W , then we have w = 0 + w for 0 2 U and w 2 W
so that T (w ) = 0 which means that w 2 ker(T ).
⌅ If w 2 ker(T ), then T (w ) = 0 and so, as T projects onto U,
we must have w = 0 + w for 0 2 U and w 2 W .

Thus, the ‘parallel space’ W = ker(T ), as required.

Now, as we want to represent projections by matrices, this means


that we have the following obvious corollary.

Corollary: If P represents a projection, then it will project


onto its range R(P) parallel to its null space N(P).

Lecture 16, page 7


Idempotent matrices

We now need to introduce the following kind of matrix.

Definition: A real matrix P is idempotent if P 2 = P.

And this is essentially because of the following theorem.

Theorem: If P is an idempotent n ⇥ n matrix, then

Rn = R(P) N(P).

Proof: Suppose that P is idempotent, i.e. P 2 = P.


For any v 2 Rn , we have
v = Pv + (In P)v
where Pv 2 R(P) and (In P)v 2 N(P) as...
Lecture 16, page 8
P(In P)v = (P P 2 )v = (P P)v = 0.
Thus, v 2 R(P) + N(P) and so Rn ✓ R(P) + N(P).
Obviously, as R(P), N(P) ✓ Rn , we also have R(P) + N(P) ✓ Rn
and so R(P) + N(P) = Rn .
Further, consider any vector y 2 R(P) \ N(P) so that

y 2 R(P) and y 2 N(P).

This means that there is a vector x 2 Rn such that Px = y and


Py = 0. Thus, we have

y = Px = P 2 x = P(Px ) = Py = 0,

and so we must have R(P) \ N(P) = {0}.


Consequently, Rn = R(P) N(P), as required.

Lecture 16, page 9


Idempotent matrices are projections

Theorem: An n ⇥ n matrix P represents a projection if and


only if P is idempotent.

Proof: We prove this both ways.


LTR: Suppose that the n ⇥ n matrix P is a projection onto R(P)
parallel to N(P) so that, for any v 2 Rn , we have

Pv 2 R(P) and P(Pv ) = Pv

as Pv = Pv + 0 with Pv 2 R(P) and 0 2 N(P).


Thus, as P 2 v = Pv for all v 2 Rn , we have P 2 = P.

Lecture 16, page 10


RTL: Suppose that P is idempotent so that Rn = R(P) N(P).
This means that, for any v 2 Rn , we have

v = u + w,

with u 2 R(P) and w 2 N(P). Thus,

Pv = Pu + Pw = Pu + 0 = Pu 2 R(P),

and so P is the projection onto R(P) parallel to N(P).

Let’s now see how we can find a matrix which represents a given
projection.
We’ll look at two methods. ‘Method 1’ follows what we have seen
before whereas ‘Method 2’ will be more useful in the next lecture
and beyond.

Lecture 16, page 11


Finding projection matrices: Method 1

We want T to project Rk+n onto U parallel to W where

U = Lin{v1 , v2 , . . . , vk } and W = Lin{vk+1 , vk+2 , . . . , vk+n }.

This means that, relative to the ordered basis

B = (v1 , v2 , . . . , vk , vk+1 , vk+2 , . . . , vk+n ),

our projection is
0 1
| | | | | |
B C
AB,B
T = @e1 e2 · · · ek 0 0 ··· 0A ,
| | | | | |

and the matrix we seek is AT = MB AB,B 1


T MB .

Lecture 16, page 12


Example of Method 1 (projection)

Find the projection of R3 onto U parallel to W when


onto 80 1 0 19 parallel 8 0 19
d
>
< 1 0 > = d >
< 2 > =
B C B C B C
U = Lin @0A , @ 1A and W = Lin @1A .
>
: >
; >
: >
;
1 1 0

B '

{ (! ) .
(I ) ( f ) )
,
,
is an

C-U
ordered

EW
basis of IRS =
a⑦ w

IT Ati
f ! ! !)
"
mis . and -

,
onears :
Ai .
at
t
' R LAT) -
U
find Mps
-

§ !)
M LAT) = W
'
then AT =
MBABIB Mf = . . .

Lecture 16, page 13


Finding projection matrices: Method 2

Suppose that
⌅ A is an n ⇥ m matrix of rank m and
⌅ B is an m ⇥ n matrix of rank m
so that the m ⇥ m matrix BA is invertible.
The projection of Rn onto R(A) parallel to N(B) is then

1
P = A(BA) B.

We can easily verify that P is idempotent, and as

PA = A and BP = B,

we can also show that R(P) = R(A) and N(P) = N(B).

Lecture 16, page 14


① RCP) -
RCA )

proof RCP) {y la PI for VERN }


-
- -
-
:
-
-
-

{ a II AE for I ERM }
'
RCA) -
-
-
-

take ye Rep ) . Then I


-
-
PK C- Rn
Mxm Mxn nxt
-

A ( BAMBI )
C- RCA SO Rep) ERA
-

ERM

"
take U' ERLA) , then I
'
-
-

AE
'
ER

=
PAE C- Rcp ) RCA) E Rcp) RCAKRIP)

③ Ncp) - N CB)

proof : NCP) -
-
ful PIE } MB) -

-
{ I 1131=0 }
take any EE NCP) ,
take any It NCB)

have O
we have Py=o we BI -

then Ba BPI - O then PI ALBA)-431=0


-

- -

It

UENCB) and Ncp) EMB) t

UENCP ) and MLB) EMP)

Nlp) MLB)
-
-
Example of Method 2 (projection)

Find the projection of R3 onto U parallel to W when


80 19 8 0 1 0 19
>
< 1 > = >
< 1 0 >
=
B C B C B C
U = Lin @0A and W = Lin @1A , @1A .
>
: >
; >
: >
;
1 0 3

projection of Pe RCA) to
'
P ALBA) onto "
NCB)
-

-
-
B is the

RLA) - U
A -

(! ) 3×1 rank I
checks : P'=P
NCB) W -

1×3 rank I
) U
-

I
Ba ( z 3 RIP)
-

-1=4
( f) Mcp) w
BA ( 3 -3 l) 4 ( BA )
-

- = -

(1) Ics ( § Ig ! )
p ACBAHB
-
-
-
-

-3 1) =L

Lecture 16, page 15


Orthogonal projections

We can now find the projection of Rn onto U parallel to any


complement W of U, i.e. any W such that Rn = U W .
But, of all the complements of U, the orthogonal complement U ?
is special as it gives us the following.

Definition: Suppose that U is a subspace of a real vector


space V . The orthogonal projection of V onto U is the
projection of V onto U parallel to U ? .

Geometrically... in IRS ut

intr ut w_µfE
I , I

:') I
' "
TT
→ u
e al k dad
-
Tcu ) -
-
u

U -
Utw
- -
Tulku EU C- Ut
-

C- U Gut
Lecture 16, page 16
Symmetric idempotent matrices are orthogonal projections

Theorem: An n ⇥ n matrix P represents an orthogonal


projection if and only P is idempotent and symmetric.

Proof: We prove this both ways.


LTR: Suppose P represents an orthogonal projection onto R(P).
As P is a projection, it must be idempotent.
As it is the orthogonal projection onto R(P), for any vectors
x , y 2 Rn , we have

Px 2 R(P) and (In P)y 2 R(P)?

as y = Py + w with Py 2 R(P) and w 2 R(P)? .

Lecture 16, page 17


This means that

(Px ) · (In P)y = 0 =) x t P t (In P)y = 0,

for all x , y 2 Rn , i.e. (see Question 11 of Exercises 13) we have



P -
cptp It
t t t
P (In P) = 0n =) P = P P. =p eptt
=
pep apt
But, (P t P)t = P t P and so (P t )t = P t which means that P = P t ,
i.e. P is symmetric, as required.
RTL: If P is idempotent and symmetric, then it represents a
projection onto R(P) parallel to N(P) and P t = P. But,

N(P) = N(P t ) = R(P)? ,

and so P is the orthogonal projection onto R(P), as required.

Lecture 16, page 18


Finding orthogonal projections

Method 1 will still work but we can now use MB 1 = MBt if


we work with an orthonormal basis B of Rn .

You can easily verify that MB AB,B t


T MB is symmetric.

Method 2 now requires that


⌅ A is an n ⇥ m matrix of rank m
so that the m ⇥ m matrix At A is invertible.
The projection of Rn onto R(A) parallel to N(At ) = R(A)?
is then
P = A(At A) 1 At .

You can easily verify that P is symmetric.

Lecture 16, page 19


Example of Method 1 (orthogonal projection)
8 0 19 '
Mt
-

for orthogonal matrix ,>


< 2 >
M
=
=

B C
Find the orthogonal projection of R onto Lin @1A .
3
>
: >
;
1
we need to find a basis of Ut .
Lin { (tf ) .
(I ) )
→ LI and t to
/ !)
( (!) foz) µ ))
'
ordered basis B
-
-

of R
,
,
,
w -
check :
PIP
onto H

pt =p

( ! Io I ) (g ! ! )
'
MB -
-

and AFB =

pep ) = u
,
u w
j Nlp) - Ut
"to "

find Most ,
then penn BAYBayt
'
=
. .
.

Lecture 16, page 20


Example of Method 2 (orthogonal projection) RCA)

80 1 t0 19
>
< 1 0 >=
B C B C
Find the orthogonal projection of R onto Lin @ 2A , @ 1 A .
3
>
: >
;
0 1
a
II. at .
-

fo ? ;)
**
to ? I
,
)f÷ ! ) .

it *tart .tl : : )
projecting MAIN to NCB)

(
' onto
If
)
2
then D= ALATA )
-

At -2 -2
=

→ ,
y onto RLA) NAIT)
-
z -
i y
t
projecting [4 to
11

NAT )
so '
p ALBA) B
-

-
-

A CATA)
'
At
-

Lecture 16, page 21


example session CWS )

Co ! )
ffi ) ( do 1) ¥ )
have E- Je with J and
suppose we
-
=

' '

idk
Z, =
XZI t Zz Zz =
AZZ

zi -
AZ , = Aet Zz -
-
Ae
't
since IT = XZZ

eftdt.e.ae/dEI--fxdtzie-H-xe-Hzi--AeHe*td-dyCz,e-dt)--
If .

At -113
In 12-4 -
atte

" 't
2- it) -
-
Catt B) e
'
Zz =
Ae
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 18: Dominant eigenvalues, sums and complements

For these and all other exercises on this course, you must show all your working.
1. An n ⇥ n matrix A is called nilpotent if there exists a positive integer k such that Ak is the
zero matrix.
(a) Show that, if all the entries of A on and below the diagonal are zero, then A is nilpotent.
(b) Show that, if A is nilpotent and B is similar to A, then B is nilpotent.
(c) Show that every n ⇥ n matrix is the sum of a diagonalisable matrix and a nilpotent matrix.
(d) Show that, if is an eigenvalue of a nilpotent matrix, then = 0.
(e) Show that 0 is an eigenvalue of every nilpotent matrix.

2. Let A be a real diagonalisable matrix whose eigenvalues are all non-negative.


Show that there is a real matrix B such that B 2 = A. Is the matrix B unique?
What if A is not diagonalisable? (You need only consider the case where A is a 2 ⇥ 2 matrix.)

3. A population develops according to xt+1 = Lxt where


0 1
0 2 1 0
B9/10 0 0 0C
L=B @ 0
C,
3/4 0 0A
0 0 1/2 0
and the entries of xt represent the number of individuals aged 0-10, 10-20, 20-30 and 30-40 respectively
after t decades. Assume that this population has been developing in this way for a long period of
time and that, in 2010, there are 12,000 individuals in this population.
(a) Given that = 3/2 is the dominant eigenvalue of L, find the corresponding eigenvector.
(b) Approximately how many individuals are aged 0-10 in 2010?
(c) Approximately how many individuals were there in 2000?
(For (b) and (c), you may assume that the population has been following the dominant behaviour.)

4. A population develops according to xt+1 = Lxt where


0 1
0 3/2 33/25
L = @2/3 0 0 A,
0 3/5 0
and the entries of xt represent the number of individuals aged 0-10, 10-20 and 20-30 respectively
after t decades.
p
Given that 15 ( 3 + 2i) is an eigenvalue of L, find the other eigenvalues and an eigenvector corre-
sponding to the dominant eigenvalue.
(Note: You should not need to find the characteristic polynomial of L!)
Explain the significance of the dominant eigenvalue and its corresponding eigenvector in terms of the
age-structure of the population.
(You may assume that the population has been following the dominant behaviour.)

Exercises continue overleaf...


5. Suppose that S is any subset of a real inner product space V .
Show that S ? is also a subspace of V and that S ✓ (S ? )? .
Further, show that if C and D are subsets of a real vector space with C ✓ D, then D? ✓ C ? .

6. Consider the matrix ✓ ◆


3 1 10
A= .
1 2 5
Determine the range and null space of A, and also the range and null space of At .
Verify that R(At ) and N (A) are orthogonal complements, as are R(A) and N (At ).
G

µ :)
, .ca, as
required . write A -
- O AR AB . . -

Ain

a "
:3 %
, .

÷:÷÷÷:÷t÷÷÷÷÷
"

÷:÷÷÷ : :
""

:p ÷ :÷÷÷:÷ )
'
1
AK 923 912 Azn

:÷: : :: ::
= O O - - - -
-

÷
. . . . .
.

- - -
- L -

O O O

÷ :)
"
a a
-
- -
- -

so there exists k
-
-
h . sit . Ak Ah -
-
-
-
e

'

P AP
-

Cb ) since B is similar to A , B -
-
where P is an invertible matrix

if
"
A is nilpotent , then there exists k St -
A -
-
e
'
Bk P Akp I for invertible P
-

so
-
- = all

CC ) since all matrix be written Jordan form A- PJP


t
where P is an invertible matrix
-

can as normal ,
then

( ! ! !! =/ ! ÷ :o) to !! ! )
J
* * *
" '
nilpotent it in diagonal ,
- - -

= -
-

' '
p Ipp
-

P MP
-

A= P lait D) p
-

so = t

T F
nilpotent diagonalsable
from Cb ) matrix
Cd) since d is an eigenvalue of matrix A then AI XI -

-
,

"
AK I -

=
A
""
.

AI = A
"
.
AI = x (AK
-

'
g) = d A -
"-2
CAI) =
d ( Ak
'
-

2) I = . . .
=D I

given that A is and Ak Ak I


nilpotent o then o I
-
-
. = -

since Xto , to

Le) FA , Ak - O

then AKI -
-

MI -
-
o , since I fo ,
a
-
-
O for every nilpotent matrix A ?

2 .
if A is diagonalis able ,
then there exists an invertible p sit .

A PDP
" -
-

÷:÷÷n÷÷÷÷÷n÷÷÷÷÷i
win .

µ!
IT, O - . .
O

denote
'
D
:
- "

. . .

I ' ' ' "


p )( P
'
'

(P p )
- -

Cpp 1) D p D
'
hence A =P D
'
P
' ' -

D D P =
D =

'

define 'P be
A written A B
B PD diagonalis able can as
-

hen matrix
-

"
since P P also P (eigenvectors ) not
, , D are
unique , B is
unique . is
unique
×
B is not unique
"
A diagonalis able PJP
if is not ,
then A-

'
J may
( dy j ) PA
since take the form ,
cannot be written as A -
B
,

( :b: )

(g : ) ( g !)
a "'

supposes Cao ! ) men


' -
- -
- -

from J form ,
a c -

and blate ) =
Zab -4 b -
Ia , then J
'
-

µ r¥ )
,
does not hold when d -
O
g§ )
3. ca, at eigenvalue a -
- E .
denote a-

then LI -

-
de

l:÷÷÷÷n¥ ÷⇐ ÷÷¥÷÷÷:* :
"

¥.
it
10
(b) 12,000 x
lot ft 3 -11
' =
b. 000 ✓

"
cc ) 120001 ⇐ ) =
208
X d is already for 10 yr period ,
12,0001 D= 8,000

4
given diets C- 3 tri ) then its complex dugate a Ttt 3 ri )
-

con
- -
-

.
,

I ¥3 EEE x ¥3x¥ ,

/ 1=0
let l L
-
RI Ko , i.e .
-
x
'
then ⇐ 747 -
-

o

-5 o
AE t 2
-

° E -
,
r ✓
since last -4×4=45-17257 =
IF 15 < 1-2 ,
a- 1.2 is the dominant eigenvalue

÷ Tig ) ) (1) f÷÷÷÷÷÷÷


to find the eigenvector .
let '
"" ""

. . .
.

(z ) fi;)
us -
-
-
-

(Ig )
"
cu 23 and
in the long term , yX e .
the
proportion in each age
chassis 18:10 :S

(! ) (§ )
""
since ya Chek [ Cll 2) 1.29kt
= = 1.2 -

) I
,
so
age groups grows at rate of 1.2


*
subspace : non -

empty ; CWA ; cu SM

5 ① non empty EE St Ce I > E VE E S


=
.
-
i
, ,

CU VA :
if I , I C- St , then I d. £7
-
- o t E ES and she .
I> = O VEE S

so s It I I > . = < it I >


. t < I s , > =
a

CV SM : taek , lets E S , Lae s > ,


= act I > .
= a e
-
I
'
② ( St )
-

Salsa I > St }
o
FIE
-
= . -
,

test SECS It
'
SES
-

let , then e s I> ,


=
E t ,
hence

③ if CED , then dim (C) e dim CD )

dim cut dim cot)


by definition dim co) t
-
-

dim CV ) ⇐ dim CD) t dim CDT )

then dim cot ) = dim CV) -


dim co) z dimer) -
dim CD) = dime Dt )

Ct Dt and thus Dt Ect


so

>

( to )
° 3
6 .
RRECAI -
-

, ,

f? ) )}
3

( too )
since then MAI
Ling
-
-

= o .

, ,

of
{ ( f ) (L ) }
since rank A 2 -
-

RCAF Lin
,
,

(! !) § !) { ( f) }
At -
. ARE CA )
' =
. so NCA's =
Lin

( %) § ) }
since rank of At then RCA)
{
-
- 2 ,
-
-

Verify
[ (¥ ) ) In µ ) )
that a t B . = -
San -
3 RB -
an -
z Br t loan -15 fr -
-
O

C- RLAT ) E N CA)

[ (3) tell ) )
a .

In D= o

C- RLA) EM CAT )

therefore , RCAT) and NCA) are orthogonal complements , as are RA) and MCAT)
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 18: Dominant eigenvalues, sums and complements

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. (a) For each t, if we take the tth power of such a matrix A, all entries on the first t diagonals
on and above the leading diagonal, as well as all entries below the leading diagonal, are zero. In
particular, An = 0.
(b) If B = P AP 1, where P is invertible, and Ak = 0, then B k = P Ak P 1 = 0.
(c) Write B = P JP 1 , where J is in Jordan Normal Form. Write J = D + N , where D is diagonal
and N is nilpotent (by (a)). Now B = P DP 1 + P N P 1 . The first such matrix is diagonalisable by
construction, and the second is nilpotent by (b).
(d) If is an eigenvalue with corresponding eigenvector v , then Ak v = k v , for all k. As A is
nilpotent, Ak = 0 for some k, meaning that k v = 0. Since v 6= 0, it follows that k = 0 for some
k, and so that = 0.
(e) We know that every matrix has at least one (real or complex) eigenvalue. It follows from (d) that
a nilpotent matrix has zero as an eigenvalue.
(Alternatively, if an n ⇥ n matrix A does not have zero as an eigenvalue, then it represents a trans-
formation from Rn to Rn whose image is the whole of Rn . Thus Ak also represents a transformation
from Rn to Rn whose image is the whole of Rn , for each k. So Ak is never equal to the zero matrix.)

2. Suppose that A is diagonalisable, and let P be the invertible matrix such that P 1 AP is a
diagonal matrix D with 1 , 2 , . . . , n on the diagonal of D. We have assumed that i is non-negative
for all i.
p p p
We can define D0 to be the diagonal matrix with 1, 2, . . . ,
0
n on the diagonal. Letting A be
PD P 0 1 0 0 0 1 0 1 0 0
we have that A A = P D P P D P = P D D P = P DP = A. 1 1
p p
The solution
p is not unique as we could have taken D0 to be the diagonal matrix with 1, 2,
..., n on the diagonal.
What if A is a 2⇥2 matrix that is not diagonalisable? We have to consider the case where A = P JP 1

and ✓ ◆
1
J= ,
0
for 0. In this case, we can confirm that
p 1
!
p
2
K= p
0

has K 2 = J, and we can again set A0 = P KP 1 . There is a catch in that we’d better not have = 0:
for instance the matrix ✓ ◆
0 1
0 0
does not have a square root.
3. (a) Finding the eigenvector in the usual way, we get (10, 6, 3, 1)t .
(b) The interpretation of the eigenvector is that it gives the proportions of the population in each
age-range. So here the proportions should be (roughly) 1/2, 3/10, 3/20, 1/20 in age ranges 0-10,
10-20, 20-30 and 30-40 respectively. As there are 12,000 individuals in all in 2010, roughly 6,000 of
them are aged 0-10.
(c) The interpretation of the eigenvalue is that it gives the increase in the population over (here) a
10-year period. So there will be (approximately) 12, 000 = 18, 000 individuals in 2020. Going the
other way, there were (approximately) 12, 000/ = 8, 000 individuals in 2000.
p
4. Suppose we are willing to trust the question-setter that 15 ( 3 + 2i) is an eigenvalue of matrix
L. Then p we know that, because L is a real matrix, another eigenvalue is the complex conjugate
1
5 ( 3 2i). How can we get the third eigenvalue without evaluating and factorising |L xI|? The
easiest way is to focus on the trace of L: we know that this is the sum of the eigenvalues. The trace
of L is 0, and the two eigenvalues we’ve found so far sum to 6/5, so the third one must be 6/5.
The dominant eigenvalue is the one of largest modulus, and this is 6/5.
If the eigenvector corresponding to eigenvalue 6/5 is (x, y, z)t , then we have to solve the equations
3 33 6 2 6 3 6
y + z = x, x= y and y = z.
2 25 5 3 5 5 5
We can write down a solution to the second and third of these equations instantly: x = 1, y =
10/18 = 5/9, z = y/2 = 5/18. We can take the integer eigenvector (x, y, z)t = (18, 10, 5)t for
convenience. Then it would be good practice to check that it does solve the first equation.
For the final part: the population grows by a factor of approximately 6/5 every ten years, and the
proportions of the various age-groups in the population should approach 18/33, 10/33, 5/33.

5. To show that S ? is a subspace of V , we note that

• 0 2 S ? as hu, 0i = 0 for all u 2 S, i.e. S ? is non-empty.

• Let v and v 0 be any vectors in S ? , and let u be any vector in S, so that hu, v i = 0 and
hu, v 0 i = 0. This means that

hu, v + v 0 i = hu, v i + hu, v 0 i = 0 + 0 = 0,

so v + v 0 2 S ? , i.e. S ? is closed under vector addition.

• Let v be any vector in S ? and let u be any vector in S, so that hu, v i = 0. If ↵ is any scalar,
we then have
hu, ↵v i = ↵hu, v i = ↵0 = 0,
so ↵v 2 S ? , i.e. S ? is closed under scalar multiplication.

To show that S ✓ (S ? )? , note that every vector u 2 S satisfies hv , ui = hu, v i = 0 for all v 2 S ? .
This is exactly what it means to say that u is in (S ? )? .
Note: if S is not a subspace, then S cannot be equal to (S ? )? , which is. In the finite-dimensional
case, if S is a subspace, then S = (S ? )? : to prove this we can use a dimension argument.
To say that v 2 D? means that v is orthogonal to every vector in D. As C ✓ D, this certainly
implies that v is orthogonal to every vector in C, i.e. v 2 C ? .
6. The range (column space) of the matrix A is a subset of R2 that does not consist of the
multiples of any single column, so it has dimension at least 2, and therefore is the whole of R2 . The
nullity of A is equal to 1 by the rank-nullity theorem. You can check that (3, 1, 1)t is in the null
space, and so the null space is the set of multiples of this vector.
The range R(At ) is equal to U = Lin({(3, 1, 10)t , (1, 2, 5)t }). The work to check that this is the
orthogonal complement of V = Lin({(3, 1, 1)t }) is the same as that needed to check that V is the
null space of A.
The null space of At is {0} (for instance, by the rank-nullity theorem). This is indeed the orthogonal
complement of R(A) = R2 : the only vector orthogonal to every vector in R2 is the zero vector.
Note that it’s not enough just to say that U = R(A) is spanned by (3, 1)t and (1, 2)t , and to note that
these vectors are both orthogonal to 0. The point is that every subspace U satisfies that condition,
but not every subspace is the orthogonal complement of {0}! Indeed, {0}? is the set of all vectors
orthogonal to 0, which is the entire space (here R2 ), so we need to observe that the columns of A do
indeed span R2 .
MA212 Further Mathematical Methods
Lecture 17: Orthogonal projections continued

Dr James Ward

⌅ What’s special about orthogonal projections?


⌅ Least squares

The London School of Economics and Political Science


Department of Mathematics

Lecture 17, page 1


What’s special about orthogonal projections?

Let’s suppose that V = U U ? and that P is the orthogonal


projection onto U. If v 62 U, what’s special about Pv ?
ut
's
V -
IR

If
" E PEHE HE all fora EV
Pdf Hu pull
E-
-

-
-

•→
E
HYPE
U

Thus, Pv 2 U is closer to v than any other vector in U, i.e.


distance between v and Pv  distance between v and u
for any u 2 U.
Lecture 17, page 3
The general result

Theorem: If P is the orthogonal projection of V onto U,

kv Pv k  kv uk,

for any u 2 U.

Proof: Suppose that v 2 V . First note the following.


⌅ As Pv 2 U and u 2 U, there is a vector x 2 U given by
x = Pv u
as U is CUVA.
⌅ As Pv 2 U and
v = Pv + (v Pv )
the vector v Pv must be in U ?.
Lecture 17, page 4
This means that we have

kv uk2 = kv (Pv x )k2 = k(v Pv ) + x k2 .

But, as x 2 U and v Pv 2 U ? , these vectors are orthogonal and


so the generalised theorem of Pythagoras gives us

kv uk2 = kv Pv k2 + kx k2

and, as kx k2 0, this means that

kv uk2 kv Pv k2

from which the result follows.

Lecture 17, page 5


Least squares analysis

Suppose that, for i = 1, 2, . . . , n, we have some data points (ti , yi )


and we know that our y values are subject to some error.
That is, at each ti , we should get yi⇤ but due to an error, say "i ,
we get yi instead so that
yi = yi⇤ + "i
where the error is unknown and so we can’t find the true value yi⇤
from the data value yi .
Also suppose that we expect this data to follow some ‘law’ relating
y and t which is given by an equation of the form
y = mt + c,
for some constants m and c. What does our data tell us about the
values of these constants?
Lecture 17, page 6
Now, if we had the yi⇤ , it would be easy to find the constants m
and c because we would just have the equations y ⇤ = Ax where
0 1 0 1
y1⇤ t1 1
B ⇤C B C !
By 2 C B t 2 1C m
B . C = B . .C
B . C B . .C c ,
@ . A @ . .A
yn⇤ tn 1
Here, as we assume there is a solution, y ⇤ 2 R(A).
But, we do not have the yi⇤ , we just have the yi and their inherent
errors, i.e. we have the equations y " = Ax where
0 1 0 1
y1 " 1 t1 1
B C B C !
B y2 " 2 C B t2 1C m
B . C=B. .. C ,
B . C B. C
@ . A @. .A c
yn "n tn 1
Here, as we don’t know the errors, we can not find a solution.

Lecture 17, page 7


So, what can we do? We conduct a least squares analysis where
we look for the values of m and c that minimise the sum of the
squares of the errors in the hope that these will be close to their
true values. That is, as y " = Ax , we seek to minimise

"21 + "22 + · · · + "2n = k"k2 = ky Ax k2 ,

over all possible vectors Ax 2 R(A).


But, if P is the orthogonal projection of Rn onto R(A), we have

ky Py k  ky Ax k,

for all Ax 2 R(A). That is, Ax ⇤ 2 R(A) minimises the sum of the
squares of the errors if
Ax ⇤ = Py .

We call x ⇤ = (m⇤ , c ⇤ )t a least squares solution of y " = Ax .

Lecture 17, page 8


So, assuming that At A is invertible, we have

Ax ⇤ = A(At A) 1 t
Ay =) x ⇤ = (At A) 1 t
A y,

is a least squares solution to y " = Ax .

⌅ What if At A isn’t invertible?


Then we need another way of finding P. See Lecture 19.
⌅ Could there be other solutions?
No, as At A is invertible, A must be an n ⇥ 2 matrix of rank 2
so that Ax ⇤ = Py has a unique solution for every Py 2 R(A).

So, generalising if necessary, we have the following.

If At A is invertible, the least squares solution of

y " = Ax is x ⇤ = (At A) 1 t
A y.

Lecture 17, page 9


Example

Given the data

t 0 3 5 8 10
y 2 5 6 9 11 error in here
d
find the least squares fit for a curve of the form y = mt + c.

l÷H÷÷l¥ :p:
Z

:* :*
with
1- AI : : :
- ,

( 5)
i
- - -
198 2b
i
so

* Lata
'

'Aty= # (2%7)=4! ) ! #A " -

-
Tia giggb )

soy .
-

* i.
I
¥435,5711
A'
=p;D
I
Lecture 17, page 10
Where are the errors?

In the last example, we assumed that the errors were in the y -data
so that we could minimise the sum of the squares of the errors

"i = “yi from data” “yi from fit”

over all possible straight lines y = mt + c.


But, if we had assumed that the errors were in the t-data, we
would need to minimise the sum of the squares of the errors

"i = “ti from data” “ti from fit”

over all possible straight lines t = ay + b.


There is no reason to suppose that these two minimisation
problems would give us the same answer!

Lecture 17, page 11


Example continued error in here

d
Find the least squares fit for a curve of the form t = ay + b.

¥ ¥)
*'

. x .
-

-
-

A 1

Tt (Iffy ) (ab! )
'
CATA 5 A'y
't
x
-
= - -
-

t=¥y 5244T -

y
-
-

244ft 54¥
t
= o.
888T -11982
T

different from our

earlier fit !

Lecture 17, page 12


Fits for more complicated curves

We can also find fits for curves with more complicated forms...
Example: Find the least squares fit for a curve of the form
errors in here

yt/035
d
y = at 2 + bt + c.
2 5 6 9 11

I AI
-
-

z= Aotbotcl

µ } s÷:p ! )
2
5=99-163-19
c -

P1 -
-
AI 6=9251-65-101

X
aca.im#y..axs=abatbs-ia
, . -

w -
11=9100 1- blotch
1 A

'
z*= CATA ) Atb
-

- - - -

Lecture 17, page 13


MA212 Further Mathematical Methods
Lecture 18: Inverse matrices

Dr James Ward

⌅ Left and right inverses


⌅ Properties of left and right inverses

The London School of Economics and Political Science


Department of Mathematics

Lecture 18, page 1


Inverses: What do we know?

If A is an n ⇥ n matrix with ⇢(A) = n, then it has an inverse A 1

with certain properties.

⌅ A 1A = In ,
⌅ AA 1 = In and
⌅ Ax = b has a unique solution x = A 1b for every b 2 Rn .

We ask: What happens if

⌅ A is an n ⇥ n matrix with ⇢(A) < n?


⌅ A is an m ⇥ n matrix with m 6= n?

Obviously, in such cases, A 1 does not exist.


But, is there anything else that might be able to do a similar job?

Lecture 18, page 3


Left inverses if A is invertible . At is
unique ; if not , have many different inverses ,

depending on B

Definition: A left inverse of an m ⇥ n matrix A is an n ⇥ m


matrix A` such that A` A = In .

Obviously, if it exists, A 1 is a left inverse of A as A 1A = In .


More generally, if B is any n ⇥ m matrix such that BA is invertible,
the matrix (BA) 1 B is a left inverse of A as

(BA) 1 B A = (BA) 1 (BA) = In .

Note: If A has a left inverse, then there are usually many di↵erent
matrices B that will give us many di↵erent left inverses (BA) 1 B.

Lecture 18, page 4


Example not square . A I
-

does not exist

0 t 1 80 19
1 0 >
< 0 > =
B C ` B C
Find a left inverse of A = @1 1A with N(A ) = Lin @ 1 A .
>
: >
;
1 1 1
"
Al -
-
LBAJ B

NCA )
' -
-

NCB) with BA invertible

f! ! ) ( to Y )
° '
A- want BA take B- St NLA )
-

2×3 NLB) -
. -

. .

Co ? 7)
(! ! ) (L 9) E
(39 )
then BA -
- -
-

CBA 5
'
-

I
⇐ f)
' 0
Ab 43A) B left
-

=
-
-
is the inverse
,

check : At A -12 note : .


A- At # Is ! so Al f- A
"

Ling (I ) )
'
A is not
.

'
NCA ) -
-
unique
,
Lecture 18, page 5
Right inverses

Definition: A right inverse of an m ⇥ n matrix A is an n ⇥ m


matrix Ar such that AAr = Im .

Obviously, if it exists, A 1 is a right inverse of A as AA 1 = In .


More generally, if B is any n ⇥ m matrix such that AB is invertible,
the matrix B(AB) 1 is a right inverse of A as

A B(AB) 1 = (AB)(AB) 1 = Im .

Note: If A has a right inverse, then there are usually many di↵erent
matrices B that will give us many di↵erent right inverses B(AB) 1 .

Lecture 18, page 6


Existence of left and right inverses

Theorem: An m ⇥ n matrix A has a


⌅ left inverse if and only if ⇢(A) = n (full column rank).
⌅ right inverse if and only if ⇢(A) = m (full row rank).
ALA In
-
-
.
AAN In
-
-

Proof: Essentially, A must have these ranks in order for the


products to give us the appropriate identity matrices.
Remark: If A is n ⇥ n, it will only have left or right inverses if it
has full rank. In this case, we have Ar = A 1 = A` .
If A is m ⇥ n with m 6= n, it can have only left inverses, only right
inverses or neither.

Lecture 18, page 7


Systems of equations and left and right inverses

Let A be m ⇥ n and consider the system of equations Ax = b.

⌅ If A has a left inverse, then we have

A` Ax = A` b =) In x = A` b =) x = A` b.

Of course, this only works when Ax = b has a solution, i.e.


when b 2 R(A). Indeed, this solution will be unique as A has
full column rank.
⌅ If A has a right inverse, we see that x = Ar b is a solution for
any b 2 Rm as we have

A(Ar b) = Im b = b.

Lecture 18, page 8


Projections and left and right inverses

Theorem: Suppose that A is an m ⇥ n matrix.


⌅ If A has a left inverse, then AA` is the projection of Rm
onto R(A) parallel to N(A` ).
⌅ If A has a right inverse, then Ar A is the projection of
Rn onto R(Ar ) parallel to N(A).

Proof: Essentially, we have the following.


For left inverses, we take a matrix B so that AA` = A(BA) 1B is
the projection of Rm onto R(A) parallel to N(B) = N(A` ).
For right inverses, we take a matrix B so that Ar A = B(AB) 1A is
the projection of Rn onto R(B) = R(Ar ) parallel to N(A).

Lecture 18, page 9


MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 19: Projections and least squares

For these and all other exercises on this course, you must show all your working.
1. An n ⇥ n matrix is called idempotent if A2 = A. Show that 0 and 1 are the only possible
eigenvalues of an idempotent matrix.

2.
✓ Let U = Lin({(1, 2, 1)t , (1, 0, 0)t }) and V = Lin({(0, 1, 1)t }) be subspaces of R3 .
Find the matrix A representing the projection onto U parallel to V .
Also find the matrix B representing the projection onto V parallel to U .
Evaluate A + B, and explain the answer you get.

:
3. Let v = (v1 , . . . , vn )t be any vector in Rn with ||v || = 1. Define the n ⇥ n matrix A = vv t
where v t is represented as a horizontal 1 ⇥ n matrix and v is represented as a vertical n ⇥ 1 matrix.
Show that A is symmetric, idempotent and of rank one. Hence show that A represents the orthogonal
projection onto Lin({v }).
With A as above, set B = I 2A. Show that B is symmetric, orthogonal and that B 2 = I. Is B
orientation-preserving or orientation-reversing? What transformation does it represent?
Hint: what are the eigenvalues of A, counted according to multiplicity? How do the eigenvalues and
eigenvectors of B correspond to those of A?

4. Let S be the subspace of R3 spanned by (1, 2, 3)t and (1, 1, 1)t . Find the matrix A representing
the orthogonal projection onto S.

5. Find an orthogonal and a non-orthogonal projection in R3 , each of rank two, that sends the
vectors (1, 1, 0)t and (0, 2, 1)t to themselves.

6. Consider the temperature data given below.

:
t 0 ⇡/3 2⇡/3 ⇡ 4⇡/3 5⇡/3
T 2 10 18 24 22 14
Find the best approximation to this data of the form T = f (t) = a + b cos t + c sin t.
Evaluate your function f (t) at the values t = 0, ⇡/3, . . . , 5⇡/3.

7. Find the least squares fit for a curve of the form


6m
y= +c
x
given the following data points.
x 1 2 3 6
y 5 3 2 1
Why would it be wrong to suppose this was equivalent to the problem of fitting a curve of the form
xy = cx + 6m
through the data points (xy, x)?

8. Suppose we are given a number of di↵erent measurements a1 , . . . , an of the value of a constant x.


Use the least squares method to find the best fit function to this data, i.e. (x, a1 ), (x, a2 ), . . . , (x, an ).

Exercises continue overleaf...


9. Given the four points (0, 0), (1, 1), (2, 4) and (4, 7) in R2 , find the lines which minimise the sum
of the squares of the
(a) vertical and (b) horizontal
distances from these points to the line.

10. Consider the data points ( 3, 3), (0, 1) and (1, 4) where the first coordinate is the value for x
and second the value for y.
Find the least squares approximation to this data using only functions of the form y = ax + b and
check how closely it approximates the data.
What is the least squares approximation to this data using functions of the form y = ax2 + bx + c,
and how closely does it approximate the data?
i . A's eigenvalue satisfies : AE -
-
de and also AE =
die since AZ A -
-

note also that A- I =


ALAI ) =
AXE = NAE ) =
and

so AE =
ME

since I -40 , NZ =D only possible when 4=0 or I

{ ( { ) (fo ) f ! ) }
2 . A- is an ordered basis
of Rs = U ⑦ V
,
,

from A we know that MA


( to ! )
L and AYA
=/ ! ! ! ) since the first two columns are
-
-

, the onto column .

'

µ to ! I g ! ! ) (g ! ! / I ÷ ! )
we can find Mai using the 3×6 matrix '
'

, ,

) f ! ! ! ) Ko ! ! ) ( III)
Armani
'
'

f
'
: : Y
'

so ma and thus ma .
-

""

i :* : it
( Oo )
I O O
=

I
2

, need to check !

ff! ) § ) ffg) } ordered basis of 1123


Following the same B is
-
V⑦ U
process an
-
-

, ,

( ÷ g)9
(g! !)
' ' '
then Mps and BE =

'

µ { to / g ! ! ) ( g ! ! / ! ! ! )
we can find ME using the 3×6 matrix ' '

,
f! I y ) f ? ÷ g ) ( g ! ! ) (! I ÷ )
"
so most -

-
and thus BT -
-

MB Bi mis -
-
' '

: : n: i. il
'

( )
O O O
=

o -
I 2

O -
I 2

fg ? I I f ) fo )
From
: ::
I3
above A it Bt
? 9
-
=

t
- - -
-
.

O I 2
, l
-

O O
rank l :

as I =
(Vi .
. . .
.
An It .
It =
(Vi ,
- - -

,
Un ) ,

i÷÷ ÷:÷ :)
"
y
"
:: : : :
. on
* me .
. -

Un :
'

Uno, Unit In
.

. .
.

: : :÷÷ )
( C
I:÷÷÷ ) .
. * nasrani .

according to the theorem , A


represents an
orthogonal projection onto RCA) parallel to NCA)

RCA) = { at a -
-
AE for all OE Rn }
= Lin, f aij ,
is i. sent this does not make sense

=
Linda) )

2612 6162 I, In

÷:
B -
-
I -

2A = I - - - - -
-

"" " ien


'

÷:

up
" .
. . .

composed by vector U, so RCA) = Linc 963 )


symmetric t idempotent orthogonal projection
so RCA) = Lin 463 )
symmetric :

÷)
Uno ,
-265

f-
I

BI
-

I
B
-
= -

me
.
. .
.

-
hen -
Usan
.

orthogonal :

B Bt =
B- = CI 2A 5
-
= IZ -14A' -
4A =
I -
41A -
A) -
-
I since A is idempotent
-
orientation :

since A is idempotent .
o and I are the only eigenvalues proved in Qi

A with eigenvalue e
for any eigenvector of
AI =D

LI 2A ) I
BE = =
I so B has eigenvalue 1
-

for eigenvalue I of A . AE I -
-

Bu CI ZA) I I 21 E So B has eigenvalue I


-
- -
- =
-
= -

1131=-1 B is orientation
so
reversing
-

what transformation B
represents ?

if He line { u } ) ,
then BI = -

if XE Lin 463 It , then BI -


-

so reflection about the plane orthogonal to I

f! ! ) (; I, )
' 2
let P =
,
then P =

# Pi Fi
Ptp
C ; ;)
( g ! ) fi ; g) fo ;)
-
-
" -
-
-
-

, ,
,

(Ig Ig Ia )
*
MPTP " M #
⇐ %) ( i I 7)
f} !)
so a
-
-
- - =
-

,
f! I ) (f I I, )
5 . orthogonal projection :
A -
-

,
At -

f! ! ) ⇐ 3.) ( 3 :)
"
Ata
fo I 7) *As
-
- - -
-

E
( Iz Iz ÷ )
to

÷) 3) ( I 9 )
then '
At
'

P ACHA ' '


'
- -
-
-
- =

non orthogonal projection Ker CA)


(f) w
-
- : - =

choose B
( f Oo ? ) St NCB) W
(f )
-
-
- .
-
-
-

(÷ ! !)
"
since BA -

-
I ,
then P =
A CBA ) B -
-

AB -
-
§ ÷ ) (Ffg )
*I

ftp./=(bI )
" "
then a* = Att* = A' * 5 # T
'
-
-
HA ' = =


and hence T -
-
ft ) =
15
-

to cost -

IB sing ✓

evaluating the function :


f cos = 15 -
lo
-
O =
1455
f- t ) =
Is I 5 -
4353 F- x = IT 8

f- I F) =
15 t 5 -

4353 x-P =
18 V

f la) =
15 -110 -

Fps x 0=25 V

f LIK) =
15 -15 t 4353×-13 = 22 ✓

f- ( 5574=15 -
5 1- 3453×-7 =
12 V

1- I
GMT
suppose A transfer x into y so I AI with 5 = to
. -
-

, ,

3 =
tmz t C

2 =

tmz tC

bnf-tc.us l =

=L! ! )
(5,9 f ) ¥6 E)
4

µ !) (
'

( I ;)
6 3 '
from
-

which Ata = so Cata 5 =

iz
, ,

) CI )
3
and ate
C i :) -
-
-

, ,

) ( 4,4 ) ( 4) ( II )
""
so * -
CA hi
- '
ATI
-
-

to = =

33 11/28
therefore y
-
-

Tx
t
#✓
It is wrong to
suppose this was equivalent to Xy
-
-
cat 6M is because y -
-
tf t c contains

the error in y and xy= cxtbm contains the error e in oxy .


Hence they will
generate
different fitting curve .
'
S . the question wants us to find x. St . minimise ca ai )-

T I
CATA s I
-

take A -
(l l - -
-

l ) , so =

Cata's
't
A ca . .
ai . -

i. ant =
's .IE ai
,

s .

suppose the line is y axtb


-
-

ca ) at X -
o , y
-
- b then the sum of squares of vertical distance

b't
- ' 2
Cl A b) 17 4A b)
X I y Atb t 14 za b) t
- = - -
- - -
-
=
-

x
x -
-

2 ,
y = Zatb

71=4.9--4 Atb

4g ) µ ) !
y -
-
mate , we set A- and b -

¥ ( 14 ) ¥
' ( 64×-7
*
CATA ) Atb y )
-

2-
-
-

= =

Cb )
interchange the role
of x and y ,
A- nytd

µ !)
see a- and "

CATA 5' Atb to (f) fo #


't
t = = x -
l 32g -19 ) or y = Cbo x -

f)
µ !)
to ca )
.
as given ,
I =
I

-
A

a'
: i) ( 3 ni )
fi f! ! )
cattail
f: ;)
*A -
-
-
- -

Atg -
-

f
-3
,
o

, t)
(I )
=

( 5g )
Tiko ) ( I! )
'

( 3 Yo ) (YY
'
Eas a' y
't -

so n
-
- - -
=

%
-

hence I =
If It 535
g, =
six c- 3) + If = IT
Yz = # xot If = ¥5
93
=
€ x It
=
¥

÷÷Ht÷i÷d
error . . ax
.

µ § !)
Cb) given 3 fa 3b then I
-

as C
t
I
-
=
-
-
, .

I =
C

-
4- atbtc
-

ATA =
-
-
-

y
-
-
t ( 11×425 Xt 12)

in similar
a manner , E I AI
-
-
-
= - -
i
MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Solutions to Exercises 19: Projections and least squares

This document contains answers to the exercises. There may well be some errors; please do let me
know if you spot any.

1. Suppose Av = v for some non-zero vector v . Then A2 v = A( v ) = 2 v . But A2 = A, so


2 v = v . As v 6= 0, it follows that 2 = . This is possible only if = 0 or = 1.

2. There are two ways to do this. One way involves using a change of basis where the basis
consists of the three given vectors. We’ll show how to do this the other way.
We need a matrix C whose range is U (which is easy to write down) and a matrix D whose null
space is V : for this we need two linearly independent vectors orthogonal to (0, 1, 1)t such as (1, 0, 0)t
and (0, 1, 1)t . We write
0 1
1 1 ✓ ◆
@ A 1 0 0
C= 2 0 and D = .
0 1 1
1 0
Now the matrix we want is A = C(DC) 1 D, which we calculate to be
0 1
1 0 0
A = @0 2 2A .
0 1 1

This matrix A is supposed to be idempotent, have U as its range and (0, 1, 1)t is supposed to be in
its null space: all are easy to check.
0 1
0 0 0
Repeating the process, we find that B = @0 1 2A.
0 1 2
We see that A + B = I. Why is this? Given a vector w 2 R3 , it can be written uniquely as u + v
where u is in U and v is in V . Now, for all w , Aw = u and Bw = v , so

(A + B)w = Aw + Bw = u + v = w ,

i.e. A + B is the identity matrix.

3. The (i, j) entry of the matrix A will be vi vj ; therefore the matrix A is symmetric. Alternatively,
observe that (vv t )t = v tt v t = vv t .
P P
Calculating the (i, j) entry of A2 we get k ai,k ak,j = k vk2 vi vj = vi vj where the last equality arises
since kv k = 1). Therefore the matrix A is idempotent. Alternatively, observe that v t v = kv k2 = 1
and so vv t vv t = vv t .
As A is symmetric and idempotent, it is an orthogonal projection matrix with a range equal to
the column space of A, which is Lin({v }). The columns of A are all multiples of v , and they are
not all zero, so A has rank one, and the eigenvector with eigenvalue 1 is v . Let’s note here that
the eigenvalues of A, counted according to multiplicity, are 0, 0, . . . , 0, 1. The null space of A is the
orthogonal complement of R(A) (of dimension n 1): any non-zero vector orthogonal to v is an
eigenvector with eigenvalue 0.
As A is symmetric, B = I 2A must also be symmetric. We also have

BB t = B 2 = (I 2A)2 = I 2 4A + 4A2 = I 4(A A2 ) = I,

since A is idempotent. Thus B is orthogonal (B 1 = B = B t ) and B 2 = I.


The eigenvectors of A are all eigenvectors of B as well. If Ax = 0, then Bx = x and, as Av = v , we
have Bv = v .
Thus the eigenvalues of B, counted according to multiplicity, are 1, 1, . . . , 1, 1. The determinant
of B is 1, so B is orientation-reversing. What transformation does B represent? Every vector in
the (n 1)-dimensional subspace R(A)? is mapped to itself, while the vector v spanning R(A) is
mapped to v . So B represents reflection in the plane R(A)? normal to v .
Note: we need to know not just that all the eigenvalues of B are either 1 or 1, but also how many of
each there are. The determinant is the product of the eigenvalues, counted according to multiplicity;
for instance, an orthogonal matrix with eigenvalues 1, 1, 1 has determinant 1, is order-preserving,
and its square is the identity (it represents a half-turn about some axis).

4. The required orthogonal projection is given by P = A(At A) 1 At where A is the matrix


0 1
1 1
A = @2 1 A.
3 1

This gives us
0 1
✓ ◆ 1 1 ✓ ◆ ✓ ◆
t 1 2 3 @ A 14 0 t 1 1/14 0
AA= 2 1 = =) (A A) = ,
1 1 1 0 3 0 1/3
3 1

and so
0 1 0 1
1 1 ✓ ◆✓ ◆ 17 20 5
1 @2 3 0 1 2 3 1 @
P = A(At A) 1 At = 1A = 20 26 4 A.
42 0 14 1 1 1 42
3 1 5 4 41

5. For the orthogonal projection, we have a formula: let A be the matrix whose columns are the
two vectors, and evaluate P = A(At A) 1 At .
Let’s just pause a moment to contemplate this formula. We see that P A = A, so the two vectors in
the question are mapped to themselves. Moreover, any vector in N (At ) is also in N (P ), but N (At )
= R(A)? . So this transformation does the right thing to all the vectors in R(A) and R(A)? , so it
must be the transformation we are looking for. If this helps you to remember, or reconstruct, the
formula, that would be good.
Anyway, with this matrix A, we have
0 1
✓ ◆ 1 0 ✓ ◆ ✓ ◆
1 1 0 @ 2 2 1 5 2
At A = 1 2 A= and (At A) 1
= ;
0 2 1 2 5 6 2 2
0 1

so that the required orthogonal projection is


0 1 0 1
1 0 ✓ ◆✓ ◆ 5 1 2
1@ A 5 2 1 1 0 1@
t
P = A(A A) A =1 t
1 2 = 1 5 2A .
6 2 2 0 2 1 6
0 1 2 2 2

You should expect P to be symmetric (and at least notice if you get an answer that’s not symmetric),
and you should check that P does indeed take the two given vectors to themselves.
To find a non-orthogonal projection, there are a variety of approaches. One can take pretty much
any 2 ⇥ 3 matrix B such that BA is invertible, and work out P = A(BA) 1 B. By much the same
argument as before, this is the projection onto the right range, parallel to the null space of B.
Almost all matrices will work, but there’s still scope for choosing one that cuts down on the amount
of work you’ll have to do. For instance, a really good matrix to choose for B is
✓ ◆
1 0 0
B= ,
0 0 1

since we get BA = I, so that


0 1
1 0 0
1
P = A(BA) B = A(I)B = AB = @ 1 0 2A .
0 0 1

Again, you should check that this matrix P does the job it’s supposed to.

6. We first need to form the matrix A, whose columns are the values of the three functions 1, cos t
and sin t at the various values of t. So
0 1
1 1 p0
B1 1/2 C
B p3/2 C
B1 1/2 3/2 C
A=BB1
C.
C
B 1 p 0 C
@1 1/2 3/2A
p
1 1/2 3/2

Now we find the vector of coefficients (a, b, c)t by plugging in the vector y of recorded data to the
equation (a, b, c)t = (At A) 1 At y . We find:
0 1 0 1
6 0 0 1/6 0 0
At A = @0 3 0A and (AT A) 1 = @ 0 1/3 0 A ;
0 0 3 0 0 1/3

and we also have 0 1 0 1


90 15
At y = @ 30
p
A =) (At A) 1 At y = @ p10 A .
4 3 4 3/3
p
That is, the best fit is given by the function f (t) = 15 10 cos t 4 3/3 sin t.
Using this function, we now find that

f (0) = 5, f (⇡/3) = 8, f (2⇡/3) = 18, f (⇡) = 25, f (4⇡/3) = 22, f (5⇡/3) = 12.

These numbers are respectably close to the experimental data given: if you don’t get numbers that
are close to the data, then you should realise that you have made a mistake, and — at least — say
so.

7. The process is the same as in other questions. We set


0 1 0 1
6 1 5
B3 1 C B3C
A=B C
@2 1A and y = @2A ,
B C

1 1 1

and evaluate (m, c)t = (At A) 1 At yto get m = 22/28 and c = 11/28, so that the best fit of the given
form is ✓ ◆
6 22 11 33 11
y= + = + .
x 28 28 7x 28
What you’re doing here is minimising the sum of the values of (y 6m/x c)2 , for the four given data
points, over pairs (m, c). The alternative suggested amounts to minimising the sum of the values of
(xy 6m cx)2 . That’s not the same problem — the second formulation amounts to giving more
weight to the data with higher values of x.

8. Here we are trying to find the best fit function to the dataset (x, a1 ), (x, a2 ), . . . , (x, an ). We
can only be expected to fit a value at x, i.e. we are looking for the number a minimising the sum of
the squares of the terms (a ai )2 .
Our method applies. We take A = (1, 1, . . . , 1)t , so that (At A) 1 = 1/n, and
n
1X
(At A) 1
At (a1 , . . . , an )t = ai ,
n
i=1

the mean of the numbers ai .

9. For (a), we have the standard least-squares problem. If the points are values of (x, y), we need
to fit a line of the form y = mx + c. We set
0 1 0 1
0 1 0
B1 1C B1C
A=B C
@2 1A and b = @4A ,
B C

4 1 7
✓ ◆
m
so we want to find the value z ⇤ = such that Az ⇤ is closest to b.
c
As usual, we have z ⇤ = (At A) 1 At b so we find that
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
t 21 7 t 1 1 4 7 t 37 ⇤ 1 64
AA= ; (A A) = ; Ab= ; z = .
7 4 35 7 21 12 35 7
1
That is, the line we are asked for is y = 35 (64x 7).
For (b), we need to interchange the roles of x and y so that we are looking for a line of the form
x = ny + d. We set 0 1 0 1
0 1 0
B1 1C B1C
A=B @4
C and b=B C
@2A ,
1A
7 1 4
and we want (n, d)t = z ⇤ = (At A) 1 At b. We find that
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
t 66 12 t 1 1 2 6 t 37 1
⇤ 32
AA= ; (A A) = ; Ab= ; z = .
12 4 60 6 33 7 60 9
1 1
That is, the required line is now x = 60 (32y + 9), or y = 32 (60x 9).

10. This is the standard least-squares problem. The points are values of (x, y), and we need to fit
a line of the form y = ax + b. So we set
0 1 0 1
3 1 3
A= @ 0 1 A and c = 1A ,
@
1 1 4
and we want to find the vector z ⇤ = (a, b)t such that Az ⇤ is closest to c.
As usual, we set z ⇤ = (At A) 1 At c and find that
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
t 10 2 t 1 1 3 2 t 5 ⇤ 1 1
AA= ; (A A) = ; Ac= ; z = .
2 3 26 2 10 8 26 70
1
That is, the line we are asked for is y = 26 (x + 70).
This is not a wonderfully accurate approximation: fundamentally, the three points we are given are
not all close to any straight line. This line is nevertheless the “best”, in the sense that it minimises
the sum of the squares of the vertical distances from the points to the line.
We now allow ourselves to use quadratic functions. Let’s see how the algebra works out. We set
0 1 0 1
9 3 1 3
@
A= 0 0 1 A and d = 1A ,
@
1 1 1 4

so we want to find the vector z ⇤ = (a, b, c)t such that Az ⇤ is closest to d . We know to take
z ⇤ = (At A) 1 At d , but here At and A are invertible, so this is just z ⇤ = A 1 (At ) 1 At d = A 1 d .
Thus we have 0 1 0 1
1 4 3 11
1 @ 1 1
A 1= 8 9A =) z ⇤ = A 1 d = @25A .
12 12
0 12 0 12
The “best fit” is then
1
(11x2 + 25x + 12).
y=
12
You can check that this does indeed go through all three points.
What did we do, e↵ectively? Instead of solving the problem of finding a vector z ⇤ such that Az ⇤ is
closest to d , we solved the (easier?) problem of finding a vector z ⇤ such that Az ⇤ is equal to d . In
other words, we solved the matrix equation Az ⇤ = d . Our method amounted to setting z ⇤ = A 1 d .
In general, we can fit a quadratic function through any three points with di↵erent values of x. In
even more generality, if we are given any k points in the plane with di↵erent x-values, then we can
find a polynomial of degree k 1 passing through all the points.
To see this, let the points be (x1 , y1 ), (x2 , y2 ), . . . , (xk , yk ), and suppose we are looking for a polynomial
y = a0 + a1 x + a2 x2 + · · · + ak 1 xk 1 ; we are then looking to solve:
0 1
0 2 k 11 a0 0 1
1 x1 x1 · · · x1 B C y1
B1 x2 x2 · · · xk 1 C B 1 C By2 C a
B 2 2 CB a C B C
2 C = B . C.
B .. .. .. . . .. C B
@. . . . . A B . C @ .. A
k 1
@ .. A
1 xk x2k · · · xk yk
ak 1

What we need to verify is that the matrix above is non-singular, for any set of distinct values
x1 , . . . , xk . There are various ways to approach that: here’s a rather abstract treatment. If you think
of expanding out the determinant, each term will contain one entry from each column, so each term
will be a “monomial” of the form xi1 x2i2 · · · xkik 1 , of degree 1 + · · · + (k 1) = k2 . So the whole
expression for the determinant is a polynomial of degree k2 . Now, if we have any pair of terms
equal, say xi = xj , then two of the rows of the matrix will be zero, so the determinant will be zero.
Therefore each of the terms (xi xj ) is a factor of the determinant. There are k2 such terms, so
the determinant is a constant times the product of all of them. Hence the determinant is non-zero if
all the terms (xi xj ) are non-zero. (Don’t worry if you didn’t get that.)
MA212 Further Mathematical Methods
Lecture 19: Generalised inverses

Dr James Ward

⌅ Weak generalised inverses (WGIs)


⌅ Strong generalised inverses (SGIs)
⌅ Least squares again...

The London School of Economics and Political Science


Department of Mathematics

Lecture 19, page 1


Weak generalised inverses

Definition: A weak generalised inverse (WGI) of an m ⇥ n


matrix A is an n ⇥ m matrix Ag such that AAg A = A.

Obviously, if it exists, A 1 is a WGI of A as AA 1A = A.


Also, if they exist, left and right inverses of A are WGIs as

AA` A = AIn = A and AAr A = Im A = A.

It follows that, generally speaking, a matrix will have many WGIs


since, if they exist, it can have many left or right inverses.
However, unlike left and right inverses, every matrix has a WGI...

Lecture 19, page 3


Existence of WGIs

Theorem: Every m ⇥ n matrix A has a WGI.

Proof: If A is an m ⇥ n zero matrix, then any n ⇥ m matrix works.


Otherwise, we show that A has a WGI by construction.

Suppose the m ⇥ n matrix A has rank k > 0.


⌅ Write A = BC where B is m ⇥ k, C is k ⇥ n and they
both have rank k.
⌅ Find a left inverse B ` of B and a right inverse C r of C .
⌅ A WGI of A is Ag = C r B ` .

Note that this is indeed a WGI as


AAg A = (BC )(C r B ` )(BC ) = B(CC r )(B ` B)C = BC = A.
Lecture 19, page 4
Example
0 1
1 0 2 1 4 PLA) -_ 2

B C
Find a WGI of the matrix A = @0 1 1 0 6 A. A'BC
1 1 1 1 10
)
B 3×2

II. ⇐
ran"

I. E can
.

Kill 'i "


5×2
Cr QLCQ)
"
Q
"
-_

* rank 2

. .
on
.

' '
B P "
-

CPB ) where P 2×3 rank (Cat


(f Y)
-
,
-

Co : : )
9:14! ) Coil
a-
-
-

(÷)
" o
.

""
Is iii.
cnn.yio.i.io.r.co ; ,
(! ! ! )
l
l O O
l
l
=
O l O ,

i I check AASA -
A
Lecture 19, page 5
Example
0 1
1 1 2 4
B 1 2 1 1C
B C
Find a WGI of the matrix A = B C.
@ 3 0 5 9A

( : :/
/ O 513 3 1 4 3 7
RRECA)=
:
"'

O O
o

O O
soecai-r.a-B.ca
4×2
-
2×4
0=49%33 ) ,

ranks "
cha QLCQ)

2g )
"
" Bl CPB)
-
'
'
-
p
a 4×2 rankle
ranks

g) % Yo)
p #
area . "

taker -

Co : : :) -
-

-
-
"

;①d②
-
-
- -
-

BEC ! !: :/
'

. .
as .ua .
-

Yg ! ! ! )
Lecture 19, page 6
Projections and WGIs

Theorem: If Ag is a WGI of the m ⇥ n matrix A, then


⌅ AAg projects Rm onto R(A) parallel to N(Ag ) and
⌅ Ag A projects Rn onto R(Ag ) parallel to N(A).

Proof: Essentially, we have the following.


The m ⇥ m matrix AAg is idempotent as
(AAg )2 = (AAg )(AAg ) = (AAg A)A = AAg
and so it will project Rm onto R(A) parallel to N(Ag ).
The n ⇥ n matrix Ag A is idempotent as
(Ag A)2 = (Ag A)(Ag A) = Ag (AAg A) = Ag A
and so it will project Rn onto R(Ag ) parallel to N(A).

Lecture 19, page 7


What about orthogonal projections?

When we find a WGI, we write A = BC and then we can find the


left and right inverses using, say

1
B ` = (PB) P and C r = Q(CQ) 1

for some appropriate matrices P and Q so that

Ag = C r B ` = Q(CQ) 1
(PB) 1
P.

However, if we were to pick P and Q very carefully, the projections


AAg and Ag A would be orthogonal projections.
For this to happen, it turns out that we need P = B t and Q = C t
as we shall see when we consider...

Lecture 19, page 8


Strong generalised inverses

Amongst all the possible WGIs for a matrix, one is very special...

Definition: A strong generalised inverse (SGI) of an m ⇥ n


matrix A is an n ⇥ m matrix As such that
(1) AAs A = A (so an SGI is a WGI),
(2) As AAs = As ,
(3) AAs is symmetric and
(4) As A is symmetric.

In fact, even though a matrix can have many WGIs, it turns out
that only one of these will be an SGI.

Lecture 19, page 9


Existence of SGIs

Theorem: Every m ⇥ n matrix has exactly one SGI.

Proof: To show that there is an SGI we have the following.


If A is an m ⇥ n zero matrix, then As has to be the n ⇥ m zero
matrix in order to ensure that (2) holds.

Otherwise, we follow our earlier construction but now we must


take the left inverse of B and the right inverse of C to be

B ` = (B t B) 1
Bt and C r = C t (CC t ) 1
,

respectively so that the SGI of A = BC is

As = C r B ` = C t (CC t ) 1
(B t B) 1
Bt.

Lecture 19, page 10


To verify that this is indeed an SGI, note the following.
For (1), as before, we have

AAs A = (BC )(C r B ` )(BC ) = B(CC r )(B ` B)C = BC = A.

For (2), we have

As AAs = (C r B ` )(BC )(C r B ` ) = C r (B ` B)(CC r )B ` = C r B ` = As .

For (3) and (4), we have

AAs = (BC )(C r B ` ) = BB ` = B(B t B) 1


Bt,

and
As A = (C r B ` )(BC ) = C r C = C t (CC t ) 1
C.
It’s relatively easy to show that these two matrices are symmetric.
To show that the SGI is unique, see Exercises 20.

Lecture 19, page 11


Example Getts ¥
0 4 ff 1
-

1 0 2 A 3×3
B C
Find the SGI of the matrix A = @ 1 1 0A. '
-

BC rank 2

0 1 2
B
tank ,
B'
Y) CI
T A

CI 2142£
T

"
E
fi } )
E
B -

C 2×3
rank
C -
-

to :3 ) or
*
,
a
Cz CI

duct) ( Bt BI ' Bt
( j f f)
As '
-

'
-

c
.
-
-

( y 3oz)
I 3
z
-
I
-

=
9-
- -
=
-

Ase .
.
.
-
-
same

Lecture 19, page 12


Projections and SGIs

Unlike WGIs in general, SGIs give rise to orthogonal projections


and that is what makes them useful.

Theorem: If As is the SGI of the m ⇥ n matrix A, then


⌅ AAs orthogonally projects Rm onto R(A) and
⌅ As A orthogonally projects Rn onto R(As ).

Proof: Essentially, as an SGI is a WGI, the matrices AAs and As A


represent projections as before. But now, as the matrices AAs and
As A are symmetric, they are orthogonal projections.

Lecture 19, page 13


Least squares again...

In the last lecture, we saw that the least squares solution of

y " = Ax is x = (At A) 1 t
A y,

and, for (At A) 1 to exist, we needed A to have full column rank.


If this is not the case, as we can always find an SGI, we can use
AAs for the orthogonal projection onto R(A) instead.
That is, a least squares solution of

y " = Ax is x = As y .

Note that, if we have to use this, this will be one of many possible
least squares solutions...

Lecture 19, page 14


Many least squares solutions?

Theorem: Suppose that A is an m ⇥ n matrix.


The least squares solutions to

y " = Ax are given by x = As y + (In As A)w ,

where w is any vector in Rn .

Proof: A least squares solution of y " = Ax is a solution of


Ax = Py where P is the orthogonal projection onto R(A). Thus,
as As always exists, we solve the matrix equation Ax = AAs y .
This matrix equation is consistent because AAs y 2 R(A) and so
its solution, x , must be the sum of a particular solution like As y
and an arbitrary vector in N(A).

Lecture 19, page 15


Now, essentially, every vector in N(A) has the form (In As A)w
for some w 2 Rn since

A(In As A)w = (A AAs A)w = (A A)w = 0,

and, consequently, our least squares solutions are given by

x = As y + (In As A)w ,

where w is any vector in Rn .


Note: If A is an m ⇥ n matrix of rank n, B = A and C = In giving
us As = (At A) 1 At and so As A = (At A) 1 At A = In . That is, as
expected, x = (At A) 1 At y will still be the only least squares
solution that we find.

Lecture 19, page 16


Example continued

Find the least squares solutions to y " = Ax when


0 1 0 1
1 0 2 3
B C B C
A = @ 1 1 0A and y = @ 0 A .
0 1 2 3

t.me#as---if:. 's .
'

'÷i⇐! ! ) III
a '
'
"
'

±¥µ ,
=

if, } ! ) ! !) "
¥ !)
Asa -

a. wer

Y
I
Is -
Asa -
-

Lecture 19, page 17


An observation

Theorem: Suppose that A is an m ⇥ n matrix.


The least squares solution to y " = Ax which is closest to
the origin is x = As y .

Proof: We note that, for any w 2 Rn ,

[(In As A)w ] · (As y ) = [(In As A)w ]t (As y )


= w t [In (As A)t ]As y
= w t [As As AAs ]y
= w t [As As ]y
=0

and so the vectors (In As A)w and As y are orthogonal.

Lecture 19, page 18


Thus, the distance between x and the origin, i.e. kx k, is

kx k2 = kAs y + (In As A)w k2 = kAs y k2 + k(In As A)w k2

by the generalised theorem of Pythagoras. But, this means that


kx k kAs y k and so x = As y is the least squares solution which
is closest to the origin.
Example continued: Verify this using our least squares solution.

E-
(÷ ) +
htgb (I ) ,
,
a. b. CER

4.) 411 "


l
'

"" "
"

H vine fit
"" " " " "
"
-
t

so the closest I can get to the origin is


11/4/11 =
- -
-
B

and this occurs when I


-
-

(4) =
ASI
Lecture 19, page 19
MA212 Further Mathematical Methods
Lecture 20: Orthogonal projections in function spaces

Dr James Ward

⌅ Least squares approximations


⌅ Fourier series

The London School of Economics and Political Science


Department of Mathematics

Lecture 20, page 1


What about function spaces?

Question: Can we find the orthogonal projection of a function onto


a subspace U of a function space?
Answer: Yes! If the subspace U is finite-dimensional.
Question: How can we do it? All of our methods so far have
assumed that we are working in Rn .
Answer: We’ll need the theorem on the next slide.
Question: What will this give us?
Answer: It will give us the function in U that is closest to the
given function in the ‘least squares’ sense. We will call this the
least squares approximation of our function.

Lecture 20, page 3


Orthogonal projections in function spaces

Theorem: Suppose that U is a finite-dimensional subspace


of an inner product space V .
If {u1 , u2 , . . . , un } is an orthonormal basis of U, then

P(v ) = hv , u1 i u1 + hv , u2 i u2 + · · · + hv , un i un ,

is the orthogonal projection of v 2 V onto U.

Proof: For any v 2 V , we can write

v = P(v ) + [v P(v )]

where P(v ) 2 U as it is a linear combination of the vectors in our


basis {u1 , u2 , . . . , un } of U.

Lecture 20, page 4


Furthermore, v P(v ) 2 U ? as, for each 1  i  n, we have
* n
+
X
hv P(v ), ui i = v hv , uj i uj , ui
j=1
n
X
= hv , ui i hv , uj i huj , ui i
j=1

= hv , ui i hv , ui i = 0
using the orthonormality of the ui .
Thus, every v 2 V can be written as
v = P(v ) + [v P(v )]
with P(v ) 2 U and v P(v ) 2 U ? so that we have V = U + U ? .
Also, if u 2 U \ U ? , then hu, ui = 0 which implies that u = 0.
That is, we also have U \ U ? = {0}.
Consequently, V = U U ? and so P(v ) is indeed the orthogonal
projection of v 2 V onto U. Lecture 20, page 5
Example
Consider the vector space C[0, 1] with the inner product
Z 1
hf , g i = f (x)g (x) dx.
0

In Lecture 5, we found that {h1 , h2 , h3 } with the functions


p p
h1 (x) = 1, h2 (x) = 3(2x 1) and h3 (x) = 5(6x 2 6x +1),

is an orthonormal basis of the subspace P2 [0, 1] of C[0, 1].


p
Find the orthogonal projection of f (x) = x onto P2 [0, 1].

To rx d x
5- [x-P] -5
'
<
f. his =
=
o
=

B ( 2¥
±
< f. hz > =
Sf Tx B C 2x -

1) dx =
f! -
X ) DX =
TIB
< f. has > =
If Fx 55 lb X
'
-

Gx ti ) dx -
-
S! if lb XE 69¥ txt ) dx
-
= -

Tosof
Lecture 20, page 6
PIE ) f.he > kit at hz he CI 13713
= < .
> t .

=
I .

I t TI B .

B (2x D t C
-
-
Fas ft ) if 16×2-6×-11 )
= -
4x4 Efx + Is exr
-
t

least squares approximation


"
to Tx using functions in IR

Lecture 20, page 7


What have we just found?

As an orthogonal projection, P(f ) is the function that minimises


Z 1✓ ◆2
2
kf gk = f (x) g (x) dx
0

over all functions g 2 P2 [0, 1].


That is, we have found the quadratic that minimises the quantity
Z 1 ✓p ◆2
2
x (ax + bx + c) dx
0

over all a, b, c 2 R.
In this sense, it is a least squares approximation to f .

Lecture 20, page 8


Let’s take a look at it...

For 0  x  1, our least squares approximation (solid curve) to


p
f (x) = x (dashed curve) looks like this...

And that looks pretty good!


Lecture 20, page 9
Getting better approximations

If we were allowed polynomials of higher degree, say we were using


an orthonormal basis {h1 , h2 , . . . , hn } of Pn [0, 1], then the
orthogonal projection of f onto Pn [0, 1], i.e.
n
X
P(f ) = hf , hi i hi ,
i=1

would give us better approximations to f as n increases.


But, we’d have to find this orthonormal basis (i.e. each hi ) and all
of these inner products (i.e. each hf , hi i). A nightmare!
So, what we really want is an orthonormal basis that is easy to find
and easy to work with! Let’s look at one possible choice...

Lecture 20, page 10


Complex exponentials

We will work in a complex inner product space consisting of


continuous functions which map from [ ⇡, ⇡] to C with
Z ⇡
hf , g i = f (t)g (t) dt.

In this vector space, the functions hk : [ ⇡, ⇡] ! C given by

eikt
hk (t) = p ,
2⇡
can be used to form the orthonormal set

{h n , . . . , h 2 , h 1 , h0 , h 1 , h 2 , . . . hn }

and we’ll call the subspace spanned by this set Un .

Lecture 20, page 11


Let’s check that this is an orthonormal set...
i let

I
< I I>,
-
II fth Studt ,
her Ct) =

22T

if ntm .
cnn.hn > =

ce÷ tis ,
=
eI e÷
II de

¥ (eimn-mmt-g.ca
m"

¥17, ein
-

eine = coso ti since =


at =

"" m ' "


e-it
m ''
ein e-
-

so isi no
zig ¥ 2isinln-mlk.co
-

= =

n m -
M n-

eio.e-io-zisinoe.int int

Ia Ia
e
Ka
,

if mm cnn.hn> de Idt I
⇐ >
.

.
= =
, =

Lecture 20, page 12


Fourier series

The orthogonal projection of a function f : [ ⇡, ⇡] ! C


onto Un is given by
n
X
Pn (f )(t) = ak (f ) eikt
k= n

where the complex numbers ak (f ) are


Z ⇡
1 ikt
ak (f ) = f (t) e dt.
2⇡ ⇡

We call Pn (f ) the nth-order Fourier series of f and ak (f )


is the kth Fourier coefficient of f .

Lecture 20, page 13


Where does this come from?

orthogonally project on U" "


on basis for -
ne Ken

PIKE) II.n'I.he Ike III. f


-
-
>
e÷ n
e . >

÷ one:* an :i÷
.

II. ( ¥ IIfit e-
""

=
) de ) eike
n -

Ak Cf )
n
Kt
'
'
=
Ak Cf ) e
n

Lecture 20, page 14


Fourier series for real functions

The Fourier series Pn (f ) will be a linear combination of the


functions eikt for n  k  n and so, as

e±ikt = cos(kt) ± i sin(kt),

we can rewrite this as a linear combination of the functions

1, sin(t), cos(t), sin(2t), cos(2t), . . . , sin(nt), cos(nt).

Indeed, if f is a function from [ ⇡, ⇡] to R, this should give us


n
X n
X
Pn (f ) = ↵0 + k sin(kt) + k cos(kt),
k=1 k=1

for some real numbers ↵0 , 1, 2, . . . , n, 1, 2, . . . , n.

Lecture 20, page 15


Example

Find the Fourier series of the function f (t) = |t|.

if Kao, Ao -
-
Ia Ia Itt at = ¥ (Ia -
t dt tf} edt )

Eal I Ia ⇐I ? I E
'

-
-
E
-

t
-
-

i"
¥17, It I e ¥ ( Ia
i" "
de )
""
if Jo t
-

Ho
- -

, ak -
-
dy = -
te de + e

""

fate
""
¥I fo't e
-

de dt )
-
-

=
t

te t.e.IT?q+c=te:iY*e-
-
ikt

Y÷ I
"'
Ste
-

mm de -
-
- de -
-
to

need k¥0

arias i÷e seat [ ee eI÷ , :}


'"
.
t e
+

Lecture 20, page 16


Example

Find the Fourier series of the function f (t) = |t|.

"'' ik"
fi ¥ ( Eike , e
ike
)
}
-

¥1
-

(e e )
-

=
- - +
-
-
k
Zisihlk 'd) for odd 20054587=21 1)
-

-
O k

=
Ia I + 2t I = t -
ith )
"
)

o k even
E

and
so
for k¥0 ao -

A"
-
-
-
.

,
K odd

pncfu-u-IE.nake.it ,¥÷aa Etc ?!ha


" "' ""
=
e' + e

, , , ,

""
Et En
N
⇐ eikt
2
e-
-

I
¥2
-

let
"
K -7 -

K
"
= t
k
-
-
I KH
Lk odd ) ( K odd)

Lecture 20, page 16


Example

Find the Fourier series of the function f (t) = |t|.


ikt
I Kt
'

[ e' ,
-

=
Is + -
+ e
K I
-

20056kt)
-

CK odd)

)
so Pnc the - I cosckt
t day kik which is real

Ck odd)

Lecture 20, page 16


Pn=E EE,
coset)
Let’s take a look at it... -
-

K2
k
Ckodd)
For ⇡  t  ⇡, the first, third and fifth-order Fourier series (solid
curve) of f (t) = |t| (dashed curve) look like this...

P1 (t) #
=
-
cost P3 (t) =p th 4¥
,
- -

005431 P5 (t) -
-
Bce )

-
IE 25

Clearly, as n increases, the Fourier series Pn (f ) gets closer to f .

Lecture 20, page 17


This holds more generally...

Theorem: Suppose that f : [ ⇡, ⇡] ! C is continuous.


If Pn (f ) is the nth-order Fourier series of f , then
(1) kf Pn (f )k ! 0 as n ! 1 and
(2) for t 2 [ ⇡, ⇡], f (t) = lim Pn (f )(t).
n!1

Proof: Well beyond the scope of this course!


Fourier series are very useful in physics and engineering, but let’s
look at a mathematical application...

Lecture 20, page 18


Example continued
Use the Fourier series above to show that
1 1 1 ⇡2
f- Lt) Itt 1+ + + + · · · = .
32 52 72 8
-
-

Pnc f) it) -
-
E -
¥ I ⇐ TE En % -

,
) -
-
fit)
K= I

Ck odd )
is cos Ckt)
E ka -2
-

1<=1
To =
HI
(K odd)

Sef t -
- o
,
I -
TF I ¥ = o
K= I

Ck odd )

¥ ft
A
so I =
It j t ¥ Tt t t . . -
=
f- ¥
.

1<=1
(K odd)

Lecture 20, page 19


Final words

This is a ‘methods’ course, but to apply the methods properly you


need a decent understanding of the underlying theory.
The exam will test you on the theory, but if you have been paying
attention to what is in the slides and the exercises, you should be
able to handle it.
Obviously, the exam will also test your ability to apply the methods
that you’ve learnt in this course.

Have a great Easter!

Lecture 20, page 20


MA212: Further Mathematical Methods (Linear Algebra) 2020–21
Exercises 20: Inverses and Fourier series

For these and all other exercises on this course, you must show all your working.


1. Write down the equations which u, v, w, x, y and z must satisfy in order for the matrix
0 1
u v
@w x A
y z

to be a right inverse of the matrix ✓ ◆


1 1 1
A= .
1 1 2
Hence find all the right inverses of A.
Also show that A has no left inverse.


2. Express the matrix 0 1
1 1 0 1
@
A= 0 1 1 1A
1 1 0 1
in the form A = BC where B and C are both of rank 2 and of sizes 3 ⇥ 2 and 2 ⇥ 4 respectively.
Hence find the strong generalised inverse of A.
Deduce the matrix which represents the orthogonal projection of R3 onto the column space of A.
0 1
1 0 0 1
B 1 1 0 0C
3. Find the strong generalised inverse of A = B
@0
C.
1 1 0A
0 0 1 1
✓ ◆
1 x
4. Let A be the singular 2 ⇥ 2 matrix where x and y are real numbers, not both zero.
y xy
Find a formula for the strong generalised inverse of A.

5. Let A be a real matrix, with strong generalised inverse As . Show that (As )t = (At )s , i.e. that
the transpose of As is the strong generalised inverse of At .
Deduce that, if A is symmetric, then so is As .

6. Assume that there are only two values for x, namely 0 and 1, but there are four pairs of data,
namely (0, 3), (0, 2), (1, 5) and (1, 3) where the first coordinate is the independent variable x and the
second the dependent variable y.
Using the strong generalised inverse, find the least squares approximation function to the data in
Lin({1, x, x2 }).


7. Let P be the space of all polynomial functions. Define the linear transformation D : P ! P by
D(f ) = f 0 , i.e. each function f in P is mapped to its derivative.
(a) Does D have an inverse?

(b) Find a right inverse of D, i.e. a linear transformation D⇤ : P ! P such that DD⇤ is the
identity.

Exercises continue overleaf...


8. Recall that in question 10 of Exercises 13 we found an orthonormal basis of P2 [ 1, 1] with the

:
inner product given by Z 1
hf, gi = f (x)g(x) dx.
1

Using this basis, find a least squares approximation to x3 in P2 [ 1, 1].


What exactly is being minimised by this least squares approximation?

9. Find the best approximation of the function t in the interval ( ⇡, ⇡) by a linear combination
of the functions eikt for k in some range n  k  n.
Hence write the function t as a sum of functions of the form ck sin(kt) and dk cos(kt), for k > 0.
Evaluate both sides of your final expression at t = ⇡/2 to get an expression for ⇡.

10. Find the best approximation of the function t2 in the interval ( ⇡, ⇡) by a linear combination
of the functions eikt for k in some range n  k  n and rewrite this approximation as a real-valued
function.
X1
( 1)k+1 ⇡2
By letting n tend to 1 and evaluating at t = 0, show that = .
k2 12
k=1
I

(Wy Az )
u a
"
1- For to be a
right inverse of A , AA -
-
Iz

ti i ill this:: : : :*.tl : :L


-

would wty

l÷÷÷÷÷i÷÷:
a I
So require
- -

we
-

( 3×1
3W
② any matrix in the form of is right inverse of A where x. WEIR
W X
-2W -
2X X

(f (f )
0
③ since
Yz )
KAI
( f I f)
RRE LAD -
-
=
-
I =
.
-

-
2+3 -
n

, ,

according to the theorem .


A does not have left inverse since A does not have full column rank .


2 . Observe that Cz
-
-
E -
Ci . Et =
E -

2cL

( b f i T)
'

µ !) wand
then we can take B-
- c-

(¥ !)
② E-
Bt
f t)
o
and thus -
-

, ,

its :L
ki it
f) 3)
at -

i' = eat .
-

±
* a-
( : : : ) (y !) ⇐ 3) * Bi
⇐ E)
' -
-

v. ¥¥¥¥

÷÷÷÷H÷÷,
Cicco ) ' CBTB ) Bt
-
'
-

hence As =

l O l

* income
b
l


{ (§ !))
'
③ RCA) -
- Lin
, according to theorem ,
AAS
orthogonally projects R onto RCA)

÷÷÷f÷÷tx4
I

:L
I 0

Ii : :
*"
: :
IT

¥¥iE¥

3 .
① Ca) the of D ff DX To fttldt
't
inverse #
'
is D =

)
"
Cb , let fi X. ni X be the basis of Pn
- - -

. . .

Then DC D= # Ci ) - O so D
*
Cl ) =
fl da
DIX) - ddxcx) = I 't
D Cx) =fxdx= ¥
Dhi )=ddxlX4= 2x D*Cx4=fPdx=¥
i .

t
I ,

Dam ) =
# ( xn ) -

- NX
""
D ( xn)
't
=
Janda =
# x
""

f! ! ! ÷ : ) f ! ! ;÷÷
" " and ""
asterism inverse

Ca)
for an inverse to exist :
for f Elp , D
't
Dcf) f -
-

*
DD (f) f=

Ker CD) { constant function } so inverse is not unique


-

8 .
from Q lo in ex 13 , the basis of < f. s > is hi Cx) - ¥
hzlx)= E- X
hslx) -
-

¥042 -

f)
< f. his -
-
I Fda E [ Thx's ; -Ec¥ E ) o
,
-
-
=
- -
-

< f. ha > =/'s x3-Fxdx= -91 x4dx F- i¥x5 ] ! ,


=

,
= -91¥ -1¥ ) ¥ -

<
f. has > =L ,
7131K£ ) To da = Info f'yCx5 3×3) dx - =
Info ⇐ X' -

¥44 ) ! ,
=
# fo tf E It # 7=0 -
-

hence PCI ) =
af ,
his hit af ,
hz > hat < f. h3 > h3

=
IF Xo t F- F- x x t O x -34%(712--4)
= X z x3
f. if k=o ,
Ao -
- ¥172 Edt ¥ CITY % = = ¥ ( Iai Eas ) - =
o ?

if k¥0 , are ⇐ I'ate


-
""
de =
's
27L [te eI Ka +

eine
cK e
reika
# Tu )
-

= + - -

Tik

"i"
"
"
-
Iac t s

÷ ¥
I
=
l Tk-

zoos ( ka)
-
Zi Sin Clas )

=
Iq IT .
2.cos ( ka ) =
Iq C- 1)
k

hence Pnc f) A) =

Fen c 1)-
Keitel

iz ,¥ Keith
't
=
C 1)
-
Keil + c-1)
, m ,

=
II ,
Tj c -
Dk ceikt -
e
- ikt
)

FE i ,
C -
1) kzisinlkt)

141
n
C-1) k n ←, y
-2-2 is
=

I f-
sin ckt ) =
2
IT
k= I
sin ( KL )

7L
Set ties . Pncfjct) = I -
-
I


I
sin CE k)
-2¥ I
-
-

-2 [ EE ,
since Ht II ,
Ti -
H ) -
O ) =
E note when
:
k is even ,
since 14=0
odd
even

so -2 f -
I -
I ly- -
- . .

) = Is I= 14C It I t I t - - -

) x

its It 'T -9 I + tf it -9 TF t
-
.

4
- -

-2C
-

so - -
t . .
.
.

) = ⇐ -
7. (a) d) Ut w =
{ Utne / a c- V. New }
It
'

take it c- U, When w=I .


0 c- Ut W ,
SO U' c- Utw

i. e. U c- U -1W

cii )
by def , V =
Ut Ut

Ut ( Vtwjt
from ci ) VE V -1W then V E V
- -

. ,

SO ⑨ + WH E ut

LU -1W It Wt ① twice U.tn Wt


similarly c- ,
so

take Ue Ut n wt and UE V. WE W

then < U ,
Utne > = < U . U> + < Usw > = 0

SO It CU -1W )
1-
hence Utnwt c- @ + wjt

( vtw It =
Uts wt

a) cis vein
{ (g) (f) } w tin
{ (f) }
-
-

µ ;) ( f ? :)
°
take A- ,
then A"

4,0;)
* A" =-3
* A-
Yo ? ;) =L ; ;) ⇐ it
men

P=A*AÉñ=µ ;) +
⇐ 2)
4911=+4 ? / to ? :)
-

-13

t: : :)
= 2 -

l l
µ ;) 1)
8. d) '

(
Az =
Az -

A, so A- I 0 -
I -2

o l l
94 = Az -
29 ,
,

B- c-
di ) want BL be orthogonal projection , how to take B ?

( ? f)
1
take BT =

*'

?
a)

(b) 4) < 9 , It ) ,
821-1 ) > =) ! , 8114921-1) de = It , t d-1 =
1--12/1 ,
= d- -2-1=0

118,112 = < S, ,
S, > = f- I 1. Idt = It] !, = 1- c- 1) = 2 119,11 - E

hi =

÷f =

-1oz , so hit) =
¥

< 82 , hi > =)!, d- tdt =


-11<-12/1, =
¥ -1-1<=0
f-2=82 -
c 82 hi > hi
,
=
t -
D. d- = t

11 fall
-
=
f! ,
ttdt =
3--131 ! , = d- -
(-1-3)=-5 111-211=-38
h2= 1¥11 ¥-3 = = t , so hz It ) =
¥-1

the orthonormal basis for W= { hilt) , hzctl }


H) for hits = et

<h , h, > =
# f! ,
et de -
Filet ] ! # ( e- e-1)
,
=

IF < h , fz > = It , tetdt =


[tee ] 'y f!, et de let e- 1)
-
=
-

le
'
-

e- 1) =2e -1

Plh ) =
< h, f , > fit < h, fz > fz =
. - -

= d- le - e-1) +3 e-It
icaci )
lying ,
"r%n CHR
= ¥%¥¥=¥%¥ñ=¥
* um
✗ →A
"-
2×2-1
-

¥%¥,=¥% =¥%÷¥÷¥÷=iim¥¥÷×+
=iim¥¥¥ =&
cii ) take log ?

eb

(b) Ci ) Since LNX is continuous on II. 5) , by smoothness rule

fbacsx -
✗ 2) dllnx ) = I?(5X-ñ)- dx =/ Fcs -

✗ Idk [5×-1-2×2] ? = 25 - E- -51--1--8

WEN

•→
step function :
↳ Tix] 1h1 ✗ c- [1. 2)

{
=

1h5 •

1h2 [ 2,3 )
1h4
In } ↳ 4)
1h3
1h4 [4. 5)
1h2 -0

1h5 71=5
Be-10 1
>
I I

I 2 3 4 5

=
f- (2) 1h2 -1
'
f- ↳ ) -

11h3 -

1h21 1- f- (4) 11h4 1h3 ) -11-151 11h5 ln4)


.
-
.
-

=
GIN -1 61h3 -61-92-1444 -
41h3

=
21h3 -1444 g- -
-
-
→É
-
- i. -
"

÷.
o
"

¢1
"

=/
*

{
* " ↳ ✗ En -
-
""

#
.
.
.

2 [↳ 3) 2X [2. 3)

?
.

3 Esat) }x [3. 4)
i.

¥÷→
:
i
2) (4-2) 1- fl 3) Cf 6) -

1- f-(4) 116-12) + .
-
.

-1 fin ) b) +
. f? 44dm + JÉ Hd Csx) + - -
.

+ In? dcn -

DX

=
It .
2-1 34 .
3 -1 44 .
4-1 .
- .

+ n4 .

n ] + ⇐ ✗5) ? + [2 -5-+5 ] {
.
+ [ 3. ✗5) § + 1-(4-1)=5×511
. .
.

= 125+35-145-1 . . -
this ] + -5425 -1545 -
+ -5235 -5425
-
+ E- 45+-3535 + . . .
-1 ¥ n5 -15 -
-

= [ 25 -1354 .
.
.

+ n5 ] -

[ 25 + 35 -1 . . .

+ Cn -135 ] + n5 -
¥
= ¥225 -135 -1 . . .
n 5) 1- Jt [ nb -

i ]

2. (a) Ci > let ✗ = In -1 , then -1=0 and DX/ dt =


IT
when -1--3 .
✗ =
In 3 ; -1=0 ,
✗ =D

JIN } ¥ . tdx = Fns # dx since I > i. I converges

problem point eY= cosy


at ) y =D
et ¥ d- cosy t.tt?--tatt )
}
near 0 =
it -1 + t - -
'

=
,

I
cosy = 1- -5
so
Jteyasydy f ! g- dy -

go.yt-jasy-otji.FI?y-=-ey-HRey+-y-----i--c
take

sina.es ! gigs diverges . If Ég dy also diverges

cii, problem point : 71=0 or is

Hi-Fi +9? sin± ✗ -1×3

for ① . take six) =


¥
±
d-✗ cosrx
¥%+s±¥, ¥7.si#*=I.m*x-i----.-x:--i--o
-

since I! ¥ dx converges , by LCT .


o -

finite .
I'd ¥775T DX also converges
for ② .

1%1%-1 ¥3 e ← F's for x-D

by DCT J ?# DX converges . ② also converges

converge + converge =

converge . so J? s¥f¥ dx also


converges
Y ^
g- ✗
(b) d)

2
-
-
- -
- -
-

± :
g- IT
o I I >

cii , II. 15.* ¥ ,


dxdy + ly? ± ¥ dxdy
=s÷ ,
* c-a) dx + 1¥ ,
* j→Édx
=
1¥ ,
( -2×2 .
-1g -1 2X ? ¥ ) DX + f) =
,
1-2×2 -1g -1 2×2
.

.
3) DX

=/ ¥ I É ,
-
+ E)dat C- É +2×5 sdx

=
(I ✗
3
+ zlnx ]i + f- ¥ + 3- ✗ b) ?

= -

TI 8 -1 21h2 ✗
-

l -

I 1- 21h 1) + c- TF + I ✗ 2° + I -

=
-

5- 1- 21h2 -1 TI -

f- p ¥ + TI d- -

= + 2hr2 ✗ f-

3 .
(a)
4. evalues C- 1-d) [(-2411-34)+1]

1
: let / A -

dI1=0 . then -1 -
X O -1 =

I -2 -
X -
I -
I (1-0)

. I -3 = ⇐ (6+5×+5+1) -1=0
D= -2

:/ 1:11 :/ at :p
IA its / 1=0 0

1:
electors :
-
I 0 -
I ✗

Ui=( f) µ !-1*41=>0--1 :|
°
take ,
then 1A -51-314--4
' °

,
,

eaaea-yjj-inenla-H.si __a ( :{ ÷:/ (f) =L ;)


" •" "

f; ; ;) ( if ;)
Home * and "
0 -2 I

41=1 : : :/ 1¥:
rewrite XK I 0 -
I

ZK

1<(-4×-1) ( ? ? ;)
""

"=PJ"P"=µ gig ) (
' "
A 21 K ' -4 °
k O l -
l
0 C- 2)
, 0 k
0 C-2)

(f)
(b) unit vector in the direction of the axis .
Ñ=

we men have E-
(f) ii.
=ɵ )
-

ui.it/:-.Y.YE/=-ii--i--ii=i
iii.
0
Bib

( sing ;) ¢2 )
cos -4¢
* Yrs Yr 0
-
= = -

sin-44
assay o YR °

0 O O -

A7= xnBA%BMÑ =
.
.
.

Check AT is orthogonal

( & f) ( Ff ) ( ) ( )
5. cakis B*B= = 2-41 Zi = 2-41 Zi = I

-
2- i - i -
Zi I

2- = 0

cii ) B*B=BB*

(; ; ;) f. I ;)
• '
" "
' " "
' = - '
'

"
p BDB
-
-

rd with # c- 0£ ¥ #
'
1. cb]
by polar coordinates , ✗ =
rcoso ,
y -
rsino and ✗ 1-y
'
-_
and are
"> "- 42
( 2- pzj
-

y J [2- CX tyyj
-

( 2- ✗
'
= =
-

I
g- ✗

y=É

YYp.is#=Yf-.Yf--If-Yf-=rcos-o+rsirio--r
use Jacobian :

(/
Kosa
1%1
0=46 no ¥5K IN dr ) do

44
=/ o-xa.CH?FI-rFdr)do
let a- 2- M ,
then dull.dr= -

zr

JE f ÷ %-
a- %
.
do

=s÷%I÷÷ .
do

744 Kosa
=

0=>46
[ no
do

=s÷

.la#.- rido- s:i-;i*do-e i.os:=fI Y.s?&do- x- a let a- since . then du / do =


so

=
I _u÷du -
at # a

= ↳ arcsincru )]¥± - Tia


= a
(a) d) Ex ] [0,1 € O

{
)

{
i. - = 0 ✗c- ←
-

X -
O

✗C- ( 0 , I ]
+ ✗e -4,2 ) -
I

-2 ✗= 2 → ✗ ell .2]

Ex] EX]= 0 7--0 EDEMA

1
-
-

. " "° " 2- -0

I 71=1 I -

2 ✗ C- 4,2 )
←É
4 A- 2

cii ) assume fix) is continuous and aix ) is a non -

decreasing function
partition P={ Xo , ×, ,
-
"

.
An } where 710=1 and Xn=2

mi-inff.fm/7:XEEti-i.ti ) )
LHP )= ¥2 ,
mil titi ) -
✗ Iti 1)
- him Lacp )=2fH -12142)

=
Mk ( 2- d) -1mn (4-2)

=
ZMK -12mn

ZMK -12mn Vap )=2fa ) -121-12)


Similarly , Vap > =
Iim

by definition , 21-41-121-12)£f}fMd 1×1×1 ) ← 21-1111-2142)


?⃝
(b) PIX) =
¥ , P' ( X ) =
-

¥ ; FIX)= X ,
f' 1×1=1
42
(11-1-2×2)
51×1=3×-5,4 , +
dt

=
lH-iI 4t¥× ¥ +1¥ _icHt¥¥ -

I/ ✗
i. , ,

"+¥+¥+SÉ¥÷÷
=

so fire ) = 2 +4T / T2 =
To

(X43Ñ -1×1*1 1×43%-112*1 X43X -



'
1- ZX
2. d) Iim =
"M
x2-3ÑtX2Ñ +43T + HIT

b- X
¥
=1im¥¥tÑ
=

di ) Taylor theorem

(b) problem points : 71=2 , 3. A

JI =
+
JÉ + ji + fi
3. (a) d) ↳ fits } =

¥+4T ,
=

¥-5 = L
{ sinizt) ] ✗

tis] =
¥4 fit)= sin at

STS ) =

# get)= sin 2T

by convolution theorem .
Acts -

-
J ! flu) Slt -
u )du= To sinew sin Rt -24dg = ¥ sina.tl Itasca
-

cii ) L {y
"
} -14L { y } =
L{ sina.tl }

say }
'
-

y
'
lo ) -14 LEY } =
¥+2
sty -

sylo) b -
-1 44 =
¥
>
s Y -
as -
b -14T =
2/(5+4)

⇐ + 4) Y = as tbt
¥4
y=%¥4 É +

Las } =L { %_¥ } -1 L{ ¥4s }


Y =L { ¥4 +
¥É +
¥4s }
.

acosztt-bsinsf-izcoszt-d-sinzt-1-gls.in
y =

2-1-14b)
-
4-10052-111 -4A)

You might also like