0% found this document useful (0 votes)
41 views215 pages

LAA Notes 2024 Web

Uploaded by

Charis Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views215 pages

LAA Notes 2024 Web

Uploaded by

Charis Huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 215

MAST10022 Linear Algebra: Advanced

Lawrence Reeves
School of Mathematics and Statistics
University of Melbourne 2024 semester 1
ii MAST10022 Linear Algebra: Advanced, 2024

Reading mathematics is not like reading novels or history. You need to think slowly about every
sentence. Usually, you will need to reread the same material later, often more than one rereading.

From “The art of proof” by Beck and Geoghegan

You have probably never had a laboratory course in mathematics. Mathematics is not considered
to be an experimental science, whereas physics, chemistry, and biology are. Research for a chemist
can consist of a laboratory experiment designed to validate a conjecture, to suggest a conjecture,
or simply to see what happens. There is little comparable activity in mathematics.

The main business of mathematics is proving theorems.

From “A first course in abstract algebra” by J. B. Fraleigh

DO NOT POST ON ANY INTERNET SITE

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced
Subject information

Linear algebra is a core component of undergraduate mathematics. The term ‘linear algebra’ cov-
ers many things from the basic idea of vectors in the plane through to infinite dimensional function
spaces and beyond. The fundamental objects of study are ‘vector spaces’ and ‘linear transforma-
tions’. Matrices are used as a tool. Appropriate mathematical notation and proof technique will be
introduced when investigating the fundamental properties of vector spaces.
In these lectures our goal will be to develop a solid theoretical understanding and to prove the results
(i.e., theorems, propositions, lemmas) considered. There will also be applications and calculation, but
these rest firmly on the theoretical foundations. Extra material and explanation will be presented in
the lectures.
You are expected to attend all lectures in person.

Topics
Topics that we will cover include :

 sets and functions, induction and logic  error-correcting codes


 matrices  linear transformations
 linear systems  matrix of a linear transformation
 row operations and elementary matrices  eigenvalues and eigenvectors
 row reduction and elimination  diagonalisation
 determinants  Cayley-Hamilton theorem
 vector spaces: definition and examples  inner product spaces
 subspaces  Gram-Schmidt orthogonalisation procedure
 bases and dimension  orthogonal diagonalisation
 coordinates relative to a basis  spectral theorem for Hermitian matrices

Sources of information
 The definitive source of information for this (and indeed any) subject is the University Handbook.
 There are subject materials and a discussion board on the subject LMS site.
 There is lots of general information for Maths and Stats students on the LMS site MSPrime

Books
The lecture notes are not a textbook. There are many excellent textbooks that cover the material of
this subject. It is not necessary to purchase a textbook. You should make use of the library ! Here are
some suggestions:

 Elementary Linear Algebra  Linear Algebra Done Right


by H. Anton and C. Rorres by S. Axler
 Linear Algebra and its Applications  Finite-Dimensional Vector Spaces
by G. Strang by P. Halmos
iv MAST10022 Linear Algebra: Advanced, 2024

Classes
There are three one-hour lectures and two one-hour tutorials per week. The tutorials involve working
in small groups at a whiteboard. MATLAB is used as a tool for some of the exercises.

Assessment
Four written assignments 12%
Equally weighted and due at 1pm on Friday of weeks 3, 5, 9, 11
Late assignments will not be accepted (without special consideration)

45-minute online mid-semester test held in week 7 8%

45-minute computer MATLAB test held in week 12 10%

3-hour written end-of-semester examination 70%

How to get help


 Consultation with the lecturer (times available on the LMS)
 Consultation with your tutor (by arrangement with your tutor)
 Subject discussion board (via the LMS)
 Math Assist is a drop-in space for Mathematics and Statistics students to develop their mathemat-
ical understanding with support from tutors on duty.

Special consideration
Here is some general information about special consideration.

 To apply for special consideration on the exam please follow the above link to apply online.
 To apply for special consideration on an assignment or the mid-semester test or the MATLAB test,
please contact the lecturer directly. Do not apply through the general special consideration page
linked above as that can result in a long delay.
 Applications must be submitted no later than four business days after the due date. (This is a
university-wide rule.)

Academic integrity
 You must complete all assessment pieces entirely on your own.
 It is academic misconduct to upload assignment or exam questions to online ‘help’ sites.
 It is academic misconduct to show your assignment answers to other students.
 There is more information on the University’s academic integrity site.

Lecturer
Lawrence Reeves When sending me an email, please use your
[email protected] university email account and include your
room 173, Peter Hall building student number.

© University of Melbourne 2024


Lecture plan

(Subject to change!)
1 Sets and functions
2 Mathematical induction and logic
3 Matrices
4 Matrix inverses and linear systems
5 Row operations and elementary matrices
6 Gaussian elimination and types of solution set
7 Reduced row echelon form and matrix inverses
8 Rank and determinant of a matrix
9 Determinants (continued)
10 Fields
11 Vector spaces
12 Subspaces
13 Linear combinations: span and linear independence
14 Bases and dimension
15 Coordinates relative to a basis
16 Row and column space of a matrix
17 Some techniques for finite-dimensional vector spaces
18 Linear error-correcting codes
19 Linear transformations
20 Matrix representations of linear transformations
21 Change of basis
22 Dual space
23 Eigenvalues and eigenvectors
24 Eigenspaces
25 Diagonalisation
26 Diagonalisation II
27 Powers of a matrix and the Cayley-Hamilton theorem
28 Geometry in Euclidean space
29 Inner products
30 Orthonormal bases
31 The Gram-Schmidt orthogonalisation procedure and orthogonal projection
32 Orthogonal diagonalisation
33 Proof of the spectral theorem
34 Unitary diagonalisation
35 Least squares approximation
A Cardinality
B Existence of bases
vi MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


Preface: what is a vector?

As motivation for the definition of vector space (Lecture 11), we list some examples of familiar things
that we might already think of as being ‘vectors’. We will see that they share some fundamental
algebraic properties: there is a version of ‘vector addition’ and of ‘scalar multiplication’.

Vectors in the plane

Let u = (1, 1) and v = (1, −2). Then u, v ∈ R2 and if we add or multiply by a scalar we again get
elements of R2 .

u + v = (1, 1) + (1, −2) = (2, −1) −2u = −2(1, 1) = (−2, −2)

u u
v

u+v
−2u

Systems of linear equations

Let S be the set consisting of all solutions of the following simultaneous equations:

x + 9y + 6z + 8w = 0
2y + 3w = 0

Then u = (−6, 0, 1, 0) and v = (11, −3, 0, 2) are solutions. Adding two solutions gives another solu-
tion as does multiplying by a scalar: u + v = (5, −3, 1, 2) is a solution and −2u = (12, 0, −2, 0) is a
solution.

Polynomials

Consider the set of all polynomials of degree at most 2:

P2 (C) = {a0 + a1 x + a2 x2 | a0 , a1 , a2 ∈ C}

If we add any two elements of P2 (C), we get an element of P2 (C). If we multiply an element of P2 (C)
by a scalar, we get an element of P2 (C).

(1 + x + x2 ) + (1 − 2x − x2 ) = 2 − x

We could also consider the set of all polynomials:

C[x] = {a0 + a1 x + · · · + an xn | n > 0 and a0 , a1 , . . . , an ∈ C}

This set too is closed under addition and scalar multiplication.


0-2 MAST10022 Linear Algebra: Advanced, 2024

Functions

Consider the set F consisting of all functions from R to R. We have a way of adding two functions
together and of multiplying a function by a scalar. For example, let u, v ∈ F be given by u(x) = sin(x)
and v(x) = ex . Adding them together gives another element w ∈ F which is given by defining
w(x) = sin(x) + ex .

Vector spaces

All of the structures we’ve listed above share some fundamental properties. This leads to the abstract
concept of a vector space. The definition of a vector space, which will state precisely later, captures the
common algebraic properties shared by the above examples. The fundamental idea is that we have
a set (whose elements are called vectors) and a way of adding elements of the set together, as well
as a way of multiplying by a scalar. The elements of the set are called vectors, even if they are also
polynomials or functions or matrices. For a given vector space, the scalars are fixed and can be R or
C or some other field.
The pay-off for this abstraction is that any result or technique we establish for vector spaces applies
to all of the above examples – they don’t need to be treated separately.

© University of Melbourne 2024


LECTURE 1

Sets and functions

Before beginning with the linear algebra content proper we revise some important general concepts
and notations. Sets and functions are fundamental to linear algebra and to modern mathematics in
general.

1.1 Sets

A set is a collection of objects called elements (or members) of that set.* The notation x ∈ A means
that x is an element of the set A. The notation x ∈
/ A is used to mean that x is not a member of A.
Let A and B be sets. We say that A is a subset of (or is contained in B), written A ⊆ B, if every
element of A is also an element of B (i.e., if x ∈ A, then x ∈ B). Two sets are equal if they have the
same members. Thus A = B exactly when both A ⊆ B and B ⊆ A. If A ⊆ B and A 6= B then we say
that A is a proper subset of B and (sometimes) write A ( B.
Sets are often defined either by listing their elements, as in A = {0, 2, 3}, or by giving a rule or
condition which determines membership in the set, as in A = {x ∈ R | x3 − 5x2 + 6x = 0}.
Here are some familiar (mostly mathematical) sets:

 natural numbers: N = {1, 2, 3, 4, . . .}

 integers: Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . .}


x
 rational numbers: Q = { y | x, y ∈ Z, y 6= 0}

 real numbers:† R

 complex numbers: C = {x + iy | x, y ∈ R}

 (1, 3] = {x ∈ R | 1 < x 6 3}

 Greek alphabet (lower case): {α, β, γ, δ, , ζ, η, θ, ι, κ, λ, µ, ν, ξ, o, π, ρ, σ, τ, υ, ϕ, χ, ψ, ω}

In these examples we have the following containment relations: N ⊆ Z ⊆ Q ⊆ R ⊆ C and (1, 3] ⊆


√ (1, 3] 6⊆ Q because
R. Note that √ the interval (1, 3] contains real numbers that are not rational. For
example, 2 ∈ (1, 3] but 2 ∈ / Q.
As indicated above, the notation {. . .} is used for set formation. Sets are themselves mathematical
objects and so can be members of other sets. For instance, the set {3, 5} consists of two elements,
namely the numbers 3 and 5. The set {{3, 5}, {3, 7}, {7}, 3} consists of 4 elements, namely the sets
{3, 5},{3, 7}, {7} and the integer 3. Note that 7 ∈
/ {{3, 5}, {3, 7}, {7}, 3}. Observe that {7} is the set
whose only element is the number 7, and we have that 7 ∈ {7} but 7 6⊆ {7}.
The empty set, denoted by ∅, is the set that has no elements, that is, x ∈ ∅ is never true.

* In fact, more care is needed in the definition of a set. In general one must place some restriction on set formation. For

example, trying to form {x | x is a set} or {x | x ∈ / x} can lead to logical paradoxes (Russell’s paradox). This can be dealt
with or excluded in a more formal or axiomatic treatment of set theory. We will be careful to avoid situations where this
difficulty arises.

It’s a bit more involved to define the real numbers precisely, but one can think of them either as the points on the real
line or as (infinite) decimal expansions. In this subject we will be using some standard properties of R, but we will not give
a construction.
1-2 MAST10022 Linear Algebra: Advanced, 2024

Lemma 1.1

The empty set is a subset of every set.

Proof. Let A be a set. We need to show that the following statement is true:

if a ∈ ∅, then a ∈ A (*)

Let’s suppose that (*) were not true. Then there would be an element a such that a ∈ ∅ is true and
a ∈ A is false. However, since a ∈ ∅ is never true, no such a exists and we conclude that (*) must in
fact be true.

Note that ∅ ∈ {∅} and ∅ ⊆ {∅} but ∅ ∈


/ ∅.

Operations on sets

The intersection of two sets A and B is the set

A ∩ B = {x | x ∈ A and x ∈ B}

The union of A and B is the set

A ∪ B = {x | x ∈ A or x ∈ B}

Example 1.2.

1) {2m + 5 | m ∈ Z} ∪ {2m | m ∈ Z} = Z 3) {n ∈ N | n is prime} ∩ {2n | n ∈ N} = {2}


2) {2m + 5 | m ∈ Z} ∩ {2m | m ∈ Z} = ∅

The set difference of two sets A and B is the set

A \ B = {x | x ∈ A and x ∈
/ B}

If B ⊆ A, then A \ B is called the complement of B in A. If the larger set A is clear from the context,
we sometimes write B c for the complement of B in A.

Example 1.3.
1) Z \ N = {. . . , −2, −1, 0} 3) [0, 2] \ N = [0, 1) ∪ (1, 2)

2) N \ Z = ∅ 4) N \ [0, 2] = {3, 4, . . . }

Proposition 1.4: De Morgan’s Laws

Let A, B ⊆ X be two sets. Then

1. A ⊆ B iff B c ⊆ Ac 2. (A ∩ B)c = Ac ∪ B c 3. (A ∪ B)c = Ac ∩ B c

Given a set A, the power set of A is the set containing all subsets of A. It is denoted P(A).

Example 1.5. For A = {α, β, γ} we have P(A) = {∅, {α}, {β}, {γ}, {α, β}, {α, γ}, {β, γ}, {α, β, γ}}.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 1-3

Let A and B be two sets. We define a set, called the Cartesian product of A and B, by
A × B = {(a, b) | a ∈ A and b ∈ B}
Each element (a, b) of the set A × B is called an ordered pair.

Note. 1. (a, b) = (a0 , b0 ) precisely when a = a0 and b = b0 .


2. If A 6= B, then A × B 6= B × A.
3. If A = B, we often write A2 in place of A × A.

1.2 Functions

The concept of a function is fundamental in mathematics. Functions on the real numbers are often
described using some sort of a formula (e.g., f (x) = sin(x)), but we want to define the notion of
function in a way that makes sense more generally. The idea is to make a definition out of what is
sometimes called the graph of a function.

Definition 1.6: function

Let A and B be sets. A function from A to B is a subset f of A × B such that for each a ∈ A
there is exactly one element of f whose first entry is a. We write f (a) = b to mean (a, b) ∈ f . We
write f : A → B to mean that f is a function from A to B. The set A is called the domain of the
function and B is called the codomain of the function.

Remark. 1. Functions are often (but not always!) given by a formula such as f : R → R, f (x) = x2 .
When written in this way, the subset of A × B is understood to be {(a, f (a)) | a ∈ A}.
2. The domain and codomain are part of the defining data of a function. The following two func-
tions are not the same:
f : R → R, f (x) = x2
g : [0, ∞) → R, g(x) = x2

Definition 1.7: injective, surjective, bijective

Let f : A → B be a function.

1. We say that f is injective if for all a1 , a2 ∈ A, if f (a1 ) = f (a2 ) then a1 = a2 .

2. We say that f is surjective if for all b ∈ B there exists a ∈ A with f (a) = b.

3. We say that f is bijective if it is both injective and surjective.

Example 1.8.
1) The function f : R → R, f (x) = x2 is neither injective nor surjective.
2) The function g : [0, ∞) → R, g(x) = x2 is injective but not surjective.
3) The function h : R → [0, ∞), h(x) = x2 is surjective but not injective.
4) The function k : [0, ∞) → [0, ∞), k(x) = x2 is bijective.

© University of Melbourne 2024


1-4 MAST10022 Linear Algebra: Advanced, 2024

Example 1.9.
1) f : N × N → N, f (m, n) = 2m 3n is injective (but not surjective).
(
n
2 if n is even
2) g : N → Z, g(n) = 1−n is bijective.
2 if n is odd

Let f : A → B and g : B → C be two functions. The composition of f and g is the function


g ◦ f : A → C given by g ◦ f (a) = g(f (a)). Given a set A, the identity function on A is the function
IdA : A → A, IdA (a) = a. If f : A → B is a bijection, then there is a well-defined inverse function
f −1 : B → A having the property that f ◦ f −1 = IdB and f −1 ◦ f = IdA . Indeed, if we think of
functions as sets of ordered pairs and f is a bijection, then the ordered pairs of f −1 are just the pairs
of f in reverse order.

Example 1.10. Consider the function f : N → Z>2 , f (n) = n + 1. The corresponding subset of N × Z>2
is
{(1, 2), (2, 3), (3, 4), . . . }
The function f is a bijection. Its inverse is f −1 : Z>2 → N, f −1 (n) = n − 1 which as a subset of Z>2 × N
is
{(2, 1), (3, 2), (4, 3), . . . }

In mathematics one often needs functions of several variables, for example the operation of addition
of real numbers is a function of two variables which assigns to each pair of real numbers (x, y) their
sum x + y. Thus addition is a function from R2 to R. More generally, a function of n variables from
A to B (or an n-ary function f from A to B) is just a function of the form f : An → B.

1.3 Exercises

1. List five elements belonging to each of the following sets:

(a) {n ∈ N | n is divisible by 5} (d) {2n | n ∈ N}


(b) P({1, 2, 3, 4, 5}) (e) {r ∈ Q | 0 < r < 1}
(c) {n ∈ N | n + 1 is a prime }

2. List all the elements in each of the following sets:

(a) {n ∈ N | n2 = 3} (d) {3n + 1 | n ∈ N and n 6 6}


(b) {n ∈ Z | 3 < |n| < 7} (e) {n ∈ N | n is a prime and n 6 15}
(c) {x ∈ R | x < 1 and x > 2}

3. Consider the sets

A = {n ∈ N | n is odd } C = {4n + 3 | n ∈ N}
B = {n ∈ N | n is a prime } D = {x ∈ R | x2 − 8x + 15 = 0}

Which are subsets of which? Consider all sixteen possibilities.

4. Consider the sets

A = {n ∈ N | n ≤ 11} E = {n ∈ N | n is even}
B = {n ∈ N | n is even and n ≤ 20}

Determine each of the following sets:

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 1-5

(a) A ∪ B (c) A \ B (e) E ∩ B (g) E \ B


(b) A ∩ B (d) B \ A (f) B \ E (h) N \ E

5. Prove (directly from the definitions of the operations) that (A ∪ B) ∩ Ac ⊆ B.

6. Prove or disprove each of the following:

(a) A ∩ B = A ∩ C implies B = C
(b) A ∪ B = A ∪ C implies B = C
(c) (A ∩ B = A ∩ C and A ∪ B = A ∪ C) implies B = C

7. Let S = {0, 1, 2, 3, 4} and T = {0, 2, 4}.

(a) How many elements are there in S × T ? How many in T × S?


(b) List the elements in {(m, n) ∈ S × T | m < n}.
(c) List the elements in {(m, n) ∈ T × S | m < n}.
(d) List the elements in {(m, n) ∈ S × T | m + n ≥ 3}.
(e) List the elements in {(m, n) ∈ T × S | mn ≥ 4}.
(f) List the elements in {(m, n) ∈ S × S | m + n = 10}.

8. Let S = {1, 2, 3, 4, 5} and T = {a, b, c, d}. For each question below: if the answer is “yes” give
an example; if the answer is “no” explain briefly.

(a) Are there any injective functions from S to T ?


(b) Are there any injective functions from T to S?
(c) Are there any surjective functions from S to T ?
(d) Are there any surjective functions from T to S?
(e) Are there any bijective functions from S and T ?

9. Let S = {1, 2, 3, 4, 5} and consider the following functions from S to S: 1S (n) = n, f (n) = 6 − n,
g(n) = max{3, n} and h(n) = max{1, n − 1}.

(a) Write each of these functions as a set of ordered pairs.


(b) Which of these functions are injective and which are surjective?

10. Consider the two functions from N2 to N defined by f (m, n) = 2m 3n and g(m, n) = 2m 4n . Show
that f is injective but that g is not injective. Is f surjective? Explain. (You may use that every
n ∈ N with n > 2 has a unique prime factorisation.)

11. Show that if f : A → B and g : B → C are injective functions, then g ◦ f is injective.

12. Show that composition of functions is associative, that is, h ◦ (g ◦ f ) = (h ◦ g) ◦ f.

13. Here are two ‘shift functions’ mapping N to N:

R : N → N, R(n) = n + 1
L : N → N, L(n) = max{1, n − 1}

(a) Show that R is injective but not surjective.


(b) Show that L is surjective but not injective.
(c) Show that L ◦ R = IdN but that R ◦ L 6= IdN .

© University of Melbourne 2024


1-6 MAST10022 Linear Algebra: Advanced, 2024

Further reading for lecture 1

The extra material at the end of a lecture can include extra theory or references. It is NOT required!
It’s just for those who would like to know more.

 Some references for introductory set theory:

The art of proof: basic training for deeper mathematics, by Beck and Geoghegan, chapter 5.
Naive set theory, by Halmos.
Russell’s paradox, on Wikipedia.
Zermelo-Fraenkel set theory, on Wikipedia.

 The axiomatic definition of the real numbers R and their construction from Q will be covered
in the subject MAST20033 Real Analysis: Advanced (amongst others). An important difference
between Q and R is that every bounded subset of R has a least upper bound. A standard
construction of R from Q is via “Dedekind cuts” which, roughly speaking, carefully adds least
upper bounds to Q.

The Art of Proof, by Beck and Geoghegan, chapter 8.


Principles of Mathematical Analysis, by Rudin, chapter 1.

 Bijections are used in the definition of the cardinality (or size) of a set. Two sets are said to
have the same cardinality if there exists a bijection from one to the other. The two sets {1, 2, 3}
and {Julia, Ada, Xav} have the same cardinality. It starts to get interesting when we consider
infinite sets. Not all infinite sets have the same cardinality. The sets N, Z, and Q all have the
same cardinality, but R does not. In particular, there is a bijection from N to Q. That there is no
bijection from N to R can be shown with an elegant argument known as ‘Cantor diagonalisa-
tion’.

The Art of Proof, by Beck and Geoghegan, chapter 13.

 Theorem, proposition, lemma, corollary,... What’s the difference?


These are all statements of results that are then proven to be true. The difference is a little sub-
jective, and the choice usually reflects the author’s view of how important or interesting the
result is. Theorems are results that are considered important. Propositions are less important
but still interesting. Lemmas are usually shorter often technical results that are used in prov-
ing other statements. A corollary is a statement that follows easily from a previously proven
theorem or proposition.

© University of Melbourne 2024


LECTURE 2

Mathematical induction and logic

We continue with some background material on logic and proof by induction that we will need later
when constructing proofs.

2.1 Some useful notation from logic

2.1.1 Propositions

We will be concerned with statements that are either true or false. They are called propositions
(alternatively statements).

Example 2.1.
Propositions: Not propositions:
 1+1=2  28

 1+1=3  z is even

 For all integers z ∈ Z, if z 2 is even then z is even.  ‘potato’

 All maths lecturers are named Lawrence.

 Every even integer greater than 2 can be written as the


sum of two primes.

2.1.2 Operations on propositions (connectives)

Given two propositions p and q, we can combine them to form new propositions. The conjunction
(‘and’) of p and q is denoted p ∧ q and is defined to be true if both p and q are true, and false in all
other cases. The disjunction (‘or’) of p and q is denoted p ∨ q and is defined to be false if both p and
q are false, and true in all other cases. For each, the four possible cases can be listed in a truth table.

conjunction disjunction implication equivalence


p q p∧q p q p∨q p q p =⇒ q p q p ⇐⇒ q
T T T T T T T T T T T T
T F F T F T T F F T F F
F T F F T T F T T F T F
F F F F F F F F T F F T
A statement of the form “if p then q" is called an implication (or a conditional statement). It is
written as p =⇒ q. It is defined by the truth table above. Notice that if p is false, then p =⇒ q is
true whatever the truth value of q. The proof of Lemma 1.1 illustrates why this is the correct choice
of definition to make.
A statement of the form “p if and only if q" is called an equivalence. It is written as p ⇐⇒ q and is
defined by the truth table above.
Given a single proposition p, its negation, denoted ¬p (or ∼ p) is the statement that is false if p is true
and is true if p is false.
2-2 MAST10022 Linear Algebra: Advanced, 2024

Example 2.2. Let’s construct a truth table for the


proposition ¬p∨q. Compare the last col- p q ¬p ¬p ∨ q
umn with that for p =⇒ q. T T F T
T F F F
F T T T
F F T T

We say that two statements p and q are logically equivalent if p is true precisely when q is true.* This
is written p ≡ q.

Example 2.3. We observed in the previous example that (p =⇒ q) ≡ (¬p∨q). To show this explicitly,
we construct a truth table for (p =⇒ q) ⇐⇒ (¬p ∨ q) and observe that the equivalence has value T
for all possible values of p and q.

p q p =⇒ q ¬p ∨ q (p =⇒ q) ⇐⇒ (¬p ∨ q)
T T T T T
T F F F T
F T T T T
F F T T T

Exercise 14. Use a truth table to show that (p =⇒ q) ≡ (¬q =⇒ ¬p). The second statement is called
the contrapositive of the first statement.

The following can be established using truth tables.

Lemma 2.4: De Morgan

Let p and q be two statements. Then

1. ¬(p ∨ q) ≡ (¬p) ∧ (¬q) 2. ¬(p ∧ q) ≡ (¬p) ∨ (¬q)

2.1.3 Quantifiers

The symbol ∀ means ‘for all’ (or ‘for each’ or ‘for every’). It is called the universal quantifier. The
general form of a proposition formed using the universal quantifier is

∀x ∈ A, p(x)

where, for a given x, p(x) is a statement.


The symbol ∃ means ‘there exist’ (or ‘for some’). It is called the existential quantifier. The statement

∃ x ∈ A, p(x)

is true if there is at least one element x in A such that the statement p(x) is true.

Example 2.5. Here are some statements constructed using these quantifiers.

1. ∀ x ∈ R, x2 > 0 (which is true)

2. ∀ x ∈ R, (x2 ∈ Q =⇒ x ∈ Q) (which is false)

3. ∃ x ∈ R, x2 < 0 (which is false)

4. ∀ x ∈ R ∃ y ∈ R, x + y = 0 (which is true)
* This is the same as saying that the statement p ⇐⇒ q is always true (i.e., is a tautology).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 2-3

5. ∃ y ∈ R ∀ x ∈ R, x + y = 0 (which is false)

Notice the difference between the final two examples above. Example 4 says that every real number
has an additive inverse. For example, given x = π we can take y = (−1) × π. Example 5 says that
there exists one real number that is the additive inverse of every real number.

Lemma 2.6: Negation of statements involving quantifiers

1. ¬(∀x ∈ A, p(x)) ≡ ∃x ∈ A, ¬p(x)

2. ¬(∃x ∈ A, p(x)) ≡ ∀x ∈ A, ¬p(x)

Example 2.7.

statement negation
∀ x ∈ R, x2 > 0 ∃ x ∈ R, x2 < 0
∃ y ∈ R ∀ x ∈ R, x + y = 0 ∀ y ∈ R ∃ x ∈ R, x + y 6= 0
∀ x ∈ U, x ∈ ∅ =⇒ x ∈ A ∃ x ∈ U, (x ∈ ∅ ∧ x ∈
/ A)
All maths lecturers are named Lawrence There exists a maths lecturer whose name is not
Lawrence
All flying pigs are purple There exists a flying pig that is not purple

Which one of the two statements in the final row is true?

2.2 Induction

In this subject we will be assuming some standard properties of N such as the distributive law. An
important property that we will use in many proofs is given in the following.

Theorem 2.8: Principle of mathematical induction

Let P (n) be a (true or false) statement that depends on a natural number n ∈ N. In order to prove
that P (n) is true for all values of n ∈ N it is sufficient to prove the following:

1) P (1) is true (‘base case’)

2) ∀n ∈ N, P (n) =⇒ P (n + 1) (‘induction step’)

To give a proof of this theorem we would need to consider the definition and construction of the
natural numbers. It is equivalent to the so-called ‘well ordering’ property of N. We will not give a
proof, but shall regard the above as a fundamental property of N (and Z).
Here are some examples of using mathematical induction as a method of proof.

Example 2.9.

1) Claim. For all n ∈ N, the number n4 − 6n3 + 11n2 − 6n is divisible by 4.

Proof. Let P (n) be the statement ‘n4 − 6n3 + 11n2 − 6n is divisible by 4’.† We check that both
conditions of the above theorem are satisfied.

Notice that P (n) is a statement that is either true or false. It is not a polynomial, nor is it an integer. It would be an
error to write something such as P (n) = n4 − 6n3 + 11n2 − 6n

© University of Melbourne 2024


2-4 MAST10022 Linear Algebra: Advanced, 2024

Base case: P (1) is the statement ‘1 − 6 + 11 − 6 is divisible by 4’. Since 1 − 6 + 11 − 6 = 0 and 0 is


divisible by 4,‡ the statement P (1) is true.
Induction step: Let n ∈ N and suppose that P (n) is true. That P (n) is true means that there exists
a k ∈ Z such that n4 − 6n3 + 11n2 − 6n = 4k. To show that P (n + 1) is true we need to show that
(n + 1)4 − 6(n + 1)3 + 11(n + 1)2 − 6(n + 1) is divisible by 4. Note that

(n + 1)4 − 6(n + 1)3 + 11(n + 1)2 − 6(n + 1) = (n4 + 4n3 + 6n2 + 4n + 1) − 6(n3 + 3n2 + 3n + 1)
+ 11(n2 + 2n + 1) − 6(n + 1)
= n4 − 2n3 − n2 + 2n
= (n4 − 6n3 + 11n2 − 6n) + 4n3 − 12n2 + 8n
= 4k + 4(n3 − 3n2 + 2n)
= 4(k + n3 − 3n2 + 2n)

Therefore P (n + 1) is true. It follows from the principle of mathematical induction (Theorem 2.8)
that P (n) is true for all n ∈ N.

2) Claim. For all n > 4 we have n! > 2n

Proof. Notice that the claim is that the inequality holds for all n > 4. In order to apply Theorem
2.8 in exactly the form given, we define P (n) to be the statement that (n + 3)! > 2n+3 . If we show
that P (n) is true for all n ∈ N, we will have established the claim.
Base case: P (1) is the statement that 4! > 24 . Since 4! = 24 > 16 = 24 , the statement P (1) is true.
Induction step: Let n ∈ N and suppose that P (n) is true, that is that (n + 3)! > 2n+3 . We need to
show that P (n + 1) is true, that is that ((n + 1) + 3)! > 2(n+1)+3 . We have

((n + 1) + 3)! = (n + 4)! =(n + 4)(n + 3)!


> (n + 4)2n+3 (since (n + 3)! > 2n+3 )
> 2 × 2n+3 (since n + 4 > 2)
n+4 (n+1)+3
=2 =2

Therefore P (n + 1) is true. It follows from the principle of mathematical induction (Theorem 2.8)
that P (n) is true for all n ∈ N.

Here is another version of the induction statement in which the induction step is, in principle, easier
to prove.

Theorem 2.10: ‘complete induction’

Let P (n) be a (true or false) statement that depends on a natural number n ∈ N. In order to prove
that P (n) is true for all values of n ∈ N it is sufficient to prove the following:

1) P (1) is true (‘base case’)

2) ∀n ∈ N, (P (1) ∧ · · · ∧ P (n)) =⇒ P (n + 1) (‘induction step’)


(i.e., ∀n ∈ N, if P (k) is true for all k 6 n, then P (n + 1) is true)

Proof. We will show that this theorem follows from Theorem 2.8.§

Every integer m ∈ Z divides 0 since 0 = m × 0
§
The converse is also true: Theorem 2.10 implies Theorem 2.8.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 2-5

Let P (n) be as in the statement of the current theorem and assume that both (1) and (2) hold. We
need to show that P (n) is true for all n ∈ N. Let Q(n) be the statement that ‘P(k) is true for all k 6 n’.
We want to check that the statements Q(n) satisfy the conditions stated in Theorem 2.8.
Base case: Since Q(1) is simply the statement that P (1) is true, and were are assuming that (1) holds,
we have that Q(1) is true.
Induction step: We need to show that if Q(n) is true, then Q(n + 1) is true. Assume then that Q(n) is
true. That is, that P (1) is true, and P (2) is true, and P (3) is true,. . . , and P (n) is true. In particular, we
have that P (n) is true. Therefore, since we are assuming that (2) (from the current theorem) holds,
we have that P (n + 1) is true. Therefore, P (1) is true, and P (2) is true, and P (3) is true,. . . , and P (n)
is true, and P (n + 1) is true. That is, Q(n + 1) is true.
It follows from the principle of mathematical induction (Theorem 2.8) that Q(n) is true for all n ∈ N.
Therefore, P (n) is true for all n ∈ N.

Example 2.11. Complete induction can be used to prove that every natural number n ∈ N>2 can be
written as a product of primes.

2.3 Exercises

15. Translate the following into mathematical notation.

(a) The square of 10 is 50 and the cube of 5 is 12.


(b) If 7 is an integer, then 6 is not an integer.

16. Construct truth tables for the following statements.

(a) (p ∧ q) ∨ (¬p ∧ ¬q) (b) (¬q ∧ (p =⇒ q)) =⇒ ¬p

17. Use a truth table to show that (p ⇐⇒ q) is logically equivalent to (¬p ⇐⇒ ¬q).

18. Translate the following into mathematical notation.

(a) All rational numbers are larger than 6.


(b) There is a real-number solution to x2 + 3x − 7 = 0.
(c) There is a natural number whose cube is 8.
(d) The set of all integers that aren’t multiples of 7.

19. Find the negation of the following propositions.

(a) ∀x ∈ R, x2 = 10 (c) ∃a ∈ N, ∀x ∈ R, ax = 4
(b) ∃y ∈ N, y < 0 (d) ∀y ∈ Q, ∃x ∈ R, xy = 30
  
1 1 n 1 n
20. Let A = . Use induction to prove that A = for all n ∈ N.
0 1 0 1

21. Use induction to show that k 4 − 6k 3 + 11k 2 − 6k is divisible by 4 for all k ∈ N.

© University of Melbourne 2024


2-6 MAST10022 Linear Algebra: Advanced, 2024

Further reading for lecture 2 (for interest)

 More on induction and well-ordering of N

The Art of Proof by Beck and Geoghegan, chapter 2.

 Introductory logic

The Art of Proof by Beck and Geoghegan, chapter 3.

 Here is a useful minor reformulation of Theorem 2.8. The difference is that we start at any
integer as the base case (in place of 1).

Theorem. Let P (n) be a (true or false) statement that depends on an integer n ∈ Z and let n0 ∈ Z. In
order to prove that P (n) is true for all values of n > n0 it is sufficient to prove the following:

1) P (n0 ) is true (‘base case’)


2) ∀n ∈ Z>n0 , P (n) =⇒ P (n + 1) (‘induction step’)

Proof. We proof that this theorem is implied by Theorem 2.8.


Let P (n) and n0 be as in the statement of the current theorem and suppose that (1) and (2) both
hold. For any n ∈ N, define Q(n) to be the statement P (n0 + n − 1). Note that n0 + n − 1 > n0 ,
since n > 1. We will use Theorem 2.8 to show that Q(n) holds for all n ∈ N.
Base case: Q(1) is the statement P (n0 ), which is true by condition (1).
Induction step: Suppose that Q(n) is true for some n ∈ N. Then P (n + n0 − 1) is true, and
therefore P (n + n0 ) is true by condition (2). Therefore Q(n + 1) is true.
It follows from the principle of mathematical induction (Theorem 2.8) that Q(n) is true for all
n ∈ N. Therefore P (n) is true for all n > n0 .

© University of Melbourne 2024


LECTURE 3

Matrices

Matrices are a fundamental tool in linear algebra. We recall some definitions including the usual
arithmetic binary operations and the unary operation of transposition

Definition 3.1: Matrix

Let m, n ∈ N. A matrix of size m × n is a rectangular array of numbers having m (horizontal)


rows and n (vertical) columns. The numbers in the array are called the entries of the matrix.
For the moment, the entries are from Z or R or C, but later we will allow other types of entries.
For a matrix A of size m × n, we denote by Aij the entry in the i-th row and j-th column of A
 
A11 A12 ... A1n
 A21 A22 ... A2n 
A=
 ... .. .. ..  or A = [Aij ]
. . . 
Am1 Am2 ... Amn

We denote by Mm,n (R) the set of all matrices of size m × n having real entries.
The notations Mm,n (C), Mm,n (Q), and simply Mm,n are used similarly. For the moment, when
F is used it represents one of: Q, R or C. However, all results (and proofs) remain valid for any
fielda .
a
We will see the definition of a field in Lecture 10

 
π 3.1 0
Example 3.2. A = ∈ M2,3 (C), A12 = 3.1, A21 = − 21
− 12 4 2i

Definition 3.3

We fix some terminology and notation for particular kinds of matrices.

 A matrix with the same number of rows as columns is called a square matrix

 A matrix with only one row is called a row matrix

 A matrix with only one column is called a column matrix

 A matrix with all elements equal to zero is a zero matrix, eg., [ 00 00 ], [ 00 00 00 ]


It is often denoted simply by 0 when the size is clear from context.
h2 0 0i
 A matrix A with Aij = 0 if i 6= j is called a diagonal matrix eg., 0 −1 0
0 0 5
(
1 if i = j
 A square matrix A satisfying Aij = is called an identity matrix. The identity
0 if i 6= j
h1 0 0i
matrix of size n × n is denoted In . For example, I2 = [ 10 01 ], I3 = 0 1 0
001
3-2 MAST10022 Linear Algebra: Advanced, 2024

3.1 Operations on matrices

Definition 3.4: matrix addition

Given two matrices of the same size A, B ∈ Mm,n (F) we define A+B ∈ Mm,n (F) by (A+B)ij =
Aij + Bij . That is,
     
A11 A12 ... A1n B11 B12 ... B1n A11 + B11 A12 + B12 ... A1n + B1n
A21 A22 ... A2n B21 B22 ... B2n A21 + B21 A22 + B22 ... A2n + B2n
+ =
     
 .. .. .. .. .. .. .. .. .. .. .. .. 
. . . . . . . . . . . .
Am1 Am2 ... Amn Bm1 Bm2 ... Bmn Am1 + Bm1 Am2 + Bm2 ... Amn + Bmn

Note. 1. Only matrices of the same size can be added together!


2. The addition Aij + Bij is in the field F.
     
π 3.1 0 0 0 6 π 3.1 6
Example 3.5. + = 3
− 12 4 2i 2 3 8 2 7 8 + 2i

Definition 3.6: scalar multiplication for matrices

Given a matrix A ∈ Mm,n (F) and k ∈ F, define a matrix kA ∈ Mm,n (F) by (kA)ij = k × Aij .
That is,    
A11 A12 ... A1n kA11 kA12 ... kA1n
 A21 A22 ... A2n   kA21 kA22 ... kA2n 
k
 ... .. .. ..  =  .. .. .. .. 
. . .   . . . . 
Am1 Am2 . . . Amn kAm1 kAm2 ... kAmn

   
π 3.1 0 2π 6.2 0
Example 3.7. 2 =
− 21 4 2i −1 8 4i

Remark. We write A − B to mean A + (−1)B

There are other useful operations with matrices.

Definition 3.8: matrix multiplication

Given matrices A ∈ Mm,n (F) and B ∈ Mn,p (F) we define their product, AB, to be the matrix in
Mm,p (F) given by
Xn
(AB)ij = Aik Bkj
k=1

Note. Two matrices A and B can only be multiplied together (in that order) if the number of columns
of A is equal to the number of rows of B.

Exercise 22. Using the definition of matrix multiplication, show that for any matrix A ∈ Mm,n (F),
B ∈ Mn,m (F), and k ∈ F we have:

(a) AIn = A and Im A = A


(b) A(kB) = k(AB)
 
  2 5  
π 3.1 0  2π 5π + 18.6
Example 3.9. 1. 0 6 =
− 21 43

4 2i −1 + 14i 2 + 4i
7 2

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 3-3

2π − 52 26.2 10i
   
2 5  
π 3.1 0
2. 0 6 =  −3 24 12i
− 21 4 2i
7 2 7π − 1 29.7 4i
    
1 1 1 −1 0 0
3. =
−1 −1 −1 1 0 0

Remark. 1. Matrix multiplication is not commutative!


The order matters. For two matrices A and B it’s possible that AB is defined, but that BA is
not. Even when both AB and BA are defined, they are not necessarily equal.

2. Matrix multiplication is associative.


For any A ∈ Mm,n (F), B ∈ Mn,p (F), and C ∈ Mp,q (F) we have (AB)C = A(BC). This can be
proved directly from the definition of matrix multiplication.

Definition 3.10: transpose

Given a matrix A ∈ Mm,n we define its transpose to be the matrix AT ∈ Mn,m given by (AT )ij =
Aji . That is, AT is obtained from A by interchanging the rows and columns of A. A matrix is
called symmetric if AT = A.

π − 12
 
 
π 3.1 0
Example 3.11. A = , AT = 3.1 4 
− 21 4 2
0 2

Exercise 23. Using the definitions of matrix addition and transpose prove that for all A, B ∈ Mm,n (F),
(A + B)T = AT + B T .

Lemma 3.12

Let A ∈ Mm,n (F) and B ∈ Mn,p (F). Then (AB)T = B T AT .

Proof. Note first that (AB)T and B T AT are both of size p × m. To show that they are equal we need
to show that
∀i ∈ {1, . . . , p} ∀j ∈ {1, . . . , n}, ((AB)T )ij = (B T AT )ij
Let i ∈ {1, . . . , p} and j ∈ {1, . . . , n}. Then

((AB)T )ij = (AB)ji (definition of transpose)


Xn
= Ajk Bki (definition of multiplication)
k=1
Xn
= Bki Ajk (multiplication in F is commutative)
k=1
Xn
= (B T )ik (AT )kj (definition of transpose)
k=1
= (B AT )ij
T
(definition of multiplication)

© University of Melbourne 2024


3-4 MAST10022 Linear Algebra: Advanced, 2024

3.2 Exercises

24. Suppose that A, B, C, and D are matrices with sizes given by:
A ∈ M2,3 , B ∈ M1,3 , C ∈ M2,2 , D ∈ M2,1 .
Determine which of the following expressions are defined. For those which are defined, give
the size of the resulting matrix.

(a) CA (c) DB (e) CA + DB


(b) BD (d) CDAT (f) AB T DT + C

25. Give examples of matrices A ∈ M3,3 (C) such that A is:

(a) a diagonal matrix; that is, Aij = 0 if i 6= j


(b) a scalar matrix; that is, A is a diagonal matrix with Aii = Ajj for all i, j
(c) a symmetric matrix; that is, Aij = Aji for all i, j
(d) an upper triangular matrix; that is, Aij = 0 if i > j

26. Let    
−1 0 1 0 4 −2
A =  2 −1 3  B= 3 1 2 
0 1 −2 −1 0 1
Find

(a) A + B (c) A − λI (λ ∈ R) (e) AB


(b) 2A − 3B (d) AT (f) BA

27. Calculate AB, BC, AT C T and (CA)T given that


 
1 2  
  3
A= 1 3 1 B= 0 1  C=
1
3 4

28. Evaluate the following matrix products:


    
3 4 2 1 2 0  
(a)  1 3 6   0 −1  (d)  1  7 6 −4
7 −1 0 −1 0 1
  
  3 −6 0 2
2 2  
(b) −1 −2 (e)  0 2 −2  1 
1 −1
1 −1 −1 1
   
  0 4 3  1 
(c) 7 + i 6 −4 + 3i  1−i  (f)  6 6  21
1 8 9 3

29. Consider the following matrices:


     
2 0 1 5 0 0 1 1 −2
A =  3 1 −1  B =  2 3 −1  C= 2 3 5 
−1 2 1 −2 3 4 0 1 2

Verify that

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 3-5

(a) A(BC) = (AB)C (c) A(B + C) = AB + AC


(b) (AB)T = B T AT

30. Find examples of the following:

(a) a non-zero matrix A ∈ M2,2 (R) with A2 = 0


(b) a matrix B ∈ M2,2 (R) with B 2 = −I2
(c) matrices C, D ∈ M2,2 (R) with no zero entries but with CD = 0

31. (a) Show that if the matrix products AB and BA are both defined, then AB and BA are square
matrices.
(b) Show that if A is an m × n matrix and A(BA) is defined, then B is an n × m matrix.

32. Verify, using appropriate trigonometric identities, that


    
cos (θ1 ) − sin (θ1 ) cos (θ2 ) − sin (θ2 ) cos (θ1 + θ2 ) − sin (θ1 + θ2 )
=
sin (θ1 ) cos (θ1 ) sin (θ2 ) cos (θ2 ) sin (θ1 + θ2 ) cos (θ1 + θ2 )

33. Suppose that a 2 by 2 matrix A ∈ M2,2 (C) satisfies AB = BA for every 2 by 2 matrix B. That is
A satisfies
∀B ∈ M2,2 (C), AB = BA
In this exercise we show that A must be equal to zI2 for some z ∈ C.
 
Let A = ac db . By considering the two cases B = [ 10 00 ] and B = [ 00 10 ], show that a = d and
b = c = 0.

34. Consider the following matrices:


 
  1 0 0
1 −1
C= D = 0 2 0
5 −4
0 0 3
By setting up and solving appropriate simultaneous equations, find:

(a) all matrices A ∈ M2,2 (C) that satisfy AC = CA


(b) all matrices A ∈ M3,3 (C) that satisfy AD = DA

35. Let A, B ∈ Mn,n (C). Suppose that A2 = A. Show that (AB − ABA)2 = 0.
(Note we may not assume that n = 2 nor that AB = BA.)

36. Let A, B ∈ Mn,n be symmetric matrices. Show that AB is symmetric if and only if AB = BA.

37. (a) Give an example of three matrices A, B ∈ M2,2 and C ∈ M2,1 such that C 6= 0 and
AC = BC but A 6= B.
(b) Let A, B ∈ Mm,n (C). Suppose that AC = BC for all C ∈ Mn,1 (C). Show that A = B.

© University of Melbourne 2024


3-6 MAST10022 Linear Algebra: Advanced, 2024

Further material for lecture 3

 Matrices

Elementary Linear Algebra by Anton and Rorres, §1.3


Linear Algebra Done Right by Axler, §3.C

 Some histrory

Matrices and determinants on MacTutor

 A matrix A ∈ Mm,n (F) determines (and is determined by) a function

f : {1, . . . , m} × {1, . . . , n} → F, f (i, j) = Ai,j

The only difference is notation.

 Adjacency matrix of a graph


Given a finite graph with vertices v1 , . . . , vn we define a matrix A ∈ Mn,n (R) by
(
1 if the vertices vi and vj are connected by an edge
Ai,j =
0 if the vertices vi and vj are not connected by an edge

v3 v2    
0 1 0 1 1 2 6 1 5 4
1 0 1 0 1 6 2 5 1 4
   
0
A= 1 0 1 0 A3 = 
1 5 0 4 2
v5 1
 
0 1 0 0 5 1 4 0 2
1 1 0 0 0 4 4 2 2 2

v4 v1

Note. Given any k ∈ N, the i,j-th entry of Ak gives the number of edge paths of length k from
vi to vj . (If you’re feeling adventurous, you could try proving this using induction on k.) For
example, there are six edge paths of length three from v1 to v2 in the above graph.

© University of Melbourne 2024


LECTURE 4

Matrix inverses and linear systems

4.1 Inverse of a matrix

Matrices follow many of the algebraic properties that we are familiar with from the real numbers
(such as the distributive law). It’s natural to think about "dividing by a matrix" in the sense of multi-
plying by the multiplicative inverse.
An important property of the real numbers is that every non-zero real number has a multiplicative
inverse, that is, ∀ a ∈ R \ {0} ∃ b ∈ R, ab = 1. Some, but not all, non-zero square matrices have a
multiplicative inverse in the same sense.

Definition 4.1: Matrix inverse

A square matrix A ∈ Mn,n (F) is called invertible if there exists a matrix B ∈ Mn,n (F) such that
AB = In and BA = In . The matrix B is called the inverse of A and is denoted A−1 . If A is not
invertible, we say that A is singular.

Remark. Calling B the inverse of A needs some justification. It’s possible to show that there can be at
most one matrix that satisfies the above property. That is, if B, C ∈ Mn,n (F) are such that AB = In
and BA = In and AC = In and CA = In , then B = C.
 −1    −1  
2 i 1 i 0 −3 −2 4 −5 −2
Example 4.2. 1. =
i −1 i −2 3.  1 −4 −2 =  5 −6 −2
  −3 4 1 −8 9 3
1 1
2. is singular
−1 −1

It’s easy to check that the given inverses are correct by simply multiplying and verifying that the
result is the identity matrix. We will describe a method for calculating the inverse of a matrix in a
later section.
Remark. Notice that it follows immediately from the definition that:

1. For A to be invertible it must be square

2. (If it exists) A−1 has the same size as A

3. If A is invertible, then A−1 is invertible and (A−1 )−1 = A

4. In−1 = In

We note some other useful results about invertibility.

Lemma 4.3: Product of invertible matrices is invertible

If A and C are invertible matrices of the same size, then AC is invertible and

(AC)−1 = C −1 A−1
4-2 MAST10022 Linear Algebra: Advanced, 2024

Proof. Let A, C ∈ Mn,n (F) be two invertible matrices. We need to verify that C −1 A−1 satisfies the
conditions given in the definition of the matrix inverse.

(AC)(C −1 A−1 ) = A(CC −1 )A−1 (associativity)


= AIn A−1 (since CC −1 = In )
= AA−1 (since AIn = A)
= In

and, similarly

(C −1 A−1 )(AC) = C −1 (A−1 A)C = C −1 In C = C −1 C = In

Exercise 38. Let A ∈ Mn,n (F). For k ∈ N we define Ak to be the product of k copies of A, that is,

Ak = A
| ×A×
{z· · · × A}
k copies

Suppose that A ∈ Mn,n (F) is invertible. Show that

(a) (Ak )−1 = (A−1 )k for all k ∈ N.


(b) If a ∈ F with a 6= 0, then aA is invertible and (aA)−1 = a1 A−1 .
(c) AT is invertible and (AT )−1 = (A−1 )T .

The following technical result will be useful later.

Lemma 4.4

Let A, B ∈ Mn,n be two square matrices and let i ∈ {1, 2, . . . , n}. If all entries in the i-th row of
A are equal to zero, then all entries in the i-th row of AB are zero.
In particular, if a square matrix has a row consisting entirely of zeros, then it is singular.

Proof. Let A ∈ Mn,n (F) be a square matrix with all entries in the i-th row of A equal to zero. That is,
∀j ∈ {1, . . . , n}, Aij = 0. Let B ∈ Mn,n (F). For the same fixed i, we have (for all j)
n
X n
X n
X
(AB)ij = Aik Bkj = 0 × Bkj = 0=0
k=1 k=1 k=1

Therefore, the i-th row of AB has all entries equal to zero. In particular, (AB)ii 6= 1 and hence
AB 6= In . Therefore, no matrix B can satisfy the properties needed to be the inverse of A.

4.2 Linear Systems

We would like to have an efficient method to solve simultaneous linear equations. Suppose that we
have n vairiables x1 , . . . , xn and m linear equations that they should simultaneously satisfy.

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2
.. ..
. . (∗)
am1 x1 + am2 x2 + · · · + amn xn = bm

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 4-3

This is sometimes called a linear system.


Letting
     
a11 a12 ... a1n x1 b1
 a21 a22 ... a2n   x2   b2 
A= . X= .  B= . 
     
.. .. .. 
 .. . . .   ..   .. 
am1 am2 . . . amn xn bm
the linear system (∗) can be written as
AX = B (†)

Example 4.5. The linear system below can be written as


 
x + (1 + i)y + 7z = 1 − i   x  
1 1+i 7   1−i
y =
ix + 2y − z = 2 + i i 2 −1 2+i
z

Given A ∈ Mm,n (F) and B ∈ Mm,1 (F) we would like to find all X ∈ Mn,1 (F) such that the equation
AX = B is satisfied. Thinking about our experience with solving simple equations of the form
2x = 7, our first thought might be to multiply on both sides by A−1 . The problem is that A need
not be invertible (or even square). However, if A does happen to be invertible, then we have the
following.

Proposition 4.6

Let A ∈ Mm,m (F) and B ∈ Mm,1 (F). If A is invertible, then the equation AX = B has a unique
solution and it is given by X = A−1 B.

Proof. First note that A−1 B is a solution since AA−1 B = Im B = B. Now suppose that X is any
solution. Then we have

AX = B =⇒ A−1 AX = A−1 B =⇒ Im X = A−1 B =⇒ X = A−1 B

Therefore, X = A−1 B is the only solution.

The general case of a linear system (in which A is not necessarily invertible) is discussed in the next
section. The technique will rest on the following observation. Suppose we have A ∈ Mm,n (F) and
B ∈ Mm,1 (F) and an invertible matrix E ∈ Mm,m (F). Define A0 = EA and B 0 = EB. Then for all
X ∈ Mn,1 (F) we have
AX = B ⇐⇒ A0 X = B 0
The goal will be to arrange for the new linear system A0 X = B 0 to be as simple as possible.

4.3 Exercises

39. Let a, b, c, d ∈ C.

(a) Suppose
 that ad − bc 6= 0. Show (using the definition of inverse) that the inverse of the
matrix ac db ∈ M2,2 (C) is the matrix
 
1 d −b
∈ M2,2 (C)
ad − bc −c a

© University of Melbourne 2024


4-4 MAST10022 Linear Algebra: Advanced, 2024

(b) Suppose now that ad − bc = 0.


   d −b 
i) Show that ac db −c a = [ 00 00 ]
 
ii) Use the above observation to show that the matrix ac db has no inverse.

40. Let A ∈ Mn,n (C). Suppose that there exists B ∈ Mn,k (C) such that AB = 0 and B 6= 0. Show
that A is not invertible. (Be careful, A need not be equal to 0.)

41. (a) Show that if a square matrix A satisfies A2 − 4A + 3I = 0 then A−1 = 13 (4I − A).
(b) Veify these relations in the case that A = [ 21 12 ]

42. Let  
1 0 1
A= 2 3 4 
−1 0 −2
Show that A−1 = − 13 (A2 − 2A − 4I).

43. Write the following systems of linear equations in the form AX = B. Then, by using the inverse
of A, solve the system (i.e., find X).

(a)
2x − 3y = 3
3x − 5y = 1
(b)

x + z = −1
2x + 3y + 4z = 3
−x − 2z = 3

(Your answer to the previous exercise will be useful here.)

44. Suppose A, B, P ∈ Mn,n (F) are such that P is invertible and A = P BP −1 . Show that

Ak = P B k P −1

(for all k ∈ N).

45. Let A be a square matrix. Show that if A2 is invertible, then A is invertible.


(It is possible to give a proof that uses only what we’ve seen so far. We will see later how this
can also be shown using the determinant.)

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 4-5

Extra material for lecture 4

 Matrix inverse

Elementary Linear Algebra by Anton and Rorres, §1.3


Linear Algebra Done Right by Axler, §3.C

 Uniqueness of matrix inverse


Suppose that B, C ∈ Mn,n (F) are such that AB = In and BA = In and AC = In and CA = In .
We would then have AB = AC and

AB = AC =⇒ B(AB) = B(AC) (multiplying on the left by B)


=⇒ (BA)B = (BA)C (associativity)
=⇒ In B = In C (since BA = I)
=⇒ B = C

Therefore, if a matrix does have an inverse, it is unique.

 Linear systems

Elementary Linear Algebra by Anton and Rorres, §1.1

© University of Melbourne 2024


4-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 5

Row operations and elementary matrices

5.1 Row operations

For a linear system AX = B in which the matrix A is not invertible Proposition 4.6 does not apply. To
handle the general case we introduce the notion of elementary row operations and the corresponding
elementary matrices. They will turn out to be useful in other contexts, including when calculating
the inverse of a matrix.
Given a linear system there are certain operations that can be carried out without changing the set
of solutions. For example, changing the order in which the equations are listed does not alter the set
of solutions. Similarly, multiplying one of the equations by a (non-zero) constant does change the
solutions. This leads us to define the following operations on matrices.

Definition 5.1: Elementary row operations

An elementary row operation on a matrix A ∈ Mm,n (F) is one of the following:

1. Interchanging two rows.

2. Multiplying a row by a non-zero element of F.

3. Adding a multiple of one row to another.

Note that applying one of the above row operations does not change the size of the matrix. We
say that two matrices are row equivalent* if one can be obtained from the other by a sequence
of row operations. If A, B ∈ Mm,n (F) are row equivalent, we write A ∼ B.

   
0 i 2 1 0 −2 + 6i
Example 5.2. ∼ since one can be obtained from the other as follows
1 3 −2 0 1 −2i
       
0 i 2 R1↔R2 1 3 −2 (−i)×R2 1 3 −2 R1−3R2 1 0 −2 + 6i
−−−−−→ −−−−−→ −−−−−→
1 3 −2i 0 i 2 0 1 −2i 0 1 −2i

Definition 5.3

An elementary matrix is a matrix obtained from an identity matrix In by performing a single


elementary row operation.

It follows from the definition that elementary matrices are always square.
     
0 1 1 0 1 −3
Example 5.4. The matrices , , and are elementary matrices. To justify this
1 0 0 −i 0 1
note that
           
1 0 R1↔R2 0 1 1 0 (−i)×R2 1 0 1 0 R1−3R2 1 −3
−−−−−→ , −−−−−→ , −−−−−→
0 1 1 0 0 1 0 −i 0 1 0 1
* Row equivalence is an example of what is called an equivalence relation.
5-2 MAST10022 Linear Algebra: Advanced, 2024

The connection between elementary matrices and elementary row operations is given by the follow-
ing result.

Lemma 5.5

Let A, B ∈ Mm,n (F). Suppose that B is obtained from A by applying a single row operation and
let E ∈ Mm,m (F) be the elementary matrix obtained by applying the same row operation to Im .
Then B = EA.

Proof. We consider the three kinds of row operation separately.


Suppose first that the row operation swaps rows p and q. Then we have

 
Aij
 if i ∈
/ {p, q} Iij
 if i ∈
/ {p, q}
Bij = Aqj if i = p Eij = Iqj if i = p
 
Apj if i = q Ipj if i = q
 

Pm 
m
X Pk=1 Iik Akj
 if i ∈
/ {p, q} 
Aij if i ∈
/ {p, q}
m
(EA)ij = Eik Akj = k=1 Iqk Akj if i = p = Aqj if i = p = Bij
k=1  m
P 
k=1 Ipk Akj if i = q Apj if i = q

Suppose now that the row operation is to multiply the p-th row by λ ∈ F \ {0}. Then we have

( (
Aij if i 6= p Iij if i 6= p
Bij = Eij =
λAij if i = p λIij if i = p

m
(P (
m
X Iik Akj if i 6= p Aij if i 6= p
(EA)ij = Eik Akj = Pk=1
m = = Bij
k=1 k=1 λIik Akj if i = p λAij if i = p

Therefore EA = B in this case also.


Finally, suppose that the row operation replaces row p by itself plus λ ∈ F times row q. We have

( (
Aij if i 6= p Iij if i 6= p
Bij = Eij =
Apj + λAqj if i = p Ipj + λIqj if i = p

m
(P (
m
X Iik Akj if i 6= p Aij if i 6= p
(EA)ij = Eik Akj = Pk=1
m = = Bij
k=1 k=1 (Ipk + λIqk )Akj if i = p Apj + λAqj if i = p

Again, this shows that EA = B.

Corollary 5.6

Elementary matrices are invertible.

Proof. Given any row operation ρ there is a row operation ρ0 which undoes the effect of ρ. Let the
corresponding elementary matrices be E and E 0 . Then applying the lemma with A = I we have
E 0 EI = I and EE 0 I = I. Hence EE 0 = I and E 0 E = I. Therefore E is invertible and E −1 = E 0 .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 5-3

Example 5.7. Considering the row operations and corresponding elementary matrices from Exam-
ples 5.2 and 5.4 we have:
ρ
row operation A − →B elementary matrix E EA = B
          
0 i 2 R1↔R2 1 3 −2 0 1 0 1 0 i 2 1 3 −2
−−−−−→ E1 = =
1 3 −2i 0 i 2 1 0 1 0 1 3 −2i 0 i 2
          
1 3 −2 (−i)×R2 1 3 −2 1 0 1 0 1 3 −2 1 3 −2
−−−−−−→ E2 = =
0 i 2 0 1 −2i 0 −i 0 −i 0 i 2 0 1 −2i
          
1 3 −2 R1−3R2 1 0 −2 + 6i 1 −3 1 −3 1 3 −2 1 0 −2 + 6i
−−−−−→ E3 = =
0 1 −2i 0 1 −2i 0 1 0 1 0 1 −2i 0 1 −2i

Notice that
       
0 i 2 1 3 −2 1 3 −2 1 0 −2 + 6i
E3 E2 E1 = E3 E2 = E3 =
1 3 −2i 0 i 2 0 1 −2i 0 1 −2i

The order in which the Ei have been multiplied is important!

Exercise 46. For each of the row operations ρi in Example 5.7 write down a row operation ρ0i that
undoes the effect of ρi . Write down the elementary matrix Ei0 that corresponds to ρ0 . Verify that
Ei Ei0 = I2 .

Lemma 5.8

Let A, B ∈ Mm,n (F). If A and B are row equivalent, then there exists an invertible matrix
E ∈ Mm,m (F) such that B = EA.

Proof. Since A and B are row equivalent, there is a sequence of elementary row operations that trans-
forms A to B
ρ1 ρ2 ρ3 ρk
A −→ A1 −→ A2 −→ · · · −→ Ak = B

Let Ei be the elementary matrix corresponding to the row operation ρi and define E = Ek Ek−1 . . . E2 E1 .
Applying Lemma 5.5 we have
B = Ek Ek−1 . . . E2 E1 A = EA

Since each Ei is invertible (Corollary 5.6), E is invertible (Lemma 4.3.)

5.2 Row echelon form

We now define the first version of a "simplified" matrix that will be useful for solving linear systems
and for other applications (such as finding bases and calculating rank).

Definition 5.9: Row echelon form (REF)

The leftmost non-zero entry in a row is called the leading entry of that row.
A matrix is in row echelon form if it satisfies the following conditions:

1. For any two non-zero rows, the leading entry of the lower row is further to the right than
the leading entry in the higher row.

2. Any row that consists entirely of zeros is lower than every non-zero row.

© University of Melbourne 2024


5-4 MAST10022 Linear Algebra: Advanced, 2024

Note. Some authors add the condition that to be in row echelon form all leading entries should be
equal to 1. We are not including this requirement for what we call row echelon form. The extra
condition is not needed for any of our applications.

Examples 5.10.
 
  2 0 2 3
0 1 −2 3 4 and  0 4 1 2  are in row echelon form
0 0 0 3
 
0 0 0 2 4  
 0 0 1 0 0 3
3 1 6 
 and  0 0 0

 0 0 3  are not in row echelon form
0 0 0 
0 4 1 2
2 −3 6 −4 9

5.3 Exercises

47. For each of the following row operations find the corresponding elementary matrix.
   
1 2 R2 −3R1 1 2
(a) −−−−−→
3 4 0 −2
   
1 2 5 R2 −3R1 1 2 5
(b) −−−−−→
3 4 6 0 −2 −9
   
1 2 1 2
R ↔R3
(c) 3 4 −−2−−−→ 5 6
5 6 3 4
   
3 1 4 3 1 4
R2 ×(−2)
(d) 1 5 9 −−−−−→  −2 −10 −18 
2 6 5 2 6 5

48. Let A ∈ M3,5 (C). Suppose that B is obtained from A by the following sequence of row opera-
tions in the given order.

1) R1 ↔ R2 2) R3 − 2R1 3) R1 × 3

Find a single matrix E such that B = EA.

49. Which of the following matrices are in row echelon form?


     
0 0 0 0 2 4 1 0 1 0 3 1
(e)
 1 0 2 1  (c)  0 0 2 1  0 1 2 4
(a) 
 0 0 1 2 

0 0 0 3
0 0 0 1 
1 3 0 2 0

   
1 2 3 4 1 0 0  0 0 2 2 0 
(f)  
(b)  0 1 1 1  (d)  0 1 1   0 0 0 0 0 
0 0 1 1 0 0 0 0 0 0 1 4

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 5-5

Extra material for lecture 5

 row operations, row echelon form

Elementary Linear Algebra by Anton and Rorres, §1.1, §1.5

 equivalence relations

The Art of Proof by Beck and Geoghegan, §6.1.

 column operations
Elementary column operations can be defined in way analogous to row operations. The ele-
mentary matrices corresponding to column operations are multiplied on the right rather than
on the left. Here’s an example to illustrate.
       
2 1 3 C1 ↔C2 1 2 3 C2 −2C1 1 0 0 C3 −2C2 1 0 0
−−−−−→ −−−−−→ −−−−−→
5 4 6 4 5 6 C3 −3C1 4 −3 −6 4 −3 0
    
  0 1 0 1 −2 0 1 0 −3 1 0 0  
2 1 3  1 0 0
1 0 0  0 1 0  0 1 0  0 1 −2 =
5 4 6 4 −3 0
0 0 1 0 0 1 0 0 1 0 0 1

© University of Melbourne 2024


5-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 6

Gaussian elimination and types of solution set

We look at how to solve a linear system using ‘Gaussian elimination’ to put a corresponding matrix
into row echelon form. We also look at how to determine if a linear system has no solutions, a unique
solution, or more than one solution.

6.1 Gaussian elimination

Any matrix can be put into row echelon form by performing a sequence of row operations as follows.

Algorithm 6.1: Gaussian elimination* (To put a matrix into REF)

1. Consider the first column that is not all zeros. Interchange rows (if necessary) to bring a
non-zero entry to the top of that column. (The ‘leading entry’.)

2. Add suitable multiples of the top row to lower rows so that all entries below the leading
entry are zero.

3. Start again with Step 1 applied to the matrix without the first row.
(stop if there are no more rows)

Example 6.2. Here’s an example of applying the above procedure. It’s a good idea to record the row
operations being used at each step.
       
3 2 −1 −15 3 2 −1 −15 3 2 −1 −15 3 2 −1 −15
1 −11 0 1 −11 1 −11
1
 1 −4 −30 R2− 13 R1 0
 −−−−−−→  3 3 −25 R3−R1
 −−−−−→  3 3 −25  R4−R1 0
 −−−−−→  3 3 −25
3 1 3 11  3 1 3 11  0 −1 4 26  0 −1 4 26 
3 3 −5 −41 3 3 −5 −41 3 3 −5 −41 0 1 −4 −26
   
3 2 −1 −15 3 2 −1 −15
1 −11 0 1 −11 −25
R3+3R2  0 −25 R4−3R2
−−−−−→  3 3  −−−−−→  3 3 
0 0 −7 −49 0 0 −7 −49
0 1 −4 −26 0 0 7 49
 
3 2 −1 −15
1 −11
R4+R3  0 −25
−−−−−→  3 3 
0 0 −7 −49
0 0 0 0

The final matrix is in row echelon form. All matrices above are row equivalent to one another.

6.2 Using Gaussian elimination to solve a linear system

Given A ∈ Mm,n (F) and B ∈ Mm,1 (F), to solve the linear system AX = B, we can proceed as
follows.

1. Form a matrix [A|B] ∈ Mm,n+1 (F) by adjoining B as an extra (final) column to A.


(This matrix [A|B] is sometimes called an augmented matrix.)
* Namedafter the 19th century German mathematician Carl Friedrich Gauss. The technique had been previously
discovered by Chinese mathematicians during the Han dynasty (206 BCE – 220 CE).
6-2 MAST10022 Linear Algebra: Advanced, 2024

2. Apply Gaussian elimination starting with [A|B] to obtain a row echelon matrix [A0 |B 0 ].

3. Solve the new, simplified set of equations A0 X = B 0 starting from the last equation and working
up. (This is sometimes called back substitution)

Remark. Why does this work? We know that [A0 |B 0 ] = E[A|B] for some invertible matrix E by
Lemma 5.8. Therefore A0 = EA and B 0 = EB. Therefore AX = B and A0 X = B 0 have the same set
of solutions (see the comment at the end of section 4.2).

Example 6.3. Let’s use this technique to find all solutions to the following linear system.

3x + 2y − z = −15
x + y − 4z = −30
(∗)
3x + y + 3z = 11
3x + 3y − 5z = −41
   
3 2 −1 −15
1 1 −4
, B = −30.
 
Let A = 
3 1 3   11 
3 3 −5 −41
Then
   
3 2 −1 −15 3 2 −1 −15
0 1 −11
1 1 −4 −30
∼ 3 3 −25
 (as shown in Example 6.2 )
[A|B] = 
3 1 3 11  0 0 −7 −49
3 3 −5 −41 0 0 0 0

The new linear system is


3x + 2y − z = −15
1 11
3 y − 3 z = −25
−7z = −49
We then get

z=7 (from the last equation)


y = 11z − 75 (from the second last equation)
=2 (since z = 7)
−2 1
x= 3 y + 3z −5 (from the first equation)
= −4 (since y = 2 and z = 7)

So the original linear system (∗) has a unique solution and it is given by (x, y, z) = (−4, 2, 7).

6.3 Inconsistent linear systems

Definition 6.4

A linear system is called inconsistent if it has no solutions.

Example 6.5. Find all solutions of the system

x− y+ z =3
x − 7y + 3z = −11
2x + y + z = 16

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 6-3

We form the matrix [A|B] and reduce to row echelon form.


     
1 −1 1 3 1 −1 1 3 1
R3+ R2
1 −1 1 3
R2−R1
[A|B] = 1 −7 3 −11 −−−−−→ 0 −6 2 −14 −−−−2−→ 0 −6 2 −14 = [A0 |B 0 ]
R3−2R1
2 1 1 16 0 3 −1 10 0 0 0 3

The new linear system is


x− y+ z =3
− 6y + 2z = −14
0=3
This system is clearly never satisfied no matter what values of x, y, z we take!
The linear system is inconsistent.

Note. As we saw in the above example, a linear system is inconsistent if there is a row of the form
[0 · · · 0 | k] (with k 6= 0) in the row echelon form. Given the way in which row echelon form is defined,
such a row occurs precisely when there is a leading entry in the final column of the row echelon form
matrix [A0 |B 0 ]. We shall see in the next section that this is the only situation in which a linear system
is inconsistent.

6.4 Consistent linear systems

Definition 6.6

A linear system is called consistent if it has at least one solution.

In Example 6.3 we saw a consistent linear system for which there was a unique solution. Here is an
example in which there turns out to be more than one solution.

Example 6.7. Find all solutions of the system

x+ y+ z =4
2x + y + 2z = 9
3x + 2y + 3z = 13

     
1 1 1 4 1 1 1 4 1 1 1 4
R2−2R1 R3−R2
[A|B] = 2 1 2 9  −−−−−→ 0 −1 0 1 −−−−→ 0 −1 0 1
R3−3R1
3 2 3 13 0 −1 0 1 0 0 0 0

The new system is


x+ y+ z =4
−y =1
From which we get

y = −1 (from the last (non -trivial) equation)


x=4−y−z (from the first equation)
=5−z

For any value of z, we get a solution by taking y = −1 and x − 5 − z. That is, the set of all solutions is

S = {(5 − z, −1, z) | z ∈ F}

© University of Melbourne 2024


6-4 MAST10022 Linear Algebra: Advanced, 2024

Lemma 6.8

Let A ∈ Mm,n (F) and B ∈ Mm,1 (F). Suppose that [A|B] ∼ [A0 |B 0 ] and that [A0 |B 0 ] is in row
echelon form. Define r ∈ N ∪ {0} to be the number of non-zero rows in [A0 |B 0 ].

1. The system AX = B is consistent iff there is no leading entry in the final column of [A0 |B 0 ].

Suppose now that the system is consistent.

2. Then r 6 n and the full solution will require n − r parameters.

Sketch of proof. We have already noted above that if there is a leading entry in the final column, then
the system is inconsistent. If there is no leading entry in the final column, then a solution can always
be found using back substitution as above. The linear system A0 X = B 0 yields r (non-trivial) equa-
tions. For each equation we can write the variable xi , that corresponds to the leading entry, in terms
of the xj having j > i. The n − r variables that do not correspond to leading entries can be chosen as
parameters.

Note. 1. If r = n (as in Example 6.3), the solution requires no parameters. That is, there is a unique
solution.

2. If r < n, then there will be more than one solution. If F is infinite (eg, Q, R or C), there will be
infinitely many solutions.

6.5 Exercises

50. Use Gaussian elimination to put the following matrices into row echelon form.
   
1 1 −8 −14 0 1 2 3 4
(a)  3 −4 −3 0 (b)  0 0 5 7 6 
2 −1 −7 −10 0 0 5 2 4

51. Solve the linear system whose augmented matrix can be reduced to the following row echelon
form.  
1 −3 4 7
 0 1 2 2 
 
 0 0 1 5 
0 0 0 0

52. Use Gaussian elimination to solve the following linear systems of equations:

(a) x + z = 0 (b) 2x + 3y = 1 (c) x− y+ z =1


x+y =0 x+ y =1 2x + y − z = 3
y+z =0 3x + 2y − 3z = 2

53. Solve the systems of linear equations whose augmented matrices can be reduced to the follow-
ing row echelon forms.
   
1 −3 7 1 1 2 3 4
(b)
(a)  0 1 4 0  0 0 0 0
0 0 0 1

54. Use Gaussian elimination to solve the following linear systems of equations:

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 6-5

(a) 4x − 2y = 5 (c) x1 + x2 + x3 + x4 =1
−6x + 3y = 1 2x1 + 3x2 + 3x3 =1
(b) x − 4y = 1 −x1 − 2x2 − 2x3 + x4 =0
−2x + 8y = −2 − x2 − x3 + 2x4 =1

55. Determine the conditions on a, b, c ∈ C so that the system is consistent:

(a) x + 2y − 3z = a (b) x + 2y + 4z = a
3x − y + 2z = b 2x + 3y − z = b
x − 5y + 8z = c 3x + y + 2z = c

56. Determine the values of the constant k ∈ C for which the system has

(i) no solutions,
(ii) a unique solution,
(iii) an infinite number of solutions.

Find the solutions when they exist.

(a) 2x + 3y + z = 11 (c) x+ y+ 2z = 9
x+ y+ z =6 x− y+ z=2
5x − y + 11z = k 4x + 2y + (k − 22)z = k
(b) x1 + x3 = 1
x2 + x3 = 2
2x2 + kx3 = k

57. Determine the values for a, b and c for which the parabola y = ax2 + bx + c passes through the
points:

(a) (0, −3), (1, 0) and (2, 5) (b) (−1, 1), (1, 9) and (2, 16)

© University of Melbourne 2024


6-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 6

 Gaussian elimination
Elementary Linear Algebra by Anton and Rorres, §1.2

 A homogeneous linear system is a system of linear equations in which each equation is


equal to 0. Otherwise, the system is called non-homogeneous. We explore the relation-
ship between the solution set of a non-homogeneous linear system and the solution set of
the associated homogeneous linear system.
Let A ∈ Mm,n (F) and B ∈ Mm,1 (F). Define

Shom = {X ∈ Mn,1 (F) | AX = 0} and Sinhom = {X ∈ Mn,1 (F) | AX = B}

Let Xp ∈ Sinhom be a fixed solution of the non-homogeneous system. Show that

Sinhom = Xp + Shom

(Notation: Xp + Shom = {Xp + X | X ∈ Shom })

For example, the following linear system is inhomogeneous.

x + y + 3z = 2
x − y + 7z = 4

A particular solution is (x, y, z) = (3, −1, 0)


The corresponding homogeneous system is

x + y + 3z = 0
x − y + 7z = 0

The general solution to the homogeneous system is:

(x, y, z) = t(−5, 2, 1) t ∈ R

That is,
Shom = {t(−5, 2, 1) | t ∈ R}

The general solution to the inhomogeneous system is:

(x, y, z) = (3, −1, 0) + t(−5, 2, 1) t ∈ R

That is,
Sinhom = {(3, 1, 0) + t(−5, 2, 1) | t ∈ R} = (3, 1, 0) + Shom

© University of Melbourne 2024


LECTURE 7

Reduced row echelon form and matrix inverses

7.1 Reduced row echelon form

We further refine the simplified form of a matrix. For linear systems this corresponds to performing
back substitution while still in matrix form. The new form has the advantage of being uniquely
determined by the original matrix.

Definition 7.1: Reduced row echelon form (RREF)

A matrix is in reduced row echelon form if the following three conditions are satisfied.

1. It is in row echelon form.

2. Each leading entry is equal to 1 (sometimes called a leading 1).

3. In each column containing a leading 1, all other entries are zero.

   
  1 2+i 0 0 1 2+i 0
Example 7.2. 1) 1 −2 3 −4 5 , , are in RREF
0 0 1 0 0 0 1
 
  1 0 0 2 4
1 0 1−i 3  0 1 0 1 6 
2)  0 1 1 2 , 
 0
 are not in RREF
0 0 0 0 
0 0 0 i
0 0 1 −4 9

Any matrix can be put into reduced row echelon form as follows.

Algorithm 7.3: Gauss-Jordan elimination (To put a matrix into RREF)

1. First use Gaussian elimination (Algorithm 6.1) to put the matrix in row echelon form.

2. Multiply rows by appropriate numbers (type 2 row ops) to create the leading 1’s.

3. Working from the bottom row upward, use row operations of type 3 to create zeros above
the leading entries.

Example 7.4.
     
−2 0 6 8 14 −2 0 6 8 14 −2 0 6 8 14
1 0 −5 −8 −13 R2+ 21 R1  0 0 −2 −4 −6 R3+2R2 0 0 −2 −4 −6
  −−−−−→   −−−−−→  
2 0 −2 0 −2  R3 + R2  0 0 4 8 12  R4+ 21 R2  0 0 0 0 0
2 0 −5 −6 −11 R4 + R1 0 0 1 2 3 0 0 0 0 0
   
1 0 −3 −4 −7 1 0 0 2 2
− 12 R1 0 0 1 2 3  R1+3R2 0 0 1
  2 3
−−−−→  0 0 −−−−−→  
− 12 R2 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

The final matrix is in reduced row echelon form.


7-2 MAST10022 Linear Algebra: Advanced, 2024

Consider the linear system

−2x1 + 6x3 + 8x4 = 14


x1 − 5x3 − 8x4 = −13
2x1 − 2x3 = −2
2x1 − 5x3 − 6x4 = −11

The above row operations tell us that the set of solutions is the same as that for the system

x1 + 2x4 = 14
x3 + 2x4 = 3

This gives

x1 = −2x4 + 14
x3 = −2x4 + 3

We choose s = x2 and t = x4 as parameters (since the corresponding columns have no leading entry).
The set of solutions is given by

{(−2t + 14, s, −2t + 3, t) | s, t ∈ F}

7.2 Using row operations to find the inverse of a matrix

Our method for finding the inverse of a matrix is based on the following result.

Theorem 7.5

Let A ∈ Mn,n (F). Then A is invertible if and only if A ∼ In .

Proof. Suppose first that A ∼ In . By Lemma 5.8 there is an invertible matrix E ∈ Mn,n (F) such that
EA = In . Since E is invertible we have

EA = In =⇒ E −1 EA = E −1 In =⇒ A = E −1

Therefore A is invertible (since E −1 is invertible) and A−1 = E.


Now suppose that A is invertible. We want to show that A ∼ In . Let R ∈ Mn,n (F) be a matrix in
reduced row echelon form such that A ∼ R (existence is ensured by Algorithm 7.3). By Lemma 5.8
there is an invertible matrix E ∈ Mn,n (F) such that EA = R. Since both E and A are invertible,
R is invertible (Lemma 4.3). Since R is invertible it does not have a zero row (Lemma 4.4). From
the definition of reduced row echelon form we can see that a square matrix that is in reduced row
echelon form and does not have a zero row must be the identity matrix. Therefore, R = In and hence
A ∼ In .

Corollary 7.6

If A ∈ Mn,n (F) is invertible, then it can be written as a product of elementary matrices.

Exercise 58. Suppose that A, B ∈ Mn,n (F) are such that AB = In . Show that A is invertible and that
A−1 = B. (Hint: as in the above proof, let E ∈ Mn,n (F) be invertible such that R = EA is in reduced
row echelon form. Show that RB = E and that therefore R does not have a row of zeros. Deduce
that R = In .)

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 7-3

Using the idea of the above theorem (and its proof) we have a way of finding the inverse of a square
matrix A ∈ Mn,n (F). First we use row operations to put A into reduced row echelon form R. If
R 6= In , then A is not invertible. If R = In , then A is invertible and its inverse is the matrix E. A
convenient way of doing this is the following.

Algorithm 7.7: Finding the inverse of a square matrix (if it exists)

Let A ∈ Mn,n (F).

1. Form the matrix [A | In ] ∈ Mn,2n (F)

2. Use row operations to put the matrix into reduced row echelon form

[A | In ] ∼ [R | B]

(where R, B ∈ Mn.n (F))

3. If R 6= In , then A in not invertible.


If R = In , then A is invertible and A−1 = B

Remark. If we have row operations ρi (with corresponding elementary matrices Ei ) such that
ρ1 ρ2 ρ3 ρ k
[A|In ] −→ [A1 | B1 ] −→ [A2 | B2 ] −→ · · · −→ [In | B]

then we have A−1 = B = Ek . . . E2 E1 .


 
1 2 1
Example 7.8. Let’s find the inverse of the matrix A = −1 −1 1 ∈ M3,3 (C).
0 1 3
   
1 2 1 1 0 0 1 2 1 1 0 0
R2+R1
[A|I3 ] = −1 −1 1 0 1 0 −−−−→ 0 1 2 1 1 0
0 1 3 0 0 1 0 1 3 0 0 1
 
1 2 1 1 0 0 At this point we know
R3−R2
−−−−→ 0 1 2 1 1 0 that the matrix A is in-
0 0 1 −1 −1 1 vertible.
 
1 2 0 2 1 −1
R1−R3 
−−−−−→ 0 1 0 3 3 −2
R2−2R3
0 0 1 −1 −1 1
 
1 0 0 −4 −5 3
R1−2R2 
−−−−−→ 0 1 0 3 3 −2
0 0 1 −1 −1 1
 
−4 −5 3
Therefore A−1 = 3 3 −2. In addition, from the row operations used we have that
−1 −1 1
       
−4 −5 3 1 −2 0 1 0 0 1 0 −1 1 0 0 1 0 0
3 3 −2 = 0 1 0 0 1 −2 0 1 0  0 1 0 1 1 0
−1 −1 1 0 0 1 0 0 1 0 0 1 0 −1 1 0 0 1

© University of Melbourne 2024


7-4 MAST10022 Linear Algebra: Advanced, 2024

7.3 Exercises

59. For each of the following linear systems, use row reduction to decide whether the system has

(i) no solution, (ii) a unique solution, (iii) more than one solution.

Solve the systems where possible.

(a) 3x − 2y + 4z = 3 (c) 3x − 4y + z = 2
x− y+ z = 7 −5x + 6y + 10z = 7
4x − 3y + 5z = 1 8x − 10y − 9z = −5
(b) x + 2y − z = −1 (d) 2x − 3y + 5z = 10
2x + 7y − z = 3 4x + 7y − 2z = −5
−3x − 12y + z = 0 2x − 4y + 25z = 31

60. Using row reduction, find the set of solutions to the following system of equations:

2x1 + x2 + 3x3 + x4 = 3
x1 + x2 + x3 − x4 = 6
x1 − x2 + 3x3 + 5x4 = −12
4x1 + x2 + 7x3 + 5x4 = −3

61. Determine the values of k ∈ C for which the system of equations has

(i) no solution,
(ii) a unique solution,
(iii) more than one solution.

(a) kx + y + z = 1 (c) x + 2y + kz = 1
x + ky + z = 1 2x + ky + 8z = 3
x + y + kz = 1
(b) 2x + (k − 4)y + (3 − k)z = 1 (d) x − 3z = −3
2y + (k − 3)z = 2 2x + ky − z = −2
x− 2y + z = 1 x + 2y + kz = 1

Find the solutions when they exist.

62. Determine the conditions on a, b, c ∈ C so that the system has a solution:

(a) x + 2y − 3z = a (b) x − 2y + 4z = a
3x − y + 2z = b 2x + 3y − z = b
x − 5y + 8z = c 3x + y + 2z = c

Find the solutions when they exist.

63. The equation of an arbitrary circle in the x-y plane can be written in the form

x2 + y 2 + ax + by + c = 0

where a, b, c are real constants. Find the equation of the unique circle that passes through the
three points (−2, 7), (−4, 5), (4, −3).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 7-5

64. A traveller who just returned from Europe spent the following amounts.
For housing: $30/day in England, $20/day in France, $20/day in Spain
For food: $20/day in England, $30/day in France, $20/day in Spain
For incidental expenses: $10/day in each country.
The traveller’s records of the trip indicate a total of $340 spent for housing, $320 for food, $140
for incidental expenses while travelling in these countries. Calculate the number of days spent
in each country or show that the records must be incorrect.

65. Frank’s, Dave’s and Phil’s ages are not known, but are related as follows. The sum of Dave’s
and Phil’s ages is 13 more than Frank’s. Frank’s age plus Phil’s age is 19 more than Dave’s age.
If the sum of their ages is 71, how old are Frank, Dave and Phil?

66. Consider a triangle with vertices x, y and z as shown. Show that there z
exist unique points cx , cy , cz (on the sides indicated) with the property
cy
that: cx
d(x, cy ) = d(x, cz ) and d(y, cx ) = d(y, cz ) and d(z, cx ) = d(z, cy ).

x cz y

67. Consider the traffic flow diagram on the right. The


number of cars x1 , x2 , x3 and x4 entering an inter-
section must equal the number of cars leaving the
intersection.
Determine the smallest nonnegative values of
x1 , x2 , x3 and x4 .

68. Following the algorithm described in lectures, reduce the following matrices to reduced row
echelon form. Keep a record of the elementary row operations you use.
   
4 −8 16 0 2 1 4
(a)  1 −3 6  (d)  0 0 2 6 
2 1 1 1 0 −3 2
 
  1 2 0 1
2+i 2+i 5 6+i
(b) (e)  2 4 1 1 
1 − 2i 1 − 2i −2 + i 2 − i
3 6 1 1
   
1 2 0 0 2 7
 −1 1   1 −1 1 1 
(c)   (f) 
 −1

 2 2  1 −4 5 
0 2 −2 2 −5 4

69. For each of the following matrices, decide whether or not the matrix is invertible and, if it is,
find the inverse.
 
 

1 2 −1
 
1 −1 0
 1 0 1 0
2 0 0 1 0 1
(a) (b) 0 1 2  (c) −1 1 1 (d)  
−3 1 0 0 1 1
0 0 1 0 −1 1
1 1 0 0
 
1 0
70. Consider the matrix A = .
−5 2

(a) Write A−1 as a product of two elementary matrices.

© University of Melbourne 2024


7-6 MAST10022 Linear Algebra: Advanced, 2024

(b) Write A as a product of two elementary matrices.

71. (a) Find the inverse of the matrix  


2 1 1 1
 2 3 2 2 
A=
 4

2 4 3 
6 3 3 5
(b) Check your answer by matrix multiplication.
(c) Write A as a product of elementary matrices.
(d) Use your answer to part (a) to solve the system

2x + y + z + w = 3
2x + 3y + 2z + 2w = 5
4x + 2y + 4z + 3w = 6
6x + 3y + 3z + 5w = 9

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 7-7

Extra material for lecture 7

 reduced row echelon form

Elementary Linear Algebra by Anton and Rorres, §1.2, §1.5

 calculating matrix inverses

Elementary Linear Algebra by Anton and Rorres, §1.5

© University of Melbourne 2024


7-8 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 8

Rank and determinant of a matrix

8.1 Uniqueness of the reduced row echelon form

Proposition 8.1

The reduced row echelon form of a matrix is unique.

The proof is presented (for completeness) in an appendix at the end of this lecture.
Note that this is the same as saying that if R and S are two matrices each in RREF and if R ∼ S, then
we must have R = S.

8.2 Rank of a matrix

Knowing that the reduced row echelon form of A is unique enables us to make the following defini-
tion.

Definition 8.2

The rank of a matrix is the number of non-zero rows in its reduced row echelon form. We denote
the rank of a matrix A by rank(A).

Remark. 1. Although the row echelon form is not unique, it has the same number of non-zero rows
as its reduced row echelon form.

2. If two matrices are row equivalent, then they have the same rank.

3. If A has m rows, then rank(A) 6 m.

4. If a square matrix R ∈ Mn,n (F) has rank n and is in reduced row echelon form, then R = In .

Proposition 8.3

Let A ∈ Mn,n (F). Then A is invertible if and only if rank(A) = n.

Proof. Assume first that A is invertible. Then A ∼ In by Theorem 7.5. Therefore rank(A) = rank(In ) =
n.
For the converse, assume now that rank(A) = n. Let R be the reduced row echelon form of A. Then
rank(R) = rank(A) = n. Therefore R is an n × n matrix, it is in reduced row echelon form and has
no zero rows (since it has rank n). It must be the case that R = In because In is the only n × n matrix
that is in RREF and has rank n. Therefore A is invertible by Theorem 7.5.
8-2 MAST10022 Linear Algebra: Advanced, 2024

8.3 Determinant
 
Recall that for a 2 × 2 matrix ac db the quantity ad − bc can be used to determine whether or not
the matrix is invertible. It is also equal to the area of the parallelogram defined by the vectors
(a, b), (c, d) ∈ R2 . The determinant is a generalisation of this quantity to n × n matrices.
The determinant is a number (i.e., element of F) associated to a square matrix (in Mn,n (F)). It gives
a lot of information about the matrix. For example, a square matrix is invertible precisely when its
determinant is non-zero.
Rather than beginning with an explicit formula for the determinant, we list its important properties.
The first three of which are, in fact, enough to determine the value of the determinant.

Properties of determinants

Given a matrix A ∈ Mn,n (F), the determinant of A is a number det(A) ∈ F that satisfies:

1. det(In ) = 1

2. If B is obtained from A by swapping two rows, then det(B) = − det(A).

3. The determinant depends linearly on the first row, that is:


kA11 kA12 ··· kA1n A11 A12 ··· A1n
   
 A21 A22 ··· A2n   A21 A22 ··· A2n 
(a) det 

..  = k det 
 
..


 .   . 
An1 An2 ··· Ann An1 An2 ··· Ann

A11 + A011 A12 + A012 A1n + A01n


 0
··· A11 A12 ··· A1n A11 A012 ··· A01n
    
 A21 A22 ··· A2n   A21 A22 ··· A2n   A21 A22 ··· A2n 
(b) det 

..  = det 
 
..  +det 
 
..


 .   .   . 
An1 An2 ··· Ann An1 An2 ··· Ann An1 An2 ··· Ann

Note. The above properties tell us exactly what the effect of a row operation is on the determinant.

Example 8.4.
       
2 0 0 1 0 0 0 0 3 0 0 1
det 0 0 3 = 2 det 0
   0 3 = −2 det 1
  0 0 = −6 det 1 0 0
0 1 0 0 1 0 0 1 0 0 1 0
   
1 0 0 1 0 0
= 6 det 0 0 1 = −6 det 0 1 0 = −6
0 1 0 0 0 1

The following further properties can all be derived from the first three* .

Properties of determinants

4. If two rows are equal, then the determinant is zero

5. If B is obtained from A by applying a row operation of the third kind (i.e., replacing a row
by itself plus a multiple of another row), then det(B) = det(A)

6. If A has a row of zeros, then det(A) = 0

* In fact, for some fields (those of characteristic 2) property 4 does not follow from the first 3.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 8-3

 
d1 ∗ · · · ∗
 0 d2 · · · ∗ 
7. det   = d1 d2 . . . dn
 
. .
 . 
0 · · · 0 dn

8. det(A) = 0 if and only if A is singular

9. det(AB) = det(A) det(B)

10. det(AT ) = det(A)

Note. Another common notation for the determinant of a matrix A which we will sometimes use is
to write |A| in place of det(A)

Example 8.5. We can use row operations to calculate the determinant of a matrix by putting it into
row echelon form.
1 2 −2 0 1 2 −2 0
2 3 −4 1 0 −1 0 1
= (property 5: R2 − 2R1 , R3 + R1 )
−1 −2 0 2 0 0 −2 2
0 2 5 3 0 2 5 3
1 2 −2 0
0 −1 0 1
= (property 5: R4 + 2R2 )
0 0 −2 2
0 0 5 5
1 2 −2 0
0 −1 0 1 5
= (property 5: R4 + R3 )
0 0 −2 2 2
0 0 0 10
= 1 × (−1) × (−2) × 10 = 20 (property 7)

8.4 Exercises

72. Find the rank of the matrices given in Exercise 50.


73. Find the rank of the matrices given in Exercise 68.
 
  1 0 1
1 2 −1
74. (a) Determine the rank of the following matrices: (i) (ii)−2 1 1
3 −6 2
1 1 2
 
1 1 k
(b) Determine the rank of the matrix 1 k 1 when: (i) k = 1, (ii) k = −2, (iii) k ∈
/ {1, −2}.
k 1 1
75. Use row operations to calculate the determinants of the following matrices:
     1 3 2

1 2 3 2 1 1 3 5 5
     3 1 1

(a) 
 1 3 7   3 0 −1 
(b)  (c) 
  
 8 2 4 
1 2 1
1 4 13 4 5 2 3 3 2

76. Let a, b, c, d, e, f, g, h, i ∈ F be such that


a b c
d e f =1
g h i
Find the following determinants:

© University of Melbourne 2024


8-4 MAST10022 Linear Algebra: Advanced, 2024

a b c 2a 2b 2c
(a) g h i (d) 2d 2e 2f
d e f 2g 2h 2i
a −b c a b c
(b) d −e f (e) d+a e+b f +c
g −h i g − 2a h − 2b i − 2c
d e f
(c) 3g 3h 3i
a b c

77. A matrix P is called idempotent if P 2 = P .

(a) Show that if P is idempotent and det(P ) 6= 0, then P = I.


(b) Give an example of an idempotent matrix that is neither a zero matrix nor an identity
matrix.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 8-5

Extra material for lecture 8

 Proof of uniqueness of RREF

Proof. We need to show that for any A ∈ Mm,n (F) if A ∼ R1 and A ∼ R2 , and both R1 and R2
are in reduced row echelon form, then R1 = R2 .
The idea is to show that if R1 6= R2 (and they are both in RREF) then there exists X ∈ Mn,1 (F)
such that R1 X 6= 0 and R2 X = 0. This would establish that R1 6= R2 =⇒ R1 6∼ R2 (as
required).
Suppose, for a contradiction, that R1 6= R2 . Let j ∈ {1, . . . , n} be minimal such that j-th column
of R1 is not equal to the j-th column of R2 . Form a new matrix S1 from R1 as follows. Drop all
columns to the right of the j-th column and drop all the columns on its left that do not contain
a leading entry. Define S2 as the matrix similarly obtained from R2 . For example
       
1 2 0 2 5 1 2 0 3 6 1 0 2 1 0 3
R1 = 0 0 1 3 6 R2 = 0 0 1 4 7 S1 = 0 1 3 S2 = 0 1 4
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Note that S1 and S2 will be in reduced row echelon form and S1 ∼ S2 . Suppose first that the
j-th column of R1 (which is the last column of S1 ) contains a leading entry.
Then we have    
0 b1
 ..   .. 

 Ik . 

 Ik . 

 0  

 bk  
S1 = 
 1   S2 = 
 0  

 0  

 0  
 ..   .. 
 0 .   0 . 
0 0
 T  T
But then S1 b1 · · · bk −1 6= 0 whereas S2 b1 · · · bk −1 = 0 which is not possible
since S1 ∼ S2 . The same argument applies in the case in which the j-th column of R2 contains
a leading entry. If neither the j-th column of R1 nor the j-th column of R2 contains a leading
entry we have    
a1 b1
 ..   .. 

 Ik . 

 Ik . 
 ak   bk 
S1 =   S2 =  

 0  

 0  
 ..   .. 
 0 .   0 . 
0 0
Considered as the augmented matrix of a linear system, each has a unique solution. The solutions must
be equal since S1 ∼ S2 . Therefore ai = bi for all i. But then S1 = S2 , which contradicts the assumption
that their final columns are different.

© University of Melbourne 2024


8-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 9

Determinants (continued)

9.1 Derivation of properties

We claim that each of the properties given for the determinant can be derived from the first 3.

Exercise 78. Show that properties 4,5, and 6 follow from the first three properties. (For property 4
you will need to assume that the field is such that 1 + 1 6= 0.)

We will show that properties 7,8 9, and 10 follow from the first six.

Lemma 9.1

Let A, B ∈ Mn,n (F). Suppose B is obtained from A by a sequence of elementary row operations.
Then there is a k ∈ F \ {0} such that det(B) = k det(A). Moreover, if D is obtained from C using
the same sequence of row operations, then det(D) = k det(C).

Proof. (in lecture)

Lemma 9.2: Property 7

If A ∈ Mn,n (F) is in triangular form, then det(A) = A11 A22 . . . Ann .

Proof. Suppose that A is in upper triangular form. That is, we have


 
d1 ∗ ∗
A =  0 . . . ∗  ∈ Mn,n (F)
 

0 0 dn

If the diagonal entries di are all non-zero, then we have

d1 ∗ ∗ 1 ∗ ∗
.. .
A = 0 . ∗ = d1 d2 · · · dn 0 . . ∗ (by properties 2, 3a)
0 0 dn 0 0 1
1 0 0
= d1 d2 · · · dn ..
0 . 0 = d1 d2 · · · dn (by properties 5, 1)
0 0 1

If one (or more) of the di is equal to 0, then the REF of A will have a row of zeros and therefore
det(A) = 0.
9-2 MAST10022 Linear Algebra: Advanced, 2024

Lemma 9.3: Property 8

det(A) = 0 if and only if A is singular

Proof. We first show that det(A) = 0 =⇒ A is singular (by establishing the contrapositive).

A invertible =⇒ A ∼ I (Theorem 7.5)


=⇒ det(A) = k det(I) (some k 6= 0, Lemma 9.1)
=⇒ det(A) 6= 0

For the converse, suppose that A is singular. Let R be the matrix in RREF such that R ∼ A. Because
A is singular, R 6= I (Theorem 7.5 ) and R therefore has a row of zeros. Since R has a row of zeros,
det(R) = 0 and therefore det(A) = 0 (Lemma 9.1).

Lemma 9.4: Property 9

det(AB) = det(A) det(B)

Proof. Let R be the RREF of A. Then det(R) = k det(A) for some k 6= 0. We also have det(RB) =
k det(AB) with the same k.
Assume first that A is invertible. Then R = I and we have

det(A) det(B) = k det(A) det(AB) (since det(B) = k det(AB))


= det(AB) (since k det(A) = det(R) = 1)

On the other hand, if A is singular, then R has a row of zeros. Therefore RB has a row of zeros, and
so det(RB) = 0. We have
1
det(AB) = det(RB) = 0
k
and
det(A) det(B) = 0 det(B) = 0

Lemma 9.5: Property 10

det(AT ) = det(A)

Proof. If A is singular, then AT is also singular (Exercise 38) and therefore det(AT ) = 0 = det(A).
We can assume therefore that A is invertible. It is enough to show that det(E T ) = det(E) for all
elementary matrices E since any invertible matrix can be written as a product of elementary matrices
(Corollary 7.6) and we then have

A = E1 E2 . . . Ek =⇒ det(A) = det(E1 ) det(E2 ) . . . det(Ek )

and

AT = EkT . . . E2T E1T =⇒ det(AT ) = det(EkT ) . . . det(E2T ) det(E1T )


= det(Ek ) . . . det(E2 ) det(E1 )
= det(E1 ) det(E2 ) . . . det(Ek )

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 9-3

It remains to check that det(E T ) = det(E) for elementary matrices. Note that if E is an elementary
matrix corresponding to a row swap or to multiplying a row by a constant, then E = E T . If E
corresponds to the third kind of row operation, then both E and E T are triangular and have all
diagonal entries equal to one.

9.2 Cofactor expansion

The following gives a version of a (recursive) formula for det(A) that can often be useful.

Lemma 9.6: Cofactor expansion along the i-th row

Let n > 2, A ∈ Mn,n (F) and let i ∈ {1, . . . , n}. Then


n
X
det(A) = (−1)i+j Aij det(A(i, j))
j=1

where A(i, j) denotes the (n − 1) × (n − 1) matrix obtained from A by deleting the i-th row and
the j-th column.

Remark. This is often referred to as ‘cofactor expansion along the i-th row’.

Example 9.7.

1 2 −2
2 −2 1 −2 1 2
2 3 −4 = (−1)4 × (−1) × + (−1)5 × (−2) × + (−1)6 × 0 ×
3 −4 2 −4 2 3
−1 −2 0
= 1 × (−1) × (−2) + (−1) × (−2) × 0 + 1 × 0 × (−1)
=2

Remark. The value of (−1)i+j is +1 for i = j = 1 (i.e., top left of the matrix) and then alternates
between −1 and +1 as we move one entry vertically or horizontally.

9.3 Exercises

79. Use cofactor expansion to calculate the determinant of the matrix in Example 8.5.

80. Evaluate the determinant of the following matrices using cofactor expansion (Lemma 9.6):
" #  
2 1 2 3 4 5
(a)  
3 −1  0 3 4 5 
(d) 
   
2 1 1

 0 0 4 5 
   
 3 0 −1 
(b)  0 0 0 5

4 5 2
" #
a ab

2 4 2
 (e) (where a, b ∈ C)
  b a2 + b2
(c) 
 1 5 1 

3 −7 3

81. Evaluate the determinants of the following matrices. For what values of the variables (x, λ, k ∈
C) are the matrices invertible?

© University of Melbourne 2024


9-4 MAST10022 Linear Algebra: Advanced, 2024

 
x 2x −3x
(a)  x x − 1 −3 
0 0 2x − 1
 
λ−1 0 0 0
 2 0 λ+1 0 
(b)  
 1 λ−2 0 0 
2 3 9 λ+2
 
k k+1 k+2
(c)  k + 3 k + 4 k + 5 
k+6 k+7 k+8

82. (a) Determine the values of the parameter λ for which det(A − λI) = 0 when
   
4 2 2 2 1
i) A =
−3 −1 ii) A =  2 5 2 
1 2 2

(b) Find all solutions X of the linear system

(A − λI)X = 0

for each A and each λ you found in part (a).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 9-5

Extra material for lecture 9

 Think about why the properties listed uniquely determine the value of det(A).

 Try to prove that the cofactor expansion satisfies properties 1–3.

 An alternative way of defining (or calculating) the determinant of a matrix A ∈ Mn,n uses
permutations of the set {1, . . . , n}.
X
det(A) = sign(σ)A1,σ(1) A2,σ(2) . . . An,σ(n)
σ∈Sn

Here Sn is the set of all bijections σ : {1, . . . , n} → {1, . . . , n}.


See, for example, Axler (Chapter 10).

© University of Melbourne 2024


9-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 10

Fields

Recall that elements of R3 can be added together and multiplied by scalars. For example

(1, 2, 3) + (−2, 3, −7) = (−1, 5, −4)


−7(−2, 3, −7) = (14, −21, 49)

Thinking about the properties of these operations leads to the definition of a vector space, the central
topic of this subject. A vector space is a generalisation of the vector structure of R3 in which we have
a set of ‘vectors’ together with a version of vector addition and scalar multiplication.
The first thing we will look at is what we can use as ‘scalars’.

10.1 Fields
Examples of fields that we alerady know are Q, R, and C. They each have two operations (addition
and multiplication) and the two operations satisfy certain familiar properties. For example, every
non-zero element has a multiplicative inverse. Before listing the defining properties we recall the
meaning of ‘associative’ and ‘commutative’ for binary operations.

Definition 10.1

A binary operation on a set S is a function S × S → S. We often use ‘infix’ notation for binary
operations. For example, we write a + b rather than +(a, b) for the binary operation of addition.
A binary operation  : S × S → S is called:
1. associative if ∀a, b, c ∈ S, (a  b)  c = a  (b  c)

2. commutative if ∀a, b ∈ S, a  b = b  a

Example 10.2. The binary operations of addition and multiplication on R are associative and commu-
tative. Subtraction gives a binary operation on R that is not associative since, for example, (1−2)−3 6=
1 − (2 − 3). It is also not commutative since, for example, 1 − 2 6= 2 − 1. Matrix multiplication gives
a binary operation on M2,2 (R) that is associative but not commutative.

Definition 10.3

A field is a set F together with two binary operations, + and × on F satisfying the following
properties:

A1) ∀ a, b ∈ F, a + b = b + a (addition is commutative)

A2) ∀ a, b, c ∈ F, a + (b + c) = (a + b) + c (addition is associative)

A3) ∃ 0 ∈ F ∀ a ∈ F, a+0=a (additive identity)

A4) ∀ a ∈ F ∃ (−a) ∈ F, a + (−a) = 0 (additive inverses)

M1) ∀ a, b ∈ F, a × b = b × a (multiplication is commutative)

M2) ∀ a, b, c ∈ F, a × (b × c) = (a × b) × c (multiplication is associative)


10-2 MAST10022 Linear Algebra: Advanced, 2024

M3) ∃ 1 ∈ F \ {0} ∀ a ∈ F, a × 1 = a (multiplicative identity)

M4) ∀ a ∈ F \ {0} ∃ (a−1 ) ∈ F, a × a−1 = 1 (multiplicative inverses)

D) ∀ a, b, c ∈ F, a × (b + c) = (a × b) + (a × c) (distributivity)

Example 10.4. 1. Q, R, and C (with the usual operations) are fields


2. Z (with the usual operations) is not a field. It fails to satisfy axiom M4.
√ √
3. Q[ 2] = {a + b 2 | a, b ∈ Q} ⊂ R is a field (using the usual operations on R)
4. M2,2 (R) with the usual matrix operations is not a field. It fails to satisfy axioms M1 and M4.

Exercise 83. Let F be a field. Show that:


a) ∀x ∈ F, 0 × x = 0 b) ∀x ∈ F, (−1) × x = −x
(Justify every step of your proofs using the axioms in the definition of a field.)

Remark. As is common when writing multiplication in R or C, we will often write ab in place of a × b.

Example 10.5. Here are two fields that each have only a finite number of elements.
(F2 , +) (F2 , ×)
+ [0] [1] × [0] [1]
The field F2 is a field having two elements F2 = {[0], [1]}
with operations given in the tables on the right. [0] [0] [1] [0] [0] [0]
[1] [1] [0] [1] [0] [1]
(F3 , +) (F3 , ×)
+ [0] [1] [2] × [0] [1] [2]
The field F3 is defined by F3 = {[0], [1], [2]}
and operations given on the right. [0] [0] [1] [2] [0] [0] [0] [0]
[1] [1] [2] [0] [1] [0] [1] [2]
[2] [2] [0] [1] [2] [0] [2] [1]

Given any prime p ∈ N there is a field having p elements. It can be constructed as follows. We label
the elements as [0], [1], . . . , [p − 1], that is, Fp = {[0], [1], . . . , [p − 1]}. The operations are defined by

[a] + [b] = [a + b]
[a] × [b] = [a × b]

where [a+b] is defined to be the the element [k] ∈ Fp given by the condition that p divides (a+b)−k. In
other words, we add as usual in Z, but then add a multiple of p until the result lies in {1, 2, . . . , p − 1}.
Similarly, the element [a×b] is defined to be the element [k] ∈ Fp given by the condition that p divides
(a × b) − k.

Remark. It’s common to write the elements of Fp simply as {0, 1, 2, . . . , p − 1} rather than
{[0], [1], [2], . . . , [p − 1]}. Another notation is {0, 1, 2, . . . , (p − 1)}. Be careful, Fp is not a ‘subfield’ of
R.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 10-3

10.2 Exercises

84. Write down the addition and multiplication tables for F5 and F7 .

85. Let Z4 = {[0], [1], [2], [3]} equipped (Z4 , +) (Z4 , ×)


with the addition and multiplica- + [0] [1] [2] [3] × [0] [1] [2] [3]
tion given in the tables. [0] [0] [1] [2] [3] [0] [0] [0] [0] [0]
[1] [1] [2] [3] [0] [1] [0] [1] [2] [3]
Show that Z4 is not a field. [2] [2] [3] [0] [1] [2] [0] [2] [0] [2]
[3] [3] [0] [1] [2] [3] [0] [3] [2] [1]

86. Show that        


0 0 1 0 1 1 0 1
, , , ⊂ M2,2 (F2 )
0 0 0 1 1 0 1 1
equipped with the usual matrix operations is a field. That is, check that all the axioms are
satisfied. (It’s important to realise that the entries in the matrices are from F2 . For example.
[ 10 01 ] + [ 10 01 ] = [ 00 00 ])
 
1 1 1
87. Let A ∈ M3,3 (F3 ) be given by A = 1 2 0. (Note that the entries are elements of F3 .)
2 0 1

(a) Find the reduced row echelon form of A.


(b) Find the determinant of A.
(c) Find the inverse of A, if it exists.

88. Let F be a field. Show that ∀a, b ∈ F, ab = 0 =⇒ (a = 0 ∨ b = 0)

89. Let F be a field and a ∈ F \ {0}. Show that the function L : F → F given by L(x) = ax is a
bijection.

90. Let F = {0, 1, a, b} be a field having four elements. Show that

(a) ab = 1
(b) a2 = b
(c) a3 = 1
(d) 1 + 1 = 0 (This is harder and just for fun. Feel free to skip it.)

© University of Melbourne 2024


10-4 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 10

 Related algebraic structures


A set equipped with a single operation that satisfies A2, A3, and A4 is called a group. If it also
satisfies A1, it is called an abelian group. You will see groups in the subject MAST20022 Group
Theory and Linear Algebra.
If we drop axiom M4 (but keep all other parts from the definition of a field) we get what is
called a (commutative) ring. Examples of rings that are not fields are: Z and R[X] (where
each is equipped with the usual addition and multiplication). Rings are covered in the subject
MAST30005 Algebra.

 More reading about fields and related structures:

A first course in abstract algebra by John Fraleigh


Algebra by Michael Artin
Concrete abstract algebra : from numbers to Gröbner bases by Niels Lauritzen

© University of Melbourne 2024


LECTURE 11

Vector spaces

11.1 Vector spaces

Consideration of the fundamental properties of ‘vectors’ in R2 leads us to the following definition


of a ‘vector space’. It consists of a field (often called the scalars) and another set whose elements
will be called vectors. In addition there are two binary operations: ‘scalar multiplication’ and ‘vector
addition’.*

Definition 11.1: Vector space

Let F be a field. A vector space over F (also called an F-vector space, or just a vector space)
consists of a non-empty set V together with two binary operations:

vector addition: V × V → V (with the image of (u, v) will be denoted u + v)


scalar multiplication: F × V → V (with the image of (α, v) being denoted αv).

These are required to satisfy the following axioms:

1) ∀ u, v ∈ V , u+v =v+u 5) ∀ α ∈ F ∀ u, v ∈ V , α(u + v) = αu + αv

2) ∀ u, v, w ∈ V , u+(v+w) = (u+v)+w 6) ∀ α, β ∈ F ∀ v ∈ V , (α + β)v = αv + βv


* *
3) ∃ 0 ∈ V ∀v ∈ V , v+0=v 7) ∀ α, β ∈ F ∀ v ∈ V , (αβ)v = α(βv)
*
4) ∀ v ∈ V ∃(−v) ∈ V , v + (−v) = 0 8) ∀ v ∈ V , 1v = v

Remark. 1. It’s important to note that the scalars form part of the vector space structure.
2. The elements of V are called vectors.
*
3. The symbol 0 is being used for the zero vector to distinguish it from the zero scalar 0 ∈ F. The
*
vector 0 is uniquely determined by the property in axiom 3.
4. The additive inverse −v of a vector v is uniquely determined by the property in axiom 4.
5. A vector space over R is often called a real vector space.
A vector space over C is often called a complex vector space.

Exercise 91. Let V be a vector space over a field F. Show that


*
(a) ∀v ∈ V , 0v = 0 (b) ∀v ∈ V , (−1)v = −v
(Each step in your argument should be justified by reference to one of the axioms in the definition.)

Example 11.2. Here are some important examples of vector spaces. Everything we will say about
vector spaces applies to each of these.

1. F = R and V = R2 = {(x, y) | x, y ∈ R} with addition and scalar multiplication defined by:

(x, y) + (a, b) = (x + a, y + b) and α(x, y) = (αx, αy)


* These two operations are distinct from the two binary operations in the field of scalars.
11-2 MAST10022 Linear Algebra: Advanced, 2024

2. F = R and V = Rn = {(x1 , . . . , xn ) | xi ∈ R}. The operations are given by

(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn )


α(x1 , . . . , xn ) = (αx1 , . . . , αxn )

3. Cn is a vector space over C. (With operations as above for Rn .)

4. (F3 )2 is a vector space over F3 . The operations are the same as above for Rn . Here are some
examples of the operations.
(1, 2) + (2, 2) = (0, 1)
2(2, 1) = (1, 2)

5. (Fp )n is a vector space over Fp . (With operations as above.)

6. Vector space of matrices


Let F = R and V = Mm,n (R) (for some fixed m, n ∈ N) and take the usual operations of scalar
multiplication and matrix addition. Each elements of V is both a matrix and a vector! Similarly
Mm,n (F) can be considered a vector space over F for any field F.

7. Sequence spaces
Let F be any field and let V = {(x1 , x2 , . . . ) | xi ∈ F}.† The operations are given by

(xi )i∈N + (yi )i∈N = (xi + yi )i∈N


α(xi )i∈N = (αxi )i∈N

8. Polynomials of degree at most n


Let F be any field and fix n ∈ N. Let V = Pn (F) = {a0 + a1 x + a2 x2 + · · · + an xn | ai ∈ F}. The
operations are given by the usual operations on polynomials.

(a0 + a1 x + · · · + an xn ) + (b0 + b1 x + · · · + bn xn ) = ((a0 + b0 ) + (a1 + b1 )x + · · · + (an + bn )xn )


α(a0 + a1 x + · · · + an xn ) = (αa0 ) + (αa1 )x + · · · + (αan )xn

9. Polynomials
Let F be any field let V = F[x] = {a0 + a1 x + a2 x2 + · · · + an xn | n ∈ N, ai ∈ F}.. The operations
are given by the usual operations on polynomials.

10. Functions
Let V = F(R, R) be the set of all functions from R to R. We get a vector space over R if we
define operations by

(f + g)(x) = f (x) + g(x)


(αf )(x) = α × f (x)

More generally, given any non-empty set S and any field F we have a vector space V = F(S, F)
whose vectors are functions from S to F and with operations as given above.

11. Continuous functions


Let V = C(R, R) be the set of all continuous functions from R to R. Equipped with the opera-
tions as defined above for F(R, R), this gives a vector space.


An element of this set is the same as a function from N to F.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 11-3

11.2 Direct sum of vector spaces

We describe a standard way of combining two vector spaces to produce a third. Fix a field F and let
V and W be vector spaces over F.

Definition 11.3: external direct sum

The direct sum of V and W , denoted V ⊕ W , has underlying set is the Cartesian product of V
and W
V ⊕ W = {(v, w) | v ∈ V, w ∈ W }
with the operations defined by

(v1 , w1 ) + (v2 , w2 ) = (v1 + v2 , w1 + w2 )


k(v, w) = (kv, kw)

Exercise 92. Show that with these operations V ⊕ W is a vector space over F.

Remark. The vector space R ⊕ R is the same as R2 as defined above.

11.3 Exercises

93. Determine whether or not the given set is a real vector space when equipped with the usual
operations. If it is not a vector space, list all properties that fail to hold.

(a) The set of all 2 × 3 matrices whose second column consists of 0’s.
That is, {A ∈ M2,3 (R) | Ai2 = 0 ∀ i}.
(b) The set of all (real) polynomials with positive coefficients.
That is, {a0 + a1 x + a2 x2 + · · · + an xn | ai ∈ R, ai > 0}.
(c) The set of all (real valued) continuous functions with the property that the function is 0 at
every integer. That is, {f ∈ C(R, R) | f (x) = 0 ∀ x ∈ Z}

94. Let V be the set of positive real numbers, that is, V = {x ∈ R | x > 0}. Define the operations of
vector addition ⊕ and scalar multiplication as follows:

x ⊕ y = xy for all x, y ∈ V (the operation on the right is multiplication in R)


α
α x=x for all α ∈ R, and x ∈ V

Show that, equipped with these operations, V forms a real vector space. (What is the zero
vector? What is the (additive) inverse of a vector x ∈ V ?)

95. Let A ∈ M2,2 (R) and let V = {X ∈ M2,1 (R) | AX = [ 00 ]}. Show that V is closed under the usual
operations of matrix addition and scalar multiplication. That is, show that ∀u, v ∈ V, u + v ∈ V
and that ∀u ∈ V ∀α ∈ R, αu ∈ V . Check that V together with these operations forms a vector
space. (We may use standard properties of R.)

© University of Melbourne 2024


11-4 MAST10022 Linear Algebra: Advanced, 2024

Further reading for lecture 11

 Some references for vector spaces:

Elementary Linear Algebra by Anton and Rorres, chapter 4


Linear Algebra Done Right by S. Axler, chapter 1
Finite-Dimensional Vector Spaces by P. Halmos, chapter 1

 Modules
What if we were to relax the requirement that the scalars must be a field? For example, could
we allow Z as the scalars, but keep the rest of the definition of a vector space unchanged? This
leads to what is called a module. Many of our considerations about vector spaces continue to
work for modules, but some do not. Modules are covered in the subject MAST30005 Algebra.

© University of Melbourne 2024


LECTURE 12

Subspaces

Definition 12.1: Subspace

A subspace W of a vector space V is a subset W ⊆ V that is itself a vector space using the
addition and scalar multiplication operations defined on V .

Remark. 1. The operations on W are the restrictions of those on V .

2. W a subspace of V is denoted W 6 V .

3. It’s important to realise that subspaces are vector spaces in their own right. Anything we can
prove about vector spaces applies to subspaces.

Examples 12.2. 1. {(t, 2t, 3t) | t ∈ R} 6 R3

2. {(x, y, z) ∈ R3 | x + y + z = 0} 6 R3

3. {(x, y, z) ∈ R3 | x + y + z = 1} is not a subspace of R3

4. {(x, y, z) ∈ R3 | z > 0} is not a subspace of R3

To show that a subset fails to be a subspace it can sometimes be useful to note the following.

Lemma 12.3

*
Let V be a vector space and W ⊆ V . If W is a subspace, then 0V ∈ W .
*
(Where 0V denotes the zero vector in the vector space V .)

Exercise 96. Use Exercise 91 to prove the above lemma.

The following theorem allows us to avoid checking all the vector space axioms when showing that a
subset of V is a subspace.

Theorem 12.4: Subspace theorem

Let V be a vector space over F. A subset W ⊆ V is a subspace of V if and only if

0. W is non-empty

1. ∀u, v ∈ W, u+v ∈W (W is closed under vector addition)

2. ∀a ∈ F ∀u ∈ W, au ∈ W (W is closed under scalar multiplication)

Proof. Suppose that W is a subspace of V . Then it is a vector space and therefore satisfies the con-
ditions in Definition 11.1. Therefore it is non-empty (by axiom 3) and the the operations of vector
addition and scalar multiplication give functions W × W → W and F × W → W and therefore (1)
and (2) are satisfied.
12-2 MAST10022 Linear Algebra: Advanced, 2024

For the converse, suppose that W ⊆ V satisfies (0), (1), and (2). We need to check that W satisfies
Definition 11.1. By (1) and (2), we have that the binary operations of vector addition V × V → V
and scalar multiplication F × V → V restrict to give functions W × W → W and F × W → W . That
axioms 1,2,5,6,7,8 of Definition 11.1 are satisfied is immediate (since they hold in V and W ⊆ V ).
Since W is non empty, there is an element w ∈ W . Then by condition (2) and Exercise 91 we have that
* * *
0V = 0w ∈ W . Therefore axiom 3 is satisfied (and 0W = 0V ). To see that axiom 4 holds, let v ∈ W .
Appealing to Exercise 91 again we have that −v = (−1)v ∈ W .

Exercise 97. Use the Subspace Theorem to show that the first two examples above (in Example 12.2)
are indeed subspaces.

Exercise 98. Let H and K be subspaces of a vector space V . Prove that the intersection H ∩ K is a
subspace of V .
*
Examples 12.5. 1. {0} is always a subspace of V

2. V is always a subspace of V
  
 a 0 0 
3. The set of diagonal matrices 0 b 0 | a, b, c ∈ C is a subspace of M3,3 (C).
0 0 c
 

4. The subset of continuous functions C(R, R) = {f : R → R | f is continuous} is a subspace of


F(R, R)*
  
a b
5. | ad − bc = 0 is not a subspace of M2,2 (C)
c d

6. {f : [0, 1] → R | f (0) = 2} is not a subspace of F([0, 1], R)

Lemma 12.6: Solution space of a homogeneous linear system

Let F be a field and A ∈ Mm,n (F). The solution space

{X ∈ Mn,1 (F) | AX = 0}

is a subspace of Mn,1 (F).

Proof. Let W = {X ∈ Mn,1 (F) | AX = 0}. By Theorem 12.4 it is enough to show that W is non-empty
and closed under vector addition and scalar multiplication.
Note first that 0 ∈ W since A0 = 0. Therefore W 6= ∅.
Let X, Y ∈ W . Then A(X + Y ) = AX + AY = 0 + 0 = 0. Therefore X + Y ∈ W .
Let X ∈ W and α ∈ F. Then A(αX) = αAX = α × 0 = 0. Therefore αX ∈ W .
Therefore, by the Subspace Theorem, W is a subspace of Mn,1 (F).
 
0 0 0 1 1
Example 12.7. Consider the matrix H = 0 1 1 0 0 ∈ M3,5 (F2 ).
1 0 1 0 1
The set W = {X ∈ M5,1 (F2 ) | HX = 0} is a subspace of M5,1 (F2 ). Solving the linear system in the
usual way, we get

W = {(0, 0, 0, 0, 0), (1, 1, 1, 0, 0), (1, 0, 0, 1, 1), (0, 1, 1, 1, 1)}


* This
relies on the fact (from calculus) that the sum of two continuous functions is continuous and that a multiple of a
continuous function is continuous.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 12-3

12.1 Exercises

99. Sketch each of the following subsets of R2 and determine whether it is

(i) closed under addition,


(ii) closed under scalar multiplication,
(iii) a subspace of R2 .

(a) A = {(x, y) | y > 0} (c) C = {(x, y) | x2 + y 2 6 1}


(b) B = {(x, y) | x = y} (d) D = {(x, y) | xy = 0}

100. Decide which of the following are subspaces of C3 . Explain your answers.

(a) A = {(a, b, 0) ∈ C3 | a, b ∈ C} (c) C = {(a, b, c) ∈ C3 | 2a − 3b + 5c = 0}


(b) B = {(a, b, c) ∈ C3 | 2a − 3b + 5c = 4} (d) D = {(a − b, a + b, 2a) ∈ C3 | a, b ∈ C}

101. Show that the following sets of vectors are not subspaces of Rn .

(a) The set of all vectors whose first component is 2.


(b) The set of all vectors except the zero vector.
(c) The set of all vectors the sum of whose components is 1.

102. Use the subspace theorem to decide which of the following are real vector spaces with the usual
operations.

(a) The set of real polynomials of degree exactly n (where n ∈ N is fixed).


(b) The set of real polynomials p with p(0) = 0.
(c) The set of real polynomials p with p(0) = 1.

103. Determine whether or not the given set is a subspace of Mn,n (C).

(a) The set of all matrices, the sum of whose entries is zero.
(b) The set of all matrices whose determinant is zero.
(c) The diagonal matrices.
(d) The matrices with trace equal to 0.

104. Decide which of the following are complex vector spaces with the usual matrix operations.
 
z1 z2
(a) All complex 2 × 2 matrices with z1 and z2 real.
z3 z4
(b) All complex 2 × 2 matrices with z1 + z4 = 0.

105. Let A ∈ Mm,n (F) and B ∈ Mm,1 (F). Show that if B 6= 0, then the set of solutions {X ∈
Mn,1 (F) | AX = B} is not a subspace of Mn,1 (F).

106. Let V = (F2 )3 . Show that W = {(0, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0)} ⊆ V is a subspace of V .

© University of Melbourne 2024


12-4 MAST10022 Linear Algebra: Advanced, 2024

Further reading for lecture 12

 References about subspaces

Elementary Linear Algebra by Anton and Rorres, §4.2


Linear Algebra Done Right by Axler, §1.C

© University of Melbourne 2024


LECTURE 13

Linear combinations: span and linear independence

We define and investigate the important notions of linear dependence and span.

13.1 Linear combinations

Definition 13.1: Linear combination

Let V be a vector space over a field F. An expression of the form

α1 u1 + · · · + αk uk

with u1 , . . . uk ∈ V and α1 , . . . , αk ∈ F is called a linear combination.

Notice that if W 6 V and u1 , . . . uk ∈ W , then the above linear combination gives a vector in W . We
will investigate how to describe subspaces using linear combinations.

13.2 Span

Definition 13.2: The span of a set of vectors

Let V be a vector space with scalars F, and let S ⊆ V be a non-empty subset of V . The span of S
is the set of all linear combinations of vectors from S

span(S) = {α1 u1 + · · · + αk uk | k ∈ N and u1 , . . . , uk ∈ S and α1 , . . . , αk ∈ F} ⊆ V


*
If S = ∅, we define span(S) = {0} ⊆ V .

Remark. 1. The above definition of span(S) does not assume that S is finite.

2. It’s immediate from the definition that S ⊆ span(S).

3. An alternative notation for the span of S is hSi.

Examples 13.3. 1. S = {(1, 1), (−3, −3)} ⊆ R2 , span(S) is the line {(x, y) ∈ R2 | y = x}

2. S = {(1, 1), (2, 3)} ⊆ R2 , span(S) = R2

3. S = {(1, 1), (2, 3), (4, 5)} ⊆ R2 , span(S) = R2

4. S = {(−1, 1, 0), (−1, 0, 1)} ⊆ R3 , span(S) is the plane {(x, y, z) | x + y + z = 0} ⊆ R3


13-2 MAST10022 Linear Algebra: Advanced, 2024

 −2 0 6 8 
1 0 −5 −8
Example 13.4. Let A ∈ M4,4 (C) be given by A = 2 0 −2 0 .
Calculation gives that the reduced
1 0 0 2 2 0 −5 −6

row echelon form of A is R = 00 00 10 20 . The solution space of the linear system AX = 0 is given by
0000
x  −2w   0   −2w 
{ y
z
y
| x, y, z, w ∈ C, x = −2w, z = −2w} = { −2w | y, w ∈ C} = { y0 + −2w0 | y, w ∈ C}
w w 0 w
0  −2   0   −2 
1 0
= {y 0 + w −2 | y, w ∈ C} = span{ 10 , −2 0 }
0 1 0 1

Proposition 13.5

Let V be a vector space with scalars F, and let S ⊆ V be a subset.

1. The span of S, span(S), is a subspace of V .

2. Let W 6 V be a subspace. If S ⊆ W , then span(S) 6 W .

*
Proof. Note first that if S = ∅, then both parts are trivially true since span(S) = {0}. We will assume
then that S 6= ∅.
To show that span(S) is a subspace we show that it satisfies the conditions of the Subspace Theorem.
Firstly, span(S) is non-empty since span(S) ⊇ S 6= ∅. Now suppose that u, v ∈ span(S). Then, from
the definition of the span, we have that u = ki=1 αi ui for some αi ∈ F and ui ∈ S and v = ki=1 βi vi
P P
for some βi ∈ F and vi ∈ S. But then

u + v = α1 u1 + · · · + αk uk + β1 v1 + · · · βl vl ∈ span(S)

Therefore span(S) is closed under vector addition. Similarly, for any a ∈ F we have that

au = a(α1 u1 + · · · + αk uk ) = (aα1 )u1 + · · · + (aαk )uk ∈ span(S)

and therefore span(S) is closed under scalar multiplication. It follows from the Subspace Theorem
that span(S) is a subspace of V .
For the second part, suppose that W 6 V and that S ⊆ W . We have

u ∈ span(S) =⇒ u = α1 u1 + · · · + αk uk for some αi ∈ F and ui ∈ S


=⇒ u ∈ W (since W is a subspace and ui ∈ W )

Remark. The above proposition tells us that span(S) is the ‘smallest’ subspace of V that contains S.
We sometimes say that span(S) is the subspace spanned by S.

Exercise 107. Let V be a vector space and let S, T ⊆ V be two subsets. Show that if S ⊆ T , then
span(S) 6 span(T ).

Definition 13.6

Given a subspace W 6 V of a vector space, we say that a subset S ⊆ V is a spanning set for W
if span(S) = W . We also say that S spans W .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 13-3

Exercise 108. Show that S ⊆ V is a spanning set for a subspace W 6 V if and only if

(a) S ⊆ W and

(b) ∀w ∈ W ∃k ∈ N ∃α1 , . . . , αk ∈ F ∃u1 , . . . , uk ∈ S, w = α1 u1 + · · · + αk uk

Example 13.7. We will show that S = {(1, 1), (2, 3), (4, 5)} ⊆ R2 is a spanning set for R2 . By Exercise
108 we need to show that given any (a, b) ∈ R2 , there exist x, y, z ∈ R such that (a, b) = x(1, 1) +
y(2, 3) + z(4, 5). That is, we need to show that the following linear system is consistent:

x + 2y + 4z = a
x + 3y + 5z = b

Forming the augmented matrix of the linear system and row reducing gives:
   
1 2 4 a R2 −R1 1 2 4 a
−−−−→
1 3 5 b 0 1 1 b−a

Therefore the linear system is consistent (for all a, b ∈ R) and we conclude that S is a spanning set for
R2 .

Exercise 109. By setting up an appropriate linear system, show that the set

S = {(1, −1, 1), (3, −2, 3), (1, 0, 1), (1, 1, 1)}

is not a spanning set for R3 . Use your working to find a vector u ∈ R3 such that u ∈
/ span(S).

13.3 Linear independence

Definition 13.8

Let V ba a vector space over a field F. A subset S ⊆ V is called linearly dependent if there are
vectors u1 , . . . , uk ∈ S and scalars α1 , . . . , αk ∈ F such that
*
α1 u1 + · · · + αk uk = 0

and at least one of the αi is non-zero.


A subset S ⊆ V which is not linearly dependent is called linearly independent.

Note. In other words, a subset S is linearly independent iff ∀k ∈ N ∀u1 , . . . , uk ∈ S ∀α1 , . . . , αk ∈ F,

k
X *
αi ui = 0 =⇒ ∀i, αi = 0
i=1

In any vector space V the empty set ∅ ⊆ V is a linearly independent set.

Example 13.9. We decide whether or not the subset S = {(1, 4, 1), (2, 5, 1), (3, 6, 1)} ⊆ C3 is linearly
dependent. From the definition, the set is linearly independent iff x(1, 4, 1) + y(2, 5, 1) + z(3, 6, 1) =
(0, 0, 0) =⇒ x = y = z = 0 That is, the set is linearly independent iff the following (homogeneous)
linear system has a unique solution

x + 2y + 3z = 0
4x + 5y + 6z = 0
x+y+z =0

© University of Melbourne 2024


13-4 MAST10022 Linear Algebra: Advanced, 2024

h1 2 3i
We write down the corresponding matrix A = 4 5 6 . Reducing to row echelon form gives
111
     
1 2 3 1 2 3 R − 1
R
1 2 3
R2 −4R1 3 2
4 5 6 −−−−−→ 0 −3 −6 −−−−3−→ 0 −3 −6
R3 −R1
1 1 1 0 −1 −2 0 0 0

Since the solution set will require one parameter (Lemma 6.8), we know that there are non-trivial
solutions. Therefore the set S is linearly dependent.

Exercise 110. By solving an appropriate linear system, show that the following subset of P3 (C) is
linearly independent:

{2 + 2x − 2x2 + 6x3 , 1 + 4x − 4x2 + 9x3 , −1 − 2x + 3x2 − 5x3 }

Lemma 13.10

A subset S ⊆ V is linearly dependent if and only if ∃u ∈ S, u ∈ span(S \ {u}).

Proof. Suppose that S is linearly dependent. Let u1 , . . . , uk ∈ S and α1 , . . . , αk ∈ F be such that


Pk *
i=1 αi ui = 0 and let j ∈ {1, . . . , k} be such that αj 6= 0. We have
*
α1 u1 + · · · + αk uk = 0 =⇒ −αj uj = α1 u1 + · · · + αj−1 uj−1 + αj+1 uj+1 + · · · + αk uk
       
α1 αj−1 αj αk
=⇒ uj = u1 + · · · + uj−1 + uj+1 + · · · + uk
−αj −αj −αj −αj

where in the last step we have relied on the fact that αj 6= 0.


For the converse, suppose instead that u ∈ S, u1 , . . . , uk ∈ S \ {u} and α1 , . . . , αk ∈ F are such that
u = α1 u1 + · · · + αk uk . Rearranging gives
*
α1 u1 + · · · + αk uk − u = 0

and therefore S is linearly dependent.

Remark. The following are particular cases of the lemma above.

*
1. If S = {u}, then S is linearly dependent iff u = 0.

2. If If S = {u, v} consists of exactly two vectors, then S is linearly dependent iff one of the two
vectors is a multiple of the other.

3. If S contains at least two vectors, then S is linearly dependent if and only if


∃u ∈ S ∃u1 , . . . , uk ∈ S \ {u} ∃α1 , . . . , αk ∈ F such that u = α1 u1 + · · · + αk uk

13.4 Exercises

111. Determine which of the following sets span R3 .

(a) {(1, 2, 3), (−1, 0, 1), (0, 1, 2)}


(b) {(−1, 1, 2), (3, 3, 1), (1, 2, 2)}

112. Determine whether the given set spans the given vector space.

(a) V = C2 , S = {(i, 2), (3, 4)}

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 13-5

(b) V = P2 (R), S = {2 + x2 , 3 + x + 2x2 , 1 + x + x2 , 7 + 3x + 5x2 }


(c) V = (F2 )3 , S = {(1, 1, 0), (1, 0, 1), (0, 1, 1)}
(d) V = M2,2 (R), S = {[ 20 10 ] , [ 02 01 ] 03 −1
  00
0 , [ 3 1 ]}

113. Find (finite) spanning sets for the following subspaces of R3 :

(a) {(2a, b, 0) | a, b ∈ R}
(b) {(a + c, c − b, 3c) | a, b, c ∈ R}
(c) {(4a + d, a + 2b, c − b) | a, b, c, d ∈ R}

114. Find (finite) spanning sets for the given vector spaces.

(a) {A ∈ M2,2 (C) | AT = A} 6 M2,2 (C)


 
0 0 0 1 1 1 1
(b) {X ∈ M7,1 (F2 ) | HX = 0} where H = 0 1 1 0 0 1 1 ∈ M3,7 (F2 )
1 0 1 0 1 0 1
(c) C as a vector space over the field C (i.e., V = C and F = C).
(d) C as a vector space over the field R (i.e., V = C and F = R).

115. Determine whether or not the following sets of vectors are linearly independent:

(a) {(1, 2), (0, 2), (1, 0), (−1, 1)} ⊆ R2


(b) {(1, 2), (3, −1)} ⊆ R2
(c) {(1, 0, 1), (−1, 1, 0), (0, 1, 1))} ⊆ R3
(d) {(2, 0, 0, 0), (2, 1, 0, 0), (−1, 3, −2, 0), (1, −2, 4, −3)} ⊆ R4
(e) {(2, −3, 1, −5), (0, 1, 2, 2), (1, −2, 3, 0)} ⊆ R4
(f) {(1, 0, 2, −3), (0, −4, 1, 1), (2, 2, 0, −1), (1, −2, −1, 3)} ⊆ C4

116. By setting up and solving an appropriate linear system, decide whether the vector u is a linear
combination of the vectors in the set S. If so, express u as a linear combination of the vectors in
S.

(a) u = (0, 0, 1) ∈ R3 , S = {(1, 2, 3), (2, 3, 1)} ⊆ R3


(b) u = (0, 1, 1, 2) ∈ R4 , S = {(1, 0, −2, −1), (−2, −1, 2, 0), (1, 1, 1, 1)} ⊆ R4

117. Which of the following subsets of C3 are linearly independent?

(a) {(1 − i, 1, 0), (2, 1 + i, 0), (1 + i, i, 0)}


(b) {(1, 0, −i), (1 + i, 1, 1 − 2i), (0, i, 2)}
(c) {(i, 0, 2 − i), (0, 1, i), (−i, −1 − 4i, 3)}

118. Let a, b, c ∈ C be such that no two of them are equal. Show that the set {(1, a, a2 ), (1, b, b2 ), (1, c, c2 )} ⊆
C3 is linearly independent.

119. Determine whether or not the given set is linearly dependent. If the set is linearly dependent,
write one of its vectors as a linear combination of the others.

(a) {1, 1 + x, 1 + x + x2 } ⊆ P2 (R)


(b) {1 + x2 , 1 + x + 2x2 , x + x2 } ⊆ P2 (R)
       
1 0 0 1 0 0 0 0
(c) , , , ⊆ M2,2 (R)
0 −1 0 −1 1 −1 0 1
   
 2 0 0 −2 0 0 
(d) 0 −1 0 ,  0 −1 0  ⊆ M3,3 (R)
0 0 1 0 0 −1
 

120. Show that a set S ⊆ V is linearly dependent iff ∃u ∈ S such that span(S \ {u}) = span(S).

© University of Melbourne 2024


13-6 MAST10022 Linear Algebra: Advanced, 2024

*
121. Show that if a subset S ⊆ V contains the zero vector 0 ∈ V , then S is linearly dependent.

122. Show that any subset of a linearly independent set is itself linearly independent.

123. Let u, v ∈ V and let α, β ∈ F with α 6= 0. Show that {u, v} is linearly independent iff {αu+βv, v}
is linear independent.

124. Let A ∈ Mm,n (F) be a matrix that is in row echelon form. Show that the non-zero rows of A
form a linearly independent subset of M1,n (F).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 13-7

Further material for lecture 13

 Alternative definition of span(S)


As an alternative to Definition 13.6 we could have defined the span of a set of vectors S ⊆ V to
be the the intersection of all subspaces of V that contain S. With this version it is not necessary
to treat the case S = ∅ separately. As an exercise you could check that this definition gives the
same thing as Definition 13.6. This version of the definition of span can be readily adapted to
other algebraic structures (e.g., groups, rings, fields, etc).

© University of Melbourne 2024


13-8 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 14

Bases and dimension

A fundamental concept when working with vector spaces is that of a basis. It is a spanning set
that is as ‘small’ as possible. A basis gives an efficient way of describing and working with vectors.
Choosing a basis allows us to use coordinates to represent vectors. The dimension of a vector space
is defined using bases.

14.1 Definition of basis

Definition 14.1

Let V be a vector space over a field F. A basis for V is a subset B ⊆ V that satisfies:

1. B is linearly independent, and

2. B is a spanning set for V .

Examples 14.2. 1. {(1, 0), (0, 1)} ⊆ R2 is a basis for R2 .

2. {(1, 1), (3, 4)} ⊆ R2 is also a basis for R2 .

3. {(1, 1), (−3, −3)} ⊆ R2 is not a basis for R2 .

4. {(−1, 1, 0), (−1, 0, 1)} is a basis for the plane P = {(x, y, z) ∈ R3 | x + y + z = 0}


 −2   −1 
5. { 10 , −2 0 } is a basis for the solution space of the matrix A = [ 1 2 0 1 ],
1213
0 1
V = {X ∈ M4,1 (R) | AX = 0}

 M0 2,2
6. Let V 6  (C) be the subspace of all matrices having trace equal to 0.
Then { 10 −1 , [ 00 10 ] , [ 01 00 ]} is a basis for V .

A vector space has many subsets that are bases. The following are the standard bases for some
common vector spaces. If no other basis if specified, these are the assumed choices.

6. The standard basis for Fn is {(1, 0, 0, . . . , 0), (0, 1, 0, . . . , 0), . . . , (0, 0, 0, . . . , 0, 1)}. These vectors
are sometimes denoted as {e1 , e2 , · · · , en }.

7. The standard basis for Mm,n (F) is {Ei,j | i ∈ {1, . . . m}, j ∈ {1, . . . , n}} where Ei,j is the matrix
with a 1 in the (i, j)-th entry and 0 in all other entries.

8. The standard basis for Pn (F) is 1, x, x2 , . . . , xn .




9. The standard basis for F[x] is {1, x, x2 , . . . }.

The following result illustrates why bases are so useful.


14-2 MAST10022 Linear Algebra: Advanced, 2024

Lemma 14.3: Uniqueness of coordinates

Let V be a vector space and B = {b1 , . . . , bn } ⊆ V a basis for V . Every vector in u ∈ V can be
written uniquely as a linear combination of elements from B. That is, there exist unique αi ∈ F
such that
u = α1 b1 + · · · + αn bn

Pn set for V and therefore every element in u ∈ V can be


Proof. Since B is a basis, it is a spanning
written as a linear combination u = i=1 αi bi where αi ∈ F. We will use the factPthat B is linearly
independent to show that the αi are uniquely determined by u. Suppose that u = ni=1 βi bi for some
βi ∈ F. Then we have
n n n n
X X X X *
αi bi = βi bi =⇒ αi bi − βi bi = 0
i=1 i=1 i=1 i=1
n
X *
=⇒ (αi − βi )bi = 0
i=1
=⇒ ∀i, (αi − βi ) = 0 (since B is linearly independent)
=⇒ ∀i, βi = αi

14.2 Dimension

Theorem 14.4

Let V be a vector space and let B = {b1 , . . . , bn } be a basis for V . Let S ⊆ V be a subset.

1. If S in linearly independent, then it has at most n elements.

2. If S is a spanning set, then it has at least n elements.

3. Every basis of V has n elements.

Proof. Note first that the third part is an immediate consequence of the first two parts, which we now
prove.
1) We prove the contrapositive: if S has more than n elements, then S is linearly dependent.
Suppose that |S| > n, and let {u1 , . . . , un+1 } ⊆ S be distinct elements of S. Let αij ∈ F be such that
uj = α1j b1 + · · · + αnj bn . Let A ∈ Mn,n+1 (F) be given by Aij = αij . To show that S is linearly
dependent we will show that there is a non-trivial linear combination of the ui that gives the zero
vector. For x1 , . . . , xn+1 ∈ F we have
n+1 n+1 n
X * X X *
xj uj = 0 ⇐⇒ xj αij bi = 0
j=1 j=1 i=1
 
n n+1
X X *
⇐⇒  αij xj  bi = 0
i=1 j=1
n+1
X
⇐⇒ ∀i, αij xj = 0 (since B is linearly independent)
j=1
" x1 #
x2
⇐⇒ AX = 0 where X = ..
.
xn+1

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 14-3

The final expression is a homogeneous linear system. The solution space of this linear system will
need (n + 1) − rank(A) parameters (see Lemma 6.8). Since A has n rows, we have rank(A) 6 n, and
therefore (n + 1) − rank(A) > 1. Therefore the linear system more than one solution. In particular,
there is a non-trivial solution.
2) Will will show the contrapositive: if S has fewer than n elements, then it is not a spanning set.
Suppose that S = {u1 , . . . , uk } with k < n. Let αij ∈ F be such that uj = α1j b1 + · · · + αnj bn .
Let A ∈ Mn,k (F) be given by Aij = αij . We will show that there exist γ1 , . . . , γn ∈ F such that
v = γ1 b1 + · · · + γn bn is not in span(S).
n
X k
X
v ∈ span(S) ⇐⇒ ∃xj ∈ F, γi bi = xj uj
i=1 j=1
n
X k
X n
X
⇐⇒ ∃xj ∈ F, γi bi = xj αij bi
i=1 j=1 i=1
 
n
X n
X k
X
⇐⇒ ∃xj ∈ F, γi bi =  αij xj  bi
i=1 i=1 j=1
n
X
⇐⇒ ∃xj ∈ F ∀i, γi = αij xj (Lemma 14.3)
j=1
 x1   γ1 
⇐⇒ ∃xj ∈ F, A ... = ...
xk γn
 γ1 
We need to show that there is a choice for the γi such that the linear system AX = ... is incon-
γn
sistent. Let E ∈ Mn,n (F) be an invertible matrix such that EA = R is in reduced row echelon
# R ∈ Mn,k (F) has fewer columns than
form. "Since rows, the bottom row of R is all zeros. Now let
 γ1 
0.
en = 0.. ∈ Mn,1 (F) and choose the γi by C = ... = E −1 en . Then E[A | C] = [R | en ]. Since the
1 γn
bottom row of R is all zeros, the linear system in inconsistent.

Definition 14.5: Dimension

The dimension of a vector space V is size of a basis for V . It is denoted by dim(V ).


We call a vector space finite dimensional if it has a basis with a finite number of elements, and
infinite dimensional otherwise.

Examples 14.6. (cf. Example 14.2)

1. dim(R2 ) = 2
2. dim(P ) = 2, where P 6 V is the plane P = {(x, y, z) ∈ R3 | x + y + z = 0}
 
3. dim(V ) = 3, where V 6 M2,2 (C) is V = { ac db | a + d = 0}
4. dim(Fn ) = n
5. dim(Mm,n (F)) = mn
6. dim(Pn (F)) = n + 1
7. F[x] is infinite dimensional
8. F(R, R) is infinite dimensional

We finish with an important fact.

© University of Melbourne 2024


14-4 MAST10022 Linear Algebra: Advanced, 2024

Theorem 14.7

Every vector space has a basis.

The full proof is a little too technical to go through here, but the idea of the proof is the following.
Start with a linearly independent set S ⊆ V . If S is not a basis, then there exists u ∈ V such that
S ∪ {u} is linearly independent. Therefore a maximal linearly independent set must be a basis. That
there is a maximal linearly independent set requires the use of ‘Zorn’s Lemma’.
We isolate the following consequence of the above argument.

Lemma 14.8

Let V be a vector space and S ⊆ V a subset. Suppose that |S| = dim(V ). Then S is linearly
independent if and only if S is a spanning set (for V ).

14.3 Exercises

125. Determine whether or not the given set is a basis for C3 .

(a) {(i, 0, −1), (1, 1, 1), (0, −i, i)} (b) {(i, 1, 0), (0, 0, 1)}

126. Which of the following sets of vectors are bases for P2 (R)?

(a) {1 − 3x + 2x2 , 1 + x + 4x2 , 1 − 7x}


(b) {1 + x + x2 , x + x2 , x2 }

127. Find a basis for the given vector space:

(a) The subspace of M2,2 (C) consisting of all diagonal 2 × 2 matrices.


(b) The subspace of M2,2 (F2 ) consisting of all 2 × 2 matrices whose diagonal entries are zero.
(c) The subspace of P3 (R) consisting of all polynomials a0 + a1 x + a2 x2 + a3 x3 with a2 = 0.

128. (a) Show that any set of four polynomials in P2 (C) is linearly dependent.
(b) Show that a set consisting of two polynomials cannot span P2 (C).

129. Let
 
0 1 4 n T T o
and W = (x, y, z) ∈ R3 : A x y z = 3 x y z
 
A =  6 1 −8 .
−9 3 15

Show that W is a subspace of R3 , and find a basis for it.

130. Let V = R[x] and W = {p(x) ∈ R[x] | p(1) = 0}.


Show that W is a subspace of V , and find a basis for W .

131. Let V be a finite dimensional vector space and W 6 V a subspace of V .

(a) Show that dim(W ) 6 dim(V ).


(b) Show that if dim(W ) = dim(V ), then W = V .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 14-5

Extra material for lecture 14

existence of basis appendix inserted here (pdf cut-paste)

© University of Melbourne 2024


14-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 15

Coordinates relative to a basis

15.1 Coordinates

From Lemma 14.3 we know that every vector in a vector space can be written in exactly one way
as a linear combination of a given basis. The scalars that appear as the coefficients in the linear
combination are called the coordinates. Once we fix a basis for a finite dimensional vector space V ,
there is a one-to-one correspondence* between vectors and their coordinate matrices.

Definition 15.1: Coordinates

Let V be a vector space over a field F. Suppose that B = {b1 , . . . , bn } is an ordered basis for V ,
that is, a basis arranged in order: b1 first, b2 second and so on. For v ∈ V we can write

v = α1 b1 + · · · + αn bn for some scalars α1 , . . . , α n ∈ F

The scalars α1 , . . . , αn are uniquely determined by v (see Lemma 14.3) and are called the coor-
dinates of v relative to B.

The column matrix 



α1
[v]B =  ...  ∈ Mn,1 (F)
 

αn
is called the coordinate matrix† of v with respect to B.

"1#
0
Example 15.2. 1. V = P3 (C), B = {1, x, x2 , x3 , x4 }, v =1+ 2x3 + 2x4 , [v]B = 0
2
2
1
2. V = M2,2 (Q), B = {[ 10 00 ] , [ 00 10 ] , [ 01 00 ] , [ 00 01 ]}, v= [ 12 02 ], [v]B = 0
2
2
 −1 
3. V = R2 , B = {(1, 2), (3, 4)}, v = (2, 2), [v]B = 1

4. V = {(x, y, z) ∈ R3 | x+y +z = 0} 6 R3 , B = {(1, 2, −3), (2, −1, −1)}, v = (−3, 4, −1), [v]B = 1
 
−2

Note. The coordinates depend on the basis chosen. If we change the basis B, the coordinates of a
vector will change.

Example 15.3. Consider the vector space V = R2 . Then S =


{(1, 0), (0, 1)} and C = {(2, 1), (−1, 1)} are bases for V . For
5
v = (1, 5) ∈ V we have
   
1 2
[v]S = and [v]C =
3

5 3
2

* i.e., a bijection

also called the coordinate vector of v
15-2 MAST10022 Linear Algebra: Advanced, 2024

Lemma 15.4

Let V be a finite dimensional vector space over a field F and let B be an ordered basis for V .
Let u, v ∈ V and α ∈ F. Then

[u + v]B = [u]B + [v]B

[αv]B = α[v]B

Pn
PLet the ordered basis be B = {b1 , . . . , bn } and let αi , βi ∈ F be such that u =
Proof. i=1 αi bi and
v = ni=1 βi bi . Then
n
X n
X n
X
u+v = αi bi + βi bi = (αi + βi )bi
i=1 i=1 i=1

and therefore
 α1  " β1 # " α1 +β1 # "
α1 +β1
#
[u]B + [v]B = ... + ... = ..
. and [u + v]B = ..
.
αn βn αn +βn βn +αn

The second statement is similar, and left as an exercise.

Exercise 132. Let V be an n-dimensional vector space with scalars F and let B be a basis for V . Show
that the map ϕ : V → Mn,1 (F) given by ϕ(v) = [v]B is a bijection.

The following observation allows us to convert some questions about an n-dimensional vector space
to the corresponding questions about Fn .

Lemma 15.5

Let V be an n-dimensional vector space with scalars F and let B be a basis for V . Let S ⊆ V be a
subset of V and define T ⊆ Mn,1 (F) by T = {[v]B | v ∈ S}. Then

1. S is linearly independent iff T is linearly independent

2. S is a spanning set for V iff T is a spanning set for Mn,1 (F)

Proof.

S linearly dependent ⇐⇒ ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F (not all zero)


k
X *
αi ui = 0V
i=1
⇐⇒ ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F (not all zero)
" k #
X *
αi ui = [0V ]B (by Exercise 132)
i=1 B
⇐⇒ ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F (not all zero)
k
X *
αi [ui ]B = 0Mn,1 (F ) (by Lemma 15.4)
i=1
⇐⇒ T is linearly dependent

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 15-3

k
X
S a spanning set for V ⇐⇒ ∀v ∈ V ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F, v= αi ui
i=1
k
X
⇐⇒ ∀v ∈ V ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F, [v]B = αi [ui ]B
i=1
k
X
⇐⇒ ∀u ∈ Mn,1 (F ) ∃u1 , . . . , uk ∈ S ∃α1 , . . . , αk ∈ F, u= αi [ui ]B
i=1
⇐⇒ T a spanning set for Mn,1 (F )

Examples 15.6.

1. Consider S = {2 − 3x + x2 − 5x3 , x + 2x2 + 2x3 , 1 − 2x + 3x2 } ⊆ P3 (R). Taking coordinates with


respect to the standard basis for P3 (C) we get {(2, −3, 1, −5), (0, 1, 2, 2), (1, −2, 3, 0)} ⊆ R4 .
Since this set is linearly independent (see Exercise 115(e)), the set S is linearly independent.

2. Consider the
 1 vector
 0space V 6 M2,2 (C) of matrices having trace 0. In Example
 0 1 14.2.6 we saw
0 1 0 0
that B = { 0 −1 , [ 0 0 ] , [ 1 0 ]} is a basis 1 0 −1 1
1 −1
h iS = { 1 −1 , 0 1 , [ 1 0 ]} ⊆ V . Taking
h iforh V .i Let
0
coordinates with respect to B we get { 0 , 1 , 1 }. Since this set is linearly dependent (see
1 0 1
Exercise 115(c)), we have that S is linearly dependent.

15.2 Exercises

133. (a) Show that the set B = {(−2, 2, 2), (3, −2, 3), (2, −1, 1)} is a basis for R3 .
(b) Find the vectors x, y ∈ R3 whose coordinates with respect to B are
   
2 1
[x]B = 1
  and [y]B =  0
1 −1

(c) For each of the following vectors find its coordinates with respect to B:

a = (2, −1, 1) b = (1, 0, 5) c = (3, −1, 6)

134. Find the coordinate vector of v with respect to the given basis B for the vector space V .

(a) v = 2 − x + 3x2 , B = {1, x, x2 , x3 }, V = P3 (C).


 
1 2 1
(b) v = , B = {Eij | i = 1, 2; j = 1, 2, 3}, V = M2,3 (R).
−1 1 2
(Here E ij is the matrix with (i, j) entry equal to 1 and other entries equal to 0.)
(c) v = −2 + (5 + i)x, B = {x + 1, x − 1}, V = P1 (C)
     
−2 0 i 0 0 0
(d) v = ,B = , , V is the vector space of all diagonal
0 3 0 0 0 1
2 × 2 complex matrices.
135. Use coordinate matrices to decide whether or not the given set is linearly independent. If it is
linearly dependent, express one of the vectors as a linear combination of the others.

(a) {x2 + x − 1, x2 − 2x + 3, x2 + 4x − 3} ∈ P2 (R)


     
1 2 0 −1 1 0
(b) , , in M2,2 (R)
−1 0 1 1 1 2

© University of Melbourne 2024


15-4 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 15

 References about bases and coordinates

Elementary Linear Algebra by Anton and Rorres, §4.4

© University of Melbourne 2024


LECTURE 16

Row and column space of a matrix

16.1 Row and column space of a matrix

We’ve already seen that, given a matrix A ∈ Mm,n (F), its solution space is a subspace of Mn,1 (F).
There are two other spaces we can associate to a matrix.

Definition 16.1

Let A = (aij ) ∈ Mm,n (F) be a matrix.

1. The solution space of A (also called the null space or kernel) is the subspace of Mn,1 (F)
given by {X ∈ Mn,1 (F) | AX = 0}.
It will often be identified with a subspace of Fn .
T
Explicitly, solspace(A) = {(x1 , . . . , xn ) ∈ Fn | A x1 · · · xn = 0}.


2. The row space, rowspace(A), is the subspace of M1,n (F) spanned by the rows of A.
It will often be identified with a subspace of Fn . Explicitly, we identify the i-th row with
the element ri = (ai1 , . . . , ain ) ∈ Fn and let rowspace(A) = span{r1 , . . . , rm } 6 Fn

3. The column space, colspace(A), is the subspace of Mm,1 (F) spanned by the columns of A.
It will often be identified with a subspace of Fm . Explicitly, we identify the j-th column
with the element cj = (a1j , . . . , amj ) ∈ Fm and let colspace(A) = span{c1 , . . . , cn }

Remark. Observe that


a11 a12 ... a1n x1 a11 a12 a1n
        

AX = .. .. .. ..   ..  = x  ..  + x  ..  + · · · + x  .. 
 . . . . . 1 . 2 . n .
am1 am2 ... amn xn am1 am2 amn

Therefore, AX is an element of the column space of A (for any X ∈ Mn,1 (F)).


 
1 2
Example 16.2. As a concrete example, let A = 3 4 ∈ M3,2 (R).
1 1
Then rowspace(A) = span{(1, 2), (3, 4), (1, 1)} 6 R2 and colspace(A) = span{(1, 3, 1), (2, 4, 1)} 6 R3 .

Lemma 16.3

Let A, R ∈ Mm,n (F). Suppose that A ∼ R and R is in row echelon form.

1. rowspace(A) = rowspace(R)

2. The non-zero rows of R are a basis for the row space of A.

3. The pivot columns* of A form a basis for the column space of A.


16-2 MAST10022 Linear Algebra: Advanced, 2024

4. Every non-pivot column of A can be written as a linear combination of the columns to its
left

Proof. 1) We show that if two matrices are row equivalent, then they have the same row space. For
that it is enough to show that if R is obtained from A by a single row operation, then the the row space
of R is a subset of the row space of A. But this is clear since each row of R is a linear combination of
the rows of A.
2) From the first part, we know that the non-zero rows of R form a spanning set for rowspace(A).
That the non-zero rows are linearly independent is exercise 124.
3) Since the position of the leading entries will be the same, we can assume that R is in reduced row
echelon form. The pivot columns of R are then linearly independent. Denoting the columns of A by
c1 , . . . , cn and those of R by d1 , . . . , dn we have that
 α1 
α1 c1 + · · · + αn cn = 0 ⇐⇒ A .. = 0 (see the remark ofter Definition 16.1)
.
αn
 α1 
⇐⇒ R .. =0 (A and R are row equivalent)
.
αn
⇐⇒ α1 d1 + · · · + αn dn = 0

Therefore, the pivot columns of A form a linearly independent set. To see that the set of pivot columns
of A forms a spanning set for colspace(A) it is enough to show that each of the non-pivot columns
of A can be written as a linear combination of the pivot columns of A. But this is clearly true for the
columns of R, and therefore for the columns of A by the above calculation.
4) The statement holds for the columns of R since R is in row echelon form. That the same holds for
A then follows from the remark after Definition 16.1 (as in the previous part).

Example 16.4. Let


   
2 1 3 1 −1 2 1 3 1 −1
2 1 3 1 −1 0 0 1 2 2
A=
2
 and R =  
1 4 3 1 0 0 0 0 0
2 1 5 5 3 0 0 0 0 0

Note that A ∼ R and that R is in row echelon form. We have that


 The first two rows of R form a basis for  The first and third columns of A form a
the row space of A basis for the column space of A
 The row space of A has dimension 2  The column space of A has dimension 2
 The first two rows of A do not form a ba-  The first and third columns of R do not
sis for the row space of A form a basis for the column space of A
 The row space of A is equal to the row  The column space of A is not equal to the
space of R column space of R

Definition 16.5

The dimension of the row space is called the row rank of a matrix. The dimension of the column
space is called the column rank of a matrix.

From the above lemma we have:

* That is, the columns of A such that the corresponding column in R has a leading entry.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 16-3

Corollary 16.6

The rank, row rank and column rank of a matrix are all equal.

16.2 Exercises

136. In each part find a basis for, and the dimension of, the indicated subspace.

(a) The solution space of the homogeneous linear system (over R):

x1 − 2x2 + x3 = 0
x2 − x3 + x4 = 0
x1 − x2 + x4 = 0

(b) The solution space (over C) of

x1 − 3x2 + x3 − x5 = 0
x1 − 2x2 + x3 − x4 = 0
x1 − x2 + x3 − 2x4 + x5 = 0

(c) The subspace of R4 of all vectors of the form (x, −y, x − 2y, 3y).

137. For each of the following real matrices find a basis for the

(i) column space,


(ii) row space,
(iii) solution space.
     
1 2 1 1 0 −1 1 −1 3
(a) (b) (c) 
2 1 −1 −1 0 1 0 1 1

1 1 0
2 −1 1

138. Find a basis for each of the column space, row space and solution space of the matrix
 
0 2 1 1
2 0 1 1
  ∈ M4,4 (F3 )
1 1 1 1
0 1 2 1

What is the rank of the matrix?

139. Let w = [x1 · · · xn ]T ∈ Mn,1 (R) be fixed, and let W = span{w}. Show the there exists a matrix
A ∈ Mn,n (R) whose solution space is W .

© University of Melbourne 2024


16-4 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 16

 References about row and column spaces

Elementary Linear Algebra by Anton and Rorres, §4.7

© University of Melbourne 2024


LECTURE 17

Some techniques for finite-dimensional vector spaces

We collect here some techniques that result from the theory that we’ve seen so far.

Algorithm 17.1: To decide if a set is linearly independent

Let V be an n-dimensional vector space over a field F and let B be a basis for V .
Let S = {u1 , . . . , uk } ⊆ V . To decide if S is linearly independent:

1. Form the matrix A ∈ Mn,k (F) given by A = [ [u1 ]B · · · [uk ]B ]

2. Calculate rank(A)

3. S is linearly independent iff rank(A) = k

Remark. It is always the case that rank(A) 6 k.

Example 17.2. Let S = {(2+2i)+2x+2x2 −2x3 , i+(1+i)x+x2 −x3 , −1−x2 +x3 } ⊆ P3 (C). Is S linearly
independent? To apply the above technique we first fix a basis for P3 (C). Let’s use the standard basis
B = {1, x, x2 , x3 }. Letting u1 = (2 + 2i) + 2x + 2x2 − 2x3 , u2 = i + (1 + i)x + x2 − x3 , u3 = −1 − x2 + x3
we get
   
2 + 2i i −1 −2 −1 0
  2 1+i 0 ∼ 0 i 1
 
A = [u1 ]B [u2 ]B [u3 ]B =  
 2 1 −1  0 0 0
−2 −1 1 0 0 0

Since the rank is 2 which is strictly less the the number of elements in S, we conclude that S is linearly
dependent.

Remark. Since we know that rank(A) = rank(AT ) (Corol-


 
2 + 2i 2 2 −2
lary 16.6), we could have instead calculated the AT =  i 1 + i 1 −1
rank of the matrix AT −1 0 −1 1

Algorithm 17.3: To decide whether a subset of V is a spanning set for V

Let V be an n-dimensional vector space over a field F and let B be a basis for V .
Let S = {u1 , . . . , uk } ⊆ V . To decide if S is a spanning set for V :

1. Form the matrix A ∈ Mn,k (F) given by A = [ [u1 ]B · · · [uk ]B ]

2. Calculate rank(A)

3. S is a spanning set for V iff rank(A) = n

Example 17.4. With S ⊆ P3 (C) as in Example 17.2, the calculation done there shows that S is not a
spanning set for P3 (C) because rank(A) = 2 < 4 = dim(P3 (C)).
17-2 MAST10022 Linear Algebra: Advanced, 2024

Algorithm 17.5: To find a subset of a set S = {u1 , . . . , uk } that is a basis for span (S)

Let V be an n-dimensional vector space over a field F and let B be a basis for V .
Let S = {u1 , . . . , uk } ⊆ V . To find a subset of S that is a basis for span(S):

1. Form the matrix A ∈ Mn,k (F) given by A = [ [u1 ]B · · · [uk ]B ]

2. Reduce A to row echelon form R

3. Locate the pivot columns of A

4. The corresponding elements of S form a basis for span(S). That is, the basis is

{ui | the i-th column of R is a pivot column}

Example 17.6. With S ⊆ P3 (C) as in Example 17.2, the calculation done there shows that the set
C = {(2 + 2i) + 2x + 2x2 − 2x3 , i + (1 + i)x + x2 − x3 } is a basis for span(S).

Remark. Alternatively, to find a basis for span(S) we could consider the matrix
   
2 + 2i 2 2 −2 −1 0 −1 1
AT =  i 1 + i 1 −1 ∼  0 1 −i i 
−1 0 −1 1 0 0 0 0
The non-zero rows in the row echelon form give a basis {(−1, 0, −1, 1), (0, 1, −i, i)} for rowspace(A)
and hence a basis {−1 − x2 + x3 , x − ix2 + ix3 } for span(A).

Algorithm 17.7: To find a superset of a linearly independent set that is a basis

Let V be an n-dimensional vector space over a field F and let B be a basis for V .
Let W 6 V be a subspace and fix a basis {w1 , . . . , wm } for W .
Given a linearly independent subset S = {u1 , . . . , uk } ⊆ W , to extend S to obtain a basis of W :

1. Form the matrix A = [ [u1 ]B · · · [uk ]B [w1 ]B · · · [wm ]B ] ∈ Mn,k+m (F)

2. Reduce to row echelon form

3. Locate the pivot columns of A.a

4. The vectors correspnding to the pivot columns form a basis for W . That is

{u1 , . . . , uk } ∪ {wi | the (k + i)-th column of A is a pivot column}

is a basis for W .
a
The first k columns will all be pivot columns since S is linearly independent.

Note. This includes the case in which W = V and B = {w1 , . . . , wm }.

Example 17.8. We saw in the previous example that {−1 − x2 + x3 , x − ix2 + ix3 } ⊆ P3 (C) is a linearly
independent set. We can extend it to a basis for P3 (C) as follows (letting u1 = −1 − x2 + x3 , u2 =
x − ix2 + ix3 and using the standard basis B = {1, x, x2 , x3 } )
   
−1 0 1 0 0 0 −1 0 1 0 0 0
0 1 0 1 0 0
[[u1 ]B [u2 ]B [1]B [x]B [x2 ]B [x3 ]B ] =   ∼ · · · ∼  0 1 0 1 0 0
 
−1 −i 0 0 1 0  0 0 −1 i 1 0
1 i 0 0 0 1 0 0 0 0 1 1
Therefore the set {−1 − x2 + x3 , x − ix2 + ix3 , 1, x2 } is a basis for P3 (C).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 17-3

Algorithm 17.9: To write a vector as a linear combination of a set S

Let V be an n-dimensional vector space over a field F and let B be a basis for V .
Let S = {u1 , . . . , uk } ⊆ V and let w ∈ V . To write w as a linear combination of the elements in S:

1. Form the matrix A = [ [u1 ]B · · · [uk ]B [w]B ]

2. Reduce to row echelon form

3. If the last column is a pivot column, then w ∈


/ span(S)

4. Otherwise, continue to reduced row echelon form R. The entries in the last column of R
give the coefficients in the linear combination. (See the example below.)

Example 17.10. Consider the set S = {−1 − x2 + x3 , x − ix2 + ix3 } ⊆ P3 (C). Let’s try to write the
element w = 1 + x + x2 + x3 as a linear combination of these two elements:
   
−1 0 1 −1 0 1
0 1 1  0 1 1
−1 −i 1 ∼ · · · ∼  0 0 i 
   

1 i 1 0 0 0

From which we conclude that w ∈


/ span(S). If we consider instead the vector v = −1 + ix we get
   
−1 0 −1 1 0 1
0 1 i  0 1 i 
−1 −i 0  ∼ · · · ∼ 0 0 0
   

1 i 0 0 0 0

and we see that v = u1 + iu2 , where u1 = −1 − x2 + x3 , u2 = x − ix2 + ix3 .

17.1 Exercises
 1 −2   2 −3   1 0   2 −5 
140. Let W = span(S) where S = { 4 1 , 9 −1 , 6 −5 , 7 5 } ⊆ M2,2 (R).

(a) Find a subset T of S that forms a basis for W .


(b) Write each element of S \ T as a linear combination of the elements of T .
(c) Find a basis for M2,2 (R) that contains T .

141. (a) Show that B = {(1, 1, 1), (1, 1, 0), (1, 0, 0)} is a basis for R3 .
(b) Find the coordinates of (4, −3, 2) ∈ R3 relative to B.

142. (a) Show that the set {(1 − i, i), (2, −1 + i)} is linearly dependent in C2
(b) Now consider the vector space V with underlying set C2 , but with R as the field of scalars.
Show that the set {(1 − i, i), (2, −1 + i)} is linearly independent in V
 
1 3 −2 5 4
1 4 1 3 5 
1 4 2 4 3  ∈ M4,5 (R)
143. Let A =  

2 7 −3 6 13

(a) Find a basis for the solution space of A (as a subspace of R5 ).


(b) Extend to a basis of R5 .
(c) Find a basis for the column space of A (as a subspace of R4 ).

144. Which of the following are linear combinations of (0, −2, 2) and (1, 3, −1)?

© University of Melbourne 2024


17-4 MAST10022 Linear Algebra: Advanced, 2024

(a) (2, 2, 2) (b) (0, 4, 5)

145. Let u = (1, 0, −1) and v = (−2, 1, 1).


(a) Write (−1, 2, −1) as a linear combination of u and v.
(b) Show that (−1, 1, 1) cannot be written as a linear combination of u and v.
(c) For what value of c is the vector (1, 1, c) a linear combination of u and v?
       
6 −8 4 0 1 −1 0 2
146. Is a linear combination of the matrices in , , ?
−1 −8 −2 −2 2 3 1 4
147. Express the polynomial −9−7x−15x2 as a linear combination of p1 = 2+x+4x2 , p2 = 1−x+3x2 ,
and p3 = 3 + 2x + 5x2 .
148. Show that the vectors (1, a, a2 ), (1, b, b2 ), (1, c, c2 ) are linearly independent if a, b, c are distinct
(i.e., a 6= b, a 6= c and b 6= c).
149. In this question let S = {v1 , v2 , v3 , v4 , v5 } ⊆ R3 and let A be the 3×5 matrix with the i-th column
given by the vector vi . Suppose that the reduced row echelon form of A is
 
1 2 0 −1 0
0 0 1 3 0
0 0 0 0 1
Are the following sets linearly dependent or independent? If linearly dependent, express one
vector as a linear combination of the others.

(a) {v1 , v2 , v3 } (b) {v1 , v3 , v4 } (c) {v1 , v4 , v5 } (d) {v3 , v4 , v5 }

150. In each part explain why the given statement is true “by inspection.”
(a) The set {(1, 0, 3), (−1, 1, 0), (1, 2, 4), (0, −1, −2)} is linearly dependent.
(b) The set {(1, −1, 2), (0, 1, 1)} does not span R3 .
(c) If the set {v1 , v2 , v3 , v4 } of vectors in R4 is linearly independent, then it spans R4 .
(d) The set {(0, 1, −1, 0), (0, −1, 2, 0)} is linearly independent, and so it spans the subspace of
R4 of all vectors of the form (0, a, b, 0).
151. In each part determine whether or not the given set forms a basis for the indicated subspace.
(a) {(1, 2, 3), (−1, 0, 1), (0, 1, 2)} for R3
(b) {(−1, 1, 2), (3, 3, 1), (1, 2, 2)} for R3
(c) {(1, −1, 0), (0, 1, −1)} for the subspace of R3 consisting of all (x, y, z) such that x+y+z = 0.
(d) {(1, 1, 0), (1, 1, 1)} for the subspace of R3 consisting of all (x, y, z) such that y = x + z.
152. Which of the following sets of vectors are bases for R3 ?

(a) {(1, 0, 0), (2, 2, 0), (3, 3, 3)}


(b) {(2, −3, 1), (4, 1, 1), (0, −7, 1)}

153. Find a basis for and the dimension of the subspace of Rn spanned by the following sets.
(a) {(0, 1, −2), (3, 0, 1), (3, 2, −3)} (n = 3)
(b) {(1, 3), (−1, 2), (7, 6)} (n = 2)
(c) {(−1, 2, 0, 4), (3, 1, −1, 2), (−5, 3, 1, 6), (7, 0, −2, 0)} (n = 4)
154. For each of the following sets choose a subset that is a basis for the subspace spanned by the set.
Then express each vector that is not in the basis as a linear combination of the basis vectors.
(a) {(1, 2, 0, −1), (2, −1, 2, 3), (−1, −11, 6, 13), (4, 3, 2, 1)} ⊆ Q4
(b) {(0, −1, −3, 3), (−1, −1, −3, 2), (3, 1, 3, 0), (0, −1, −2, 1)} ⊆ Q4
(c) {(1, 2, −1), (0, 3, 4), (2, 1, −6), (0, 0, 2)} ⊆ Q3

© University of Melbourne 2024


LECTURE 18

Linear error-correcting codes

When storing or transmitting data (on a disk, over the internet etc) errors are often introduced. Er-
rors might result, for example, from physical damage or radiation. In a memory chip, background
radiation can alter the memory contents. We would like to be able to protect against this, and reduce
the risk of using corrupted data.
How can we encode data in order to detect and perhaps correct errors in transmission or storage?
The study of such problems is known as Coding Theory.

18.1 Codes

A key idea is to build in redundancy before sending or storing the data. A simple way to do this is by
repetition. If we wanted to send the message ‘1011’, we could send each bit twice and send ‘11001111’.
If the message ‘11011111’ were received we would know that there had been some corruption of the
message. In this example the sent message is made up of combinations of 00 and 11.
By repeating each bit three times we would even be able to correct errors:

message: 1011 sent: 111000111111 received: 111010111111

We would know that there had been some interference and that the original message was (most
probably) 101, since 010 is ‘closer’ to 000 than to 111.

Definition 18.1

Let A be a finite set. We will refer to A as the alphabet. A code over A is a non-empty subset of
An . The number n is called the length of the code. The elements of a code are called codewords.

Example 18.2. 1. The set {(c, a, t), (d, o, g), (p, i, g), (a, b, c), (l, d, r)} is a code of length 3 over the
alphabet A = {a, b, c, . . . , z}.
2. {(0, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0)} is a code of length 3 over F2 = {0, 1}

Remark. When considering codes it is convenient to drop the commas and parentheses. So the exam-
ples above would be written as {cat, dog, pig, abc, ldr} and {000, 011, 101, 110}

18.2 Linear codes

Definition 18.3

A linear code of length n and rank k is a k-dimensional subspace of Fnp . (for some prime p ∈ N).
The code is called a binary linear code if p = 2, and it is a ternary linear code in the case p = 3.

Example 18.4. For the above repetition code, our codewords were 000 and 111. These two elements
together form a subspace of F32 so it is a linear code.

Example 18.5. {(a, b, c) | a, b, c ∈ F2 and a + b + c = 0} = {000, 011, 101, 110} is a subspace of F32 , and
so is a binary linear code. This code could be used as follows.
18-2 MAST10022 Linear Algebra: Advanced, 2024

original codeword If we receive a word abc we can check whether it is a codeword by calculat-
00 000 ing (remembering that entries are in F2 ) whether
01 101 hai
10 110 [1 1 1] b = 0
c
11 011
If it is a codeword, we know that the intended (original) message was bc.

Definition 18.6

A check matrix for a linear code is a matrix H such that C is the solution space of H.

Example 18.7. (The Hamming Code H2 (3))


Consider the matrix
 
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1 ∈ M3,7 (F2 )
1 0 1 0 1 0 1

We take as codewords all the elements in the solution space of H.


 1   1   0   1 
 1
 0 1 1 
1 0 0 0
A basis for the solution space is given by  0  ,  1  ,  1  ,  1 
 00
 1
0
0
1
0 
0 
0 0 0 1

We then send abcd ∈ F42 using the codeword

 x1  1   1 0
  1
x2 1 0 1 1
x 1 0 0 0
 x34  = a0 + b1 + c1 + d1
x5 0 1 0 0
x6 0 0 1 0
x7 0 0 0 1
1 1 0 1  a+b+d 
1011   a+c+d
 1 0 0 0  ab  a 
=  0 1 1 1  c =  b+c+d 
0100 d b
0010 c
0001 d

We have 4 parameters (called information bits in this setting) which can be chosen arbitrarily. (In
this case they are the 3rd, 5th, 6th and 7th bits.) The other 3 bits are called check bits. (The 1st, 2nd
and 4th bits.)
If we want to send the message abcd ∈ F42 , we calculate the codeword [ a+b+d a+c+d a b+c+d b c d ] and
send it. When a message is received, we check that it is a codeword by multiplication by H.
For example, suppose we wanted to send 1011 (i.e., a = 1, b = 0, c = 1, d = 1).
Given the specified encoding, the codeword for this is 0110011. We send the codeword 0110011
Suppose some interference occurs and the received word is v = 0100011
The receiver knows that an error has occurred because
   
0 0
1 1
        
0 0 0 0 1 1 1 1 0 0 0
T
   
Hv = H 0 = 0 1 1 0 0 1 1 
    0 = 1 6= 0
 
0 1 0 1 0 1 0 1  0 1 0

 
1 1
1 1

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 18-3

In fact, since  
0
T
Hv = 1

1
the receiver knows that the error occurred in the third bit.

Assuming that a single error has occurred, the (column) matrix Hv T is equal to a column (of H). If it
is the nth column, the error occurred in the nth bit.

She therefore knows that to get the intended codeword the third bit should be swapped from 0 to 1
— giving 0110011

The original message is then recovered by dropping the check bits (1st, 2nd and 4th) — giving 1011
This code gives a way of correcting a single error, which we explore below.

Exercise 155. Using the above encoding, recover the original messages from the received words:
0110001, 1110000, 0000011

18.3 Hamming distance

When an error occurs, we would like to be able to correct the error, by deciding what the original
message was. In the above triple repetition code, when the non codeword 010 was received we
deduced that the (probable) intended codeword was 000, since this is ‘closer’ to 010 than is the other
codeword 111. We saw another example in the previous section.
In correcting an error, we are assuming that the original codeword (before interference) is the code-
word closest to the received word. This is called the nearest neighbour principle.

Definition 18.8

The Hamming distance between two strings u = a1 · · · an , v = b1 · · · bn ∈ An is the number of


places in which they differ. It is denoted d(u, v).

Example 18.9. So, d(010, 000) = 1 and d(010, 111) = 2

For a code C, an important property is the minimum distance between codewords.

Definition 18.10

The minimum distance of a code C is dmin = min{d(u, v) | u, v ∈ C, u 6= v}

Proposition 18.11: Nearest neighbour principle

Let C be a code with minimum distance dmin between codewords. Then C can be used to

1. detect up to dmin − 1 errors and


−1
2. correct up to b dmin
2 c errors.

In general, to find the minimum distance between codewords one needs to calculate the distance be-
tween every pair of codewords (and then take the minimum). However, for linear codes calculating

© University of Melbourne 2024


18-4 MAST10022 Linear Algebra: Advanced, 2024

the minimum distance is simpler than for general codes.

Definition 18.12

The weight w(u) of a word u = a1 · · · an ∈ Fnp is the number of non-zero coordinates, that is,
*
w(u) = d(u, 0).

Lemma 18.13

For a linear code, the minimum distance between codewords is equal to the smallest non-zero
weight.

Exercise 156. Write out a proof of the above lemma.

Example 18.14. The linear code {0000, 1011, 0110, 1101} ⊂ F42 has minimum weight 2, and hence
minimum distance 2.

Suppose a linear code is defined by a check matrix H. How can we calculate the minimum weight
without having to list all the words in the code?

Lemma 18.15: Minimum distance of a linear code

1. For a binary linear code defined by a check matrix H, the minimum weight is the smallest
number r > 0 such that r columns of H sum to zero.

2. More generally, let H ∈ Mm,n (Fp ) be the check matrix for a linear code C. Then the mini-
mum weight of C is equal to the size of the smallest linearly dependent set of columns of
H.

Exercise 157. Prove the above lemma.

18.4 Exercises

158. Determine the minimum distance dmin between codewords for each of the following binary
codes:

(a) {1000, 1011, 0100}


(b) {000000, 101010, 010101, 111001, 011110}
(c) For the code in (b), find the distance of each of the following received words from the
codewords, and hence decode each of the received words using the nearest neighbour
principle: (i) 100010 (ii) 000101 (iii) 000110

159. What is the smallest minimum distance that a code must have in order to correct two errors?
How many errors will it detect?

160. Determine whether each of the following sets of codewords forms a binary linear code.

(a) {000, 110, 100}


(b) {000, 100, 011, 111}
(c) {00000, 01110, 10111, 11001}

161. Verify that each of the following sets gives a binary linear code, and find the minimum distance.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 18-5

(a) {00000000, 10101010, 01010101, 11111111}


(b) {00000, 00111, 01011, 01100, 10011, 10100, 11000, 11111}

162. Show that  


1 0 0 1 0
H = 0 1 0 1 1
0 0 1 1 0
cannot be used as the check matrix for a single-error-correcting code by

(a) writing down a codeword of weight 2, and


(b) determining two non-zero codewords which are distance 2 apart.

163. Consider the binary linear code with check matrix


 
1 0 0 1 1 0
H = 0 1 0 0 1 1
0 0 1 1 1 1

(a) Decode the received words 111001, 110111.


(b) Show that if 111111 is received then more than one error has occurred in transmission.
Find two possible codewords which could have been transmitted with two errors occur-
ring, and a codeword which could have been transmitted with three errors.

164. Let  
0 0 0 0 0 0 0 1 1 1
0 0 0 1 1 1 1 0 0 0
H=  ∈ M4,10 (F2 ).
0 1 1 0 0 1 1 0 0 1
1 0 1 0 1 0 1 0 1 0
This check matrix defines a linear code whose information bits (i.e., parameters) are the 3rd,
5th, 6th, 7th, 9th, and 10th.

(a) Encode the information messages (i) 101101, (ii) 001011.


(b) Decode the received words (i) 0001110101, (ii) 0000101100, (iii) 1011110111, giving if pos-
sible the information messages which were sent.

165. Let  
0 0 0 0 1 1 1 1 1 1 1 1 1
H = 0 1 1 1 0 0 0 1 1 1 2 2 2 ∈ M3,13 (F3 )
1 0 1 2 0 1 2 0 1 2 0 1 2
Each codeword has 10 information bits and 3 check bits (1st, 2nd and 5th)

(a) Prove that this code has minimum distance 3.


(b) The word w = 0121002011001 is received. By calculating HwT , show that an error has
occurred.
(c) Assuming that only one error has occurred, find the intended codeword.
(d) What was the original 10-digit message?

© University of Melbourne 2024


18-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 18

 References for further reading about codes

Introduction to coding and information theory, by Steven Roman


Coding Theory : A First Course, by San Ling and Chaoping Xing
Mathematics in computing, by Gerard O’Regan, Chapter 9.

© University of Melbourne 2024


LECTURE 19

Linear transformations

We next consider functions between vector spaces that preserve the vector space structure (addition
and scalar multiplication). Such linear transformations arise in many applications and are a central
tool in the study of vector spaces.

19.1 Definition of linear transformation

Definition 19.1: Linear transformation

Let V and W be vector spaces over the same field of scalars F. A linear transformation from V
to W is a function T : V → W that satisfies the following properties:

1. ∀u, v ∈ V , T (u + v) = T (u) + T (v) (T preserves vector addition)

2. ∀u ∈ V ∀α ∈ F, T (αu) = αT (u) (T preserves scalar multiplication)

Remark.

1. Properties 1 and 2 together are equivalent to the single condition:

∀u, v ∈ V ∀α ∈ F, T (u + αv) = T (u) + αT (v)

2. The condition T (αu) = αT (u) implies that ‘lines are mapped to lines (or points)’

Examples 19.2.

1. Let T : R3 → R3 be given by T (x, y, z) = (0, 2x + z, −y). Then T is a linear transformation, since

T ((a, b, c) + (x, y, z)) = T (a + x, b + y, c + z)


= (0, 2(a + x) + (c + z), −(b + y)) = (0, 2a + c, −b) + (0, 2x + z, −y)
= T (a, b, c) + T (x, y, z)

and

T (α(x, y, z)) = T (αx, αy, αz) = (0, 2αx + αz, −αy) = α(0, 2x + z, −y) = αT (x, y, z)

dp
2. The map D : P3 (C) → P2 (C), D(p(x)) = dx is a linear transformation

3. Many familiar geometric transformations of the plane R2 are linear, such as: reflection across a
line through the origin, rotation about the origin, projection onto a line through the origin.

Exercise 166. Let T : V → W be a linear transformation. Show that


* *
(a) T (0V ) = 0W (b) ∀v ∈ V , T (−v) = −T (v)
19-2 MAST10022 Linear Algebra: Advanced, 2024

Example 19.3 (Translation is not a linear transformation). The function τ : R2 → R2 given by


* *
τ (x, y) = (x + 1, y) is not a linear transformation since τ (0) 6= 0. More generally, let n ∈ N and
*
t ∈ Rn \ {0}. The function τ : Rn → Rn given by τ (u) = u + t is not a linear transformation.

Exercise 167. Let T : V → W be a linear transformation.

(a) Suppose that U 6 V is a subspace of V . Prove that the image of U , T (U ) = {T (u) | u ∈ U }, is a


subspace of W .

(b) Suppose that U 6 W is a subspace of W . Prove that the pre-image of U , T −1 (U ) = {v ∈ V |


T (v) ∈ U }, is a subspace of V .
(Note that the notation T −1 (U ) for the pre-image does not mean that we are assuming that the
linear transformation is invertible.)

19.2 The linear transformations determined by a matrix

There is a natural way to use matrices to define linear transformations. In fact, as we will see later, all
linear transformations (between finite-dimensional vector spaces) can be represented by matrices.

Lemma 19.4: linear transformation given by a matrix

Let A ∈ Mm,n (F). The function T : Mn,1 (F) → Mm,1 (F) given by
   
x1 x1
 ..   .. 
T  .  = A  . 
xn xn

is a linear transformation.

Proof. We need to show that given any u, v ∈ Mn,1 (F) and α ∈ F we have T (u + v) = T (u) + T (v)
and T (αu) = αT (u).

T (u + v) = A(u + v) = Au + Av (matrix multiplication is distributive)


= T (u) + T (v)
T (αu) = A(αu) = αA(u) (property of matrix multiplication)
= αT (u)

Examples 19.5 (Some ‘geometric’ transformations of the Euclidean plane).


Using coordinates with respect to the standard basis, we identify M2,1 (R) with R2 in the following.

1. The linear transformation R2 → R2


T (x, y) = (−x, y) (x, y)
 
given by the matrix on the right is −1 0
reflection across the y-axis. 0 1

2. The linear transformation R2 → R2


T (x, y)
given by the matrix on the right  
cos θ − sin θ
is rotation by θ radians (anticlock- (x, y)
sin θ cos θ θ
wise) about the origin.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 19-3

3. The linear transformation R2 → R2  


given by the matrix on the right is k 0
0 1 (x, y) T (x, y) = (kx, y)
dilation by factor k > 0 along the
x-axis.

4. The linear transformation R2 → R2  


given by the matrix on the right is 0 0
T (x, y) = (0, y) (x, y)
orthogonal projection onto the y- 0 1
axis.

5. The linear transformation R2 → R2 


1 1 1
 T (x, y) = 12 (x + y, x + y)
given by the matrix on the right
is orthogonal projection onto the 2 1 1
(x, y)
line y = x.

19.3 Linear transformations and bases

Linear transformations are completely determined by their effect on a spanning set.

Proposition 19.6

Let T1 , T2 : V → W be two linear transformations and let S ⊆ V be a spanning set for V .


If T1 (u) = T2 (u) for all u ∈ S, then T1 = T2 .

Proof. We need to show that ∀v ∈ V , T1 (v) = T2 (v). Let v ∈ V . Then, since S is a spanning set for V ,
there exist αi ∈ F and ui ∈ S such that v = ki=1 αi ui . Then we have
P

k
!
X
T1 (v) = T1 αi ui
i=1
k
X
= αi T1 (ui ) (T1 is a linear transformation)
i=1
Xk
= αi T2 (ui ) (T1 (ui ) = T2 (ui ))
i=1
k
!
X
= T2 αi ui (T2 is a linear transformation)
i=1
= T2 (v)

The next result says that a linear transformation can be defined by choosing images for the elements
of a basis.

Theorem 19.7

Let V and W be vector spaces over the same field F. Let B ⊆ V be a basis for V .
Given any function f : B → W , there exists a unique linear transformation T : V → W having
the property that T (b) = f (b) for all b ∈ B.

© University of Melbourne 2024


19-4 MAST10022 Linear Algebra: Advanced, 2024

Proof. We need to show that there exists a linear transformation with the property that T (bi ) = f (bi )
for all i, and that if two linear transformations each have this property, then they are equal.
To establish the existence part of the statement we define a function T : V → W as follows. Given
u ∈ V , we have u = α1 b1 + · · · + αn bn for uniquely determined αi ∈ F and bi ∈ B (Lemma 14.3).
We define T (u) = α1 f (b1 ) + · · · + αn f (bn ). To see that this gives a linear transformation, let u, v ∈ V
and α ∈ F . Then there are b1 , . . . , bn ∈ B and αi , βi ∈ F such that u = α1 b1 + · · · + αn bn and
v = β1 b1 + · · · + βn bn . Then

T (u + v) = T (α1 b1 + · · · + αn bn + β1 b1 + · · · + βn bn )
= T ((α1 + β1 )b1 + · · · + (αn + βn )bn )
= (α1 + β1 )f (b1 ) + · · · + (αn + βn )f (bn ) (definition of T )
= α1 f (b1 ) + · · · + αn f (bn ) + β1 f (b1 ) + · · · + βn f (bn )
= T (α1 b1 + · · · + αn bn ) + T (β1 b1 + · · · + βn bn ) (definition of T )
= T (u) + T (v)

and

T (αu) = T (α(α1 b1 + · · · + αn bn ))
= T ((αα1 )b1 + · · · + (ααn )bn )
= (αα1 )f (b1 ) + · · · + (ααn )f (bn ) (definition of T )
= α(α1 f (b1 ) + · · · + αn f (bn ))
= αT (α1 b1 + · · · + αn bn ) (definition of T )
= αT (u)

The uniqueness part of the theorem is an immediate consequence of Proposition 19.6

19.4 Exercises

168. Show that each of the following maps is a linear transformation:

(a) S : R2 → R2 , S(x, y) = (2x − y, x + y)


 
y z
(b) T : R3 → M2,2 (R) given by T (x, y, z) =
−x 0

169. Determine whether or not the given map is a linear transformation, and justify your answer.

(a) F : R3 → R2 , F (x, y, z) = (0, 2x + y)


(b) K : R2 → R3 , K(x, y) = (x, sin y, 2x + y)

170. Let v1 , v2 , and v3 be vectors in a vector space V and T : V → R3 a linear transformation for
which T (v1 ) = (1, −1, 2), T (v2 ) = (0, 3, 2), and T (v3 ) = (−3, 1, 2). Find T (2v1 − 3v2 + 4v3 ).

171. For the linear transformations of R2 into R2 given by the following matrices:

(i) Sketch the image of the rectangle with vertices (0, 0), (2, 0), (0, 1), (2, 1).
(ii) Describe the geometric effect of the linear transformation.
     
0 1 1 1 b 0
(a) (c) (e)
1 0 0 0 0 c
     
0 0 1 0 1 3 −4
(b) (d) (f) 5
1 0 a 1 4 3

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 19-5

172. Show that there is no line through the origin in R2 that is invariant under the transformation
determined by the matrix  
cos θ − sin θ
A(θ) =
sin θ cos θ
when θ is not an integral multiple of π. Give a geometric interpretation of this observation
commenting on the case when θ = kπ for some k ∈ Z.

173. Let V and W be two vector spaces over a field F. Let S ⊆ V be non-empty and linearly in-
dependent. Use Theorem 19.7 to show that for all functions f : S → W , there exists a linear
transformation T : V → W with the property that T (v) = f (v) for all v ∈ S.

© University of Melbourne 2024


19-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 19

 Hom(V, W )
Let V and W be vector spaces over the same field F. The set of all linear transformations from
V to W is itself a vector space over F when given the usual ‘pointwise’ operations.

(S + T )(v) = S(v) + T (v)


(kT )(v) = kT (v)

This vector space is denoted Hom(V, W ).

© University of Melbourne 2024


LECTURE 20

Matrix representations of linear transformations

We saw in Lemma 19.4 that matrices can be used to define linear transformations. In fact, any linear
transformation can be represented by a matrix. Just as the coordinate matrix of a vector depends on
a choice of basis, the matrix of a linear transformation depends on a choice of basis for each of the
domain and codomain.

20.1 Matrix of a linear transformation

Let V, W be finite dimensional vector spaces with the same scalars F and let T : V → W be a linear
transformation. Let B = {b1 , b2 , . . . , bn } be an ordered basis for V and C = {c1 , c2 , . . . , cm } be an
ordered basis for W . Then T (bi ) ∈ W for each i = 1, . . . , m and we can therefore write T (bi ) uniquely
as a linear combination of the basis vectors in C.

T (b1 ) = α11 c1 + α21 c2 + . . . + αm1 cm


T (b2 ) = α12 c1 + α22 c2 + . . . + αm2 cm
.. .. .. .. ..
. . . . .
T (bn ) = α1n c1 + α2n c2 + . . . + αmn cm

Definition 20.1: matrix of a linear transformation

We form a matrix [T ]C,B ∈ Mm,n (F) by defining [T ]C,B = (αij ). This matrix is called the matrix
of T with respect to B and C.

Note. The i-th column of [T ]C,B is given by [T (bi )]C . That is,

[T ]C,B = [ [T (b1 )]C · · · [T (bn )]C ]

Remark. In the case in which V = W and B = C, we sometimes write [T ]B in place of [T ]B,B .

The way in which this matrix represents the linear transformation is given by the following.

Lemma 20.2

Let V and W be finite dimensional vector spaces with bases B and C respectively. Let T : V → W
be a linear transformation. Then

∀u ∈ V, [T (u)]C = [T ]C,B [u]B

Proof. Let B = {b1 , b2 , . . . , bn } and C = {c1 , c2 , . . . , cm }. Let u ∈ V . Then u = α1 b1 + · · · + αn bn for


some αi ∈ F. We have

u = α1 b1 + · · · + αm bn
T (u) = α1 T (b1 ) + · · · + αn T (bn ) (T is linear)
[T (u)]C = α1 [T (b1 )]C + · · · + αn [T (bn )]C (Lemma 15.4) (∗)
20-2 MAST10022 Linear Algebra: Advanced, 2024

and
 
α1
· · · [T (bn )]C  ... 
  
[T ]C,B [u]B = [T (b1 )]C
αn
= α1 [T (b1 )]C + · · · + αn [T (bn )]C (see the remark on page 16-1)
= [T (u)]C (from (∗) above)

In summary, we have
apply T
u∈V −−−−−→ T (u) ∈ W
 
 
take coordsy ytake coords
mult by [T ]C,B
[u]B −−−−−−−−−→ [T (u)]C

Exercise 174. Suppose that A ∈ Mm,n (F) is such that A[u]B = [T (u)]C for all u ∈ V . Show that
A = [T ]C,B . (Hint: Show that the i-th column of A is equal to [T (bi )]C .)

Example 20.3. Consider the linear transformation T : M2,2 (R) → M2,2 (R) where T is defined by

T (A) = AT

Let’s calculate the matrix representation of T with respect to the basis S = {[ 10 00 ] , [ 00 10 ] , [ 01 00 ] , [ 00 01 ]}


(for both domain and codomain).
1 0
[T [ 10 00 ]]S = [[ 10 00 ]]S = 0
0 [T [ 00 10 ]]S = [[ 01 00 ]]S = 0
1
 00   00 
[T [ 01 00 ]]S = [[ 00 10 ]]S = 1
0 [T [ 00 01 ]]S = [[ 00 01 ]]S = 0
0
0 1
1 0 0 0
Therefore [T ]S,S = 0010
0100
0001

0 1
 
Exercise 175. With T as above and C = {[ 10 00 ] , [ 01 10 ] , −1 0 , [ 00 01 ]}, calculate [T ]C,C .

Example 20.4. Let T : R2 → R2 be the linear transformation given by T (x, y) = (x + 4y, x + y). Let
S = {(1, 0), (0, 1)} and B = {(2, −1), (2, 1)}. Then
 −1 0 
 −2 6 
[T ]S,S = [ 11 41 ] 0 3 and [T ]B,B =
and [T ] S,B = 1 3

Exercise 176. A linear transformation T : R3 → R2 has matrix [T ] = 51 15 −20


 
with respect to the stan-
3 2
dard bases of R and R . What is the matrix of T with respect to the bases B = {(1, 1, 0), (1, −1, 0), (1, −1, −2)}
of R3 and C = {(1, 1), (1, −1)} of R2 ?

Lemma 20.5

Let U, V, W be finite dimensional vector spaces with bases A, B, C respectively. Let S : U → V


and T : V → W be linear transformations. Then

[T ◦ S]C,A = [T ]C,B [S]B,A

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 20-3

Proof. Let u ∈ U .

[T ◦ S]C,A [u]A = [T ◦ S(u)]C Lemma 20.2


= [T (S(u))]C
= [T ]C,B [S(u)]B Lemma 20.2
= ([T ]C,B [S]B,A )[u]A Lemma 20.2

Since [T ◦ S]C,A [u]A = ([T ]C,B [S]B,A )[u]A for all u ∈ U , we must have that [T ◦ S]C,A = [T ]C,B [S]B,A

Example 20.6. With the linear transformation T : R2 → R2 and bases S and B of Example 20.4, we
have
    
2 1 4 1 4 5 8
[T ]S,S = [T ]S,S [T ]S,S = =
1 1 1 1 2 5
    
−1 0 −1 0 1 0
[T 2 ]B,B = [T ]B,B [T ]B,B = =
0 3 0 3 0 9
    
1 4 −2 6 2 18
[T 2 ]S,B = [T ]S,S [T ]S,B = =
1 1 1 3 −1 9

20.2 Kernel and image of a linear transformation

Definition 20.7: Kernel and Image

Let T : V → W be a linear transformation. The kernel (or nullspace) of T is defined to be


*
ker(T ) = {u ∈ V | T (u) = 0}

The image of T is defined to be

im(T ) = T (V ) = {w ∈ W | w = T (u) for some u ∈ V }

Note. From Exercise 167 we know that ker(T ) 6 V and im(T ) 6 W .

Definition 20.8: rank and nullity of a linear transformation

The dimension of ker(T ) is called the nullity of T and is denoted nullity(T ). The dimension of
im(T ) is called the rank of T and is denoted rank(T ).

Example 20.9. Consider the linear transformation T : P3 (R) → P2 (R) given by differentiation. Then
ker(T ) = span{1} and im(T ) = P2 (R). Hence nullity(T ) = 1 and rank(T ) = 3.

Lemma 20.10

*
Let T : V → W be a linear transformation. Then T is injective if and only if ker(T ) = {0}.

Proof. Suppose first that T is injective. Then we have


* * *
u ∈ ker(T ) ⇐⇒ T (u) = 0 ⇐⇒ T (u) = T (0) ⇐⇒ u = 0
*
Therefore ker(T ) = {0}.

© University of Melbourne 2024


20-4 MAST10022 Linear Algebra: Advanced, 2024

*
Now, conversely, suppose that ker(T ) = {0}. For u, v ∈ V we have
* * *
T (u) = T (v) =⇒ T (u) − T (v) = 0 =⇒ T (u − v) = 0 =⇒ u − v ∈ ker(T ) =⇒ u − v = 0 =⇒ u = v

Therefore T is injective.

Exercise 177. Let T : V → W be a linear transformation.

(a) Let X ⊆ V be a linearly independent subset of the domain. Show that if T is injective, then
T (X) is linearly independent.

(b) Let Y ⊆ V be a spanning set for the domain. Show that T (Y ) is a spanning set for the image
im(T ).

(c) Use parts (a) and (b) to show that if B is a basis for V and T is injective, then T (B) is a basis for
im(T ).

20.3 Rank-nullity theorem

If both the domain and codomain are finite dimensional, the kernel and image of a linear transfor-
mation T can calculated from a matrix representation of T :

u ∈ ker(T ) ⇐⇒ [u]B ∈ Solution space of [T ]C,B


w ∈ im(T ) ⇐⇒ [w]C ∈ colspace([T ]C,B )

Therefore,

nullity(T ) = dim(solspace([T ]C,B ))


rank(T ) = rank([T ]C,B )

The following is essentially the observation that each column of a matrix is either a pivot column or
a non-pivot column.

Theorem 20.11: Rank-Nullity Theorem

Let T : V → W be a linear transformation. Suppose that V is finite dimensional. Then

rank(T ) + nullity(T ) = dim(V )

Proof. We first prove under the assumption that W is also finite dimensional. Let n = dim(V ) and
m = dim(W ) and let B be a basis for V and C a basis for W . Let A = [T ]C,B ∈ Mm,n (F).

nullity(T ) = dim(solution space of A) = number of non-pivot columns of A


rank(T ) = dim(column space of A) = number of pivot columns of A
dim(V ) = n = number of columns of A

Therefore rank(T ) + nullity(T ) = dim(V ).


For the general case (without assuming that W is finite dimensional) we define W 0 = im(T ). Then W 0
is finite dimensional, and we apply the previous part to the linear transformation T 0 : V → W 0 given
by T 0 (u) = T (u) to conclude that rank(T 0 ) + nullity(T 0 ) = dim(V ). Then note that ker(T 0 ) = ker(T )
and im(T 0 ) = im(T ) = W 0 .

Example 20.12. For the linear transformation T : P3 (R) → P2 (R) of Example 20.9 we have rank(T ) +
nullity(T ) = 3 + 1 = 4 = dim(P3 (R)).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 20-5

20.4 Exercises

178. Find the standard matrix of the following linear transformations of R2 .



(a) rotation by 4 (c) reflection in the line y = x
(b) rotation by − π2 (d) reflection in the x−axis

179. In each part, find a single matrix that performs the indicated succession of operations.
1
(a) Compresses by a factor of 2 in the x-direction, then expands by a factor of 5 in the y-
direction.
(b) Reflects about y = x, then rotates about the origin through an angle of π.
(c) Reflects about the y-axis, then expands by a factor of 5 in the x-direction, and then reflects
about y = x.

180. Find the standard matrix (i.e., with respect to the standard basis of R2 ) of the rotation about the
origin through
π
(a) 4 anticlockwise (b) π

181. Consider the following linear transformations:


K : R3 → R3 K(x, y, z) = (x, x+y, x+y+z) L : R3 → R2 L(x, y, z) = (2x − y, x + 2y)
S : R3 → R3 S(x, y, z) = (z, y, x) T : R2 → R4 T (x, y) = (2x+y, x+y, x−y, x−2y)
Find the matrix that represents each of the following linear transformations (with respect to the
standard bases).

(a) K (b) L (c) S (d) T

182. Find the indicated linear transformation if it is defined. If it is not defined, explain why not.

(a) LK (b) T L (c) S 2 (d) K + S (e) T 2

(Where, for example, LK denotes the composition L ◦ K.)

183. Find the matrix which represents (with respect to the standard bases) those linear transforma-
tions in question 182 which exist.

184. Let T : P2 (R) → P3 (R) be the function defined by multiplication by x. That is, T (a+bx+cx2 ) =
ax + bx2 + cx3 .

(a) Show that T is a linear transformation.


(b) Find the matrix of T with respect to the standard bases B = {1, x, x2 } for P2 (R) and C =
{1, x, x2 , x3 } for P3 (R).

185. Let T : P2 (R) → P2 (R) be the linear transformation defined by T (p(x)) = p(2x + 1), that is,

T (a0 + a1 x + a2 x2 ) = a0 + a1 (2x + 1) + a2 (2x + 1)2

Find [T ]B , where B = {1, x, x2 }.

186. Find the matrix that represents the linear transformation T with respect to the bases B and B 0 .

(a) T : R3 → M2,2 (R) given by  


y z
T (x, y, z) =
−x 0
where B = {e1 , e2 , e3 } and B 0 = {E ij |i = 1, 2; j = 1, 2} (i.e. the standard bases).

© University of Melbourne 2024


20-6 MAST10022 Linear Algebra: Advanced, 2024

(b) T : P3 (C) → P3 (C) given by

T (a0 + a1 x + a2 x2 + a3 x3 ) = (a0 + a2 ) − (a1 + 2a3 )x2

where B = B 0 = {1, x, x2 , x3 }.

187. Consider the linear transformation T : R4 → R3 given by the matrix (wrt the standard bases):
 
1 2 −1 1
[T ] = 1 0 1 1
2 −4 6 2

(a) Determine whether or not v1 = (−2, 0, 0, 2) and v2 = (−2, 2, 2, 0) are in the kernel of T .
(b) Determine whether or not w1 = (1, 3, 1) and w2 = (−1, −1, −2) are in the image of T .
(c) Find the nullity of T and give a basis for the kernel of T . Is the transformation injective?
(d) Find the rank of T and give a basis for the image of T . Is the transformation surjective?

188. For each of the linear transformations T : Rn → Rm given below find:

(i) its standard matrix (i.e., with respect to the standard bases),
(ii) a basis for the kernel,
(iii) a basis for the image.
 
    x1  
x x+y x1 + x2 − x3
(a) T = (b) T  x2  =
y 3y 2x1 + x2
x3
     
  x + 2y x1 3x1 − x2 − 6x3
x
(c) T =  −y  (d) T x2  = −2x1 + x2 + 5x3 
y
x−y x3 3x1 + 3x2 + 6x3

189. Let T : M2,2 (R) → M1,2 (R) be the map defined by


   
a11 a12   a11 a12  
T = 2 −1 = 2a11 − a21 2a12 − a22
a21 a22 a21 a22

(a) Show that T is a linear transformation.


(b) Find bases for the kernel and image of T . Deduce the rank and nullity of T .
(c) Find the matrix of T with respect to the standard bases of M2,2 (R) and M1,2 (R).

190. Let S : P2 (R) → P3 (R) be defined as follows. For each p(x) = a0 + a1 x + a2 x2 , define S(p) =
a0 x + 21 a1 x2 + 13 a2 x3 . The linear transformation S gives the antiderivative of p(x), with the
constant term equal to zero.

(a) Find the matrix A that represents S with respect to the bases B = {1, x, x2 } and B 0 =
{1, x, x2 , x3 }
(b) Use A to find the antiderivative of p(x) = 1 − x + 2x2 .

191. Let U, V, W be vector spaces over the same field F with V being fininte dimensional. Consider
linear transformations S : U → V and T : V → W that satisfy T ◦ S = 0. Show that rank(S) +
rank(T ) 6 dim(V ).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 20-7

Extra material for lecture 20

 Let V and W be finite dimensional vector spaces over the same field F. Let n = dim(V ) and
m = dim(W ). Choose bases B for V and C for W and define f : Hom(V, W ) → Mm,n (F) by
f (T ) = [T ]C,B . Then f is a bijective linear transformation.

 Short exact sequences


A sequence of two linear transformations of the form

ϕ ψ
U V W

is said to be exact at V if im(ϕ) = ker(ψ). A short exact sequence is a sequence of linear


transformations of the form
ϕ ψ
0 U V W 0 (∗)

that is exact at each of U , V , and W .


Show that:

(a) the sequence (∗) is exact at U iff ϕ is injective;


(b) the sequence (∗) is exact at W iff ψ is surjective;
(c) if the sequence (∗) is exact, then dim(V ) = dim(U ) + dim(W ).

 Commutative diagrams
A diagram of four linear transformations of the form
ϕ
U V
f g
ϕ0
U0 V0

is said to commute if g ◦ ϕ = ϕ0 ◦ f .
Suppose that in the following diagram of linear transformations the rows are exact and the left
square commutes.
ϕ ψ
0 U V W 0
f g h
ϕ0 ψ0
0 U0 V0 W0 0
Show that:

(d) There exists a unique linear transformation h : W → W 0 that makes the right square
commute;
(e) If g is surjective, then so too is h;
(f) If f is surjective and g is injective, then h is injective.

© University of Melbourne 2024


20-8 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 21

Change of basis

The matrix of linear transformation depends on the bases used for both domain and codomain. We
want to develop a convenient way of relating the different matrices that represent a given linear
transformation. In other words, if we change the bases used, how does the matrix representation
change?

21.1 Invertible linear transformations

Definition 21.1: invertible linear transformation

A linear transformation T : V → W is called invertible if there exists a linear transformation


T −1 : W → V satisfying T ◦ T −1 = IdW and T −1 ◦ T = IdV .
Invertible linear transformations are also called isomorphisms. We say that two vector spaces V
and W are isomorphic if there exists an isomorphism T : V → W . We write V ∼ = W to denote
that V and W are isomorphic.

Exercise 192. Show that a linear transformation T : V → W is invertible if and only if T is a bijection.

Note that in the above definition and exercise we are not assuming that either V or W is finite di-
mensional. In the case in which they are both finite dimensional we have the following.

Proposition 21.2: Matrix of an invertible linear transformation

Let V and W be finite dimensional vector spaces and let B and C be bases for V and W respec-
tively. Let T : V → W be a linear transformation.

1. T is inveritble if and only if [T ]C,B is invertible

2. If T is invertible, then [T −1 ]B.C = ([T ]C,B )−1

*
Proof. Suppose first that T is invertible. Then T is a bijection and therefore ker(T ) = {0}. Therefore,
by the rank-nullity theorem, dim(im(T )) = dim(V ). But since T is a bijection, im(T ) = W . Therefore
dim(W ) = dim(V ).

T ◦ T −1 = IdW =⇒ [T ◦ T −1 ]C,C = [IdW ]C,C


=⇒ [T ]C,B [T −1 ]B,C = [IdW ]C,C Lemma 20.5
=⇒ [T ]C,B [T −1 ]B,C = In where n = dim(W )

Similarly,

[T −1 ]B,C [T ]C,B = In where n = dim(V )

It follows that [T −1 ]B,C is the inverse of [T ]C,B .


Suppose instead that A = [T ]C,B is invertible. Since it is invertible, it is square and therefore dim(V ) =
dim(W ). Let n = dim(V ) and A = [T ]C,B ∈ Mn,n (F). Let S : W → V be the linear transformation
defined by [S]B,C = A−1 . Then [T ◦ S]C,C = [T ]C,B [S]B,C = A A−1 = In . Therefore T ◦ S = IdW .
21-2 MAST10022 Linear Algebra: Advanced, 2024

Similarly, [S ◦ T ]B.B = [S]B,C [T ]C,B = A−1 A = In , and therefore S ◦ T = IdV . Therefore T is invertible,
and T −1 = S.

Exercise 193. Let V be an n-dimensional F-vector space and B a basis for V . Show that the map
ϕ : V → Mn,1 (F), ϕ(u) = [u]B is an isomorphism. Conclude that V ∼
= Fn .

21.2 Change of basis for vectors

How can we convert coordinates with respect to one basis to coordinates with respect to another?

Definition 21.3

Let V be a finite dimensional vector space and let B and C be two bases for V . The transition
matrix from B to C, denoted PC,B , is defined to be

PC,B = [IdV ]C,B

Remark. If B = {b1 , . . . , bn }, then PC,B = [ [b1 ]C · · · [bn ]C ]

Exercise 194. Let V be a finite dimensional vector space and let A, B and C be bases for V . Show that

(a) PC,B is invertible and (PC,B )−1 = PB,C

(b) PC,A = PC,B PB,A

Proposition 21.4

Let V be a finite dimensional vector space and let B and C be two bases for V . Then

∀u ∈ V, [u]C = PC,B [u]B

Proof. Follows immediately from Lemma 20.2.

Example 21.5. Consider the following bases for R2 : B = {(1, 1), (1, −1)} and S = {(1, 0), (0, 1)}.
   −1  
1 1 1 1 1 1 1
Then we have PS,B = and PB,S = =2 .
1 −1 1 −1 1 −1
    
1 1 2 −1
If u = (2, −4) ∈ R2 , then [u]B = PB,S [u]S = 21 =
1 −1 −4 3

Example 21.6. Consider the following bases for C2 : B = {(1, 1), (1, −1)} and C = {(1+i, 1), (1, 1−i)}.
To find PC,B , we could calculate [(1, 1)]C and [(1, −1)]C . Alternatively, we can proceed as follows:
 −1    
−1 1+i 1 1 1 −i 2 − i
PC,B = PC,S PS,B = (PS,C ) PS,B = =
1 1−i 1 −1 i −2 − i

21.3 Change of basis for linear transformations

We can also use transition matrices to relate two different matrix representations of the same linear
transformation.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 21-3

Proposition 21.7

Let V and W be finite dimensional vector spaces. Let B1 and B2 be two bases for V and let C1
and C2 be two bases for W . Let T : V → W be a linear transformation. Then

[T ]C2 ,B2 = PC2 ,C1 [T ]C1 ,B1 PB1 ,B2

Proof. Let u ∈ V . We have

(PC2 ,C1 [T ]C1 ,B1 PB1 ,B2 ) [u]B2 = PC2 ,C1 [T ]C1 ,B1 (PB1 ,B2 [u]B2 )
= PC2 ,C1 [T ]C1 ,B1 [u]B1 (Proposition 21.4)
= PC2 ,C1 [T (u)]C1 (Lemma 20.2)
= [T (u)]C2 (Proposition 21.4)

It follows that PC2 ,C1 [T ]C1 ,B1 PB1 ,B2 = [T ]C2 ,B2 (see Exercise 174).

Corollary 21.8

Let V be a finite dimensional vector space and let B and C be two bases for V . Denote P = PC,B .
Let T : V → V be a linear transformation. Then

[T ]C = PC,B [T ]B PB,C = P [T ]B P −1

Proof. Apply the Proposition with B2 = C2 = C and B1 = C1 = B. Note that PB,C = (PB,C )−1 .

Example 21.9. Let T : R2 → R2 be the linear transformation defined by


T (x, y) = (3x − y, −x + 3y). Using the standard basis B = {(1, 0), (0, 1)} we find the matrix of f is
 
3 −1
[T ]B =
−1 3

Now let’s calculate the matrix with respect to the basis C = {(1, 1), (−1, 1)}.

 −1     
1 −1 3 −1 1 −1 2 0
[T ]C = PC,B [T ]B PB,C = =
1 1 −1 3 1 1 0 4

(Alternatively, we could calculate [T ]C by finding [(1, 1)]C and [(−1, 1)]C .)

The above corollary motivates the following definition.

Definition 21.10

Let A, B ∈ Mn,n (F ). We say that A and B are similar if there exists an invertible matrix P ∈
Mn,n (F ) such that A = P BP −1 . It is denoted A ∼ B.a
a
Beware! This is not the same as saying that A and B are row-equivalent.

From Corollary 21.8 we know that [T ]C ∼ [T ]B

Exercise 195. Let A, B ∈ Mn,n (F) and suppose that A and B are similar. Show that there exists a
linear transformation T : Fn → Fn and a basis B of Fn such that A = [T ]S and B = [T ]B (where S is
the standard basis for Fn ).

© University of Melbourne 2024


21-4 MAST10022 Linear Algebra: Advanced, 2024

21.4 Exercises

196. Determine whether or not the given linear transformation is invertible. If it is invertible, com-
pute its inverse.

(a) T : R3 → R3 given by T (x, y, z) = (x + z, x − y + z, y + 2z)


(b) T : R2 → R2 given by T (x, y) = (3x + 2y, −6x − 4y)
(c) Tθ : R2 → R2 an anticlockwise rotation around the origin through an angle of θ.
(d) T θ : R2 → R2 a reflection in the line through the origin which forms an angle θ with the
x-axis.

197. Show that the transformation T : R3 → R3 defined by


   
x x+y
T y  =  y + z 
z z+x

is invertible and find its inverse

198. (a) Find the transition matrix P from B to C, where B, C are the following bases of R3

B = {(1, −2, 1), (0, 3, 2), (1, 0, −1)} and C = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}

(b) Use P to find [x]B if

i) x = (3, −2, 5) ii) x = (−2, 7, 4)

199. Verify that the given set B is a basis for Rn . Compute the change of basis matrix for each of the
bases, and use it to find the coordinate matrix of v with respect to B.

(a) B = {(1, 2), (1, −2)}, v = (−1, 3)


(b) B = {(1, 1, 1), (1, 0, 1), (−1, 1, 0)}, v = (3, −1, 1)

200. (a) Let T : R2 → R2 be given by [T ]S = 10 −1


 1 
. Find the matrix [T ]B that represents T with
respect to the basis B of question 199 (a).
h 2 −1 0 i
(b) Let T : R3 → R3 be given by [T ]S = −2 1 2 . Find the matrix [T ]B that represents T with
−1 −1 3
respect to the basis B of question 199 (b).

201. Let T : R3 → R3 be given by T (x, y, z) = (4x + y − 4z, −3x − y + 5z, x). Find the matrix [T ]B
that represents T with respect to the basis B of question 199 (b).

202. Let T : R2 → R2 be defined by T (x1 , x2 ) = (x1 − 2x2 , −x2 ), and let B = {u1 , u2 }, B 0 = {v1 , v2 },
u1 = (1, 0), u2 = (0, 1), where v1 = (2, 1), v2 = (−3, 4).

(a) Write down the matrix [T ]B of T with respect to B.


(b) Compute the matrix [T ]B 0 of T with respect to B 0 .

203. A linear transformation T : R3 → R3 has matrix


 
13 −4 −5
[T ]S = 15 −4 −6 
18 −6 = 7

with respect to the standard basis for R3 . Find the matrix [T ]B of T with respect to the basis
B = {(1, 2, 1), (0, 1, −1), (2, 3, 2)}.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 21-5

204. Let B = {b1 , b2 , b3 } be a basis for C3 . Calculate the nullity and rank of the linear transformation
T : C3 → C3 determined by

T (b1 ) = b1 − b2
T (b2 ) = b2 − b3
T (b3 ) = b1 − b3

205. Calculate the nullity and rank of the linear transformation T : (F7 )3 → (F7 )3 determined by

T (1, 0, 0) = (1, 2, 3)
T (0, 1, 0) = (3, 4, 5)
T (0, 0, 1) = (5, 1, 4)

© University of Melbourne 2024


21-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 21

 Show that the relation of similarity is an equivalence relation on Mn,n (F). That is, show that
the relation is reflexive, symmetric and transitive.

 Let V be an n-dimensional F-vector space. Show that every invertible n × n matrix is a change
of basis matrix for V . That is, show that for all invertible P ∈ Mn,n (F) there exist bases B and
B 0 for V such that P = [IdV ]B0 ,B .

 Given a linear transformation T : V → V , show that there exist bases B and B 0 for V such that
[T ]B0 ,B is diagonal and all entries are either 0 or 1. (Hint: start with a basis for the kernel of T .)
It is important here that B and B 0 do not have to be the same. We will be investigating later the
special case in which there exists a basis B such that [T ]B,B is diagonal.

© University of Melbourne 2024


LECTURE 22

Dual space

22.1 Linear functionals and the dual space

Let F be a field and V a vector space over F. We consider linear transformations from V to F.

Definition 22.1: Linear functional

A linear functional is a linear transformation ϕ : V → F. That is, ∀u, v ∈ V and ∀k ∈ F

ϕ(u + v) = ϕ(u) + ϕ(v)


ϕ(ku) = kϕ(u)

Exercise 206. Let ϕ : V → F be a linear functional. Show that if ϕ 6= 0, then ϕ is surjective.


*
Exercise 207. Let u ∈ V \ {0}. Show that there exists ϕ ∈ V ∗ such that ϕ(u) 6= 0. (You may assume
that V is finite dimensional.)

Examples 22.2. 1. V = R3 , ϕ(x, y, z) = z


R1
2. V = F[x], ϕ(p(x)) = 0 p(x) dx

3. V = Mn,n (F), ϕ(A) = A11 + · · · + Ann (i.e., the trace of A)

Definition 22.3: Dual space

Let V be a vector space over F. The dual space, V ∗ , is the vector space made of all linear
functionals with operations given by, for ϕ, ψ ∈ V ∗ and k ∈ F,

(ϕ + ψ)(v) = ϕ(v) + ψ(v)


(kϕ)(v) = kϕ(v)

22.2 Dual basis

Definition 22.4

Suppose that B = {b1 , . . . , bn } is a basis for V . Define b∗1 , . . . , b∗n ∈ V ∗ bya


(
1 if i = j
b∗i (bj ) =
0 if i 6= j

B ∗ = {b∗1 , . . . , b∗n } ⊂ V ∗ is called the dual basis (or the basis dual to B).
a
That this uniquely determines the b∗i follows from Theorem 19.7
22-2 MAST10022 Linear Algebra: Advanced, 2024

Theorem 22.5

Let {b1 , . . . , bn } be a basis for V . Then {b∗1 , . . . , b∗n } is a basis for V ∗ , and

∀u ∈ V, u = b∗1 (u)b1 + · · · + b∗n (u)bn


∀ϕ ∈ V ∗ , ϕ = ϕ(b1 )b∗1 + · · · + ϕ(bn )b∗n

Pn
Proof. For all u ∈ V we have u = j=1 βj bj for some βj ∈ F. Then
 
n
X n
X
b∗i (u) = b∗i  βj bj  = βj b∗i (bj ) = βi
j=1 j=1

Pn
Having shown that βi = b∗i (u), we have that u = ∗
i=1 bi (u)bi .
Pn
For the second statement, let ϕ ∈ V ∗ . For all u ∈ V we have u = j=1 βj bj for some βj ∈ F. Then

n
! n
! n 
X X X
ϕ(bi )b∗i (u) = ϕ(bi )b∗i  βj bj 
i=1 i=1 j=1
n
X n
X
= ϕ(bi ) βj b∗i (bj )
i=1 j=1
n n
!
X X
= ϕ(bi )βi = ϕ βi bi = ϕ(u)
i=1 i=1
Pn ∗
Therefore ϕ = i=1 ϕ(bi )bi . To see that {b∗1 , . . . , b∗n } is linearly independent we have
n
X n
X
βi b∗i = 0 =⇒ ∀j, βi b∗i (bj ) = 0
i=1 i=1
=⇒ ∀j, βj = 0

Corollary 22.6

It follows that if V is finite dimensional, then V ∗ ∼


= V . In particular, dim(V ∗ ) = dim(V ).

22.3 Second dual space

b ∈ V ∗∗ by
Given u ∈ V we define an element u

u
b(ϕ) = ϕ(u)

b is a linear transformation V ∗ → F.
Exercise 208. Show that u

Theorem 22.7: natural isomorphism

Let V be a finite dimensional vector space. The map V → V ∗∗ given by u 7→ u


b is an isomorphism.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 22-3

Proof. To see that the map is a linear transformation, for u, v ∈ V and ϕ ∈ V ∗ we have:
\
(u + v)(ϕ) = ϕ(u + v) = ϕ(u) + ϕ(v) = u
b(ϕ) + vb(ϕ)
(ku)(ϕ)
d = ϕ(ku) = kϕ(u) = kbu(ϕ)

To see that the map is injective, let u ∈ V and suppose that u


b = 0. We have
b = 0 =⇒ ∀ϕ ∈ V ∗ ,
u u
b(ϕ) = 0

=⇒ ∀ϕ ∈ V , ϕ(u) = 0
*
=⇒ u = 0 (by Exercise 207)
Therefore the map is injective.
From Corollary 22.6, dim(V ∗∗ ) = dim(V ∗ ) = dim(V ). Since the linear transformation is injective and
the domain and codomain have the same dimension, we conclude that the map is an isomorphism.

Remark. The above isomorphism does not require any choice of basis.

22.4 Transpose linear transformation

Definition 22.8

Given a linear transformation T : V → W we de- T


V W
fine the transpose of T to be the linear transformation
T ∗ : W ∗ → V ∗ given by
ψ
T ∗ (ψ)

T (ψ) = ψ ◦ T
F

The matrix of the transpose linear transformation T ∗ is the transpose of the matrix of T .

Theorem 22.9

Let V and W be finite dimensional vector spaces. Let B be a basis for V and B ∗ the dual basis
for V ∗ . Let C be a basis for W and C ∗ the dual basis for W ∗ . Let T : V → W be a linear
transformation. Then
[T ∗ ]B∗ ,C ∗ = ([T ]C,B )T

Proof. Let B = {b1 , . . . , bn }, B ∗ = {b∗1 , . . . , b∗n }, C = {c1 , . . . , cn }, and C ∗ = {c∗1 , . . . , c∗n }. Recall that
(
∗ ∗ 1 if i = j
bi (bj ) = ci (cj ) =
0 if i 6= j
Pn
Let aij ∈ F be such that T (bj ) = i=1 aij ci . That is, aij is the (i, j)-th entry of [T ]C,B . The i-th row of
[T ]C,B is [ai1 · · · ain ].
We will show that i-th column of [T ∗ ]B∗ ,C ∗ is the transpose of the i-th row of [T ]C,B .
The i-th column of [T ∗ ]B∗ ,C ∗ is given by [T ∗ (c∗i )]B∗ . We have
T ∗ (c∗i ) = c∗i ◦ T (definition of transpose linear transformation)
X n
= c∗i ◦ T (bj )b∗j (Theorem 22.5)
j=1

© University of Melbourne 2024


22-4 MAST10022 Linear Algebra: Advanced, 2024

The i-th column of [T ∗ ]B∗ ,C ∗ is therefore [c∗i ◦ T (b1 ) · · · c∗i ◦ T (bn )]T . We will be done if we can show
that c∗i ◦ T (bj ) = aij . We have
n
X n
X
c∗i ◦ T (bj ) = c∗i ( akj ck ) = akj c∗i (ck ) = aij
k=1 k=1

22.5 Exercises

209. Let ϕ ∈ (R2 )∗ satisfy ϕ(1, −1) = 2 and ϕ(−4, 5) = −3. Find ϕ(x, y).

210. For each of the following bases of R3 , find the dual basis B ∗ = {u∗ , v ∗ , w∗ } of (R3 )∗ .

(a) B = {u = (−2, −2, 3), v = (1, 2, −1), w = (−1, −1, 2)}


(b) B = {u = (−1, 1, 1), v = (−2, 1, 2), w = (−1, 1, 2)}

211. Let V be a finite dimensional vector space with basis B. Show that ∀u ∈ V and ∀ϕ ∈ V ∗ ,

ϕ(u) = [ϕ]TB∗ [u]B

212. For each of the following linear transformations T : R2 → R3 and linear functionals ϕ ∈ (R3 )∗ ,
calculate T ∗ (ϕ)(x, y).

(a) T (x, y) = (x + y, 0, x − y), ϕ(x, y, z) = x + y + z


(b) T (x, y) = (y, x, x + y), ϕ(x, y, z) = z

213. Consider linear transformations S : U → V and T : V → W . Show that (T ◦ S)∗ = S ∗ ◦ T ∗ .

214. Let V = P2 (F). Given a scalar a ∈ F, define ϕa : V → F by

ϕa (a0 + a1 x + a2 x2 ) = a0 + a1 a + a2 a2

Show

(a) ϕa ∈ V ∗
(b) If a 6= b, then ϕa 6= ϕb

Suppose that a, b, c ∈ F are distinct.

(c) Show that {ϕa , ϕb , ϕc } is a basis for V ∗ .


(d) Find the basis B for V , such that B ∗ is the above basis for V ∗ .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 22-5

Extra material for lecture 22

 If V is infinite dimensional, then V ∗ is not isomorphic to V . Given a basis B = {bi | i ∈ I} we


can define {b∗i | i ∈ I} as in Definition 22.4. This set is linearly dependent, but is not a spanning
set (when I is infinite).
For example, consider V = F<N 2 , the vector space of infinite sequences (ai )i∈N having the prop-
erty that ai 6= 0 for only finitely many i. A basis for V is B = {bi | i ∈ N} where bi has a 1 in
the i-th position and zeros elsewhere. The linear functional ϕ ∈ V ∗ defined by ϕ(bi ) = 1 (for all
i ∈ N) is not in the span of the set {b∗i | i ∈ N}. The dual space is V ∗ = FN 2 , the vector space of
infinite sequences (ai )i∈N and has dimension strictly larger than that of V .

 The following connects transition matrices for V and transition matrices for V ∗ .

Theorem. Let V be a (finite dimensional) vector space. Let B and C be two bases for V and let B ∗ and
C ∗ be the (respective) dual bases for V ∗ . Then

(PC ∗ ,B∗ )T = (PC,B )−1

Proof. We will show that (PC ∗ ,B∗ )T PC,B = I. The i-th row of (PC ∗ ,B∗ )T is [b∗i ]TC ∗ = [α1 · · · αn ]
where b∗i = α1 c∗1 + · · · + αn c∗n . The j-th column of PC,B is [bj ]C = [β1 · · · βn ]T where bj =
β1 c1 + · · · + βn cn . Therefore

((PC ∗ ,B∗ )T PC,B )ij = [α1 · · · αn ][β1 · · · βn ]T = α1 β1 + · · · + αn βn

Now observe that

b∗i (bj ) = (α1 c∗1 + · · · + αn c∗n )(β1 c1 + · · · + βn cn ) = α1 β1 + · · · + αn βn

We have shown that (


1 if i = j
((PC ∗ ,B∗ )T PC,B )ij = b∗i (bj ) =
0 if i 6= j

 Let S ⊆ V be a subset (not necessarily a subspace) of a vector space V . The annihilator of S is

S 0 = {ϕ ∈ V ∗ | ∀u ∈ S, ϕ(u) = 0}

The annihilator is a subspace of V ∗ (even if S is not a subspace of V ).


In the case in which V is finite dimensional and W 6 V is a subspace, we have the following.

Theorem. Let V be a finite dimensional vector space and W a subspace of V . Then

1. dim(W ) + dim(W 0 ) = dim(V )


2. (W 0 )0 = W (where we are identifying V and V ∗∗ via the natural isomorphism)

Proof. Exercise! Hint for the first part: Let {w1 , . . . , wk } be a basis for W . Extend to a basis for
V , {w1 , . . . , wk , v1 , . . . , vm }. Show that {v1∗ , . . . , vm
∗ } is a basis for W 0 .

© University of Melbourne 2024


22-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 23

Eigenvalues and eigenvectors

23.1 Invariant subspaces

To help with understanding and analysing linear transformations, it is useful to identify subspaces
that are mapped to themselves in the following sense.

Definition 23.1

Given a linear transformation T : V → V , a subspace W 6 V is called an invariant subspace if


T (W ) ⊆ W .

Example 23.2. Let T : R4 → R4 be the linear transformation having standard matrix representation
 
2 0 1 2
 1 2 0 −1 
[T ]S = 
 −2

1 −1 −1 
1 −2 2 1

Let u1 = (−1, 1, 1, 0) and u2 = (0, 0, 1, −1). The subspace U = span{u1 , u2 } is invariant. To see this it
is enough to note that
 
−1
1
 2  = [u1 ]S + [u2 ]S
[T (u1 )]S = [T ]S [u1 ]S =   therefore T (u1 ) = u1 + u2 ∈ U
−1
 
−1
1
 0  = [u1 ]S − [u2 ]S
[T (u2 )]S = [T ]S [u2 ]S =  therefore T (u2 ) = u1 − u2 ∈ U

Exercise 215. Let V be a finite-dimensional F-vector space and let T : V → V be a linear transforma-
tion. Let B = {b1 , . . . , bn } be a basis for V and k ∈ {1, 2, . . . , n − 1}. Define W = span{b1 , . . . , bk } and
suppose that W is invariant (i.e. T (W ) ⊆ W ). Show that
" #
A C
[T ]B = for some A ∈ Mk,k (F), C ∈ Mk,(n−k) and B ∈ M(n−k),(n−k) (F).
0 B

23.2 Definition of eigenvalues and eigenvectors

Consideration of 1-dimensional invariant subspaces leads to the idea of eigenvectors and eigenval-
ues.

Exercise 216. Let V be a vector space and let T : V → V be a linear transformation. Suppose u ∈ V
and λ ∈ F are such that T (u) = λu. Show that the subspace W = span{u} 6 V is invariant.
23-2 MAST10022 Linear Algebra: Advanced, 2024

Definition 23.3: Eigenvalues and eigenvectors of linear transformations

Let V be a vector space with field of scalars F, and let T : V → V be a linear transformation. A
*
scalar λ ∈ F is called an eigenvalue of T if there is a non-zero vector v ∈ V \ {0} such that

T (v) = λv
*
Such a vector v ∈ V \ {0} is called an eigenvector of T corresponding to the eigenvalue λ.
The set {v ∈ V | T (v) = λv} is a subspace of V , called the eigenspace of λ.

Exercise 217. Let T : V → V be a linear transformation and suppose that λ is an eigenvalue of T . Let
Wλ = {v ∈ V | T (v) = λv} (i.e., the corresponding eigenspace). Show that
*
(a) Wλ is a subspace of V . (b) Wλ 6= {0} (c) T (Wλ ) ⊆ Wλ

Example 23.4. Let T : R3 → R3 be the linear transformation whose standard matrix is


 
−7 −18 −12
[T ]S =  9 20 12 
−9 −18 −10
T (2, −1, 0) = (4, −2, 0) = 2(2, −1, 0) and T (6, −5, 3) = (12, −10, 6) = 2(6, −5, 3) since
    
−7 −18 −12 2 4
[T (2, −1, 0)]S = [T ]S [(2, −1, 0)]S =  9 20 12  −1 = −2
−9 −18 −10 0 0
    
−7 −18 −12 6 12
[T (6, −5, 3)]S = [T ]S [(6, −5, 3)]S =  9 20 12  −5 = −10
−9 −18 −10 3 6

So both (2, −1, 0) and (6, −5, 3) are eigenvectors with eigenvalue 2.

As we’ve seen above, the matrix of a linear transformation is, of course, useful for working with
eigenvectors. We adapt the definition of eigenvalue and eigenvector to matrices in a natural way.

Definition 23.5: Eigenvalues and eigenvectors of a matrix

Let F be a field, and let A ∈ Mn,n (F). A scalar λ ∈ F is an eigenvalue of A if there is a non-zero
column matrix v ∈ Mn,1 (F) such that
Av = λv
Then v is called an eigenvector of A corresponding to eigenvalue λ.

Lemma 23.6

Let V be a finite-dimensional vector space over a field F and let T : V → V be a linear transfor-
mation. Let B be a basis of V and let λ ∈ F and v ∈ V . Then

v is an eigenvector of T with eigenvalue λ ⇐⇒ [v]B is an eigenvector of [T ]B with eigenvalue λ

Proof. This follows from Lemma 20.2 (and Lemma 15.4):

T (v) = λv ⇐⇒ [T (v)]B = [λv]B ⇐⇒ [T ]B [v]B = λ[v]B


* *
Note also that v = 0 ⇐⇒ [v]B = 0.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 23-3

23.3 Calculating eigenvalues

We can calculate the eigenvalues of a matrix (and hence a linear transformation) using the following.

Proposition 23.7

Let A ∈ Mn,n (F). Then λ ∈ F is an eigenvalue of A if and only if det(A − λIn ) = 0.

Proof. Let λ ∈ F and v ∈ V .


* * *
Av = λv ⇐⇒ Av − λv = 0 ⇐⇒ Av − λIn v = 0 ⇐⇒ (A − λIn )v = 0

Therefore,

λ is an eigenvalue ⇐⇒ (A − λIn ) has a non-trivial solution space


⇐⇒ nullity(A − λIn ) > 0
⇐⇒ rank(A − λIn ) < n Rank-nullity theorem
⇐⇒ det(A − λIn ) = 0 Noting that A − λIn is square

 
1 4
Example 23.8. Let’s find the eigenvalues of A = ∈ M2,2 (R).
1 1
     
1 4 λ 0 1−λ 4
det(A − λI2 ) = det( − ) = det = λ2 − 2λ − 3 = (λ − 3)(λ + 1)
1 1 0 λ 1 1−λ

The eigenvalues of A are −1 and 3.

Exercise 218. Show that if A ∈ Mn,n (F) is in upper triangular form, then the eigenvalues of A are
exactly the entries on the diagonal of A.

Definition 23.9: Characteristic polynomial

Let A ∈ Mn,n (F). The determinant det(xIn − A) is a polynomial in x called the characteristic
polynomial of A. We will denote it by cA .

cA (x) = det(xIn − A)

The characteristic equation of A is the equation cA (x) = 0

Remark. 1. The characteristic polynomial of A ∈ Mn,n (F) is always monic and of degree exactly
n. That is,
cA (x) = α0 + α1 x + α2 x2 + · · · + αn−1 xn−1 + xn
for some α0 , . . . , αn−1 ∈ F.

2. For A ∈ M2,2 (F) we have cA (x) = det(A) − tr(A)x + x2

3. From Proposition 23.7 we know that the eigenvalues of A are exactly the roots of the char-
acteristic polynomial. If λ ∈ F is an eigenvalue of A, then (x − λ) divides the characteristic
polynomial of A.

© University of Melbourne 2024


23-4 MAST10022 Linear Algebra: Advanced, 2024

Definition 23.10: Algebraic multiplicity of eigenvalues

The algebraic multiplicity of an eigenvalue λ is the largest k ∈ N such that (x − λ)k divides the
characteristic polynomial.

Remark. 1. In general, the sum of the algebraic multiplicities is at most n.

2. In the case in which F = C, there is always at least one eigenvalue.* Further, the sum of the
algebraic multiplicities equals n.
 
8 −9 −9
Example 23.11. Let A =  9 −10 −9 ∈ M3,3 (R). The characteristic polynomial of A is given by
−1 2 5

x−8 9 9 x−8 9 9 x−8 9 9


R2 −R1
xI3 − A = −9 x + 10 9 ===== −x − 1 x + 1 0 = (x + 1) −1 1 0
1 −2 x−5 1 −2 x − 5 1 −2 x − 5
x+1 9 9
C1 +C2 x+1 9
===== (x + 1) 0 1 0 = (x + 1)
−1 x − 5
−1 −2 x − 5
= (x + 1)((x + 1)(x − 5) + 9) = (x + 1)(x − 2)2

The eigenvalues of A are therefore −1 and 2. The eignevalue −1 has algebraic multiplicity 1 and the
eigenvalue 2 has algebraic multiplicity 2.
 
0 −1
Example 23.12. Let A = ∈ M2,2 (R). The characteristic polynomial is:
1 0
 
x 1
det = x2 + 1
−1 x

This polynomial has no roots in R. Therefore A has no eigenvalues.


Now, use the same matrix, but considered as an element of M2,2 (C). The characteristic polynomial
is the same, but since it does have roots in C, the matrix has eigenvalues. The eigenvalues are i and
−i. Each has algebraic multiplicity 1.

23.4 Exercises

219. Find the eigenvalues (over C) of the following matrices:


       
7 −2 3 −2 1 −1 3 5
(a) (b) (c) (d)
15 −4 17 −7 1 3 −2 −3

220. Find, by inspection, the eigenvalues of the given matrix.


   
1 2 0 1 0 0 0
(a) 0 3 −1 2 3 0 0
(b)  
0 0 4 4 5 6 0
7 8 9 10

221. Find the eigenvalues (over R) of the following matrices.

* By the Fundamental Theorem of Algebra.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 23-5

     
2 −3 6 −5 −8 −12 3 1 1
(a) 0 5 −6 (c) −6 −10 −12 (e) 2 4 2
0 1 0 6 10 13 1 1 3
     
2 1 0 2 2 2 1 1 0
(b) 0 2 0 (d) −1 −1 −2 (f) 0 1 0
0 0 −3 1 2 3 0 0 1

222. Let A ∈ Mn,n (F).

(a) Prove that if λ ∈ F is an eigenvalue of A, then λ2 is an eigenvalue of A2 .


(b) Suppose that A is invertible. Prove that λ is an eigenvalue of A if and only if λ1 is an
eigenvalue of A−1 . What relationship holds between the eigenvectors of A and A−1 ?

223. Let A ∈ Mn,n (F) and let p ∈ F[x]. Show that

p(A) = 0 =⇒ p(λ) = 0 for all eigenvalues λ of A

224. Let A ∈ Mn,n (C). Show that det(A) is equal to the product of the eigenvalues (with multiplic-
ity) of A .

© University of Melbourne 2024


23-6 MAST10022 Linear Algebra: Advanced, 2024

Further material for lecture 23

 References for eigenvalues and eigenvectors

Elementary Linear Algebra by Anton and Rorres, §5


The Art of Proof by Beck and Geoghegan, §5

 Let A ∈ Mn,n (C). The sum of the eigenvalues (with multiplicity) of A is equal to the trace of A.
The product of the eigenvalues (with multiplicity) is equal to the determinant. Neither of these
is obvious!

© University of Melbourne 2024


LECTURE 24

Eigenspaces

24.1 Calculating the eigenspaces of a matrix

We saw last lecture the definitions of the eigenvalues and eigenspaces for a linear transformation and
for a matrix. If λ ∈ F is an eigenvalue value of A ∈ Mn,n (F), the corresponding eigenspace is the
solution space of the matrix A − λIn .
 
2 −3 6
Example 24.1. Let A = 0 5 −6 ∈ M3,3 (R).
0 1 0
We find the eigenvalues of A and a basis for each eigenspace.
The characteristic polynomial is given by

x−2 3 −6
x−5 6
xI3 − A = 0 x − 5 6 = (x − 2) = (x − 2)(x(x − 5) + 6)
−1 x
0 −1 x
= (x − 2)(x2 − 5x + 6) = (x − 2)2 (x − 3)

Therefore the eigenvalues of A are 2 and 3. The eigenvalue 2 has algebraic multiplicity 2 and the
eigenvalue 3 has algebraic multiplicity 1.
To find the eigenspace for eigenvalue 2 we solve for the solution space of A − 2I3 .*
     
0 −3 6 0 1 −2 0 1 −2
R1 ↔R3 R2 −3R1
A − 2I3 = 0 3 −6 −−−−−→ 0 3 −6 −−−−−→ 0 0 0 
R3 +3R1
0 1 −2 0 −3 6 0 0 0

Therefore, the eigenspace (for eigenvalue 2) is

{(x, y, z) ∈ R3 | y = 2z} = {x(1, 0, 0) + z(0, 2, 1) | x, z ∈ R} = span{(1, 0, 0), (0, 2, 1)}

The set {(1, 0, 0), (0, 2, 1)} is a basis for the eigenspace.
For eigenvalue 3, the eigenspace is the solution space of A − 3I3 .
     
−1 −3 6 −1 −3 6 −1 −3 6
R ↔R3 R −2R2
A − 3I3 =  0 2 −6 −−2−−−→ 0 1 −3 −−3−−−→ 0 1 −3
0 1 −3 0 2 −6 0 0 0
   
−1 0 −3 1 0 3
R1 +3R2 −1×R1
−−−−−→ 0 1 −3 −−−−→ 0 1 −3
  
0 0 0 0 0 0

Therefore the set {(−3, 3, 1)} is a basis for the eigenspace.

Definition 24.2: Geometric multiplicity of a eigenvalue

The geometric multiplicity of an eigenvalue is the dimension of the corresponding eigenspace.

* Which is equal to the solution space of 2I3 − A.


24-2 MAST10022 Linear Algebra: Advanced, 2024

h 8 −9 −9 i
Example 24.3. We saw last lecture (Example 23.11) that the matrix A = 9 −10 −9 ∈ M3,3 (R) has
−1 2 5
eigenvalues −1 (with algebraic multiplicity 1) and 2 (with algebraic multiplicity 2). Let’s find the
corresponding eigenspaces.
For eigenvalue −1 we have
   
9 −9 −9 1 0 4
A − (−1)I3 =  9 −9 −9 ∼ · · · ∼ 0 1 5
−1 2 6 0 0 0
The eigenspace has basis {(−4, −5, 1)} and geometric multiplicity 1.
For eigenvalue 2 we have
   
6 −9 −9 1 0 3
A − 2I3 =  9 −12 −9 ∼ · · · ∼ 0 1 3
−1 2 3 0 0 0
The eigenspace has basis {(−3, −3, 1)} and geometric multiplicity 1. Notice that, for this matrix, the
geometric multiplicity of the eigenvalue 2 is strictly less than its algebraic multiplicity.

24.2 Characteristic polynomial of a linear transformation

We saw in Lemma 23.6 that the eigenvalues of a linear transforamtion T are the same as the eigenval-
ues of any matrix representation [T ]B , and that there is a correspondence between the eigenvectors of
T and the eigenvectors of [T ]B . We now define the characteristic polynomial of T to be the character-
istic polynomial of [T ]B . To see that this is independent of the choice of basis B, we note the following
lemma. Recall that if B and C are two bases, then although [T ]B and [T ]C are not equal, they are similar
in the sense that [T ]B = P [T ]C P −1 for some invertible matrix P (see Definition 21.10).

Lemma 24.4

Let A, B ∈ Mn,n (F). If A and B are similar, then they have the same characteristic polynomial.

Proof. Let P ∈ Mn,n be an invertible matrix such that A = P BP −1 . Then


det(A − λIn ) = det(P BP −1 − λIn ) = det(P (BP −1 − λP −1 In )) = det(P (B − λP −1 In P )P −1 )
= det(P (B − λIn )P −1 ) = det(P ) det(B − λIn ) det(P −1 ) = det(P ) det(P −1 ) det(B − λIn )
= det(B − λIn )

Definition 24.5: Characteristic polynomial of a linear transformation

Let V be an n-dimensional vector space over F and let T : V → V be a linear transformation.


The characteristic polynomial of T is defined to be the the characteristic polynomial of some
(hence any) matrix representation [T ]B ∈ Mn,n (F) of T .

Example 24.6. Consider the linear transformation T : P2 (F3 ) → P2 (F3 ) given by


T (a + bx + cx2 ) = (2a + b + c) + (b + 2c)x + bx2
Let’s find the characteristic polynomial of T , then its eigenvalues and eigenspaces. We choose a basis
B = {1, x, x2 } for P2 (F3 ).
 
2 1 1
[T ]B = [T (1)]B [T (x)]B [T (x2 )]B = [2]B [1 + x + x2 ]B [1 + 2x]B = 0
   
1 2
0 1 0

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 24-3

Now we calculate the characteristic polynomial of [T ]B .

x − 2 −1 −1
x − 1 −2
0 x − 1 −2 = (x − 2) = (x − 2)(x2 − x − 2) = (x − 2)(x − 2)(x + 1)
−1 x
0 −1 x
= (x + 1)3

The characteristic polynomial of T is (x + 1)3 .


There is only one eigenvalue, 2 ∈ F3 , and it has algebraic multiplicity 3 ∈ N.
Now to calculate the corresponding eigenspace.
   
0 1 1 0 1 1
R +R1
[T ]B − 2I3 = 0 2 2 −−2−−→ 0 0 0
R3 −R1
0 1 1 0 0 0

We see that the solution space has dimension 2 and has a basis {(1, 0, 0), (0, 2, 1)} ⊆ F33 .
The eigenspace of T is therefore the subspace of P2 (F3 ) that has basis {1, 2x + x2 }.
The eigenvalue 2 ∈ F3 has geometric multiplicity 2 ∈ N.

24.3 Algebraic versus geometric multiplicity

In the above examples we saw that the geometric multiplicity was always less than or equal to the
algebraic multiplicity. We prove that this is always the case.

Proposition 24.7

Let V be an n-dimensional vector space over F and let T : V → V be a linear transformation.


Suppose that λ ∈ F is an eigenvalue of T . The geometric multiplicity of λ is less than or equal to
its algebraic multiplicity.

Proof. Let W 6 V be the eigenspace corresponding to λ. Let k = dim(W ) be the geometric multiplic-
ity of λ. Let {w1 , . . . , wk } be a basis for W . Extend to a basis for V , B = {w1 , . . . , wk , bk+1 , . . . , bn }. We
have
 
λIk M
[T ]B = for some M ∈ Mk,(n−k) (F) and N ∈ M(n−k),(n−k) (F)
0 N

The characteristic polynomial of T is therefore (x − λ)k cN (x) where cN (x) is the characteristic poly-
nomial of the matrix N . Therefore the algebraic multiplicity is greater than or equal to k.

24.4 Exercises

225. Find bases for the eigenspaces of matrices in Exercise 219. Give the algebraic and geometric
multiplicity of each eigenvalue.

226. Find bases for the eigenspaces of matrices in Exercise 221. Give the algebraic and geometric
multiplicity of each eigenvalue.

227. Find 1-dimensional subspaces of R2 that are invariant under the linear transformations given
by the following matrices:
   
1 0 1 2
(a) (b)
2 0 −3 −6

© University of Melbourne 2024


24-4 MAST10022 Linear Algebra: Advanced, 2024

Further material for lecture 24

 Let A ∈ Mn,n (F). Prove that the characteristic polynomial cA (x) = det(xIn − A) has degree n
and is monic.

© University of Melbourne 2024


LECTURE 25

Diagonalisation

We have seen in several examples that, given a linear transformation, there is sometimes a basis with
respect to which the matrix of the linear transformation is diagonal. Having such a basis is extremely
useful when working with linear transformations. We will give conditions for the existence of such
a basis and see that it is not always possible.

25.1 Diagonalisability

Definition 25.1: diagonalisable linear transformation

Let V be a finite dimensional vector space. A linear transformation T : V → V is diagonalisable


if there is a basis B for V such that the matrix [T ]B is diagonal.

Example 25.2. Consider the linear transformation T : R3 → R3 with standard matrix given by
 
2 −3 6
[T ]S = 0 5 −6
0 1 0

In Example 24.1 we saw that this matrix has eigenvalues 2 and 3. Further we calculated a basis
{(1, 0, 0), (0, 2, 1)} for the λ = 2 eigenspace and a basis {(−3, 3, 1)} for the λ = 3 eigenspace. In
particular, letting b1 = (1, 0, 0), b2 = (0, 2, 1) and b3 = (−3, 3, 1) we have

T (b1 ) = 2b1 T (b2 ) = 2b2 T (b3 ) = 3b3

Moreover, B = {b1 , b2 , b3 } is a basis for R3 . To see this note


   
1 0 −3 R − 1 R 1 0 −3
3 2
P = 0 2
 3  −−−−2−→ 0 2 3 
0 1 1 0 0 12

Then, with respect to the basis B the matrix of T is diagonal:


 
2 0 0
[T ]B = [ [T (b1 )]B [T (b2 )]B [T (b3 )]B ] = [ [2b1 ]B [2b2 ]B [3b3 ]B ] = 0 2 0
0 0 3

Therefore T is diagonalisable.

Remark. Notice that, in the above example, if we let A = [T ]S , and D = [T ]B , then we have that
A = P DP −1 since
A = [T ]S = PS,B [T ]B PB,S = P DP −1

Definition 25.3: diagonalisable matrix

A matrix A ∈ Mn,n (F) is diagonalisable if there is an invertible matrix P ∈ Mn,n (F) and a
diagonal matrix D ∈ Mn,n (F) such that

A = P DP −1
25-2 MAST10022 Linear Algebra: Advanced, 2024

Exercise 228. Let T : V → V be a linear transformation of a finite-dimensional vector space V , and


let B be any basis of V . Show that T is diagonalisable if and only if [T ]B is diagonalisable.

Theorem 25.4

Let V be a finite-dimensional vector space. A linear transformation T : V → V is diagonalisable


if and only if there is a basis B for V with the property that all elements of B are eigenvectors of
T

Proof. Suppose first that B = {b1 , . . . , bn } is a basis for V and that T (bi ) = λi bi . Then
" #
λ1 0
[T ]B = [ [T (b1 )]B · · · [T (bn )]B ] = [ [λ1 b1 ]B · · · [λn bn ]B ] = ..
.
0 λn

Conversely, suppose that C = {c1 , . . . , cn } is a basis of V such that [T ]C is diagonal and let the entries
on the diagonal be µ1 , . . . , µn . Then considering the j-th column of [T ]C we have that
0
.
 .. 
0
 
[T (cj )]C = 
 µj  = [µj cj ]C

0
 . 
..
0

and therefore T (cj ) = µj cj and cj is an eigenvector of T .

25.2 How to diagonalise a matrix

Algorithm 25.5: Diagonalise a matrix

Given a matrix A ∈ Mn,n (F), to find an invertible P ∈ Mn,n (F) and a diagonal D ∈ Mn,n (F)
such that A = P DP −1 (or conclude that no such exist):

1. Calculate the eigenvalues of A

2. For each eigenvalue, find a basis for the corresponding eigenspace.

3. Let B be the union of the eigenspace bases.

4.  If |B| < n, then A if not diagonalisable.


 Otherwise, A is diagonalisable and A = P DP −1 with the following P, D ∈ Mn,n (F):
Label the elements of B as B = {b1 , . . . , bn } and let λi ∈ F be such that T (bi ) = λi bi .
Take " #
λ1 0
P = [ [b1 ]S · · · [bn ]S ] D= ..
.
0 λn


1 4
Example 25.6. Let A = ∈ M2,2 (R). The characteristic polynomial of A is given by
1 1
 
1−x 4
cA (x) = det = x2 − 2x − 3 = (x − 3)(x + 1)
1 1−x

The eigenvalues are: −1 and 3.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 25-3

For eigenvalue −1 we have


     
2 4 R1 ↔R2 1 2 R2 −2R1 1 2
A − (−1)I2 = −−−−−→ −−−−−→
1 2 2 4 0 0
A basis for the λ = −1 eigenspace is {(−2, 1)}.
For eigenvalue 3 we have
     
−2 4 R1 ↔R2 1 −2 R2 +2R1 1 −2
A − 3I2 = −−−−−→ −−−−−→
1 −2 −2 4 0 0
A basis for the λ = 3 eigenspace is {(2, 1)}.
B = {(−2, 1), (2, 1)} is a basis for R2 and if we define
   
−2 2 −1 0
P = and D =
1 1 0 3
we then have A = P DP −1 .
 
1 4
Example 25.7. Let B = ∈ M2,2 (R). The characteristic polynomial of B is given by
0 1
 
1−x 4
cB (x) = det = (x − 1)(x − 1)
0 1−x
The only eigenvalue is: 1
We have    
0 4 14 ×R1 0 1
B − I2 = −−−−→
0 0 0 0
A basis for the λ = −1 eigenspace is {(1, 0)}.
The matrix B is not diagonalisable.
 
6 −5
Example 25.8. Let M = ∈ M2,2 (R).
8 −6
The characteristic polynomial of M is given by
 
6−x −5
cM (x) = det = x2 + 4
8 −6 − x
The matrix M has no eigenvalues (in R). The matrix M is not diagonalisable (over R).
 
6 −5
Now let N = ∈ M2,2 (C). The characteristic polynomial of N is given by
8 −6
 
6−x −5
cN (x) = det = x2 + 4
8 −6 − x
The eigenvalues of N are: −2i and 2i.
For eigenvalue λ = −2i we have
8 1
    1  
6 + 2i −5 R2 − 6+2i R1 6 + 2i −5 6+2i ×R1 1
N − (−2i)I2 = −−−−−−−→ −−−−−−→ 4 (−3 + i)
8 −6 + 2i 0 0 0 0
A basis for the λ = −2i eigenspace is {(3 − i, 4)}.
For eigenvalue λ = 2i we have
8 1
    1  
6 − 2i −5 R2 − 6−2i R1 6 − 2i −5 6−2i ×R1 1 − i)
N − 2iI2 = −−−−−−−→ −−−−−−→ 4 (−3
8 −6 − 2i 0 0 0 0
A basis for the λ = 2i eigenspace is {(3 + i, 4)}.
The matrix N is diagonalisable (over C) and we have
   
3−i 3+i −2i 0
N = P DP −1 with P = and D =
4 4 0 2i

© University of Melbourne 2024


25-4 MAST10022 Linear Algebra: Advanced, 2024

25.3 Exercises

229. Decide which of the matrices A in questions 219 above are diagonalisable and, if possible, find
an invertible matrix P and a diagonal matrix D such that P −1 AP = D.

230. Decide which of the matrices A in questions 221 above are diagonalisable and, if possible, find
an invertible matrix P and a diagonal matrix D such that P −1 AP = D.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 25-5

Further material for lecture 25

 Let V be a finite dimensional vector space over a field F. Although not all linear transformations
T : V → V are diagonalisable, if the field is C then there does always exist a basis such that
[T ]B is upper triangular. Here’s an outline of a proof.

– Use induction on dim(V ). Suppose dim(V ) = n + 1.


*
– Since F = C, T has an eigenvalue λ and corresponding eigenvector u ∈ V \ {0}.
– Extend to a basis {u, b1 , . . . , bn } for V and let W = span{b1 , . . . , bn }.
*
– Define P : V → W by P (u) = 0 and P (bi ) = bi .
– Let ϕ : W → W be given by ϕ(w) = P ◦ T (w).
– By the induction hypothesis, there is a basis C = {w1 , . . . , wn } for W such that [ϕ]C is upper
triangular.
– A = {u, w1 , . . . , wn } is a basis for V .
– T (u) ∈ span{u} and T (wi ) ∈ span{u, w1 , . . . , wi } (because ϕ(wi ) ∈ span{w1 , . . . , wi })
– Therefore [T ]A is upper triangular.

© University of Melbourne 2024


25-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 26

Diagonalisation II

We look at sufficient conditions for diagonalisability and see why the diagonalisation method given
in the last lecture works.

26.1 Distinct eigenvalues

Lemma 26.1

Let T : V → V be a linear transformation of a vector space V . Suppose that v1 , . . . , vk ∈ V are


eigenvectors of T and let λi ∈ F be the corresponding eigenvalues. If the λi are distinct, then the
set {v1 , . . . , vk } is linearly independent.

Proof. We use induction on k.


The base case is k = 1. We want to show that {v1 } is linearly independent. Since v1 is an eigenvector,
*
v1 6= 0 and it follows that {v1 } is linearly independent.
For the induction step, assume that the statement is true for k = n ∈ N. We want to show that it holds
*
for k = n + 1. So suppose that we have eigenvectors v1 , . . . , vn+1 ∈ V \ {0} with distinct eigenvalues
Pn+1 *
λ1 , . . . , λn+1 ∈ F. Suppose that α1 , . . . , αn+1 ∈ F are such that i=1 αi vi = 0. We want to show that
αi = 0 for all i ∈ {1, . . . , n + 1}. We have

n+1
X *
αi vi = 0 (1)
i=1
n+1 n+1
X * X *
=⇒ T ( αi vi ) = T (0) =⇒ αi T (vi ) = 0
i=1 i=1
n+1
X *
=⇒ αi λ i vi = 0 (2)
i=1

Multiplying (1) by λn+1 and then subtracting (2) yields

n+1 n+1
!
X X *
λn+1 αi vi − αi λ i vi = 0
i=1 i=1
n+1 n
X * X *
=⇒ αi (λn+1 − λi )vi = 0 =⇒ αi (λn+1 − λi )vi = 0
i=1 i=1
=⇒ ∀i ∈ {1, . . . , n}, αi (λn+1 − λi ) = 0 ({v1 , . . . , vn } is linearly independent)
=⇒ ∀i ∈ {1, . . . , n}, αi = 0 (since λn+1 6= λi )

* *
Finally, note that we now have, from (1), that αn+1 vn+1 = 0. Since vn+1 6= 0 this implies that αn+1 = 0
also. Therefore the result holds for k = n + 1.
Therefore, by mathematical induction, the result holds for all k ∈ N.
26-2 MAST10022 Linear Algebra: Advanced, 2024

Example 26.2. For the matrix of Example 24.1, (1, 2, 1) is an eigenvector with eigenvalue 2 and
(3, −3, −1) is an eigenvector with eigenvalue 3. The set {(1, 2, 1), (3, −3, −1)} is linearly indepen-
dent.

Exercise 231. Let T : V → V be a linear transformation of a finite dimensional vector space V .


Suppose that λ1 6= λ2 are two eigenvalues of T and let the corresponding eigenspaces be W1 and W2 .
*
(a) Show that W1 ∩ W2 = {0}

(b) Let B1 be a basis for W1 and let B2 be a basis for W2 . Show that B1 ∪ B2 is linearly independent.

The following result justifies the technique given in the last lecture for diagonalising a matrix.

Theorem 26.3

Let T : V → V be a linear transformation of an n-dimensional vector space V . Then T is


diagonalisable if and only if the geometric multiplicities sum to n.

Proof. Let the eigenvalues of T be λ1 , . . . , λk . Denote by gi and ai the geometric and algebraic mul-
tiplicities of the eigenvalue λi . We know from Proposition 24.7 that gi 6 ai . We also know that
a1 + · · · + ak 6 n.
Suppose that T is diagonalisable. Then there exists a basis B for V with the property that each
element of B is an eigenvector of T . For i ∈ {1, . . . , k} let Bi = {b ∈ B | b has eigenvalue λi }. Note
that B = B1 ∪ B2 ∪ · · · ∪ Bk and that Bi ∩ Bj = ∅ if i 6= i. We have

n = |B| = |B1 | + |B2 | + · · · + |Bk | 6 g1 + g2 + · · · + gk 6 a1 + a2 + · · · + ak 6 n

Therefore g1 + · · · + gk = n.
For the converse, suppose now that g1 +· · ·+gk = n. Let Ci be a basis for the λi eigenspace. Then |Ci | =
gi . From Exercise 231 we know that Ci ∩ Cj = ∅ if i 6= j. Therefore, if we define C = C1 ∪ C2 ∪ · · · ∪ Ck ,
we have
|C| = |C1 | + |C1 | + · · · + |Ck | = g1 + g2 + · · · + gk = n
All elements of C are eigenvectors. If we show that C is linearly independent we will be done since it
would follow that C is a basis.
Denote the elements of Ci = {ci1 , . . . , ci,gi }.
gi
k X k gi
X * X * X
αi,j ci,j = 0 =⇒ ui = 0 where we define ui = αi,j ci,j
i=1 j=1 i=1 j=1
*
=⇒ ∀i, ui = 0 by Lemma 26.1
=⇒ ∀i ∀j, αi,j = 0 since Ci is linearly independent

Corollary 26.4

Let T : V → V be a linear transformation of an n-dimensional vector space V and let λ ∈ F be an


eigenvalue of T . If the geometric multiplicity of λ is strictly less than its algebraic multiplicity,
then T is not diagonalisable.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 26-3

26.2 A sufficient condition for diagonalisability

Proposition 26.5

Let V be an n-dimensional vector space over F. Let T : V → V be a linear transformation.


If T has n distinct eigenvalues in F, then T is diagonalisable.

Note. The converse is false! Linear transformations that have repeated eigenvalues can still be diag-
onalisable.

Proof. Suppose that λ1 , . . . , λn ∈ F are eigenvalues of T and that λi 6= λj when i 6= j. Let mi be the
geometric multiplicity of the eigenvalue λi . Since mi > 1, we must must that m1 + · · · + mn = n.
Diagonalisability then follows from Theorem 26.3.
1 2 3 4
Example 26.6. 1. 00 20 33 44 ∈ M4,4 (F5 ) is diagonalisable (and we can see that without the need for
0004
any calculations).

2. 1−3i 4i
 
−2i 1+3i ∈ M2,2 (C) has eigenvalues 1 + i, 1 − i and is therefore diagonalisable.

26.3 Exercises

232. Give an example of a linear transformation T : R2 → R2 that is diagonalisable and has only one
eigenvalue.
 
4 −2 −1
233. Let E =  −2 4 −1  ∈ M3,3 (C)
−1 −1 1

(a) Calculate the characteristic polynomial of E.


(b) Find the eigenvalues of E.
(c) Without finding any eigenspaces, explain why E is diagonalisable and write down a di-
agonal matrix D that is similar to E.

234. (a) Let T : Q2 → Q2 be the linear transformation given by T (x, y) = (x + y, x − y). Find the
eigenvalues of T . Is T diagonalisable? If it is diagonalisable, give a diagonal matrix D
such that [T ]B = D with respect to some basis B. (You are not being asked to find B.)
(b) Let S : R2 → R2 be the linear transformation given by S(x, y) = (x + y, x − y). Find the
eigenvalues of S. Is S diagonalisable? If it is diagonalisable, give a diagonal matrix D such
that [S]B = D with respect to some basis B. (You are not being asked to find B.)

© University of Melbourne 2024


26-4 MAST10022 Linear Algebra: Advanced, 2024

Further material for lecture 26

 References for diagonalisation

Elementary Linear Algebra by Anton and Rorres, §5.2, p302


The Art of Proof by Beck and Geoghegan, §5.C, p155

© University of Melbourne 2024


LECTURE 27

Powers of a matrix and the Cayley-Hamilton theorem

27.1 Matrix powers

In applications one often comes across the need to apply a transformation many times. If the trans-
formation can be represented by a diagonalisable matrix A, it’s much easier to compute Ak and thus
the effect of the k-th application of the transformation. The first point to appreciate is that computing
powers of a diagonal matrix D is easy.
 
h1 0 0i h1 0 0i h1 0 0i 1 0 0
2 3
Example 27.1. Let D = 0 −3 0 . Then D = 0 9 0 , D = 0 −27 0 , D = 0 (−3) 0 k k
0 0 2 004 0 0 8 0 0 2k

Example 27.2. The matrix A = [ 11 41 ] ∈ M2,2 (R) is diagonalisable and A = P DP −1 with P =


 −2 2 
1 1
and D = −1 0
0 3 . Therefore,

1 −2 2 (−1)k 0 −1 2 (−1)k + 3k 2(−1)k+1 + 3k+1


     
1
Ak = P Dk P −1 = =
4 1 1 0 3k 1 2 2 12 ((−1)k+1 + 3k ) (−1)k + 3k

Example 27.3. As an application we will investigate a simple model of population movement be-
tween Victoria and Queensland. Assume that 2% of Victorians move to Queensland each year, 1%
of Queenslanders move to Victoria each year and everybody else stays put. This is an example of a
(discrete-time) ’Markov process’. We investigate what happens in the long term under these assump-
tions.
Let vi be the Victorian population (in millions) after i years and qi be the Queensland population (in
millions) after i years.

vi+1 = 0.98vi + 0.01qi


qi+1 = 0.02vi + 0.99qi
     
vi+1 v 1 98 1
Which we can write as =A i where A =
qi+1 qi 100 2 99
   
vk k v0
Then =A .
qk q0
   
1 1 1 0
Diagonalisation gives A = P DP −1 with P = and D = so
2 −1 0 0.97

   −1
k 1 1 1 0 1 1
A = 1 1
2 −1 0 (0.97)k 2 −1 vk = (1 + 2(0.97)k )v0 + (1 − (0.97)k )q0
3 3
k k 1 1
" #
1 1 + 2(0.97) 1 − (0.97) qk = (2 − 2(0.97)k )v0 + (2 + (0.97)k )q0
= 3 3
3 2 − 2(0.97)k 2 + (0.97)k

Therefore vk → 13 (vo + q0 ) and qk → 23 (v0 + q0 ).


27-2 MAST10022 Linear Algebra: Advanced, 2024

27.2 Cayley-Hamilton Theorem

Theorem 27.4: Cayley-Hamilton Theorem

Given a matrix A ∈ Mn,n (F), let its characteristc polynomial be

cA (x) = a0 + a1 x + · · · + an−1 xn−1 + xn

Then
a0 In + a1 A + · · · + an−1 An−1 + An = 0

That is, every square matrix satisfies its own characteristic equation.

Example 27.5. The characteristic polynomial of A = [ 31 24 ] is cA (x) = x2 − 7x + 10 and

       
2 11 14 3 2 1 0 0 0
A − 7A + 10I2 = −7 + 10 =
7 18 1 4 0 1 0 0

Proof of Cayley-Hamilton for diagonalisable matrices. First suppose that D ∈ Mn,n (F) is diagonal and let
the entries on the diagonal be λ1 . . . , λn ∈ F. Then cD (x) = (x − λ1 )(x − λ2 ) . . . (x − λn ). To see that
(D − λ1 I)(D − λ2 I) · · · (D − λn I) = 0 note that the entry in the i-th row and i-th column of D − λi I
is equal to zero.
Now suppose that A = P DP −1 for some invertible matrix D and diagonal matrix D. Then cA (x) =
cD (x) by Lemma 24.4. Writing cA (x) = xn + an−1 xn−1 + · · · + a1 x + a0 we have

An + an−1 An−1 + · · · + a1 A + a0 In = P An P −1 + an−1 P An−1 P −1 + · · · + a1 P AP −1 + a0 P In P −1


= P (Dn + an−1 Dn−1 + · · · + a1 D + a0 In )P −1
= P 0P −1 = 0

Remark. A slightly more involved argument can be used to show that upper triangular matrices
satisfy the statement of the theorem. The full proof (at least for F = C and F = R) is then completed
by showing that all matrices in Mn,n (C) are similar to a matrix in upper triangular form.

The Cayley-Hamilton Theorem can be used to show the following.

Corollary 27.6

Let A ∈ Mn,n (F). For all m > 0, Am can be expressed as a linear combination of I, A, . . . , An−1 .
If A is invertible, then for all m > 0,
A−m can be expressed as a linear combination of I, A, . . . , An−1

Example 27.7. Given that the matrix

 
9 18 −24
A = 7 20 −24 has characteristic polynomial cA (x) = x3 − 4x2 + x + 6
7 21 −25

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 27-3

We know that

A3 = 4A2 − A − 6I3
A4 = 4A3 − A2 − 6A = 4(4A2 − A − 6I3 ) − A2 − 6A = 15A2 − 10A − 24I3
A5 = 4A4 − A3 − 6A2 = 4(15A2 − 10A − 24I3 ) − (4A2 − A − 6I3 ) − 6A2
= 50A2 − 39A − 90I3
−1 2 2 1
A−1 = A + A − I3
6 3 6

27.3 Exercises

235. By first diagonalising the matrix, find A5 , where A is


     
3 −2 9 18 −24 8 1 27 5
(a)
2 −2 (b) 7 20 −24 0 18 14 −6
(c) 18  
7 21 −25 0 2 −18 −6
0 8 8 −8

[Note: The matrix has eigenvalues 3, 2, −1 in (b), and 1, 2, −1, −2 in (c).]

236. Suppose the nth pass through a manufacturing process is modelled by the linear equations
xn = An x0 , where x0 is the initial state of the system and
 
1 3 2
A=
5 2 3
Show that " # " #
1 1
2 2 1 n 12 − 12
An = 1 1 +( )
2 2
5 − 21 1
2

p
Then, with the initial state x0 = , calculate limn→∞ xn .
1−p
237. The Fibonacci sequence
0, 1, 1, 2, 3, 5, 8, 13, . . .
is given by the difference equation Fk+2 = Fk+1 + Fk and the initial conditions F0 = 0, F1 = 1.
   
Fk+1 k 1 1
(a) Letting uk = , show that uk = A u0 where A = .
Fk 1 0
(b) Show that A is diagonalisable, and find P and D such that A = P DP −1 .
(c) Use your answer to (b) to calculate Ak .
(d) Use your answer to (c) to show that
√ !k √ !k
 
1  1+ 5 1− 5 
Fk = √ −
5 2 2

(e) (Bonus question!) Show that



Fk+1 1+ 5
lim = (the ‘golden ratio’)
k→∞ Fk 2
238. Two companies, Lemon and LIME, introduce a new type of computer. At the start, their shares
of the market are 60% and 40%. After a year, Lemon kept 85% of its customers and gained 25%
of LIME’s customers; LIME gained 15% of Lemon’s customers and kept 75% of its customers.
Assume that the total market is constant and that the same fractions shift among the firms every
year.

© University of Melbourne 2024


27-4 MAST10022 Linear Algebra: Advanced, 2024

(a) Write down the market share shift as a system of linear equations.
(b) Express the shift in matrix form.
(c) Find the market shares after 5 and 10 years.
(d) Show that the market eventually reaches a steady state, and give the limit market shares.

239. Verify the Cayley-Hamilton Theorem for the following matrices.


   
3 6 0 1 0
(a)
1 2 (b) 0 0 1
1 −3 3
 
0 1 0
240. Use the Cayley-Hamilton Theorem to calculate the inverse of the matrix 0 0 1 .
1 −3 3

241. For each matrix, find a non-zero polynomial satisfied by the matrix.
   
2 5 1 4 −3
(a)
1 −3 (b) 0 3 1 
0 2 −1

242. (a) Give an example of a matrix A ∈ M3,3 (R) and a quadratic polynomial p ∈ R[x] such that
p(A) = 0.
(b) Give an example of a matrix B ∈ M3,3 (R) and two different cubic polynomials q, r ∈ R[x]
such that q(A) = r(A) = 0.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 27-5

Further material for lecture 27

 Sketch of a proof of the Cayley-Hamilton theorem for upper-triangular matrices

Claim. If A ∈ Mn,n (F) is upper-triangular, then A satisfies its own characteristic polynomial

Sketch of proof.
 
λ1 ∗ ∗
A=0
 .. 
cA (x) = (x − λ1 )(x − λ2 ) · · · (x − λn )
. ∗
0 0 λn

Let ei ∈ Mn,1 be the column matrix with a 1 in the i-th row and zeros everywhere else. Define
*
subspaces V0 , V1 , . . . , Vn 6 Mn,1 (F) by V0 = {0} and Vk = span{e1 , . . . , ek }for k ∈ {1, 2, . . . , n}.
Note that
∀u ∈ Vk , (A − λk In )u ⊆ Vk−1
It follows that
∀u ∈ Mn,1 (F), (A − λ1 In ) · · · (A − λn In )u ⊆ V0
Since this holds for all u ∈ Mn,1 (F), we conclude that (A − λ1 In ) · · · (A − λn In ) = 0.

 Combining this with the result that all elements of Mn,n (C) are similar to an upper triangular
matrix gives a proof of Cayley-Hamilton in the case F = C. In fact, the same argument works
for any ‘algebraically closed’ field, and it’s then not much work to extend to an arbitrary F.

© University of Melbourne 2024


27-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 28

Geometry in Euclidean space

We want to be able to consider geometric notions such as length and angle in a vector space. Before
defining the general notion of an inner product space, we revise the familiar context of Rn .

28.1 Dot product in Rn

To work with length, distance, and angle we need more than just the vector space properties of Rn .

Definition 28.1: Dot product

Let u = (u1 , . . . , un ) ∈ Rn and v = (v1 , . . . , vn ) ∈ Rn . We define the dot product of u and v to be

u · v = u1 v1 + u2 v2 + · · · + un vn ∈ R

Remark. Denoting the standard basis of Rn by S, we have u · v = [u]TS [v]S

Exercise 243. Use the definition of the dot product to verify the following properties. For all u, v ∈ Rn
and α ∈ R,
*
(a) u · v = v · u (c) u · (v + w) = u · v + u · w (e) u · u = 0 ⇐⇒ u = 0
(b) (αu) · v = α(u.v) (d) u · u > 0

Definition 28.2: Length and distance

The length (or magnitude or norm) of a vector u = (u1 , u2 , . . . , un ) ∈ Rn is given by


√ q
kuk = u · u = u21 + u22 + · · · + u2n

A vector u ∈ Rn is called a unit vector if kuk = 1.


The distance between two vectors u, v ∈ Rn is given by d(u, v) = kv − uk

Example 28.3. k(1, −2, 2)k = 3, k 31 (1, −2, 2)k = 1, 13 (1, −2, 2) is a unit vector.

Example 28.4. The distance between two points P (1, 3, −1) and Q(2, 1, −1) is the distance between
their position vectors: d(P, Q) = d((1, 3, −1), (2, 1, −1)) = k(1, 3, −1) − (2, 1, −1)k = k(−1, 2, 2)k = 3

Definition 28.5: Angle

The angle θ between two non-zero vectors u, v ∈ Rn is given by the expression

u · v = kukkvk cos θ where 0 ≤ θ ≤ π

We say that u and v are orthogonal ( or perpendicular) if u · v = 0.


We say that u and v are parallel if one is a scalar multiple of the other.
28-2 MAST10022 Linear Algebra: Advanced, 2024

28.2 Cross product in R3

Definition 28.6: Cross product

Let u = (u1 , u2 , u3 ), v = (v1 , v2 , v3 ) ∈ R3 . The cross product (or vector product) of u and v is
the vector given by

u × v = (u2 v3 − u3 v2 )i + (u3 v1 − u1 v3 ) j + (u1 v2 − u2 v1 )k

Remark. A convenient way to remember this is as a “determinant”

i j k
u u3 u u3 u u2
u × v = u1 u2 u3 = 2 i− 1 j+ 1 k
v2 v3 v1 v3 v1 v2
v1 v2 v3

using cofactor expansion along the first row.

Exercise 244 (Properties of the cross product). Show (directly from the definitions) that for any vec-
tors u, v and w ∈ R3 , and scalar α ∈ R:

*
(a) u × v = −(v × u) (d) u × u = 0
(b) u × (v + w) = (u × v) + (u × w) (e) u · (u × v) = 0
(c) (αu) × v = α(u × v)

Example 28.7. We can use the cross product to find a vector perpendicular to both (2, 3, 1) and
(1, 1, 1).

(2, 3, 1) × (1, 1, 1) = (2, −1, −1)

Note that (2, −1, −1) · (2, 3, 1) = 0 and (2, −1, −1) · (1, 1, 1) = 0

Lemma 28.8

The cross product satisfies


u × v = kukkvk sin(θ) n̂
where
 n̂ is a unit vector perpendicular to both u and v and

points in the direction given by the ‘right-hand rule’ v
 θ ∈ [0, π] is the angle between u and v u

28.3 Lines in R3

Lines through the origin are exactly the 1-dimensional subspaces of R3 . Every line in R3 is a translate
of a 1-dimensional subspace of R3 in the following way.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 28-3

*
P0 r0 + span(*
v)

All lines in R3 are of the form


*
* v) = *r0 + {t*
r0 + span(* v | t ∈ R}
r0
v ∈ R3
for some (not unique) *r0 , *
*
span(*
v)
v
*
0
The vector equation of a line through a point P0 in the di- *
* r = *r0 + t*
v t∈R
v is (where *r0 = OP0 ):
rection determined by a vector *

Letting *r = (x, y, z), *r0 = (x0 , y0 , z0 ) and *


v = (a, b, c) and x = x0 + ta
then equating coordinates gives the parametric equations y = y0 + tb t∈R
for the line: z = z0 + tc

If a 6= 0, b 6= 0 and c 6= 0, we can solve the parametric equa-


x − x0 y − y0 z − z0
tions for t and equate. This gives the cartesian form of the = =
straight line: a b c

Example 28.9. Consider the line passing through the points P (−1, 2, 3) and Q(4, −2, 5). It has

vector equation: (x, y, z) = (−1, 2, 3) + t(5, −4, 2), t∈R

x = −1 + 5t
parametric equations: y = 2 − 4t t∈R
z = 3 + 2t

x+1 y−2 z−3


cartesian form: = =
5 −4 2

Definition 28.10

Two lines are said to:

 intersect if there is a point lying in both

 be parallel if their direction vectors are parallel

 be skew if they do not intersect and are not parallel

The angle between two lines is the angle between their direction vectors

Example 28.11. Consider the three lines having parametric equations

x=1+t x = −4 + 3t x = 12 t
L1 : y = 2 − 4t t ∈ R L2 : y = −6 + 2t t ∈ R L3 : y = 1 − 2t t ∈ R
z = 3 + 2t z =3+t z =2+t

Then L1 and L2 intersect, L1 and L3 are parallel, L2 and L3 are skew.

© University of Melbourne 2024


28-4 MAST10022 Linear Algebra: Advanced, 2024

Distance between a point and a line

Given a point with position vector * p and a line with vector equation *r = *r0 + t*
u t ∈ R, the distance
from the point to the line is given by
k*
u × (*
p − *r0 )k
d=
k*
uk

Exercise 245. Use Lemma 28.8 to derive the above expression for the distance between a point and a
line.

Distance between two skew lines

Given two skew lines having vector equations


*
r = r* 1 + t*
u t∈R and *
r = *r2 + t*
v t∈R

The distance between them is given by


|(* v) · (*r2 − *r1 )|
u×*
d=
k*
u×* vk

28.4 Planes in R3

Planes through the origin are exactly the 2-dimensional subspaces of R3 . All planes in R3 are of the
form *r0 + W where W is a 2-dimensional subspace of R3 .
The vector equation of a plane through a point P0 and par- *
* r = *r0 + s*
u + t*
v s, t ∈ R
allel to both * v ∈ R3 is (where *r0 = OP0 ):
u, *
The cartesian form of a plane is:
Where a, b, c, d ∈ R and the vector (a, b, c) is perpendicular ax + by + cz = d
to both *
u and *v. Such a vector is called a normal to the plane.

Examples 28.12.

1. The plane perpendicular to the direction (1, 2, 3) and through the point (4, 5, 6) is given by
x + 2y + 3z = d where d = 1 × 4 + 2 × 5 + 3 × 6. That is, x + 2y + 3z = 32
2. Consider the plane perpendicular to (1, 0, −2) and containing the point (1, −1, −3).
The cartesian equation is: x − 2z = 7
A vector equation is: (x, y, x) = (1, −1, −3) + s(0, 1, 0) + t(2, 0, 1) s, t ∈ R
3. Consider the plane with vector equation
(x, y, z) = (1, 2, 3) + s(2, 3, 1) + t(1, 1, 1), s, t ∈ R
To find a catesian equation for the plane, we need a normal to the plane. For this we can
the fact the (2, 3, 1) × (1, 1, 1) = (2, −1, −1) is orthogonal to both (2, 3, 1) and (1, 1, 1). The
cartesian equation is of the form 2x − y − z = d. Since (1, 2, 3) lies in the plane, we have that
d = 2 − 2 − 3 = −3. The catesian equation is
2x − y − z = −3

Distance between a point and a plane

Given a point *p and a plane that has normal vector * n and contains a point *r0 , the distance from the
point to the plane is given by
|(*
p − *r0 ) · *
n|
d=
k*
nk

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 28-5

28.5 Exercises

246. Find the following dot products:


√ √
(a) (1, 1, 1) · (2, 1, −3) (b) (2, 1, 1) · (1, −3, 7) (c) ( 2, π, 1) · ( 2, −2, 3)

247. Find the angle between the following pairs of vectors in R3 :

(a) (1, 0, 0), (0, 0, 4) (b) (1, −1, 0), (0, 1, 1) (c) (2, −2, 2), (−1, 0, 2)

248. Let u = (3, −1, 4), v = −i − 3j + k, w = (−1, 1, 2).


Find (if they exist):

(a) u · v (d) (v × u) · (−w) (g) (u × v) · w


(b) (3u) · (−2v) (e) v × 2u (h) u · (v · w)
(c) u × v (f) u × (v · w) (i) u · (v × w)

249. Find the values of x such that the following pairs of vectors are (i) orthogonal and (ii) parallel.

(a) (x, 1 − 2x, 3) and (1, −x, 3x) (b) (x, x, −1) and (1, x, 6)

250. Write down the equation for the following lines in both vector and cartesian form.

(a) the line passing through P (2, 1, −3) and parallel to v = (1, 2, 2)
(b) the line through P (2, −3, 1) and parallel to the x-axis
(c) the line passing through the points P (2, 0, −2) and Q(1, 4, 2)
(d) the line through P (2, 4, 5) and perpendicular to the plane 5x − 5y − 10z = 2

251. Determine whether the lines L1 and L2 are parallel, intersecting or skew (not parallel or inter-
secting). If they intersect, find the point of intersection. Let the parameters s, t ∈ R.

(a) L1 : x = −6t, y = 1 + 9t, z = −3t and L2 : x = 1 + 2s, y = 4 − 3s, z = s


(b) L1 : x = 1 + t, y = 1 − t, z = 2t and L2 : x = 2 − s, y = s, z = 2
x−4 y+5 z−1 y+1 z
(c) L1 : = = and L2 : x − 2 = =
2 4 −3 3 2
252. Find the equations of the following planes in both cartesian and (vector) parametric form:

(a) the plane through the point (1, 4, 5) and perpendicular to the vector (7, 1, 4)
(b) the plane through the point (6, 5, −2) and parallel to the plane x + y − z + 1 = 0
(c) the plane through the origin and the points (1, 1, 1) and (1, 2, 3)
(d) the plane that passes through the point (1, 6, −4) and contains the line

x = 1 + 2t, y = 2 − 3t, z = 3 − t where t ∈ R


* *
253. (a) Show that three points A, B and C are collinear if and only if AB × AC = 0. Are the
points A(1, 2, 3), B(3, 1, 0) and C(9, −2, −9) collinear? If yes, find the equation of the line
containing these points.
(b) Show that four points A, B, C and D are coplanar if and only if
* * *
AB · (AC × AD) = 0.

Are the points A(1, 1, 1), B(2, 1, 3), C(3, 2, 1) and D(4, 2, 3) coplanar? If yes, find the equa-
tion of the plane containing these points.

© University of Melbourne 2024


28-6 MAST10022 Linear Algebra: Advanced, 2024

254. (a) Find the point of intersection of the line r(t) = (2, 1, 1) + t(−1, 0, 4); t ∈ R with the plane
x − 3y − z = 1.
(b) Find the point of intersection of the line x = 1 + t, y = 2t, z = 3t; t ∈ R with the plane
x + y + z = 1.

255. Find the angle between:

(a) the lines x − 3 = 2 − y, z = 1 and x = 7, y − 2 = z − 5


(b) the planes 2x + y + 3z = 0 and 3x − 2y + 4z − 4 = 0
(c) the line x = 2t − 7, y = 4t − 6, z = t − 5; t ∈ R and a vector normal to the plane
x + 2y − 4z = 0

256. Given the line ` determined by the equations 2x − y + z = 0, x + z − 1 = 0, and M the point
(1, 3, −2), find a cartesian equation of the plane:

(a) passing through M and `


(b) passing through M and orthogonal to `

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 28-7

Extra material for lecture 28

 The cross product is not associative. For example

((1, 0, 0) × (0, 1, 0)) × (1, 1, 0) = (−1, 1, 0) 6= (0, 1, 0) = (1, 0, 0) × ((0, 1, 0) × (1, 1, 0))

© University of Melbourne 2024


28-8 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 29

Inner products

An inner product on a vector space is a generalisation of the dot product on Rn seen in the last lecture.
It will be used to define geometric notions such as length and angle.
When dealing with inner products, the field F will always be either R or C. Given an element α ∈ F,
we denote by α its complex conjugate and by |α| its absolute value.

29.1 Definition of inner product

Definition 29.1: Inner product

Let F be one of R or C. Let V be a vector space over F.


An inner product on V is a function V × V → F (with the image of (u, v) being denoted hu, vi)
that satisfies the following conditions. For all u, v, w ∈ V and α ∈ F:
1) hu, vi = hv, ui 3) hu + v, wi = hu, wi + hv, wi

2) αhu, vi = hαu, vi 4) (a) hu, ui > 0


*
(b) hu, ui = 0 ⇒ u = 0

The case in which F = R is sometimes referred to as a real inner product.


The case in which F = C is sometimes referred to as an Hermitian inner product.
A vector space V together with a fixed inner product is called an inner product space.

Note.
 The first condition implies that ∀u ∈ V, hu, ui ∈ R and therefore the inequality in 4(a) makes
sense.

 The first and second conditions together imply that ∀u, v ∈ V ∀α ∈ F, hu, αvi = αhu, vi.

 It’s possible to have many different inner products on the same vector space.
*
Exercise 257. Show that ∀v ∈ V , h0, vi = 0.

Examples 29.2.

1. The dot product on Rn is an inner product

h(u1 , . . . , un ), (v1 , . . . , vn )i = u1 v1 + · · · + un vn

2. The Hermitian dot product on Cn is

h(u1 , . . . , un ), (v1 , . . . , vn )i = u1 v1 + · · · + un vn

3. The following is an inner product on R2 ,

h(u1 , u2 ), (v1 , v2 )i = u1 v1 − u1 v2 − u2 v1 + 5u2 v2

4. V = Mn,n (C), hA, Bi = tr(A(B)t )


29-2 MAST10022 Linear Algebra: Advanced, 2024

R1
5. V = Pn (R), hp, qi = 0 p(x)q(x) dx

Examples 29.3.

1. The following is not an inner product on R2 ,

h(u1 , u2 ), (v1 , v2 )i = u1 v1 − 2u1 v2 − 2u2 v1 + 3u2 v2

2. The following is not an inner product on C2 ,

h(u1 , u2 ), (v1 , v2 )i = u1 v1 + u2 v2

29.2 Length, distance and orthogonality

Definition 29.4: Length and distance

For a vector space with an inner product h· , ·i we define: the length (or norm) of a vector v ∈ V
by p
kvk = hv, vi
The distance between two vectors u, v ∈ V is defined to be

d(v, u) = kv − uk

A vector u ∈ V with kuk = 1 is called a unit vector.


Two vectors u, v ∈ V are said to be orthogonal if hu, vi = 0

Exercise 258. Let V be an inner product space. Show that

∀u ∈ V ∀α ∈ F, kαuk = |α|kuk

Exercise 259. Let u, v be orthogonal vectors in an inner product space V . Show that

ku + vk2 = kuk2 + kvk2

Example 29.5. h(u1 , u2 ), (v1 , v2 )i = u1 v1 + 2u2 v2 defines an inner product on R2


Letting u = (3, 1) and v = (−2, 3), we have

kvk2 = h(−2, 3), (−2, 3)i = (−2)2 + 2 × 32 = 22


p √ √
d(u, v) = ku − vk = k(5, −2)k = h(5, −2), (5, −2)i = 25 + 8 = 33
hu, vi = h(3, 1), (−2, 3)i = 3 × (−2) + 2 × 1 × 3 = 0

The vectors u and v are orthogonal (with respect to this inner product).

Example 29.6. Consider the real vector space V = C[0, 2π] of all continuous functions f : [0, 2π] → R.
We equip V with the following inner product

Z 2π
hf, gi = f (x)g(x)dx
0

The norms of the functions s, c ∈ V given by s(x) = sin(x) and c(x) = cos(x) are:

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 29-3

Z 2π Z 2π  2π
2 2 1 x 1
ksk = hs, si = sin (x)dx = (1 − cos(2x))dx = − sin(2x) =π
0 0 2 2 4 0
√ √
So ksk = π. Similarly, kck = π.
The vectors s and c are othogonal since
Z 2π Z 2π  2π
1 1
hs, ci = sin(x) cos(x)dx = sin(2x)dx = − cos(2x) =0
0 0 2 4 0

29.3 The Cauchy-Schwartz inequality

We would like to define what should be meant by the angle between two vectors in an inner product
space by using the same expression as in Definition 28.5. To be sure that it makes sense we need the
following result.

Theorem 29.7: Cauchy-Schwarz inequality

Let V be an inner product space. Then for all u, v ∈ V

|hu, vi| 6 kukkvk

Further, equality holds if and only if one vector is a multiple of the other.

* * *
Proof. Let u, v ∈ V . If v = 0, the result holds because hu, 0i = 0 and k0k = 0 (see Exercise 257).
1
We now cinsider the case in which v is non-zero. Let p = kvk2
hu, viv. Note that
1 1 1 1
kpk2 = h 2
hu, viv, 2
hu, vivi = 4
|hu, vi|2 hv, vi = |hu, vi|2
kvk kvk kvk kvk2
and that w = u − p is orthogonal to p since
1 1
hp, wi = hp, u − pi = hp, ui − hp, pi = hu, vihv, ui − |hu, vi|2 = 0
kvk2 kvk2
Then we have
kuk2 = kw + pk2
= kwk2 + kpk2 (by Exercise 259)
> kpk2
1
= |hu, vi|2
kvk2

This inequality gives kuk2 kvk2 > |hu, vi|2 and hence kukkvk > |hu, vi|.
*
The above inequality is an equality iff kwk2 = 0, that is, w = 0. We have
*
w = 0 ⇐⇒ u = p ⇐⇒ u is a multiple of v

R1
Example 29.8. Consider V = P2 (R) with inner product given by hp, qi = 0 p(x)q(x) dx. Let u = −x
and v = x2 . Then
Z 1 Z 1 Z 1
3 1 2 2 1 2 1
hu, vi = − x dx = − kuk = hu, ui = x dx = kvk = hv, vi = x4 dx =
0 4 0 3 0 5
1 1
|hu, vi| = 6 √ = kukkvk
4 15

© University of Melbourne 2024


29-4 MAST10022 Linear Algebra: Advanced, 2024

Definition 29.9

Let V be a real inner product space. The angle between two vectors u, v ∈ V is defined to be
θ ∈ [0, π] given by
hu, vi
θ = arccos
kuk kvk

Example 29.10. With u, v ∈ V as in Example 29.8 we have that the angle between x and x2 is
  √ !
−1/4 − 15
θ = arccos √ = arccos
1/ 15 4

Lemma 29.11: Triangle inequality for inner product spaces

Let V be an inner product space. Then ∀u, v ∈ V,

ku + vk 6 kuk + kvk

Proof.
ku + vk2 = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi
= kuk2 + 2<(hu, vi) + kvk2 (<(z) denotes the real part of z ∈ C)
2 2
6 kuk + 2|hu, vi| + kvk
6 kuk2 + 2kukkvk + kvk2 (Cauchy-Schwartz)
2
= (kuk + kvk)

29.4 Exercises
   
u1 u2 v1 v2
260. Given U = and V = are two 2 × 2 matrices, then
u3 u4 v3 v4
hU, V i = u1 v1 + u2 v2 + u3 v3 + u4 v4
defines an inner product on M2,2 (R).
   
3 −2 −1 3
(a) Compute hU, V i when U = and V = .
4 8 1 1
 
−2 5
(b) Given A = , find kAk.
3 6
   
2 6 −4 7
(c) Given A = and B = , find the distance between them d(A, B).
9 4 1 6
 
2 1
(d) Let A = . Which of the following matrices are orthogonal to A?
−1 3
   
1 1 2 1
i) ii)
0 −1 5 2

261. Given p = a0 + a1 x + a2 x2 and q = b0 + b1 x + b2 x2 are two vectors in P2 (R), then


hp, qi = a0 b0 + a1 b1 + a2 b2
is an inner product on P2 (R).

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 29-5

(a) Compute hp, qi if p = −2 + x + 3x2 and q = 4 − 7x2 .


(b) If p = −2 + 3x + 2x2 , find kpk.
(c) Given p = 3 − x + x2 , q = 2 + 5x2 find the distance between them d(p, q).
(d) Show that p = 1 − x + 2x2 and q = 2x + x2 are orthogonal.

262. For x = (x1 , x2 ).y = (y1 , y2 ) ∈ R2 , define hx, yi = x1 y1 + 3x2 y2 . Show that hx, yi is an inner
product on R2 .

263. In R2 , let h(x1 , x2 ), (y1 , y2 )i = x1 y1 − x2 y2 . Is this an inner product? If not, why not?

264. Verify that the operation

h(x1 , x2 ), (y1 , y2 )i = x1 y1 − x1 y2 − x2 y1 + 3x2 y2

is an inner product in R2 .

265. Decide which of the suggested operations on x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in R3 define
an inner product:

(a) hx, yi = x1 y1 + 2x2 y2 + x3 y3 (c) hx, yi = x1 y1 − x2 y2 + x3 y3


(b) hx, yi = x21 y12 + x22 y22 + x23 y32 (d) hx, yi = x1 y1 + x2 y2

266. Decide which of the following functions hp, qi on real polynomials p(x) = a0 + a1 x + a2 x2 and
q(x) = b0 + b1 x + b2 x2 define inner products on P2 (R):

(a) hp, qi = a0 b0 + a1 b1 + a2 b2 (b) hp, qi = a0 b0

267. For the vectors x = (1, 1, 0), y = (0, 1, 0) in R3 compute the norms kxk and kyk using the
following inner products.

(a) hx, yi = x1 y1 + x2 y2 + x3 y3 (b) hx, yi = x1 y1 + 3x2 y2 + x3 y3

268. In each part determine whether the given vectors are orthogonal with respect to the Euclidean
inner product (i.e., the usual dot product).

(a) u = (−1, 3, 2), v = (4, 2, −1) (b) u = (0, 3, −2, 1), v = (5, 2, −1, 0)

269. Endow R4 have the Euclidean inner product (i.e. the dot product), and let u = (−1, 1, 0, 2).
Determine whether the vector u is orthogonal to the following vectors:

(a) w1 = (0, 0, 0, 0) (b) w2 = (1, −1, 3, 0) (c) w3 = (4, 0, 9, 2)

270. Let u = (1 + i, 3i) and v = (4, 2 − i). Use the complex dot product on C2 to compute:

(a) u · v (b) v · u (c) kuk (d) kvk

271. Let C3 have the complex dot product. If u = (2i, i, 3i) and v = (i, 6i, k), for what values of k ∈ C
are u and v orthogonal?

272. Show that in every real inner product space: v+w is orthogonal to v−w if and only if kvk = kwk.

273. Prove that the following holds for all vectors x, y in a real inner product space:

kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2

© University of Melbourne 2024


29-6 MAST10022 Linear Algebra: Advanced, 2024

274. Let A be a real invertible n × n matrix. Show that

hx, yi ≡ [x]T AT A[y] = (A[x])T A[y]

defines an inner product in Rn , where [x] and [y] are coordinate matrices with respect to the
standard basis.
Show that it fails to be an inner product if A is not invertible.
(Hint: If A is not invertible, then its kernel is non-trivial.)

275. Verify that the Cauchy-Schwartz inequality holds for the given vectors using the Euclidean
inner product.

(a) u = (−3, 1, 0), v = (2, −1, 3) (b) u = (−4, 2, 1), v = (8, −4, −2)

276. Use the Cauchy-Schwartz inequality (applied to the Euclidean inner product on Rn ) to show
that given any a1 , a2 , . . . , an ∈ R we have that
r
a1 + · · · + an a21 + · · · + a2n
6
n n

277. Consider R2 and R3 , each with the Euclidean inner product. In each part find the cosine of the
angle between u and v.

(a) u = (1, −3), v = (2, 4) (b) u = (−1, 5, 2), v = (2, 4, −9)

278. For the vectors x = (1, 1, 0), y = (0, 1, 0) in R3 compute the angle between x and y using the
following inner products.

(a) hx, yi = x1 y1 + x2 y2 + x3 y3 (b) hx, yi = x1 y1 + 3x2 y2 + x3 y3

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 29-7

Extra material for lecture 29

 References about inner products

Elementary Linear Algebra by Anton and Rorres, §6, p345


Linear Algebra Done Right by Axler, §6.A, p164

© University of Melbourne 2024


29-8 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 30

Orthonormal bases

Some bases for an inner product space fit nicely with the inner product. Before defining the notion
of an orthonormal basis we note that, after choosing a basis, inner products (on finite-dimensional
vector spaces) can be represented by matrices. If the basis is orthonormal, then the corresponding
matrix is particularly simple.
In this lecture, as in the previous, the field F is either R or C.

30.1 Matrix representation of an inner product

Let V be an n-dimensional vector space and let B be a basis for V . Fix a matrix M ∈ Mn,n (F). For
u, v ∈ V define
hu, vi = [u]T M [v] (†)
Where, for legibility, we have written [u] in place of [u]B and [v] in place of [v]B . The right hand side
is a 1 × 1 matrix which we identify with an element of F.
What conditions of M ensure that this gives an inner product on V ?

Exercise 279. Show that the function V × V → F defined by (†) satisfies axioms 2 and 3 in the
definition of an inner product.

We need to add a condition on M in order that axiom 1 be satisfied.


T
hv, ui = [v]T M [u] = [v] M [u]
T
= ([v] M [u])T (since a 1 × 1 matrix is equal to its own transpose)
T
= [u]T M ∗ [v] (where M ∗ denotes M )

Therefore axiom 1 is satisfied if and only if M ∗ = M .


We fix some terminology.

Definition 30.1

A matrix M ∈ Mn,n (R) is called a real symmetric matrix if M T = M .


A matrix M ∈ Mn,n (C) is called a Hermitian matrix if M ∗ = M .
A matrix M ∈ Mn,n (F) is called positive definite if M ∗ = M and the following condition is
satisfied
*
∀X ∈ Mn,1 (F) \ {0}, XT M X > 0

Exercise 280. Show that if M is positive definite, then the function V × V → F defined by (†) is
an inner product on V . (Hint: The only thing left to show is that axiom 4 in the definition of inner
product is satisfied.)

Given an inner product on V , there always exists a matrix M ∈ Mn,n (F) such that the inner product
is given by the expression (†).
30-2 MAST10022 Linear Algebra: Advanced, 2024

Proposition 30.2: matrix representation of an inner product

Let V be a finite-dimensional inner product space and B a basis for V . There exists a matrix
M ∈ Mn,n (F) such that
∀u, v ∈ V, hu, vi = [u]TB M [v]B

Proof. Suppose B = {b1 , . . . , bn } and that we have an inner product h., .i on V . Define M ∈ Mn,n (F)
to be the matrix whose (i, j)-th entry is given by Mij = hbi , bj i. That hu, vi = [u]T M [v] for all u, v ∈ V
can be readily verified.

Exercise 281. Show that, with M defined as above, we have hu, vi = [u]T M [v] for all u, v ∈ V .

Example 30.3. We saw in a previous example that

h(u1 , u2 ), (v1 , v2 ))i = u1 v1 − u1 v2 − u2 v1 + 5u2 v2

defines an inner product on R2 . What is the matrix representation of this inner product (with respect
to the standard basis)? Noting that h(1, 0), (1, 0)i = 1, h(1, 0), (0, 1)i = −1, h(0, 1), (1, 0)i = −1 and
h(0, 1), (0, 1)i = 5 we have that
  
1 −1 v1
h(u1 , u2 ), (v1 , v2 )i = [u1 u2 ]
−1 5 v2

Exercise 282. Let M, N ∈ Mn,n (F). Show that if

∀X, Y ∈ Mn.1 (F), XT M Y = XT N Y

then M = N .
(It follows that the matrix representation (with respect to a fixed basis) of an inner product is unique.)

30.2 Orthogonal sets of vectors

Definition 30.4: Orthogonal set of vectors

Let V be an inner product space. A set of vectors {v1 , . . . , vk } ⊆ V is called orthogonal if


hvi , vj i = 0 whenever i 6= j.

Examples 30.5. 1. {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is orthogonal in R3 with the dot product as inner
product.

2. So is {(1, 1, 1), (1, −1, 0), (1, 1, −2)} ⊆ R3 using the dot product.
R 2π
3. Consider the inner product space of Example 29.6: V = C[0, 2π] and hf, gi = 0 f (x)g(x)dx.
The set {1, sin(x), cos(x)} ⊆ V is orthogonal.

Lemma 30.6

*
Let V be an inner product space and S ⊆ V \ {0}. If S is orthogonal, then S is linearly indepen-
dent.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 30-3

*
Proof. Suppose that α1 , . . . , αk ∈ F and v1 , . . . , vk ∈ S are such that α1 v1 + · · · + αk vk = 0. Then for
i ∈ {1, . . . , k} we have
*
α1 v1 + · · · + αk vk = 0 =⇒ hα1 v1 + · · · + αk vk , vi i = 0
=⇒ α1 hv1 , vi i + · · · + αk hvk , vi i = 0
=⇒ αi hvi , vi i = 0 (since hvj , vi i = 0 if j 6= i)
*
=⇒ αi = 0 (since vi 6= 0)
Having shown that
*
α1 v1 + · · · + αk vk = 0 =⇒ ∀i, αi = 0
we conclude that the set S is linearly independent.

Example 30.7. 1. The set {(1, 1, 1), (1, −1, 0), (1, 1, −2)} ⊆ R3 is linearly independent.
2. The set {1, sin(x), cos(x)} ⊆ C[0, 2π] is linearly independent.

30.3 Orthonormal bases

Definition 30.8

A set of vectors {v1 , . . . , vk } is called orthonormal if it is orthogonal and each vector has length
one. That is, (
0 i 6= j
{v1 , . . . , vk } is orthonormal if hvi , vj i =
1 i=j

Remark. Any orthogonal set of non-zero vectors can be made orthonormal by multiplying each vector
1
v by kvk .

Examples 30.9.

1. In R3 with the dot product:


(a) {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is orthonormal
(b) {(1, 1, 1), (1, −1, 0), (1, 1, −2)} is orthogonal but is not orthonormal
(c) { √13 (1, 1, 1), √12 (1, −1, 0), √16 (1, 1, −2)} is orthonormal
R1
2. In P2 (R) with the inner product hp, qi = 0 p(x)q(x) dx:
(a) The set {1, x, x2 } is not orthogonal.
(b) The set {1, 2x − 1, 6x2 − 6x + 1} is orthogonal but not orthonormal.
√ √
(c) The set {1, 3(2x − 1)}, 5(6x2 − 6x + 1)} is orthonormal.
3. In C[0, 2π] with the inner product
Z 2π
hf, gi = f (x)g(x)dx
0

(a) The set {1, sin(x), cos(x)} is orthogonal but not orthonormal.
(b) The set { √12π , √1π sin(x), √1π cos(x)} is orthonormal
(c) The (infinite) set
1 1 1 1 1
{ √ , √ sin(x), √ cos(x), √ sin(2x), √ cos(2x), . . . }
2π π π π π
is orthonormal.

© University of Melbourne 2024


30-4 MAST10022 Linear Algebra: Advanced, 2024

Definition 30.10

Let V be an inner product space. An orthonormal basis for V is a basis that is an orthonormal
set.

Examples 30.11.

1. In R3 with the dot product:


(a) {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis
(b) { √13 (1, 1, 1), √12 (1, −1, 0), √16 (1, 1, −2)} is an orthonormal basis
R1 √ √
2. For V = P2 (R) with the inner product hp, qi = 0 p(x)q(x) dx, The set {1, 3(2x − 1)}, 5(6x2 −
6x + 1)} is an orthonormal basis.

Bases that are orthonormal are particularly convenient to work with. For example, we have the
following.

Lemma 30.12

Let V be an inner product space and let B = {b1 , . . . , bn } be an orthonormal basis for V . Then
for all v ∈ V we have that
v = hv, b1 ib1 + · · · + hv, bn ibn

Proof. Let v = α1 b1 + · · · + αn bn . We need to show that αi = hv, bi i. We have


n
X n
X
hv, bi i = h αj bj , bi i = αj hbj , bi i = αi
j=1 j=1

Example 30.13. Let V = R3 equipped with the dot product and let
1 1 1
B = {b1 = √ (1, 1, 1), b2 = √ (1, −1, 0), b3 = √ (1, 1, −2)}
3 2 6
We saw above that this is an orthonormal basis. To find coordinates with respect to B, we can just
use the inner product. For example
(1, 2, 3) = ((1, 2, 3) · b1 )b1 + ((1, 2, 3) · b2 )b2 + ((1, 2, 3) · b3 )b3 (Lemma 30.12)
1 1 1
= √ (1, 2, 3) · (1, 1, 1) b1 + √ (1, 2, 3) · (1, −1, 0) b2 + √ (1, 2, 3) · (1, 1, −2) b3
3 2 6

√ 1 3
= 2 3 b1 − √ b2 − √ b3
2 2
 √ 
2 6
That is, [(1, 2, 3)]B = √12  −1 √

− 3

Example 30.14. Let W = {(x, y, z) ∈ R3 | x + y + z = 0} 6 R3 equipped with the dot product. The set
1 1
B = { √ (−1, 1, 0), √ (−1, −1, 2)}
2 6
is an orthonormal basis for W . For (−1, 0, 1) ∈ W we have

1 3
(−1, 0, 1) · b1 = √ (−1, 0, 1) · b2 = √
2 2

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 30-5


1 3 1 1
(−1, 0, 1) = √ b1 + √ b2 = (−1, 1, 0) + (−1, −1, 2)
2 2 2 2

30.4 Exercises

283. Use Lemma 30.12 to express the given vector as a linear combination of the vectors in the
following orthonormal basis (with respect to the dot product) for R3 .

B = 31 , − 32 , 23 , − 23 , 13 , 23 , 32 , 23 , 13
   

(a) (1, 2, 3) (b) (−1, 0, 1)

284. Let V be the be the vector space C[0, 2π] of real-valued continuous functions on the interval
R 2π
[0, 2π] equipped with the inner product hf, gi = 0 f (x)g(x) dx. Show that the set
 
1 1 1
√ , √ sin(x), √ cos(x) ⊆ V
2π π π

is an orthonormal set

285. Let hx, yi be an inner product on a vector space V , and let B = {e1 , e2 , . . . , en } be an orthonormal
basis for V . Prove that:

(a) hα1 e1 + α2 e2 + · · · + αn en , β1 e1 + β2 e2 + · · · + βn en i = α1 β 1 + α2 β 2 + · · · + αn β n
(b) hx, yi = hx, e1 ihy, e1 i + · · · + hx, en ihy, en i
(c) The matrix representation, with respect to B, of the inner product is In .

286. Let V be a finite dimensional inner product space and let B = {b1 , . . . , bn } be an orthonormal
basis for V . Let T : V → V be a linear transformation and let A = [T ]B be the matrix of T with
respect to B. Prove that Aij = hT (bj ), bi i

© University of Melbourne 2024


30-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 30

 Let V be the inner product space of Exercise 284. Show that the infinite set
n o
√1 , √1 sin(x), √1 cos(x), √1 sin(2x), √1 cos(2x), √1 sin(3x), √1 cos(3x), . . . ⊆V
2π π π π π π π

is an orthonormal set. The set is therefore linearly independent.

 A Hilbert space is an inner product space with the property that the associated metric is com-
plete. That the metric is complete is to say that all Cauchy sequences converge.
An example of an infinite-dimensional Hilbert space is the space of "square summable" se-
quences of complex numbers

X
`2 = { (z1 , z2 , z3 , . . . ) | zi ∈ C, |zi |2 < ∞}
i=1

with inner product



X
h(x1 , . . . ), (y1 , . . . )i = x i yi
i=1

© University of Melbourne 2024


LECTURE 31

The Gram-Schmidt orthogonalisation procedure and


orthogonal projection

We have seen how to find a basis for a vector space, but what about finding an orthonormal basis? In
this lecture we discuss a technique, based on the idea contained in Exercise 287 below, for converting
a basis of a finite-dimensional inner product space into an orthonormal basis.

Exercise 287. Let V be an inner product space and let v ∈ V .

a) Let u ∈ V be a unit vector. Show that v − hv, uiu is orthogonal to u.

b) Let {u1 , . . . , uk } ⊆ V be an orthonormal set. Show that

v − hv, u1 iu1 − hv, u2 iu2 − · · · − hv, uk iuk

is orthogonal to every element of span{u1 , . . . , uk }.

31.1 Gram-Schmidt orthogonalisation procedure

Let V be an inner product space and let {u1 , . . . , uk } ⊆ V be an orthonormal set. Suppose that v ∈ V
is such that v ∈
/ W = span{u1 , . . . , uk }. Defining w = v − hv, u1 iu1 − hv, u2 iu2 − · · · − hv, uk iuk we
*
have that w 6= 0 and w is orthogonal to W . Therefore, if we define uk+1 = w/kwk and add it to the
above orthonormal set, the now larger set {u1 , . . . , uk+1 } ⊆ V is still orthonormal. Applying this
observation repeatedly allows us to construct an orthonormal basis for V .

Algorithm 31.1: Gram-Schmidt procedure

Let V be a finite-dimensional inner product space and let {b1 , . . . , bn } be a basis for V .
We define u1 , w2 , u2 , . . . , wn , un ∈ V as follows:

1
1) u1 = kb1 k b1

1
2) w2 = b2 − hb2 , u1 iu1 and u2 = kw2 k w2

1
3) w3 = b3 − hb3 , u1 iu1 − hb3 , u2 iu2 and u3 = kw3 k w3
.. ..
. .
1
n) wn = bn − hbn , u1 iu1 − · · · − hbn , uk−1 iuk−1 and un = kwn k wn

Then {u1 , . . . , un } is an orthonormal basis for V .

Let’s note explicitly the following consequence.

Theorem 31.2

Every finite-dimensional inner product space has an orthonormal basis.


31-2 MAST10022 Linear Algebra: Advanced, 2024

Example 31.3. We find an orthonormal basis for the subspace W of R4 (with the dot product) spanned
by {(1, 1, 1, 1), (2, 4, 2, 4), (1, 5, −1, 3)}.
Following the Gram-Schmidt procedure gives:
1
u1 = (1, 1, 1, 1)/(k(1, 1, 1, 1)k) = (1, 1, 1, 1)
2
1
w2 = (2, 4, 2, 4) − h(2, 4, 2, 4), u1 iu1 = (2, 4, 2, 4) − h(2, 4, 2, 4), (1, 1, 1, 1)i(1, 1, 1, 1)
4
= (2, 4, 2, 4) − 3(1, 1, 1, 1) = (−1, 1, −1, 1)
1
u2 = (−1, 1, −1, 1)/k(−1, 1, −1, 1)k = (−1, 1, −1, 1)
2
w3 = (1, 5, −1, 3) − h(1, 5, −1, 3), u1 iu1 − h(1, 5, −1, 3), u2 iu2
= (1, 5, −1, 3) − 2(1, 1, 1, 1) − 2(−1, 1, −1, 1) = (1, 1, −1, −1)
1
u3 = (1, 1, −1, −1)/k(1, 1, −1, −1)k = (1, 1, −1, −1)
2
So { 12 (1, 1, 1, 1), 21 (−1, 1, −1, 1), 12 (1, 1, −1, −1)} is an orthonormal basis for W .

31.2 Orthogonal projection

Definition 31.4: Orthogonal complement

Let V be an inner product space and let W 6 V be a subspace of V . The orthogonal comple-
ment of W is defined to be

W ⊥ = {v ∈ V | hv, wi = 0 for all w ∈ W }

Exercise 288. Let V be an inner product space and let W 6 V be a subspace of V . Prove the following
properties of the orthogonal complement. Note that we are not assuming that V (or W ) is finite
dimensional.
*
a) W ⊥ is a subspace of V b) W ∩ W ⊥ = {0} c) W ⊆ (W ⊥ )⊥

Lemma 31.5

Let V be a finite-dimensional inner product space and let W 6 V be a subspace. Every vector
v ∈ V can be written in a unique way as v = w + w0 where w ∈ W and w0 ∈ W ⊥ .

Proof. Let v ∈ V . We first need to show that there exist w ∈ W and w0 ∈ W ⊥ such that v = w + w0 .
Let {w1 , . . . , wk } be an orthonormal basis for W . Define
w = hv, w1 iw1 + · · · + hv, wk iwk
Then clearly w ∈ W . Now define w0 = v − w. We have that v = w + w0 . We have to show that
w0 ∈ W ⊥ . For i ∈ {1, . . . , k} we have
k
X
hw, wi i = h hv, wj iwj , wi i (definition of w)
j=1
k
X
= hv, wj ihwj , wi i (linearity)
j=1

= hv, wi i ({w1 , . . . , wk } is an orthonormal set)

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 31-3

Therefore hw − v, wi i = 0 (for all i) and it follows that w0 ∈ W ⊥ .


It remains to show that w and w0 are unique. Suppose that u ∈ W and u0 ∈ W ⊥ are such that
v = u + u0 . Then we have

w + w0 = u + u0
=⇒ w − u = u0 − w0 ∈ W ∩ W ⊥
*
=⇒ w − u = u0 − w0 = 0 (Exercise 288)

Definition 31.6: Orthogonal projection

Let V be a finite-dimensional inner product space and W 6 V a subspace. The orthogonal


projection of V onto W is the map
projW : V → V
defined as follows.
Given v ∈ V we have v = w + w0 for some (unique) w ∈ W and w0 ∈ W ⊥ . Define projW (v) = w.
The element projW (v) is sometimes called the projection of v to W .

Exercise 289. Show that projW is a linear transformation and that (projW )2 = projW .

Exercise 290. Let V be an finite-dimensional inner product space and let W 6 V be a subspace of V .

a) Use the rank-nullity theorem to show that dim(V ) = dim(W ) + dim(W ⊥ ).

b) Show that (W ⊥ )⊥ = W .

From the proof of Lemma 31.5 we have the following.

Lemma 31.7

Let V be a finite-dimensional inner product space and W 6 V a subspace. Let {w1 , . . . , wk } be


an orthonormal basis for W . Then for any v ∈ V

projW (v) = hv, w1 iw1 + · · · + hv, wk iwk

R1
Example 31.8. Consider P2 (R) with the inner product hf, gi = f (x)g(x)dx. 0

The orthogonal projection of v = 1 + 2x + 3x2 onto the unit vector u = 3 x (i.e., the projection onto
the subspace W = span{u}) is

Z 1
23
projW (v) = hv, uiu = (3 x + 2x2 + 3x3 dx)x = x
0 4

Example 31.9. Let W = {(x, y, z) | x + y + z = 0} in V = R3 equipped with the dot product.


The set {b1 = √12 (1, −1, 0), b2 = √16 (1, 1, −2)} is an orthonormal basis for W . For v = (1, 2, 3) we have
   
−1 −1
projW (v) = hv, b1 ib1 + hv, b2 ib2 = (1, −1, 0) + (1, 1, −2) = (−1, 0, 1)
2 2

Note that v − projW (v) = (2, 2, 2) is orthogonal to W .

© University of Melbourne 2024


31-4 MAST10022 Linear Algebra: Advanced, 2024

Lemma 31.10

Let V be an inner product space and W 6 V a subspace. Then, for all v ∈ V projW (v) is the
vector in W that is closest to v. (That is, ∀v ∈ V ∀ w ∈ W , kv − projW (v)k 6 kv − wk)

Proof. Let p = projW (v) and w ∈ W

kv − wk2 = hv − w, v − wi = h(v − p) + (p − w), (v − p) + (p − w)i


= hv − p, v − pi + hv − p, p − wi + hp − w, v − pi + hp − w, p − wi
= kv − pk2 + kp − wk2
> kv − pk2

Example 31.11. Let W 6 R4 be as in Example 31.3, that is, W = span{(1, 1, 1, 1), (2, 4, 2, 4), (1, 5, −1, 3)}.
We saw that { 21 (1, 1, 1, 1), 12 (−1, 1, −1, 1), 21 (1, 1, −1, −1)} is an orthonormal basis for W .
Using the orthonormal basis we find the point p ∈ W that is closest to v = (2, 2, 1, 3). From Lemma
31.10 we know that p = projW (v).

1
p = projW (v) = hv, u1 iu1 + hv, u2 iu2 + hv, u3 iu3 = 2(1, 1, 1, 1) + (−1, 1, −1, 1) + 0(1, 1, −1, −1)
2
1
= (3, 5, 3, 5)
2

31.3 Exercises

291. Use the Gram-Schmidt procedure to construct orthonormal bases for the subspaces of Rn spanned
by the following sets of vectors (using the dot product):

(a) (1, 0, 1, 0), (2, 1, 1, 1), (1, −1, 1, −1)


(b) (2, 2, −1, 0), (2, 3, 1, −2), (3, 4, 5, −2)
(c) (1, −2, 1, 3, −1), (0, 6, −2, −6, 0), (4, −2, 2, 6, −4)

292. Let P2 (R) be the vector space of polynomials of degree at most two with the inner product
Z 1
hp, qi = p(x)q(x) dx
−1

Apply Gram-Schmidt to the basis {1, x, x2 } to obtain an orthonormal basis.

293. Find the orthogonal projection of (x, y, z) onto the subspace of R3 spanned by the vectors

(a) {(1, 2, 2), (−2, 2, −1)} (b) {(1, 2, −1), (0, −1, 2)}

294. Consider R3 equipped with the dot product. Let w = (2, −1, −2) and v = (2, 1, 3). Find vectors
v1 and v2 such that v = v1 + v2 , v1 is parallel to w, and v2 is perpendicular to w.

295. Find the standard matrices of the transformations T : R3 → R3 which orthogonally project a
point (x, y, z) onto the following subspaces of R3 . Use the matrix to show the transformation is
idempotent (i.e., T ◦ T = T ).

(a) The z-axis.


(b) The straight line x = y = 2z.
(c) The plane x + y + z = 0.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 31-5

296. Let Π be the plane in R3 given by x + y + z = 0. Use orthogonal projection to find the point on
Π that is as close as possible to (4, 5, 0).

297. Let V be a finite-dimensional inner product space and let W 6 V be a subspace. Show that
dim W + dim W ⊥ = dim V .

298. Let V be a finite-dimensional inner product space and let W 6 V be a subspace. Let P : V → V
be projection onto W . Show that

∀ u, v ∈ V, hP (u), vi = hu, P (v)i

© University of Melbourne 2024


31-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 31

 What happens if we apply Gram-Schmidt to a linearly dependent set? It can be used to produce
an orthonormal basis for the span of the original set of vectors.

 Not all infinite-dimensional inner product spaces have an orthonormal basis.

 Reflection across a subspace


Let V be a finite-dimensional inner product space and W 6 V a subspace. Reflection across
W is the linear transformation R : V → V defined as follows. Given v ∈ V we have v = w + w0
for some (unique) w ∈ W and w0 ∈ W ⊥ . Define

R(w + w0 ) = w − w0

Notice that R(v) = v − 2 projW ⊥ (v)

© University of Melbourne 2024


LECTURE 32

Orthogonal diagonalisation

In this lecture we look at matrix representations with respect to orthonormal bases. Suppose that
B = {u1 , . . . , un } is an orthonormal basis for Rn equipped with the standard inner product (dot
product). Then the transition matrix P = PS,B = [ [u1 ]S · · · [un ]S ] has the property that P T P = In .

32.1 Orthogonal matrices

Definition 32.1

A matrix Q ∈ Mn,n (R) is called orthogonal if QT Q = In .

Examples 32.2.

1) The following are all orthogonal:


 
1 2 −2    
1 0 1 cos θ − sin θ
−2 2 1  ∈ M3.3 (R) , ∈ M2,2 (R)
3 1 0 sin θ cos θ
2 1 2

2) The following are not orthogonal:


 
1 2 −2    
−2 2 1  ∈ M3.3 (R) 2 0 1 4 3
, ∈ M2,2 (R)
0 12 5 3 4
2 1 2

Exercise 299. Suppose that Q, P ∈ Mn,n (R) are orthogonal matrices. Show that

(a) QT is orthogonal (c) P Q is orthogonal

(b) Q−1 is orthogonal (d) det(Q) = ±1

Lemma 32.3: Conditions equivalent to orthogonality

Let Q ∈ Mn,n (R). The following are equivalent

1. Q is orthogonal

2. the columns of Q form an orthonormal basis of Mn,1 (R) (with respect to the dot product)

3. the rows of Q form an orthonormal basis of Mn,1 (R) (with respect to the dot product)

4. kQuk = kuk for all u ∈ Mn,1 (R)

5. hQu, Qvi = hu, vi for all u, v ∈ Mn,1 (R)

Proof. That 1 ⇔ 2 follows from the way in which matrix multiplication is defined. That 1 ⇔ 3 then
follows from the fact that the transpose of an orthogonal matrix is orthogonal.
32-2 MAST10022 Linear Algebra: Advanced, 2024

For 1 ⇒ 4 we have:

kQuk2 = hQu, Qui = (Qu)T (Qu) (coordinate matrices with respect to the standard basis)
= uT QT Qu
= uT In u (since QT Q = In )
= uT u = kuk2

For 4 ⇒ 5 we have:
1
hQu, Qvi = (hQu + Qv, Qu + Qvi − hQu − Qv, Qu − Qvi)
4
1
= (hQ(u + v), Q(u + v)i − hQ(u − v), Q(u − v)i)
4
1
= (hu + v, u + vi − hu − v, u − vi) (since 4 holds)
4
= hu, vi

It only remains to show that 5 ⇒ 1.

hQu, Qvi = hu, vi =⇒ uT QT Qv = uT v =⇒ uT (QT Q − In )v = 0


*
Since this holds for all u, v ∈ Mn,1 (R) we must have that QT Q − In = 0.

32.2 Orthogonal diagonalisation

Definition 32.4

A matrix A ∈ Mn,n (R) is said to be orthogonally diagonalisable if there is an orthogonal matrix


Q ∈ Mn,n (R) and a diagonal matrix D ∈ Mn,n (R) such that A = Q D QT .

 
1 2
Example 32.5. The matrix A = is orthogonally diagonalisable since
2 −2

 " √1 # " #T
− √2 − √15 √2

1 2 −3 0
A= = √2 5 √1
5
√2 √1
5 = QDQT
2 −2 5 5
0 2 5 5

(We’ll see soon how Q can be found.)

We have the following version of Theorem 25.4.

Theorem 32.6

A matrix A ∈ Mn,n (R) is orthogonally diagonalisable if and only if there is an orthonormal*


basis B for Mn,1 (R) with the property that all elements of B are eigenvectors of A.

Sketch of proof. The proof is the same as for Theorem 25.4, but with the extra observation that the
basis B is orthonormal if and only if the transition matrix is orthogonal.

* with respect to the dot product

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 32-3

32.3 Real symmetric matrices

Theorem 32.7

Let A ∈ Mn,n (R). If A is symmetric (i.e., AT = A), then A is orthogonally diagonalisable.

Although we postpone the proof of this theorem, we note the following

Proposition 32.8

Let A ∈ Mn,n (R). If A is symmetric, then:

1. All roots of cA (x) are real.

2. Eigenvectors having different eigenvalues are orthogonal.

T
Proof. (Recall that for a matrix M ∈ Mm,n (C), A∗ denotes the conjugate transpose A∗ = A .)
*
Suppose that λ ∈ C and v ∈ Mn,1 (C) \ {0} are such that Av = λv. Then v ∗ A v is a 1 × 1 matrix and
(v ∗ A v)T = v ∗ A v (since any 1 × 1 matrix is symmetric). It follows that

v ∗ A v = (v ∗ A v)T = v T AT v = v ∗ A∗ v = v ∗ A v

Therefore v ∗ A v ∈ R. Since v ∗ A v = v ∗ λv = λv ∗ v, and both v ∗ A v and v ∗ v are real numbers (and


v ∗ v > 0), it follows that λ is real.
Suppose v1 , v2 ∈ Mn,1 (R) are two eigenvectors of A with Av1 = λ1 v1 and Av2 = λ2 v2 .
Then

v1T Av2 = v1T (λ2 v2 ) = λ2 v1T v2

and also

v1T Av2 = v1T AT v2 = (Av1 )T v2 = (λ1 v1 )T v2 = λ1 v1T v2

Therefore λ2 v1T v2 = λ1 v1T v2 which, if λ1 6= λ2 , implies v1T v2 = 0.

Algorithm 32.9: To orthogonally diagonalise a real symmetric matrix

Let A ∈ Mn,n (R) and suppose that AT = A.

1. Find the eigenvalues of A.

2. For each eigenvalue

(a) Find a basis for the eigenspace


(b) Use Gram-Schmidt to convert to an orthonormal basis

3. The union of the eigenspace bases will be an orthonormal basis {u1 , . . . , un } for Rn .
Letting Q be the matrix whose columns are given by the ui and letting D be the diagonal
matrix whose diagonal entries are the corresponding eigenvalues† we then have

A = QDQT


The order of the eigenvalues must correspond to the order of the eigenvectors ui

© University of Melbourne 2024


32-4 MAST10022 Linear Algebra: Advanced, 2024

 
4 2 2
Example 32.10. We apply the above to the matrix A = 2 4 2,
2 2 4
To find the eigenvalues:
   
4−x 2 2 4−x 2 2
det(A − xI3 ) = det  2 4−x 2  = det x − 2 2 − x 0  (R2 − R1 , R3 − R1 )
2 2 4−x x−2 0 2−x
 
4−x 2 2
= (x − 2)2 det  1 −1 0 
1 0 −1
    
2 2 2 4−x 2
= (x − 2) − det − det (expanding along the second row)
0 −1 1 −1
= (x − 2)2 (2 − (x − 6)) = (x − 2)2 (8 − x)

The eigenvalues are 2 and 8.


To find a basis for the eigenspace with λ = 2:
   
2 2 2 1 1 1
A − 2I3 = 2 2 2 ∼ 0 0 0
2 2 2 0 0 0

A basis for the eigenspace is therefore {(−1, 1, 0), (−1, 0, 1)}.


We apply Gram-Schmidt to find an orthonormal basis for the (λ = 2) eigenspace:
1
u1 = √ (−1, 1, 0)
2
w2 = (−1, 0, 1) − h(−1, 0, 1), u1 iu1
1
= (−1, 0, 1) − h(−1, 0, 1), (−1, 1, 0)i(−1, 1, 0)
2
1
= (−1, 0, 1) − (−1, 1, 0)
2
1 1 1
= (− , − , 1) = (−1, −1, 2)
2 2 2
w2 (−1, −1, 2) 1
u2 = = = √ (−1, −1, 2)
kw2 k k(−1, −1, 2)k 6
Therefore { √12 (−1, 1, 1), √16 (−1, −1, 2)} is an orthonormal basis for the eigenspace.
Now for the λ = 8 eigenspace:
   
−4 2 2 1 0 −1
A − 8I3 =  2 −4 2  ∼ 0 1 −1
2 2 −4 0 0 0

A basis for the eigenspace is therefore {(1, 1, 1)}.


We apply Gram-Schmidt to find an orthonormal basis for the (λ = 8) eigenspace:
1
u3 = √ (1, 1, 1)
3

Finally, if we take  1
− √16 √1

− √2  
 1
3 2 0 0
− √16 √1 

Q= √ and D = 0 2 0
 2 3
0 √2 √1 0 0 8
6 3

we have that A = QDQT .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 32-5

32.4 Exercises

300. Determine whether or not the given matrix A is orthogonal.


1 2 2 
 
3 3 3 0 1 0
2
(a)  3
1
− 23  (b) 1 0 1
3 
2 2 1 0 1 0
3 −3 3

 
cos θ − sin θ
301. Show that the rotation matrix A = is orthogonal.
sin θ cos θ

302. For each symmetric matrix A below find a decomposition A = QDQT , where Q is orthogonal
and D diagonal.
   
6 −2 1 1 0
(a)
−2 6 (d) 1 1
 0

7 2 0
 0 0 0
(b) 2 6 2
 
3 1 0 0
0 2 5 1 3 0 0
  (e) 
0 0

−2 0 −36 0 0
(c)  0 −3 0  0 0 0 0
−36 0 −23

© University of Melbourne 2024


32-6 MAST10022 Linear Algebra: Advanced, 2024

Extra material for lecture 32

 Reference

Elementary Linear Algebra by Anton and Rorres, §7, p401

© University of Melbourne 2024


LECTURE 33

Proof of the spectral theorem

Our goal is to show that every real symmetric matrix is orthogonally diagonalisable (Theorem 32.7).
The proof is more easily understood when presented in terms of linear transformations rather than
matrices.

Definition 33.1

Let V be a finite dimensional real inner product space. We will call a linear transformation
T : V → V symmetric if it has the property that hT (u), vi = hu, T (v)i for all u, v ∈ V .

Exercise 303. Let V be a finite dimensional real inner product space. Let B be an orthonormal basis
for V and let T : V → V be a linear transformation. Show that T is symmetric if and only if the
matrix [T ]B is symmetric.

Theorem 33.2

Let V be a finite-dimensional real inner product space and T : V → V a linear transforma-


tion. If T is symmetric, then there exists an orthonormal basis for V all of whose elements are
eigenvectors of T .

Proof. We use induction on n = dim(V ).


Base case: If n=1, then we choose any non-zero vector u ∈ V and take B = {u/kuk}.
Induction step: Assume that for any (n − 1)-dimensional inner product space W and every symmetric
linear transformation S : W → W there exsits an orthonormal basis for W made up of eigenvectors
of S.
Let λ be an eigenvalue of T . From Proposition 32.8 we know that λ ∈ R. Let u ∈ V be an eigenvector
of T with eigenvalue λ and with kuk = 1. Let W = {v ∈ V | hv, ui = 0}. Then we have

(a) W is a subspace of V (b) T (W ) ⊆ W (c) dim(W ) = n − 1

Let S : W → W be given by S(w) = T (w). Then S is a symmetric linear transformation. By


the induction hypothesis, there exists an orthonormal basis C for W such that all elements in C are
eigenvectors of S. We therefore have that C ⊆ V is an orthonormal set and all its elements are
eigenvectors of T . Hence {u} ∪ C is an orthonormal basis for V made up of eigenvectors of T .
Therefore, by mathematical induction a symmetric linear transformation on a finite dimensional real
inner product space is orthogonally diagonalisable.

33.1 An application: conic sections

Suppose that we would like to plot the set of points (x, y) ∈ R2 that satisfy the equation

6x2 − 4xy + 3y 2 = 1

We can use orthogonal diagonalisation to eliminate the cross terms in the above equation.
33-2 MAST10022 Linear Algebra: Advanced, 2024

   
x 6 −2
The equation can be written as X T AX = 1 where X = and A = .
y −2 3
Since A is real symmetric we know that it can be orthogonally diagonalised. Calculation gives A =
QDQT with    
2 0 1 1 −2
D= and Q = √
0 7 5 2 1
Note that
X T AX = X T QDQT X = (QT X)T D(QT X) (∗)
Let B = {b1 = √1 (1, 2), b2
= √1 (−2, 1)}
be the orthonormal basis of R2
corresponding to the columns
5 5
of Q. We rewrite the above equation using coordinates with respect to B.
 0
x
Let X 0 = be the coordinates of the point (x, y) with respect to B. Note that PS,B = Q and
y0
therefore PB,S = Q−1 = QT .
We have
X 0 = [(x, y)]B = PB,S [(x, y)]S = QT X
Then we have
6x2 − 4xy + 3y 3 = 1 ⇐⇒ X T AX = 1
⇐⇒ (QT X)T D(QT X) = 1 (from ∗)
0 T 0
⇐⇒ (X ) DX = 1
 2 0 x0
  
⇐⇒ x0 y 0

=1
0 7 y0
⇐⇒ 2(x0 )2 + 7(y 0 )2 = 1
The curve is now much easier to recognise as an ellipse.

y
b1
y0 √1 b1
2
b2
1
√ b2
7
√1
7 x
x0
√1
2

33.2 Exercises

304. Use orthogonal diagonalisation to sketch the curve given by the following equation:
5x2 − 4xy + 8y 2 = 36

305. Prove the statements (a), (b), and (c) about W in the above proof of Theorem 33.2.
306. Let T : R2 → R2 be the linear transformation given by T (x, y) = (−6x, −5x+4y). The following
defines an inner product on R2
h(a, b), (x, y)i = ax − ay − bx + 2by
(a) Show that T is symmetric (with respect to the inner product above).
(b) Find a basis for R2 that is orthonormal (with respect to the inner product above) and com-
posed of eigenvectors of T .

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 33-3

Extra material for lecture 33

 For a longer discussion of the application of diagonalisation to conic sections see

Elementary Linear Algebra by Anton and Rorres, §7.2, p417

The same approach can be applied to quadric surfaces.

 Let V be a finite-dimensional inner product space and W 6 V a subspace.

(a) Show that projection onto W projW : V → V is symmetric.


(b) Show that reflection across W is symmetric.

 Let V 6 F([0, 1], R) be the real inner product space of all smooth functions f : [0, 1] → R that
satisfy f (0) = f (1) = 0. That is

V = {f ∈ F([0, 1], R) | f (0) = f (1) = 0, and f is smooth}

with inner product


Z 1
hf, gi = f (x)g(x) dx
0
df
Let D : V → V be the linear transformation given by D(f (x)) = dx

(a) Use integration by parts to show that ∀f, g ∈ V , hD(f ), gi = −hf, D(g)i
d2 f
(b) Show that D2 is symmetric. (Explicitly, D2 : V → V ,D2 (f (x)) = dx2
)

© University of Melbourne 2024


33-4 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


LECTURE 34

Unitary diagonalisation

We note that the results on orthogonal diagonalisation carry over to complex matrices. The proofs
given in the real case apply with only minor changes, and will not be repeated.

34.1 Unitary matrices


T
Recall that for a matrix A ∈ Mm,n (C) we denote by A∗ the conjugate transpose A∗ = A .

Definition 34.1

A matrix U ∈ Mn,n (C) is called unitary if U ∗ U = In .

Examples 34.2.    
1 i i 1 1+i 1−i
√ , ∈ M2,2 (C)
2 1 −1 2 1−i 1+i

Exercise 307. Suppose the U, P ∈ Mn,n (C) are unitary matrices. Show that

(a) U ∗ is unitary (c) P U is unitary

(b) U −1 is unitary (d) | det(U )| = 1

Lemma 34.3: Conditions equivalent to being unitary

Let U ∈ Mn,n (C). The following are equivalent

1. U is unitary

2. the columns of U form an orthonormal basis of Cn


(with respect to the complex dot product)

3. the rows of U form an orthonormal basis of Cn


(with respect to the complex dot product)

4. kU vk = kvk for all v ∈ Cn (complex dot product)

5. hU u, U vi = hu, vi for all u, v ∈ Cn (complex dot product)

Proof. We prove that 4 implies 5.


hu + v, u + vi − hu − v, u − vi = 2(hu, vi + hv, ui) = 4<(hu, vi)
hU (u + v), U (u + v)i − hU (u − v), U (u − v)i = 2(hU u, U vi + hU v, U ui) = 4<(hU u, U vi)
Since 4 holds, the left hand sides above are equal and therefore <(hu, vi) = <(hU u, U vi. Putting iv in
place of v in the above calculation gives
hu, vi − hv, ui = hU u, U vi − hU v, U ui
and therefore =(hu, vi) = =(hU u, U vi (the imaginary parts are equal).
34-2 MAST10022 Linear Algebra: Advanced, 2024

34.2 Unitary diagonalisation

Definition 34.4

A matrix A ∈ Mn,n (C) is said to be unitarily diagonalisable if there is a unitary matrix U ∈


Mn,n (C) and a diagonal matrix D ∈ Mn,n (C) such that A = U D U ∗ .

 
1 2i
Example 34.5. The matrix A = is orthogonally diagonalisable since
−2i −2

 " √i 2i ∗
2i
# " #
− √ − √i5 √

1 2i −3 0
A= = √2 5 √1
5
√2 √1
5 = U DU ∗
−2i −2 5 5
0 2 5 5

Theorem 34.6

A matrix A ∈ Mn,n (C) is unitarily diagonalisable if and only if there is an orthonormal basis B
for Cn with the property that all elements of B are eigenvectors of A.

34.3 Hermitian matrices

Theorem 34.7

Let A ∈ Mn,n (C). If A is Hermitian (i.e., A∗ = A), then A is unitarily diagonalisable.

Proposition 34.8

Let A ∈ Mn,n (C). If A is Hermitian (i.e., A∗ = A), then:

1. All eigenvalues of A are real;

2. Eigenvectors having different eigenvalues are orthogonal.

Algorithm 34.9: To unitarily diagonalise a Hermitian matrix

Let A ∈ Mn,n (C) and suppose that A∗ = A.

1. Find the eigenvalues of A.

2. For each eigenvalue

(a) Find a basis for the eigenspace


(b) Use Gram-Schmidt to convert to an orthonormal basis

3. The union of the eigenspace bases will be an orthonormal basis {u1 , . . . , un } for Cn .
Letting U be the matrix whose columns are given by the ui and letting D be the diagonal
matrix whose diagonal entries are the corresponding eigenvalues, we then have

A = U DU ∗

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 34-3

 
1 1+i
Example 34.10. Let A = .
1−i 2
To find the eigenvalues of A:
 
x − 1 −1 − i
det(xI2 − A) = det = (x − 1)(x − 2) − (1 + i)(1 − i) = x2 − 3x = x(x − 3)
−1 + i x − 2

The eigenvalues of A are: 0, 3.


To find the eigenvectors of A:
   
1 1+i 1 1+i
A − 0I2 = ∼
1−i 2 0 0

A basis for the λ = 0 eigenspace is {(1 + i, −1)}. An orthonormal basis is { √13 (1 + i, −1)}.

   
−2 1 + i −2 1 + i
A − 3I2 = ∼
1 − i −1 0 0

A basis for the λ = 3 eigenspace is {(1 + i, 2)}. An orthonormal basis is { √16 (1 + i, 2)}.
(Notice that (1 + i, −1) · (1 + i, 2) = (1 + i)(1 − i) − 2 = 0)
" 1+i 1+i #
√ √  
3 6 0 0
Letting U = −1 and D = we have that A = U DU ∗ .
√ √2 0 3
3 6

34.4 Exercises
 
2 i
308. Unitarily diagonalse the Hermitian matrix A = .
−i 2

309. Let M ∈ Mn,n (C) be an Hermitian matrix. Show that M is positive definite (see definition 30.1)
if and only if all eigenvalues of M are strictly positive. (Hint: unitarily diagonalise M .)

310. A linear transformation T : V → V on a complex inner product space V is called self-adjoint


if
∀u, v ∈ V, hT (u), vi = hu, T (v)i

(a) Show that all eigenvalues of T are real.


(b) Show that eigenvectors corresponding to distinct eigenvalues are orthogonal

311. Let A ∈ Mn,n (C) and suppose that A∗ = −A. Suppose that λ ∈ C is an eigenvalue of A. Show
that λ = iy for some y ∈ R.

© University of Melbourne 2024


34-4 MAST10022 Linear Algebra: Advanced, 2024

Further material for lecture 34

 A matrix A ∈ Mn,n (C) is called normal if A∗ A = AA∗ . Normal matrices are unitarily diago-
nalisable. (This will be covered in the subject MAST20022 Group Theory and Linear Algebra.)
All Hermitian matrices are normal. The matrix 2+i 2−i
 
2−i 2+i is normal but not Hermitian.

© University of Melbourne 2024


LECTURE 35

Least squares approximation

We give two applications related to the dot product in Rn .

35.1 Least squares line of best fit

Given a set of data points (x1 , y1 ), (x2 , y2 ),. . . , (xn , yn ) we


(xi , yi )
want to find the straight line y = a + bx which best approxi- yi
mates the data. A common approach is to minimise the least δi y = a + bx
squares error: a + bxi

E = sum of the squares of the vertical errors δi


Xn
= δi2
i=1
n
X xi
= (yi − (a + bxi ))2
i=1

Pn
Given (x1 , y1 ),. . . , (xn , yn ) we want to find a, b ∈ R that minimise the quantity i=1 (yi − (a + bxi ))2
This can be written as
n
X
E= (yi − (a + bxi ))2 = ky − Auk2
i=1

where
 
  1 x1
y1 1 x2   
 ..  a
y= .  A = . .  u=
 
. .
. .  b
yn
1 xn

The length is that coming from the inner product on Mn,1 (R) that corresponds to the dot product on
Rn , that is hv1 , v2 i = v1T v2 .
To minimise ky − Auk we want u to be such that Au is as close as possible to y (which is fixed).
That is, we want the vector in

W = {Av | v ∈ M2,1 (R)} 6 Mn,1 (R) (W is the column space of A)

that is closest to y.
The closest vector is precisely p = projW (y). To find u we could project y to W to get p and then
solve Au = p to get u (by solving a linear system).
35-2 MAST10022 Linear Algebra: Advanced, 2024

However, we can calculate u more directly (without finding an orthonormal basis for W ) by using
properties of the projection:

wT (y − projW y) = 0 ∀w∈W
=⇒ (Av)T (y − Au) = 0 ∀ v ∈ M2,1 (R) (since w = Av)
T T
=⇒ v A (y − Au) = 0 ∀ v ∈ M2,1 (R)
*
=⇒ AT (y − Au) = 0
*
=⇒ AT y − AT Au = 0
=⇒ (AT A)u = AT y (∗)

From this we can calculate u, given that we know A and y. It’s just a matter of solving the linear
system.
Note that AT A ∈ M2,2 (R). If AT A is invertible (and it usually is), the solution to (∗) is given by

u = (AT A)−1 AT y

Example 35.1. We find the straight line which best fits the data points: (−1, 1), (1, 1), (2, 3), (4, 5)

 
  1 −1  
T 1 1 1 1  1 1 4 6
A A=   =
−1 1 2 4 1 2  6 22
5
1 4
4
 
   1  
T −1 T 1 22 −6 1 1 1 1  1 1 16 3
(A A) A y =   =
52 −6 4 −1 1 2 4 3 13 11 2
5
1

16 11
The line of best fit is y = 13 + 13 x
-2 -1 1 2 3 4 5

35.2 Polynomial of best fit

The same method works for finding quadratic (or higher degree) fitting curves.
To find the quadratic y = a + bx + cx2 which best fits data (x1 , y1 ), (x2 , y2 ),. . . , (xn , yn ) we take

1 x1 x21
 
 
1 x2 x2  y1
y =  ... 
2
A = . .
  
.. 
 .. .. . 
yn
1 xn x2n

and solve
AT Au = AT y

for  
a
u = b
c

Example 35.2. We find the parabola which best fits the data points: (−1, 1), (1, 1), (2, 3), (4, 5)

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 35-3

 
  1 −1 1  
1 1 1 1  4 6 22
1 1 1
AT A = −1 1 2 4  
1 2
 =  6 22 72 
4
1 1 4 16 22 72 274
1 4 16

 83 
78
(AT A)−1 AT y =  26
9 
6
1
6
4
The parabola of best fit is
83 9
y= 78 + 26 x + 61 x2 2

-2 -1 1 2 3 4 5

© University of Melbourne 2024


35-4 MAST10022 Linear Algebra: Advanced, 2024

35.3 Exercises

312. Find the (least squares) line of best fit for the given data sets.

(a) {(0, 0), (1, 0), (2, 1), (3, 3), (4, 5)}
(b) {(−2, 2), (−1, 1), (0, −1), (1, 0), (2, 3)}

313. A maths lecturer was placed on a rack by his students and stretched to lengths L = 1.7, 2.0
and 2.3 metres when forces of F = 1, 2 and 4 tonnes were applied. Assuming Hooke’s law
L = a + bF , estimate the lecturer’s normal length a.

314. A firm that manufactures widgets finds the daily consumer demand d(x) for widgets as a func-
tion of their price x is as in the following table:
x 1 1.5 2 2.5 3
d(x) 200 180 150 100 25
Using least squares, approximate the daily consumer demand by a linear function.

315. Find the parabola of best fit for the data in Exercise 312(a).
(Use MATLAB for the matrix algebra in this question!)

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 -5

Extra material for lecture 35

 Reference

Elementary Linear Algebra by Anton and Rorres, §6.5, p387

© University of Melbourne 2024


-6 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


APPENDIX A

Cardinality

What should it mean for two sets to have the same ‘size’? Does it make sense to say that there are
more rational numbers than there are natural numbers?* Are there more real numbers than there are
rationals?†
Rather than trying to define the size of a set directly, it is convenient to introduce the notion of two
sets ‘having the same number of elements’. The notion of a bijective function is clearly just what is
required.

Definition A.1

Two sets A and B are said to have the same cardinality if there exists a bijection A → B. A set is
finite if it is either empty or has the same cardinality as the set {1, 2, . . . , n} for some n. A set that
is not finite is called infinite. A set is called countably infinite is it has the same cardinality as
N. A set is called countable if it is either finite or countably infinite. A set that is not countable
is called uncountable.

Lemma A.2

Let A be a set.

1. If there exists an injective function A → N, then A is countable.

2. If there exists a surjective function N → A, then A is countable.

Proof. For the first statement it suffices to show that if B is an infinite subset of N, then there is a
bijection ϕ : N → B. We inductively define a sequence of subsets Bi ⊂ N together with the required
function ϕ : N → B. Let B1 = B. Suppose that Bi has been defined and is infinite. Let bi = min(Bi )
and define ϕ(i) = bi and Bi+1 = Bi \ {bn }. Since Bi is infinite, Bi+1 is infinite. The finction ϕ : N → B
defined inductively in this way is clearly a bijection.
For the second statement, suppose now that there exists a surjective function f : N → A. If A is finite,
then A is countable by definition and there is nothing to prove. We can assume, therefore, that A is
infinite. We define a map g : N → A as follows. Define M ⊆ N by

M = {m ∈ N | f (m) ∈
/ {f (1), f (2), . . . , f (m − 1)}}

Note that

1. M is infinite since A is infinite

2. f (M ) = A

3. If m, n ∈ M and m 6= n, then f (m) 6= f (n)

Denote by mi the i-th element of M (using the usual ordering on N) and define

g : N → A, g(i) = f (mi )
* no

yes
A-2 MAST10022 Linear Algebra: Advanced, 2024

That g is bijective follows from properties 2 and 3 above.

Proposition A.3

1. Z and N have the same cardinality (i.e., Z is countably infinite)

2. N × N and N have the same cardinality (i.e., N × N is countably infinite)

3. Q and N have the same cardinality (i.e., Q is countably infinite)

(
n
2 if n is even
Proof. 1. The map f : N → Z given by f (n) = 1−n
is a surjection.
2 if n is odd

(In fact it’s a bijection.)

2. The map g : N × N → N given by g(m, n) = 2m 3n is injective.

3. From the first two parts, we know that Z × (Z \ {0}) has the same cardinality as N. Then note
that the map h : Z × (Z \ {0}) → Q given by h(m, n) = mn is surjective.

At this point we might start to think that all infinite sets are countably infinite. But we’d be wrong.

Theorem A.4

R is uncountable.

Proof. Suppose, for a contradiction, that the interval (0, 1) ⊂ R is countable and let f : N → (0, 1) be
a bijection. We consider the decimal expansion of each element:

f (1) = 0.a11 a12 a13 . . .


f (2) = 0.a21 a22 a23 . . .
..
.
f (i) = 0.ai1 ai2 ai3 . . .
..
.

Each aij ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and aij is the j-th digit in the decimal expansion of f (i). Define
b ∈ (0, 1) as follows:
(
7 if aii = 8
b = 0.b1 b2 b3 . . . where bi =
8 if aii 6= 8

Notice that for all i ∈ N, bi 6= aii . It follows that for all i ∈ N, b 6= f (i). This contradicts the assumption
that f is surjective.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 A-3

A similar ‘diagonal argument’ establishes the following.

Theorem A.5

Let A be a set. The power set of A, P(A), does not have the same cardinality as A.

Proof. Suppose, for a contradiction, that there exists a bijection f : A → P(A). Define B = {a ∈ A |
a∈/ f (A)}. Since B ⊆ A and f is surjective, there exists b ∈ A such that f (b) = B. Then we have

b ∈ f (b) ⇐⇒ b ∈ B (since f (b) = B)


⇐⇒ b ∈
/ f (b) (definition of B)

© University of Melbourne 2024


A-4 MAST10022 Linear Algebra: Advanced, 2024

© University of Melbourne 2024


APPENDIX B

Existence of bases

THIS IS FOR INTEREST ONLY!


The goal is to present a proof of the following theorem.

Theorem B.1

Let V be a vector space.


1. Every spanning set of V contains a basis of V .

2. Every linearly independent set in V can be extended to a basis of V .

3. Any two bases of V have the same cardinality.

If we assume that V is finite dimensional, the theorem is much easier to prove (see Lecture 17). In the
general case, which we shall consider here, the proof uses some fundamental results from the theory
of infinite sets which are stated without proof in the second section.
Notice that, since V itself is a spanning set for V , we have the following consequence of the theorem.

Corollary B.2

Every vector space has a basis.

B.1 Proof of the theorem


We start by proving the following.

Lemma B.3

Let V be a vector space over F and let X, Y ⊆ V be two subsets. If X is linearly independent and
Y spans V , then there is a subset Y 0 ⊆ Y such that X ∪ Y 0 is a basis of X.

Proof of Lemma B.3. Define S to be the collection of all subsets Z of Y such that X ∪ Z is linearly
independent, that is:

S = {Z | Z ⊆ Y and X ∪ Z is linearly independent}

Let Y 0 ∈ S be maximal in S, that is, for all Z ∈ S we have:

Z ⊇ Y 0 =⇒ Z = Y 0

How do we know that such a maximal element exists? If Y is a finite set, then S is also finite and the
existence is clear. If Y is infinite, the existence of a maximal element Y 0 is less obvious, and is in fact
a fundamental property of set theory. It is called “Zorn’s Lemma”* (see the next section).
* Infact, it is not really a lemma. It is equivalent to something called the “Axiom of Choice,” which is independent of
the other axioms of set theory.
B-2 MAST10022 Linear Algebra: Advanced, 2024

By construction X ∪Y 0 is linearly independent. We claim that it is also a spanning set for V . We know
that for all y ∈ Y , y ∈ span(X ∪ Y 0 ) since otherwise X ∪ Y 0 ∪ {y} would be linearly independent,
which contradicts the maximality of Y 0 . Thus Y ⊆ span(X ∪ Y 0 ) and hence span(Y ) ⊆ span(X ∪ Y 0 ).
Since span(Y ) = V and span(X ∪ Y 0 ) ⊆ V , it follows that span(X ∪ Y 0 ) = V . Therefore X ∪ Y 0 is a
spanning set for V , and hence a basis.

We can now prove the first two parts of the theorem.

Proof (of parts 1 and 2 of the theorem).


1. Suppose that Y is a spanning set. Applying Lemma B.3 with this Y and X = ∅ yields a subset
Y 0 ⊆ Y such that Y 0 is a basis.
2. Suppose now that X is any linearly independent set. Taking Y = V and applying Lemma B.3
yields a basis that contains X.

To prove the third part of the theorem, we will use the following lemma.

Lemma B.4

Let V be a vector space over F, let Y ⊆ V be a spanning set for V , and let {x1 , . . . , xm } ⊆ V be a
linearly independent set of vectors. If |Y | > m, then there are elements y1 , . . . , ym ∈ Y such that
{Y \ {y1 , . . . , ym }} ∪ {x1 , . . . , xm } is a spanning set for V . (That is, replacing the yi by the xi still
gives a spanning set.)

Proof. We first note that since {x1 , . . . , xm } is linearly independent, all of the xi are non-zero. Since Y
is a spanning set, there exist a1 , . . . , ak ∈ Y and α1 , . . . , αk ∈ F such that x1 = α1 a1 + · · · + αk ak . As
x1 is non-zero, at least one of the αi is non-zero. By re-ordering the ai if necessary we may assume
that α1 6= 0. We then have that
     
1 α2 αk
a1 = x1 − a2 − · · · − ak
α1 α1 α1

Let y1 = a1 . Using the above expression any linear combination of elements from Y can be rewritten
as a linear combination of vectors from Y1 = {Y \ {y1 }} ∪ {x1 }. We simply replace any occurance of
y1 by the right hand side of the above expression. This then gives a linear combination which does
not involve y1 , but does involve x1 . It follows that span(Y ) ⊆ span(Y1 ), and therefore span(Y1 ) = V .
Suppose now that we have found y1 , . . . , yl ∈ Y (where 1 6 l < m) such that Yl = {Y \ {y1 , . . . , yl }} ∪
{x1 , . . . , xl } is a spanning set for V . Since Yl is a spanning set, there exist b1 , . . . , bk ∈ Y \ {y1 , . . . , yl }
and β1 , . . . , βk , γ1 . . . , γl ∈ F such that xl+1 = β1 b1 +· · ·+βk bk +γ1 x1 +· · ·+γl xl . Since xl+1 is non-zero,
at least one of the βi or γj is non-zero. Indeed, not all the βi can be zero, as that would contradict the
linear independence of the set {x1 , . . . , xl+1 }. Re-ordering if necessary, we may assume that β1 6= 0.
Then          
1 β2 βk γ1 γl
b1 = xl+1 − b2 − · · · − bk − x1 − · · · − xl
β1 β1 β1 β1 β1
Letting yl+1 = b1 and Yl+1 = {Y \{y1 , . . . , yl+1 }}∪{x1 , . . . , xl+1 }, we have, as above, that span(Yl+1 ) =
V . The lemma then follows by induction.

The third part of the theorem follows from:

Lemma B.5

Let V be a vector space over F and let X, Y ⊆ V be two subsets. If X is linearly independent and
Y spans V , then |X| 6 |Y |.

© University of Melbourne 2024


MAST10022 Linear Algebra: Advanced, 2024 B-3

Proof. We first prove the lemma under the assumption that Y is finite. Let Y = {y1 , . . . , yk }. Sup-
pose, in order to get a contradiction, that |X| > k. Choose distinct elements x1 , . . . , xk ∈ X. Then
{x1 , . . . , xk } is linearly independent (since X is), and applying Lemma B.4 we know that

{x1 , . . . , xk } = (Y \ {y1 , . . . , yk }) ∪ {x1 , . . . , xk }

is a spanning set for V . Since |X| > k, there is an element x ∈ X \ {x1 , . . . , xk }. As {x1 , . . . , xk } is a
spanning set for V , x can be expressed as a linear combination of the xi . This contradicts the linear
independence of X, so we must in fact have |X| 6 k.
Consider now the case where Y is not finite. Denote by F (Y ) the set of all finite subsets of Y . Then
|F (Y )| = |Y | (See Lemma B.6). Define a map Φ : X → F (Y ) as follows: For each x ∈ X we choose a
a finite subset Sx ⊂ Y such that x is a linear combination of Sx , and define Φ(x) = Sx . If |X| > |F (Y )|
then there is some element S ∈ F (X) with infinite preimage (see Lemma B.6). We would then have
a finite set S ⊂ Y such that Φ−1 (S) ⊂ X is an infinite, linearly independent subset of span(S). This
would contradict the first case of this proof.

Proof of part 3 of the theorem. Let B1 and B2 be two bases for V . Since B1 is linearly independent
and B2 is a spanning set, Lemma B.5 implies that |B1 | 6 |B2 |. On the other hand, B2 is linearly
independent and B1 is a spanning set, so we also have |B2 | 6 |B1 |. It follows that |B1 | = |B2 |.

B.2 Results from set theory

In the above proof we used some results about infinite sets. We state them here without proof. The
interested reader should consult an introductory textbook on set theory.

Lemma B.6

Suppose that X and Y are infinite sets and f : X → Y is a function. Then

1. |F (Y )| = |Y | (where F (Y ) is the set of all finite subsets of Y )

2. If |X| > |Y |, then there exists an element y ∈ Y whose preimage, {x ∈ X | f (x) = y}, is
infinite.

Zorn’s Lemma is a fundamental statement in the theory of infinite sets, and is equivalent to the Axiom
of Choice. In order to state Zorn’s Lemma (in the form we use) we make the following definition. If
S is a collection of sets, a non-empty subset C ⊆ S is called a chain if

∀ A, B ∈ C either A ⊆ B or B⊆A

Zorn’s Lemma

Let S be a non-empty collection of sets. Suppose that whenever C is a chain in S the union ∪C∈C C
is also an element of S. Then S contains a maximal element.

© University of Melbourne 2024

You might also like