0% found this document useful (0 votes)
58 views114 pages

Math 211 Course Pack V5

The Math 211 Course Pack covers topics in Linear Algebra, including Euclidean vector spaces, systems of linear equations, and matrices. It provides detailed notes and structured sections on vector operations, matrix mappings, and linear transformations. The document serves as a comprehensive resource for students in science and engineering disciplines.

Uploaded by

Jaspervanmaren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views114 pages

Math 211 Course Pack V5

The Math 211 Course Pack covers topics in Linear Algebra, including Euclidean vector spaces, systems of linear equations, and matrices. It provides detailed notes and structured sections on vector operations, matrix mappings, and linear transformations. The document serves as a comprehensive resource for students in science and engineering disciplines.

Uploaded by

Jaspervanmaren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 114

Math 211 Course Pack

Linear Algebra

Notes By William Thompson based on


Introduction to Linear Algebra for Science and Engineering, Norman and Wolczuk
Contents

1 Euclidean Vector Spaces 4


1.1 Vectors in R2 and R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Introduction to Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Standard Basis and Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.4 Zero Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.5 The Vector Equation of a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.6 Directed Line Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.1 Rn and Algebraic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.3 Spanning Sets and Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.4 Surfaces in Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Length and Dot Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.1 Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.2 Angles and Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.3 Properties of Length and Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.4 The Scalar Equation of a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.5 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 Projections and Minimum Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.1 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.2 Projections and Minimal Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Systems of Linear Equations 23


2.1 Systems of Linear Equations and Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.1 Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.2 Matrix Representations of a System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.3 Row Echelon Form and Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Reduced REF, Rank, and Homogeneous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Reduced Row Echelon Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 Homogeneous Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Application to Spanning and Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 Spanning Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 Linear Independence Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.3 Bases of Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

1
3 Matrices, Linear Mappings, and Inverses 36
3.1 Operations on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 Equality, Addition and Scalar Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . 37
3.1.2 Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.3 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.4 Properties of Matrix Multiplication and the Identity Matrix . . . . . . . . . . . . . . . . . . 41
3.2 Matrix Mappings and Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Matrix Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.2 Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 Compositions and Linear Combination of Mappings . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Geometrical Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Common Transformation Matrices in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Common Transformation Matrices in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.3 General Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Special Subspaces for Systems and Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.1 The Four Fundamental Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.2 Bases for Row(A), Col(A), and Null(A) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.3 The Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Inverse Matrices and Inverse Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.1 Inverse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.2 A Procedure for Finding the Inverse of a Matrix and Solving Systems of Equations . . . . . 53
3.5.3 Solving Systems of Equations A⃗x = ⃗b Using the Inverse . . . . . . . . . . . . . . . . . . . . 53
3.5.4 Inverse Linear Mappings and the Inverse Matrix Theorem . . . . . . . . . . . . . . . . . . . 54
3.6 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.1 Elementary Matrices and Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.2 Representing A and A−1 as a Product of Elementary Matrices . . . . . . . . . . . . . . . . 56
3.7 LU -Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.1 Constructing the LU -Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.2 Using LU -Decomposition to Solve Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 Vector Spaces 59
4.1 Spaces of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.1 Polynomial Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.2 Linear Combinations, Spans, and Linear Dependendence/Independence . . . . . . . . . . . 61
4.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Bases and Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.1 Linear Combinations, Spans and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.2 Determining a Basis of a Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.4 Extending a Linearly Independent Subset to a Basis . . . . . . . . . . . . . . . . . . . . . . 70
4.4 Coordinates with Respect to a Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.1 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.2 Change-of-Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 General Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.1 General Linearity Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.2 The General Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

2
4.6 Matrix of a Linear Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6.1 The Matrix of L with Respect to the Basis B . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6.2 Change of Coordinates and Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5 Determinants 80
5.1 Determinants in Terms of Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.1 The 2 × 2 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.2 The 3 × 3 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.3 General Cofactor Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Elementary Row Operations and the Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.1 Determinant Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.2 Determinant Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Inverse by Cofactors and Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Inverse by Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.2 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.3 A Formula For the Cross Product Using Determinants . . . . . . . . . . . . . . . . . . . . . 90

6 Eigenvectors and Diagonalization 91


6.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.1 Eigenvalues and Eigenvectors of a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.2 Finding Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.3 Algebraic and Geometric Multiplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.2 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.1 Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.2 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3 Powers of Matrices and the Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.3.1 (Large) Powers of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.3.2 Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.3 The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

7 Orthonormal Bases 102


7.1 Orthonormal Bases and Orthogonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1.1 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1.2 Coordinates with Respect to an Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . 104
7.1.3 Orthogonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Projections and the Gram-Schmidt Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.1 Complementary Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.2 Projection on Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.3 The Gram-Schmidt Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

A (Lengthy) Important Theorems and Definitions 110


A.1 The Inverse Matrix Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
A.2 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
A.2.1 Vector Space Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
A.2.2 Subspace Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3
Chapter 1

Euclidean Vector Spaces

4
1.1 Vectors in R2 and R3
1.1.1 Introduction to Vectors
Definition: Points in Space

The collection of all points of two components (x, y) is called the Real 2-Space and is denoted R2 . The
collection of all points of three components (x, y, z) is called the Real 3-Space and is denoted R3 .

Definition: Vectors

An abstraction of a point is a vector. Consider a point P = (x1 , x2 ), then an analog to this P⃗ is called a
vector and is given by
 
⃗ x1
P = (x1 , x2 ) ⇐⇒ P =
x2
They are graphically given by arrows where the base is located at some point (x, y) and the tip of the arrow
is located at (x + p1 , y + p2 ) in R2 . Vectors and how you graph them naturally extends in the same notation
and graphical means in R3 .

Notation: Vectors as Bold or With an Arrow

Textbooks often use the notation P instead of P⃗ . Either is acceptable but bold face is impossible to write
by hand, so when writing by hand we always use P⃗ .
 
  −1
1
Example 1: Graph the vectors in R2 and  1  in R3 .
−2
2

Note: Notational Issues When Writing

To address the
 notational
 issue of wanting to write inline, but vectors being written vertically, we will adopt
a  T
the notation  b  = a b c .
c

5
1.1.2 Vector Operations
Definition: Vector Addition and Subtraction
Let ⃗u and ⃗v be vectors with the same number of components. Vector addition ⃗u +⃗v and subtraction ⃗u −⃗v are
defined by adding and subtracting the same positioned components, respectively. Visually, vector addition
follows the parallelogram law and subtraction follows the tip-to-tip law.
   
2 1
Example 2: Let ⃗v = and w
⃗= ⃗ and ⃗v − w.
. Compute and graph ⃗v + w ⃗
−1 2

Definition: Scalars
In the context of vectors we call (real) numbers scalars.

Definition: Scalar Multiplication

If ⃗v is a vector and t is a scalar then multiplication of the two t⃗v is done by distributing the scalar into each
component.
 
2 1
Example 3: Let ⃗v = . Compute and graph 2⃗v and − ⃗v .
4 2

6
1.1.3 Standard Basis and Linear Combinations
Definition: Linear Combination
Let v⃗1 , ..., v⃗n be vectors. We say that the expression a1 v⃗1 + · · · + an v⃗n where a1 , ..., an are scalars is called
a linear combination of v⃗1 , ..., v⃗n .
     
5 1 2
Example 4: Express ⃗v = as a linear combination of ⃗s = and ⃗t = .
1 1 0

Definition: Standard Basis Vectors


The vectors ⃗e1 , ⃗e2 , ... are called the standard basis vectors. Each vector ⃗ei is defined as having 1 in the
i’th component and zero everywhere else. Alternate notation used is ⃗i, ⃗j, ....

Theorem: Linear Combination of Standard Basis Vectors


Every vector can be uniquely written as a linear combination of the standard basis vectors.

T
in terms of the standard basis of R3 .

Example 5: Express w
⃗= 3 2 −4

1.1.4 Zero Vector


Definition: The Zero (Null) Vector

The zero vector ⃗0 is the vector with all zero components.

Theorem: Properties of the Zero Vector

Let ⃗v be any vector. Then we have

1. ⃗0 + ⃗v = ⃗v

2. ⃗v − ⃗v = ⃗0

3. 0 ⃗v = ⃗0 (notice here 0 is a scalar and ⃗0 is the zero vector)

7
1.1.5 The Vector Equation of a Line
Definition: Parametric Curves in Rn

Let C be a curve in Rn . A vector valued function that


represents C is a function f (t), whose input is a real number
and output a vector with range given by C. Explicitly, the com-
ponents x1 = f1 (t), ..., xn = fn (t) are called the parametric
equations.

We interpret these functions as giving all points by the tip of the


representing arrow in standard position. That is, (f1 (t), ..., fn (t))
represents a point on the curve.

Definition: The Parametric Representation of a Line

The standard (vector) representation of a line in Rn that


contains a point P and moves in a direction ⃗v is described

by the vector valued function L(t) = P⃗ + t⃗v . Eliminating
t from the equation and a system of equations dependent
only on the components x1 , ..., xn is called the scalar form
of the line.

Consequently, if someone wants to construct any line they


need to only acquire two ingredients: a point on a line and
the direction of the line.

   
1 2
Example 6: Consider the line through P (3, 1, −2) that is parallel to the line ⃗y (t) =  −2  + t  2 .
1 −3
Form the (a) Vector equation; (b) Parametric equations; and (c) Scalar form of this line.

8
1.1.6 Directed Line Segments
Definition: Directed Line Segments

The directed line segment from the point P to the point Q is the vector denoted by P⃗Q. Suppose O is
⃗ as simply P⃗ to be consistent with previous results.
the origin, then we just denote OP

Theorem: The Vector Between Two Points

For any two points P and Q we have P⃗Q = Q


⃗ − P⃗ .

Example 7: Find the vector equation of the line that passes through the points P (1, 5, −2) and Q(4, −1, 3).
Demonstrate that the vector equation for a line isn’t necessarily unique by finding three different vector equations.

9
1.2 Vectors in Rn
1.2.1 Rn and Algebraic Operations
Definition: Points in Space

Rn is the set of all points (x1 , ..., xn ) with n components.

Definition: Vector Algebraic Operations

Vector addition, subtraction, and scalar multiplication is defined the same in Rn (component-wise).

 T  T
Example 1: Let ⃗v = 2 −1 3 4 and w
⃗= 3 2 −2 1 . Compute 3⃗v − 2w.

Theorem: Properties Vectors in Rn

⃗ ⃗x and ⃗y in Rn and s, t ∈ R we have...


For all w,

Property Name

⃗x + ⃗y ∈ Rn Closure under addition

⃗x + ⃗y = ⃗y + ⃗x Commutativity of vector addition

(⃗x + ⃗y ) + w
⃗ = ⃗x + (⃗y + w)
⃗ Associativity of vector addition

There exists vector ⃗0 ∈ Rn such that ⃗0 + ⃗z = ⃗z for all ⃗z ∈ Rn Existence of zero vector

For each ⃗z ∈ Rn there exists −⃗z ∈ Rn such that ⃗z + (−⃗z) = ⃗0 Existence of additive inverse

t⃗x ∈ Rn Closure under scalar multiplication

s(t⃗x) = (st)⃗x Associativity of scalar multiplication

(s + t)⃗x = s⃗x + t⃗x Distributivity of scalar addition

t(⃗x + ⃗y ) = t⃗x + t⃗y Distributivity of vector addition

1 ⃗z = ⃗z for all ⃗z ∈ Rn 1 is the scalar identity

10
1.2.2 Subspaces
Definition: Subspaces

A non-empty subset S of Rn is called a subspace of Rn if for all vectors ⃗x, ⃗y ∈ S and t ∈ R...

1. ⃗0 ∈ S (Non-Empty)

2. ⃗x + ⃗y ∈ S (Closure under addition)

3. t ⃗x ∈ S (Closure under scalar multiplication)


(  )
x1
Example 2: Show that the set T = x1 x2 = 0 is not a subspace of R2 .
x2

(  )
x1
Example 3: Prove that the set S = 2x1 − 3x2 = 0 is a subspace of R2 .
x2

11
1.2.3 Spanning Sets and Linear Independence
Definition: Spanning Sets

Let R = {v⃗1 , ..., v⃗k } be a collection of vectors. We 


define Span(R) to be the collection of all possible linear
combinations of vectors in R. That is, Span(R) = t1 v⃗1 + t2 v⃗2 + · · · + tk v⃗k t1 , ..., tk ∈ R .

n T  T o  T
Example 4: Let R = −3 0 2 , −5 0 1 . Demonstrate that ⃗v = −4 2 5 is not in
Span(R).

Theorem: Spanning Sets are Subspaces

Let R be a non-empty collection of vectors in Rn , then Span(R) is a subspace of Rn .

Definition: Spanning Sets

Let R be a non-empty collection of vectors in Rn . If S = Span(R) then we say that S is the subspace
spanned by the vectors in R and in return we say that R spans S. The set R is called a spanning set
for the subspace S.

Theorem: Linear Dependence Within Spanning Sets

Let v⃗1 , ..., v⃗k be vectors in Rn . If v⃗k can be written as a linear combination of v⃗1 , ..., v⃗k then

Span {v⃗1 , ..., v⃗k } = Span {v⃗1 , ..., ⃗vk−1 }


         
 1 1 0   1 0 
Example 5: Use the prior theorem to argue that Span  0  ,  1  ,  1  = Span  0  ,  1  .
−1 −1 0 −1 0
   

12
Definition: Linear Dependence and Independence

A collection of vectors {v⃗1 , ..., v⃗k } is said to be linearly dependent if there exist coefficients t1 , ..., tk not
all zero such that

⃗0 = t1 v⃗1 + · · · + tk v⃗k

Alternatively, if the only solution is t1 = t2 = · · · = tk = 0 (called the trivial solution) then we say the
collection is linearly independent.
     
 4 9 1 
Example 6: Show that T =  −1  ,  0  ,  2  is linearly dependent. Hint: −2⃗v1 + ⃗v2 − ⃗v3 .
0 1 1
 

Theorem: Linear Dependence With The Zero Vector

If a set of vector contains the zero vector, then it is linearly dependent.

Definition: Basis of a Vector Space

Let R be a collection of vectors in Rn such that S = Span(R). Provided R is linearly independent we say
that R is a basis for S.

Example 7: Consider the set T of the prior example. Argue that the set T is not a basis of Span(T ). Then,
determine a basis of Span(T ).

13
1.2.4 Surfaces in Higher Dimensions
Definition: Planes in Higher Dimensions

(1) Let p⃗, d⃗ ∈ Rn with d⃗ ̸= ⃗0. Then we call the set with vector equation ⃗x(t) = p⃗ + td⃗ a line in Rn that
passes through p⃗.

(2) Let p⃗, d⃗1 , d⃗2 ∈ Rn with {d⃗1 , d⃗2 } being a linearly independent set. Then the set with vector equation
⃗x(t1 , t2 ) = p⃗ + t1 d⃗1 + t2 d⃗2 is called a plane in Rn that passes through p⃗.

(3) Let p⃗, v⃗1 , ..., ⃗vn−1 ∈ Rn with {v⃗1 , ..., ⃗vn−1 } being linearly independent. Then the set with vector
equations ⃗x(t1 , ..., tn−1 ) = p⃗ + t1⃗v1 + · · · + tn−1⃗vn−1 is called a hyperplane in Rn that passes through
p⃗.
     
 1 −1 0 
Example 9: Show that the set Span   2 ,
  1  , 3   is not a hyperplane. Determine what

1 2 3
 
type of surface this space describes.

14
1.3 Length and Dot Products
1.3.1 Length
Definition
n
The length/norm q of a vector ⃗x ∈ R is denoted and de-
fined as ∥⃗x∥ = x21 + x22 + · · · + x2n . Quite literally, it is
the length of the arrow.

 T
1 2 2
Example 1: Let ⃗x = − − . Compute ∥⃗x∥.
3 3 3

Definition: Unit Vectors


A vector ⃗x ∈ Rn such that ∥⃗x∥ = 1 is called a unit vector.

Proposition: Normalization of a Vector

1
The unit vector moving in the same direction as ⃗x is denoted and given by x̂ = ⃗x.
∥⃗x∥

1
⃗ in the same direction as ⃗x =  −2  such that ∥w∥
Example 2: Construct a vector w ⃗ = 2.
5

15
1.3.2 Angles and Dot Product
Definition: The Dot Product
The dot product between two vectors p⃗ and ⃗q in Rn is given and denoted by p⃗ · ⃗q = p1 q1 + p2 q2 + · · · + pn qn .

Definition (and Theorem): The Angle Between Vectors

The angle θ between two vectors p⃗ and ⃗q in Rn is given by the equation


p⃗ · ⃗q = ∥⃗
p∥∥⃗q∥ cos(θ) where it is always chosen to satisfy 0 ≤ θ ≤ π.

Note: The Derivation Angle Formula

The prior result is a theorem in R2 and R3 derived by using the Law-Of-Cosines. Since we can’t draw such
figures in higher dimensions, we take the above to be the definition of the angle between two vectors in
higher dimensions.
   
1 1
Example 3: Find the angle in R2 between the vectors ⃗v =  2  and w
⃗ =  −1 .
−1 −1

Definition: Orthogonality of Vectors

Two vectors ⃗x and ⃗y in Rn are orthogonal to each other provided ⃗x · ⃗y = 0.


    
1 2 −1
, and ⃗z =  −1 . Which pair of vectors are orthogonal to each
 0   3   
Example 4: Let ⃗v =   3 , w
 ⃗ =
 0   1 
−2 1 2
other and which ones aren’t?

16
1.3.3 Properties of Length and Dot Product
Theorem: Dot Product and Norm Properties

Let ⃗x, ⃗y , ⃗z ∈ Rn and let t ∈ R be a scalar. Then...

Property Name Property Name

⃗x · ⃗y = ⃗y · ⃗x Commutativity ∥t⃗x∥ = |t| ∥⃗x∥ Common Factor

⃗x · (⃗y + ⃗z) = ⃗x · ⃗y + ⃗x · ⃗z Distributivity |⃗x · ⃗y | ≤ ∥⃗x∥ ∥⃗y ∥ Cauchy-Schwarz Inequality

(t⃗x) · ⃗y = t(⃗x · ⃗y ) = ⃗x · (t⃗y ) Associativity ∥⃗x + ⃗y ∥ ≤ ∥⃗x∥ + ∥⃗y ∥ Triangle Inequality

Theorem: Positive Definiteness of the Dot Product and Norm

Let ⃗x ∈ Rn . We have that ⃗x · ⃗x ≥ 0 and equality is obtained if and only if ⃗x = ⃗0. Similarly, ∥⃗x∥ ≥ 0 and
equality is obtained if and only if ⃗x = ⃗0.

Theorem: Relationship Between the Norm and Dot Product

Let ⃗x ∈ Rn , then ∥⃗x∥2 = ⃗x · ⃗x.

Example 5: Suppose that ⃗x and ⃗y are vectors in Rn satisfying ∥⃗x∥ = 3, ∥⃗y ∥ = 2 and the angle between them
is θ = π/3. Compute ⃗x · (5⃗x + ⃗y ).

17
1.3.4 The Scalar Equation of a Plane
Definition: The Scalar Equation of a Plane

If ⃗n ̸= ⃗0 is a vector orthogonal to every vector in a plane, we say that the


vector ⃗n is a normal vector to the plane.

The scalar equation of a plane (or hyperplane) with normal vector ⃗n


containing the point p⃗ is given implicitly by ⃗n · (⃗x − p⃗) = 0.

 T
Example 6: Find the equation of the plane that contains the point P (2, 3, −1) with normal ⃗n = 1 −4 1 .

Example 7: Find a normal vector to the plane with scalar equation 5x1 − 6x2 + 7x3 = 11.

Definition: Parallel Planes


Two planes are defined to be parallel if the normal vector to one plane is a non-zero scalar multiple of the
normal vector of the other plane.

Example 8: Construct the scalar equation of the plane that contains the point P (2, 4, −1) and is parallel to
the plane 2x1 + 3x2 − 5x3 = 6.

Definition: Orthogonality of Planes

Two planes are orthogonal if their normal vectors are orthogonal.

18
1.3.5 The Cross Product
Definition: The Cross Product
 
u2 v3 − u3 v2
Let ⃗u and ⃗v be vectors in R3 . The cross product is defined to be ⃗u × ⃗v =  u3 v1 − u1 v3 .
u1 v2 − u2 v1

Theorem: Orthogonality of the Cross Product

Let ⃗u and ⃗v be vectors in R3 . The cross product of them is orthogonal to both ⃗u and ⃗v .

Example 9: Find the scalar equation of the plane through three points P (1, 0, 1), Q(0, 1, −1) and R(2, 1, 0).

Theorem: Cross Product Properties

Let ⃗x, ⃗y , ⃗z ∈ R3 and t ∈ R be a scalar. Then...

Property Name

⃗x × ⃗y = −⃗y × ⃗x Anti-Commutativity

⃗x × ⃗x = ⃗0 Self-Degenerate

⃗x × (⃗y + ⃗z) = (⃗x × ⃗y ) + (⃗x × ⃗z) Distributive

(t⃗x) × ⃗y = t(⃗x × ⃗y ) = ⃗x × (t⃗y ) Scalar Associativity

⃗x × ⃗y = ⃗0 only if ⃗x = ⃗0 of ⃗y is a multiple of ⃗x Collinear-Degenerate

If ⃗n = ⃗x × ⃗y then for any w


⃗ ∈ Span{⃗x, ⃗y } we have ⃗n · w
⃗ =0 Orthogonality

19
1.4 Projections and Minimum Distance
1.4.1 Projections
Definition: Projection

Let ⃗u and ⃗v be vectors in Rn where ⃗v ̸= ⃗0. The projection of


⃗u onto ⃗v is defined to be
 
⃗u · ⃗v
proj⃗v (⃗u) = ⃗v .
⃗v · ⃗v
⃗u · ⃗v
The expression Comp⃗v (⃗u) = is called the scalar compo-
⃗v · ⃗v
nent.

Definition: Perpendicular Projection

For any vectors ⃗u, ⃗v ∈ Rn with ⃗v ̸= ⃗0, we define the projection


of ⃗u perpendicular to ⃗v to be

perp⃗v ⃗u = ⃗u − proj⃗v ⃗u

 T  T
Example 1: Let ⃗u = 1 −2 0 and ⃗v = 3 1 2 . Determine proj⃗v ⃗u and perp⃗v ⃗u.

Theorem: Projection Properties

Theorem: Let ⃗x, ⃗y , ⃗z ∈ Rn and let t ∈ R be a scalar. Then...

Property Name

proj⃗x (⃗y + ⃗z) = proj⃗x ⃗y + proj⃗x⃗z Additive Linearity

proj⃗x (t⃗y ) = t proj⃗x ⃗y Scalar Linearity

proj⃗x (proj⃗x ⃗y ) = proj⃗x ⃗y Projection Property

20
1.4.2 Projections and Minimal Distance
Theorem: Distance from a Point to a Plane and Distance from a Point to a Plane

Distance Between a Line and a Point: Let P be a point on


the line ⃗x(t) = P⃗ + td⃗ and let Q be a point in Rn . The minimum
distance from Q to the line is given by D = ∥perpd⃗P⃗Q∥.

Distance Between a Point and a Plane: Let R be a point on


the plane ⃗n·(⃗x−P⃗ ) = 0 and let Q be a point in Rn . The minimum
distance from Q to the plane is given by D = ∥proj⃗n RQ∥. ⃗

Example 2: Find the distance between the point Q(4, 3) and the line that runs through the points P (1, 2)
and R(0, 3).

21
Example 3: Find the point on the plane x1 − 2x2 + 2x3 = 5 that is closest to the point Q(2, 1, 1). Hint: In
⃗ =Q
the previous diagram note that R ⃗ + QR
⃗ =Q ⃗ + proj⃗n QP
⃗ .

   
1 1
Example 4: Find the point on the line ⃗x(t) =  −2  + t  0  that is closest to the point Q(1, 3, 1).
3 1

22
Chapter 2

Systems of Linear Equations

23
2.1 Systems of Linear Equations and Elimination
2.1.1 Linear Equations
Definition: Linear Equations

A linear equation in n variables x1 , ..., xn is an equation that can be written in the form

a1 x2 + a2 x2 + · · · + an xn = b
where the constants a1 , ..., an are called the coefficients of the equation, and b is just a constant without
any particular name. The variables x1 , ..., xn are unknown and usually to be solved for.

Note: Elimination and Rules of Solving Systems of Linear Equations

The standard procedure for solving a system of linear equations is elimination. When dealing with a system
of linear equations there are three actions that you are allowed to use in rendering a possible solution:

1. Multiply one equation by a non-zero constant

2. Interchange two equations

3. Add a multiple of one equation to another equation

Example 1: Use elimination (with back-substitution) to find all solutions of the following system of linear
equations

x1 + x2 − 2x3 = 4

x1 + 3x2 − x3 = 7

2x1 + x2 − 5x3 = 7

24
Definition: Gaussian Elimination with Back-Substitution and Solutions
The solution procedure introduced above is known as Gaussian elimination with back-substitution.
At each step, the entry you are using to eliminate all other entries is called a pivot.

Definition: Solutions of Linear Equations and the General Solution

To any system of linear equations the only possibilities are a unique solution, infinite solutions, or no solution.

In the case of infinite solutions, the variables that are chosen to be independent of the other variables are
called the free variables. The values chosen for the free variables are called the parameters. The final
result where each variable is dependent on the parameters is called the general solution.

Example 2: Use elimination to find all solutions to the following system of linear equations


2x1 + x2 + 5x3 = 0

x2 − 3x3 = 1

x1 + x2 + x3 = 2

25
2.1.2 Matrix Representations of a System
Note: Systems of Equations as a Matrix

We may simplify the prior notation when solving a system of linear equations to that of a spreadsheet of
information. That is, we can drop the variables and replace them as cells containing the coefficients. This
helps simplify matters when solving, e.g.
(  
a11 x1 + a12 x2 = b1 a11 a12 b1
⇐⇒
a21 x1 + a22 x2 = b2 a21 a22 b2

Here the entry aij represents the coefficient in the i’th row and j’th column. We represent this as [ A | ⃗b ].

Definition: Matrices and Augmented Matrices

A rectangular array of data is called a matrix. A system of the form [ A | ⃗b ] is called an Augmented
Matrix of the system of corresponding linear equations. The matrix A is called the Coefficient Matrix.

Note: Row Operations and Elimination

Actions of elimination when applied to a matrix are called Row Operations. Performing these actions is
called Row Reducing instead of eliminating. Each resulting matrix when applying these operations is called
Row Equivalent. In reducing, it is VERY WRONG to write an equal sign instead of an arrow.

Example 3: By row reducing an augmented matrix (following the procedure of Gaussian elimination) find
the general solution of the system

x1 + x2 + x3 = 5

2x1 + 4x3 = 4

−3x2 + 3x3 = −9

26
2.1.3 Row Echelon Form and Consistency
Definition: Row Echelon Form
A matrix is in Row Echelon Form (REF) if...

ˆ When all entries in a row are zero, this row appears below all non-zero rows.

ˆ When two non-zero rows are compared, the first non-zero entry, called the leading entry, in the
upper row is to the left of the leading entry in the lower row (i.e. the first non-zero entries cascade
from left to right as you move down rows).

Example 4: Reduce the following system to row echelon form and determine all solutions.

x1 + x2 = 1

x2 + x3 = 2

x1 + 2x2 + x3 = −2

Definition: Consistency of a System

If a system has at least one solution we say it is consistent. If a system does not have any solutions we
say it is inconsistent.

Theorem: Possible States of Reduced System

Suppose that the augmented matrix [ A | ⃗b ] is reduced to REF. There are three possibilities:

1. The system is inconsistent;

2. The system is consistent with infinite solutions; or

3. The system is consistent with a unique solution.

27
2.2 Reduced REF, Rank, and Homogeneous Systems
2.2.1 Reduced Row Echelon Form
Note: Gauss-Jordan Elimination
When performing Gaussian elimination, if we choose to use the j-th equation to eliminate the the j-th
variable of all other rows (not just the rows below it) then you create a new form of elimination called
Gauss-Jordan elimination. The benefits or row reducing in this manner is that it avoids the need for
back-substitution at the end.

x1 + 2x2 + 2x3 = 4

Example 1: Use Gauss-Jordan elimination to solve the following system x1 + 3x2 + 3x3 − 5 .

2x2 + x3 = −2

Definition: Reduced Row Echelon Form


A matrix is said to be in reduced row echelon form (RREF) if

1. It is in row echelon form.

2. All leading entries are 1, called a leading 1.

3. In a column with a leading 1, all other entries in the column are zero.

Theorem
For any given matrix, there is a unique matrix in reduced row echelon form that is row equivalent to it.

28
2.2.2 Rank of a Matrix
Note: Leading 1’s as an Indicator of Solution Types

It is clear that the number of pivots is an indicator of what type of solution a system of linear equations has.
In the case of RREF, pivots are replaced by leading 1’s. Also, if the system is consistent, the number of piv-
ots is a good indicator of home many free variables there are and thus it seems important enough to define it.

Definition: Rank of a Matrix


The rank of a matrix is the number of leading 1’s in its RREF and is denoted by rank(A).
 
1 0 1
Example 2: Determine the rank of the following matrix  −2 −3 1 .
3 3 0

Theorem: Consistency and Rank

Let [ A | ⃗b ] be an augmented system of m linear equations in n variables.

1. The system is consistent if and only if the rank of the coefficient matrix A is equal to the rank of the
augmented matrix [ A | ⃗b ]. That is, rank(A) = rank([A|⃗b])

2. If the system is consistent, then the number of parameters in the general solution (i.e. number of free
variables) is given by n − rank(A).

Theorem: Requirement for Consistency of all Possible Outputs

Let [ A | ⃗b ] be a system of m linear equations in n variables. Then [ A | ⃗b ] is consistent for all ⃗b ∈ Rn if and
only if rank(A) = m.

29
2.2.3 Homogeneous Linear Equations
Definition: Homogeneous Systems

A linear equation if homogeneous if the “right-hand side” is zero (specifically when a system of linear
equations is written in the form we usually write it in). A system of linear equations if homogeneous if all
of the equations in the system are homogeneous.

2x1 + x2 = 0

Example 3: Find a general solution of the homogeneous system x1 + x2 − x3 = 0 .

−x2 + 2x3 = 0

Note: Trivial Solution


The values x1 = 0, ..., xn = 0 are always a solution to any homogeneous system. This solution is called the
trivial solution.

Theorem: Consistency of Homogeneous Systems

A homogeneous system is always consistent. It always has either infinite solutions or a unique solution. If
the solution is unique, it is always the trivial solution.

30
2.3 Application to Spanning and Linear Independence
2.3.1 Spanning Problems
         
−2  1 1 2 −1 
Example 1: Determine whether the vector ⃗v =  −3  is in the set Span   1  ,  −1  ,  1  ,  −3  
1 1 5 4 3
 

Theorem: Linear Combinations as Systems of Equations

The linear combination ⃗v = t1 w⃗1 + · · · tn w⃗n is equivalent to a system of linear equations where ⃗v is the
“Right-Hand-Side” and the vector w ⃗ i give the coefficients of the i’th column. For example,
       
3 −2 −1 −2 −1 3
= t1 + t2 ⇐⇒
4 1 5 1 5 4

31
     

 1 1 3 
     
2  ,   ,  5  . Find a homogeneous system of linear equations that
1

Example 2: Consider Span   1   3   5 

 
1 1 3
 
defines this set. That is, let ⃗x be an arbitrary vector in the spanning set and find homogeneous conditions on the
components for this vector to be in the set.

32
Example 3: Show that every vector ⃗v ∈ R3 can be written as a linear combination of the vectors
     
1 2 −4
 0 ,  2 , and  6 
0 1 5

Hint: There’s an indirect and computationally easy way to do this. Remember that a system [ A | ⃗b ] has a
solution for every ⃗b ∈ Rn if rank(A) = m.

Theorem: Consistency and Spanning Sets

A set of k vectors {v⃗1 , ..., v⃗k } in Rn spans Rn if and only if the rank of the coefficient matrix of the system
t1 v⃗1 + · · · + tk v⃗k = ⃗v is n.

Theorem: Lower Bound on Dimension for Spanning

Let {v⃗1 , ..., v⃗k } be a set of k vectors in Rn . If Span({v⃗1 , ..., v⃗k }) = Rn then k ≥ n.

33
2.3.2 Linear Independence Problems
     
1 2 1
Example 4: Determine whether the set , , is linearly independent or dependent.
−1 0 1

Theorem: Linear Dependence of “Too Many Vectors”

If {v⃗1 , ..., v⃗k } is a collection of vectors in Rn with k > n then the collection is linearly dependent.
     
 1 −2 1 
Example 5: Determine whether the set  2  ,  1  ,  1  is linearly independent or dependent.
1 0 1
 

Theorem: Rank and Linear Independence

A set of vectors {v⃗1 , ..., v⃗k } in Rn is linearly independent if and only if the rank of the coefficient matrix of
the homogeneous system t1 v⃗1 + · · · + tk v⃗k = ⃗0 is k. Furthermore, if {v⃗1 , ...v⃗k } is a linearly independent set
of vectors in Rn then k ≤ n.

34
2.3.3 Bases of Subspaces
     
 1 5 −2 
Example 6: Prove that  1  ,  −2  ,  3  is a basis for R3 . Hint: all this can be argued from the rank.
2 2 1
 

Theorem: Requirement for a Basis of Rn

A set of vectors {v⃗1 , ..., v⃗n } is a basis for Rn if and only if the rank of the coefficient matrix t1 v⃗1 +· · ·+tn v⃗n = ⃗v
is n.
   
 1 1 
Example 7: Show that  2  ,  1  is a basis for the plane −3x1 + 2x2 + x3 = 0
−1 1
 

Definition: Dimension of a Vector Space

If S is a non-trivial subspace of Rn with a basis containing k vectors, then we say that the dimension of
S is k and write dim(S) = k.

35
Chapter 3

Matrices, Linear Mappings, and Inverses

36
3.1 Operations on Matrices
3.1.1 Equality, Addition and Scalar Multiplication of Matrices
Note: Matrices in the Abstract
In this chapter we wish to understand the underlying structure of matrices mathematically, without a
reference to a system of linear equations. Therefore, most-to-all matrices in the following merely are objects
which are arrays of data, not necessarily augmented and not necessarily corresponding to a system of
equations.

Definition: Characteristics of a Matrix


Let A be a matrix.

ˆ Size of a Matrix: We say that A is a size m × n matrix when A has m rows and n columns. If we
want to subtly state or imply the size of a m × n matrix we will write Am×n .

ˆ Equality of Matrices: Two matrices A and B are defined to be equal if and only if they have the
same size and their corresponding entries are equal. That is, if aij = bij for 1 ≤ i ≤ m, 1 ≤ j ≤ n.

ˆ Entries of a Matrix: Sometimes we denote the entries of a matrix as (A)ij or aij . We sometimes
denote the full matrix as A or [aij ].

ˆ Square Matrices: A size n × n matrix (where the number of rows and columns are equal) is called
a square matrix.
(
1 if i = j
Example 1: Construct a square 3 × 3 matrix with entries given by δij = .
0 ̸ j
if i =

Definition: Triangular and Diagonal Matrices

A square matrix U is said to be upper triangular if the entries beneath the main diagonal are all zero,
that is, if uij = 0 whenever i > j. A square matrix L is said to be lower triangular if the entries above
the main diagonal are all zero, that is, if lij = 0 whenever i < j. A square matrix D such that dij = 0 if
i ̸= j is called a diagonal matrix. We denote an n × n diagonal matrix by D = diag(d11 , d22 , ..., dnn ).

37
Definition: Matrix Addition and Scalar Multiplication

Let A and B be m × n matrices and t ∈ R be a scalar. We define addition of matrices A + B to


be component-wise addition. That is, (A + B)ij = (A)ij + (B)ij and scalar multiplication tA to be
distribution of t into every component of A. That is, (tA)ij = t(A)ij .
     
2 −1 −1 3 7 0 −1
Example 2: Let A = ,B= and C = .
1 0 −2 4 3 1 2
If possible, compute B − 4A and A + B + C.

Definition: The Zero Matrix


The zero matrix O of size m × n is the matrix that satisfies oij = 0 for all 1 ≤ i ≤ m, 1 ≤ j ≤ n.

Theorem: Properties of Matrices

For all size m × n matrices A, B and C and s, t ∈ R we have...

Property Name

A + B is an m × n matrix Closure under addition

A+B =B+A Commutativity of matrix addition

(A + B) + C = A + (B + C) Associativity of matrix addition

There exists a matrix Om×n such that O + M = M for all Mm×n Existence of zero matrix

For all Mm×n there exists a (−M )m×n such that M + (−M ) = O Existence of additive inverse

sA is an m × n matrix Closure under scalar multiplication

s(tA) = (st)A Associativity of scalar multiplication

(s + t)A = sA + tA Distributivity of scalar addition

s(A + B) = sA + sB Distributivity of matrix addition

1 M = M for all Mm×n 1 is the scalar identity

38
Definition: Span of Matrices

Let B = {A1 , ..., Ak } be a set of m × n matrices. Then the span of B is defined as

Span(B) = {t1 A1 + · · · + tk Ak | t1 , ..., tk ∈ R}


       
1 2 1 1 1 0 0 1
Example 3: Determine if the matrix is in the span of B = , , .
3 4 0 0 0 1 1 0

Definition: Linear Dependence and Independence

Let B = {A1 , ..., Al } be a set of m × n matrices. Then B is said to be linearly independent if the only
solution to the equation t1 A1 + · · · + tl Al = Om×n is t1 = · · · = t1 = 0. Otherwise, there is a non-trivial
solution for which we say B is linearly dependent.
     
1 0 1 2 −1 −3
Example 4: Determine if B = , , is linearly dependent or independent.
1 1 0 1 1 −1

39
3.1.2 Transpose of a Matrix
Definition: Transpose of a Matrix

Let A be an m × n matrix. Then the transpose of A is the n × m matrix denoted AT , whose ij-th entry
is the ji-th entry of A. That is, (AT )ij = (A)ji .
 
−1 6 0 2
Example 5: Let A = . Compute 3AT .
4 2 −1 3

3.1.3 Matrix Multiplication


Definition: Matrix Multiplication

Let B be an m × k matrix with rows ⃗bT1 , ..., ⃗bTm and A be an k × n matrix with columns ⃗a1 , ..., ⃗an . Then we
define BA to be the m × n matrix whose ij-th entry is (BA)ij = ⃗bi · ⃗aj .

Example 6: If possible, perform the following operations. If it is not possible, explain why.
   
  3 1 3 1  
2 3 0 1  1 2 
 
 1
 2  2 3 0 1
4 −1 2 −1  2 3   2 3  4 −1 2 −1
0 5 0 5

Note: AB ̸= BA — i.e. Matrix Multiplication is Non-Commutative

The previous example shows that AB and BA are not generally equal. Don’t ever assume that they are.

40
3.1.4 Properties of Matrix Multiplication and the Identity Matrix
Theorem: Properties of Matrix Multiplication

If A, B and C are matrices of the correct size so that the required products are defined, and t ∈ R, then...

Property Name

A(B + C) = AB + AC Left Distributivity of Matrix Multiplication

(A + B)C = AC + BC Right Distributivity of Matrix Multiplication

t(AB) = (tA)B = A(tB) Associativity of Scalars

A(BC) = (AB)C Associativity of Matrix Multiplication

(AB)T = B T AT Distributivity of Transpose

Definition: The Identity Matrix

The n × n matrix I = diag(1, 1, ..., 1) is called the identity matrix. We sometimes denote this matrix by
In to emphasize it’s size.

Theorem: The Identity Matrix is the Multiplicative Identity

If A is any m × n matrix then Im A = A = AIn .

 2
4 −1
Example 7: Compute I22 . Also compute just to show that matrices follow their own sets of
12 −3
rules. All matrices that satisfy this are called idempotent matrices.

41
3.2 Matrix Mappings and Linear Mappings
3.2.1 Matrix Mappings
Definition: Matrix Mappings

For any m × n matrix A we define fA : Rn → Rm given by ⃗x 7→ A⃗x and call it the matrix mapping
associated to A.
 
2 2
 1 0 
 0 −3 . State the domain and codomain of fA . Then, compute fA (1, 2).
Example 1: Let A =  

−2 5

Theorem: Representation of Matrix Mappings

Let ⃗e1 , ⃗e2 , ..., ⃗en be the standard basis vectors of Rn , let A be an m 
× n matrix, and let fA : Rn → Rm be
x1
 .. 
the corresponding matrix mapping. Then, for any vector ⃗x =  .  we have
xn
fA (⃗x) = x1 fA (e⃗1 ) + x2 fA (e⃗2 ) + · · · + xn fA (e⃗n )

Theorem: Linearity of Matrix Mappings

Let A be an m × n matrix with corresponding matrix mapping fA : Rn → Rm . Then, for any ⃗x, ⃗y ∈ Rn and
any t ∈ R, we have

Property Name

fA (⃗x + ⃗y ) = fA (⃗x) + fA (⃗y ) Additive Linearity

fA (t⃗x) = t fA (⃗x) Scalar Linearity

42
3.2.2 Linear Mappings
Definition: Linear Mappings

A function L : Rn → Rm is called a linear mapping (or linear transformation) if for every ⃗x, ⃗y ∈ Rn and
t ∈ R it satisfies the following properties:

Property Name

L(⃗x + ⃗y ) = L(⃗x) + L(⃗y ) Additive Linearity

L(t ⃗x) = t L(⃗x) Scalar Linearity

A linear operator is a linear mapping whose domain and codomain are the same. In particular, the
operator defined by Id : Rn → Rn given by ⃗x 7→ ⃗x, i.e. Id(⃗x) = ⃗x, is called the identity mapping.

Example 2: Prove that the mapping f : R2 → R2 defined by f (x1 , x2 ) = (3x1 − x2 , 2x1 ) is a linear operator
by directly showing it satisfies the linearity conditions.

Theorem: Linear Mappings as Matrix Mappings

If L : Rn → Rm is a linear mapping,
 then L can be represented as a matrix mapping, with the corresponding
m × n matrix [L] given by [L] = L(e⃗1 ) L(e⃗2 ) · · · L(e⃗n ) .

Example 3: Find [f ] of the previous example and represent the function as a matrix mapping.

43
3.2.3 Compositions and Linear Combination of Mappings
Theorem: Closure of Linear Mappings

If L : Rn → Rm and M : Rn → Rm are linear mappings and t ∈ R is a scalar then

Property Name

(L + M ) : Rn → Rm is linear Closure Under Function Addition

t L : Rn → Rm is linear Closure Under Scalar Multiplication

Theorem
Let L : Rn → Rm , M : Rn → Rm , and N : Rm → Rp be linear mappings and t ∈ R. Then

1. [L + M ] = [L] + [M ]

2. [tL] = t [L]

3. [N ◦ L] = [N ][L]

Example: Let L and M be linear operators on R2 defined by L(x1 , x2 ) = (2x1 + x2 , x1 ) and M (x1 , x2 ) =
(x2 , x1 ). Compute M ◦ L, [M ◦ L], and [M ][L].

44
3.3 Geometrical Transformations
Note: Matrix Mappings as Geometric Transforms

As with projection, we have learned that some linear mappings have geometric significance. We will be
building a toolkit of various representative matrices [L] of operators that have such geometric significance.

3.3.1 Common Transformation Matrices in R2


Theorem: Common Matrices in R2 and Their Geometries
> Rotations About The Origin
The Rotation Matrix of a vector about the origin counter-
clockwise by angle θ is
 
cos(θ) − sin(θ)
[Rθ ] =
sin(θ) cos(θ)

> Stretches and Shrinks (i.e. Scaling Along an Axis)


Let t > 0 be fixed. The Stretching and Shrinking Matrices
along the x1 -axis and x2 -axis are given by
   
t 0 1 0
[Qx1 ] = ; [Qx2 ] =
0 1 0 t

respectively. They stretch if t > 1 and shrink if 0 < t < 1.


> Contractions and Dilations (i.e. Scaling)
Let t > 0 be fixed. The Contracting and Dilating Matrices
are given by  
t 0
[T ] =
0 t
They dilate if t > 1 and contract if 0 < t < 1.
> Shears
Let t > 0 be fixed. The Horizontal and Vertical Shear Ma-
trices are given by
   
1 t 1 0
[Sx1 ] = ; [Sx2 ] =
0 1 t 1

respectfully.
> Reflections
The Reflection Matrices about the x1 and x2 axis are given
by
   
1 0 −1 0
[Rx1 ] = ; [Rx2 ] =
0 −1 0 1
respectively.

45
 
1
Example 1: Let ⃗u = . Transform the vector by shearing it along the x1 -axis by a factor of 3, scale it
1
by a factor of 2, then rotate it counter-clockwise about the origin by an angle of π/2. Loosely graph the resulting
vector at each step.

3.3.2 Common Transformation Matrices in R3


Theorem: Common Matrices in R3 and Their Geometries
> Rotation Matrices About an Axis
The Rotation Matrix [Rθ,xi ] that rotates a vector about the xi -axis an angle θ counter-clockwise under
the right-hand-rule is given by

     
1 0 0 cos(θ) 0 sin(θ) cos(θ) − sin(θ) 0
[Rθ,x1 ] =  0 cos(θ) − sin(θ)  ; [Rθ,x2 ] =  0 1 0 ; [Rθ,x3 ] =  sin(θ) cos(θ) 0 
0 sin(θ) cos(θ) − sin(θ) 0 cos(θ) 0 0 1

> Reflection Matrices About a Coordinate Plane


The Reflection Matrix [Rxi ,xj ] that reflects a vector about the xi xj -coordinate plane is given by
     
1 0 0 1 0 0 −1 0 0
[Rx1 ,x2 ] =  0 1 0  ; [Rx1 ,x3 ] =  0 −1 0  ; [Rx2 ,x3 ] =  0 1 0 
0 0 −1 0 0 1 0 0 1

46
3.3.3 General Reflections
Theorem: General Reflections in Rn

Let refl⃗n : Rn → Rn be a mapping that reflects a point in Rn


about the curve/surface/hypersurface given by a linear equation
through the origin (line in R2 , plane in R3 , hyperplane in Rm
with m ≥ 4) with normal vector ⃗n. This function is a linear
operator and is given by

refl⃗n (⃗a) = ⃗a − 2 proj⃗n (⃗a)

Example 2: Find the matrix representation of the mapping that reflects any point in R3 about the plane
 T
through the origin with the normal vector ⃗n = 1 3 −2 .

47
3.4 Special Subspaces for Systems and Mappings
3.4.1 The Four Fundamental Subspaces
Definition: The Four Fundamental Subspaces

Let L : Rn → Rm be a linear mapping with [L] = A. Then...

ˆ Nullspace: The collection of all ⃗x ∈ Rn such that A⃗x = ⃗0 is called the Nullspace of A, denoted
Null(A). It is also sometimes called the Kernel of L and denoted Ker(L).

ˆ Left-Nullspace: The collection of all x⃗∗ ∈ Rm such that AT x⃗∗ = ⃗0 is called the Left Nullspace of
A, and we simply denote it Null(AT ) (i.e. it is the nullspace of AT ). It may also be thought of as the
kernel of the mapping associated to AT .

ˆ Columnspace: The collection of all ⃗y ∈ Rm such that A⃗x = ⃗y for some ⃗x ∈ Rn is called the
Columnspace of A, denoted Col(A) (i.e. it is the span of the columns of A). It is also sometimes
called the Range of L and denoted Range(L).

ˆ Rowspace: The collection of all y⃗∗ ∈ Rn such that AT x⃗∗ = y⃗∗ for some x⃗∗ ∈ Rm is called the
Rowspace of A, denoted Row(A) (i.e. it is the span of the rows of A). It may also be though of as
the range of the linear mapping associated to AT .

Theorem: The Four Fundamental Subspaces are Subspaces

Let A be a matrix. The sets Null(A), Col(A), Row(A) and Null(AT ) are subspaces.
 
1 1 1 1
Example 1: Consider the matrix A = . Describe conditions that a vector must satisfy (i.e. as
2 3 4 4
a system of equations that the components must satisfy) for it to be in (a) the Nullspace, Null(A); and (b) for a
vector to be in the Left-Nullspace, Null(AT ).

48
 
1 1
Example 2: Consider the matrix A =  2 1 . Describe the (a) Columnspace of A (i.e. Col(A)) as a
1 3
spanning set of vectors; and (b) the Rowspace of A (i.e. Row(A)) as a spanning set of vectors. Lastly, is the
spanning set described in (b) a basis for Row(A)?

   
1 1 1
Example 3: Suppose that L is a linear mapping with matrix A =  2 1 . Determine whether ⃗c =  3 
1 3 −1
 
2
and d⃗ =  1  are in the range of L. Hint: Form [ A | ⃗c | d⃗ ] to kill two birds with one stone.
9

Note: Vectors in the Range and Consistency

As one may see, a vector ⃗b being in the range of L, or columnspace of A, is the same as the system A⃗x = ⃗b
being consistent.

49
3.4.2 Bases for Row(A), Col(A), and Null(A)
Note: Bases for Row(A), Col(A), and Null(A)

The bases for three of the four fundamental subspaces may be obtained simply by reducing a matrix A to
RREF and interpreting the results. We summarize this in a theorem below.

Theorem: Bases for Row(A), Col(A) and Null(A)

Let A be a matrix.

ˆ Basis of the Nullspace: The spanning set for the general solution of the homogeneous A⃗x = ⃗0
obtained by the method of Gauss-Jordan elimination (to RREF) is a basis for Null(A).

ˆ Basis of the Columnspace: Let B be RREF of A. Then, the columns of A that correspond to the
columns of B with leading 1’s form a basis for the columnspace of A.

ˆ Basis of the Rowspace: Let B be the RREF of A, then the non-zero rows of B form a basis of
Row(A).

Note: Justification for the Above Theorem


The theorem above relies on a few easily observable (but tedious) facts. For the columnspace, this relies
on the observation that by row reductions, the systems A⃗x = 0 and B⃗x = ⃗0 have the same solution space.
Hence, any statement about the linear dependence of the columns of A is true if and only if the same
statement is true for the corresponding columns of B. Similarly, since row operations merely replace a row
with a linear combination of it with another row, then the span of the rows of A are the same as B.

Theorem: Rank and Equality of Dimension for Columnspace and Rowspace

For any matrix A, Rank(A) = dim(Row(A)) = dim(Col(A)). This follows from the prior theorem.
 
1 2 3
Example 4: Find a basis for the rowspace, columnspace and nullspace of A =  −1 3 2 .
0 1 1

50
(Continued...)

3.4.3 The Rank-Nullity Theorem


Definition: Nullity of a Matrix

The dimension of the nullspace of A is called the nullity of A, denoted nullity(A).

The Rank-Nullity Theorem

Let A be an m × n matrix, then rank(A)+nullity(A) = n.

Example 5: Use the results of the prior exercise to verify the Rank-Nullity Theorem.

51
3.5 Inverse Matrices and Inverse Mappings
3.5.1 Inverse Matrices
Definition: Inverse of a Matrix
Let A be an n × n matrix (square). If there exists an n × n matrix B such that AB = I = BA, then A is
said to be invertible, and B is called the inverse of A (and vice versa A is the inverse of B). The inverse
of A is denoted A−1 .

Note: Notation for the Inverse

Since matrix multiplication does not commute, it is VERY WRONG to write A−1 as 1/A.

Theorem: AB = I Implies BA = I for Square Matrices

Suppose that A and B are square matrices such that AB = I, then BA = I.


   
2 5 8 −5
Example 1: Show that the matrices and are inverses of each other.
3 8 −3 2

Theorem: Uniqueness of Inverse

The inverse of a matrix (provided it exists) is unique.

Theorem: Properties of Inverse Matrices

Suppose that A and B are invertible matrices and that t ̸= 0 is a real number. Then...

Property Name

1
(tA)−1 = A−1 Inverse of Scalar Multiple
t

(AB)−1 = B −1 A−1 Inverse of Product

(AT )−1 = (A−1 )T Inverse of Transpose

52
3.5.2 A Procedure for Finding the Inverse of a Matrix and Solving Systems of Equations
Theorem: Algorithm For Finding An Inverse

To find the inverse of a square matrix A,

ˆ Row reduce the multi-augmented matrix A I so that the left block is in reduced row echelon
 

form.

ˆ If the reduced row echelon form is I B then A−1 = B.


 

ˆ If the reduced row echelon form of A is not I, then A is not invertible.

Note: Justification of the Algorithm

The reason why this works is that you are solving the equation AB = I for B. You do this column by
column. Letting ⃗bi represent the i’th column of B you are solving the equation A⃗bi = ⃗ei for each i. This
translates to solving [A|⃗e1 |⃗e2 | · · · |⃗en ] ⇔ [A|I].
 
1 1
Example 2: Provided it exists, determine the inverse of A = .
4 3

3.5.3 Solving Systems of Equations A⃗x = ⃗b Using the Inverse


Note: Using Inverse Matrices to Solve Systems of Equations

Using matrix algebra, we may solve a system of equations A⃗x = ⃗b by multiplying both sides on the left by
the inverse A−1 .
(
x1 + x2 = 3
Example 3: Solve the system of equations using the results of the previous example.
4x1 + 3x2 = −2

53
3.5.4 Inverse Linear Mappings and the Inverse Matrix Theorem
Definition: Inverse Linear Mappings

If L : Rn → Rn is a linear operator and there exists another


mapping M : Rn → Rn such that M ◦ L = Id = L ◦ M , then L is
said to be invertible, and M is called the inverse of L, usually
denoted L−1 .

Theorem: Matrix Representation of Inverse Mappings

Let L : Rn → Rn be a invertible linear operator. Then, [L−1 ] = [L]−1 .

Theorem: The Inverse Matrix Theorem


We refer the reader to the appendix to read the Inverse Matrix Theorem.

Example 4: Prove that the following linear operator


(
L : R3 → R3
(x1 , x2 , x3 ) 7→ (2x1 + x2 , x3 , x2 − 2x3 )
is invertible. Hint: Use the Inverse Matrix Theorem.

54
3.6 Elementary Matrices
3.6.1 Elementary Matrices and Row Operations
Definition: Elementary Matrices

A matrix that can be obtained from the identity matrix by a single elementary row operation is called an
elementary matrix.

Theorem: Left Multiplication by Elementary Matrices is Equivalent to Row Operations

If A is an n × n matrix and E is the elementary matrix obtained from In by a certain elementary row opera-
tion, then the product EA is the matrix obtained from A by performing the same elementary row operation.

As such, there is a sequence of elementary matrices E1 , E2 , ..., Ek such that the product Ek · · · e2 E1 A yields
the RREF of A.

Example 1:  the elementary matrix obtained by performing the row operations R1 + (−2)R2 → R1
 Let E be
5 −1
on I. Let A = and demonstrate the prior theorem by computing EA.
1 3

 
1 2 1
Example 2: Let A = . Find a sequence of elementary matrices E1 , ..., Ek such that Ek · · · E1 A is
2 4 4
the RREF of A.

55
3.6.2 Representing A and A−1 as a Product of Elementary Matrices
Note: Inverse of Elementary Matrices

Constructing the inverses of an elementary matrix is easy. If you have an elementary matrix E obtained
from I through some row operation, you may obtain the elementary matrix E −1 by applying whatever row
R R−1
operation brings E back to I. Visually, E ∼ I ⇔ I ∼ E −1 .

Theorem: A and A−1 as a Product of Elementary Matrices

Let A be an invertible matrix, then by the inverse matrix theorem there exists a sequence of elementary
matrices where Ek · · · E1 A = I. Consequently A−1 = Ek · · · E1 and A = (Ek · · · E1 )−1 = E1−1 · · · Ek−1 .

 
0 3 0
Example 3: Let A =  1 1 0 . Write A and A−1 as a product of elementary matrices.
0 −2 1

56
3.7 LU -Decomposition
3.7.1 Constructing the LU -Decomposition
Definition: LU -Decomposition

Writing an n × n matrix A as a product A = LU , where L is lower triangular and U is upper triangular, is


called an LU -decomposition of A.

Theorem: Existence of LU -Decomposition

If A is an n × n matrix that can be row reduced to REF WITHOUT SWAPPING ROWS, then there exists
an upper triangular matrix U and lower triangular matrix L such that A = LU .

Theorem: Procedure For LU Decomposition

Suppose that A is a matrix that can be reduced to REF without swapping rows. The following procedure
constructs the LU decomposition:

1. Reduce A to REF through a sequence of row operations stored as elementary matrices Ek · · · E1 A = U .


The resulting matrix U is the upper triangular matrix in the decomposition.

2. Construct L = E1−1 · · · Ek−1 .

3. The resulting expression A = LU is the desired LU decomposition.

Note: Justification of the LU -Decomposition Procedure

The above procedure works because Ek · · · E1 A = U . You must then have that A = (E1−1 · · · Ek−1 )U and
thus we obtain the corresponding desired L.

2 1 −1
Example 1: Find an LU -decomposition of B =  −4 3 3 
6 8 −3

57
3.7.2 Using LU -Decomposition to Solve Systems

Theorem: Procedure to Solving A⃗x = ⃗b Using LU Decomposition

Suppose that A has an LU decomposition given by A = LU . Then the following procedure solves the
system A⃗x = LU⃗x = ⃗b for ⃗x.

1. Substitute ⃗y = U⃗x and solve the system L⃗y = ⃗b.

2. Use your result of the previous step to solve U⃗x = ⃗y .

3. The resulting ⃗x is the solution to A⃗x = ⃗b.


    
2 1 −1 x1 3
Example 2: Use LU -decomposition to solve the system  −4 3 3   x2  =  −13 . Note: we have
6 8 −3 x3 4
already computed the LU -decomposition in an earlier example. Use this result to start the algorithm.

Note: Solving Systems with Row Swaps Required

Since you are not allowed to row swap when performing LU -decomposition, it is then recommended to
interchange the position of any equations in a system of equations before forming the matrix (if required to
do so).

58
Chapter 4

Vector Spaces

59
4.1 Spaces of Polynomials
4.1.1 Polynomial Vector Space
Definition: Polynomial Space and the Standard Basis of Polynomial Space

The collection of polynomials of degree at most n is denoted Pn . The collection of polynomials given by
{1, x, x2 , ..., xn } is called the monomial basis for Pn .

Definition: Algebra of Polynomials

Let p, q ∈ Pn . If p(x) = an xn + · · · + a1 x + a0 and q(x) = bn xn + · · · + b1 x + b0 and t is a scalar then


polynomial addition is defined as

p(x) + q(x) = (an + bn )xn + · · · + (a1 + b1 )x + (a0 + b0 )


and scalar multiplication is defined as

tp(x) = (tan )xn + · · · + (ta1 )x + (ta0 )

Theorem: Properties of Polynomials

Let p(x), q(x) and r(x) be polynomials in Pn and let s, t ∈ R. Then...

Property Name

p(x) + q(x) is a polynomial of degree at most n Closure under addition

p(x) + q(x) = q(x) + p(x) Commutativity of addition

(p(x) + q(x)) + r(x) = p(x) + (q(x) + r(x)) Associativity of addition

There is a poly. o(x) such that o(x) + s(x) = s(x) for all poly. s(x) Existence of zero

For each s(x) there exists a −s(x) such that s(x) + (−s(x)) = o(x) Existence of inverse

tp(x) is a polynomial of degree at most n Closure under mult.

s(tp(x)) = (st)p(x) Associativity of mult.

(s + t)p(x) = sp(x) + tp(x) Distributivity of scalar add.

t(p(x) + q(x)) = tp(x) + tq(x) Distributivity of poly add.

1 s(x) = s(x) for all s(x) 1 is the scalar identity

60
4.1.2 Linear Combinations, Spans, and Linear Dependendence/Independence
Definition: Linear Combination of Polynomials

A linear combination of polynomials p1 (x),...,pn (x) is given by a sum of the form

t1 p1 (x) + · · · + tn pn (x)

for scalars t1 , ..., tn ∈ R.

Definition: Span of Polynomials

Let B = {p1 (x), ..., pk (x)} be a set of polynomials of degree at most n, Then the span of B is defined as

Span(B) = {t1 p1 (x) + · · · + tk pk (x) | t1 , ..., tk ∈ R}

Example 1: Determine if p(x) = 1 + 2x + 3x2 is in the span of B = {1 + x, 1 + x2 , 1 + x + x2 }. If so, write


p(x) as a linear combination of the polynomials in B.

61
Definition: Linear Dependence and Independence of Polynomials

The set B = {p1 (x), ..., pk (x)} is said to be linearly independent if the only solution to the equation

t1 p1 (x) + · · · + tk pk (x) = 0
is t1 = · · · = tk = 0; otherwise, there is a solution where not all ti are zero, for which B is said to be linearly
dependent.

Example 2: Determine if the set B = {2 − x2 , 3x, −2 + x + x2 } is linearly dependent or independent.

62
4.2 Vector Spaces
4.2.1 Vector Spaces
Definition: Vector Spaces

We refer the reader to the appendix for the requirements of an algebraic space to be called a Vector Space.

Theorem: Examples of Vector Spaces

The following are vector spaces:

1. The space Rn equipped with vector addition and scalar multiplication.

2. The space M (m, n) of all size m × n matrices equipped with matrix addition and scalar multiplication.

3. The space Pn equipped with polynomial addition and scalar multiplication.

4. The space of functions F(a, b) = {f |f : (a, b) ⊆ R → R} equipped with function addition and scalar
multiplication.

Example 1: Consider the space R2 with addition defined to be standard vector addition ⊕ = +, and scalar
multiplication defined by k ⊙ (x, y) = (ky, kx). By finding a counterexample, show that this is not a vector space.

Example 2: Consider the space R with addition defined to be x ⊕ y = x2 + y + 1 and scalar multiplication to
be defined by standard multiplication ⊙ = ·. By finding a counterexample, show that this is not a vector space.

63
Example 3: Consider R+ = (0, ∞) the space of positive real numbers. Define addition on this space to be
x ⊕ y = xy and scalar multiplication to be s ⊙ x = xs . Prove that this is a vector space.

V1: Closure under addition

V2: Commutativity of addition

V3: Associativity of addition

V4: Existence of zero

V5: Additive inverse

64
V6: Closure under scalar multiplication

V7: Associativity of scalar multiplication

V8: Distributivity of scalar addition

V9: Distributivity of addition

V10: 1 is the scalar identity

Definition: Exponential Space

We will call the space R+ equipped with x ⊕ y = xy and s ⊙ xs the Exponential Space and denote it E.

65
Theorem: Inverse and Zero Properties of Vector Spaces

Let V be a vector space. Then...

1. 0 ⊙ x = 0 for all x ∈ V

2. (−1) ⊙ x = −x for all x ∈ V

3. t ⊙ 0 = 0 for all t ∈ R

Example 4: Demonstrate the prior theorem holds for E, the exponential space.

Example 5: Let V = {(a, b) | a ∈ R, b ∈ R+ }. Define addition in this space to be (a, b) ⊕ (c, d) = (ad + bc, bd)
and scalar multiplication to be t ⊙ (a, b) = (tabt−1 , bt ). Given this is a vector space, determine the zero vector and
the additive inverse of any vector using the prior theorem.

66
4.2.2 Subspaces
Definition: Subspaces

We refer the reader to the appendix for the definition of a subspace.

Theorem: Subspaces are (“Smaller”) Vector Spaces

A subspace of a vector space is a vector space itself. Consequently, this means that as U ⊆ V you may think
of a vector subspace as a smaller vector space within another vector space.

Example 6: Let U = {p(x) ∈ P3 | p(3) = 0}. Show that U is a subspace of P3 .

Example 7: Prove that the set U = {A ∈ M (2, 2) | a11 + a22 = 0} is a subspace of M (2, 2).

67
4.3 Bases and Dimensions
4.3.1 Linear Combinations, Spans and Bases
Theorem: Spanning Sets as Subspaces

If {v1 , ..., vk } is a set of vectors in a vector space V and S is the set of all possible linear combinations of
these vectors,

S = Span({v1 , ..., vk }) = {(t1 ⊙ v1 ) ⊕ · · · ⊕ (tk ⊙ vk ) | t1 , ..., tk ∈ R}


then S is a subspace of V.

Definition: Spanning Sets of a Vector Space

If S is a subspace of the vector space V consisting of all linear combinations of vectors v1 , ..., vk ∈ V, then S
is called the subspace spanned by B = {v1 , ..., vk }, and we say that the set B spans S. The set B is called
a spanning set for the subspace S. We denote S = Span({v1 , ..., vk }) = Span(B).

Definition: Linear Dependence and Independence

If B = {v1 , ..., vk } is a set of vectors in a vector space V, then B is said to be linearly independent if the
only solution to the equation
(t1 ⊙ v1 ) ⊕ · · · ⊕ (tk ⊙ vk ) = 0
is t1 = · · · = tk = 0; otherwise, there is a non-trivial solution and we say B is said to be linearly dependent.

Theorem: Unique Representation Theorem

Let B = {v1 , ..., vn } be a spanning set for a vector space V. Then every vector in V can be expressed in a
unique way as a linear combination of the vectors of B if and only if the set B is linearly independent.

Definition: Basis of a Vector Space

A set B of vectors in a vector space V is a basis if it is a linearly independent spanning set for V.
     
1 2 0 1 2 5
Example 1: Prove that the set B = , , is not a basis for the subspace Span(B).
−1 1 3 1 1 3
Note: As a reminder, if we don’t specify the operations or vector space, you can assume they are the standard
operations from the standard space. Also, hint... 2A + B − C = O.

68
4.3.2 Determining a Basis of a Subspace
Note: Finding a Basis

Finding a basis of a subspace can be quite difficult. The technique is to first (somehow) determine a spanning
set, then reduce it to or prove that it is, a linearly independent set. Finding the spanning set is the creative
step, while turning a spanning set into a basis is quite procedural. One technique is to associate the span
as the range of some matrix, then discern a basis using our results on the basis of the columnspace of that
matrix.

Example 2: Determine a basis for the subspace S = {p(x) ∈ P2 | p(1) = 0} of P2 . Hint: Every element in this
space can be written as p(x) = (x − 1)(ax + b) for some constants a and b.

Example 3: Determine a basis for Span 1 + x − 2x2 , 2 − x + x2 , 1 − 2x + 3x2 , 1 + 5x + 3x2



.
   
1 2 1 1 1 0 −1 0
RREF
Hint: To speed things up, you are given the fact that  1 −1 −2 5  ∼  0 1 1 0 .
−2 1 3 3 0 0 0 1

69
4.3.3 Dimension
Definition
If B = {v1 , ..., vn } and C = {u1 , ..., uk } are both bases of a vector space V, then k = n. If a vector space
V has a basis with n vectors, then we say that the dimension of V is n and write dim(V) = n. If a vector
space V does not have a basis with finitely many elements, then V is called infinite-dimensional. The
dimension of the trivial vector space V = {0} is defined to be 0.

Example 4: Determine the dimension of the vector space Span(T ) of the previous example.

4.3.4 Extending a Linearly Independent Subset to a Basis


Theorem: Dimension and Linear Independency

Let V be an n-dimensional vector space. Then

1. A set of more than n vectors in V must be linearly dependent.

2. A set of fewer than n vectors cannot span V.

3. A set with n elements of V is a spanning set for V if and only if it is linearly independent.
     
1 1 −2 −1 1 0
Example 5: Let C = , , . Extend C to a basis for M (2, 2). Hint: We’ve
0 1 1 1 1 1
worked with M (2, 2) and know that dim(M (2, 2)) = 4. Thus, you need only find an additional element not in the
span of these three matrices.

70
4.4 Coordinates with Respect to a Basis
4.4.1 Bases
Definition: Coordinate Vectors
Suppose that B = {v1 , ..., vn } is a basis for the vector space V. If x ∈ V with

x = (x1 ⊙ v1 ) ⊕ (x2 ⊙ v2 ) ⊕ · · · ⊕ (xn ⊙ vn )


then the coordinate vector of x with respect to the basis B is
 
x1
 x2 
[x]B =  . 
 
 .. 
xn

Example 1: Consider the bases of R2 (you don’t need to check this)


       
1 1 1 0
B = {⃗v1 , ⃗v2 } = , and E = {⃗e1 , ⃗e2 } = ,
−1 1 0 1
 
3
find [⃗a]B and [⃗a]E where ⃗a = .
−2

Example 2: The collection B = {1, x, 1 + x2 } is a basis of P2 . Find the B-coordinates of p(x) = 2 + x + 3x2 .

71
4.4.2 Change-of-Basis
Theorem: Linearity of Basis Representations

Let B be a basis for a finite dimensional vector space V. Then, for any x, y ∈ V and s, t ∈ R we have

[(s ⊙ x) ⊕ (t ⊙ y)]B = s[x]B + t[y]B

Note: Matrix Representation of Coordinate Representations

This means that the function gB : V → Rn given by gB (v) = [v]B is a linear function. In the event that
V = Rn this means that we should be able to find a representing matrix [gB ].

Note: Development of the Change-of-Basis Matrix

Consider a general vector space V with two bases B and C = {w1 , ..., wn }.

Let x ∈ V and write x be written as a linear combination of the vectors in C,

x = (x1 ⊙ w1 ) ⊕ · · · ⊕ (xn ⊙ wn )
 T
That is, [x]C = x1 x2 · · · xn . Taking B-coordinates gives

[x]B = [(x1 ⊙ w1 ) ⊕ · · · ⊕ (xn ⊙ wn )]B


= x1 [w1 ]B + · · · + xn [wn ]B
 
x1
  . 
= [w1 ]B · · · [wn ]B  .. 
xn
 
= [w1 ]B · · · [wn ]B [x]C

Theorem: Change-of-Basis Matrix

Let B and C = {w1 , ..., wn } both be bases for a vector space V. The matrix
 
P = [w1 ]B · · · [wn ]B
is called the change of coordinates matrix from C-coordinates to B-coordinates and satisfies

[x]B = P [x]C
and is called the change of coordinates equation. Often an emphatic notation PBC is used.

Theorem: Invertibility Reverses the Change-of-Basis, i.e. (PBC )−1 = PCB

Let B and C both be bases for a finite-dimensional vector space V. Let P be the change of coordinates
matrix from C-coordinates to B-coordinates. Then, P is invertible and P −1 is the change of coordinates
matrix from B-coordinates to C-coordinates.

72
Note: Efficiently Obtaining PBC

Let B = {v1 , v2 , ..., vn }. One will note by prior examples


 that when forming [x]B that you need to solve a
system of linear equations. Hence, when forming PBC = [w1 ]B · · · [wn ]B you need to solve n systems
of linear equations, i.e. t1 ⊙ v1 ⊕ · · · ⊕ tn ⊙ vn = wk for every 1 ≤ k ≤ n. However, the left-hand-
side (i.e. coefficient matrix) is always the same. Hence, you may efficiently solve for PBC by solving the
multi-augmented system
 RREF 
I PBC
 
v1 · · · vn w1 · · · wn ∼

       
1 1 1 0
Example 1: Earlier we considered the bases B = , and E = , for R2 . The
−1 1 0 1
    
E 1 0
change of basis matrix from E to B is PB = , . Demonstrate the prior note by setting up the
0 B 1 B
   
1 0
systems of equations for , and solving for them. Demonstrate the change-of-basis is consistent with
0 B 1 B
T
Example 1 by computing PBE ⃗a where ⃗a = 3 −2 .


73
Example 2: Let C = {1, x, x2 } be the standard basis of P2 and let B = 1 − x2 , x, −2 + x2 . Find the change


of coordinates matrix from B to C. Then, use the inverse to find the change of coordinates matrix from C to B.

74
4.5 General Linear Mappings
4.5.1 General Linearity Conditions
Definition: Linear Mappings

If V and W are vector spaces over R, a function L : V → W is a linear mapping if it satisfies the linearity
properties

Property Name

L(x ⊕V y) = L(x) ⊕W L(y) Additive Linearity

L(t ⊙V x) = t ⊙W L(x) Scalar Linearity

for all x, y ∈ V; t ∈ R; ⊕V and ⊙V are the operations of addition and scalar multiplying respectively on V;
and ⊕W and ⊙W are the operations of addition and scalar multiplying respectively on W. If V = W, then
L may be called a linear operator.

Example 1: Let L : M (2, 2) → P1 be defined by L(A) = a21 + (a12 + a22 )x. Prove that L is a linear mapping.

Example 2: Define L : E → R by L(x) = ln(x). Prove that L is a linear function. Note: The function is most
certainly not linear if it is mapping L : R+ → R as usually done in calculus!

75
4.5.2 The General Rank-Nullity Theorem
Definition: The Range and Nullspace/Kernel of a General Linear Mapping

Let V and W be vector spaces over R. The range of a linear mapping L : V → W is defined to be the set

Range(L) = {L(x) ∈ W | x ∈ V}

The nullspace ( or kernel) of L is the set of all vectors in V whose image under L is the zero vector 0W .
We write

Null(L) = {x ∈ V | L(x) = 0W }

Theorem: The Nullspace and Range are Subspaces

Let V and W be vector spaces and let L : V → W be a linear mapping. Then, Null(L) is a subspace of V
and Range(L) is a subspace of W.

Example 3: Consider the linear mapping L : M (2, 2) → P2 given by L(B) = b21 + (b12 + b22 )x + b11 x2 .
Determine whether 1 + x + x2 ∈ Range(L), and if it is, determine a matrix A such that L(A) = 1 + x + x2 .
Afterwards, determine the nullspace of L (as a spanning set).

Theorem: Linear Mappings Fix the Zero Vector

Let V and W be vector spaces and let L : V → W be a linear mapping. Then, L(0V ) = 0W .

Example 4: Demonstrate the prior theorem is true using L : E → R given by L(x) = ln(x) of Example 2.

76
Example 5: Determine a basis for the range and nullspace of the linear mapping L : P1 → R3 defined by
 
0
L(a + bx) =  0 
a − 2b

Definition: The Rank of a General Linear Mapping

Let V and W be vector spaces over R. The rank of a linear mapping L : V → W is the dimension of the
range of L, that is, rank(L) = dim(Range(L)).

Definition: The Nullity of a General Linear Mapping

Let V and W be vector spaces over R. The nullity of a linear mapping L : V → W is the dimension of
the nullspace of L, that is, nullity(L) = dim(Null(L)).

The Rank-Nullity Theorem for a General Linear Mapping

Let V and W be vector spaces over R with dim(V) = n, and let L : V → W be a linear mapping. Then,

rank(L) + nullity(L) = n

Example 6: Confirm the Rank-Nullity Theorem in the previous example.

77
4.6 Matrix of a Linear Mapping
4.6.1 The Matrix of L with Respect to the Basis B
Note: The Representing Matrix of a Mapping in Another Basis

In this subsection we are concerned with the representing matrix [L] of L in another basis B. Specifically, we
want everything stated in the language of B with nothing in the language of the standard basis. That is, for
an input [⃗x]B the output is [L(⃗x)]B and we seek a matrix A such that [L(⃗x)]B = A[⃗x]B , which we will call [L]B .

Let B = {⃗v1 , ..., ⃗vn } be a basis for Rn and let L : Rn → Rn be a linear operator. Then, for any ⃗x ∈ Rn , we
can write ⃗x = b1⃗v1 + · · · + bn⃗vn . Thus by linearity,

L(⃗x) = L(b1⃗v1 + · · · + bn⃗vn ) = b1 L(⃗v1 ) + · · · + bn L(⃗vn )


Representing everything in B coordinates we obtain

[L(⃗x)]B = [b1 L(⃗v1 ) + · · · + bn L(⃗vn )]B


= b1 [L(⃗v1 )]B + · · · + bn [L(⃗vn )]B
 
b1
  . 
= [L(⃗v1 )]B · · · [L(⃗vn )]B  .. 
bn
= [L]B [⃗x]B

Theorem: The Representing Matrix of L in the Basis B

Let V be a vector space. Suppose that B = {v1 , ..., vn } is any basis for V and that L : V → V is a linear
operator. Define the matrix of the linear operator L with respect to the basis B to be the matrix
 
[L]B = [L(v1 )]B · · · [L(vn )]B
where we have for any x ∈ V, [L(x)]B = [L]B [x]B .
   
2 2 1 1
Example 1: Let L : R → R be given by L(x1 , x2 ) = (x2 , x1 ) and let B = , . Determine
−1 1
[L]B .

78
4.6.2 Change of Coordinates and Linear Mappings
Note: Another Way to Obtain [L]B

There is another way to obtain [L]B which is quite simple (in


theory). Specifically, let S represent the standard basis and let
B represent the basis we wish to work in. Let P = PSB be the
change of basis matrix from B to S. Then we may construct

[L]B = PBS [L]PSB ⇒ [L]B = P −1 [L]P


where [L] is just the regular matrix representation of L with
respect to the standard basis (as done in prior chapters). The
logic of this follows the adjacent diagram with linear composition.

Note: Linear Mapping with Different Basis

Using the prior note, you may extend the logic to find the matrix representation of the linear mapping if
the input is in basis B and the output is in basis C. This is simply [L]B S B
C = PC [L]PS , which is very useful.

       
2 1 3  1 1 0 
Example 2: Let L be the linear mapping with [L] = A = −1 2 2 and let B =
   1 , 1 , 1  .
   
−2 3 1 0 1 1
 
Determine [L]B .

79
Chapter 5

Determinants

80
5.1 Determinants in Terms of Cofactors
5.1.1 The 2 × 2 Case
Note: Consistency of a 2 × 2 System

When solving a system of equations A⃗x = ⃗b, we know from the inverse theorem that the consistency of the
system depends entirely on the coefficient matrix and not ⃗b. Specifically, what conditions are these? We
may talk about rank, and other equivalencies, but in this section we develop a new apparatus for measuring
consistency.

One may solve such a system (in general) to obtain the following result,

a22 b1 − a12 b2
x1 =


a 11 a22 − a12 a21
( 

a11 x1 + a12 x2 = b1 
=⇒
a21 x1 + a22 x2 = b2 
 a11 b2 − a21 b1
x2 =


a11 a22 − a12 a21
and thus a condition is that we must have that a11 a22 − a12 a21 ̸= 0.

Definition: The Determinant of a 2 × 2 Matrix


 
a11 a12
The determinant of a 2 × 2 matrix A = is defined by
a21 a22
 
a11 a12
det(A) = det = a11 a22 − a12 a21
a21 a22
We also use the notation |A| = det(A).

Note: Criss-Cross Apple Sauce

a11 a12
It’s simple to remember the above by the ‘criss-cross’ pattern. = a11 a22 − a21 a12
You take the product of the main diagonal and subtract the a21 a22
product of the off-diagonal
   
1 2 2 4
Example 1: Compute the determinants of and . Which matrices A will always have
−3 −7 4 8
A⃗x = ⃗b be consistent for every ⃗b ∈ R2 ?

81
5.1.2 The 3 × 3 Case
Consistency of a 3 × 3 System and the Pattern

One may derive a condition for consistency required in a 3 × 3 system through brute force just like the prior
subsection. As one expects, the condition is much messier and given by

|A| = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 ̸= 0.
There is an easier way to memorize this. We may arrange this as:

|A| = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
a22 a23 a21 a23 a21 a22
= a11 · + a12 · (−1) + a13 ·
a32 a33 a31 a33 a31 a32

This might still seem like a mess to memorize until we try to relate it back to the original matrix to obtain
an easy to memorize pattern. Notice that every determinant (and it’s coefficient) can be obtained by moving
across the top row, crossing out all orthogonal elements at each step, and multiplying this coefficient by the
determinant of the corresponding sub-matrix:
 
a11 — —
a22 a23
=⇒ a11 ·
 
 | a22 a23  a32 a33
| a32 a33
 
— a12 —
a21 a23
=⇒ a12 · (−1)
 
 a21 | a23  a31 a33
a31 | a33
 
— — a13
a22 a23
=⇒ a11 ·
 
 a21 a22 | 
a32 a33
a31 a32 |
So the idea is to move across the top in an alternating pattern, taking the appropriate sub-determinants as
we move across.
   
a11 → a12 → a13 + → − → +
   
 a21 a22 a23   
a31 a32 a33
We denote the above sub-determinants by the following

a22 a23 a21 a23 a22 a23


C11 = C12 = (−1) C13 =
a32 a33 a31 a33 a32 a33
These sub-determinants have a special name. There also isn’t necessarily a reason for us to restrict to just
taking the resulting sub-determinants produced by the first row.

82
Definition: The Cofactors of a 3 × 3 Matrix

Let A be a 3 × 3 matrix. Let A(i, j) denote the 2 × 2 sub-matrix obtained from A by deleting the i-th row
and j-th column. Define the cofactors of a 3×3 matrix to be

Cij = (−1)i+j |A(i, j)|

 
2 −1 3
Example 2: Let A =  0 4 −1 . Compute C11 and C23 .
1 −2 3

Definition: The Determinant of a 3 × 3 Matrix

The determinant of a 3×3 matrix A is defined by |A| = a11 C11 + a12 C12 + a13 C13 .
 
2 −1 3
Example 3: Let A =  0 4 −1 . Compute |A|. Next, compute a11 C11 + a21 C21 + a31 C31 (the expansion
1 −2 3
down column 1 instead).

Theorem: The Order of Determinant Expansion Is Equal Along Any Row or Column

Cofactor expansion along any row or column of a 3 × 3 matrix yields the same determinant value.

83
5.1.3 General Cofactor Expansion
Note: The Pattern of Determinant Expansion is Consistent

The pattern described in the 3 × 3 continues when defining the determinant in greater square size matrices.

Definition: The Cofactors and Determinant of an n × n Matrix

Let A be an n × n matrix. Let A(i, j) denote the (n − 1) × (n − 1) matrix obtained from A by deleting the
i-th row and j-th column. The cofactors of an n × n matrix are defined to be

Cij = (−1)i+j |A(i, j)|


Furthermore, the determinant of an n × n matrix A is defined by |A| = a11 C11 + a12 C12 + · · · + a1n C1n .

Theorem: The Order of Determinant Expansion is Equal Along Any Row or Column

Cofactor expansion along any row or column of an n × n matrix yields the same determinant value.

0 0 3 0
0 5 6 0
Example 4: Compute .
−2 3 0 4
−5 1 2 3

84
 
3 2 0 −1
 0 0 0 0 
Example 5: Calculate the determinant of 
 4 1

2 1 
3 −1 0 1

 
4 2 1 −1
 0 2 2 2 
Example 6: Calculate the determinant of  
 0 0 −1 3 
0 0 0 4

Theorem: Rows and Columns of Zeroes in the Determinant


If one row (or column) of an n × n matrix A contains only zeros, then |A| = 0.

Theorem: Triangular Matrix

If A is an n × n upper or lower triangular matrix (which includes diagonal matrices) then the determinant
of A is the product of the diagonal entries of A. That is, |A| = a11 a22 · · · ann .

85
5.2 Elementary Row Operations and the Determinant
5.2.1 Determinant Operations
> Scaled Rows or Columns
Note: Common Factors in a Determinant Row or Column
The first way that we may simplify is by noticing a common factor in a row or column. Suppose that a
column (or a row) of A has been scaled by a factor r. Then,

a11 ··· ra1j ··· a1n


a21 ··· ra2j ··· a2n
.. .. .. = ra1j C1j +ra2j C2j +· · ·+ranj Cnj = r(a1j C1j +a2j C2j +· · ·+anj Cnj ) = r|A|
. . .
an1 · · · ranj ··· ann

Theorem: Common Factors in a Determinant Row or Column


Let A be an n × n matrix and let B be the matrix obtained from A by multiplying the i-th row of A by the
real number r. Then, |B| = r|A|.

a b a −4b
Example 1: Given that = 5 compute .
c d −7c 28d

> Swapping Rows or Columns


Theorem: Swapping Columns or Rows in a Determinant

Suppose that A is an n × n matrix and that B is the matrix obtained from A by swapping two rows (or
columns). Then, |B| = −|A|.

a b c 3e d f
Example 2: Given that d e f = −3, compute 3h g i .
g h i 6b 2a 2c

86
> Adding Multiples of a Row or Column
Theorem: Adding Multiples of a Row or Column to Another Doesn’t Change the Determinant

Suppose that A is an n×n matrix and that B is obtained from A by adding r times the i-th row (or column)
of A to the k-th row (or column resp.). Then, |B| = |A|.

a b c h g i
Example 3: Given that d e f = 7, compute e + 2h d + 2g f + 2i .
g h i 2b − 3e 2a − 3d 2e − 3f

1 3 1 5
1 3 −3 −3
Example 4: Compute by reducing it to an upper triangular matrix.
0 3 1 0
1 6 2 11

87
5.2.2 Determinant Properties
Note: Determinants and the Inverse Matrix Theorem
Since |A| ̸= 0 is intrinsically tied to the (general) consistency of a system, it holds a spot on the Inverse
Matrix Theorem. Specifically, we pay special attention to the fact that a matrix is invertible if and only if
the determinant is non-zero.

Theorem: Determinant Properties

Let A and B be n × n matrices and let r ∈ R. Then...

Property Name

|tA| = tn |A| Common Scalar in Determinant

|AB| = |A| · |B| Multiplicativity of the Determinant

|AT | = |A| Determinant of Transpose

|A−1 | = 1/|A| Determinant of Inverse

Note: The Determinant is NOT Additive


Never assume that |A + B| and |A| + |B| are equal. That result is almost never true. If you make this
mistake, you will be heavily penalized.
 
1 −1
Example 5: Suppose that A and B are 2 × 2 matrices such that |A| = −3 and B = . Compute the
3 2
expression |4A3 B T B −1 |.

88
5.3 Inverse by Cofactors and Cramer’s Rule
5.3.1 Inverse by Cofactors
Definition: The Adjugate Matrix

Let A be a size n × n matrix. The cofactor matrix of A is the matrix of cofactors, denoted cof(A) = [Cij ].
The adjugate of A is the matrix adj(A) =cof(A)T .

Theorem: A Formula for the Inverse Matrix


1
Let A be an invertible size n × n matrix, then A−1 = adj(A).
|A|

Note: Advantage of the Inverse Matrix Formulation

The advantage of this formulation, albeit a computational nuisance, gives an explicit formulation of the
inverse. Before, we only had a procedure for constructing the inverse without a formula.
 
2 4 −1
Example 1: Use the method of cofactors to construct the inverse of the matrix  0 3 1 .
6 −2 5

Note: Important Shortcut for 2 × 2 Inverses


   
a b −1 1 d −b
Let A = be an invertible 2 × 2 matrix. Then A = .
c d |A| −c a

89
5.3.2 Cramer’s Rule
Theorem: Cramer’s Rule

Let A be an invertible size n × n matrix and consider the system A⃗x = ⃗b. Let Ni be the matrix obtained
|Ni |
from A by replacing the i’th column of A by ⃗b. Then xi = .
|A|

Example 2: Consider the following system of linear equations



x1 + x2 − x3 = 3

2x1 + 4x2 + 5x3 = 1

x1 + x2 + 2x3 = k

Solve for JUST x3 and then find all values of k such that x3 = 0.

5.3.3 A Formula For the Cross Product Using Determinants


Theorem: A Better Formula for the Cross Product

⃗e1 ⃗e2 ⃗e3


Let ⃗u, ⃗v ∈ R3 . Then, ⃗u × ⃗v = u1 u2 u3 = C11⃗e1 + C12⃗e2 + C13⃗e3 .
v1 v2 v3

90
Chapter 6

Eigenvectors and Diagonalization

91
6.1 Eigenvalues and Eigenvectors
6.1.1 Eigenvalues and Eigenvectors of a Mapping
Definition: Eigenvalues and Eigenvectors of a Linear Mapping

Suppose that L : Rn → Rn is a linear operator. A non-zero vector ⃗v ∈ Rn such that L(⃗v ) = λ ⃗v is called an
eigenvector of L; the scalar λ is called an eigenvalue of L. The pair λ, ⃗v is called an eigenpair. This
terminology remains unchanged in the context of matrices where [L] = A and we consider A⃗v = λ⃗v .
 
1
Example 1: Let ⃗v = and consider the projection mapping proj⃗v : R2 → R2 . Find two eigenvectors
−1
with distinct eigenvalues.

6.1.2 Finding Eigenvectors and Eigenvalues


Note: Deriving the Characteristic Equation

We wish to find non-zero solutions λ and ⃗v to the system A⃗v = λ⃗v . Suppose that λ is known, then the
system A⃗v = λ⃗v is equivalent to the Homogeneous System (A − λI)⃗v = ⃗0. We know that the only way
to avoid the trivial solution to such a system is if the solution to the system is not unique. We may obtain
such a condition by A − λI not being invertible, that is, if |A − λI| = 0.

Definition: The Characteristic Equation and Characteristic Polynomial

Let A be an n × n matrix. The function given by C(λ) = |A − λI| is called the characteristic polynomial.
The equation given by C(λ) = 0 (or rather |A − λI| = 0) is called the characteristic equation.

Theorem: The Roots of the Characteristic Equation Yield All Eigenvalues

Suppose that A is an n × n matrix. A real number λ is an eigenvalue of A if an only if it satisfies the


characteristic equation |A − λI| = 0. If λ is an eigenvalue of A, then all non-trivial solutions of the
homogeneous system (A − λI)⃗v = ⃗0 are eigenvectors of A that correspond to λ.

92
 
2 2
Example 2: Find all eigenvalues of A = .
1 3

Definition: Eigenspace

Let λ be an eigenvalue of an n × n matrix A. Then the set containing the zero vector and all eigenvectors
of A corresponding to λ is called the eigenspace of λ. In particular. Eλ = Null(A − λI).

Note: Convention of Naming Eigenvectors

By convention we refer to “the” eigenvectors of A to be the basis vectors of the eigenspaces.

Note: Procedure to Solving the Eigenvalue Problem

Let A be an n × n matrix. To solve the eigenvalue problem A⃗v = λ⃗v for all solution complete the following...

1. Form the characteristic equation |A − λI| = 0 and solve for each λ; proceed to the next step if you’re
also finding the eigenvectors...

2. For each λ obtained in the previous step form the homogeneous system (A − λI)⃗v = ⃗0.

3. Find the general solution of each system in the previous step to find the eigenspace.

4. Find a basis for each of the previous solution sets to determine the corresponding eigenvectors.
 
1 1 1
Example 3: Solve the eigenvalue problem associated to A =  1 1 1 .
1 1 1

93
(Continued...)

Note: Properties of the Characteristic Equation

It is important to recognize that C(λ) = |A − λI| is an n-th degree polynomial. As such,

(1) λ1 is a root of C(λ) (i.e. C(λ1 ) = 0) if and only if λ − λ1 is a factor of C(λ).

(2) The total number of roots (real and complex, counting repetitions) is n.

(3) Complex roots of the equation occur in “conjugate pairs,” so that the total number of complex roots
must be even.

(4) If n is odd, there must be at least one real root.

(5) If the entries of A are integers, since the leading coefficient of the characteristic polynomial is ±1, any
rational root must in fact be an integer (by the rational roots theorem).

Theorem: Linear Independence of Eigenvectors

Suppose that λ1 , ..., λk are distinct (λi ̸= λj ) eigenvalues of an n × n matrix A, with corresponding eigen-
vectors ⃗v1 , ..., ⃗vk , respectively. Then {⃗v1 , ..., ⃗vk } is linearly independent.

94
6.1.3 Algebraic and Geometric Multiplicity
Definition: Geometric and Algebraic Multiplicity

Let A be an n × n matrix with eigenvalue λ. The algebraic multiplicity of λ is the number of times λ is
repeated as a root of the characteristic polynomial. The geometric multiplicity of λ is the dimension of
the eigenspace of λ.

Theorem: Inequality Relationship Between Geometric and Algebraic Multiplicity

Let λ be an eigenvalue of an n × n matrix A. Then we always have

1 ≤ Geometric Multiplicity ≤ Algebraic Multiplicity


and if the second inequality is strict we say that the eigenvalue is deficient.
 
1 1
Example 4: Determine the algebraic and geometric multiplicity of each eigenvalue to A = . Are any
0 1
eigenvalues deficient?

95
6.2 Diagonalization
6.2.1 Similar Matrices
Definition: “Similarity” of Matrices

If A and B are n × n matrices such that S −1 AS = B for some invertible matrix S, then A and B are said
to be similar. We denote this by A ∼ B.

Definition
Let A be a square n × n matrix. We define the trace of A to be the sum of its diagonal elements. That is,
n
X
tr(A) = a11 + a22 + · · · + ann = aii
i=1

Theorem: Similar Traits of Similar Matrices


If A and B are n × n matrices such that A ∼ B. Then A and B have the same...

1. Determinant (|A| = |B|)

2. Eigenvalues

3. Rank (rank(A) =rank(B))

4. Trace
   
8 6 2 0
Example 1: Consider the similar matrices A = ∼B= related by
−8 −6 0 0
   −1   
8 6 4 3 2 0 4 3
= .
−8 −6 1 1 0 0 1 1
Demonstrate the properties of the prior theorem hold with regards to A and B.

96
6.2.2 Diagonalization
Definition: Diagonalizable Matrices

If there exists an invertible matrix P and a diagonal matrix D such that P −1 AP = D (i.e. A is similar to
a diagonal matrix), then we say that A is diagonalizable and that the matrix P diagonalizes A to its
diagonal form D.

Note: Diagonalization by Eigenvectors

One may see that eigenvectors yield a simple way to diagonalize


 matrices. Specifically,
 let A be a square
n × n matrix with A⃗vk = λ⃗vk for 1 ≤ k ≤ n. Then, form P = ⃗v1 · · · ⃗vn We may see that

 
λ1 0 ··· 0
 .. 
      0 λ2 ··· . 
AP = A⃗v1 · · · A⃗vn = λ1⃗v1 · · · λn⃗vn = ⃗v1 · · · ⃗vn 
 .. .. ..
 = P D.

 . . . 0 
0 ··· 0 λn

Thus, P −1 AP = D, or rather, A = P DP −1 .

Theorem: Diagonalization Theorem


n
An n × n matrix A can be diagonalized if and only if there exists  a basis for R of eigenvectors of A.
If such a basis {⃗v1 , ..., ⃗vn } exists, the matrix P = ⃗v1 · · · ⃗vn diagonalizes A to a diagonal matrix
D = diag(λ1 , ..., λn ), where λk is an eigenvalue of A corresponding to ⃗vk for 1 ≤ k ≤ n.

Consequently, a matrix is diagonalizable if and only if every eigenvalue has its geometric multiplicity equal
to its algebraic multiplicity — i.e. no eigenvalues are deficient. In particular, if all n eigenvalues of the
matrix are distinct, then A is diagonalizable.
 
1 1
Example 2: Show that the following matrix A = is not diagonalizable.
0 1
Hint: Refer back to an earlier example to save on time.

97
 
0 3 −2
Example 3: Diagonalize, if possible, the following matrix A =  −2 5 −2 .
−2 3 0
Hint: Perform the determinant operation R3 + (−R2 ) → R3 and expand on the third row.

98
6.3 Powers of Matrices and the Markov Process
6.3.1 (Large) Powers of Matrices
Theorem: Powers of Diagonal Matrices

If D is a diagonal n × n matrix such that D = diag(d1 , ..., dn ) then Dm = diag(dm m


1 , ..., dn ) for any integer
m ≥ 1.

Theorem: Powers of Similar Matrices

Let A and B be n × n matrices such that A = SBS −1 (i.e. A ∼ B). Then for any integer m ≥ 1 we have
Am = SB m S −1 .


1 4
Example 1: For the matrix B = the diagonalization is given by
2 3
     
−1 1 4 −2 1 −1 0 −1/3 1/3
B = P DP =⇒ =
2 3 1 1 0 5 1/3 2/3
Compute B 2 directly and indirectly (using diagonalization). Then, find a closed formula for B m for any general
m ≥ 1.

Definition: Root of a Matrix

Let A be a square matrix with diagonalization A = P DP −1 . The root A1/2 is defined provided that all the
eigenvalues of A are non-negative and is given by A1/2 = P D1/2 P −1 where in D1/2 the root is taken of each
diagonal element of D.

99
6.3.2 Markov Process
Example 2: Smith and Jones are the only competing suppliers of communication services in their community.
At present, they each have a 50% share of the market. However, Smith has recently upgraded his service and a
survey indicates that from one month to the next, 90% of Smith’s customers remain loyal, while 10% switch to
Jones. On the other hand, 70% of Jones’ customers remain loyal and 30% switch to Smith. If this goes on for six
months, how large are their market shares? If this goes on for a long time (∞), how big will Smith’s share become?
     
0.9 0.3 3 3 1
Hint: The eigenpairs of T = are λ1 = 1, ⃗v1 = and λ2 = , ⃗v2 = .
0.1 0.7 1 5 −1

100
Definition: Markov Matrices
An n × n matrix T is the Markov matrix (or transition matrix) of an n-state Markov process if

1. tij ≥ 0 for each i and j.

2. Each column sum is 1: t1j + t2j + · · · + tnj = 1 for each j.

The column vector inputs and outputs of the transition matrix, whose components sum to 1, are called the
states of the Markov Process.

The eigenvector of a transition matrix with eigenvalue λ = 1, whose components sum to 1, is called the
fixed (or invariant) state of the transition matrix.

Example 3: Find the fixed state of the previous example.


Hint: You already have the eigenvector, you just need the components to add to 1.

Theorem: Properties of Markov Matrices

Consider a Markov process with Markov matrix T . Then the following properties hold:

(P1) One eigenvalue of a Markov matrix is λ1 = 1.

(P2) The components of the eigenvector associated to λ1 = 1 are all non-negative.

(P3) All other eigenvalues satisfy |λi | ≤ 1.

(P4) Suppose that for some m all the entries in T m are non-zero. Then all the eigenvalues of T except for
λ1 = 1 satisfy |λi | < 1. In this case, for any initial state ⃗s, T m⃗s → ⃗s∗ as m → ∞ where ⃗s∗ is the
eigenstate associated to λ1 = 1.

6.3.3 The Power Method


Theorem: Power Method Algorithm

Make a guess ⃗x0 and normalize to obtain x̂0 . Construct ⃗x1 = Ax̂0 and normalize to obtain x̂1 . Repeat the
process (Ax̂i = ⃗xi+1 ) and eventually x̂m → ⃗v as m → ∞ where ⃗v is the eigenvector associated to the largest
eigenvalue of A. Compute A⃗v and factor out ⃗v to find the largest eigenvalue.

101
Chapter 7

Orthonormal Bases

102
7.1 Orthonormal Bases and Orthogonal Matrices
7.1.1 Orthonormal Bases
Definition: Orthogonal Vectors

A set of vectors {⃗v1 , ..., ⃗vk } in Rn is orthogonal if ⃗vi · ⃗vj = 0 whenever i ̸= j.


     

 1 1 −1 
1 −1   0 
  
Example 1: Show that the set   is an orthogonal set in R4 .
 1 , 1
  ,
   1 
 
1 −1 0
 

Theorem: Linear Independence of Orthogonal Vectors

If {⃗v1 , ..., ⃗vk } is an orthogonal set of non-zero vectors in Rn , it is linearly independent.

Definition: Orthonormal Basis


A set {⃗v1 , ..., ⃗vk } of vectors in Rn is orthonormal if it is orthogonal and each vector ⃗vi is a unit vector
(that is, each vector is normalized). If a basis is orthonormal we call it an orthonormal basis.

Example 2: Normalize the set of orthogonal vectors in the previous example to form an orthonormal set.

103
7.1.2 Coordinates with Respect to an Orthonormal Basis
Theorem: Coordinates with Respect to an Orthogonal Basis

If B = {⃗v1 , ..., ⃗vn } is an orthogonal basis for Rn , then

⃗x = proj⃗v1 (⃗x) + · · · + proj⃗vn (⃗x).


 T ⃗x · ⃗vk
In particular, this means that [⃗x]B = b1 · · · bn where bk = (called the Scalar Component
⃗vk · ⃗vk
and denoted Comp⃗vk (⃗x)) for 1 ≤ k ≤ n.

In particular, if the basis is orthonormal, i.e. B = {v̂1 , ..., v̂n }, then Compv̂k (⃗x) = ⃗x · v̂k and ⃗x = (⃗x · v̂1 )v̂1 +
· · · + (⃗x · v̂n )v̂n .

Example 3: Consider the orthonormal basis of R3 given by


      
1 1 −2 2
1 1 
B=  2 ,
 −1 , −2 
3 3 3
2 2 1

 T
and let ⃗x = 3 −1 5 . Determine [⃗x]B .

Theorem: Lengths and Angles are Preserved in Orthonormal Coordinates

Let B be an orthonormal basis of Rn and let ⃗x, ⃗y ∈ Rn . Then,

∥⃗x∥ = ∥[⃗x]B ∥ and ⃗x · ⃗y = [⃗x]B · [⃗y ]B

Example 4: Demonstrate that ∥⃗x∥ = ∥[⃗x]B ∥ in the previous example.

104
7.1.3 Orthogonal Matrices
Note: Orthogonal Matrices

Let B = {⃗v1 , ..., ⃗vn } be an orthonormal basis of Rn . Construct the matrix


 
P = ⃗v1 ⃗v2 · · · ⃗vn
This matrix satisfies an interesting equation that hopefully should be familiar:
 T 
⃗v1
 ⃗v T  
 2 
P T P =  .  ⃗v1 ⃗v2 · · · ⃗vn

 .. 
⃗vnT
 
⃗v1 · ⃗v1 ⃗v1 · ⃗v2 ··· ⃗v1 · ⃗vn
 ⃗v2 · ⃗v1 ⃗v2 · ⃗v2 ··· ⃗v2 · ⃗vn 
=
 
.. .. .. .. 
 . . . . 
⃗vn · ⃗v1 ⃗v2 · ⃗vn · · · ⃗vn · ⃗vn
 
1 0 ··· 0
 0 1 ··· 0 
=
 
.. .. . . .. 
 . . . . 
0 0 ··· 1
=I

Therefore P T P = I and so P T acts as an inverse to P . Since the inverse is unique, this means that
P T = P −1 .

Definition: Orthogonal Matrices

An n × n matrix P such that P T P = I is called an orthogonal matrix. If follows that P −1 = P T and


that P P T = I = P T P .
 
cos(θ) − sin(θ)
Example 5: Show that Rθ = is an orthogonal matrix for any θ. Then, use this to
sin(θ) cos(θ)
construct Rθ−1 .

105
7.2 Projections and the Gram-Schmidt Procedure
7.2.1 Complementary Subspaces
Definition: Orthogonal Complements

Let W be a subspace of Rn . We shall say that a vector ⃗x is


orthogonal to W if ⃗x · w ⃗ ∈ W.
⃗ = 0 for all w

We call the set of all vectors orthogonal to W the orthogonal


complement of W and denote it W ⊥ . This is explicitly given
by W ⊥ = {⃗x ∈ Rn | ⃗x · w ⃗ ∈ W }.
⃗ = 0 for all w

Note: Basis of W ⊥

If B = {w ⃗ k } is basis of W then W ⊥ = {⃗x ∈ Rn | ⃗x · w


⃗ 1 , ..., w ⃗ 1 = 0, ..., ⃗x · w
⃗ k = 0}. As the conditions ⃗x
satisfies forms a homogeneous system, a basis of the general solution space yields a basis for W ⊥ .
   
 1 −1 
Example 6: Let W = Span  1  ,  0  . Construct a basis for the complement W ⊥ in R3 .
1 1
 

Theorem: Properties of Complementary Spaces

Let S be a k-dimensional subspace of Rn . Then,

1. S ∩ S ⊥ = {⃗0} (i.e. the only element in both S and S ⊥ is the zero vector)

2. dim(S ⊥ ) = n − k (i.e. otherwise stated dim(S)+dim(S ⊥ ) = n)

3. If B = {v̂1 , ..., v̂k } is an orthonormal basis for S and Bperp = {v̂k+1 , ..., v̂n } is an orthonormal basis for
S ⊥ , then B ∪ Bperp = {v̂1 , ..., v̂k , v̂k+1 , ..., v̂n } is an orthonormal basis for Rn .

106
7.2.2 Projection on Subspaces
Definition: Subspace Projection

Let S be a k-dimensional subspace of Rn and let B = {⃗v1 , ..., ⃗vk }


be an orthogonal basis of S. If ⃗x is any vector in Rn , the
projection of ⃗x onto S is defined to be

projS (⃗x) = proj⃗v1 (⃗x) + proj⃗v2 (⃗x) + · · · + proj⃗vk (⃗x)


The projection of ⃗x perpendicular to S is defined to be

perpS (⃗x) = ⃗x − projS (⃗x)

     
 1 −1  2
Example 7: Let B =  2  ,  1  be an orthogonal basis for S and let ⃗x =  1 . Determine
1 −1 3
 
projS (⃗x) and perpS (⃗x).

Theorem: The Subspace Approximation Theorem

Let S be a subspace of Rn . Then, for any ⃗x ∈ Rn ,

min⃗s∈S ∥⃗x − ⃗s∥ = ∥⃗x − projS (⃗x)∥ = ∥perpS (⃗x)∥

107
7.2.3 The Gram-Schmidt Procedure
Theorem: The Gram-Schmidt Procedure
If {⃗v1 , ⃗v2 , ..., ⃗vk } is a linearly independent set of vectors in S then there exists an orthogonal set of vectors
{w⃗ 1, w
⃗ 2 , ..., w⃗ k } in S such that

Span = {⃗v1 , ⃗v2 , ..., ⃗vj } = Span {w


⃗ 1, w ⃗j}
⃗ 2 , ..., w
for all 1 ≤ j ≤ k. Specifically, it may be constructed by the following recursion,

w
⃗ 1 = ⃗v1
⃗ 2 = ⃗v2 − projw⃗ 1 (⃗v2 )
w
⃗ 3 = ⃗v3 − projw⃗ 1 (⃗v3 ) − projw⃗ 2 (⃗v3 )
w
..
.
⃗ k = ⃗vk − projw⃗ 1 (⃗vk ) − · · · − projw⃗ k−1 (⃗vk )
w

That is, if Sj = Span{w ⃗ j } then w


⃗ 1 , ..., w ⃗ j+1 = perpSj (⃗vj+1 ) for all 1 ≤ j ≤ k − 1 where w
⃗ 1 = ⃗v1 .

Note: Scale of Vectors in the Algorithm

As we only care about the geometry of the objects involved, we may replace any w⃗ j at each step with any
scalar multiple of it. This, thereby, drastically reduces the algebra involved.

Example 8: Find an orthonormal basis for the subspace S ⊂ R4 given by


     

 1 −1 0 

0   0   0
     
S = Span   0 , 2 , 1



 
1 0 0
 

108
Note: Relaxtion of Conditions Required in Gram-Schmidt

The conditions in The Gram-Schmidt procedure may be slightly relaxed. Specifically, all you need is a
spanning set. All non-zero vectors produced in the Gram-Schmidt procedure will naturally turn a spanning
set into an orthogonal basis.

Example 9: Use the Gram-Schmidt procedure to find an orthonormal basis of the subspace S ⊂ R3 given by
     
 0 3 3 
S = Span   −1 , 0 ,
    1  
2 1 −1
 

109
Appendix A

(Lengthy) Important Theorems and


Definitions

110
A.1 The Inverse Matrix Theorem
Theorem: The Inverse Matrix Theorem
Suppose that L : Rn → Rn is a linear operator with representative matrix A = [L]. Then, the following
statements are equivalent to each other:

1. A is invertible.

2. rank(A) = n.

3. nullity(A) = 0.

4. The RREF of A is I.

5. For all ⃗b ∈ Rn , the system A⃗x = ⃗b is consistent and has a unique solution.

6. The columns of A are linearly independent.

7. The rows of A are linearly independent.

8. The columnspace of A spans Rn .

9. The rowspace of A spans Rn .

10. L is invertible

11. Range(L) = Rn .

12. Null(A) = {⃗0} (or Ker(L) = {⃗0}).

13. A has n pivot positions.

14. There is a square matrix C such that CA = I.

15. There is a square matrix D such that AD = I.

16. The transpose AT is invertible.

17. The columnspace of A is Rn .

18. The rowspace of A is Rn .

19. There exists a sequence of elementary matrices such that Ek · · · E2 E1 A = I.

20. The determinant of A is non-zero.

21. Zero is not an eigenvalue of A.

22. Col(A)⊥ = {⃗0}.

23. Null(A)⊥ = Rn .

111
A.2 Vector Spaces and Subspaces
A.2.1 Vector Space Requirements
Definition: Vector Spaces

A vector space over R is a set V together with an operation ⊕ (called addition and denoted x ⊕ y for
any x, y ∈ V), and an operation ⊙ (called scalar multiplication and denoted s ⊙ x for any x ∈ V and
s ∈ R) such that for any x, y, z ∈ V and s, t ∈ R we have all of the following properties:

Number Property Name

V1 x⊕y ∈V Closure under add.

V2 x⊕y =y⊕x Commutativity of add.

V3 (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) Associativity of add.

V4 There is a 0 ∈ V such that 0 ⊕ v = v for all v ∈ V Existence of zero

V5 For each v ∈ V there exists a −v ∈ V such that v ⊕ −v = 0 Additive inverse

V6 s⊙x∈V Closure under scalar multi.

V7 s ⊙ (t ⊙ x) = (st) ⊙ x Associativity of scalar multi.

V8 (s + t) ⊙ x = (s ⊙ x) ⊕ (t ⊙ x) Distributivity of scalar add.

V9 t ⊙ (x ⊕ y) = (t ⊙ x) ⊕ (t ⊙ y) Distributivity of add.

V10 1 ⊙ v = v for all v ∈ V 1 is the scalar identity

The elements v in a vector space V are called vectors.

112
A.2.2 Subspace Requirements
Definition: Subspaces

Suppose that V is a vector space. A non-empty subset U ⊆ V is a subspace of V if it satisfies the following
properties:

Number Property Name

S0 0∈U Non-Empty

S1 x ⊕ y ∈ U for all x, y ∈ U Closure under addition

S2 t ⊙ x ∈ U for all x ∈ U and t ∈ R Closure under scalar multiplication

113

You might also like