Math 211 Course Pack V5
Math 211 Course Pack V5
Linear Algebra
1
3 Matrices, Linear Mappings, and Inverses 36
3.1 Operations on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 Equality, Addition and Scalar Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . 37
3.1.2 Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.3 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.4 Properties of Matrix Multiplication and the Identity Matrix . . . . . . . . . . . . . . . . . . 41
3.2 Matrix Mappings and Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Matrix Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.2 Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 Compositions and Linear Combination of Mappings . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Geometrical Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Common Transformation Matrices in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Common Transformation Matrices in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.3 General Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Special Subspaces for Systems and Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.1 The Four Fundamental Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.2 Bases for Row(A), Col(A), and Null(A) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.3 The Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Inverse Matrices and Inverse Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.1 Inverse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.2 A Procedure for Finding the Inverse of a Matrix and Solving Systems of Equations . . . . . 53
3.5.3 Solving Systems of Equations A⃗x = ⃗b Using the Inverse . . . . . . . . . . . . . . . . . . . . 53
3.5.4 Inverse Linear Mappings and the Inverse Matrix Theorem . . . . . . . . . . . . . . . . . . . 54
3.6 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.1 Elementary Matrices and Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.2 Representing A and A−1 as a Product of Elementary Matrices . . . . . . . . . . . . . . . . 56
3.7 LU -Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.1 Constructing the LU -Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.2 Using LU -Decomposition to Solve Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Vector Spaces 59
4.1 Spaces of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.1 Polynomial Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.2 Linear Combinations, Spans, and Linear Dependendence/Independence . . . . . . . . . . . 61
4.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Bases and Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.1 Linear Combinations, Spans and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.2 Determining a Basis of a Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.4 Extending a Linearly Independent Subset to a Basis . . . . . . . . . . . . . . . . . . . . . . 70
4.4 Coordinates with Respect to a Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.1 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.2 Change-of-Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 General Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.1 General Linearity Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.2 The General Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2
4.6 Matrix of a Linear Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6.1 The Matrix of L with Respect to the Basis B . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6.2 Change of Coordinates and Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5 Determinants 80
5.1 Determinants in Terms of Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.1 The 2 × 2 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.2 The 3 × 3 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.3 General Cofactor Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Elementary Row Operations and the Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.1 Determinant Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.2 Determinant Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Inverse by Cofactors and Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Inverse by Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.2 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.3 A Formula For the Cross Product Using Determinants . . . . . . . . . . . . . . . . . . . . . 90
3
Chapter 1
4
1.1 Vectors in R2 and R3
1.1.1 Introduction to Vectors
Definition: Points in Space
The collection of all points of two components (x, y) is called the Real 2-Space and is denoted R2 . The
collection of all points of three components (x, y, z) is called the Real 3-Space and is denoted R3 .
Definition: Vectors
An abstraction of a point is a vector. Consider a point P = (x1 , x2 ), then an analog to this P⃗ is called a
vector and is given by
⃗ x1
P = (x1 , x2 ) ⇐⇒ P =
x2
They are graphically given by arrows where the base is located at some point (x, y) and the tip of the arrow
is located at (x + p1 , y + p2 ) in R2 . Vectors and how you graph them naturally extends in the same notation
and graphical means in R3 .
Textbooks often use the notation P instead of P⃗ . Either is acceptable but bold face is impossible to write
by hand, so when writing by hand we always use P⃗ .
−1
1
Example 1: Graph the vectors in R2 and 1 in R3 .
−2
2
To address the
notational
issue of wanting to write inline, but vectors being written vertically, we will adopt
a T
the notation b = a b c .
c
5
1.1.2 Vector Operations
Definition: Vector Addition and Subtraction
Let ⃗u and ⃗v be vectors with the same number of components. Vector addition ⃗u +⃗v and subtraction ⃗u −⃗v are
defined by adding and subtracting the same positioned components, respectively. Visually, vector addition
follows the parallelogram law and subtraction follows the tip-to-tip law.
2 1
Example 2: Let ⃗v = and w
⃗= ⃗ and ⃗v − w.
. Compute and graph ⃗v + w ⃗
−1 2
Definition: Scalars
In the context of vectors we call (real) numbers scalars.
If ⃗v is a vector and t is a scalar then multiplication of the two t⃗v is done by distributing the scalar into each
component.
2 1
Example 3: Let ⃗v = . Compute and graph 2⃗v and − ⃗v .
4 2
6
1.1.3 Standard Basis and Linear Combinations
Definition: Linear Combination
Let v⃗1 , ..., v⃗n be vectors. We say that the expression a1 v⃗1 + · · · + an v⃗n where a1 , ..., an are scalars is called
a linear combination of v⃗1 , ..., v⃗n .
5 1 2
Example 4: Express ⃗v = as a linear combination of ⃗s = and ⃗t = .
1 1 0
T
in terms of the standard basis of R3 .
Example 5: Express w
⃗= 3 2 −4
1. ⃗0 + ⃗v = ⃗v
2. ⃗v − ⃗v = ⃗0
7
1.1.5 The Vector Equation of a Line
Definition: Parametric Curves in Rn
1 2
Example 6: Consider the line through P (3, 1, −2) that is parallel to the line ⃗y (t) = −2 + t 2 .
1 −3
Form the (a) Vector equation; (b) Parametric equations; and (c) Scalar form of this line.
8
1.1.6 Directed Line Segments
Definition: Directed Line Segments
The directed line segment from the point P to the point Q is the vector denoted by P⃗Q. Suppose O is
⃗ as simply P⃗ to be consistent with previous results.
the origin, then we just denote OP
Example 7: Find the vector equation of the line that passes through the points P (1, 5, −2) and Q(4, −1, 3).
Demonstrate that the vector equation for a line isn’t necessarily unique by finding three different vector equations.
9
1.2 Vectors in Rn
1.2.1 Rn and Algebraic Operations
Definition: Points in Space
Vector addition, subtraction, and scalar multiplication is defined the same in Rn (component-wise).
T T
Example 1: Let ⃗v = 2 −1 3 4 and w
⃗= 3 2 −2 1 . Compute 3⃗v − 2w.
⃗
Property Name
(⃗x + ⃗y ) + w
⃗ = ⃗x + (⃗y + w)
⃗ Associativity of vector addition
There exists vector ⃗0 ∈ Rn such that ⃗0 + ⃗z = ⃗z for all ⃗z ∈ Rn Existence of zero vector
For each ⃗z ∈ Rn there exists −⃗z ∈ Rn such that ⃗z + (−⃗z) = ⃗0 Existence of additive inverse
10
1.2.2 Subspaces
Definition: Subspaces
A non-empty subset S of Rn is called a subspace of Rn if for all vectors ⃗x, ⃗y ∈ S and t ∈ R...
1. ⃗0 ∈ S (Non-Empty)
( )
x1
Example 3: Prove that the set S = 2x1 − 3x2 = 0 is a subspace of R2 .
x2
11
1.2.3 Spanning Sets and Linear Independence
Definition: Spanning Sets
n T T o T
Example 4: Let R = −3 0 2 , −5 0 1 . Demonstrate that ⃗v = −4 2 5 is not in
Span(R).
Let R be a non-empty collection of vectors in Rn . If S = Span(R) then we say that S is the subspace
spanned by the vectors in R and in return we say that R spans S. The set R is called a spanning set
for the subspace S.
Let v⃗1 , ..., v⃗k be vectors in Rn . If v⃗k can be written as a linear combination of v⃗1 , ..., v⃗k then
12
Definition: Linear Dependence and Independence
A collection of vectors {v⃗1 , ..., v⃗k } is said to be linearly dependent if there exist coefficients t1 , ..., tk not
all zero such that
⃗0 = t1 v⃗1 + · · · + tk v⃗k
Alternatively, if the only solution is t1 = t2 = · · · = tk = 0 (called the trivial solution) then we say the
collection is linearly independent.
4 9 1
Example 6: Show that T = −1 , 0 , 2 is linearly dependent. Hint: −2⃗v1 + ⃗v2 − ⃗v3 .
0 1 1
Let R be a collection of vectors in Rn such that S = Span(R). Provided R is linearly independent we say
that R is a basis for S.
Example 7: Consider the set T of the prior example. Argue that the set T is not a basis of Span(T ). Then,
determine a basis of Span(T ).
13
1.2.4 Surfaces in Higher Dimensions
Definition: Planes in Higher Dimensions
(1) Let p⃗, d⃗ ∈ Rn with d⃗ ̸= ⃗0. Then we call the set with vector equation ⃗x(t) = p⃗ + td⃗ a line in Rn that
passes through p⃗.
(2) Let p⃗, d⃗1 , d⃗2 ∈ Rn with {d⃗1 , d⃗2 } being a linearly independent set. Then the set with vector equation
⃗x(t1 , t2 ) = p⃗ + t1 d⃗1 + t2 d⃗2 is called a plane in Rn that passes through p⃗.
(3) Let p⃗, v⃗1 , ..., ⃗vn−1 ∈ Rn with {v⃗1 , ..., ⃗vn−1 } being linearly independent. Then the set with vector
equations ⃗x(t1 , ..., tn−1 ) = p⃗ + t1⃗v1 + · · · + tn−1⃗vn−1 is called a hyperplane in Rn that passes through
p⃗.
1 −1 0
Example 9: Show that the set Span 2 ,
1 , 3 is not a hyperplane. Determine what
1 2 3
type of surface this space describes.
14
1.3 Length and Dot Products
1.3.1 Length
Definition
n
The length/norm q of a vector ⃗x ∈ R is denoted and de-
fined as ∥⃗x∥ = x21 + x22 + · · · + x2n . Quite literally, it is
the length of the arrow.
T
1 2 2
Example 1: Let ⃗x = − − . Compute ∥⃗x∥.
3 3 3
1
The unit vector moving in the same direction as ⃗x is denoted and given by x̂ = ⃗x.
∥⃗x∥
1
⃗ in the same direction as ⃗x = −2 such that ∥w∥
Example 2: Construct a vector w ⃗ = 2.
5
15
1.3.2 Angles and Dot Product
Definition: The Dot Product
The dot product between two vectors p⃗ and ⃗q in Rn is given and denoted by p⃗ · ⃗q = p1 q1 + p2 q2 + · · · + pn qn .
The prior result is a theorem in R2 and R3 derived by using the Law-Of-Cosines. Since we can’t draw such
figures in higher dimensions, we take the above to be the definition of the angle between two vectors in
higher dimensions.
1 1
Example 3: Find the angle in R2 between the vectors ⃗v = 2 and w
⃗ = −1 .
−1 −1
16
1.3.3 Properties of Length and Dot Product
Theorem: Dot Product and Norm Properties
Let ⃗x ∈ Rn . We have that ⃗x · ⃗x ≥ 0 and equality is obtained if and only if ⃗x = ⃗0. Similarly, ∥⃗x∥ ≥ 0 and
equality is obtained if and only if ⃗x = ⃗0.
Example 5: Suppose that ⃗x and ⃗y are vectors in Rn satisfying ∥⃗x∥ = 3, ∥⃗y ∥ = 2 and the angle between them
is θ = π/3. Compute ⃗x · (5⃗x + ⃗y ).
17
1.3.4 The Scalar Equation of a Plane
Definition: The Scalar Equation of a Plane
T
Example 6: Find the equation of the plane that contains the point P (2, 3, −1) with normal ⃗n = 1 −4 1 .
Example 7: Find a normal vector to the plane with scalar equation 5x1 − 6x2 + 7x3 = 11.
Example 8: Construct the scalar equation of the plane that contains the point P (2, 4, −1) and is parallel to
the plane 2x1 + 3x2 − 5x3 = 6.
18
1.3.5 The Cross Product
Definition: The Cross Product
u2 v3 − u3 v2
Let ⃗u and ⃗v be vectors in R3 . The cross product is defined to be ⃗u × ⃗v = u3 v1 − u1 v3 .
u1 v2 − u2 v1
Let ⃗u and ⃗v be vectors in R3 . The cross product of them is orthogonal to both ⃗u and ⃗v .
Example 9: Find the scalar equation of the plane through three points P (1, 0, 1), Q(0, 1, −1) and R(2, 1, 0).
Property Name
⃗x × ⃗y = −⃗y × ⃗x Anti-Commutativity
⃗x × ⃗x = ⃗0 Self-Degenerate
19
1.4 Projections and Minimum Distance
1.4.1 Projections
Definition: Projection
perp⃗v ⃗u = ⃗u − proj⃗v ⃗u
T T
Example 1: Let ⃗u = 1 −2 0 and ⃗v = 3 1 2 . Determine proj⃗v ⃗u and perp⃗v ⃗u.
Property Name
20
1.4.2 Projections and Minimal Distance
Theorem: Distance from a Point to a Plane and Distance from a Point to a Plane
Example 2: Find the distance between the point Q(4, 3) and the line that runs through the points P (1, 2)
and R(0, 3).
21
Example 3: Find the point on the plane x1 − 2x2 + 2x3 = 5 that is closest to the point Q(2, 1, 1). Hint: In
⃗ =Q
the previous diagram note that R ⃗ + QR
⃗ =Q ⃗ + proj⃗n QP
⃗ .
1 1
Example 4: Find the point on the line ⃗x(t) = −2 + t 0 that is closest to the point Q(1, 3, 1).
3 1
22
Chapter 2
23
2.1 Systems of Linear Equations and Elimination
2.1.1 Linear Equations
Definition: Linear Equations
A linear equation in n variables x1 , ..., xn is an equation that can be written in the form
a1 x2 + a2 x2 + · · · + an xn = b
where the constants a1 , ..., an are called the coefficients of the equation, and b is just a constant without
any particular name. The variables x1 , ..., xn are unknown and usually to be solved for.
The standard procedure for solving a system of linear equations is elimination. When dealing with a system
of linear equations there are three actions that you are allowed to use in rendering a possible solution:
Example 1: Use elimination (with back-substitution) to find all solutions of the following system of linear
equations
x1 + x2 − 2x3 = 4
x1 + 3x2 − x3 = 7
2x1 + x2 − 5x3 = 7
24
Definition: Gaussian Elimination with Back-Substitution and Solutions
The solution procedure introduced above is known as Gaussian elimination with back-substitution.
At each step, the entry you are using to eliminate all other entries is called a pivot.
To any system of linear equations the only possibilities are a unique solution, infinite solutions, or no solution.
In the case of infinite solutions, the variables that are chosen to be independent of the other variables are
called the free variables. The values chosen for the free variables are called the parameters. The final
result where each variable is dependent on the parameters is called the general solution.
Example 2: Use elimination to find all solutions to the following system of linear equations
2x1 + x2 + 5x3 = 0
x2 − 3x3 = 1
x1 + x2 + x3 = 2
25
2.1.2 Matrix Representations of a System
Note: Systems of Equations as a Matrix
We may simplify the prior notation when solving a system of linear equations to that of a spreadsheet of
information. That is, we can drop the variables and replace them as cells containing the coefficients. This
helps simplify matters when solving, e.g.
(
a11 x1 + a12 x2 = b1 a11 a12 b1
⇐⇒
a21 x1 + a22 x2 = b2 a21 a22 b2
Here the entry aij represents the coefficient in the i’th row and j’th column. We represent this as [ A | ⃗b ].
A rectangular array of data is called a matrix. A system of the form [ A | ⃗b ] is called an Augmented
Matrix of the system of corresponding linear equations. The matrix A is called the Coefficient Matrix.
Actions of elimination when applied to a matrix are called Row Operations. Performing these actions is
called Row Reducing instead of eliminating. Each resulting matrix when applying these operations is called
Row Equivalent. In reducing, it is VERY WRONG to write an equal sign instead of an arrow.
Example 3: By row reducing an augmented matrix (following the procedure of Gaussian elimination) find
the general solution of the system
x1 + x2 + x3 = 5
2x1 + 4x3 = 4
−3x2 + 3x3 = −9
26
2.1.3 Row Echelon Form and Consistency
Definition: Row Echelon Form
A matrix is in Row Echelon Form (REF) if...
When all entries in a row are zero, this row appears below all non-zero rows.
When two non-zero rows are compared, the first non-zero entry, called the leading entry, in the
upper row is to the left of the leading entry in the lower row (i.e. the first non-zero entries cascade
from left to right as you move down rows).
Example 4: Reduce the following system to row echelon form and determine all solutions.
x1 + x2 = 1
x2 + x3 = 2
x1 + 2x2 + x3 = −2
If a system has at least one solution we say it is consistent. If a system does not have any solutions we
say it is inconsistent.
Suppose that the augmented matrix [ A | ⃗b ] is reduced to REF. There are three possibilities:
27
2.2 Reduced REF, Rank, and Homogeneous Systems
2.2.1 Reduced Row Echelon Form
Note: Gauss-Jordan Elimination
When performing Gaussian elimination, if we choose to use the j-th equation to eliminate the the j-th
variable of all other rows (not just the rows below it) then you create a new form of elimination called
Gauss-Jordan elimination. The benefits or row reducing in this manner is that it avoids the need for
back-substitution at the end.
x1 + 2x2 + 2x3 = 4
Example 1: Use Gauss-Jordan elimination to solve the following system x1 + 3x2 + 3x3 − 5 .
2x2 + x3 = −2
3. In a column with a leading 1, all other entries in the column are zero.
Theorem
For any given matrix, there is a unique matrix in reduced row echelon form that is row equivalent to it.
28
2.2.2 Rank of a Matrix
Note: Leading 1’s as an Indicator of Solution Types
It is clear that the number of pivots is an indicator of what type of solution a system of linear equations has.
In the case of RREF, pivots are replaced by leading 1’s. Also, if the system is consistent, the number of piv-
ots is a good indicator of home many free variables there are and thus it seems important enough to define it.
1. The system is consistent if and only if the rank of the coefficient matrix A is equal to the rank of the
augmented matrix [ A | ⃗b ]. That is, rank(A) = rank([A|⃗b])
2. If the system is consistent, then the number of parameters in the general solution (i.e. number of free
variables) is given by n − rank(A).
Let [ A | ⃗b ] be a system of m linear equations in n variables. Then [ A | ⃗b ] is consistent for all ⃗b ∈ Rn if and
only if rank(A) = m.
29
2.2.3 Homogeneous Linear Equations
Definition: Homogeneous Systems
A linear equation if homogeneous if the “right-hand side” is zero (specifically when a system of linear
equations is written in the form we usually write it in). A system of linear equations if homogeneous if all
of the equations in the system are homogeneous.
2x1 + x2 = 0
Example 3: Find a general solution of the homogeneous system x1 + x2 − x3 = 0 .
−x2 + 2x3 = 0
A homogeneous system is always consistent. It always has either infinite solutions or a unique solution. If
the solution is unique, it is always the trivial solution.
30
2.3 Application to Spanning and Linear Independence
2.3.1 Spanning Problems
−2 1 1 2 −1
Example 1: Determine whether the vector ⃗v = −3 is in the set Span 1 , −1 , 1 , −3
1 1 5 4 3
The linear combination ⃗v = t1 w⃗1 + · · · tn w⃗n is equivalent to a system of linear equations where ⃗v is the
“Right-Hand-Side” and the vector w ⃗ i give the coefficients of the i’th column. For example,
3 −2 −1 −2 −1 3
= t1 + t2 ⇐⇒
4 1 5 1 5 4
31
1 1 3
2 , , 5 . Find a homogeneous system of linear equations that
1
Example 2: Consider Span 1 3 5
1 1 3
defines this set. That is, let ⃗x be an arbitrary vector in the spanning set and find homogeneous conditions on the
components for this vector to be in the set.
32
Example 3: Show that every vector ⃗v ∈ R3 can be written as a linear combination of the vectors
1 2 −4
0 , 2 , and 6
0 1 5
Hint: There’s an indirect and computationally easy way to do this. Remember that a system [ A | ⃗b ] has a
solution for every ⃗b ∈ Rn if rank(A) = m.
A set of k vectors {v⃗1 , ..., v⃗k } in Rn spans Rn if and only if the rank of the coefficient matrix of the system
t1 v⃗1 + · · · + tk v⃗k = ⃗v is n.
Let {v⃗1 , ..., v⃗k } be a set of k vectors in Rn . If Span({v⃗1 , ..., v⃗k }) = Rn then k ≥ n.
33
2.3.2 Linear Independence Problems
1 2 1
Example 4: Determine whether the set , , is linearly independent or dependent.
−1 0 1
If {v⃗1 , ..., v⃗k } is a collection of vectors in Rn with k > n then the collection is linearly dependent.
1 −2 1
Example 5: Determine whether the set 2 , 1 , 1 is linearly independent or dependent.
1 0 1
A set of vectors {v⃗1 , ..., v⃗k } in Rn is linearly independent if and only if the rank of the coefficient matrix of
the homogeneous system t1 v⃗1 + · · · + tk v⃗k = ⃗0 is k. Furthermore, if {v⃗1 , ...v⃗k } is a linearly independent set
of vectors in Rn then k ≤ n.
34
2.3.3 Bases of Subspaces
1 5 −2
Example 6: Prove that 1 , −2 , 3 is a basis for R3 . Hint: all this can be argued from the rank.
2 2 1
A set of vectors {v⃗1 , ..., v⃗n } is a basis for Rn if and only if the rank of the coefficient matrix t1 v⃗1 +· · ·+tn v⃗n = ⃗v
is n.
1 1
Example 7: Show that 2 , 1 is a basis for the plane −3x1 + 2x2 + x3 = 0
−1 1
If S is a non-trivial subspace of Rn with a basis containing k vectors, then we say that the dimension of
S is k and write dim(S) = k.
35
Chapter 3
36
3.1 Operations on Matrices
3.1.1 Equality, Addition and Scalar Multiplication of Matrices
Note: Matrices in the Abstract
In this chapter we wish to understand the underlying structure of matrices mathematically, without a
reference to a system of linear equations. Therefore, most-to-all matrices in the following merely are objects
which are arrays of data, not necessarily augmented and not necessarily corresponding to a system of
equations.
Size of a Matrix: We say that A is a size m × n matrix when A has m rows and n columns. If we
want to subtly state or imply the size of a m × n matrix we will write Am×n .
Equality of Matrices: Two matrices A and B are defined to be equal if and only if they have the
same size and their corresponding entries are equal. That is, if aij = bij for 1 ≤ i ≤ m, 1 ≤ j ≤ n.
Entries of a Matrix: Sometimes we denote the entries of a matrix as (A)ij or aij . We sometimes
denote the full matrix as A or [aij ].
Square Matrices: A size n × n matrix (where the number of rows and columns are equal) is called
a square matrix.
(
1 if i = j
Example 1: Construct a square 3 × 3 matrix with entries given by δij = .
0 ̸ j
if i =
A square matrix U is said to be upper triangular if the entries beneath the main diagonal are all zero,
that is, if uij = 0 whenever i > j. A square matrix L is said to be lower triangular if the entries above
the main diagonal are all zero, that is, if lij = 0 whenever i < j. A square matrix D such that dij = 0 if
i ̸= j is called a diagonal matrix. We denote an n × n diagonal matrix by D = diag(d11 , d22 , ..., dnn ).
37
Definition: Matrix Addition and Scalar Multiplication
Property Name
There exists a matrix Om×n such that O + M = M for all Mm×n Existence of zero matrix
For all Mm×n there exists a (−M )m×n such that M + (−M ) = O Existence of additive inverse
38
Definition: Span of Matrices
Let B = {A1 , ..., Al } be a set of m × n matrices. Then B is said to be linearly independent if the only
solution to the equation t1 A1 + · · · + tl Al = Om×n is t1 = · · · = t1 = 0. Otherwise, there is a non-trivial
solution for which we say B is linearly dependent.
1 0 1 2 −1 −3
Example 4: Determine if B = , , is linearly dependent or independent.
1 1 0 1 1 −1
39
3.1.2 Transpose of a Matrix
Definition: Transpose of a Matrix
Let A be an m × n matrix. Then the transpose of A is the n × m matrix denoted AT , whose ij-th entry
is the ji-th entry of A. That is, (AT )ij = (A)ji .
−1 6 0 2
Example 5: Let A = . Compute 3AT .
4 2 −1 3
Let B be an m × k matrix with rows ⃗bT1 , ..., ⃗bTm and A be an k × n matrix with columns ⃗a1 , ..., ⃗an . Then we
define BA to be the m × n matrix whose ij-th entry is (BA)ij = ⃗bi · ⃗aj .
Example 6: If possible, perform the following operations. If it is not possible, explain why.
3 1 3 1
2 3 0 1 1 2
1
2 2 3 0 1
4 −1 2 −1 2 3 2 3 4 −1 2 −1
0 5 0 5
The previous example shows that AB and BA are not generally equal. Don’t ever assume that they are.
40
3.1.4 Properties of Matrix Multiplication and the Identity Matrix
Theorem: Properties of Matrix Multiplication
If A, B and C are matrices of the correct size so that the required products are defined, and t ∈ R, then...
Property Name
The n × n matrix I = diag(1, 1, ..., 1) is called the identity matrix. We sometimes denote this matrix by
In to emphasize it’s size.
2
4 −1
Example 7: Compute I22 . Also compute just to show that matrices follow their own sets of
12 −3
rules. All matrices that satisfy this are called idempotent matrices.
41
3.2 Matrix Mappings and Linear Mappings
3.2.1 Matrix Mappings
Definition: Matrix Mappings
For any m × n matrix A we define fA : Rn → Rm given by ⃗x 7→ A⃗x and call it the matrix mapping
associated to A.
2 2
1 0
0 −3 . State the domain and codomain of fA . Then, compute fA (1, 2).
Example 1: Let A =
−2 5
Let ⃗e1 , ⃗e2 , ..., ⃗en be the standard basis vectors of Rn , let A be an m
× n matrix, and let fA : Rn → Rm be
x1
..
the corresponding matrix mapping. Then, for any vector ⃗x = . we have
xn
fA (⃗x) = x1 fA (e⃗1 ) + x2 fA (e⃗2 ) + · · · + xn fA (e⃗n )
Let A be an m × n matrix with corresponding matrix mapping fA : Rn → Rm . Then, for any ⃗x, ⃗y ∈ Rn and
any t ∈ R, we have
Property Name
42
3.2.2 Linear Mappings
Definition: Linear Mappings
A function L : Rn → Rm is called a linear mapping (or linear transformation) if for every ⃗x, ⃗y ∈ Rn and
t ∈ R it satisfies the following properties:
Property Name
A linear operator is a linear mapping whose domain and codomain are the same. In particular, the
operator defined by Id : Rn → Rn given by ⃗x 7→ ⃗x, i.e. Id(⃗x) = ⃗x, is called the identity mapping.
Example 2: Prove that the mapping f : R2 → R2 defined by f (x1 , x2 ) = (3x1 − x2 , 2x1 ) is a linear operator
by directly showing it satisfies the linearity conditions.
If L : Rn → Rm is a linear mapping,
then L can be represented as a matrix mapping, with the corresponding
m × n matrix [L] given by [L] = L(e⃗1 ) L(e⃗2 ) · · · L(e⃗n ) .
Example 3: Find [f ] of the previous example and represent the function as a matrix mapping.
43
3.2.3 Compositions and Linear Combination of Mappings
Theorem: Closure of Linear Mappings
Property Name
Theorem
Let L : Rn → Rm , M : Rn → Rm , and N : Rm → Rp be linear mappings and t ∈ R. Then
1. [L + M ] = [L] + [M ]
2. [tL] = t [L]
3. [N ◦ L] = [N ][L]
Example: Let L and M be linear operators on R2 defined by L(x1 , x2 ) = (2x1 + x2 , x1 ) and M (x1 , x2 ) =
(x2 , x1 ). Compute M ◦ L, [M ◦ L], and [M ][L].
44
3.3 Geometrical Transformations
Note: Matrix Mappings as Geometric Transforms
As with projection, we have learned that some linear mappings have geometric significance. We will be
building a toolkit of various representative matrices [L] of operators that have such geometric significance.
respectfully.
> Reflections
The Reflection Matrices about the x1 and x2 axis are given
by
1 0 −1 0
[Rx1 ] = ; [Rx2 ] =
0 −1 0 1
respectively.
45
1
Example 1: Let ⃗u = . Transform the vector by shearing it along the x1 -axis by a factor of 3, scale it
1
by a factor of 2, then rotate it counter-clockwise about the origin by an angle of π/2. Loosely graph the resulting
vector at each step.
1 0 0 cos(θ) 0 sin(θ) cos(θ) − sin(θ) 0
[Rθ,x1 ] = 0 cos(θ) − sin(θ) ; [Rθ,x2 ] = 0 1 0 ; [Rθ,x3 ] = sin(θ) cos(θ) 0
0 sin(θ) cos(θ) − sin(θ) 0 cos(θ) 0 0 1
46
3.3.3 General Reflections
Theorem: General Reflections in Rn
Example 2: Find the matrix representation of the mapping that reflects any point in R3 about the plane
T
through the origin with the normal vector ⃗n = 1 3 −2 .
47
3.4 Special Subspaces for Systems and Mappings
3.4.1 The Four Fundamental Subspaces
Definition: The Four Fundamental Subspaces
Nullspace: The collection of all ⃗x ∈ Rn such that A⃗x = ⃗0 is called the Nullspace of A, denoted
Null(A). It is also sometimes called the Kernel of L and denoted Ker(L).
Left-Nullspace: The collection of all x⃗∗ ∈ Rm such that AT x⃗∗ = ⃗0 is called the Left Nullspace of
A, and we simply denote it Null(AT ) (i.e. it is the nullspace of AT ). It may also be thought of as the
kernel of the mapping associated to AT .
Columnspace: The collection of all ⃗y ∈ Rm such that A⃗x = ⃗y for some ⃗x ∈ Rn is called the
Columnspace of A, denoted Col(A) (i.e. it is the span of the columns of A). It is also sometimes
called the Range of L and denoted Range(L).
Rowspace: The collection of all y⃗∗ ∈ Rn such that AT x⃗∗ = y⃗∗ for some x⃗∗ ∈ Rm is called the
Rowspace of A, denoted Row(A) (i.e. it is the span of the rows of A). It may also be though of as
the range of the linear mapping associated to AT .
Let A be a matrix. The sets Null(A), Col(A), Row(A) and Null(AT ) are subspaces.
1 1 1 1
Example 1: Consider the matrix A = . Describe conditions that a vector must satisfy (i.e. as
2 3 4 4
a system of equations that the components must satisfy) for it to be in (a) the Nullspace, Null(A); and (b) for a
vector to be in the Left-Nullspace, Null(AT ).
48
1 1
Example 2: Consider the matrix A = 2 1 . Describe the (a) Columnspace of A (i.e. Col(A)) as a
1 3
spanning set of vectors; and (b) the Rowspace of A (i.e. Row(A)) as a spanning set of vectors. Lastly, is the
spanning set described in (b) a basis for Row(A)?
1 1 1
Example 3: Suppose that L is a linear mapping with matrix A = 2 1 . Determine whether ⃗c = 3
1 3 −1
2
and d⃗ = 1 are in the range of L. Hint: Form [ A | ⃗c | d⃗ ] to kill two birds with one stone.
9
As one may see, a vector ⃗b being in the range of L, or columnspace of A, is the same as the system A⃗x = ⃗b
being consistent.
49
3.4.2 Bases for Row(A), Col(A), and Null(A)
Note: Bases for Row(A), Col(A), and Null(A)
The bases for three of the four fundamental subspaces may be obtained simply by reducing a matrix A to
RREF and interpreting the results. We summarize this in a theorem below.
Let A be a matrix.
Basis of the Nullspace: The spanning set for the general solution of the homogeneous A⃗x = ⃗0
obtained by the method of Gauss-Jordan elimination (to RREF) is a basis for Null(A).
Basis of the Columnspace: Let B be RREF of A. Then, the columns of A that correspond to the
columns of B with leading 1’s form a basis for the columnspace of A.
Basis of the Rowspace: Let B be the RREF of A, then the non-zero rows of B form a basis of
Row(A).
For any matrix A, Rank(A) = dim(Row(A)) = dim(Col(A)). This follows from the prior theorem.
1 2 3
Example 4: Find a basis for the rowspace, columnspace and nullspace of A = −1 3 2 .
0 1 1
50
(Continued...)
Example 5: Use the results of the prior exercise to verify the Rank-Nullity Theorem.
51
3.5 Inverse Matrices and Inverse Mappings
3.5.1 Inverse Matrices
Definition: Inverse of a Matrix
Let A be an n × n matrix (square). If there exists an n × n matrix B such that AB = I = BA, then A is
said to be invertible, and B is called the inverse of A (and vice versa A is the inverse of B). The inverse
of A is denoted A−1 .
Since matrix multiplication does not commute, it is VERY WRONG to write A−1 as 1/A.
Suppose that A and B are invertible matrices and that t ̸= 0 is a real number. Then...
Property Name
1
(tA)−1 = A−1 Inverse of Scalar Multiple
t
52
3.5.2 A Procedure for Finding the Inverse of a Matrix and Solving Systems of Equations
Theorem: Algorithm For Finding An Inverse
Row reduce the multi-augmented matrix A I so that the left block is in reduced row echelon
form.
The reason why this works is that you are solving the equation AB = I for B. You do this column by
column. Letting ⃗bi represent the i’th column of B you are solving the equation A⃗bi = ⃗ei for each i. This
translates to solving [A|⃗e1 |⃗e2 | · · · |⃗en ] ⇔ [A|I].
1 1
Example 2: Provided it exists, determine the inverse of A = .
4 3
Using matrix algebra, we may solve a system of equations A⃗x = ⃗b by multiplying both sides on the left by
the inverse A−1 .
(
x1 + x2 = 3
Example 3: Solve the system of equations using the results of the previous example.
4x1 + 3x2 = −2
53
3.5.4 Inverse Linear Mappings and the Inverse Matrix Theorem
Definition: Inverse Linear Mappings
54
3.6 Elementary Matrices
3.6.1 Elementary Matrices and Row Operations
Definition: Elementary Matrices
A matrix that can be obtained from the identity matrix by a single elementary row operation is called an
elementary matrix.
If A is an n × n matrix and E is the elementary matrix obtained from In by a certain elementary row opera-
tion, then the product EA is the matrix obtained from A by performing the same elementary row operation.
As such, there is a sequence of elementary matrices E1 , E2 , ..., Ek such that the product Ek · · · e2 E1 A yields
the RREF of A.
Example 1: the elementary matrix obtained by performing the row operations R1 + (−2)R2 → R1
Let E be
5 −1
on I. Let A = and demonstrate the prior theorem by computing EA.
1 3
1 2 1
Example 2: Let A = . Find a sequence of elementary matrices E1 , ..., Ek such that Ek · · · E1 A is
2 4 4
the RREF of A.
55
3.6.2 Representing A and A−1 as a Product of Elementary Matrices
Note: Inverse of Elementary Matrices
Constructing the inverses of an elementary matrix is easy. If you have an elementary matrix E obtained
from I through some row operation, you may obtain the elementary matrix E −1 by applying whatever row
R R−1
operation brings E back to I. Visually, E ∼ I ⇔ I ∼ E −1 .
Let A be an invertible matrix, then by the inverse matrix theorem there exists a sequence of elementary
matrices where Ek · · · E1 A = I. Consequently A−1 = Ek · · · E1 and A = (Ek · · · E1 )−1 = E1−1 · · · Ek−1 .
0 3 0
Example 3: Let A = 1 1 0 . Write A and A−1 as a product of elementary matrices.
0 −2 1
56
3.7 LU -Decomposition
3.7.1 Constructing the LU -Decomposition
Definition: LU -Decomposition
If A is an n × n matrix that can be row reduced to REF WITHOUT SWAPPING ROWS, then there exists
an upper triangular matrix U and lower triangular matrix L such that A = LU .
Suppose that A is a matrix that can be reduced to REF without swapping rows. The following procedure
constructs the LU decomposition:
The above procedure works because Ek · · · E1 A = U . You must then have that A = (E1−1 · · · Ek−1 )U and
thus we obtain the corresponding desired L.
2 1 −1
Example 1: Find an LU -decomposition of B = −4 3 3
6 8 −3
57
3.7.2 Using LU -Decomposition to Solve Systems
Suppose that A has an LU decomposition given by A = LU . Then the following procedure solves the
system A⃗x = LU⃗x = ⃗b for ⃗x.
Since you are not allowed to row swap when performing LU -decomposition, it is then recommended to
interchange the position of any equations in a system of equations before forming the matrix (if required to
do so).
58
Chapter 4
Vector Spaces
59
4.1 Spaces of Polynomials
4.1.1 Polynomial Vector Space
Definition: Polynomial Space and the Standard Basis of Polynomial Space
The collection of polynomials of degree at most n is denoted Pn . The collection of polynomials given by
{1, x, x2 , ..., xn } is called the monomial basis for Pn .
Property Name
There is a poly. o(x) such that o(x) + s(x) = s(x) for all poly. s(x) Existence of zero
For each s(x) there exists a −s(x) such that s(x) + (−s(x)) = o(x) Existence of inverse
60
4.1.2 Linear Combinations, Spans, and Linear Dependendence/Independence
Definition: Linear Combination of Polynomials
t1 p1 (x) + · · · + tn pn (x)
Let B = {p1 (x), ..., pk (x)} be a set of polynomials of degree at most n, Then the span of B is defined as
61
Definition: Linear Dependence and Independence of Polynomials
The set B = {p1 (x), ..., pk (x)} is said to be linearly independent if the only solution to the equation
t1 p1 (x) + · · · + tk pk (x) = 0
is t1 = · · · = tk = 0; otherwise, there is a solution where not all ti are zero, for which B is said to be linearly
dependent.
62
4.2 Vector Spaces
4.2.1 Vector Spaces
Definition: Vector Spaces
We refer the reader to the appendix for the requirements of an algebraic space to be called a Vector Space.
2. The space M (m, n) of all size m × n matrices equipped with matrix addition and scalar multiplication.
4. The space of functions F(a, b) = {f |f : (a, b) ⊆ R → R} equipped with function addition and scalar
multiplication.
Example 1: Consider the space R2 with addition defined to be standard vector addition ⊕ = +, and scalar
multiplication defined by k ⊙ (x, y) = (ky, kx). By finding a counterexample, show that this is not a vector space.
Example 2: Consider the space R with addition defined to be x ⊕ y = x2 + y + 1 and scalar multiplication to
be defined by standard multiplication ⊙ = ·. By finding a counterexample, show that this is not a vector space.
63
Example 3: Consider R+ = (0, ∞) the space of positive real numbers. Define addition on this space to be
x ⊕ y = xy and scalar multiplication to be s ⊙ x = xs . Prove that this is a vector space.
64
V6: Closure under scalar multiplication
We will call the space R+ equipped with x ⊕ y = xy and s ⊙ xs the Exponential Space and denote it E.
65
Theorem: Inverse and Zero Properties of Vector Spaces
1. 0 ⊙ x = 0 for all x ∈ V
3. t ⊙ 0 = 0 for all t ∈ R
Example 4: Demonstrate the prior theorem holds for E, the exponential space.
Example 5: Let V = {(a, b) | a ∈ R, b ∈ R+ }. Define addition in this space to be (a, b) ⊕ (c, d) = (ad + bc, bd)
and scalar multiplication to be t ⊙ (a, b) = (tabt−1 , bt ). Given this is a vector space, determine the zero vector and
the additive inverse of any vector using the prior theorem.
66
4.2.2 Subspaces
Definition: Subspaces
A subspace of a vector space is a vector space itself. Consequently, this means that as U ⊆ V you may think
of a vector subspace as a smaller vector space within another vector space.
Example 7: Prove that the set U = {A ∈ M (2, 2) | a11 + a22 = 0} is a subspace of M (2, 2).
67
4.3 Bases and Dimensions
4.3.1 Linear Combinations, Spans and Bases
Theorem: Spanning Sets as Subspaces
If {v1 , ..., vk } is a set of vectors in a vector space V and S is the set of all possible linear combinations of
these vectors,
If S is a subspace of the vector space V consisting of all linear combinations of vectors v1 , ..., vk ∈ V, then S
is called the subspace spanned by B = {v1 , ..., vk }, and we say that the set B spans S. The set B is called
a spanning set for the subspace S. We denote S = Span({v1 , ..., vk }) = Span(B).
If B = {v1 , ..., vk } is a set of vectors in a vector space V, then B is said to be linearly independent if the
only solution to the equation
(t1 ⊙ v1 ) ⊕ · · · ⊕ (tk ⊙ vk ) = 0
is t1 = · · · = tk = 0; otherwise, there is a non-trivial solution and we say B is said to be linearly dependent.
Let B = {v1 , ..., vn } be a spanning set for a vector space V. Then every vector in V can be expressed in a
unique way as a linear combination of the vectors of B if and only if the set B is linearly independent.
A set B of vectors in a vector space V is a basis if it is a linearly independent spanning set for V.
1 2 0 1 2 5
Example 1: Prove that the set B = , , is not a basis for the subspace Span(B).
−1 1 3 1 1 3
Note: As a reminder, if we don’t specify the operations or vector space, you can assume they are the standard
operations from the standard space. Also, hint... 2A + B − C = O.
68
4.3.2 Determining a Basis of a Subspace
Note: Finding a Basis
Finding a basis of a subspace can be quite difficult. The technique is to first (somehow) determine a spanning
set, then reduce it to or prove that it is, a linearly independent set. Finding the spanning set is the creative
step, while turning a spanning set into a basis is quite procedural. One technique is to associate the span
as the range of some matrix, then discern a basis using our results on the basis of the columnspace of that
matrix.
Example 2: Determine a basis for the subspace S = {p(x) ∈ P2 | p(1) = 0} of P2 . Hint: Every element in this
space can be written as p(x) = (x − 1)(ax + b) for some constants a and b.
69
4.3.3 Dimension
Definition
If B = {v1 , ..., vn } and C = {u1 , ..., uk } are both bases of a vector space V, then k = n. If a vector space
V has a basis with n vectors, then we say that the dimension of V is n and write dim(V) = n. If a vector
space V does not have a basis with finitely many elements, then V is called infinite-dimensional. The
dimension of the trivial vector space V = {0} is defined to be 0.
Example 4: Determine the dimension of the vector space Span(T ) of the previous example.
3. A set with n elements of V is a spanning set for V if and only if it is linearly independent.
1 1 −2 −1 1 0
Example 5: Let C = , , . Extend C to a basis for M (2, 2). Hint: We’ve
0 1 1 1 1 1
worked with M (2, 2) and know that dim(M (2, 2)) = 4. Thus, you need only find an additional element not in the
span of these three matrices.
70
4.4 Coordinates with Respect to a Basis
4.4.1 Bases
Definition: Coordinate Vectors
Suppose that B = {v1 , ..., vn } is a basis for the vector space V. If x ∈ V with
Example 2: The collection B = {1, x, 1 + x2 } is a basis of P2 . Find the B-coordinates of p(x) = 2 + x + 3x2 .
71
4.4.2 Change-of-Basis
Theorem: Linearity of Basis Representations
Let B be a basis for a finite dimensional vector space V. Then, for any x, y ∈ V and s, t ∈ R we have
This means that the function gB : V → Rn given by gB (v) = [v]B is a linear function. In the event that
V = Rn this means that we should be able to find a representing matrix [gB ].
Consider a general vector space V with two bases B and C = {w1 , ..., wn }.
x = (x1 ⊙ w1 ) ⊕ · · · ⊕ (xn ⊙ wn )
T
That is, [x]C = x1 x2 · · · xn . Taking B-coordinates gives
Let B and C = {w1 , ..., wn } both be bases for a vector space V. The matrix
P = [w1 ]B · · · [wn ]B
is called the change of coordinates matrix from C-coordinates to B-coordinates and satisfies
[x]B = P [x]C
and is called the change of coordinates equation. Often an emphatic notation PBC is used.
Let B and C both be bases for a finite-dimensional vector space V. Let P be the change of coordinates
matrix from C-coordinates to B-coordinates. Then, P is invertible and P −1 is the change of coordinates
matrix from B-coordinates to C-coordinates.
72
Note: Efficiently Obtaining PBC
1 1 1 0
Example 1: Earlier we considered the bases B = , and E = , for R2 . The
−1 1 0 1
E 1 0
change of basis matrix from E to B is PB = , . Demonstrate the prior note by setting up the
0 B 1 B
1 0
systems of equations for , and solving for them. Demonstrate the change-of-basis is consistent with
0 B 1 B
T
Example 1 by computing PBE ⃗a where ⃗a = 3 −2 .
73
Example 2: Let C = {1, x, x2 } be the standard basis of P2 and let B = 1 − x2 , x, −2 + x2 . Find the change
of coordinates matrix from B to C. Then, use the inverse to find the change of coordinates matrix from C to B.
74
4.5 General Linear Mappings
4.5.1 General Linearity Conditions
Definition: Linear Mappings
If V and W are vector spaces over R, a function L : V → W is a linear mapping if it satisfies the linearity
properties
Property Name
for all x, y ∈ V; t ∈ R; ⊕V and ⊙V are the operations of addition and scalar multiplying respectively on V;
and ⊕W and ⊙W are the operations of addition and scalar multiplying respectively on W. If V = W, then
L may be called a linear operator.
Example 1: Let L : M (2, 2) → P1 be defined by L(A) = a21 + (a12 + a22 )x. Prove that L is a linear mapping.
Example 2: Define L : E → R by L(x) = ln(x). Prove that L is a linear function. Note: The function is most
certainly not linear if it is mapping L : R+ → R as usually done in calculus!
75
4.5.2 The General Rank-Nullity Theorem
Definition: The Range and Nullspace/Kernel of a General Linear Mapping
Let V and W be vector spaces over R. The range of a linear mapping L : V → W is defined to be the set
Range(L) = {L(x) ∈ W | x ∈ V}
The nullspace ( or kernel) of L is the set of all vectors in V whose image under L is the zero vector 0W .
We write
Null(L) = {x ∈ V | L(x) = 0W }
Let V and W be vector spaces and let L : V → W be a linear mapping. Then, Null(L) is a subspace of V
and Range(L) is a subspace of W.
Example 3: Consider the linear mapping L : M (2, 2) → P2 given by L(B) = b21 + (b12 + b22 )x + b11 x2 .
Determine whether 1 + x + x2 ∈ Range(L), and if it is, determine a matrix A such that L(A) = 1 + x + x2 .
Afterwards, determine the nullspace of L (as a spanning set).
Let V and W be vector spaces and let L : V → W be a linear mapping. Then, L(0V ) = 0W .
Example 4: Demonstrate the prior theorem is true using L : E → R given by L(x) = ln(x) of Example 2.
76
Example 5: Determine a basis for the range and nullspace of the linear mapping L : P1 → R3 defined by
0
L(a + bx) = 0
a − 2b
Let V and W be vector spaces over R. The rank of a linear mapping L : V → W is the dimension of the
range of L, that is, rank(L) = dim(Range(L)).
Let V and W be vector spaces over R. The nullity of a linear mapping L : V → W is the dimension of
the nullspace of L, that is, nullity(L) = dim(Null(L)).
Let V and W be vector spaces over R with dim(V) = n, and let L : V → W be a linear mapping. Then,
rank(L) + nullity(L) = n
77
4.6 Matrix of a Linear Mapping
4.6.1 The Matrix of L with Respect to the Basis B
Note: The Representing Matrix of a Mapping in Another Basis
In this subsection we are concerned with the representing matrix [L] of L in another basis B. Specifically, we
want everything stated in the language of B with nothing in the language of the standard basis. That is, for
an input [⃗x]B the output is [L(⃗x)]B and we seek a matrix A such that [L(⃗x)]B = A[⃗x]B , which we will call [L]B .
Let B = {⃗v1 , ..., ⃗vn } be a basis for Rn and let L : Rn → Rn be a linear operator. Then, for any ⃗x ∈ Rn , we
can write ⃗x = b1⃗v1 + · · · + bn⃗vn . Thus by linearity,
Let V be a vector space. Suppose that B = {v1 , ..., vn } is any basis for V and that L : V → V is a linear
operator. Define the matrix of the linear operator L with respect to the basis B to be the matrix
[L]B = [L(v1 )]B · · · [L(vn )]B
where we have for any x ∈ V, [L(x)]B = [L]B [x]B .
2 2 1 1
Example 1: Let L : R → R be given by L(x1 , x2 ) = (x2 , x1 ) and let B = , . Determine
−1 1
[L]B .
78
4.6.2 Change of Coordinates and Linear Mappings
Note: Another Way to Obtain [L]B
Using the prior note, you may extend the logic to find the matrix representation of the linear mapping if
the input is in basis B and the output is in basis C. This is simply [L]B S B
C = PC [L]PS , which is very useful.
2 1 3 1 1 0
Example 2: Let L be the linear mapping with [L] = A = −1 2 2 and let B =
1 , 1 , 1 .
−2 3 1 0 1 1
Determine [L]B .
79
Chapter 5
Determinants
80
5.1 Determinants in Terms of Cofactors
5.1.1 The 2 × 2 Case
Note: Consistency of a 2 × 2 System
When solving a system of equations A⃗x = ⃗b, we know from the inverse theorem that the consistency of the
system depends entirely on the coefficient matrix and not ⃗b. Specifically, what conditions are these? We
may talk about rank, and other equivalencies, but in this section we develop a new apparatus for measuring
consistency.
One may solve such a system (in general) to obtain the following result,
a22 b1 − a12 b2
x1 =
a 11 a22 − a12 a21
(
a11 x1 + a12 x2 = b1
=⇒
a21 x1 + a22 x2 = b2
a11 b2 − a21 b1
x2 =
a11 a22 − a12 a21
and thus a condition is that we must have that a11 a22 − a12 a21 ̸= 0.
a11 a12
It’s simple to remember the above by the ‘criss-cross’ pattern. = a11 a22 − a21 a12
You take the product of the main diagonal and subtract the a21 a22
product of the off-diagonal
1 2 2 4
Example 1: Compute the determinants of and . Which matrices A will always have
−3 −7 4 8
A⃗x = ⃗b be consistent for every ⃗b ∈ R2 ?
81
5.1.2 The 3 × 3 Case
Consistency of a 3 × 3 System and the Pattern
One may derive a condition for consistency required in a 3 × 3 system through brute force just like the prior
subsection. As one expects, the condition is much messier and given by
|A| = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 ̸= 0.
There is an easier way to memorize this. We may arrange this as:
|A| = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
a22 a23 a21 a23 a21 a22
= a11 · + a12 · (−1) + a13 ·
a32 a33 a31 a33 a31 a32
This might still seem like a mess to memorize until we try to relate it back to the original matrix to obtain
an easy to memorize pattern. Notice that every determinant (and it’s coefficient) can be obtained by moving
across the top row, crossing out all orthogonal elements at each step, and multiplying this coefficient by the
determinant of the corresponding sub-matrix:
a11 — —
a22 a23
=⇒ a11 ·
| a22 a23 a32 a33
| a32 a33
— a12 —
a21 a23
=⇒ a12 · (−1)
a21 | a23 a31 a33
a31 | a33
— — a13
a22 a23
=⇒ a11 ·
a21 a22 |
a32 a33
a31 a32 |
So the idea is to move across the top in an alternating pattern, taking the appropriate sub-determinants as
we move across.
a11 → a12 → a13 + → − → +
a21 a22 a23
a31 a32 a33
We denote the above sub-determinants by the following
82
Definition: The Cofactors of a 3 × 3 Matrix
Let A be a 3 × 3 matrix. Let A(i, j) denote the 2 × 2 sub-matrix obtained from A by deleting the i-th row
and j-th column. Define the cofactors of a 3×3 matrix to be
2 −1 3
Example 2: Let A = 0 4 −1 . Compute C11 and C23 .
1 −2 3
The determinant of a 3×3 matrix A is defined by |A| = a11 C11 + a12 C12 + a13 C13 .
2 −1 3
Example 3: Let A = 0 4 −1 . Compute |A|. Next, compute a11 C11 + a21 C21 + a31 C31 (the expansion
1 −2 3
down column 1 instead).
Theorem: The Order of Determinant Expansion Is Equal Along Any Row or Column
Cofactor expansion along any row or column of a 3 × 3 matrix yields the same determinant value.
83
5.1.3 General Cofactor Expansion
Note: The Pattern of Determinant Expansion is Consistent
The pattern described in the 3 × 3 continues when defining the determinant in greater square size matrices.
Let A be an n × n matrix. Let A(i, j) denote the (n − 1) × (n − 1) matrix obtained from A by deleting the
i-th row and j-th column. The cofactors of an n × n matrix are defined to be
Theorem: The Order of Determinant Expansion is Equal Along Any Row or Column
Cofactor expansion along any row or column of an n × n matrix yields the same determinant value.
0 0 3 0
0 5 6 0
Example 4: Compute .
−2 3 0 4
−5 1 2 3
84
3 2 0 −1
0 0 0 0
Example 5: Calculate the determinant of
4 1
2 1
3 −1 0 1
4 2 1 −1
0 2 2 2
Example 6: Calculate the determinant of
0 0 −1 3
0 0 0 4
If A is an n × n upper or lower triangular matrix (which includes diagonal matrices) then the determinant
of A is the product of the diagonal entries of A. That is, |A| = a11 a22 · · · ann .
85
5.2 Elementary Row Operations and the Determinant
5.2.1 Determinant Operations
> Scaled Rows or Columns
Note: Common Factors in a Determinant Row or Column
The first way that we may simplify is by noticing a common factor in a row or column. Suppose that a
column (or a row) of A has been scaled by a factor r. Then,
a b a −4b
Example 1: Given that = 5 compute .
c d −7c 28d
Suppose that A is an n × n matrix and that B is the matrix obtained from A by swapping two rows (or
columns). Then, |B| = −|A|.
a b c 3e d f
Example 2: Given that d e f = −3, compute 3h g i .
g h i 6b 2a 2c
86
> Adding Multiples of a Row or Column
Theorem: Adding Multiples of a Row or Column to Another Doesn’t Change the Determinant
Suppose that A is an n×n matrix and that B is obtained from A by adding r times the i-th row (or column)
of A to the k-th row (or column resp.). Then, |B| = |A|.
a b c h g i
Example 3: Given that d e f = 7, compute e + 2h d + 2g f + 2i .
g h i 2b − 3e 2a − 3d 2e − 3f
1 3 1 5
1 3 −3 −3
Example 4: Compute by reducing it to an upper triangular matrix.
0 3 1 0
1 6 2 11
87
5.2.2 Determinant Properties
Note: Determinants and the Inverse Matrix Theorem
Since |A| ̸= 0 is intrinsically tied to the (general) consistency of a system, it holds a spot on the Inverse
Matrix Theorem. Specifically, we pay special attention to the fact that a matrix is invertible if and only if
the determinant is non-zero.
Property Name
88
5.3 Inverse by Cofactors and Cramer’s Rule
5.3.1 Inverse by Cofactors
Definition: The Adjugate Matrix
Let A be a size n × n matrix. The cofactor matrix of A is the matrix of cofactors, denoted cof(A) = [Cij ].
The adjugate of A is the matrix adj(A) =cof(A)T .
The advantage of this formulation, albeit a computational nuisance, gives an explicit formulation of the
inverse. Before, we only had a procedure for constructing the inverse without a formula.
2 4 −1
Example 1: Use the method of cofactors to construct the inverse of the matrix 0 3 1 .
6 −2 5
89
5.3.2 Cramer’s Rule
Theorem: Cramer’s Rule
Let A be an invertible size n × n matrix and consider the system A⃗x = ⃗b. Let Ni be the matrix obtained
|Ni |
from A by replacing the i’th column of A by ⃗b. Then xi = .
|A|
Solve for JUST x3 and then find all values of k such that x3 = 0.
90
Chapter 6
91
6.1 Eigenvalues and Eigenvectors
6.1.1 Eigenvalues and Eigenvectors of a Mapping
Definition: Eigenvalues and Eigenvectors of a Linear Mapping
Suppose that L : Rn → Rn is a linear operator. A non-zero vector ⃗v ∈ Rn such that L(⃗v ) = λ ⃗v is called an
eigenvector of L; the scalar λ is called an eigenvalue of L. The pair λ, ⃗v is called an eigenpair. This
terminology remains unchanged in the context of matrices where [L] = A and we consider A⃗v = λ⃗v .
1
Example 1: Let ⃗v = and consider the projection mapping proj⃗v : R2 → R2 . Find two eigenvectors
−1
with distinct eigenvalues.
We wish to find non-zero solutions λ and ⃗v to the system A⃗v = λ⃗v . Suppose that λ is known, then the
system A⃗v = λ⃗v is equivalent to the Homogeneous System (A − λI)⃗v = ⃗0. We know that the only way
to avoid the trivial solution to such a system is if the solution to the system is not unique. We may obtain
such a condition by A − λI not being invertible, that is, if |A − λI| = 0.
Let A be an n × n matrix. The function given by C(λ) = |A − λI| is called the characteristic polynomial.
The equation given by C(λ) = 0 (or rather |A − λI| = 0) is called the characteristic equation.
92
2 2
Example 2: Find all eigenvalues of A = .
1 3
Definition: Eigenspace
Let λ be an eigenvalue of an n × n matrix A. Then the set containing the zero vector and all eigenvectors
of A corresponding to λ is called the eigenspace of λ. In particular. Eλ = Null(A − λI).
Let A be an n × n matrix. To solve the eigenvalue problem A⃗v = λ⃗v for all solution complete the following...
1. Form the characteristic equation |A − λI| = 0 and solve for each λ; proceed to the next step if you’re
also finding the eigenvectors...
2. For each λ obtained in the previous step form the homogeneous system (A − λI)⃗v = ⃗0.
3. Find the general solution of each system in the previous step to find the eigenspace.
4. Find a basis for each of the previous solution sets to determine the corresponding eigenvectors.
1 1 1
Example 3: Solve the eigenvalue problem associated to A = 1 1 1 .
1 1 1
93
(Continued...)
(2) The total number of roots (real and complex, counting repetitions) is n.
(3) Complex roots of the equation occur in “conjugate pairs,” so that the total number of complex roots
must be even.
(5) If the entries of A are integers, since the leading coefficient of the characteristic polynomial is ±1, any
rational root must in fact be an integer (by the rational roots theorem).
Suppose that λ1 , ..., λk are distinct (λi ̸= λj ) eigenvalues of an n × n matrix A, with corresponding eigen-
vectors ⃗v1 , ..., ⃗vk , respectively. Then {⃗v1 , ..., ⃗vk } is linearly independent.
94
6.1.3 Algebraic and Geometric Multiplicity
Definition: Geometric and Algebraic Multiplicity
Let A be an n × n matrix with eigenvalue λ. The algebraic multiplicity of λ is the number of times λ is
repeated as a root of the characteristic polynomial. The geometric multiplicity of λ is the dimension of
the eigenspace of λ.
95
6.2 Diagonalization
6.2.1 Similar Matrices
Definition: “Similarity” of Matrices
If A and B are n × n matrices such that S −1 AS = B for some invertible matrix S, then A and B are said
to be similar. We denote this by A ∼ B.
Definition
Let A be a square n × n matrix. We define the trace of A to be the sum of its diagonal elements. That is,
n
X
tr(A) = a11 + a22 + · · · + ann = aii
i=1
2. Eigenvalues
4. Trace
8 6 2 0
Example 1: Consider the similar matrices A = ∼B= related by
−8 −6 0 0
−1
8 6 4 3 2 0 4 3
= .
−8 −6 1 1 0 0 1 1
Demonstrate the properties of the prior theorem hold with regards to A and B.
96
6.2.2 Diagonalization
Definition: Diagonalizable Matrices
If there exists an invertible matrix P and a diagonal matrix D such that P −1 AP = D (i.e. A is similar to
a diagonal matrix), then we say that A is diagonalizable and that the matrix P diagonalizes A to its
diagonal form D.
λ1 0 ··· 0
..
0 λ2 ··· .
AP = A⃗v1 · · · A⃗vn = λ1⃗v1 · · · λn⃗vn = ⃗v1 · · · ⃗vn
.. .. ..
= P D.
. . . 0
0 ··· 0 λn
Thus, P −1 AP = D, or rather, A = P DP −1 .
Consequently, a matrix is diagonalizable if and only if every eigenvalue has its geometric multiplicity equal
to its algebraic multiplicity — i.e. no eigenvalues are deficient. In particular, if all n eigenvalues of the
matrix are distinct, then A is diagonalizable.
1 1
Example 2: Show that the following matrix A = is not diagonalizable.
0 1
Hint: Refer back to an earlier example to save on time.
97
0 3 −2
Example 3: Diagonalize, if possible, the following matrix A = −2 5 −2 .
−2 3 0
Hint: Perform the determinant operation R3 + (−R2 ) → R3 and expand on the third row.
98
6.3 Powers of Matrices and the Markov Process
6.3.1 (Large) Powers of Matrices
Theorem: Powers of Diagonal Matrices
Let A and B be n × n matrices such that A = SBS −1 (i.e. A ∼ B). Then for any integer m ≥ 1 we have
Am = SB m S −1 .
1 4
Example 1: For the matrix B = the diagonalization is given by
2 3
−1 1 4 −2 1 −1 0 −1/3 1/3
B = P DP =⇒ =
2 3 1 1 0 5 1/3 2/3
Compute B 2 directly and indirectly (using diagonalization). Then, find a closed formula for B m for any general
m ≥ 1.
Let A be a square matrix with diagonalization A = P DP −1 . The root A1/2 is defined provided that all the
eigenvalues of A are non-negative and is given by A1/2 = P D1/2 P −1 where in D1/2 the root is taken of each
diagonal element of D.
99
6.3.2 Markov Process
Example 2: Smith and Jones are the only competing suppliers of communication services in their community.
At present, they each have a 50% share of the market. However, Smith has recently upgraded his service and a
survey indicates that from one month to the next, 90% of Smith’s customers remain loyal, while 10% switch to
Jones. On the other hand, 70% of Jones’ customers remain loyal and 30% switch to Smith. If this goes on for six
months, how large are their market shares? If this goes on for a long time (∞), how big will Smith’s share become?
0.9 0.3 3 3 1
Hint: The eigenpairs of T = are λ1 = 1, ⃗v1 = and λ2 = , ⃗v2 = .
0.1 0.7 1 5 −1
100
Definition: Markov Matrices
An n × n matrix T is the Markov matrix (or transition matrix) of an n-state Markov process if
The column vector inputs and outputs of the transition matrix, whose components sum to 1, are called the
states of the Markov Process.
The eigenvector of a transition matrix with eigenvalue λ = 1, whose components sum to 1, is called the
fixed (or invariant) state of the transition matrix.
Consider a Markov process with Markov matrix T . Then the following properties hold:
(P4) Suppose that for some m all the entries in T m are non-zero. Then all the eigenvalues of T except for
λ1 = 1 satisfy |λi | < 1. In this case, for any initial state ⃗s, T m⃗s → ⃗s∗ as m → ∞ where ⃗s∗ is the
eigenstate associated to λ1 = 1.
Make a guess ⃗x0 and normalize to obtain x̂0 . Construct ⃗x1 = Ax̂0 and normalize to obtain x̂1 . Repeat the
process (Ax̂i = ⃗xi+1 ) and eventually x̂m → ⃗v as m → ∞ where ⃗v is the eigenvector associated to the largest
eigenvalue of A. Compute A⃗v and factor out ⃗v to find the largest eigenvalue.
101
Chapter 7
Orthonormal Bases
102
7.1 Orthonormal Bases and Orthogonal Matrices
7.1.1 Orthonormal Bases
Definition: Orthogonal Vectors
Example 2: Normalize the set of orthogonal vectors in the previous example to form an orthonormal set.
103
7.1.2 Coordinates with Respect to an Orthonormal Basis
Theorem: Coordinates with Respect to an Orthogonal Basis
In particular, if the basis is orthonormal, i.e. B = {v̂1 , ..., v̂n }, then Compv̂k (⃗x) = ⃗x · v̂k and ⃗x = (⃗x · v̂1 )v̂1 +
· · · + (⃗x · v̂n )v̂n .
104
7.1.3 Orthogonal Matrices
Note: Orthogonal Matrices
Therefore P T P = I and so P T acts as an inverse to P . Since the inverse is unique, this means that
P T = P −1 .
105
7.2 Projections and the Gram-Schmidt Procedure
7.2.1 Complementary Subspaces
Definition: Orthogonal Complements
Note: Basis of W ⊥
1. S ∩ S ⊥ = {⃗0} (i.e. the only element in both S and S ⊥ is the zero vector)
3. If B = {v̂1 , ..., v̂k } is an orthonormal basis for S and Bperp = {v̂k+1 , ..., v̂n } is an orthonormal basis for
S ⊥ , then B ∪ Bperp = {v̂1 , ..., v̂k , v̂k+1 , ..., v̂n } is an orthonormal basis for Rn .
106
7.2.2 Projection on Subspaces
Definition: Subspace Projection
1 −1 2
Example 7: Let B = 2 , 1 be an orthogonal basis for S and let ⃗x = 1 . Determine
1 −1 3
projS (⃗x) and perpS (⃗x).
107
7.2.3 The Gram-Schmidt Procedure
Theorem: The Gram-Schmidt Procedure
If {⃗v1 , ⃗v2 , ..., ⃗vk } is a linearly independent set of vectors in S then there exists an orthogonal set of vectors
{w⃗ 1, w
⃗ 2 , ..., w⃗ k } in S such that
w
⃗ 1 = ⃗v1
⃗ 2 = ⃗v2 − projw⃗ 1 (⃗v2 )
w
⃗ 3 = ⃗v3 − projw⃗ 1 (⃗v3 ) − projw⃗ 2 (⃗v3 )
w
..
.
⃗ k = ⃗vk − projw⃗ 1 (⃗vk ) − · · · − projw⃗ k−1 (⃗vk )
w
As we only care about the geometry of the objects involved, we may replace any w⃗ j at each step with any
scalar multiple of it. This, thereby, drastically reduces the algebra involved.
108
Note: Relaxtion of Conditions Required in Gram-Schmidt
The conditions in The Gram-Schmidt procedure may be slightly relaxed. Specifically, all you need is a
spanning set. All non-zero vectors produced in the Gram-Schmidt procedure will naturally turn a spanning
set into an orthogonal basis.
Example 9: Use the Gram-Schmidt procedure to find an orthonormal basis of the subspace S ⊂ R3 given by
0 3 3
S = Span −1 , 0 ,
1
2 1 −1
109
Appendix A
110
A.1 The Inverse Matrix Theorem
Theorem: The Inverse Matrix Theorem
Suppose that L : Rn → Rn is a linear operator with representative matrix A = [L]. Then, the following
statements are equivalent to each other:
1. A is invertible.
2. rank(A) = n.
3. nullity(A) = 0.
4. The RREF of A is I.
5. For all ⃗b ∈ Rn , the system A⃗x = ⃗b is consistent and has a unique solution.
10. L is invertible
11. Range(L) = Rn .
23. Null(A)⊥ = Rn .
111
A.2 Vector Spaces and Subspaces
A.2.1 Vector Space Requirements
Definition: Vector Spaces
A vector space over R is a set V together with an operation ⊕ (called addition and denoted x ⊕ y for
any x, y ∈ V), and an operation ⊙ (called scalar multiplication and denoted s ⊙ x for any x ∈ V and
s ∈ R) such that for any x, y, z ∈ V and s, t ∈ R we have all of the following properties:
V3 (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) Associativity of add.
V9 t ⊙ (x ⊕ y) = (t ⊙ x) ⊕ (t ⊙ y) Distributivity of add.
112
A.2.2 Subspace Requirements
Definition: Subspaces
Suppose that V is a vector space. A non-empty subset U ⊆ V is a subspace of V if it satisfies the following
properties:
S0 0∈U Non-Empty
113