MATH 257 Lecture Notes PDF
MATH 257 Lecture Notes PDF
ILLINOIS
Department of Mathematics
Table of Contents.
Module 1: Introduction to Linear Systems Module 16: Subspaces of Rn
Module 2: Matrices and Linear Systems Module 17: Column spaces and Nullspaces
Module 3: Echelon forms of matrices Module 18: Abstract vector spaces
Module 4: Gaussian elimination Module 19: Linear independence
Module 5: Linear combinations Module 20: Basis and Dimension
Module 6: Matrix vector multiplication Module 21: The four fundamental subspaces
Module 7: Matrix multiplication Module 22: Graphs and adjacency matrices
Module 8: Properties of matrix multiplication Module 23: Orthogonal complements
Module 9: Elementary matrices Module 24: Coordinates
Module 10: Inverse of a matrix Module 25: Orthonormal basis
Module 11: Computing an inverse Module 26: Linear Transformations
Module 12: LU decomposition Module 27: Coordinate matrix
Module 13: Solving using LU decomposition Module 28: Determinants
Module 14: Spring-mass systems Module 29: Cofactor expansion
Module 15: Inner products and orthogonality Module 30: Eigenvectors and eigenvalues
Table of Contents .
(ctd)
ILLINOIS
Department of Mathematics
Definition. A linear equation is a equation of the form
a1 x1 + . . . + an xn = b
where a1 , ..., an , b are numbers and x1 , ..., xn are variables.
Example. Which of the following equations are linear equations (or can be rearranged to
become linear equations)?
Solution.
4x1 − 5x2 + 2 = x1
√
x2 = 2( 6 − x1 ) + x3
4x1 − 6x2 = x1 x2
√
x2 = 2 x1 − 7
Definition. A linear system is a collection of one or more linear equations involving the
same set of variables, say, x1 , x2 , ..., xn .
A solution of a linear system is a list (s1 , s2 , ..., sn ) of numbers that makes each equation in
the system true when the values s1 , s2 , ..., sn are substituted for x1 , x2 , ..., xn , respectively.
Example. Two equations in two variables:
x1 + x2 = 1 (I)
−x1 + x2 = 0. (II)
What is a solution for this system of linear equations?
Solution.
Example. Does every system of linear equation have a solution?
x1 − 2x2 = −3 (III)
2x1 − 4x2 = 8. (IV)
Solution.
Example. How many solutions are there to the following system?
x1 + x2 = 3 (V)
−2x1 − 2x2 = −6 (VI)
Solution.
Theorem 1. A linear system has either
one unique solution or no solution or infinitely many solutions.
Definition. The solution set of a linear system is the set of all solutions of the linear system.
Two linear systems are equivalent if they have the same solution set.
Example. Consider
x1 − 3x2 = 1 (VII)
−x1 + 5x2 = 3 (VIII)
Transform this linear system into another easier equivalent system.
Solution.
LINEAR ALGEBRA
Matrices and Linear Systems
ILLINOIS
Department of Mathematics
Definition. An m × n matrix is a rectangular array of numbers with m rows and n columns.
Example. Let’s give a few examples.
Solution.
Remark. Indeed, every row operation is reversible. We already saw how to reverse the
replacement operator. The scaling operatior R2 → cR2 is reversed by the scaling operator
R2 → c1 R2 . Row interchange R1 ↔ R2 is reversible by performing it twice.
Definition. Two matrices are row equivalent, if one matrix can be transformed into the
other matrix by a sequence of elementary row operations.
Theorem 2. If the augmented matrices of two linear systems are row equivalent, then the
two systems have the same solution set.
LINEAR ALGEBRA
Echelon forms of matrices
ILLINOIS
Department of Mathematics
Definition. A matrix is in echelon form (or row echelon form) if
1. All nonzero rows (rows with at least one nonzero element) are above any rows of all zeros.
2. the leading entry (the first nonzero number from the left) of a nonzero row is always
strictly to the right of the leading entry of the row above it.
Example. Are the following matrices in echelon form? Circle the leading entries.
3 1 2 0 5 0 2 0 1 4
0 2 0 1 4 3 1 2 0 5
a)
0
b)
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
2 −2 3 0 1 √3
c) 0 5 0 d) 0 0 2
0 0 52 0 0 0
Definition. A matrix is in row reduced echelon form (or: reduced echelon form, or:
RREF) if it is in echelon form and
3. The leading entry in each nonzero row is 1.
4. Each leading entry is the only nonzero entry in its column.
Example.
Are the following matrices in reduced
echelon form?
0 1 3 0 0 2 5 0 0 6
1 0 1 12 0 0
0 0
0 −2 1 0 −2 3 2 −24
a)
0 0 0 0 1 −3 4 0 0 5 b) 0 1 −2 2 0 −7
0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 4
0 0 0 0 0 0 0 0 1 1
Theorem 3. Each matrix is row-equivalent to one and only one matrix in reduced echelon
form.
Definition. We say a matrix B is the reduced echelon form (or: the RREF) of a matrix A
if A and B are row-equivalent and B is in reduced echelon form.
Question. Is each matrix also row-equivalent to one and only one matrix in echelon from?
Solution.
ILLINOIS
Department of Mathematics
Goal. Solve linear systems for the pivot variables in terms of the free variables (if any) in the
equation.
Algorithm. (Gaussian Elimination) Given a linear system,
(1) Write down the augmented matrix.
(2) Find the reduced echelon form of the matrix.
(3) Write down the equations corresponding to the reduced echelon form.
(4) Express pivot variables in terms of free variables.
Example. Find the general solution of
3x1 −7x2 +8x3 −5x4 +8x5 = 9
C.F. Gauß (1777–1855)
3x1 −9x2 +12x3 −9x4 +6x5 = 15
Solution.
Solution.
1 0 −2 3 5 −4
RREF of the augmented matrix:
0 1 −2 2 1 −3
Example. Is the following linear system consistent? Is the solution unique?
3x2 − 6x3 + 6x4 + 4x5 = −5
3x1 − 7x2 + 8x3 − 5x4 + 8x5 = 9
3x1 − 9x2 + 12x3 − 9x4 + 6x5 = 15
Solution.
0 3 −6 6 4 −5 3 −9 12 −9 6 15
Row reduce augmented matrix: 3 −7 8 −5 8 9 ; 0 2 −4 4 2 −6
3 −9 12 −9 6 15 0 0 0 0 1 4
Theorem 4. A linear system is consistent if and only if an echelon form of the augmented
matrix has no row of the form 0 ... 0 b , where b is nonzero. If a linear system is
consistent, then the linear system has
○ a unique solution (when there are no free variables) or
○ infinitely many solutions (when there is at least one free variable).
3 4 −3
Example. Consider the linear system whose augmented matrix is 3 4 −3 . What can
6 8 −5
you say about the number of solutions of this system?
Solution.
LINEAR ALGEBRA
Linear combinations
ILLINOIS
Department of Mathematics
a11 a12 ··· a1n b11 b12 ··· b1n
a21 a22 ··· a2n b21 b22 ··· b2n
Definition. Consider m × n-matrices A =
.. .. .. .. ,
and B =
.. .. .. .. .
. . . . . . . .
am1 am2 · · · amn bm1 bm2 · · · bmn
a) The sum of A + B is b)
The product cA fora scalar c is
ca11 ca12 · · · ca1n
a11 + b11 a12 + b12 · · · a1n + b1n
a21 + b21 a22 + b22 · · · a2n + b2n ca21 ca22 · · · ca2n
.. .. ..
.. .. .. .. ..
. . . . . . . .
am1 + bm1 am2 + bm2 · · · amn + bmn cam1 cam2 · · · camn
Example. Calculate
1 0 2 3 2 1 0
+ = 5 =
5 2 3 1 3 1 −1
Warning. Addition is only defined if A has the same number of columns as B and the same
number of rows as B.
Definition. A column vector is an m × 1-matrix. A row vector is a 1 × n-matrix.
Example. Give a few ex-
amples of column and row
vectors.
Remark. The transpose of a column vector is a row vector and vice versa.
Definition. The linear combination of m × n-matrices A1 , A2 , . . . , Ap with coefficients
c1 , c2 , . . . , cp is defined as
c1 A1 + c2 A2 + · · · + cp Ap .
Example. Consider m × n-matrices A1 and A2 . Give examples of linear combinations of these
two matrices.
Solution.
Definition. The span (A1 , . . . , Ap ) is defined as the set of all linear combinations of
A1 , . . . , Ap . Stated another way:
span (A1 , . . . , Ap ) := {c1 A1 + c2 A2 + · · · + cp Ap : c1 , . . . , cp scalars}.
Definition. We denote the set of all column vectors of length m by Rm .
1 4 −1
Example. Let a1 = 0 , a2 = 2 and b = 8 . Is b a linear combination of a1 , a2 ?
3 14 −5
Solution.
Key observation from the previous example: Solving linear systems is the same as finding
linear combinations!
Theorem 5. A vector equation
x1 a1 + x2 a2 + · · · + xn an = b
has the same solution set as the linear system whose augmented matrix is
a1 a2 · · · an | b
ILLINOIS
Department of Mathematics
Definition. Let x be a vector in Rn and A = a1 . . . an an m × n-matrix. We define the
product Ax by
Ax = x1 a1 + x2 a2 + . . . + xn an .
Remark.
○ Ax is a linear combination of the columns of A using the entries in x as coefficients.
○ Ax is only defined if the number of entries of x is equal to the number of columns of A.
1 2
2 0 2
Example. Consider A = ,B= 0 1 ,x=
. Determine Ax and Bx.
1 1 3
3 5
Solution.
Example. Consider the following vector equation
1 3 0
x1 + x2 = .
2 4 2
Find a 2 × 2 matrix A such that (x1 , x2 ) is a solution to the above equation if and only if
x 0
A 1 = ?
x2 2
Solution.
Theorem 6. Let A = a1 . . . an be an m × n-matrix and b in Rm . Then the following
are equivalent
○ (x1 , x2 , . . . , xn ) is a solution of the vector equation x1 a1 + x2 a2 + · · · + xn an = b
x1
..
○ . is a solution to the matrix equation Ax = b.
xn
○ (x1 , x2 , . . . , xn ) is a solution of the system with augmented matrix A b
Notation. We will write Ax = b for the system of equations with augmented matrix A b .
x Ax
A
0 1
Example. Consider the matrix A = . What does this machine do?
1 0
Solution.
1 0
Example. Consider the matrix B = . What does this machine do?
0 0
Solution.
Composition of machines. Let A be an m × n matrix and B be an k × l matrix.Now we can
compose the two machines:
x Ax B(Ax)
A B
ILLINOIS
Department of Mathematics
Definition. Let A be an m × n-matrix and let B = b1 . . . bp be an n × p-matrix. We
define
AB := [Ab1 Ab2 · · · Abp ]
4 −2
2 −3
Example. Compute AB where A = 3 −5 and B =
.
6 −7
0 1
Solution.
Remark. Ab1 is a linear combination of the columns of A and Ab2 is a linear combination
of the columns of A. Each column of AB is a linear combination of the columns of A using
coefficients from the corresponding columns of B.
Definition. Let A be an m × n-matrix and let B be an n × p-matrix. We define
x Bx A(Bx)
B A
x (AB)x
AB
Example. Consider
2 0 1 2 x
A= , B= , x= 1
1 1 0 1 x2
Compute (AB)x and A(B(x)). Are these the same?
Solution.
ILLINOIS
Department of Mathematics
Definition. The identity matrix In of size n is defined as
1 0 ... 0
0 1 . . . 0
In = . .. . . .. .
.. . . .
0 0 ... 1
Theorem 8. Let A be an m × n matrix and let B and C be matrices for which the indicated
sums and products are defined.
(a) A (BC ) = (AB)C (associative law of multiplication)
(b) A (B + C ) = AB + AC , (B + C ) A = BA + CA (distributive laws)
(c) r (AB) = (rA)B = A(rB) for every scalar r ,
(d) A (rB + sC ) = rAB + sAC for every scalars r , s (linearity of matrix multiplication)
(e) Im A = A = AIn (identity for matrix multiplication)
Warning. Properties above are analogous to properties of real numbers. But NOT ALL
properties of real number also hold for matrices.
1 1 1 0
Example. Let A = ,B= . Determine AB and BA. Are these matrices the
0 1 1 1
same?
Solution.
1 2
1 2 0
Example. Let A = ,B= 0 1 . Compute (AB)T , AT B T and B T AT .
3 0 1
Solution. −2 4
1 2
1 2 0
AB = 0 1 =
3 0 1
−2 4
(AB)T =
1 3 7 3 10
T T 1 0 −2
A B = 2
0 = 2 0 −4
2 1 4
0 1 2 1 4
1 3
T T 1 0 −2
B A = 2 0 =
2 1 4
0 1
3
1 0
Example. Determine .
3 2
Solution.
LINEAR ALGEBRA
Elementary matrices
ILLINOIS
Department of Mathematics
Definition. An elementary matrix is one that is obtained by performing a single elementary
row operation on an identity matrix.
A permutation matrix is one that is obtained by performing row exchanges on an identity
matrix.
1 0 0 1 0 0 1 0 0
Example. Consider E1 = 0 3 0 , E2 = 0 0 1 , E3 = 0 1 0 . Are these
0 0 1 0 1 0 2 0 1
matrices elementary matrices?
Solution.
Question. Let A be a 3 × 3-matrix. What happens to A if you multiply it by one of E1 , E2
and E3 ?
Solution.
1 0 0 a11 a12 a13
E1 A = 0 2 0
a21 a22 a23 =
0 0 1 a31 a32 a33
1 0 0 a11 a12 a13
E2 A = 0 0 1
a21 a22 a23 =
0 1 0 a31 a32 a33
1 0 0 a11 a12 a13
E3 A = 0 1 0
a21 a22 a23 =
3 0 1 a31 a32 a33
Find elementary matrices E1−1 and E2−1 such that A = E1−1 E2−1 B.
Solution.
LINEAR ALGEBRA
Inverse of a matrix
ILLINOIS
Department of Mathematics
The inverse of a real number a is denoted by a−1 . For example, 7−1 = 1/7 and
7 · 7−1 = 7−1 · 7 = 1.
Remark. Not all real numbers have inverse. 0−1 is not well defined, since there is no real
number b such that 0 · b = 1.
Remember that the identity matrix In is the n × n-matrix
1 0 ··· 0
0 1 · · · 0
In = . . .
. . . ..
. . . .
0 0 ··· 1
CA = AC = In
ILLINOIS
Department of Mathematics
Question. When is the 1 × 1 matrix a invertible?
Solution.
a b
Theorem 15. Let A = . If ad − bc 6= 0, then A is invertible and
c d
−1 1 d −b
A = .
ad − bc −c a
Theorem 17. Suppose A is invertible. The every sequence of elementary row operations
that reduces A to In will also transform In to A−1 .
Algorithm.
○ Place A and I side-by-side to form an augmented matrix [ A | I ].
This is an n × 2n matrix (Big Augmented Matrix), instead of n × (n + 1).
○ Perform row operations on this matrix (which will produce identical operations on A and
I ).
○ By Theorem:
[ A | I ] will row reduce to I | A−1
or A is not invertible.
2 0 0
Example. Find the inverse of A = −3 0 1, if it exists.
0 1 0
Solution.
1
2 0 0 1 0 0 1 0 0 2 0 0
A I = −3 0 1 0 1 0 ∼ · · · ∼ 0 1 0 0 0 1
3
0 1 0 0 0 1 0 0 1 2 1 0
Example. Let’s do the previous example step by step.
Solution.
2 0 0 1 0 0
A I = −3 0 1 0 1 0
0 1 0 0 0 1
LINEAR ALGEBRA
LU decomposition
ILLINOIS
Department of Mathematics
Definition. An n × n matrix A is called
? ? ? ? ?
0 ? ? ? ?
0
upper triangular if it is of the form 0 ? ? ? ,
. .
0 . . ..
0 0
0 0 0 0 ∗
? 0 0 0 0
? ? 0 0 0
?
lower triangular if it is of the form ? ? 0 0 .
. . ..
? ? ? . .
? ? ? ? ?
Example. Give a few examples of upper and lower triangular matrices!
Solution.
Theorem 18. The product of two lower (upper) triangular matrices is lower (upper)
triangular.
Example. Consider the row operation Ri → Ri + cRj where j < i. What can you say about
the elementary matrix E corresponding this row operation? What about E −1 ?
Solution.
Remark. The inverse of a lower (upper) triangular matrix (if it exists) is again lower (upper)
triangular.
Definition. A matrix A has an LU decomposition if there is a lower triangular matrix L and
a upper triangular matrix U such that A = LU.
Theorem 19. Let A be an n × n-matrix. If A can be brought to echelon form just using row
operations of the form Ri → Ri + cRj where j < i, then A has an LU-decomposition.
Proof.
Remark. It is important that you do the row operations in the right order!
LINEAR ALGEBRA
Solving linear systems using LU decomposition
ILLINOIS
Department of Mathematics
Theorem 20. Let A be an n × n-matrix such that A = LU, where L is a lower triangular
matrix and U is a upper triangular matrix, and let b ∈ Rn . In order to find a solution of the
linear system
Ax = b,
it is enough to find a solution of the linear system
Ux = c,
where c satisfies Lc = b.
Proof.
2 1 1 x1 5
Example. Find a solution to the linear system 4 −6 0 x2 = −2.
−2 7 2 x3 9
Solution.
2 1 1 1 0 0 2 1 1
Recall the LU decomposition 4 −6 0 = 2 1 0 0 −8 −2 .
−2 7 2 −1 −1 1 0 0 1
Question. Why do we care about LU decomposition if we already have Gaussian elimination?
Solution.
Theorem 21. Let A be n × n matrix. Then there is a permutation matrix P such that PA
has an LU-decomposition.
Proof.
0 0 1
Example. Let A = 1 1 0. Find a permutation matrix P such that PA has a LU
2 1 0
decomposition.
Solution.
LINEAR ALGEBRA
Spring-mass systems
ILLINOIS
Department of Mathematics
Example. Consider the following spring-mass system, consisting of five masses and six
springs fixed between two walls.
m1 m2 m3 m4 m5
Goal.
○ Add (steady) applied forces f1 , f2 , f3 , f4 , f5 a (steady) applied force on mass i.
○ Compute the displacements u1 , u2 , u3 , u4 , u5 of the fives masses.
u1 u2 u3 u4 u5
m1 m2 m3 m4 m5
f1 f2 f3 f4 f5
Equilibrium. In the equilibrium, the forces at each mass add up to 0.
Hooke’s law. The force F needed to extend or compress a spring by some distance u is
proportional to that distance; that is F = −ku, where k is a constant factor characteristic of
the spring, called its stiffness. Let ki be the stiffness of the i-th spring.
Springs:
○ u1 > 0 ; spring 1 is extended.
○ u2 − u1 > 0 ; spring 2 is extended.
u1 u2 Forces
u3 at m1 : u4 u5
f1 f2 ○ applied forces f1 (if positive, pushes m1 to the right).
m1 m2 ○m spring 1: −k1 u1 (since
m4 u1 > 0, pulls m m15 to the left).
3
○ spring 2: k2 (u2 − u1 ) (since u2 − u1 > 0, pulls m1 to
f3
the right). f4 f5
−k1 u1 k2 (u2 − u1 )
Equilibrium:
m1 m2 m3 f3 ○ spring m
2: −k2 (u2 − u1 ) (since (u2 − u1 ) > 0, pulls
m5
4 left).
m2 to the
○ spring 3: k3 (u3f4− u2 ) (since u3 − fu52 > 0, pulls m2 to
−k2 (u2 − u1 ) k3 (u3 − u2 ) the right).
Equilibrium at m2 :
f2 − k2 (u2 − u1 ) + k3 (u3 − u2 ) = 0
; −k2 u1 + (k2 + k3 )u2 − k3 u3 = f2
Springs:
○ u5 − u4 > 0 ; spring 5 is extended.
○ u5 > 0 ; spring 6 is compressed.
Forces at m2 :
u4 u5
f4 f5 ○ applied forces f5 (if positive, pushes m5 to the right).
○ spring 5: −k5 (u5 − u4 ) (since (u5 − u4 ) > 0, pulls
m4 m5 m5 to the left).
○ spring 6: k3 (u3 − u2 ) (since u5 > 0, pushes m2 to
−k5 (u5 − u4 ) −k6 u5 the left).
Equilibrium at m5 :
f5 − k5 (u5 − u4 ) − k6 u5 = 0
; (k5 + k6 )u5 − k5 u4 = f5
Equilibrium equations.
(k1 + k2 )u1 − k2 u2 = f1 k1 + k2 −k2 0 0 0 f1
−k2 u1 + (k2 + k3 )u2 − k3 u3 = f2
−k2 k2 + k3 −k3 0 0 f2
0 −k3 k3 + k4 −k4 0 f3
−k3 u2 + (k3 + k4 )u3 − k4 u4 = f3
0 0 −k4 k4 + k5 −k5 f4
−k4 u3 + (k4 + k5 )u4 − k5 u5 = f4 0 0 0 −k5 k5 + k6 f5
−k5 u4 + (k5 + k6 )u5 = f5 .
Remark.
○ The purpose of this example is not to find the precise solutions to these equations, but
rather to show you that linear equations appear naturally in engineering.
○ Finite element method: object is broken up into many small parts with connections
between the different parts ; gigantic spring-mass system where the forces from spring
correspond to the interaction between the different parts.
○ In practice: millions of equations, not just five!
○ Solve not for a single force vector f, but for many different vectors. Thus the coefficient
matrix stays the same ; LU-decomposition can make a difference in how quickly such
systems can be solved.
LINEAR ALGEBRA
Inner Product and Orthogonality
ILLINOIS
Department of Mathematics
Definition. The inner product of v, w ∈ Rn is
v · w = vT w.
v1 w1
.. ..
Example. If v = . and w = . , then v · w is ...
vn wn
Solution.
Question. Why is v · w = w · v?
Solution.
Question. Why is v · v always larger or equal to 0? For which v is v · v = 0?
Solution.
Theorem 22. Let u, v and w be vectors in Rn , and let c be any scalar. Then
(a) u · v = v · u
(b) (u + v) · w = u · w + v · w
(c) (cu) · v =c (u · v) = u · (cv)
(d) u · u ≥ 0, and u · u = 0 if and only if u = 0.
Definition. Let v, w ∈ Rn . x2
The norm (or length) of v is
√ q v−w
kvk = v · v = v12 + · · · + vn2 . w v
v−w
w
v
1 1 1 1
Example. Are , orthogonal? Are , orthogonal?
1 −1 1 −2
Solution.
Definition. A set of vectors form an orthonormal set if this set is an orthogonal set and all
vectors in the set are unit vectors.
1 1
Example. Let v1 = , v2 = . Do v1 and v2 form an orthonormal set?
1 −1
Solution.
LINEAR ALGEBRA
Subspaces of Rn
ILLINOIS
Department of Mathematics
Definition. A non-empty subset H of Rn is a subspace of Rn if it satisfies the following two
conditions:
○ If u, v ∈ H, then the sum u + v ∈ H. (H is closed under vector addition).
○ If u ∈ H and c is a scalar, then cu ∈ H. (H is closed under scalar multiplication.)
Theorem 24. Let v1 , v2 , . . . , vm ∈ Rn . Then Span (v1 , v2 , . . . , vm ) is a subspace of Rn .
Proof.
x
Example. Is H = : x ∈ R a subspace of R2 ?
x
Solution.
x2
x1
0
Example. Let Z = . Is Z a subspace of R2 ?
0
Solution.
x
Example. Let H = : x ∈ R . Is H a subspace of R2 ?
x +1
Solution.
x2 H
x1
x
Example. Is U = ∈ R2 : x 2 + y 2 < 1 a subspace of R2 ?
y
Solution.
x2
U
x1
x 2
Example. Consider V = ∈ R : xy ≥ 0 .
y
Solution.
x2
V
x1
x 2
Question. Is W = ∈ R : xy = 0 a subspace?
y
Solution.
LINEAR ALGEBRA
Column spaces and Nullspaces
ILLINOIS
Department of Mathematics
Definition. The column space, written as Col(A), of an m × n matrix A is the set of all
linear combinations of the columns of A. If A = a1 a2 · · · an , then
Col(A) = span (a1 , a2 , . . . , an ).
1 0
Example. Describe the column space of A = .
0 0
Solution.
v1
Example. Let H := {v2 : v1 + v2 − v3 = 0}. Find a matrix A such that H = Nul(A).
v3
Solution.
Example. Let A = 1 1 −1 . Find two vectors v, w such that Nul(A) = span(v, w).
Solution.
Nul(A)
x2
v
w x1
x3
T
Example. Let A = 1 1 −1 and let b = 1. Observe that A 1 0 0 = b. Use this to
describe {v ∈ Rn : Av = b}. x2
Solution. Nul(A) w + Nul(A)
x1
x3
LINEAR ALGEBRA
Abstract vector spaces
ILLINOIS
Department of Mathematics
○ The most important property of column vectors in Rn is that you can take linear
combinations of them.
○ There are many mathematical objects X , Y , . . . for which a linear combination cX + dY
make sense, and have the usual properties of linear combination in Rn .
○ We are going to define a vector space in general as a collection of objects for which linear
combinations make sense.The objects of such a set are called vectors.
x2 f (x) = x + 1
3 (f + g )(x) = sin x + x
2
1 1 1
2· + 2·
x1
−2−1 1 2 g (x) = sin x − 1
−1
2g (x) = 2 sin x − 2
=
Definition. A vector space is a non-empty set V of objects, called vectors, for which linear
combinations make sense. More precisely: on V there are defined two operations, called
addition and multiplication by scalars (real numbers),satisfying the following axioms for all
u, v , w ∈ V and for all scalars c, d ∈ R:
○ u + v is in V . (V is “closed under addition”.)
○ u + v = v + u.
○ (u + v) + w = u + (v + w).
○ There is a vector (called the zero vector) 0V in V such that u + 0V = u.
○ For each u in V , there is a vector −u in V satisfying u + (−u) = 0V .
○ cu is in V . (V is “closed under scalar multiplication”.)
○ c(u + v) = cu + cv.
○ (c + d)u = cu + du.
○ (cd)u = c(du).
○ 1u = u.
In particular, we may talk about linear combinations and span within a vector space, e.g.
3u + 2v or span(u, v).
Example. Explain how the set of functions R → R is a vector space.
Solution.
Question. Is the set of all invertible 2 × 2 matrices a subspace of the vector space of 2 × 2
matrices?
Solution.
Example. Let Pn be the set of all polynomials of degree at most n, that is
Pn = {a0 + a1 t + a2 t 2 + · · · + an t n : a0 , . . . , an ∈ R}.
Explain why it is a vector space. Can you think of it as a subspace of the vector space of all
functions R → R?
Solution.
LINEAR ALGEBRA
Linear Independence
ILLINOIS
Department of Mathematics
Definition. Vectors v1 , . . . , vp are said to be linearly independent if the equation
x1 v1 + x2 v2 + · · · + xp vp = 0
Theorem 29. Vectors v1 , . . . , vp are linear dependent if and only if there is i ∈ {1, . . . , p}
such that vi ∈ span(v1 , . . . , vi−1 , vi+1 , . . . , vp ).
Question. A single non-zero vector v1 is always linearly independent. Why?
Solution.
Question. Two vectors v1 , v2 are linearly independent if and only if neither of the vectors is a
multiple of the other. Why?
Solution.
Question. Consider an m × n-matrix A in echelon form. The pivot columns of A are linearly
independent. Why?
Solution.
LINEAR ALGEBRA
Basis and Dimension
ILLINOIS
Department of Mathematics
Definition. Let V be a vector space. A sequence of vectors (v1 , . . . , vp ) in V is a basis of V
if
○ V = span (v1 , . . . , vp ) , and
○ (v1 , . . . , vp ) are linearly independent.
1 0 1 1
Example. Check that both , and , are bases of R2 .
0 1 1 −1
Solution.
Theorem 31. Every two bases in a vector space V contain the same numbers of vectors.
Definition. The number of vectors in a basis of V is the dimension of V .
Example. What is the dimension of Rn ?
Solution.
Theorem 32. Suppose that V has dimension d.
○ A sequence of d vectors in V are a basis if they span V .
○ A sequence of d vectors in V are a basis if they are linearly independent.
Proof.
1 0 1
Example. Is 2 , 1 , 0 a basis of R3 ?
0 1 3
Solution.
Theorem 33. A basis is a minimal spanning set of V ; that is the elements of the basis span
V but you cannot delete any of these elements and still get all of V .
Example. Produce a basis of R2 from the vectors
1 1 −.5 x2
v1 = , v2 = , v3 = . v1
2 1 −2
v2
Solution.
x1
v3
2
Example. Produce a basis of R2 from the vector .
1
Solution.
LINEAR ALGEBRA
Bases and dimensions of the four fundamental
subspaces
ILLINOIS
Department of Mathematics
Algorithm. To find a basis for Nul(A):
○ Find the parametric form of the solutions to Ax = 0.
○ Express solutions x as a linear combination of vectors with the free variables as coefficients.
○ Use these vectors as a basis of Nul(A).
3 6 6 3 9
Example. Find a basis for Nul(A) where A = .
6 12 15 0 3
Solution.
3 6 6 3 9 1 2 0 5 13
6 12 15 0 3 RREF 0 0 1 −2 −5
Definition. The rank of a matrix is the number of pivots it has.
Theorem 34. Let A be an m × n matrix with rank r . Then dim Nul(A) = n − r .
Remark. Let A = a1 . . . an and let U = u1 . . . un be an echelon form of A. Explain
why
x1 u1 + · · · + xn un = 0 ⇐⇒ x1 a1 + · · · + xn an = 0.
Solution.
Theorem 35. Let A be an m × n matrix with rank r . The pivot columns of A form a basis
of Col(A). In particular, dim Col(A) = r .
Proof.
1 2 0 4
2 4 −1 3
Example. Find a basis for Col(A) where A =
3
.
6 2 22
4 8 0 16
Solution.
1 2 0 4 1 2 0 4
2 4 −1 3 0 0 1 5
3 6 2 22 RREF 0 0 0 0
4 8 0 16 0 0 0 0
1 3 1 3
Example. Let A = . The RREF of A is U = . Is Col(A) = Col(U)? Is
2 6 0 0
Col(AT ) = Col(U T )?
Solution.
ILLINOIS
Department of Mathematics
Definition. A graph is a set of nodes (or: vertices) that are connected through edges.
Example. A graph with 4 nodes and 5 edges:
1 2
0 1 1 0
1 1 1 0
1 1 0 1
0 0 1 0
3 4
Definition. Let G be a graph with n nodes. The adjacency matrix of G is the n × n-matrix
A = (aij ) such that
(
1 if there is an edge between node i and node j
aij =
0 otherwise .
Definition. A walk of length k on a graph of is a sequence of k + 1 vertices and k edges
between two nodes (including the start and end) that may repeat.A path is walk in which all
vertices are distinct.
Example. Count the number of walks of length 2 from node 2 to node 3 and the number of
walks of length 3 from node 3 back to node 3:
1 2
3 4
Definition. A graph is connected if for every pair of nodes i and j there is a walk from node
i to node j. A graph is disconnected if it is not connected.
Theorem 39. Let G be a graph and let A be its adjacency matrix. Then the entry in the
i-th row and j-th column of A` is the number of walks of length ` from node j to node i on G.
Proof.
Definition. A directed graph is a set of vertices connected by edges, where the edges have a
direction associated with them.
Example. A graph with 4 nodes and 5 edges:
1
1 2
0 0 0 0
1 0 1 0
3
2 4 1 0 0 0
0 1 1 0
5
3 4
We can also talk about adjacency matrices of directed graphs. We use the following
convention:
Definition. Let G be a directed graph with m edges and n nodes. The adjacency matrix of
G is the n × n matrix A = (ai,j )i,j with
1, if there is a directed edge from node j to node i
ai,j =
0, otherwise
Directed graphs have another important associated matrix:
Definition. Let G be a directed graph with m edges and n nodes. The edge-node incidence
matrix of G is the m × n matrix A = (ai,j )i,j with
−1, if edge i leaves node j
ai,j = +1, if edge i enters node j
0, otherwise
A graph with one connected component: A graph with two connected components:
1 1 2
1 2
3 1 2
2 4
5 3 4
3 4
Theorem 40. Let G be a directed graph and let A be its edge-node incidence matrix. Then
dim Nul(A) is equal to the number of connected components of G.
1 2
Example. Find a basis of the null space of the edge-node
incidence matrix of the following graph: 1 2
3 4
Solution.
Definition. A cycle in an undirected graph is a path in which all edges are distinct and the
only repeated vertices are the first and last vertices. By cycles of a directed graph we mean
those of its underlying undirected graph.
Example. Find cycles in the following graph.
Solution.
Definition. The span of all cycle vectors of a graph G is called the cycle space.
Theorem 41. Let G be a directed graph and let A be its edge-node incidence matrix. Then
the cycle space of G is equal to Nul(AT ).
Example. We explain the idea in the case of the following graph G. Let A be its edge-node
incidence matrix, a 5 × 4-matrix. Note that Nul(AT ) ⊆ R5 .
1 y1
1 2 y2
3 Think of y = y3 as assigning a flow to each edge. If y ∈ Nul(AT ):
2 4 y
4
5 y
3 4 5
0 y1
−1 −1 0 0 0 −y1 − y2
0 y
2
−1 1 0 0
0 = AT y = 1
0 1 −1 0 y3 = y1 + y3 − y4
−1 0 1 0
0
0 1 −1 0 −1 y4
y2 − y3 − y5
0 0 0 1 1 y4 + y5
0
1 −1 0 0 y5
0 −1 0 1
; y ∈ Nul(AT ) if and only if the inflow equals the outflow at each node.
0 0 −1 1
What is the simplest way to balance flow? Assign flow around cycles.
Example (ctd.).
Let’s solve AT y = 0. 1
1 2
−1 −1 0 0 0 1 0 1 0 1
1 0 1 −1 0 0 1 −1 0 −1 3
T ; 2 4
A =
0 1 −1 0 −1 RREF 0 0 0 1 1
0 0 0 1 1 0 0 0 0 0 5
3 4
−y3 − y5 −1 −1
y3 + y5 1 1
y ∈ Nul(AT ) ; y1 = −y3 −y5 , y2 = y3 +y5 , y4 = −y5 ; y =
y 3
= y3 1 +y5 0
−y5 0 −1
y5 0 1
Cycle1 Cycle3
ILLINOIS
Department of Mathematics
1 2
Example. Let A = 2 4. Find bases of Nul(A) and Col(AT ).
3 6
Solution.
x2
Nul(A)
Col(AT )
x1
1 0
Example. Find a basis of the orthogonal complement of span 0 , 1.
Solution. 0 1
LINEAR ALGEBRA
Coordinates
ILLINOIS
Department of Mathematics
Theorem 44. Let (v1 , . . . , vp ) be a basis of V . Then every vector w in V can be expressed
uniquely as
w = c1 v1 + · · · + cp vp .
Proof.
x2 x2
b1 e2
x1
x1 e1
b2
Definition. In Rn let ei denote the vector with a 1 in the i-th coordinate and 0’s elsewhere.
The standard basis of Rn is the ordered basis En := (e1 , . . . , en ).
Question. For all v ∈ Rn , we have v = vEn . Why?
Solution.
1 1
Example. Consider the basis B := b1 = , b2 = of R2 . Let v ∈ R2 be such that
1 −1
2
vB = . Can you determine v?
1
Solution.
Definition. Let B and C be two bases of Rn . The change of basis matrix IC,B is the matrix
such that for all v ∈ Rn
IC,B vB = vC
ILLINOIS
Department of Mathematics
Theorem 46. Let v1 , . . . , vm ∈ Rn be non-zero and pairwise orthogonal.Then v1 , . . . , vm are
linearly independent.
Proof.
ILLINOIS
Department of Mathematics
Definition. Let V and W be vector spaces. A map T : V → W is a linear transformation if
Example. Let Pn be the vector space of all polynomials of degree at most n. Consider the
d
map T : Pn → Pn−1 given by T (p(t)) := dt p(t). This map is linear! Why?
Solution.
Theorem 49. Let V , W be two vector spaces, let T : V → W be a linear transformation
and let (v1 , . . . , vn ) be a basis of V . Then T is completely determined by the values
T (v1 ), . . . , T (vn ).
Solution.
1
2 3 1
Example. Let T : R → R be a linear transformation with T = 2 and
0
3
0
0 1
T = 0 . What is T ?
1 2
−2
Solution.
Theorem 50. Let T : Rn → Rm be a linear transformation. Then there is a m × n matrix A
such that
○ T (v) = Av, for all v ∈ Rn .
○ A = T (e1 ) T (e2 ) . . . T (en ) , where (e1 , e2 , . . . , en ) is the standard basis of Rn .
Proof.
Remark. We call this A the coordinate matrix of T with respect to the standard bases - we
write TEm ,En .
Example. Let Tα : R2 → R2 be the “rotation over α radians (counterclockwise)” map, that
is Tα (v) is the vector obtained by rotating v over angle α. Find the 2 × 2 matrix Aα such that
Tα (v) = Aα v for all v ∈ R2 .
Solution.
x2
e2
Tα (e2 ) Tα (e1 )
e1 x1
LINEAR ALGEBRA
Coordinate matrices of linear transformations
ILLINOIS
Department of Mathematics
Last time: Linear transformation is matrix multiplication. For every linear transformation
T : Rn → Rm , there is an m × n matrix A such that T (v) = Av.
Today: The same in abstract vector spaces.
Theorem 51. Let V , W be two vector space, let B = (b1 , . . . , bn ) be a basis of V and
C = (c1 , . . . , cm ) be a basis of W , and let T : V → W be a linear transformation. Then there
is a m × n matrix TC,B such that
○ T (v)C = TC,B vB , for all v ∈ V .
○ TC,B = T (b1 )C T (b2 )C . . . T (bn )C .
apply T
v : vector in V / vector in W : T (v)
apply TC,A
(Rm , A) / (Rn , C)
O
IB,A IC,D
apply TD,B
(Rm , B) / (Rn , D)
1 0 1 1
Example. Consider E := { , } and B := { , } as before. Let T : R2 → R2 be
0 1 −1 1
3 1
again the linear transformation that v 7→ v. Determine TB,B .
1 3
Solution.
LINEAR ALGEBRA
Determinants
ILLINOIS
Department of Mathematics
Definition. The determinant
of
a b
○ a 2 × 2 matrix is det = ad − bc,
c d
○ a 1 × 1 matrix is det([a]) = a.
Remark. Recall that −1
a b 1 d −b
= .
c d ad − bc −c a
A is invertible ⇐⇒ det(A) 6= 0.
a b a b
Notation. We will write both det and for the determinant.
c d c d
Definition. The determinant is the operation that assigns to each n × n-matrix a number
and satisfies the following conditions:
○ (Normalization) det In = 1,
○ It is affected by elementary row operations as follows:
o (Replacement) Adding a multiple of one row to another row does not change the
determinant.
o (Interchange) Interchanging two different rows reverses the sign of the determinant.
o (Scaling) Multiplying all entries in a row by s, multiplies the determinant by s.
2 3 3
Example. Compute det 0 1 2 .
0 0 6
Solution.
Theorem 53. The determinant of a triangular matrix is the product of the diagonal entries.
1 2 0
Example. Compute 3 −1 2 .
2 0 1
Solution.
a b
Example. (Re)Discover the formula for .
c d
Solution.
Remark. det(AT ) = det(A) means that everything you know about determinants in terms of
rows of A is also true for the columns of A. In particular:
○ If you exchange two columns in a determinant, the determinant changes by a factor of −1.
○ You can add a multiple of a column to another column without changing the determinant.
○ Multipying each entry of a column by a scalar s, change the determinant by a factor of s.
LINEAR ALGEBRA
Cofactor expansion
ILLINOIS
Department of Mathematics
Notation. Let A be an n × n-matrix. We denote by Aij the matrix obtained from matrix A by
deleting the i-th row and j-th column of A.
1 2 3 4
5 6 7 8
Example. Let A = 9 10 11 12. Find A23 and A43 .
13 14 15 16
Solution.
Definition. Let A be an n × n-matrix. The (i, j)-cofactor of A is the scalar Cij defined by
Cij = (−1)i+j det Aij .
Theorem 57. Let A be an n × n-matrix. Then for every i, j ∈ {1, . . . , n}
det A = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin (expansion across row i)
= a1j C1j + a2j C2j + · · · + anj Cnj (expansion down column j)
1 2 0
Example. Compute 3 −1 2 by cofactor expansion across row 1.
2 0 1
Solution.
1 2 0
Example. Compute 3 −1 2 by cofactor expansion down column 2 and by cofactor
2 0 1
expansion down column 3.
Solution.
Question. Why is the method of cofactor expansion not practical for large n?
Solution.
Remark. The cofactor expansion is also called Laplace expansion, because it is due
Pierre-Simon Laplace.
LINEAR ALGEBRA
Eigenvectors and Eigenvalues
ILLINOIS
Department of Mathematics
Definition. Let A be an n × n matrix. An eigenvector of A is a nonzero v ∈ Rn such that
Av = λv,
x2
v2 v1
x1
0 1
Example. Find the eigenvectors and the eigenvalues of A = .
1 0
Solution.
x2
x1
1 0
Example. Find the eigenvectors and the eigenvalues of B = .
0 0
Solution.
x2
x1
Definition. Let λ be an eigenvalue of A. The eigenspace of A associated with λ is the set
It consists of all the eigenvectors of A with eigenvalue λ and the zero vector.
0 1 1 0
Example. Draw the eigenspaces of the two matrices A = and B = .
1 0 0 0
Solution.
x2 x2
x1 x1
LINEAR ALGEBRA
Computing Eigenvalues and Eigenvectors
ILLINOIS
Department of Mathematics
Theorem 58. Let A be an n × n matrix and λ be a scalar. Then λ is an eigenvalue of A if
and only if det(A − λI ) = 0.
Proof.
3−λ 1
det(A − λI ) = = (3 − λ)2 − 1
1 3−λ
= λ2 − 6λ + 8 = 0 ; λ1 = 2, λ2 = 4
ILLINOIS
Department of Mathematics
Definition. Let A be an n × n matrix and let λ be an eigenvalue of A.
○ The algebraic multiplicity of λ is its multiplicity as a root of the characteristic
polynomial, that is, the largest integer k such that (t − λ)k divides pA (t).
○ The geometric multiplicity of λ is the dimension of the eigenspace Eigλ (A) of λ.
1 1
Example. Find the eigenvalues of A = and determine their algebraic and geometric
0 1
multiplicities.
Solution.
Theorem 61. Let A be an n × n matrix and let v1 , . . . , vm be eigenvectors of A
corresponding to different eigenvalues. Then v1 , . . . , vm are linearly independent.
Proof.
a11 a12 . . . a1n
a21 a22 . . . a2n
Definition. Let A =
..
.
... . . . ...
. The trace of A is the sum of the diagonal entries of A;
an1 an2 . . . ann
that is
Tr(A) = a11 + a22 + · · · + ann .
Solution.
LINEAR ALGEBRA
Markov matrices
ILLINOIS
Department of Mathematics
Definition. An n × n matrix A is a Markov matrix (or: stochastic
matrix) if it has only non-negative entries, and the entries in each column
add up to 1.
A vector in Rn is a probability vector (or: stochastic vector) if has only
non-negative entries, and the entries add up to 1.
ILLINOIS
Department of Mathematics
Definition. A square matrix A is said to be diagonalizable if there is a invertible matrix P
and a diagonal matrix D such that A = PDP −1 .
Theorem 66. Let A be an n × n matrix that has n linearly independent eigenvectors
v1 , v2 , . . . , vn with associated eigenvalues λ1 , . . . , λn . Then A is diagonalizable as PDP −1 ,
where
λ1
P = v1 . . . vn and D =
..
.
λn
Proof.
6 −1
Example. Diagonalize A = .
2 3
Solution.
Definition. Vectors v1 , . . . , vn form an eigenbasis of n × n matrix A if v1 , . . . , vn form a
basis of Rn and v1 , . . . , vn are all eigenvectors of A.
Theorem 67. The following are equivalent for n × n matrix A:
○ A has an eigenbasis.
○ A is diagonalizable.
○ The geometric multiplicities of all eigenvalues of A sums up to n.
Theorem 68. Let A be n × n matrix and let B = (v1 , . . . , vn ) be an eigenbasis of A. Then
there is a diagonal matrix D such that
A = IEn ,B DIB,En .
Proof.
Question. Let A be an n × n with n distinct eigenvalues. Is A diagonalizable?
Solution.
0 1
Question. Is A = diagonalizable?
0 0
Solution.
LINEAR ALGEBRA
Powers of Matrices
ILLINOIS
Department of Mathematics
Idea. If A has an eigenbasis, then we can raise A to large powers easily!
Theorem 69. If A = PDP −1 , where D is a diagonal matrix, then for any m,
Am = PD m P −1
Proof.
ILLINOIS
Department of Mathematics
Definition. Let A be an n × n-matrix. We define the matrix exponential e At as
(At)2 (At)3
e At = I + At + + + ...
2! 3!
2 0
Example. Compute e At for A = .
0 1
Solution.
e At = Pe Dt P −1 .
Proof.
−1
−2 1 1 1 −1 0 1 1
Example. Let A = = . Compute e At .
1 −2 1 −1 0 −3 1 −1
Solution.
LINEAR ALGEBRA
Linear Differential Equations
ILLINOIS
Department of Mathematics
Definition. A linear (first order) differential equation is an equation of the form
du
= Au
dt
where u is function from R to Rn . A further condition u(0) = v, for some v in Rn is called a
initial condition.
Theorem 72. Let A be an n × n matrix and v ∈ Rn . The solution of the differential
equation du At
dt = Au with initial condition u(0) = v is u(t) = e v.
du
Example. Let n = 1. Find the solution of the differential equation dt = −u and u(0) = 1.
Solution.
Theorem 73. Let A be an n × n-matrix, and let v ∈ Rn be an eigenvector of A with
eigenvalue λ. Then e At v = e λt v.
Proof.
Theorem 74. Let A be an n × n-matrix and (v1 , . . . , vn ) be an eigenbasis of A with
eigenvalues λ1 , . . . , λn . If v = c1 v1 + · · · + cn vn ,then the unique solution to the differential
equation du
dt = Au with initial condition u(0) = v is
e At v = c1 e λ1 t v1 + · · · + cn e λn t vn .
Proof.
0 1 du
Example. Let A = . Solve the differential equation dt = Au with initial condition
1 0
1
u(0) = .
0
Solution.
1 2 du
Example. Let A = . Solve the differential equation dt = Au with initial condition
2 1
0
u(0) = .
1
Solution.
LINEAR ALGEBRA
Orthogonal projection onto a line
ILLINOIS
Department of Mathematics
Definition. Let v, w ∈ Rn . The orthogonal projection of v onto the line spanned by w is
w·v
projw (v) := w.
w·w
Theorem 75. Let v, w ∈ Rn . Then projw (v) is the point in span(w) closest to v; that is
dist(v, projw (v)) = min dist(v, u).
u∈span(w)
Proof.
x2
w
v
projw (v)
x1
Remark. Note that v − projw (v) (called the error term) is in span(w)⊥ .
v = projw (v) + v − projw (v)
| {z } | {z }
∈span(w) ∈span(w)⊥
−2 3
Example. Find the orthogonal projection of v = onto the line spanned by w = .
1 1
Solution.
x2
v w
x1
Theorem 76. Let w ∈ Rn . Then for all v ∈ Rn
1 T
projw (v) = ww v.
w·w
Proof.
1 T
Remark. Note that w·w ww is an n × n matrix, we call the orthogonal projection
matrix onto span(w).
1
Example. Let w = . Find the orthogonal projection matrix P onto span(w). Use it to
1
1 1 1
calculate the projections of , , onto span(w).
0 1 −1
Solution.
LINEAR ALGEBRA
Orthogonal projection onto a subspace
ILLINOIS
Department of Mathematics
Theorem 77. Let W be a subspace of Rn and v ∈ Rn . Then v can be written uniquely as
v = |{z} v⊥
v̂ + |{z}
in W in W ⊥
Proof.
w3
v W
v⊥
w1
w2 v̂
ILLINOIS
Department of Mathematics
Goal. Suppose Ax = b is inconsistent. Can we still find something like a best solution?
Definition. Let A be an m × n matrix and b ∈ Rm . A least squares solution (short: LSQ
solution) of the system Ax = b is a vector x̂ ∈ Rn such that
ILLINOIS
Department of Mathematics
Example. Are there β1 , β2 ∈ R such y
that the data points × ×
×
(x1 , y1 ) = (2, 1), (x2 , y2 ) = (5, 2),
(x3 , y3 ) = (7, 3), (x4 , y4 ) = (8, 3) ×
Solution.
Example. A scientist tries to find the relation between the y
mysterious quantities x and y . She measures the following values: × ×
x −2 −1 0 1 2
y 5 2.5 2.25 2 5 × ×
×
ILLINOIS
Department of Mathematics
Theorem 82. Every subspace of Rn has an orthonormal basis.
Algorithm. (Gram-Schmidt orthonormalization) Given a basis a1 , . . . , am , produce an
orthogonal basis b1 , . . . , bm and an orthonormal basis q1 , . . . , qm .
b1
b1 = a1 , q1 = kb1 k
b2 q3
b2 = a2 − projspan(q1 ) (a2 ), q2 = kb2 k
| {z } a3 span(a1 , a2 )
=(a2 ·q1 )q1
b3 q1
b3 = a3 − projspan(q1 ,q2 ) (a3 ) q3 = kb3 k
| {z } q2
(a3 ·q1 )q1 +(a3 ·q2 )q2
··· ···
Remark.
○ span(q1 , . . . , qi ) = span(a1 , . . . , ai ) for i = 1, . . . , m.
○ qj ∈
/ span(a1 , . . . , ai ) for all j > i.
2 0
Example. Let V = span 1 , 0. Use the Gram-Schmidt method to find an
2 3
orthonormal basis of V .
Solution.
Theorem 83. (QR decomposition) Let A be an m × n matrix of rank n. There is is an
m × n-matrix Q with orthonormal columns and an upper triangular n × n invertible matrix R
such that A = QR.
Proof.
1 2 4
Example. Find the QR decomposition of A = 0 0 5.
0 3 6
Solution.
LINEAR ALGEBRA
Spectral Theorem
ILLINOIS
Department of Mathematics
Theorem 84. Let A be a symmetric n × n matrix. Then A has an orthonormal basis of
eigenvectors.
Proof.
Theorem 85. Let A be a symmetric n × n matrix. Then there is a diagonal matrix D and a
matrix Q with orthonormal columns such that A = QDQ T .
Proof.
x2 Aq2 x2
q2 q2
x1 x1
q1 q1
Aq1
LINEAR ALGEBRA
Singular Value Decomposition
ILLINOIS
Department of Mathematics
Definition. Let A be an m × n matrix. A singular value decomposition of A is a
decomposition A = UΣV T where
○ U is an m × m matrix with orthonormal columns,
○ Σ is an m × n rectangular diagonal matrix with non-negative numbers on the diagonal,
○ V is an n × n matrix with orthonormal columns.
Remark. The diagonal entries σi = Σii which are positive are called the singular values of A.
We usually arrange them in decreasing order, that is σ1 ≥ σ2 ≥ . . .
Question. Let A be an m × n matrix with rank r . Recall why
○ Nul(AT A) = Nul(A) and
○ AT A is symmetric and has rank r .
Nul(AAT ) = Nul(AT ).
Solution.
Algorithm. Let A be an m × n matrix with rank r .
○ Find orthonormal eigenbasis (v1 , . . . , vn ) of AT A with eigenvalues
λ1 ≥ · · · ≥ λr > λr +1 = 0 = · · · = λn .
√
○ Set σi = λi for i = 1, . . . , n.
○ Set u1 = σ11 Av1 , . . ., ur = σ1r Avr . (Magic: orthonormal!)
○ Find ur +1 , . . . , um ∈ Rm such that (u1 , . . . , um ) is an orthonormal basis of Rm .
○ Set
σ1
U = u1 . . . um ,
Σ=
..
, V = v1 . . . vn
.
σmin{m,n}
ILLINOIS
Department of Mathematics
Theorem
87. Let A be an m × n matrix with rank r , and let U = u1 . . . um ,
V = v1 . . . vn be matrices with with orthonormal columns and Σ be a rectangular diagonal
m × n matrix such that A = UΣV T is an SVD of A. Then
ILLINOIS
Department of Mathematics
Definition. Let A be an m × n matrix with rank r . Given the compact singular value
decomposition T
A = UcΣc Vc where
○ Uc = u1 . . . ur is an m × r matrix with orthonormal columns,
○ Σc is an r × r diagonal
matrix with positive diagonal elements,
○ Vc = v1 . . . vr is an n × r matrix with orthonormal columns,
we define the pseudoinverse A+ of A as Vc Σ−1 T
c Uc .
Example. Recall that
T
√1 − √12
" √1 √1
# √
− 6
−1 1 0 3 0 − √2
A := = √1 2 √1
2 0 .
0 −1 1 0 1 6
2 2 √1 √1
6 2
Determine A+ .
Solution.
Theorem 88. Let v ∈ Col(AT ) and w ∈ Col(A). Then A+ Av = v and AA+ w = w.
Proof.
Remark. If A is n × n and invertible, then Col(A) = Rn . Thus A−1 = A+ .
Question. Let v ∈ Rn such that vr + vn = v . What is A+ Av?
|{z} |{z}
∈Col(AT ) ∈Nul(A)
Solution.
Theorem 89. Let A be an m × n matrix and let b ∈ Rm . Then A+ b is the LSQ solution of
Ax = b (with minimum length).
Proof.
Remark. This is particularly useful, when solving many different LSQ problems of the form
Ax = b, where A stays the same, but b varies.
LINEAR ALGEBRA
PCA
ILLINOIS
Department of Mathematics
Setup.
○ Given m objects, we measure the same n variables.
○ Thus m samples of n-dimensional data ; m × n matrix (each row is a sample)
○ Analyse this matrix to understand what drives the variance in the data.
T
Definition. Let X = a1 . . . am be an m × n matrix. We define the column average
µ(X ) of X as
1
µ(X ) := (a1 + · · · + am ).
m
We say X is centered if µ(X ) = 0.
1
For X centered, we define the covariance matrix cov(X ) of X as m−1 XTX.
Remark.
T
○ Not centered, replace X by a1 − µ(X ) . . . am − µ(X ) .
○ If the columns of X are orthogonal, then cov(X ) is a diagonal matrix ; each variable is
independent.
○ What if not? Idea: Pick an eigenbasis of X T X .
Principal component analysis
○ Input: centered m × n-matrix X .
○ Compute cov(X ).
○ Since cov(X ) is symmetric, we can find an orthonormal eigenbasis v1 , v2 , . . . , vn of cov(X )
with eigenvalues λ1 ≥ · · · ≥ λn ≥ 0.
○ Write cov(X ) as a sum of rank 1 matrices:
○ Each principal component vi explains part of the variance of the data. The larger λi , the
more of the variances is explained by vi .
Example. Let X be a centered 30 × 2-matrix. We
plot the data and see that
7.515 20.863
cov(X ) = .
20.863 63.795 x2
An orthonormal eigenbasis is
0.314
0.95
v1
v1 = , v2 =
0.95 −0.314 x1
v2
with λ1 = 70.685 and λ2 = 0.625. Thus
6.97 21.085 0.564 −0.186
cov(X ) = + .
21.085 63.793 −0.186 0.0616
| {z } | {z }
λ1 v1 v1T λ2 v2 v2 T
PCA using SVD.
○ Let X be a centered data matrix. Observe that X T X = (m − 1) cov(X ).
○ To find orthonormal eigenbasis of cov(X ), it is enough to find orthonormal eigenbasis of
XTX.
○ Compute the SVD of X ; X = UΣV T .
○ The columns of V = v1 . . . vn are the desired orthonormal eigenbasis.
○ If σi is the singular value for vi , then
σi2
λi =
m−1
LINEAR ALGEBRA
Review of Complex Numbers
ILLINOIS
Department of Mathematics
Definition.
√ C = {x + iy | x, y ∈ R} where
2
i = −1, or i = −1.
Any point in 2
x
R can be viewed as a complex
number: y ↔ x + iy
Definition. Given z = x + iy , w = u + iv , Example. Compute i(x + iy ), i 2 (x + iy ) and
we define i 3 (x + iy ) and i 4 (x + iy ).
z + w = (x + u) + i(y + v ) Solution.
zw = (x + iy )(u + iv )
= xu + x(iv ) + (iy )u + (iy )(iv )
= (xu − yv ) + i(xv + yu)
ILLINOIS
Department of Mathematics
Goal. Use complex numbers (instead of real numbers) as scalars.
z1
z2
Definition. The (complex) vector space Cn is of all complex column vectors z = . ,
..
zn
where z1 , z2 , . . . , zn are complex numbers.
○ Now multiplication by a complex scalar makes sense.
○ We can define subspaces, span, independence, basis, dimension for Cn in the usual way.
○ We can multiply complex vectors by complex matrices. Column space and Null space still
make sense.
○ The only difference is the dot product, you need to use the complex conjugate to get a
good notion of length:
z1 w1
.. ..
. · . = z1 w̄1 + z2 w̄2 + . . . zn w̄n .
zn wn
0 −1
Example. Find the complex eigenvectors and eigenvalues of A = .
1 0
Solution.
Definition. Let A be an m × n-matrix. The conjugate matrix A of A is obtained from A by
taking the complex conjugate of each entry of A.
Theorem 91. Let A be a matrix with real entries and λ is an eigenvalue of A. Then λ̄ is
also a eigenvalue. Furthermore, if v is an eigenvector with eigenvalue λ, then v̄ is an
eigenvector with eigenvalue λ̄.
Proof.
T
Definition. Let A be an m × n-matrix. The conjugate transpose AH of A is defined as A .
We say the matrix A is Hermitian if A = AH .
1 1
Example. Find the eigenvectors and eigenvalues of A = . Can complex numbers help
0 1
you find an eigenbasis of A?