Linear_Algebra_LectureNote
Linear_Algebra_LectureNote
Lectures on YouTube:
https://www.youtube.com/@mathtalent
Seongjai Kim
Email: [email protected]
Learning Objectives
Real-world problems can be approximated as and resolved by systems
of linear equations
a11 a12 · · · a1n
a21 a22 · · · a2n m×n
Ax = b, A= ... .. . . . .. ∈ R ,
. .
am1 am2 · · · amn
where one of {x, b} is the input and the other is the output.
iii
iv
Contents
Title ii
Prologue iii
1 Linear Equations 1
1.1. Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Row Reduction and Echelon Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1. Echelon Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2. The General Solution of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . 14
1.3. Vector Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.1. Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2. Linear Combinations and Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Programming with Matlab/Octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4. Matrix Equation Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.5. Solution Sets of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.5.1. Solutions of Homogeneous Linear Systems . . . . . . . . . . . . . . . . . . . . . 38
1.5.2. Solutions of Nonhomogeneous Linear Systems . . . . . . . . . . . . . . . . . . . 41
1.7. Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.8. Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.9. The Matrix of A Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.9.1. The Standard Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.9.2. Existence and Uniqueness Questions . . . . . . . . . . . . . . . . . . . . . . . . 61
2 Matrix Algebra 67
2.1. Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.1.1. Sum, Scalar Multiple, and Matrix Multiplication . . . . . . . . . . . . . . . . . . 68
2.1.2. Properties of Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.2. The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.3. Characterizations of Invertible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
v
vi Contents
3 Determinants 109
3.1. Introduction to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2. Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A Appendix 227
A.1. Understanding / Interpretation of Eigenvalues and Eigenvectors . . . . . . . . . . . . . 228
A.2. Eigenvalues and Eigenvectors of Stochastic Matrices . . . . . . . . . . . . . . . . . . . 231
P Projects 265
P.1. Project Regression Analysis: Linear, Piecewise Linear, and Nonlinear Models . . . . . 266
Bibliography 275
Index 277
viii Contents
C HAPTER 1
Linear Equations
Contents of Chapter 1
1.1. Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Row Reduction and Echelon Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3. Vector Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Programming with Matlab/Octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4. Matrix Equation Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.5. Solution Sets of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.7. Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.8. Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.9. The Matrix of A Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1
2 Chapter 1. Linear Equations
a1 x1 + a2 x2 + · · · + an xn = b, (1.1)
Solving (1.2)
1 ↔ 2 : (interchange)
( " #
x1 + 2x2 = 4 1 1 2 4
−2x1 + 3x2 = −1 2 −2 3 −1
2 ← 2 +2· 1 : (replacement)
( " #
x1 + 2x2 = 4 1 1 2 4
7x2 = 7 2 0 7 7
2 ← 2 /7: (scaling)
( " #
x1 + 2x2 = 4 1 1 2 4
x2 = 1 2 0 1 1
1 ← 1 −2· 2 : (replacement)
( " #
x1 = 2 1 1 0 2
x2 = 1 2 0 1 1
Example 1.7. Solve the following system of linear equations, using the 3
EROs. Then, determine if the system is consistent.
x2 − 4x3 = 8
2x1 − 3x2 + 2x3 = 1
4x1 − 8x2 + 12x3 = 1
Solution.
Ans: Inconsistency means that there is no point where the three planes meet at.
Example 1.8. Determine the values of h such that the given system is a
consistent linear system
x + h y = −5
2x − 8y = 6
Solution.
Ans: h 6= −4
1.1. Systems of Linear Equations 7
True-or-False 1.9.
a. Every elementary row operation is reversible.
b. Elementary row operations on an augmented matrix never change the
solution of the associated linear system.
c. Two linear systems are equivalent if they have the same solution set.
d. Two matrices are row equivalent if they have the same number of rows.
Solution.
Ans: T,T,T,F
8 Chapter 1. Linear Equations
You should report your homework with your work for problems. You can scan your solutions
and answers, using a scanner or your phone, then try to put in a file, either in doc/docx or pdf.
Exercises 1.1
1. Consider the augmented matrix of a linear system. State in words the next two elemen-
tary row operations that should be performed in the process of solving the system.
1 6 −4 0 1
0 1
7 0 −4
0 0 −1 2 3
0 0 2 1 −6
2. The augmented matrix of a linear system has been reduced by row operations to the
form shown. Continue the appropriate row operations and describe the solution set.
1 1 0 0 4
0 −1 3 0 −7
0 0 1 −3 1
0 0 0 2 4
3. Solve the systems or determine if the systems in inconsistent.
−x2 − 4x3 = 5 x1 + 3x3 = 2
(a) x1 + 3x2 + 5x3 = −2 x2 − 3x4 = 3
(b)
3x1 + 7x2 + 7x3 = 6 −2x2 + 3x3 + 2x4 = 1
3x1 + 7x4 = −5
4. Determine the value of h such that the matrix is the augmented matrix of a consistent
linear
" system. #
2 −3 h
−4 6 −5
Ans: h = 5/2
5. An important concern in the study of heat Write a system of four equations whose so-
transfer is to determine the steady-state tem- lution gives estimates for the temperatures
perature distribution of a thin plate when the T1 , T2 , · · · , T4 , and solve it.
temperature around the boundary is known.
Assume the plate shown in the figure repre-
sents a cross section of a metal beam, with
negligible heat flow in the direction perpen-
dicular to the plate. Let T1 , T2 , · · · , T4 denote
the temperatures at the four interior nodes of
the mesh in the figure. The temperature at
a node is approximately equal to the average
of the four nearest nodes. For example, T1 =
(10 + 20 + T2 + T4 )/4 or 4T1 = 10 + 20 + T2 + T4 . Figure 1.1
1.2. Row Reduction and Echelon Forms 9
0 0 3 0 4
10 Chapter 1. Linear Equations
Example 1.12. Verify whether the following matrices are in echelon form,
row reduced echelon form.
1 0 2 0 1 2 0 0 5
(a) 0 1 3 0 4 (b) 0 0 0 9
0 0 0 0 0 0 1 0 6
1 1 0 1 1 2 2 3
(c) 0 0 1 (d) 0 0 1 1 1
0 0 0 0 0 0 0 4
1 0 0 5 0 1 0 5
(e) 0 1 0 6 (f) 0 0 0 6
0 0 0 1 0 0 1 2
Pivot Positions
Terminologies
1) A pivot position is a location in A that corresponds to a leading 1
in the reduced echelon form of A.
2) A pivot column is a column of A that contains a pivot position.
Example 1.14. The matrix A is given with its reduced echelon form. Find
the pivot positions and pivot columns of A.
1 1 0 2 0 1 1 0 2 0
R.E.F
A = 1 1 1 3 0 −−−→ 0 0 1 1 0
1 1 0 2 4 0 0 0 0 1
Solution.
Terminologies
3) Basic variables: In the system Ax = b, the variables that corre-
spond to pivot columns (in [A : b]) are basic variables.
4) Free variables: In the system Ax = b, the variables that correspond
to non-pivotal columns are free variables.
Example 1.16. For the system of linear equations, identify its basic vari-
ables and free variables.
−x1 − 2x2
= −3
2x3 = 4
3x3 = 6
Solution. Hint : You may start with its augmented matrix, and apply row operations.
5. Start with right most pivot and work upward and left to make zeros
above each pivot. If pivot is not 1, make it 1 by a scaling operation.
Example 1.17. Row reduce the matrix into reduced echelon form.
0 −3 −6 4 9
A = −2 −3 0 3 −1
1 4 5 −9 −7
1 4 5 −9 −7 1 4 5 −9 −7
R ↔R3 R2 ←R2 +2R1
Solution. −−1−−→ −2 −3 0 3 −1 −−−−−−−→ 0 5 10 −15 −15
0 −3 −6 4 9 0 −3 −6 4 9
1 4 5 −9 −7 1 4 5 −9 −7
R2 ←R2 /5 R ←R3 +3R2
−−−−−→ 0 1 2 −3 −3 −−3−−− −−→ 0 1 2 −3 −3
0 −3 −6 4 9 0 0 0 −5 0
1 4 5 0 −7 1 0 −3 0 5
R3 ←R3 /−5; above the pivot→0 R ←R1 −4R2
−−−−−−−−−−−−−−−−−−→ 0 1 2 0 −3 −−1−−− −−→ 0 1 2 0 −3
0 0 0 1 0 0 0 0 1 0
Combination of Steps 1–4 is call the forward phase of the row reduction, while Step
5 is called the backward phase.
14 Chapter 1. Linear Equations
Example 1.18. Find the general solution of the system whose augmented
matrix is
1 0 −5 0 −8 3
0 1 4 −1 0 6
[A|b] = 0 0 0 0 1 0
0 0 0 0 0 0
Solution. Hint : You should first row reduce it for the reduced echelon form.
16 Chapter 1. Linear Equations
Example 1.19. Find the general solution of the system whose augmented
matrix is
0 0 0 1 2
0 1 3 0 2
[A|b] = 0 1 3 2 6
1 0 −9 0 −8
Solution.
1.2. Row Reduction and Echelon Forms 17
Properties
1) Any nonzero matrix may be row reduced (i.e., transformed by el-
ementary row operations) into more than one matrix in echelon
form, using different sequences of row operations.
2) Once a matrix is in an echelon form, further row operations do not
change the pivot positions (Remark 1.15).
3) Each matrix is row equivalent to one and only one reduced eche-
lon matrix (Theorem 1.13, p. 10).
True-or-False 1.21.
a. The row reduction algorithm applies to only to augmented matrices for
a linear system.
b. If one row in an echelon form of an augmented matrix is [0 0 0 0 2 0],
then the associated linear system is inconsistent.
c. The pivot positions in a matrix depend on whether or not row inter-
changes are used in the row reduction process.
d. Reducing a matrix to an echelon form is called the forward phase of
the row reduction process.
Solution.
Ans: F,F,F,T
1.2. Row Reduction and Echelon Forms 19
Exercises 1.2
1. Row reduce the matrices to reduced echelon form. Circle the pivot positions in the final
matrix and in the original matrix, and list the pivot columns.
1 2 3 4 1 3 5 7
(a) 4 5 6 7 (b) 3 5 7 9
6 7 8 9 5 7 9 1
2. Find the general solutions of the systems (in parametric vector form) whose aug-
mented matrices are given as
1 −7 0 6 5 1 2 −5 −6 0 −5
(a) 0 0 1 −2 −3 0 1 −6 −3 0 2
(b)
−1 7 −4 2 7 0 0 0 0 1 0
0 0 0 0 0 0
Ans: (a) x = [5, 0, −3, 0]T + x2 [7, 1, 0, 0]T + x4 [−6, 0, 2, 1]T ;
Ans: (b) x = [−9, 2, 0, 0, 0]T + x3 [−7, 6, 1, 0, 0]T + x4 [0, 3, 0, 1, 0]T 1
3. In the following, we use the notation for matrices in echelon form: the leading entries
with , and any values (including zero) with ∗. Suppose each matrix represents the aug-
mented matrix for a system of linear equations. In each case, determine if the system is
consistent. If the system is consistent, determine if the solution is unique.
∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗
(a) 0 ∗ ∗ (b) 0 0 ∗ ∗ (c) 0 0 ∗ ∗
0 0 ∗ 0 0 0 0 0 0 0 ∗
4. Choose h and k such that the system has (a) no solution, (b) a unique solution, and (c)
many solutions.
x1 + hx2 = 2
4x1 + 8x2 = k
5. Suppose the coefficient matrix of a system of linear equations has a pivot position in
every row. Explain why the system is consistent.
a
1 T
The superscript T denotes the transpose; for example [a, b, c] = b.
c
20 Chapter 1. Linear Equations
For example,
" # 1
3
∈ R2 −5 ∈ R
3
−2
4
1.3.1. Vectors in Rn
Vectors in R2
" #
a
We can identify a point (a, b) with a column vector , position vector.
b
" # " #
u1 v1
1) Equality of vectors: Two vectors u = and v = are equal
u2 v2
if and only if corresponding entries are equal, i.e., ui = vi , i = 1, 2.
" # " #
u1 v1
2) Addition: Let u = and v = . Then,
u2 v2
" # " # " #
u1 v1 u1 + v1
u+v = + = .
u2 v2 u2 + v2
" # " #
2 1
Remark 1.25. Let a1 = and a2 = . Then
1 −3
" # " # " # " #" # " #
2 1 2x1 + x2 2 1 x1 x1
x1 a1 + x2 a2 = x1 + x2 = = = [a1 a2 ] . (1.10)
1 −3 x1 − 3x2 1 −3 x2 x2
Vectors in Rn
Note: The above vector operations, including the parallelogram rule, are
also applicable for vectors in R3 and Rn , in general.
Algebraic Properties of Rn
For u, v, w ∈ Rn and scalars c and d,
1) u + v = v + u 5) c(u + v) = cu + cv
2) (u + v) + w = u + (v + w) 6) (c + d)u = cu + du
3) u + 0 = 0 + u = u 7) c(du) = (cd)u
4) u + (−u) = (−u) + u = 0 8) 1u = u
where −u = (−1)u
1.3. Vector Equations 23
y = c1 v1 + c2 v2 + · · · + cp vp (1.11)
−1 3 8 3
Determine whether or not b can be generated as a linear combination of a1 ,
a2 , and a3 .
Solution. Hint : We should determine
whether weights x1 , x2 , x3 exist such that x1 a1 +
x1
x2 a2 + x3 a3 = b, which reads [a1 a2 a3 ] x2 = b. (See Remark 1.25 on p.22.)
x3
1.3. Vector Equations 25
Note:
1) The vector equation x1 a1 + x2 a2 + · · · + xp ap = b has the same solu-
tion set as a linear system whose augmented matrix is [a1 a2 · · · ap : b].
2) b can be generated as a linear combination of a1 , a2 , · · · , ap if and only
if the linear system Ax = b whose augmented matrix [a1 a2 · · · ap : b]
is consistent.
Span{v1 , v2 , · · · , vp } = {y | y = c1 v1 + c2 v2 + · · · + cp vp } (1.13)
6
1 0 5
columns of the matrix −2 1 −6. (That is, determine if b is in the span
0 2 8
of columns of the matrix.)
Solution.
1.3. Vector Equations 27
2 1
Example 1.32. Find h so that 1 lies in the plane spanned by a1 = 2
h 3
2
and a2 = −1.
1
Solution.
b ∈ Span{v1 , v2 , · · · , vp }
⇔ x1 v1 + x2 v2 + · · · + xp vp = b has a solution (1.14)
⇔ [v1 v2 · · · vp : b] has a solution
True-or-False 1.33.
# "
1
a. Another notation for the vector is [1 − 2].
−2
b. The set Span{u, v} is always visualized as a plane through the origin.
c. When u and v are nonzero vectors, Span{u, v} contains the line through
u and the origin.
Solution.
Ans: F,F,T
28 Chapter 1. Linear Equations
Exercises 1.3
1. Write a system of equations that is equivalent to the given vector equation; write a
vector equation that is equivalent to the given system of equations.
6 −3 1 x2 + 5x3 = 0
(a) x1 −1 + x2 4 = −7 (b) 4x1 + 6x2 − x3 = 0
2. Determine
if b is a linear
combination
of a1 , a2, and
a3 .
1 0 5 2
a1 = −2 , a2 = 1 , a3 = −6 , b = −1 .
0 2 8 6
Ans: Yes
3. Determine if b is a linear combination of the vectors formed from the columns of the
matrix A.
1 −4 2 3 1 −2 −6 11
(a) A = 0 3 5 , b = −7 (b) A = 0 3 7 , b = −5
−2 8 −4 −3 1 −2 5 9
1 −2 4
4. Let a1 = 4 , a2 = −3, and b = 1. For what value(s) of h is b in the plane
−2 7 h
spanned by a1 and a2 ? Ans: h = −17
5. Construct a 3 × 3 matrix A, with nonzero entries, and a vector b in R3 such that b is
not in the set spanned by the columns of A. Hint : Construct a 3 × 4 augmented matrix in
echelon form that corresponds to an inconsistent system.
6. A mining company has two mines. One day’s operation at mine #1 produces ore that
contains 20 metric tons of copper and 550 kilograms of silver, while one day’s operation
at mine #2 produces
" # ore that contains
" # 30 metric tons of copper and 500 kilograms of
20 30
silver. Let v1 = and v2 = . Then v1 and v2 represent the “output per day" of
550 500
mine #1 and mine #2, respectively.
(a) What physical interpretation can be given to the vector 5v1 ?
(b) Suppose the company operates mine #1 for x1 days and mine #2 for x2 days. Write a
vector equation whose solution gives the number of days each mine should operate
in order to produce 150 tons of copper and 2825 kilograms of silver.
(c) M 2 Solve the equation in (b).
The mark M indicates that you have to solve the problem, using one of Matlab, Maple, and Mathemat-
2
ica. You may also try “octave" as a free alternative of Matlab. Attach a copy of your code.
1.3. Vector Equations 29
The most basic thing you will need Vectors and Matrices
to do is to enter vectors and matri- 1 >> u = [1; 2; 3] % column vector
2 u=
ces. You would enter commands to 3 1
Matlab or Octave at a prompt that 4 2
looks like >>. 5 3
6 >> v = [4; 5; 6];
7 >> u + 2*v
• Rows are separated by semi- 8 ans =
colons (;) or Enter . 9 9
10 12
• Entries in a row are separated 11 15
by commas (,) or space Space . 12 >> w = [5, 6, 7, 8] % row vector
13 w=
For example, 14 5 6 7 8
15 >> A = [2 1; 1 2]; % matrix
16 >> B = [-2, 5
17 1, 2]
18 B=
19 -2 5
20 1 2
21 >> C = A*B % matrix multiplication
22 C=
23 -3 12
24 0 9
30 Chapter 1. Linear Equations
You can save the commands in a file to run and get the same results.
tutorial1_vectors.m
1 u =
[1; 2; 3]
2 v =
[4; 5; 6];
3 u +
2*v
4 w =
[5, 6, 7, 8]
5 A =
[2 1; 1 2];
6 B =
[-2, 5
7 1, 2]
8 C = A*B
Solving equations
1 −4 2 3
Let A = 0 3 5 and b = −7 . Then Ax = b can be numerically
−2 8 −4 −3
solved by implementing a code as follows.
tutorial2_solve.m Result
1 A = [1 -4 2; 0 3 5; 2 8 -4]; 1 x =
2 b = [3; -7; -3]; 2 0.75000
3 x = A\b 3 -0.97115
4 -0.81731
tutorial3_plot.m
1 close all
2
3 %% a curve
4 X1 = linspace(0,2*pi,10); % n=10
5 Y1 = cos(X1);
6
7 %% another curve
8 X2=linspace(0,2*pi,20); Y2=sin(X2);
9
10 %% plot together
11 plot(X1,Y1,'-or',X2,Y2,'--b','linewidth',3);
12 legend({'y=cos(x)','y=sin(x)'},'location','best',...
13 'FontSize',16,'textcolor','blue')
14 print -dpng 'fig_cos_sin.png'
while loop
The syntax of a while loop in Matlab is as follows.
while <expression>
<statements>
end
An expression is true when the result is nonempty and contains all nonzero
elements, logical or real numeric; otherwise the expression is false. Here is
an example for the while loop.
n1=11; n2=20;
sum=n1;
while n1<n2
n1 = n1+1; sum = sum+n1;
end
fprintf('while loop: sum=%d\n',sum);
When the code above is executed, the result will be:
while loop: sum=155
1.3. Vector Equations 33
for loop
A for loop is a repetition control structure that allows you to efficiently
write a loop that needs to execute a specific number of times. The syntax of
a for loop in Matlab is as following:
for index = values
<program statements>
end
Here is an example for the for loop.
n1=11; n2=20;
sum=0;
for i=n1:n2
sum = sum+i;
end
fprintf('for loop: sum=%d\n',sum);
When the code above is executed, the result will be:
for loop: sum=155
4 s=0;
5 for i=n1:n2
6 s = s+i;
7 end
Example 1.35.
which, in turn, has the same solution set as the system with augmented
matrix
[a1 a2 · · · an : b]. (1.18)
1.4. Matrix Equation Ax = b 35
3 9 −6
Does {v1 , v2 , v3 } span R3 ? Why or why not?
36 Chapter 1. Linear Equations
1 3 −2
0 1
, , and 1 span R4 ?
Example 1.39. Do the vectors
1 2 −3
−2 −8 2
Solution.
True-or-False 1.40.
a. The equation Ax = b is referred to as a vector equation.
b. Each entry in Ax is the result of a dot product.
c. If A ∈ Rm×n and if Ax = b is inconsistent for some b ∈ Rm , then A can-
not have a pivot position in every row.
d. If the augmented matrix [A b] has a pivot position in every row, then
the equation Ax = b is inconsistent.
Solution.
Ans: F,T,T,F
1.4. Matrix Equation Ax = b 37
Exercises 1.4
1. Write the system first as a vector equation and then as a matrix equation.
3x1 + x2 − 5x3 = 9
x2 + 4x3 = 0
0 3 −5
2. Let u = 0 and A = −2 6. Is u in
4 1 1
3
the plane R spanned by the columns of A?
(See the figure.) Why or why not?
Figure 1.8
3. The problems refer to the matrices A and B below. Make appropriate calculations that
justify your answers and mention an appropriate theorem.
1 3 0 3 1 3 −2 2
−1 −1 −1 1 0 1 1 −5
A= B=
0 −4 2 −8 1 2 −3 7
2 0 3 −1 −2 −8 2 −1
(a) How many rows of A contain a pivot position? Does the equation Ax = b have a
solution for each b in R4 ?
(b) Can each vector in R4 be written as a linear combination of the columns of the
matrix A above? Do the columns of A span R4 ?
(c) Can each vector in R4 be written as a linear combination of the columns of the
matrix B above? Do the columns of B span R4 ?
Ans: (a) 3; (b) Theorem 1.37 (d) is not true
0 0 4
4. Let v1 = 0, v2 = −3, v3 = −1. Does {v1 , v2 , v3 } span R3 ? Why or why not?
−2 8 −5
Ans: The matrix of {v1 , v2 , v3 } has a pivot position on each row.
5. Could a set of three vectors in R4 span all of R4 ? Explain. What about n vectors in Rm
when n < m?
6. Suppose A is a 4 × 3 matrix and b ∈ R4 with the property that Ax = b has a unique
solution. What can you say about the reduced echelon form of A? Justify your answer.
Hint : How many pivot columns does A have?
38 Chapter 1. Linear Equations
Linear Systems Ax = b:
1. Homogeneous linear systems:
Ax = 0; A ∈ Rm×n , x ∈ Rn , 0 ∈ Rm . (1.19)
Note: Ax = 0 has a nontrivial solution if and only if the system has at least
one free variable.
x = c1 u1 + c2 u2 + · · · + cr ur , (1.21)
Example 1.43. Solve the system and write the solution in parametric
vector form.
x1 + 2x2 − 3x3 = 0
2x1 + x2 − 3x3 = 0
−x1 + x2 = 0
40 Chapter 1. Linear Equations
Example 1.45. Solve the equation of 3 variables and write the solution in
parametric vector form.
x1 − 2x2 + 3x3 = 0
Solution. Hint : x1 is only the basic variable. Thus your solution would be the form of
x = x2 v1 + x3 v2 , which is a parametric vector equation of the plane.
1.5. Solution Sets of Linear Systems 41
6 1 −8 −4
Solution.
−1 4/3
Ans: x = 2 + x3 0
0 1
A(t v) = 0, (1.26)
True-or-False 1.49.
a. The solution set of Ax = b is the set of all vectors of the form {w =
p + uh }, where uh is the solution of the homogeneous equation Ax = 0.
(Compare with Theorem 1.47, p.42.)
b. The equation Ax = b is homogeneous if the zero vector is a solution.
c. The solution set of Ax = b is obtained by translating the solution of
Ax = 0.
Solution.
Ans: F,T,F
1.5. Solution Sets of Linear Systems 43
Exercises 1.5
1. Determine if the system has a nontrivial solution. Try to use as few row operations as
possible.
x1 v1 + x2 v2 + · · · + xp vp = 0 (1.27)
c1 v1 + c2 v2 + · · · + cp vp = 0. (1.28)
0 0 0 0 1 2
Solution.
3 −8 1
Solution.
46 Chapter 1. Linear Equations
0 1 −1 3
Solution.
Example 1.57. Find the value of h so that the vectors are linearly inde-
pendent.
3 −6 9
−6, 4, h
1 −3 3
Solution.
1 −3 3
Solution.
48 Chapter 1. Linear Equations
−1 3 0 3 9
Solution.
True-or-False 1.60.
a. The columns of any 3 × 4 matrix are linearly dependent.
b. If u and v are linearly independent, and if {u, v, w} is linearly depen-
dent, then w ∈ Span{u, v}.
c. Two vectors are linearly dependent if and only if they lie on a line
through the origin.
d. The columns of a matrix A are linearly independent, if the equation
Ax = 0 has the trivial solution.
Solution.
Ans: T,T,T,F
1.7. Linear Independence 49
Exercises 1.7
1. Determine if the columns of the matrix form a linearly independent set. Justify each
answer.
−4 −3 0 1 −3 3 −2
0 −1 4 (b) −3 7 −1 2
(a)
1 0 3 0 1 −4 3
5 4 6
2. Find the value(s) of h for which the vectors are linearly dependent. Justify each answer.
1 3 −1 1 −2 3
(a) −1, −5, 5 (b) 5, −9, h
4 7 h −3 6 −9
3. (a) For what values of h is v3 in Span{v1 , v2 }, and (b) for what values of h is {v1 , v2 , v3 }
linearlydependent?
Justify
eachanswer.
1 −3 5
v1 = −3, v2 = 9, v3 = −7.
2 −6 h
Ans: (a) No h; (b) All h
4. Describe the possible echelon forms of the matrix. Use the notation of Exercise 3 in
Section 1.2, p. 19.
T : Rn → Rm
(1.30)
x 7→ Ax
" #
1 3
Example 1.64. Let A = . The transformation T : R2 → R2 defined
0 1
by T (x) = Ax is called a shear transformation. Determine the image of a
square [0, 2] × [0, 2] under T .
Solution. Hint : Matrix transformations is an affine mapping, which means that they
map line segments into line segments (and corners to corners).
52 Chapter 1. Linear Equations
1 −3 " # 3 3
2
Example 1.65. Let A = 3 5, u = , b = 2, c = 2, and
−1
−1 7 −5 5
2 3
define a transformation T : R → R by T (x) = Ax.
a. Find T (u), the image of u under the transformation T .
b. Find an x ∈ R2 whose image under T is b.
c. Is there more than one x whose image under T is b?
d. Determine if c is in the range of the transformation T .
Solution.
" #
1.5
Ans: b. x = ; c. no; d. no
−0.5
1.8. Linear Transformations 53
Linear Transformations
Definition 1.66. A transformation T is linear if:
(i) T (u + v) = T (u) + T (v), for all u, v in the domain of T
(ii) T (cu) = cT (u), for all scalars c and all u in the domain of T
T (0) = 0 (1.31)
and
T(c u + d v) = c T(u) + d T(v) , (1.32)
for all vectors u, v in the domain of T and all scalars c, d.
We can easily prove that if T satisfies (1.32), then T is linear.
Example 1.71. Let θ be the angle measured from the positive x-axis coun-
terclockwise. Then, the rotation can be defined as
" #
cos θ − sin θ
R[θ] = (1.35)
sin θ cos θ
True-or-False 1.72.
a. If A ∈ R3×5 and T is a transformation defined by T (x) = Ax, then the
domain of T is R3 .
b. A linear transformation is a special type of function.
c. The superposition principle is a physical description of a linear trans-
formation.
d. Every matrix transformation is a linear transformation.
e. Every linear transformation is a matrix transformation. (If it is false,
can you find an example that is linear but of no matrix description?)
Solution.
Ans: F,T,T,T,F
56 Chapter 1. Linear Equations
Exercises 1.8
1. With T defined by T x = Ax, find a vector x whose image under T is b, and determine
whether
x is unique.
1 −3 2 6
A = 0 1 −4, b = −7
3 −5 −9 −9
−5
Ans: x = −3, unique
1
2. Answer the following
0 2 −6 6 −4
x 7→ Ax? Why or why not?
Ans: yes
" # " #
5 −2
4. Use a rectangular coordinate system to plot u = , u= , and their images under
2 4
the given transformation T . (Make a separate and reasonably large sketch.) Describe
2
geometrically
" what#"T does
# to each vector x in R .
−1 0 x1
T (x) =
0 −1 x2
5. Show that the transformation T defined by T (x1 , x2 ) = (2x1 − 3x2 , x1 + 4, 5x2 ) is not linear.
Hint : T (0, 0) = 0?
6. Let T : R3 → R3 be the transformation that projects each vector x = (x1 , x2 , x3 ) onto the
plane x2 = 0, so T (x) = (x1 , 0, x3 ). Show that T is a linear transformation.
Hint : Try to verify (1.32): T (cx + dy) = T (cx1 + dy1 , cx2 + dy2 , cx3 + dy3 ) = · · · = cT (x) + dT (y).
1.9. The Matrix of A Linear Transformation 57
Here in this section, we will try to find matrices for linear transformations
defined in Rn . Let’s begin with an example.
Example 1.73. Suppose T : R2 → R3 is a linear transformation such that
5 −3 " # " #
1 0
T (e1 ) = −7, T (e2 ) = 8, where e1 = , e2 = .
0 1
2 0
3×2
Solution. What we should do is tofind a matrix A ∈R such that
5 −3
T (e1 ) = Ae1 = −7, T (e2 ) = Ae2 = 8. (1.36)
2 0
" #
x1
Let x = ∈ R2 . Then
x2
" # " #
1 0
x = x1 + x2 = x 1 e1 + x 2 e2 . (1.37)
0 1
It follows from linearity of T that
T (x) = T (x1 e1 + x2 e2 ) = x1 T (e1 ) + x2 T (e2 )
" # 5 −3
x1
= [T (e1 ) T (e2 )] = −7 8 x.
(1.38)
x2
2 0
| {z }
A
Thus
T (x) = T (x1 e1 + x2 e2 + · · · + xn en )
= x1 T (e1 ) + x2 T (e2 ) + · · · + xn T (en ) (1.41)
= [T (e1 ) T (e2 ) · · · T (en )] x,
and therefore the standard matrix reads
A = [T (e1 ) T (e2 ) · · · T (en )] . (1.42)
1.9. The Matrix of A Linear Transformation 59
" #
1 2
Ans: A = .
0 1
Example 1.77. Write the standard matrix for the linear transformation
T : R2 → R4 given by
T (x1 , x2 ) = (x1 + 4x2 , 0, x1 − 3x2 , x1 ).
Solution.
60 Chapter 1. Linear Equations
" #
0 −1
Ans: 3) A3 =
−1 0
1.9. The Matrix of A Linear Transformation 61
Definition 1.79.
Figure 1.12: Injective?: Is each b ∈ Rm the image of one and only one x in Rn ?
0 0 0 −1
Is T onto? Is T one-to-one?
Solution.
0 0 4
onto R3 ?
Solution.
1.9. The Matrix of A Linear Transformation 63
1 4
0 0
1 −3x. Is T one-to-one (1–1)? Is T onto?
Example 1.83. Let T (x) =
1 0
Solution.
1 0 0 1
Example 1.84. Let T (x) = 0 1 2 3x. Is T 1–1? Is T onto?
0 2 4 6
Solution.
64 Chapter 1. Linear Equations
Is T 1–1? Is T onto?
Solution.
True-or-False 1.86.
a. A mapping T : Rn → Rm is one-to-one if each vector in Rn maps onto a
unique vector in Rm .
b. If A is a 3 × 2 matrix, then the transformation x 7→ Ax cannot map R2
onto R3 .
c. If A is a 3 × 2 matrix, then the transformation x 7→ Ax cannot be one-
to-one. (See Theorem 1.81, p.62.)
d. A linear transformation T : Rn → Rm is completely determined by its
action on the columns of the n × n identity matrix.
Solution.
Ans: F,T,F,T
1.9. The Matrix of A Linear Transformation 65
Exercises 1.9
1. Assume that T is a linear transformation. Find the standard matrix of T .
(a) T : R2 → R4 , T (e1 ) = (3, 1, 3, 1) and T (e2 ) = (5, 2, 0, 0), where e1 = (1, 0) and e2 =
(0, 1).
(b) T : R2 → R2 first performs a horizontal shear that transforms e2 into e2 − 2e1 (leav-
ing e1 unchanged) and then reflects points through the line x2 = −x1 .
" # " # " #
1 −2 0 −1 0 −1
Ans: (b) shear: and reflection: ; it becomes .
0 1 −1 0 −1 2
2. Show that T is a linear transformation by finding a matrix that implements the map-
ping. Note that x1 , x2 , · · · are not vectors but are entries in vectors.
3. Let T : R2 → R2 be a linear transformation such that T (x1 , x2 ) = (x1 +x2 , 4x1 +5x2 ). Find
x such that T (x) = (3, 8). " #
7
Ans: x =
−4
4. Determine if the specified linear transformation is (1) one-to-one and (2) onto. Justify
each answer.
From the elementary school, you have learned about numbers and oper-
ations such as addition, subtraction, multiplication, division, and factor-
ization. Matrices are also mathematical objects. Thus you may de-
fine matrix operations, similarly done for numbers. Matrix algebra is a
study about such matrix operations and related applications. Algorithms
and techniques you will learn through this chapter are quite fundamen-
tal and important to further develop for application tasks.
Contents of Chapter 2
2.1. Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.2. The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.3. Characterizations of Invertible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.5. Solving Linear Systems by Matrix Factorizations . . . . . . . . . . . . . . . . . . . . . . 87
2.8. Subspaces of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.9. Dimension and Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
67
68 Chapter 2. Matrix Algebra
Let A be an m × n matrix.
Let aij denotes the entry in row i and
column j. Then, we write A = [aij ].
Terminologies
• If m = n, A is called a square matrix.
• If A is an n × n matrix, then the entries a11 , a22 , · · · , ann are called di-
agonal entries.
• A diagonal matrix is a square matrix (say n × n) whose non-diagonal
entries are zero.
Ex: Identity matrix In .
a) A + B, A + C?
b) A − 2B
Solution.
70 Chapter 2. Matrix Algebra
Matrix Multiplication
A B : x ∈ Rp 7→ Bx ∈ Rn 7→ A(Bx) ∈ Rm (2.2)
B A
Figure 2.2
Bx = [b1 b2 · · · bp ] x = x1 b1 + x2 b2 + · · · + xp bp
⇒ A(Bx) = A(x1 b1 + x2 b2 + · · · + xp bp )
= x1 Ab1 + x2 Ab2 + · · · + xp Abp (2.3)
= [Ab1 Ab2 · · · Abp ] x
⇒ AB = [Ab1 Ab2 · · · Abp ] ∈ Rm×p
where bi ∈ Rn .
2.1. Matrix Operations 71
" #
2 5
Ans: B =
3 1
2.1. Matrix Operations 73
5 −3 23 −16
Solution.
" #
4
Ans: b1 =
−1
Remark 2.7.
1. (Commutativity) Suppose both AB and BA are defined. Then, in
" AB 6=
general, # BA. " # " # " #
3 −6 −1 1 −21 −21 −4 8
A= , B= . Then AB = , BA = .
−1 2 3 4 7 7 5 −10
Ak = A
| ·{z
· · A}
k times
Transpose of a Matrix
0 0 0 5
True-or-False 2.11.
a. Each column of AB is a linear combination of the columns of B using
weights from the corresponding column of A.
b. The second row of AB is the second row of A multiplied on the right by
B.
c. The transpose of a sum of matrices equals the sum of their transposes.
Solution.
Challenge 2.12.
a. Show that if the columns of B are linearly dependent, then so are the
columns of AB.
b. Suppose CA = In (the n × n identity matrix). Show that the equation
Ax = 0 has only the trivial solution.
Exercises 2.1
1. Compute the product AB in two ways: (a) by the definition, where Ab1 and Ab2 are
computed
separately,
and (b) by the row-column rule for computing AB.
−1 2 " #
3 2
A= 5 4 and B = .
2 1
2 −3
2. If a matrix A is 5 × 3 and the product AB is 5 × 7, what is the size of B?
" # " # " #
2 −3 8 4 5 −2
3. Let A = , B = , and C = . Verify that AB = AC and yet
−4 6 5 5 3 1
B 6= C.
" # " #
1 −2 −1 2 −1
4. If A = and AB = , determine the first and second columns of
−2 5 6 −9 3
B. " # " #
7 −8
Ans: b1 = , b2 =
4 −5
5. Give a formula for (ABx)T , where x is a vector and A and B are matrices of appropriate
sizes.
−2 a
6. Let u = 3 and v = b. Compute uT v, vT u, u vT and v uT .
−4 c
−2a −2b −2c
Ans: uT v = −2a + 3b − 4c and u vT = 3a 3b 3c.
" #
−5 2
Ans: A−1 =
8 −3
0 1 0
Example 2.18. Find the inverse of A = 1 0 3, if it exists.
4 −3 8
Solution.
2.2. The Inverse of a Matrix 79
1 2 −1
Example 2.19. Find the inverse of A = −4 −7 3, if it exists.
−2 −6 4
Solution.
Theorem 2.20.
" #
a b
a. (Inverse of a 2 × 2 matrix) Let A = . If ad − bc 6= 0, then A
c d
is invertible and " #
1 d −b
A−1 = (2.5)
ad − bc −c a
Axj = ej , j = 1, 2, · · · , n.
2.2. The Inverse of a Matrix 81
True-or-False 2.24.
a. In order for a matrix B to be the inverse of A, both equations AB = In
and BA = In must be true.
" #
a b
b. If A = and ad = bc, then A is not invertible.
c d
c. If A is invertible, then elementary row operations that reduce A to the
identity In also reduce A−1 to In .
Solution.
Ans: T,T,F
Exercises 2.2
"
# 1 −2 1
3 −4
1. Find the inverses of the matrices, if exist: A = and B = 4 −7 3
7 −8
−2 6 −4
Ans: B is not invertible.
2. Use matrix algebra to show that if A is invertible and D satisfies AD = I, then D = A−1 .
Hint : You may start with AD = I and then multiply A−1 .
3. Solve the equation AB + C = BC for A, assuming that A, B, and C are square and B is
invertible.
4. Explain why the columns of an n × n matrix A span Rn when A is invertible. Hint : If A
is invertible, then Ax = b has a solution for all b in Rn .
5. Suppose A is n × n and the equation Ax = 0 has only the trivial solution. Explain why
A is row equivalent to In . Hint : A has n pivot columns.
6. Suppose A is n × n and the equation Ax = b has a solution for each b in Rn . Explain
why A must be invertible. Hint : A has n pivot columns.
82 Chapter 2. Matrix Algebra
−5 −1 9
Solution.
Example 2.27. Can a square matrix with two identical columns be invert-
ible?
84 Chapter 2. Matrix Algebra
Exercises 2.3
1. An m × n lower triangular matrix is one whose entries above the main diagonal are
0’s. When is a square upper triangular matrix invertible? Justify your answer. Hint :
See Example 2.28.
2. Is it possible for a 5 × 5 matrix to be invertible when its columns do not span R5 ? Why
or why not?
Ans: No
3. If A is invertible, then the columns of A−1 are linearly independent. Explain why.
4. If C is 6 × 6 and the equation Cx = v is consistent for every v ∈ R6 , is it possible that
for some v, the equation Cx = v has more than one solution? Why or why not?
Ans: No
5. If the equation Gx = y has more than one solution for some y ∈ Rn , can the columns of
G span Rn ? Why or why not?
Ans: No
6. Let T : R2 → R2 be a linear transformation such that T (x1 , x2 ) = (6x1 − 8x2 , −5x1 + 7x2 ).
Show that T is invertible and find a formula for T −1 . Hint : See Example 2.30.
2.5. Solving Linear Systems by Matrix Factorizations 87
A = LU, (2.8)
−3 2 0
a) Reduce A to an echelon matrix, using replacement operations.
b) Express the replacement operations as elementary matrices.
c) Find their inverse.
Solution. a) b) & c)
1 −2 1
A = 2 −2 3
−3 2 0
2.5. Solving Linear Systems by Matrix Factorizations 89
−6 5 −5
Solution. (Forward Phase: Gauss Elimination)
3 −1 1 3 −1 1
E1 : R2 ←R2 −3R1
A = 9 1 2 −− −−−−−−−→ 0 4 −1
E2 : R3 ←R3 +2R1
−6 5 −5 0 3 −3
(2.12)
3 −1 1
E3 : R3 ←R3 − 34 R2
−−−−−−−−−→ 0 4 −1 = U
0 0 − 49
A → U : R2 ← R2 −3R1 =⇒ R3 ← R3 +2R1 =⇒ R3 ← R3 − 43 R2
E3 E2 E1 A = U =⇒ A = (E3 E2 E1 )−1 U
(2.13)
L = I(E3 E2 E1 )−1 = I E1−1 E2−1 E3−1
I → L : R2 ← R2 +3R1 ⇐= R3 ← R3 −2R1 ⇐= R3 ← R3 + 43 R2
0 0 1 0 43 1
(2.14)
1 0 0
E −1 : R3 ←R3 −2R1
−−2−1−−−−−−−−→ 3 1 0 = L.
E1 : R2 ←R2 +3R1
3
−2 4 1
−6 5 −5
Solution. (Practical Implementation):
3 −1 1 3 −1 1
E : R ←R2 −3R1
A = 9 1 2 −−1−−2−−− −−→ 3 4 −1
E2 : R3 ←R3 +2R1
−6 5 −5 -2 3 −3
(2.15)
3 −1 1
E3 : R3 ←R3 − 34 R2
−−−−−−−−−→ 3 4 −1
3
-2 4 − 94
3 [m,n] = size(A);
4 for k = 1:m-1
5 A(k+1:m, k) = A(k+1:n,k)/A(k,k); %ratios to pivot
6 for i = k+1:m
7 A(i,k+1:n) = A(i,k+1:n) - A(i,k)*A(k,k+1:n);
8 end
9 end
8 8 −8
92 Chapter 2. Matrix Algebra
L y = b, (2.17)
`11 y1 = b1
`21 y1 + `22 y2 = b2
`31 y1 + `32 y2 + `33 y3 = b3 (2.18)
.. ..
. .
`n1 y1 + `n2 y2 + `n3 y3 + · · · + `nn yn = bn
• With y1 , y2 known, we can solve the third equation for y3 , and so on.
Upper-Triangular Systems
for i=n:-1:1
if(U(i,i)==0), error(’U: singular!’); end
x(i)=y(i)/U(i,i); (2.24)
y(1:i-1)=y(1:i-1)-U(1:i-1,i)*x(i);
end
94 Chapter 2. Matrix Algebra
−3 2 0 1
Solution.
forward_sub.m
1 function y = forward_sub(L,b)
2 % function y = forward_sub(L,b)
3
4 [m,n] = size(L);
5 y = zeros(m,1); y(1)=b(1)/L(1,1);
6
7 for i=2:m
8 y(i) = ( b(i) - L(i,1:i-1)*y(1:i-1) ) / L(i,i);
9 end
2.5. Solving Linear Systems by Matrix Factorizations 95
back_sub.m
1 function x = back_sub(U,y)
2 %function x = back_sub(U,y)
3
4 [m,n] = size(U);
5 x = zeros(m,1); x(m)=y(m)/U(m,m);
6
7 for i=m-1:-1:1
8 x(i) = (y(i)-(U(i,i+1:end)*x(i+1:end))) / U(i,i);
9 end
lu_solve.m Output
1 A = [ 1 -2 1 1 x =
2 2 -2 3 2 1
3 -3 2 0]; 3 2
4 b = [-2 1 1]'; 4 1
5 5
6 x = A\b 6 L =
7 7 1.00000 0.00000 0.00000
8 [L,U,P] = lu(A) 8 -0.33333 1.00000 0.00000
9 r = P*b; 9 -0.66667 0.50000 1.00000
10 y = forward_sub(L,r); 10
11 x = back_sub(U,y) 11 U =
12 -3.00000 2.00000 0.00000
13 0.00000 -1.33333 1.00000
14 0.00000 0.00000 2.50000
15
16 P =
17 0 0 1
18 1 0 0
19 0 1 0
20
21 x =
22 1
23 2
24 1
96 Chapter 2. Matrix Algebra
Exercises 2.5
1. (Hand calculation) Solve the equation Ax = b by using the LU factorization given for
A.
4 3 −5 1 0 0 4 3 −5 2
A = −4 −5 7 = −1 1 00 −2 2 and b = −4.
8 6 −8 2 0 1 0 0 2 6
Here Ly = b requires replacement operations forward, while U x = y requires replace-
ment and scaling operations backward.
1/4
Ans: x = 2.
1
2. (Hand calculation) When A is invertible, Matlab finds A−1 by factoring A = LU , in-
verting L and U , and then computing U −1 L−1 . You will use this method to compute the
inverse of A given in Exercise 1.
(a) Find U −1 , starting from [U I], reduce it to [I U −1 ].
(b) Find L−1 , starting from [L I], reduce it to [I L−1 ].
(c) Compute U −1 L−1 .
1/8 3/8 1/4
Ans: A−1 = −3/2 −1/2 1/2.
−1 0 1/2
−2 −4 7 5
(a) Try to see if you can find LU factorization without pivoting.
(b) M Use Matlab/Octave to find the LU factorization of A. Then recover A from
[L,U,P].
2 5 4 3
−4 −9 −6 −2
5. Find LU factorization (without pivoting) of B = .
2 7 7 14
−6 −14 −10 −2
2 5 4 3
0 1 2 4
Ans: U = .
0
0 −1 3
0 0 0 3
1
All problems marked by M will have a higher credit.
2.8. Subspaces of Rn 97
2.8. Subspaces of Rn
Example 2.48.
Col A = {u | u = c1 a1 + c2 a2 + · · · + cn an }, (2.25)
−3 7 6 −4
b is in the column space of A, Col A.
Solution. Clue: 1 b ∈ Col A
⇔ 2 b is a linear combination of columns of A
⇔ 3 Ax = b is consistent
⇔ 4 [A b] has a solution
Proof.
2.8. Subspaces of Rn 99
Remark 2.54.
(" # " #)
1 0
1. , is a basis for R2 .
0 2
1 0 0
0 1 0
, e2 = 0, · · · , en = .... Then {e1 , e2 , · · · , en } is called
2. Let e1 = 0
.. ..
. . 0
0 0 1
n
the standard basis for R .
2 −4 5 8
1 2 0 1 0
Solution. [A 0] ∼ 0 0 1 2 0
0 0 0 0 0
100 Chapter 2. Matrix Algebra
Theorem 2.56. Basis for Nul A can be obtained from the parametric
vector form of solutions of Ax = 0. That is, suppose that the solutions of
Ax = 0 reads
x = x1 u1 + x2 u2 + · · · + xk uk ,
where x1 , x2 , · · · , xk correspond to free variables. Then, a basis for Nul A
is {u1 , u2 , · · · , uk }.
Example 2.59. Matrix A and its echelon form is given. Find a basis for
Col A and a basis forNul
A.
3 −6 9 0 1 −2 3 0
A = 2 −4 7 2 ∼ 0 0 1 2
3 −6 6 −6 0 0 0 0
Solution.
Exercises 2.8
1 4 5 −4
−2 −7 −8 10
1. Let v1 = , v2 = , v3 = , and u = . Determine if u is in the subspace
4 9 6 −7
3 7 5 −5
of R4 generated by {v1 , v2 , v3 }.
Ans: No
−3 −2 0 1
2. Let v1 = 0, v2 = 2, v3 = −6, and p = 14. Determine if p is in Col A, where
6 3 3 −9
A = [v1 v2 v3 ].
Ans: Yes
3. Give integers p and q such that Nul A is a subspace of Rp and Col A is a subspace of Rq .
1 2 3
3 2 1 −5 4 5 7
A = −9 −4 1 7 B =
−5 −1 0
9 2 −5 1
2 7 11
4. Determine which sets are bases for R3 . Justify each answer.
0 5 6 1 3 −2 0
a) 1, −7, 3 b) −6, −4, 7, 1
−2 4 5 −7 7 5 −2
Ans: a) Yes
5. Matrix A and its echelon form is given. Find a basis for Col A and a basis for Nul A.
1 4 8 −3 −7 1 4 8 0 5
−1 2 7 3 4 0 −1
0 2 5
A = ∼ Hint : For a basis for Col A, you can just
−2 2 9 5 5 0 0 0 1 4
3 6 9 −5 −2 0 0 0 0 0
recognize pivot columns, while you should find the solutions of Ax = 0 for Nul A.
6. a) Suppose F is a 5 × 5 matrix whose column space is not equal to R5 . What can you
say about Nul F ?
b) If R is a 6 × 6 matrix and Nul R is not the zero subspace, what can you say about
Col R?
c) If Q is a 4 × 4 matrix and Col Q = R4 , what can you say about solutions of equations
of the form Qx = b for b in R4 ?
Ans: b) Col R 6= R6 . Why? c) It has always a unique solution
2.9. Dimension and Rank 103
Remark 2.61. The main reason for selecting a basis for a subspace
H (instead of merely a spanning set) is that each vector in H can be
written in only one way as a linear combination of the basis vectors.
x = c1 b1 + c2 b2 + · · · + cp bp ; x = d1 b1 + d2 b2 + · · · + dp bp , x ∈ H.
Show that c1 = d1 , c2 = d2 , · · · , cp = dp .
Solution. Hint : A property of a basis is that basis vectors are linearly independent
0 0 1
2 1 7
Then B is a basis for H = Span{v1 , v2 }, because v1 and v2 are linearly
independent. Determine if x is in H, and if it is, find the coordinate vector
of x relative to B.
Solution.
Remark 2.66. The grid on the plane in Figure 2.4 makes H “look"
like R2 . The correspondence x 7→ [x]B is a one-to-one correspondence
between H and R2 that preserves linear combinations. We call such a
correspondence an isomorphism, and we say that H is isomorphic to
R2 .
2.9. Dimension and Rank 105
Example 2.68.
a) dim Rn = n.
b) Let H be as in Example 2.65. What is dim H?
c) G = Span{u}. What is dim G?
Solution.
Remark 2.69.
1) Dimension of Col A:
Example 2.71. A matrix and its echelon form are given. Find the bases
for Col
A and Nul A and also
state
the dimensions
of these subspaces.
1 −2 −1 5 4 1 0 1 0 0
2 −1 1 5 6 0 1 1 0 0
A= −2 0 −2 1 −6 ∼ 0 0 0 1 0
3 1 4 1 5 0 0 0 0 1
Solution.
Example 2.72. Find a basis for the subspace spanned by the given vectors.
What
isthe
dimension
of the subspace?
1 2 0 −1 3
−1 −3 −1 4 −7
, , , ,
−2 −1 3 −7 6
3 4 2 7 −9
Solution.
2.9. Dimension and Rank 107
Example 2.75.
True-or-False 2.76.
a. Each line in Rn is a one-dimensional subspace of Rn .
b. The dimension of Col A is the number of pivot columns of A.
c. The dimensions of Col A and Nul A add up to the number of columns of
A.
d. If B = {v1 , v2 , · · · , vp } is a basis for a subspace H of Rn , then the corre-
spondence x 7→ [x]B makes H look and act the same as Rp .
Hint : See Remark 2.66.
Exercises 2.9
1. A matrix and its echelon form is given. Find the bases for Col A and Nul A, and then
state the dimensions of these subspaces.
1 −3 2 −4 1 −3 2 −4
3 9 −1 5
0
0 5 −7
A= ∼
2 −6 4 −3 0 0 0 5
4 12 2 7 0 0 0 0
2. Use the Rank Theorem to justify each answer, or perform construction.
(a) If the subspace of all solutions of Ax = 0 has a basis consisting of three vectors and
if A is a 5 × 7 matrix, what is the rank of A?
Ans: 4
(b) What is the rank of a 4 × 5 matrix whose null space is three-dimensional?
(c) If the rank of a 7 × 6 matrix A is 4, what is the dimension of the solution space of
Ax = 0?
(d) Construct a 4 × 3 matrix with rank 1.
C HAPTER 3
Determinants
Then the determinant of A (in modulus) is the same as the area of the par-
allelogram generated by u1 and u2 .
In this chapter, you will study the determinant and its properties.
Contents of Chapter 3
3.1. Introduction to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2. Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
109
110 Chapter 3. Determinants
Ans: 3) 12
0 −2 0
across the first row and down column 3.
Solution.
Ans: −2
112 Chapter 3. Determinants
1 −2 5 2
2 −6 −7 5
Example 3.6. Compute the determinant of A =
0 0 3
by a
0
5 0 4 4
cofactor expansion.
Solution.
True-or-False 3.8.
a. An n × n determinant is defined by determinants of (n − 1) × (n − 1)
submatrices.
b. The (i, j)-cofactor of a matrix A is the matrix Aij obtained by deleting
from A its i-th row and j-th column.
c. The cofactor expansion of det A down a column is equal to the cofactor
expansion along a row.
d. The determinant of a triangular matrix is the sum of the entries on the
main diagonal.
Solution.
Ans: T,F,T,F
Exercises 3.1
1. Compute the determinants in using a cofactor expansion across the first row and down
the second column.
3 0 4 2 3 −3
a) 2 3 2 b) 4 0 3
0 5 −1 6 1 5
Ans: a) 1, b) −24
2. Compute the determinants by cofactor expansions. At each step, choose a row or column
that involves the least amount of computation.
3 5 −6 4 4 0 −7 3 −5
0 −2 3 −3 0 0
2 0 0
a)
b)
0 0 1 5 7 3 −6 4 −8
0 0 0 3 5 0 5 2 −3
0 0 9 −1 2
114 Chapter 3. Determinants
Ans: a) −18, b) 6
3. The expansion of a 3 × 3 determinant can be remembered by the following device. Write
a second copy of the first two columns to the right of the matrix, and compute the deter-
minant by multiplying entries on six diagonals:
Figure 3.1
Then, add the downward diagonal products and subtract the upward products. Use this
method to compute the determinants for the matrices in Exercise 1. Warning: This
trick does not generalize in any reasonable way to 4 × 4 or larger matrices.
4. Explore the effect of an elementary row operation on the determinant of a matrix. In
each case, state the row operation and describe how it affects the determinant.
" # " # " # " #
a b c d a b a + kc b + kd
a) , b) ,
c d a b c d c d
Ans: b) Replacement does not change the determinant
" #
3 1
5. Let A = . Write 5A. Is det (5A) = 5det A?
4 2
3.2. Properties of Determinants 115
1 −4 2
Example 3.10. Compute det A, where A = −1 7 0, after applying
−2 8 −9
a couple of steps of replacement operations.
Solution.
Ans: 15
116 Chapter 3. Determinants
1
c) If A is invertible, then det A−1 = . (∵ det In = 1.)
det A
0 0 0 0 1
Solution.
Ans: −30
3.2. Properties of Determinants 117
Then,
T (cx) = cT (x)
(3.6)
T (u + v) = T (u) + T (v)
This (multi-) linearity property of the determinant turns out to have
many useful consequences that are studied in more advanced courses.
True-or-False 3.14.
a. If the columns of A are linearly dependent, then det A = 0.
b. det (A + B) = det A + det B.
c. If three row interchanges are made in succession, then the new deter-
minant equals the old determinant.
d. The determinant of A is the product of the diagonal entries in A.
e. If det A is zero, then two rows or two columns are the same, or a row or
a column is zero.
Solution.
Ans: T,F,F,F,F
118 Chapter 3. Determinants
Exercises 3.2
1. Find the determinant by row reduction to echelon form.
1 3 0 2
−2 −5 7 4
3 5 2 1
1 −1 2 −3
Ans: 0
2. Use
determinants
to find out if the matrix is invertible.
2 0 0 6
1 −7 −5 0
3 8 6 0
0 7 5 4
Ans: Invertible
3. Use
determinants
to decide if the set of vectors is linearly independent.
7 −8 7
−4, 5, 0
−6 7 −5
Ans: linearly independent
1 0 1
4. Compute det B 6 , where B = 1 1 2.
1 2 1
Ans: 64
5. Show or answer with justification.
a) Let A and P be square matrices, with P invertible. Show that det (P AP −1 ) = det A.
b) Suppose that A is a square matrix such that det A3 = 0. Can A can be invertible?
Ans: No
c) Let U be a square matrix such that U T U = I. Show that det U = ±1.
6. Compute
" AB # and "verify# that det AB = det A · det B.
3 0 2 0
A= ,B=
6 1 5 4
C HAPTER 4
Vector Spaces
• addition and
• scalar multiplication.
In this chapter, we will study basic concepts of such general vector spaces
and their subspaces.
Contents of Chapter 4
4.1. Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.2. Null Spaces, Column Spaces, and Linear Transformations . . . . . . . . . . . . . . . . 125
119
120 Chapter 4. Vector Spaces
Then Pn is a vector space, with the usual polynomial addition and scalar
multiplication.
c) Let V = {all real-valued functions defined on a set D}. Then, V is a vec-
tor space, with the usual function addition and scalar multiplication.
4.1. Vector Spaces and Subspaces 121
Solution.
b) For each u, w ∈ H, u + w ∈ H
True-or-False 4.10.
a. A vector is an arrow in three-dimensional space.
b. A subspace is also a vector space.
c. R2 is a subspace of R3 .
d. A subset H of a vector space V is a subspace of V if the following con-
ditions are satisfied: (i) the zero vector of V is in H, (ii) u, v, and u + v
are in H, and (iii) c is a scalar and cu is in H.
Solution.
Ans: F,T,F,F (In (ii), there is no statement that u and v represent all possible elements of H)
124 Chapter 4. Vector Spaces
Exercises 4.1
You may use Definition 4.3, p. 121, or Theorem 4.7, p. 122.
(" # )
x
1. Let V be the first quadrant in the xy-plane; that is, let V = : x ≥ 0, y ≥ 0 .
y
2s
a) Find a vector v in R3 such that H = Span{v}. 1
Ans: a) v = 3
b) Why does this show that H is a subspace of R3 ?
2
5b + 2c
3. Let W be the set of all vectors of the form b.
Proof.
126 Chapter 4. Vector Spaces
2 −4 5 8 −4
1 −2 0 −1 3 0
Solution. [A 0] ∼ 0 0 1 2 −2 0 (R.E.F)
0 0 0 0 0 0
4.2. Null Spaces, Column Spaces, and Linear Transformations 127
True-or-False 4.23.
a. The column space of A is the range of the mapping x 7→ Ax.
b. The kernel of a linear transformation is a vector space.
c. Col A is the set of all vectors that can be written as Ax for some x.
That is, Col A = {b | b = Ax, for x ∈ Rn }.
d. Nul A is the kernel of the mapping x 7→ Ax.
e. Col A is the set of all solutions of Ax = b.
Solution.
Ans: T,T,T,T,F
130 Chapter 4. Vector Spaces
Exercises 4.2
1. Either show that the given set is a vector space, or find a specific example to the contrary.
a
b − 2d
(a) b : a + b + c = 2 (b) b + 3d : b, d ∈ R
c d
4 0 4 −2
Contents of Chapter 5
5.1. Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2. The Characteristic Equation and Similarity Transformation . . . . . . . . . . . . . . . 138
5.3. Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.5. Complex Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.7. Applications to Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.8. Iterative Estimates for Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.9. Applications to Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
131
132 Chapter 5. Eigenvalues and Eigenvectors
Example 5.6. Find a basis for the eigenspace and hence the dimension of
4 0 −1
the eigenspace of A = 3 0 3, corresponding to the eigenvalue λ = 3.
2 −2 5
Solution.
Example 5.7. Find a basis for the eigenspace and hence the dimension of
4 −1 6
the eigenspace of A = 2 1 6, corresponding to the eigenvalue λ = 2.
2 −1 8
Solution.
134 Chapter 5. Eigenvalues and Eigenvectors
0 0 2 5 3 4
eigenvalues of A and B?
Solution.
Ax = 0x = 0. (5.1)
Proof.
c1 v1 + c2 v2 + · · · + cp vp = vp+1 (5.2)
• Multiplying both sides of (5.2) by λp+1 and subtracting the result from
(5.3), we have
c1 (λ1 − λp+1 )v1 + c2 (λ2 − λp+1 )v2 + · · · + cp (λp − λp+1 )vp = 0. (5.4)
c1 = c2 = · · · = cp = 0 ⇒ vp+1 = 0,
which is a contradiction.
136 Chapter 5. Eigenvalues and Eigenvectors
Example 5.12. Show that if A2 is the zero matrix, then the only eigenvalue
of A is 0.
Solution. Hint : You may start with Ax = λx, x 6= 0.
True-or-False 5.13.
a. If Ax = λx for some vector x, then λ is an eigenvalue of A.
b. A matrix A is not invertible if and only if 0 is an eigenvalue of A.
c. A number c is an eigenvalue of A if and only if the equation (A−cI)x = 0
has a nontrivial solution.
d. If v1 and v2 are linearly independent eigenvectors, then they corre-
spond to distinct eigenvalues.
e. An eigenspace of A is a null space of a certain matrix.
Solution.
Ans: F,T,T,F,T
5.1. Eigenvectors and Eigenvalues 137
Exercises 5.1
" #
7 3
1. Is λ = −2 an eigenvalue of ? Why or why not?
3 −1
Ans: Yes
" # " #
1 −3 1
2. Is an eigenvector of ? If so, find the eigenvalue.
4 −3 8
3. Find a basis for the eigenspace corresponding to each listed eigenvalue.
1 0 −1 3 0 2 0
(a) A = 1 −3 0, λ = −2
1 3 1 0
(b) B = , λ = 4
4 −13 1 0
1 1 0
0 0 0 4
2
1 3
Ans: (a) 1 (b) and another vector
1
3
0
0 0 0
4. Find the eigenvalues of the matrix 0 2 5.
0 0 −1
1 2 3
5. For 1 2 3, find one eigenvalue, with no calculation. Justify your answer.
1 2 3
Ans: 0. Why?
6. Prove that λ is an eigenvalue of A if and only if λ is an eigenvalue of AT . (A and AT have
exactly the same eigenvalues, which is frequently used in engineering applications of
linear algebra.)
Hint : 1 λ is an eigenvalue of A
⇔ 2 (A − λI)x = 0, for some x 6= 0
⇔ 3 (A − λI) is not invertible.
Now, try to use the Invertible Matrix Theorem (Theorem 2.25) to finish your proof. Note
that (A − λI)T = (AT − λI).
138 Chapter 5. Eigenvalues and Eigenvectors
Ax = λx, x 6= 0.
0 0 2
Solution.
5.2. The Characteristic Equation and Similarity Transformation 139
Similarity
The next theorem illustrates one use of the characteristic polynomial, and
it provides the foundation for the computation of eigenvalues.
Proof. B = P −1 AP . Then,
B − λI = P −1 AP − λI
= P −1 AP − λP −1 P
= P −1 (A − λI)P,
from which we conclude
det (B − λI) = det (P −1 ) det (A − λI) det (P ) = det (A − λI).
5.2. The Characteristic Equation and Similarity Transformation 141
True-or-False 5.24.
a. The determinant of A is the product of the diagonal entries in A.
b. An elementary row operation on A does not change the determinant.
c. (det A)(det B) = det AB
d. If λ + 5 is a factor of the characteristic polynomial of A, then 5 is an
eigenvalue of A.
e. The multiplicity of a root r of the characteristic equation of A is called
the algebraic multiplicity of r as an eigenvalue of A.
Solution.
Ans: F,F,T,F,T
Exercises 5.2
" #
5 −3
1. Find the characteristic polynomial and the eigenvalues of .
−4 3 √
Ans: λ = 4 ± 13
2. Find the characteristic polynomial of matrices. [Note. Finding the characteristic poly-
nomial of a 3 × 3 matrix is not easy to do with just row operations, because the variable
is involved.]
0 3 1 −1 0 1
(a) 3 0 2 (b) −3 4 1
1 2 0 0 0 2
Ans: (b) −λ3 + 5λ2 − 2λ − 8 = −(λ + 1)(λ − 2)(λ − 4)
(a) Construct a random integer-valued 4 × 4 matrix A, and verify that A and AT have
the same characteristic polynomial (the same eigenvalues with the same multiplic-
ities). Do A and AT have the same eigenvectors?
(b) Make the same analysis of a 5 × 5 matrix.
Note. Figure out by yourself how to generate random integer-valued matrices, how to
make its transpose, and how to get eigenvalues and eigenvectors.
142 Chapter 5. Eigenvalues and Eigenvectors
5.3. Diagonalization
5.3.1. The Diagonalization Theorem
A = P DP −1 (or P −1 AP = D) (5.7)
A2 = (P DP −1 )(P DP −1 ) = P D2 P −1
Ak = P Dk P −1
(5.8)
A−1 = P D−1 P −1 (when A is invertible)
det A = det D
" #
2 · 5k − 3k 5k − 3k
Ans: Ak =
2 · 3k − 2 · 5k 2 · 3k − 5k
5.3. Diagonalization 143
where Avk = λk vk , k = 1, 2, · · · , n.
while
λ1 0 · · · 0
0 λ2 · · · 0
P D = [v1 v2 · · · vn ]
... ... . . . ... = [λ1 v1 λ2 v2 · · · λn vn ].
(5.11)
0 0 · · · λn
3 3 1
1 −1 −1
Ans: P = −1 1 0 and D = diag(1, −2, −2).
1 0 1
5.3. Diagonalization 145
3 3 1
0 0 −1 1
1 0 0 0
Ans: p(λ) = (λ − 2)3 (λ − 4). P = and D = diag(2, 2, 2, 4).
0 1 0 0
0 0 1 1
5.3. Diagonalization 147
0 0 −2
Solution.
True-or-False 5.37.
a. A is diagonalizable if A = P DP −1 , for some matrix D and some invert-
ible matrix P .
b. If Rn has a basis of eigenvectors of A, then A is diagonalizable.
c. If A is diagonalizable, then A is invertible.
d. If A is invertible, then A is diagonalizable.
e. If AP = P D, with D diagonal, then the nonzero columns of P must be
eigenvectors of A.
Solution.
Ans: F,T,F,F,T
Exercises 5.3
1. The matrix A is factored in the form P DP −1 . Find the eigenvalues of A and a basis for
each eigenspace.
2 2 1 1 1 2 5 0 0 1/4 1/2 1/4
A = 1 3 1 = 1 0 −10 1 01/4 1/2 −3/4
−1 −2 2 −2 −2 −5 0 0 5
Ax = λx, where x 6= 0.
det (A − λI) = 0.
" #
0 −1
Example 5.39. Let A = and consider the linear transformation
1 0
x 7→ Ax, x ∈ R2 .
Then
a) It rotates the plane counterclockwise through a quarter-turn.
b) The action of A is periodic, since after four quarter-turns, a vector is
back where it started.
c) Obviously, no nonzero vector is mapped into a multiple of itself, so
A has no eigenvectors in R2 and hence no real eigenvalues.
Ax = Ax = Ax.
Ax = Ax = λx = λx.
That is,
Ax = λx ⇔ Ax = λx (5.12)
If λ is an eigenvalue of A with corresponding eigenvector x, then the
complex conjugate of λ, λ, is also an eigenvalue with eigenvector x.
" #
1 5
Example 5.41. Let A = .
−2 3
Solution.
A = P CP −1 . (5.13)
" #
a −b
Example 5.43. Find eigenvalues of C = .
b a
Solution.
Exercises 5.5
1. Let each matrix act on C2 . Find the eigenvalues and a basis for each eigenspace in C2 .
" # " #
1 −2 1 5
(a) (b)
1 3 −2 3
" #
−1 + i
Ans: (a) An eigen-pair: λ = 2 + i,
1
" # " #
5 −2 a −b
2. Let A = . Find an invertible matrix P and a matrix C of the form such
1 3 b a
that A = P CP −1 . " # " #
1 −1 4 −1
Ans: P = ,C= .
1 0 1 4
3. Let A be an n × n real matrix with the property that AT = A. Let x be any vector in Cn ,
and let q = xT Ax. The equalities below show that q is a real number by verifying that
q = q. Give a reason for each step.
q = xT Ax = xT Ax = xT Ax = (xT Ax)T = xT AT x = q
(a) (b) (c) (d) (e)
154 Chapter 5. Eigenvalues and Eigenvectors
ln |x| = a t + K.
dx
= 5x, x(0) = 2. (5.17)
dt
(a) Find the solution x(t).
(b) Check if our solution satisfies both the differential equation and the
initial condition.
Solution.
5.7. Applications to Differential Equations 155
where
x01 (t)
x1 (t) a11 · · · a1n
x(t) = ... , x0 (t) = ... , and A = ... .. .
.
is a solution of (5.19).
Solution. The two eigen-pairs are (λi , vi ), i = 1, 2. Then the general solu-
tion is x = c1 x1 + c2 x2 = c1 v1 eλ1 t + c2 v2 eλ2 t .
" # " #
2 −t 1 −2t
Ans: x(t) = −2 e +2 e
−3 −1
What are the direction of greatest attraction and the direction of greatest
repulsion?
Solution.
• Case (ii): If A has a double eigenvalue λ (with v), then you should
find a second generalized eigenvector by solving, e.g.,
(A − λI) w = v. (5.27)
(The above can be derived from a guess: x = tveλt + weλt .) Then the
general solution becomes [2]
x(t) = c1 veλt + c2 (tv + w)eλt (5.28)
Let
y1 (t) = [(Rev) cos bt − (Imv) sin bt]eat
y2 (t) = [(Rev) sin bt + (Imv) cos bt]eat
Then they are linearly independent and satisfy the dynamical system.
Thus, the real-valued general solution of the dynamical system reads
Exercises 5.7
1. (i) Solve the initial-value problem x0 = Ax with x(0) = (3, 2). (ii) Classify the nature of
the origin as an attractor, repeller, or saddle point of the dynamical system. (iii) Find
the directions of greatest attraction and/or repulsion. (iv) When the origin is a saddle
point, sketch typical trajectories.
" # " #
−2 −5 7 −1
(a) A = (b) A = .
1 4 3 3
Ans:"(a) #The origin is a saddle point.
" #
−5 −1
The direction of G.A. = . The direction of G.R. = .
1 " 1#
1
Ans: (b) The origin is a repeller. The direction of G.R. = .
1
2. Use the strategies in (5.27) and (5.28) to solve
" # " #
7 1 2
x0 = x, x(0) = .
−4 3 −5
"# " # " #
1 5t 1 0 5t
Ans: x(t) = 2 e − t + e
−2 −2 1
162 Chapter 5. Eigenvalues and Eigenvectors
The power method approximates the largest eigenvalue λ1 and its asso-
ciated eigenvector v1 .
• In general,
n
X
k
A x = βj λkj vj , k = 1, 2, · · · , (5.34)
j=1
which gives
n
X λ k h λ k λ k λ k i
k j 1 2 n
A x = λk1 · βj vj = λk1 · β1 v1 + β2 v2 + · · · + βn vn . (5.35)
j=1
λ1 λ1 λ1 λ1
• For j = 2, 3, · · · , n, since |λj /λ1 | < 1, we have lim |λj /λ1 |k = 0, and
k→∞
initialization : x0 = x/||x||∞
for k = 1, 2, · · ·
yk = Axk−1 ; µk = ||yk ||∞ (5.37)
xk = yk /µk
end for
−1 2 −3
vectors as follows
−6 1 −2 0
eig(A) = −3, −1 −1 1
−1 1 1 1
Verify that the sequence produced by the power method converges to the
largest eigenvalue and its associated eigenvector.
Solution.
power_iteration.m
1 A = [-4 1 -1; 1 -3 2; -1 2 -3];
2 %[V,D] = eig(A)
3
Output
1 k= 1: x = [1.00000, -0.25000, 0.25000], mu=-4.00000 (error = 2.00000)
2 k= 2: x = [1.00000, -0.50000, 0.50000], mu=-4.50000 (error = 1.50000)
3 k= 3: x = [1.00000, -0.70000, 0.70000], mu=-5.00000 (error = 1.00000)
4 k= 4: x = [1.00000, -0.83333, 0.83333], mu=-5.40000 (error = 0.60000)
5 k= 5: x = [1.00000, -0.91176, 0.91176], mu=-5.66667 (error = 0.33333)
6 k= 6: x = [1.00000, -0.95455, 0.95455], mu=-5.82353 (error = 0.17647)
7 k= 7: x = [1.00000, -0.97692, 0.97692], mu=-5.90909 (error = 0.09091)
8 k= 8: x = [1.00000, -0.98837, 0.98837], mu=-5.95385 (error = 0.04615)
9 k= 9: x = [1.00000, -0.99416, 0.99416], mu=-5.97674 (error = 0.02326)
10 k=10: x = [1.00000, -0.99708, 0.99708], mu=-5.98833 (error = 0.01167)
1 1
Notice that | − 6 − µk | ≈ | − 6 − µk−1 |, for which |λ2 /λ1 | = .
2 2
5.8. Iterative Estimates for Eigenvalues 165
Avi = λi vi , i = 1, 2, · · · , n. (5.40)
Thus, we obtain
1
(A − qI)−1 vi = vi . (5.42)
λi − q
• That is, when q 6∈ {λ1 , λ2 , · · · , λn }, the eigenvalues of (A − qI)−1 are
1 1 1
, , ··· , , (5.43)
λ1 − q λ2 − q λn − q
with the same eigenvectors {v1 , v2 , · · · , vn } of A.
−1 2 −3
Find the the eigenvalue of A nearest to q = −5/2, using the inverse power
method.
Solution.
inverse_power.m
1 A = [-4 1 -1; 1 -3 2; -1 2 -3];
2 %[V,D] = eig(A)
3
Output
1 k= 1: x = [1.00000, 0.40000, -0.40000], lambda=-3.2000000 (error = 0.2000000)
2 k= 2: x = [1.00000, 0.48485, -0.48485], lambda=-3.0303030 (error = 0.0303030)
3 k= 3: x = [1.00000, 0.49782, -0.49782], lambda=-3.0043668 (error = 0.0043668)
4 k= 4: x = [1.00000, 0.49969, -0.49969], lambda=-3.0006246 (error = 0.0006246)
5 k= 5: x = [1.00000, 0.49996, -0.49996], lambda=-3.0000892 (error = 0.0000892)
6 k= 6: x = [1.00000, 0.49999, -0.49999], lambda=-3.0000127 (error = 0.0000127)
7 k= 7: x = [1.00000, 0.50000, -0.50000], lambda=-3.0000018 (error = 0.0000018)
8 k= 8: x = [1.00000, 0.50000, -0.50000], lambda=-3.0000003 (error = 0.0000003)
9 k= 9: x = [1.00000, 0.50000, -0.50000], lambda=-3.0000000 (error = 0.0000000)
10 k=10: x = [1.00000, 0.50000, -0.50000], lambda=-3.0000000 (error = 0.0000000)
Exercises 5.8
1. M The matrix in Example 5.58 has eigenvalues {−6, −3, −1}. We may try to find the
eigenvalue of A nearest to q = −3.1.
(a) Estimate (mathematically) the convergence speed of the inverse power method.
(b) Verify it by implementing the inverse power method, with x0 = [0, 1, 0]T .
2 −1 0 0
−1 2 0 −1
2. M Let A =
. Use indicated methods to approximate eigenvalues and
0 0 4 −2
0 −1 −2 4
their associated eigenvectors of A within to 10−12 accuracy.
Data
• Observe a vector
p1
p = p2 , (5.45)
p3
where pi is the proportion (probability) of birds on the i-th level.
Note: p1 + p2 + p3 = 1.
• After 10 minutes, we have a new distribution of the birds
0
p1
p0 = p02 . (5.46)
p03
Model
• We assume that the change from p to p0 is given by a linear operator
on R3 . In other words, there is a matrix T ∈ R3×3 such that
p0 = T p. (5.47)
The matrix T is called the transition matrix for the Markov chain.
• Another 10 minutes later, we observe another distribution
p00 = T p0 . (5.48)
Note: The same matrix T is used in (5.47) and (5.48), because we assume
that the probability of a bird moving to another level is indepen-
dent of time.
• In other words, the probability of a bird moving to a particular level
depends only on the present state of the bird, and not on any past
states.
• This type of model is known as a finite Markov chain.
170 Chapter 5. Eigenvalues and Eigenvectors
pn
a probability vector.
• A (left) stochastic matrix is a square matrix whose columns are
probability vectors.
q = T p = p 1 v 1 + p2 v 2 + · · · pn v n .
sum(q) = sum(p1 v1 + p2 v2 + · · · pn vn ) = p1 + p2 + · · · + pn = 1.
x1 = T x0 , x2 = T x1 , x3 = T x2 , · · · (5.49)
xk+1 = T xk , k = 0, 1, 2, · · · (5.50)
Figure 5.4: Annual percentage migration between a city and its suburbs.
Example 5.63. Figure 5.4 shows population movement between a city and
its suburbs. Then, the annual migration between these two parts of the
metropolitan region can be expressed by the migration matrix M:
" #
0.95 0.03
M= . (5.51)
0.05 0.97
Suppose the 2023 population of the region is 60,000 in the city and 40,000 in
the suburbs. What is the distribution of the population in 2024? In 2025?
Solution.
annual_migration.m Output
1 M = [0.95 0.03 1 x1 =
2 0.05 0.97]; 2 58200
3 3 41800
4 x0 = [60000 4
5 40000]; 5 x2 =
6 6 56544
7 x1 = M*x0 7 43456
8 x2 = M*x1
172 Chapter 5. Eigenvalues and Eigenvectors
Solution. (a) From the information given, we derive the transition ma-
trix:
1/2 1/4 1/6
T = 1/3 1/2 1/3 (5.52)
5 p = [1 0 0]'; q = [0 0 1]';
6 fprintf('p_%-2d = [%.5f %.5f %.5f]; q_%-2d = [%.5f %.5f %.5f]\n',0,p,0,q)
7 fprintf('%s\n',repelem('-',1,68))
8
9 n=12;
10 for k=1:n
11 p = T*p; q = T*q;
12 fprintf('p_%-2d = [%.5f %.5f %.5f]; q_%-2d = [%.5f %.5f %.5f]\n',k,p,k,q)
13 end
Output
1 p_0 = [1.00000 0.00000 0.00000]; q_0 = [0.00000 0.00000 1.00000]
2 --------------------------------------------------------------------
3 p_1 = [0.50000 0.33333 0.16667]; q_1 = [0.16667 0.33333 0.50000]
4 p_2 = [0.36111 0.38889 0.25000]; q_2 = [0.25000 0.38889 0.36111]
5 p_3 = [0.31944 0.39815 0.28241]; q_3 = [0.28241 0.39815 0.31944]
6 p_4 = [0.30633 0.39969 0.29398]; q_4 = [0.29398 0.39969 0.30633]
7 p_5 = [0.30208 0.39995 0.29797]; q_5 = [0.29797 0.39995 0.30208]
8 p_6 = [0.30069 0.39999 0.29932]; q_6 = [0.29932 0.39999 0.30069]
9 p_7 = [0.30023 0.40000 0.29977]; q_7 = [0.29977 0.40000 0.30023]
10 p_8 = [0.30008 0.40000 0.29992]; q_8 = [0.29992 0.40000 0.30008]
11 p_9 = [0.30003 0.40000 0.29997]; q_9 = [0.29997 0.40000 0.30003]
12 p_10 = [0.30001 0.40000 0.29999]; q_10 = [0.29999 0.40000 0.30001]
13 p_11 = [0.30000 0.40000 0.30000]; q_11 = [0.30000 0.40000 0.30000]
14 p_12 = [0.30000 0.40000 0.30000]; q_12 = [0.30000 0.40000 0.30000]
174 Chapter 5. Eigenvalues and Eigenvectors
Steady-State Vectors
T q = q. (5.53)
T x = x ⇔ T x − x = 0 ⇔ (T − I)x = 0. (5.54)
" #
3/7
Ans: q =
4/7
5.9. Applications to Markov Chains 175
0.2 0 0.4
(a) Is T regular?
(b) Find a steady-state vector for T , using the power method.
Solution.
q = lim T k x0 , (5.56)
k→∞
T k → [q q · · · q] ∈ Rn×n , as k → ∞. (5.57)
0.2 0 0.4
regular_stochastic_Tk.m
1 T = [0.5 0.2 0.3
2 0.3 0.8 0.3
3 0.2 0 0.4];
4 Tk = eye(3);
5 rref(T-Tk)
6
7 for k = 1:20
8 Tk = Tk*T;
9 if(mod(k,10)==0), fprintf('T^%d =\n',k); disp(Tk) end
10 end
Output
1 ans =
2 1.0000 0 -3.0000
3 0 1.0000 -6.0000
4 0 0 0
5
6 T^10 =
7 0.3002 0.2999 0.3002
8 0.5994 0.6004 0.5994
9 0.1004 0.0997 0.1004
10
11 T^20 =
12 0.3000 0.3000 0.3000
13 0.6000 0.6000 0.6000
14 0.1000 0.1000 0.1000
Ans: T, F, T, F, F
178 Chapter 5. Eigenvalues and Eigenvectors
Exercises 5.9
1. Find the steady-state vector, deriving the RREF.
" # .7 .1 .1
.1 .6
(a) (b) .2 .8 .2
.9 .4
.1 .1 .7
Ans: (b) [1/4, 1/2, 1/4]T
2. The weather in Starkville, MS, is either good, indifferent, or bad on any given day.
• If the weather is good today, there is a 60% chance the weather will be good tomor-
row, a 30% chance the weather will be indifferent, and a 10% chance the weather
will be bad.
• If the weather is indifferent today, it will be good tomorrow with probability .40 and
indifferent with probability .30.
• Finally, if the weather is bad today, it will be good tomorrow with probability .40
and indifferent with probability .50.
x k = T k x 0 = T k ei . (5.58)
(a) Find all eigenvalues and corresponding eigenvectors, using e.g. eig in Matlab.
(b) Express the eigenvector corresponding to λ = 1 as a probability vector p.
(c) Use the power method to find a steady-state vector q, beginning with x0 = e1 .
(d) Compare p with q.
C HAPTER 6
Orthogonality and Least-Squares
Contents of Chapter 6
6.1. Inner Product, Length, and Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.2. Orthogonal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.3. Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.4. The Gram-Schmidt Process and QR Factorization . . . . . . . . . . . . . . . . . . . . . 203
6.5. Least-Squares Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.6. Machine Learning: Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
179
180 Chapter 6. Orthogonality and Least-Squares
2 −4
Solution.
Distance in Rn
Definition 6.6. For u, v ∈ Rn , the distance between u and v is
Example 6.7. Compute the distance between the vectors u = (7, 1) and
v = (3, 2).
Solution.
2 −4
and v.
Solution.
Orthogonal Complements
True-or-False 6.19.
a. For any scalar c, u•(cv) = c(u•v).
b. If the distance from u to v equals the distance from u to −v, then u and
v are orthogonal.
c. For a square matrix A, vectors in Col A are orthogonal to vectors in
Nul A.
d. For an m × n matrix A, vectors in the null space of A are orthogonal to
vectors in the row space of A.
Solution.
Ans: T,T,F,T
186 Chapter 6. Orthogonality and Least-Squares
Exercises 6.1
3 6
x•w
1. Let x = −1 and w = −2. Find x•x, x•w, and .
x•x
−5 3
Ans: 35, 5, 1/7
0 −4
2. Find the distance between u = −5 and z = −1.
2 8 √
Ans: 2 17
3. Verify the parallelogram law for vectors u and v in Rn :
−1 6
1 2 1
{u1 , u2 , u3 } orthogonal?
Solution.
c1 u1 + c2 u2 + · · · + cp up = 0.
Take the dot product with u1 . Then the above equation becomes
c1 = c2 = · · · = cp = 0,
y = c1 u1 + c2 u2 + · · · + cp up (6.14)
are given by
y•uj
cj = (j = 1, 2, · · · , p). (6.15)
uj •uj
1 2 1
ple 6.21, we have seen that S = {u1 , u2 , u3 } is orthogonal. Express the
11
vector y = 0 as a linear combination of the vectors in S.
−5
Solution.
y=y
b + z, b // u and z ⊥ u.
y
(6.16)
b = αu. Then z = y − αu and
Let y
y=y
b + z, b // u and z ⊥ u.
y (6.17)
Then
y•u
y
b = αu = u, z = y − y
b. (6.18)
u•u
The vector yb is called the orthogonal projection of y onto u, and z is
called the component of y orthogonal to u.
• Let L = Span{u}. Then we denote
y•u
y
b= u = projL y, (6.19)
u•u
which is called the orthogonal projection of y onto L.
• It is meaningful whether the angle between y and u is acute or
obtuse.
190 Chapter 6. Orthogonality and Least-Squares
" # " #
7 4
Example 6.27. Let y = and u = .
6 2
4 2
Example 6.29. Let v = −12 and w = 1. Find the distance from v
8 −3
to Span{w}.
Solution.
192 Chapter 6. Orthogonality and Least-Squares
1 2
−5
and v3 = −2 form an orthogonal basis for R3 . Find the corresponding
1
orthonormal basis.
Solution.
Proof.
Theorems 6.32 & 6.33 are particularly useful when applied to square ma-
trices.
Definition 6.34. An orthogonal matrix is a square matrix U such
that U T = U −1 , i.e.,
orthogonal_matrix.m Output
1 n = 4; 1 U =
2 2 -0.3770 0.6893 0.2283 -0.5750
3 [Q,~] = qr(rand(n)); 3 -0.3786 -0.2573 -0.8040 -0.3795
4 U = Q; 4 -0.6061 0.3149 -0.1524 0.7143
5 5 -0.5892 -0.5996 0.5274 -0.1231
6 disp("U ="); disp(U) 6 U'*U =
7 disp("U'*U ="); disp(U'*U) 7 1.0000 0.0000 0.0000 -0.0000
8 8 0.0000 1.0000 -0.0000 0.0000
9 x = rand([n,1]); 9 0.0000 -0.0000 1.0000 -0.0000
10 fprintf("\nx' ="); disp(x') 10 -0.0000 0.0000 -0.0000 1.0000
11 fprintf("||x||_2 =");disp(norm(x,2)) 11
True-or-False 6.35.
a. If y is a linear combination of nonzero vectors from an orthogonal set,
then the weights in the linear combination can be computed without
row operations on a matrix.
b. If the vectors in an orthogonal set of nonzero vectors are normalized,
then some of the new vectors may not be orthogonal.
c. A matrix with orthonormal columns is an orthogonal matrix.
d. If L is a line through 0 and if y
b is the orthogonal projection of y onto L,
then kb yk gives the distance from y to L.
e. Every orthogonal set in Rn is linearly independent.
f. If the columns of an m × n matrix A are orthonormal, then the linear
mapping x 7→ Ax preserves lengths.
Solution.
Ans: T,F,F,F,F,T
6.2. Orthogonal Sets 195
Exercises 6.2
1. Determine which sets of vectors are orthogonal.
2 −6 3 2 0 4
(a) −7, −3, 1 (b) −5, 0, −2
1 9 −1 −3 0 6
3 2 1 5
2. Let u1 = −3, u2 = 2, u3 = 1, and x = −3.
0 −1 4 1
Then y•u
y
b = αu = u, z = y − y b. (6.22)
u•u
The vector yb is called the orthogonal projection of y onto u, and z is
called the component of y orthogonal to u. Let L = Span{u}. Then we
denote y•u
y
b= u = projL y, (6.23)
u•u
which is called the orthogonal projection of y onto L.
−1 1 3
{u1 , u2 } is an orthogonal basis for W = Span{u1 , u2 }.
ky − vk2 = ky − y
bk2 + kb
y − vk2 ,
5 1 1
Solution.
Write v as the sum of two vectors: one in Span{u1 , u2 } and the other in
Span{u3 }.
Solution.
If U = [u1 u2 · · · up ], then
" # " #
1 1 −3 −2
Ans: (a) U U T = (b)
10 −3 9 6
6.3. Orthogonal Projections 201
True-or-False 6.44.
a. If z is orthogonal to u1 and u2 and if W = Span{u1 , u2 }, then z must be
in W ⊥ .
b. The orthogonal projection y
b of y onto a subspace W can sometimes de-
pend on the orthogonal basis for W used to compute y
b.
c. If the columns of an n × p matrix U are orthonormal, then U U T y is the
orthogonal projection of y onto the column space of U .
d. If an n × p matrix U has orthonormal columns, then U U T x = x for all x
in Rn .
Solution.
Ans: T,F,T,F
202 Chapter 6. Orthogonality and Least-Squares
Exercises 6.3
1. (i) Verify that {u1 , u2 } is an orthogonal set, (ii) find the orthogonal projection of y onto
W = Span{u1 , u2 }, and (iii) write y as a sum of a vector in W and a vector orthogonal to
W.
−1 3 1 −1 1 −1
(a) y = 2, u1 = −1, u2 = −1 (b) y = 4, u1 = 1, u2 = 3
6 2 −2 3 1 −2
3 −5
1 1
Ans: (b) y = 2 7 + 2 1
2 4
2. Find the best approximation to z by vectors of the form c1 v1 + c2 v2 .
3 1 −4 3 2 1
−1 −2 1 −7 −1 1
(a) z = , v1 = , v2 = (b) z = , v1 = , v2 =
1 −1 0 2 −3 0
13 2 3 3 1 −1
Ans: (a) b
z = 3v1 + v2
3. Let z, v1 , and v2 be given as in Exercise 2. Find the distance from z to the subspace of
R4 spanned by v1 and v2 .
Ans: (a) 8
4. Let W be a subspace of Rn . A transformation T : Rn → Rn is defined as
x 7→ T (x) = projW x.
0 2
an orthogonal basis for W .
Main idea: Orthogonal projection
( ) ( (
x1 x1 v1 = x1
⇒ ⇒
x2 x2 = αx1 + v2 v2 = x2 − αx1
where x1 •v2 = 0. Then W = Span{x1 , x2 } = Span{v1 , v2 }.
Solution.
Solution.
v1 = x1
x2 •v1
v2 = x2 − v1
v1 •v1
x3 •v1 x3 •v2
v3 = x3 − v1 − v2 (6.28)
v1 •v1 v2 •v2
..
.
xp •v1 xp •v2 xp •vp−1
vp = xp − v1 − v2 − · · · − vp−1
v1 •v1 v2 •v2 vp−1 •vp−1
Then {v1 , v2 , · · · , vp } is an orthogonal basis for W . In addition,
A = QR, (6.31)
where
• Q is an m × n matrix whose columns are orthonormal and
• R is an n × n upper triangular invertible matrix with positive en-
tries on its diagonal.
We may assume hat rkk > 0. (If rkk < 0, multiply both rkk and uk by −1.)
3. Let rk = [r1k , r2k , · · · , rkk , 0, · · · , 0]T . Then
xk = Qrk (6.34)
4. Define
def
R == [r1 r2 · · · rn ]. (6.35)
• Thus
A = [x1 x2 · · · xn ] = QR (6.37)
implies that
Q = [u1 u2 · · · un ],
u1 •x1 u1 •x2 u1 •x3 · · · u1 •xn
0 u2 •x2 u2 •x3 · · · u2 •xn
T
(6.38)
R =
0 0 u3 •x3 · · · u3 •xn
= Q A.
.. .. .. ... ..
. . . .
0 0 0 · · · un •xn
" # " #
0.8 −0.6 5 0.4
Ans: Q = R=
0.6 0.8 0 2.2
208 Chapter 6. Orthogonality and Least-Squares
True-or-False 6.53.
a. If {v1 , v2 , v3 } is an orthogonal basis for W , then multiplying v3 by a
scalar c gives a new orthogonal basis {v1 , v2 , cv3 }. Clue: c =?
b. The Gram-Schmidt process produces from a linearly independent set
{x1 , x2 , · · · , xp } an orthogonal set {v1 , v2 , · · · , vp } with the property
that for each k, the vectors v1 , v2 , · · · , vk span the same subspace as
that spanned by x1 , x2 , · · · , xk .
c. If A = QR, where Q has orthonormal columns, then R = QT A.
d. If x is not in a subspace W , then x
b = projW x is not zero.
e. In a QR factorization, say A = QR (when A has linearly independent
columns), the columns of Q form an orthonormal basis for the column
space of A.
Solution.
Ans: F,T,T,F,T
6.4. The Gram-Schmidt Process and QR Factorization 209
Exercises 6.4
1. The given set is a basis for a subspace W . Use the Gram-Schmidt process to produce an
orthogonal basis for W .
3 8 1 7
(a) 0, 5 −4 −7
(b) ,
−1 −6 0 −4
1 1
Ans: (b) v2 = (5, 1, −4, −1)
2. Find an orthogonal basis for the column space of the matrix
−1 6 6
3 −8 3
1 −2 6
1 −4 −3
Ans: v3 = (1, 1, −3, 1)
−10 13 7 −11
2 1 −5 3
3. M Let A =
−6 3 13 −3
16 −16 −2 5
2 1 −5 −7
(a) Use the Gram-Schmidt process to produce an orthogonal basis for the column space
of A.
(b) Use the method in this section to produce a QR factorization of A.
Ans: (a) v4 = (0, 5, 0, 0, −5)
210 Chapter 6. Orthogonality and Least-Squares
kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.39)
Figure 6.8: Least-Squares approximation for noisy data. The dashed line in cyan is the
linear model from random sample consensus (RANSAC). The data has 1,200 and 300
points respectively for inliers and outliers.
6.5. Least-Squares Problems 211
kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.40)
b is in Rn .
Figure 6.9: The LS solution x
• Let b
b = proj
Col A b. Then Ax = b has a solution and there is an
b
b ∈ Rn such that
x
Ab
x = b.
b (6.41)
• x
b is an LS solution of Ax = b.
b 2 = kb−Ab
• The quantity kb− bk xk2 is called the least-squares error.
AT Ax = AT b. (6.43)
Proof. Suppose x
b satisfies Ab
x=b b
⇔b−b b = b − Ab
x ⊥ Col A
⇔ aj •(b − Ab
x) = 0 for all columns aj
T
⇔ aj (b − Abx) = 0 for all columns aj (Note that aj T is a row of AT )
⇔ AT (b − Abx) = 0
T
⇔ A Ab x = AT b
1 1 −4
Example 6.57. Let A = 2 0 and b = 8.
−2 1 1
" #
1
Ans: (a) x
b= .
−1
6.5. Least-Squares Problems 213
b = (AT A)−1 AT b.
x (6.44)
True-or-False 6.64.
a. The general least-squares problem is to find an x that makes Ax as
close as possible to b.
b. Any solution of AT Ax = AT b is a least-squares solution of Ax = b.
c. If x b = (AT A)−1 AT b.
b is a least-squares solution of Ax = b, then x
d. The normal equations always provide a reliable method for computing
least-squares solutions.
Solution.
Ans: T,T,F,F
216 Chapter 6. Orthogonality and Least-Squares
Exercises 6.5
1. Find a least-squares solution of Ax = b by (i) constructing the normal equations and
b. Also (iii) compute the least-squares error (kb − Ab
(ii) solving for x xk) associated with
the least-squares solution.
−1 2 4 1 −2 3
(a) A = 2 −3, b = 1 −1 2 1
(b) A = , b =
−1 3 2 0 3 −4
2 5 2
" #
4/3
Ans: (b) x
b=
−1/3
2. Find (i) the orthogonal projection of b onto Col A and (ii) a least-squares solution of
Ax = b. Also (iii) compute the least-squares error associated with the least-squares
solution.
4 0 1 9 1 1 0 2
1 −5 1 0 1 0 −1 5
(a) A = , b = (b) A = , b =
6 1 0 0 0 1 1 6
1 −1 −5 0 −1 1 −1 6
Ans: (b) b b = (1/3, 14/3, −5/3)
b = (5, 2, 3, 6) and x
3. Describe
all least-squares solutions of the system and the associated least-squares error.
x+y = 1
x + 2y = 3
x + 3y = 3
Ans: x
b = (1/3, 1)
For the above problems, you may use either pencil-and-paper or computer programs. For
example, for the last problem, a code can be written as
exercise-6.5.3.m
1 A = [1 1; 1 2; 1 3];
2 b = [1;3;3];
3
6 xhat = ATA\ATb
7 error = norm(b-A*xhat)^2
6.6. Machine Learning: Regression Analysis 217
kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.48)
AT Ax = AT b. (6.49)
A+ := (AT A)−1 AT
y = β0 + β1 x (6.51)
that is as close as possible to the given points. This line is called the
least-squares line; it is also called regression line of y on x and β0 , β1
are called regression coefficients.
6.6. Machine Learning: Regression Analysis 219
β0 + β1 x1 = y1
β0 + β1 x2 = y2 (6.52)
.. ..
. .
β0 + β1 xm = ym
where
1 x1 y1
" #
1 x2 β0 y2
X=
... ... ,
β= , y=
... .
β1
1 xm ym
Here we call X the design matrix, β the parameter vector, and y
the observation vector.
• (Method of Normal Equations) Thus the LS solution can be deter-
mined as
X T Xβ = X T y ⇒ β = (X T X)−1 X T y, (6.54)
provided that X T X is invertible.
y = β0 + β1 x + β2 x2 ,
β0 + β1 x1 + β2 x21 = y1
β0 + β1 x2 + β2 x22 = y2 (6.57)
.. ..
. .
β0 + β1 xm + β2 x2m = ym
• It is equivalently written as
Xβ = y, (6.58)
where
1 x1 x21 y1
β0
1 x2 x22
y2
X= .. , β = β1 , y=
... .
... ...
.
β2
1 xm x2m ym
Ans: y = 1 + 0.5x2
6.6. Machine Learning: Regression Analysis 223
Ans: y = 1 + 2x + x2
Further Applications
Example 6.73. Find an LS curve of the form y = a cos x + b sin x that best
fits the given points:
(0, 1), (π/4, 2), (π, 0).
Solution.
√
Ans: (a, b) = (1/2, −1/2 + 2 2) = (0.5, 2.32843)
224 Chapter 6. Orthogonality and Least-Squares
Example 6.75. Find an LS curve of the form y = CeDx that best fits the
given points:
(0, e), (1, e3 ), (2, e5 ).
Solution.
x y x
e ye = ln y
1 0 1
0 e 0 1
⇒ ⇒ X = 1 1, y = 3
1 e3 1 3
1 2 5
2 e5 2 5
Exercises 6.6
1. Find an LS curve of the form y = β0 + β1 x that best fits the given points.
(a) (1, 0), (2, 1), (4, 2), (5, 3) (b) (2, 3), (3, 2), (5, 1), (6, 0)
Ans: (a) y = −0.6 + 0.7x
(1, 1.8), (2, 2.7), (3, 3.4), (4, 3.8), (5, 3.9).
For these points, we will try to find the best-fitting model of the form y = β1 x + β2 x2 .
(a) Find and display the design matrix and the observation vector.
(b) Find the unknown parameter vector.
(c) Find the LS error.
(d) Plot the associated LS curve along with the data.
226 Chapter 6. Orthogonality and Least-Squares
A PPENDIX A
Appendix
Contents of Chapter A
A.1. Understanding / Interpretation of Eigenvalues and Eigenvectors . . . . . . . . . . . . . 228
A.2. Eigenvalues and Eigenvectors of Stochastic Matrices . . . . . . . . . . . . . . . . . . . 231
227
228 Appendix A. Appendix
and therefore
n
X n
X n
X
k
Ax = ξi Avi = ξi λi vi . ⇒ A x = ξi λki vi . (A.1.4)
i=1 i=1 i=1
Find the area of the solid ellipse, the image of the unit disk by A.
Ans: π · |λ1 · λ2 | = π · 3 · 1 = 3π
230 Appendix A. Appendix
x1 = T x0 , x2 = T x1 , x3 = T x2 , · · · (A.2.1)
xk+1 = T xk , k = 0, 1, 2, · · · (A.2.2)
T v = λv ⇒ |λ| ≤ 1. (A.2.7)
(a) Note that det (T T − λI) = det (T − λI), which implies that T and T T
have exactly the same eigenvalues. Consider the all-ones vector
1 = [1, 1, · · · , 1]T ∈ Rn . Then
T T 1 = 1,
q = lim T k x0 , (A.2.11)
k→∞
4 T = [[1/2,1/4,1/6],
5 [1/3,1/2,1/3],
6 [1/6,1/4,1/2]]
7 T = np.array(T)
8
9 D,V = np.linalg.eig(T)
10
11 print('Eigenvalues:'); print(D)
12 print('Eigenvectors:'); print(V)
13
Output
1 Eigenvalues:
2 [1. 0.3333 0.1667]
3 Eigenvectors:
4 [[-0.5145 -0.7071 0.4082]
5 [-0.686 -0. -0.8165]
6 [-0.5145 0.7071 0.4082]]
7 ----- steady-state vector ----
8 v1 = [0.3 0.4 0.3]
9 ----- power method -----------
10 k = 0; [1 0 0]
11 k = 1; [0.5 0.3333 0.1667]
12 k = 2; [0.3611 0.3889 0.25 ]
13 k = 3; [0.3194 0.3981 0.2824]
14 k = 4; [0.3063 0.3997 0.294 ]
15 k = 5; [0.3021 0.3999 0.298 ]
16 k = 6; [0.3007 0.4 0.2993]
17 k = 7; [0.3002 0.4 0.2998]
18 k = 8; [0.3001 0.4 0.2999]
19 k = 9; [0.3 0.4 0.3]
20 k = 10; [0.3 0.4 0.3]
Then
• The resulting matrix is still a stochastic matrix.
• Its eigenvalues become
[ 1. 0.281 -0.1977].
236 Appendix A. Appendix
A PPENDIX C
Chapter Review
237
238 Appendix C. Chapter Review
x1 v1 + x2 v2 + · · · + xp vp = 0 (C.1.1)
c1 v1 + c2 v2 + · · · + cp vp = 0. (C.1.2)
3 −8 1
Solution.
C.1. Linear Equations 239
Thus
T (x) = T (x1 e1 + x2 e2 + · · · + xn en )
= x1 T (e1 ) + x2 T (e2 ) + · · · + xn T (en ) (C.1.5)
= [T (e1 ) T (e2 ) · · · T (en )] x,
and therefore the standard matrix reads
A = [T (e1 ) T (e2 ) · · · T (en )] . (C.1.6)
240 Appendix C. Chapter Review
Example C.2. Write the standard matrix for the linear transformation
T : R2 → R4 given by
T (x1 , x2 ) = (x1 + 4x2 , 5x1 , −3x2 , x1 − x2 ).
Solution.
0 0 0 −1
Is T onto? Is T one-to-one?
Solution.
§2.8. Subspaces of Rn
Definition 2.47. A subspace of Rn is any set H in Rn that has three
properties:
a) The zero vector is in H.
b) For each u and v in H, the sum u + v is in H.
c) For each u in H and each scalar c, the vector cu is in H.
That is, H is closed under linear combinations.
Col A = {u | u = c1 a1 + c2 a2 + · · · + cn an }, (C.2.1)
−3 5 6 −7
b is in the column space of A, Col A.
Solution. Clue: 1 b ∈ Col A
⇔ 2 b is a linear combination of columns of A
⇔ 3 Ax = b is consistent
⇔ 4 [A b] has a solution
244 Appendix C. Chapter Review
Theorem 2.56. Basis for Nul A can be obtained from the parametric
vector form of solutions of Ax = 0. That is, suppose that the solutions of
Ax = 0 reads
x = x1 u1 + x2 u2 + · · · + xk uk ,
where x1 , x2 , · · · , xk correspond to free variables. Then, a basis for Nul A
is {u1 , u2 , · · · , uk }.
Example C.7. Matrix A and its echelon form is given. Find a basis for
Col A and a basis forNul
A.
3 −6 9 0 1 −2 3 0
A = 2 −4 7 2 ∼ 0 0 1 2
3 −6 6 −6 0 0 0 0
Solution.
−2 1 −2
Then B is a basis for H = Span{v1 , v2 }, because v1 and v2 are linearly
independent. Determine if x is in H, and if it is, find the coordinate vector
of x relative to B.
Solution.
246 Appendix C. Chapter Review
Example C.9. Find a basis for the subspace spanned by the given vectors.
What
isthe
dimension
of the
subspace?
1 2 −3 −4
−1 −3 5 6
, , ,
−2 −1 0 2
3 4 −5 −8
Solution.
C.3. Determinants 247
C.3. Determinants
§3.2. Properties of Determinants
Definition 3.1. Let A be an n × n square matrix. Then determinant is
a scalar value denoted by det A or |A|.
1) Let A = [a] ∈ R1 × 1 . Then det A = a.
" #
a b
2) Let A = ∈ R2 × 2 . Then det A = ad − bc.
c d
1 −4 2
Example C.10. Compute det A, where A = −1 7 0, after applying
−2 8 −9
a couple of steps of replacement operations.
C.3. Determinants 249
Solution.
§5.3. Diagonalization
Definition 5.25. An n × n matrix A is said to be diagonalizable if there
exists an invertible matrix P and a diagonal matrix D such that
A = P DP −1 (or P −1 AP = D) (C.5.1)
where Avk = λk vk , k = 1, 2, · · · , n.
C.5. Eigenvalues and Eigenvectors 253
while
λ1 0 · · · 0
0 λ2 · · · 0
... ... . . . ... = [λ1 v1 λ2 v2 · · · λn vn ].
P D = [v1 v2 · · · vn ] (C.5.4)
0 0 · · · λn
3 3 1
pn
a probability vector.
• A (left) stochastic matrix is a square matrix whose columns are
probability vectors.
q = T p = p 1 v 1 + p2 v 2 + · · · pn v n .
sum(q) = sum(p1 v1 + p2 v2 + · · · pn vn ) = p1 + p2 + · · · + pn = 1.
x1 = T x0 , x2 = T x1 , x3 = T x2 , · · · (C.5.5)
xk+1 = T xk , k = 0, 1, 2, · · · (C.5.6)
Steady-State Vectors
T q = q. (C.5.7)
T x = x ⇔ T x − x = 0 ⇔ (T − I)x = 0. (C.5.8)
q = lim T k x0 , (C.5.10)
k→∞
T k → [q q · · · q] ∈ Rn×n , as k → ∞. (C.5.11)
" #
0 0.5
Example C.17. Let T = .
1 0.5
(a) Is T regular?
(b) What is the first column of lim T k ?
k→∞
C.6. Orthogonality and Least-Squares 257
5 1 1
258 Appendix C. Chapter Review
If U = [u1 u2 · · · up ], then
v1 = x1
x2 •v1
v2 = x2 − v1
v1 •v1
x3 •v1 x3 •v2
v3 = x3 − v1 − v2 (C.6.5)
v1 •v1 v2 •v2
..
.
xp •v1 xp •v2 xp •vp−1
vp = xp − v1 − v2 − · · · − vp−1
v1 •v1 v2 •v2 vp−1 •vp−1
Then {v1 , v2 , · · · , vp } is an orthogonal basis for W . In addition,
−1 1 −1
260 Appendix C. Chapter Review
• Thus
A = [x1 x2 · · · xn ] = QR (C.6.9)
implies that
Q = [u1 u2 · · · un ],
u1 •x1 u1 •x2 u1 •x3 · · · u1 •xn
0 u2 •x2 u2 •x3 · · · u2 •xn
T
(C.6.10)
R =
0 0 u3 •x3 · · · u3 •xn
= Q A.
.. .. .. ... ..
. . . .
0 0 0 · · · un •xn
" # " #
0.8 −0.6 5 0.4
Ans: Q = R=
0.6 0.8 0 2.2
C.6. Orthogonality and Least-Squares 261
kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (C.6.11)
• Let b
b = proj
Col A b. Then Ax = b has a solution and there is an
b
b ∈ Rn such that
x
Ab
x = b.
b (C.6.12)
• x
b is an LS solution of Ax = b.
b 2 = kb−Ab
• The quantity kb− bk xk2 is called the least-squares error.
262 Appendix C. Chapter Review
AT Ax = AT b. (C.6.13)
b = (AT A)−1 AT b.
x (C.6.14)
Regression Line
y = β0 + β1 x (C.6.15)
that is as close as possible to the given points. This line is called the
least-squares line; it is also called regression line of y on x and β0 , β1
are called regression coefficients.
C.6. Orthogonality and Least-Squares 263
β0 + β1 x1 = y1
β0 + β1 x2 = y2 (C.6.16)
.. ..
. .
β0 + β1 xm = ym
where
1 x1 y1
" #
1 x2 β0 y2
X=
... ... ,
β= , y=
... .
β1
1 xm ym
Here we call X the design matrix, β the parameter vector, and y
the observation vector.
• (Method of Normal Equations) Thus the LS solution can be deter-
mined as
X T Xβ = X T y ⇒ β = (X T X)−1 X T y, (C.6.18)
provided that X T X is invertible.
Further Applications
Example C.23. Find an LS curve of the form y = a cos x + b sin x that best
fits the given points:
(0, 1), (π/2, 1), (π, −1).
Solution.
Self-study C.24. Find an LS curve of the form y = CeDx that best fits the
given points:
(0, e), (1, e3 ), (2, e5 ).
Solution.
x y x
e ye = ln y
1 0 1
0 e 0 1
⇒ ⇒ X = 1 1, y = 3
1 e3 1 3
1 2 5
2 e5 2 5
Contents of Projects
P.1. Project Regression Analysis: Linear, Piecewise Linear, and Nonlinear Models . . . . . 266
265
266 Appendix P. Projects
• Then
Predicted y-value Observed y-value
a0 + a1 x1 + a2 x21 = y1
a0 + a1 x2 + a2 x22 = y2 (P.1.1)
.. ..
. .
a0 + a1 xm + a2 x2m = ym
• It is equivalently written as
Xp = y, (P.1.2)
where
2
1 x1 x1 y1
a0
1 x2 x22
y2
X= .. , p = a1 , y=
... .
... ...
.
a2
1 xm x2m ym
test_data_100.m
1 close all; clear all
2
3 DATA = readmatrix('test-data-100.txt');
4 x = DATA(:,1); y = DATA(:,2);
5 p = polyfit(x,y,2);
6 yhat = polyval(p,x); % predicted y-values
7 LS_error = norm(y-yhat)^2/length(y); % variance
8
9 %---------------------------------------------------
10 fprintf('LS_error= %.3f; p=',LS_error); disp(p)
11 % Output: LS_error= 0.130; p= 0.3944 -0.6824 0.3577
12
13 %---------------------------------------------------
14 x1 = linspace(min(x),max(x),100);
15 y1 = polyval(p,x1); % regression curve
16
Nonlinear Regression
Example P.2. See the dataset {(xi , yi )} shown
in the figure. When we try to find the best fit-
ting model of the form
y = c edx , (P.1.4)
nonlinear_regression.m
1 lose all; clear all
2
3 FILE = 'seemingly-exp-data.txt';
4 DATA = readmatrix(FILE);
5 % fitting: y =c*exp(d*x) ---> ln(y) = d*x + ln(c)
6 %---------------------------------------------------
7
8 x = DATA(:,1); y = DATA(:,2);
9 lny = log(y); % data transform
10 p = polyfit(x,lny,1); % [p1,p2] = [d, ln(c)]
11 d = p(1); c = exp(p(2));
12
15 %---------------------------------------------------
16 fprintf('c=%.3f; d=%.3f; LS_error=%.3f\n',c,d,LS_error)
17 % Outout: c=1.346; d=0.669; LS_error=4.212
18
19 % figure
Figure P.3: The data and a nonlinear regression via linearization: y=1.346*exp(0.669*x).
Sine_Noisy_Data_Regression.m
1 close all; clear all
2
8 %%-----------------------------------------------
9 if isfile(DATAFILE) && renew_data == 0
10 DATA = readmatrix(DATAFILE); % np.loadtxt()
11 else
12 X = linspace(a,b,m); Y0 = f(X);
13 noise = rand([1,m]); noise = noise-mean(noise(:));
14 Y = Y0 + noise; DATA = [X',Y'];
15 writematrix(DATA,DATAFILE); % np.savetxt()
16 end
17
18 %%-----------------------------------------------
19 x = linspace(a,b,101); y = f(x);
20 x1 = DATA(:,1); y1 = DATA(:,2);
21 E = zeros(1,m);
22 for n = 0:m-1
23 p = polyfit(x1,y1,n); % np.polyfit()
24 yhat = polyval(p,x1); % np.polyval()
25 E(n+1) = norm(y1-yhat,2)^2;
26 %savefigure(x,y,x1,y1,polyval(p,x),n)
27 end
28
29 % figure
272 Appendix P. Projects
The LS Error
Given the dataset {(xi , yi ) | i = 1, 2, · · · , m} and the model Pn , define the
LS-error m
X 2
En = yi − Pn (xi ) , (m = 10), (P.1.10)
i=1
which is also called the mean square error.
Project Objective: To find the best model for each of the datasets:
What to Do
First download two datasets:
https://skim.math.msstate.edu/LectureNotes/data/regression-test-data-01.txt
https://skim.math.msstate.edu/LectureNotes/data/regression-test-data-02.txt
Note: You may use parts of the codes shown in this project. Report your
code, numerical outputs, and figures, with a summary.
• A code itself will not give you any credit; include outputs or figures.
• The summary will be worth 20% the full credit.
• Include all into a single file in pdf or doc/docx format.
Bibliography
[1] D. L AY AND S. L. ABD J UDI M C D ONALD, Linear Algebra and Its Applications, 6th Ed.,
Pearson, 2021.
275
276 BIBLIOGRAPHY
Index
M , 28, 96, 141, 167, 178, 209, 225 coordinates, 103, 245
1-norm, 232
2-norm, 232 data compression, 230
design matrix, 219, 263
acute angle, 189 determinant, 73, 109–111, 247
addition, 120, 250 diagonal entries, 68
algebraic multiplicity, 139 diagonal matrix, 68
all-ones vector, 233 diagonalizable, 142, 252
annual_migration.m, 171 diagonalization theorem, 143, 252
attractor, 157 differential equation, 154
augmented matrix, 4 dimension, 105
direction of greatest attraction, 157
back substitution, 93 distance, 181
back_sub.m, 95 domain, 50
backward phase, 13 dot product, 36, 71, 180
basic variables, 12 dot product preservation, 193
basis, 99, 244 double eigenvalue, 159
basis theorem, 107, 246 dyadic decomposition, 230
best approximation, 198, 257
birds_on_aviary.m, 172 echelon form, 9
birds_on_aviary2.m, 173 eigenspace, 133, 147
eigenvalue, 132, 138, 228
change of variables, 224, 264, 269 eigenvalues of stochastic matrices, 231
Chapter Review, 237 eigenvector, 132, 138, 228
characteristic equation, 138, 150 elementary matrix, 87
characteristic polynomial, 138 elementary row operation, 87
closed, 97, 243 elementary row operations, 4, 115
closest point, 198, 257 ellipse, 229
codomain, 50 equal, 68
coefficient matrix, 4 equality of vectors, 21
cofactor, 111, 247 equivalent system, 2
cofactor expansion, 111, 247 Euler angles, 54
column space, 97, 127, 243 Euler, Leonhard, 228
column vector, 20 exercise-6.5.3.m, 216
complex conjugate, 151 existence and uniqueness, 3
complex eigenvalue, 150
consistent system, 3 finite Markov chain, 169
coordinate vector, 103, 245 for loop, 33
277
278 INDEX