0% found this document useful (0 votes)

220 views16 pages

Quick Reference of Linear Algebra

This document provides a quick reference to key concepts and applications of linear algebra, including: - Common vector and matrix norms such as the l1, l2, and infinity norms. - Algorithms for common linear algebra operations like matrix-vector multiplication, vector-matrix multiplication, and matrix-matrix multiplication. - Properties of operations like matrix addition, transpose, and element-wise multiplication. - Basic definitions of concepts in linear algebra including vector spaces, subspaces, linear independence, bases, dimensions, and matrix rank.

Uploaded by

Daniel Yang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

220 views16 pages

Quick Reference of Linear Algebra

Uploaded by

Daniel Yang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Quick Reference of Linear Algebra

Hao Zhang
National Key Laboratory for Novel Software Technology, Nanjing University, China
[email protected]

Abstract • ‖𝑨‖2 ∶= ‖𝝈‖∞ =√ max𝑖 𝜎𝑖 . √

• ‖𝑨‖F ∶= ‖𝝈‖2 = tr 𝑨⊤ 𝑨 = tr 𝑨𝑨⊤ = ‖ vec 𝑨‖2 .
In linear algebra, concepts are much more impor- In which 𝑨 = 𝑼 𝚺𝑽 ⊤ .
tant than computations. Computers can do the
calculations, but you have to choose the calcula- Lemma 1 (Cosine Formula). If 𝒖 ≠ 𝟎, 𝒗 ≠ 𝟎, cos⟨𝒖, 𝒗⟩ =
𝒖⊤ 𝒗
tions and interpret the results. The heart of linear .
‖𝒖‖‖𝒗‖
algebra is the linear combination of vectors.
In this note, we give a quick reference of some of Lemma 2 (Schwarz Inequality). |𝒖⊤ 𝒗| ≤ ‖𝒖‖‖𝒗‖. The
the key concepts and applications of linear alge- equality holds when ∃𝑐 ∈ ℝ. 𝒖 = 𝑐𝒗.
bra. The section of inner product spaces is omitted, Lemma 3 (Triangle Inequality). ‖𝒖 + 𝒗‖ ≤ ‖𝒖‖ + ‖𝒗‖. The
since it rarely appear in computer science. Please equality holds when ∃𝑐 ≥ 0. 𝒖 = 𝑐𝒗.
refer to [1] if you are interested in. For further
reading, you may consult [3, 4, 5, 6, 7, 8, 9].
1.2 Vector and Matrix Multiplications
Algorithm 1 (Computing 𝑨𝒙.). 𝑇 (𝑚, 𝑛) ∼ 𝑚𝑛. In practice,
the fastest way to compute 𝑨𝒙 depends on the way the data
1 Introduction: Vectors and Matrices is stored in the memory. Fortran (column major order) com-
putes 𝑨𝒙 as a linear combination of the columns of 𝑨, while
1.1 Vector and Matrix Norms C (row major order) computes 𝑨𝒙 using rows of 𝑨.
(∑ )1
Definition 1 (𝓁𝑝 -norm ‖𝒙‖𝑝 ). ‖𝒙‖𝑝 ∶= 𝑖 |𝑥𝑖 |
𝑝 𝑝. In par- ⊤
⎡𝒂1 𝒙⎤
ticular,
∑ ∑
𝑛
⎢𝒂⊤ 𝒙⎥
• ‖𝒙‖0 ∶= 𝑖 𝕀(𝑥𝑖 ≠ 0). 𝑨𝒙 = 𝑥𝑗 𝒂𝑗 = ⎢ 2 ⎥ . (2)
∑ ⎢ ⊤⋮ ⎥
• ‖𝒙‖1 ∶= 𝑖 |𝑥𝑖 |. 𝑗=1
⎣𝒂𝑚 𝒙⎦
√∑ √
• ‖𝒙‖2 ∶= 2
𝑖 𝑥𝑖 = 𝒙⊤ 𝒙
• ‖𝒙‖∞ ∶= max𝑖 |𝑥𝑖 |. Algorithm 2 (Computing 𝒚 ⊤ 𝑨.). 𝑇 (𝑚, 𝑛) ∼ 𝑚𝑛.
∑
𝑚
[ ⊤ ]
Definition 2 (Mahalanobis Distance). Given a set of in- 𝒚⊤𝑨 = 𝑦𝑖 𝒂⊤ ⊤ ⊤
𝑖 = 𝒚 𝒂1 𝒚 𝒂2 ⋯ 𝒚 𝒂𝑛 . (3)
stances {𝒙𝑖 }𝑚
𝑖=1
with empirical mean 𝝁 ∈ ℝ𝑑 and empirical 𝑖=1
covariance 𝚺 ∈ ℝ𝑑×𝑑 , then the Mahalanobis distance of 𝒙 is
defined as Algorithm 3 (Computing 𝑨𝑩). Suppose 𝑨 ∈ ℝ𝑚×𝑝 , 𝑩 ∈
ℝ𝑝×𝑛 . 𝑇 (𝑚, 𝑛, 𝑘) ∼ 𝑚𝑛𝑘. In practice, Fortran (column major
√
‖ −1 ⊤ ‖ order) computes 𝑨𝑩 by columns in parallel, while C (row
(𝒙 − 𝝁)⊤ 𝚺−1 (𝒙 − 𝝁) = ‖ ‖
‖𝚲 2 𝑸 (𝒙 − 𝝁)‖ , (1)
‖ ‖ major order) computes 𝑨𝑩 by rows in parallel.
⊤
where 𝚺 = 𝑸𝚲𝑸⊤ . ⎡𝒂1 𝑩 ⎤
[ ] ⎢𝒂⊤ 𝑩 ⎥ [ ] ∑𝑝
‖𝑨𝒙‖𝑝 𝑨𝑩 = 𝑨𝒃1 𝑨𝒃2 ⋯ 𝑨𝒃𝑛 = ⎢ 2 ⎥ = 𝒂⊤ 𝒃 = 𝒂 𝑘 𝒃⊤
𝑘 .
Definition 3 (Matrix Norm). ‖𝑨‖𝑝 ∶= max𝒙≠𝟎 ‖𝒙‖𝑝
. In ⋮
⎢ ⊤ ⎥
𝑖 𝑗 𝑚×𝑛
𝑘=1
particular, ⎣𝒂𝑚 𝑩 ⎦
∑ (4)
• ‖𝑨‖1 ∶= max𝑗 𝑚 |𝑎𝑖𝑗 |.
∑𝑖=1
𝑚
• ‖𝑨‖∞ ∶= max𝑖 𝑗=1 |𝑎𝑖𝑗 |. Lemma 4 (Warnings of Matrix Multiplications). The fol-
∑
• ‖𝑨‖∗ ∶= ‖𝝈‖1 = 𝑟𝑖=1 𝜎𝑖 . lowins are warnings of matrix multiplications.

1
• In general, 𝑨𝑩 ≠ 𝑩𝑨. Definition 6 (Vector Space). A set of “vectors” together with
• In general, 𝑨𝑩 = 𝑨𝑪 ̸⇒ 𝑩 = 𝑪. rules for vector addition and for multiplication by real num-
• In general, 𝑨𝑩 = 𝟎 ̸⇒ 𝑨 = 𝟎 ∨ 𝑩 = 𝟎. bers. The addition and multiplication must produce vectors
that are in the space. The space ℝ𝑛 consists of all column
vectors 𝒙 with 𝑛 components. The zero-dimensional space 𝟘
1.3 Common Matrix Operations consists only of a zero vector 𝟎.
Table 1 and 2 summarize the properties of common matrix Definition 7 (Subspace). A subspace of a vector space is a
operations. set of vectors (including 𝟎) where all linear combinations
stay in that subspace.
Lemma 5. Any square matrix 𝑨 ∈ ℝ𝑛×𝑛 can be formed as
a sum of symmetric matrix and an anti-symmetric matrix Definition 8 (Span). A set of vectors spans a space if their
𝑨 = 12 (𝑨 + 𝑨⊤ ) + 12 (𝑨 − 𝑨⊤ ). linear combinations fill the space. The span of a set of vectors
is the smallest subspace containing those vectors.
Lemma 6. diag 𝑨𝑨⊤ = (𝑨 ⊙ 𝑨)𝟏.
Definition 9 (Linear Independent). The columns of 𝑨 are
Lemma 7. The followings are properties of vec operation. linear independent iff  (𝑨) = 𝟘, or rank 𝑨 = 𝑛.
• (vec 𝑨)⊤ (vec 𝑩) = tr 𝑨⊤ 𝑩.
• (vec 𝑨𝑨⊤ )⊤ (vec 𝑩𝑩 ⊤ ) = tr(𝑨⊤ 𝑩)⊤ (𝑨⊤ 𝑩) = Definition 10 (Basis). A basis for a vector space is a set
∑𝑛 ∑𝑛 of linear independent vectors which span the space. Every
‖𝑨 𝑩‖F =
⊤ 2 (𝒂⊤ 𝒃 𝑗 )2 , where 𝑨 ∶=
[ ] 𝑖=1 [
𝑗=1 𝑖 ] vector in the space is a unique combination of the basis
𝒂1 𝒂2 ⋯ 𝒂𝑛 and 𝑩 ∶= 𝒃1 𝒃2 ⋯ 𝒃𝑛 ,
vectors. The columns of 𝑨 ∈ ℝ𝑛×𝑛 are a basis for ℝ𝑛 iff 𝑨 is
Algorithm 4 (Computing the Rank). rank 𝑨 is determined invertible.
by the SVD of 𝑨. The rank is the number of nonzero sigular
Definition 11 (Dimension). The dimension of a space if the
values. In this case, extremely small nonzero singular values
number of vectors in every basis. dim 𝟘 = 0 since 𝟎 itself
are assumed to be zero. In general, rank estimation is not a
forms a linearly dependent set.
simple problem.
Definition 12 (Rank). rank 𝑨 ∶= dim (𝑨), which also
Algorithm 5 (Computing the Determinant). Get the REF
equals to the number pivots of 𝑨.
𝑼 of 𝑨. If there are 𝑝 row interchanges, det 𝑨 = (−1)𝑝 ⋅
(product of pivots in 𝑼 ). Although 𝑼 is not unique and pivots There are four foundamental subspaces for a matrix 𝑨, as
are not unique, the product of pivots is unique. 𝑇 (𝑛) ∼ 31 𝑛3 . illustrated in Table 3.

2 Theory: Vector Spaces and Sub- 2.2 Matrix Inverses

spaces Definition 13 (Invertible Matrix). A square matrix 𝑨 ∈ ℝ𝑛×𝑛
is invertible if there exists a matrix 𝑨−1 such that 𝑨−1 𝑨 = 𝑰
2.1 The Four Foundamental Subspaces or 𝑨𝑨−1 = 𝑰 (if one holds, the other holds with the same
𝑨−1 ). A singular matrix is a square matrix that is not in-
Definition 4 (Orthogonal Subspaces 𝑆1 ⟂ 𝑆2 ). Two sub- vertible. The comparison between invertible and singular
spaces 𝑆1 and 𝑆2 are orthogonal if ∀𝒗1 ∈ 𝑆1 , ∀𝒗2 ∈ matrices are illustrated in Table 4.
𝑆2 . 𝒗⊤ 𝒗 = 0.
1 2
Lemma 9. A square matrix 𝑨 cannot have two different
Definition 5 (Orthogonal Complement 𝑆 ⟂ ). The orthogonal inverses. This shows that a left-inverse and right-inverse of a
complement of a subspace 𝑆1 contains every vector that is square matrix must be the same matrix.
perpendicular to 𝑆. (𝑨)⟂ =  (𝑨⊤ ) and (𝑨⊤ )⟂ =  (𝑨).
Proof. 𝑨−1 −1 −1 −1 −1 −1
𝑙 = 𝑨𝑙 𝑰 = 𝑨𝑙 𝑨𝑨𝑟 = 𝑰𝑨𝑟 = 𝑨𝑟 .
⊤
Theorem 8.  (𝑨 𝑨) =  (𝑨). That is to say, if 𝑨 has
independent columns, 𝑨⊤ 𝑨 is invertible. Lemma 10. The followings are inverse of common matrices.
• A diagonal matrix has an inverse iff no diagonal en-
Proof. tries are 0. If 𝑨 = diag(𝑑1 , 𝑑2 , … , 𝑑𝑛 ), then 𝑨−1 =
diag(𝑑1−1 , 𝑑2−1 , … , 𝑑𝑛−1 ).
𝑨𝒙 = 𝟎 ⇒ 𝑨⊤ 𝑨𝒙 = 𝟎 ⇒  (𝑨) ⊆  (𝑨⊤ 𝑨) , (5) • A triangluar matrix has an inverse iff no diagonal en-
⊤ ⊤ ⊤
𝑨 𝑨𝒙 = 𝟎 ⇒ 𝒙 𝑨 𝑨𝒙 = 0 ⇒ 𝑨𝒙 = 𝟎 ⇒  (𝑨 𝑨) ⊆  (𝑨) . ⊤ tries are 0.
(6) • An elimination matrix 𝑬 𝑖𝑗 has an inverse 𝑬 −1 𝑖𝑗 the same
as 𝑬 𝑖𝑗 , except the (𝑖, 𝑗) entry flipped sign.
• The inverse of a symmetric matrix is also symmetric.

2
Table 1: Properties of common matrix operations (I).

Inverse Transpose Rank

⊤ ⊤ −1 −1 ⊤ ⊤ ⊤ ⊤
𝑓 (𝑨 ) (𝑨 ) = (𝑨 ) exits iff 𝑨 is invertible (𝑨 ) = 𝑨 rank 𝑨⊤ = rank 𝑨
𝑓 (𝑨−1 ) (𝑨−1 )−1 = 𝑨 (𝑨−1 )⊤ = (𝑨⊤ )−1 -
𝑓 (𝑐𝑨) (𝑐𝑨)−1 = 1𝑐 𝑨−1 (𝑐𝑨)⊤ = 𝑐𝑨⊤ rank 𝑐𝑨 = rank 𝑨 (𝑐 ≠ 0)
𝑓 (𝑨 + 𝑩) (𝑨 + 𝑩)−1 = 𝑨−1 + 𝑩 −1 (𝑨 + 𝑩)⊤ = 𝑨⊤ + 𝑩 ⊤ rank(𝑨 + 𝑩) ≤ rank 𝑨 + rank 𝑩
𝑓 (𝑨𝑩) (𝑨𝑩)−1 = 𝑩 −1 𝑨−1 exists if 𝑨 and 𝑩 are invertible (𝑨𝑩)⊤ = 𝑩 ⊤ 𝑨⊤ rank 𝑨𝑩 ≤ min(rank 𝑨, rank 𝑩)
𝑓 (𝑨⊤ 𝑨) (𝑨⊤ 𝑨)−1 exists iff 𝑨 has independent columns (𝑨⊤ 𝑨)⊤ = 𝑨⊤ 𝑨 rank 𝑨⊤ 𝑨 = rank 𝑨𝑨⊤ = rank 𝑨
Others 𝑨 and 𝑩 are invertible ̸⇒ 𝑨 + 𝑩 is invertible (𝑨𝒙)⊤ 𝒚 = 𝒙⊤ (𝑨⊤ 𝒚) -

Table 2: Properties of common matrix operations (II).

Determinant Trace Eigenvalue

∏𝑛 ∑𝑛
𝑓 (𝑨) det 𝑨 = 𝑖=1 𝜆𝑖 tr 𝑨 = 𝑖=1 𝜆𝑖 -
𝑓 (𝑨⊤ ) ⊤
det 𝑨 = det 𝑨 tr 𝑨⊤ = tr 𝑨 𝜆(𝑨⊤ ) = 𝜆(𝑨)
𝑓 (𝑨−1 ) det 𝑨−1 = det1 𝑨 - 1
𝜆(𝑨−1 ) = 𝜆(𝑨)
𝑓 (𝑐𝑨) det 𝑐𝑨 = 𝑐 𝑛 det 𝑨 tr 𝑐𝑨 = 𝑐 tr 𝑨 -
𝑓 (𝑨 + 𝑩) - tr(𝑨 + 𝑩) = tr 𝑨 + tr 𝑩 -
𝑓 (𝑨𝑩) det 𝑨𝑩 = det 𝑨 det 𝑩 tr 𝑨𝑩 = tr 𝑩𝑨 -
𝑓 (𝑨⊤ 𝑨) - - 𝜆(𝑨⊤ 𝑨) = 𝜆(𝑨𝑨⊤ ) = 𝜎(𝑨)2
𝑓 (𝑨𝑘 ) det 𝑨𝑘 = (det 𝑨)𝑘 - 𝜆(𝑨𝑘 ) = 𝜆(𝑨)𝑘 (if 𝑘 ≥ 1)
Others det(𝑰 + 𝒖𝒗⊤ ) = 1 + 𝒖⊤ 𝒗 𝒙⊤ 𝒚 = tr 𝒙𝒚 ⊤ 𝜆(𝑐𝑰 + 𝑨) = 𝑐 + 𝜆(𝑨)

Table 3: Vector spaces for 𝑨 ∈ ℝ𝑚×𝑛 , where 𝑹 = 𝑬𝑨 is the RREF of 𝑨.

Subspace Definition Basis Dimension

𝑚
Column space (𝑨) ∶= {𝒗 ∈ ℝ ∣ ∃𝒙. 𝑨𝒙 = 𝒗} Pivot columns of 𝑨 rank 𝑨
Left nullspace  (𝑨⊤ ) ∶= {𝒙 ∈ ℝ𝑚 ∣ 𝑨⊤ 𝒙 = 𝟎} Last rows of 𝑬 𝑚 − rank 𝑨
Row space (𝑨⊤ ) ∶= {𝒗 ∈ ℝ𝑛 ∣ ∃𝒙. 𝑨⊤ 𝒙 = 𝒗} Pivot rows of 𝑨 or 𝑹 rank 𝑨
Nullspace  (𝑨) ∶= {𝒙 ∈ ℝ𝑛 ∣ 𝑨𝒙 = 𝟎} Special solutions of 𝑨 or 𝑹 𝑛 − rank 𝑨

Table 4: Comparasions of invertible and singular matrices (the matrix 𝑨 ∈ ℝ𝑛×𝑛 ).

Invertible matrices Singluar matrices

Number pivots 𝑛 <𝑛
rank 𝑨 𝑛 <
[ 𝑛 ]
𝑹𝑭
RREF 𝑰
𝟎 𝟎
Columns Independent Dependent
Rows Independent Dependent
Solution to 𝑨𝒙 = 𝟎 Only 𝟎 Infinitely many solutions
Solution to 𝑨𝒙 = 𝒃 Only 𝑨−1 𝒃 No or infinitely many solutions
Eigenvalue All 𝜆 > 0 Some eigenvalue is 0
det 𝑨 ≠0 0
𝑨⊤ 𝑨 PD PSD
Linear transformation 𝒙 ↦ 𝑨𝒙 One-to-one and onto -

Algorithm 6 (The 𝑨−1 Algorithm).[ When ] 𝑨 is square[ and] computed, unless the entries of 𝑨−1 is explicitly needed.
invertible, Gaussian Elimination on 𝑨 𝑰 to produce 𝑹 𝑬 .
Since 𝑹 = 𝑰,[ then 𝑬𝑨] = 𝑹 becomes 𝑬𝑨 = 𝑰. The elimina-
tion result is 𝑰 𝑨−1 . 𝑇 (𝑛) ∼ 𝑛3 . In practice, 𝑨−1 is seldom

3
2 ⊤
2.3 Ill-conditioned Matrices Proof. By setting 𝜕(𝒙)
𝜕𝒙
= 𝑚
𝑨 𝑨𝒙− 𝑚2 𝑨⊤ 𝒃+2𝜆𝒙 = 𝟎.
Definition 14 (Ill-conditioned Matrix). An invertible matrix Lemma 14 (Weighted Least-squares Approximation). Sup-
that can become singular if some of its entries are changed pose 𝑪 ∈ ℝ𝑚×𝑚 is a diagonal matrix specifying the weight
ever so slightly. In this case, row reduction may produce for the equations,
fewer than 𝑛 pivots as a result of roundoff error. Also, round-
1
off error can sometimes make a singular matrix appear to be arg min (𝑨𝒙 − 𝒃)⊤ 𝑪(𝑨𝒙 − 𝒃) = (𝑨⊤ 𝑪𝑨)−1 𝑨⊤ 𝑪𝒃 . (12)
𝒙 𝑚
invertible.
2 ⊤
Definition 15 (Condition Number cond 𝑨). cond 𝑨 ∶=
𝜎1 Proof. By setting 𝜕(𝒙)
𝜕𝒙
= 𝑚
𝑨 𝑪𝑨𝒙 − 𝑚2 𝑨⊤ 𝑪𝒃 = 𝟎.
𝜎𝑟
for a matrix 𝑨 ∈ ℝ𝑛×𝑛 . The larger the condition number,
the closer the matrix is to being singular. cond 𝑰 = 1, and 2.5 Orthogonality
cond(singular matrix) = ∞. Lemma 15 (Plane in Point-normal Form). The equation of a
hyperplane with a point 𝒙0 in the plane and a normal vector
2.4 Least-squares and Projections 𝒘 orthogonal to the plane is 𝒘⊤ (𝒙 − 𝒙0 ) = 0.
It is often the case that 𝑨𝒙 = 𝒃 is overdetermined: 𝑚 > 𝑛. Lemma 16. The distance from a point 𝒙 to a plane with a
The 𝑛 columns span a small part of ℝ𝑚 . Typically 𝒃 is outside point 𝒙0 on the plane and a normal vector 𝒘 orthogonal to
(𝑨) and there is no solution. One approach is least-squares. |𝒘⊤ (𝒙−𝒙0 )|
the plane is ‖𝒘‖
.
Theorem 11 (Least-squares Approximation). The projection Definition 16 (Orthogonal Vectors). Two vectors 𝒖 and 𝒗
of 𝒃 ∈ ℝ𝑚 onto (𝑨) is are orthogonal if 𝒖⊤ 𝒗 = 0.
𝒃̂ = 𝑨(𝑨⊤ 𝑨)−1 𝑨⊤ 𝒃 , (7) Definition 17 (Orthonormal[ Vectors
] 𝒒). The columns of 𝑸
where the projection matrix 𝑷 = 𝑨(𝑨 ⊤
𝑨)−1 𝑨⊤ and are orthonormal if 𝑸⊤ 𝑸 = 𝒒 ⊤ 𝒒
𝑖 𝑗 𝑛×𝑛 = 𝑰. If 𝑸 is square, it
is called the orthogonal matrix.
1
arg min ‖𝑨𝒙 − 𝒃‖2 = (𝑨⊤ 𝑨)−1 𝑨⊤ 𝒃 . (8) Lemma 17. The followings are orthogonal matrices.
𝒙 𝑚
• Every permutation matrix 𝑷 .
In particular, the projection of 𝒃 onto the line 𝒂 ∈ ℝ𝑚 is • Reflection matrix 𝑰 − 2𝒆𝒆⊤ where 𝒆 is any unit vector.
𝒂𝒂⊤ Lemma 18. Orthogonal matrices 𝑸 preserve certain norms.
𝒃̂ = ⊤ 𝒃 . (9)
𝒂 𝒂 • ‖𝑸𝒙‖2 = ‖𝒙‖2 .
• ‖𝑸1 𝑨𝑸⊤ ‖ = ‖𝑨‖2 .
2 2
Proof. Let 𝒙⋆ ∶= arg min𝒙 𝑚1 ‖𝑨𝒙 − 𝒃‖2 and the approxima-
• ‖𝑸1 𝑨𝑸⊤ ‖ = ‖𝑨‖F .
2 F
tion error term 𝒆 ∶= 𝒃 − 𝑨𝒙⋆ . 𝒆 ⟂ (𝑨) ⇒ 𝒆 ∈  (𝑨⊤ ) ⇒
𝑨⊤ 𝒆 = 𝑨⊤ (𝒃 − 𝑨𝒙⋆ ) = 𝟎 ⇒ 𝑨⊤ 𝑨𝒙⋆ = 𝑨⊤ 𝒃. Another Theorem 19. The projection of 𝒃 ∈ ℝ𝑚 onto (𝑸) is
proof is by setting 𝜕(𝒙)
𝜕𝒙
= 𝑚2 𝑨⊤ 𝑨𝒙 − 𝑚2 𝑨⊤ 𝒃 = 𝟎. ∑
𝑛
𝒃̂ = 𝑸𝑸⊤ 𝒃 = 𝒒𝑗 𝒒⊤
𝑗 𝒃. (13)
Lemma 12. 𝑨𝒙 = 𝒃 has a unique least-squares solution for 𝑗=1
each 𝒃 when the columns of 𝑨 are linearly independent. ∑𝑛
If 𝑸 is square, 𝒃 = ⊤
Algorithm 7 (Least-squares Approximation). Since 𝑗=1 𝒒 𝑗 𝒒 𝑗 𝒃.

cond 𝑨⊤ 𝑨 = (cond 𝑨)2 , least-squares is solved by QR Definition 18 (QR Factorization). 𝑨 ∈ ℝ𝑛×𝑛 can be wriiten
factorization. as 𝑨 = 𝑸𝑹, where columns of 𝑸 ∈ ℝ𝑛×𝑛 are orthonormal,
and 𝑹 ∈ ℝ𝑛×𝑛 is an upper triangular matrix.
𝑨⊤ 𝑨𝒙 = 𝑨⊤ 𝒃 ⇒ 𝑹⊤ 𝑹𝒙 = 𝑹⊤ 𝑸⊤ 𝒃 ⇒ 𝑹𝒙 = 𝑸⊤ 𝒃 . (10)
Algorithm 8 (Gram-Schmidt Process). The idea is to sub-
𝑇 (𝑚, 𝑛) ∼ 𝑚𝑛2 . tract from every new vector its projections in the directions
Another common case is that 𝑨𝒙 = 𝒃 is underdetermined: already set, and divide the resulting vectors by their lengths,
𝑚 < 𝑛 or 𝑨 has dependent columns. Typically there are in- such that
finitely many solutions. One approach is using regularization. [ ]
𝑨 = 𝒂1 𝒂2 ⋯ 𝒂𝑛
Theorem 13 (Least-squares Approximation With Regular- ⊤ 𝒒⊤ ⋯ 𝒒⊤
⎡𝒒 1 𝒂1 𝒂
1 2 1 𝑛⎤
𝒂
ization). [ ]⎢ 𝒒⊤ 𝒂 ⋯ 𝒒⊤ 𝒂 ⎥
= 𝒒1 𝒒2 ⋯ 𝒒𝑛 ⎢ 2 2 2 𝑛⎥
( ) ( )−1 ⎢ ⋱ ⋮ ⎥
1 1 ⊤ 1 ⊤
arg min ‖𝑨𝒙 − 𝒃‖2 + 𝜆‖𝒙‖2 = 𝑨 𝑨 + 𝜆𝑰 𝑨 𝒃. ⎣ 𝒒⊤
𝑛 𝒂𝑛
⎦
𝒙 𝑚 𝑚 𝑚
(11) = 𝑸𝑸⊤ 𝑨 = 𝑸𝑹 . (14)

4
The algorithm is illustrated in Alg. 1. 𝑇 (𝑛) = Definition 22 (Block Elimination). We perform (row 2) -
∑𝑛 ∑𝑗
𝑗=1 𝑖=1
2𝑛 ∼ 𝑛 3 . In practice, the roundoff error can 𝑪𝑨−1 (row 1) to get a zero block in the first column.
build up. [ ][ ] [ ]
𝑰 𝟎 𝑨 𝑩 𝑨 𝑩
= . (15)
−𝑪𝑨−1 𝑰 𝑪 𝑫 𝟎 𝑫 − 𝑪𝑨−1 𝑩
Algorithm 1 QR Factorization.
Input: 𝑨 ∈ ℝ𝑛×𝑛 The final block 𝑫 − 𝑪𝑨−1 𝑩 is called the Schur complement.
Output: 𝑸 ∈ ℝ𝑛×𝑛 , 𝑹 ∈ ℝ𝑛×𝑛
1: 𝑸 ← 𝑹 ← 𝟎 Definition 23 (Row Exchange Matrix 𝑷 𝑖𝑗 ). Identity matrix
2: for 𝑗 ← 0 to 𝑛 − 1 do with row 𝑖 and row 𝑗 exchanged. 𝑷 𝑖𝑗 𝑨 means that we ex-
3: 𝒒 𝑗 ← 𝒂𝑗 change row 𝑖 and row 𝑗.
4: for 𝑖 ← 0 to 𝑗 − 1 do
5: 𝑟𝑖𝑗 ← 𝒒 ⊤𝑖 𝒂𝑗 Definition 24 (Permutation Matrix 𝑷 ). A permutation ma-
6: 𝒒 𝑗 ← 𝒒 𝑗 − 𝑟𝑖𝑗 𝒒 𝑖 trix has the rows of the identity 𝑰 in any order. This matrix
7: 𝑟𝑗𝑗 ← ‖𝒒 𝑗 ‖ has a single 1 in every row and every column. The simplest
𝒒
8: 𝒒 𝑗 ← ‖𝒒𝑗 ‖ permutation matrix is 𝑰. The next simplest are the row ex-
𝑗
9: return 𝑸, 𝑹 change matrix 𝑷 𝑖𝑗 . There are 𝑛! permutation matrices of
order 𝑛, half of which have determinant 1, and the other half
are -1. If 𝑷 is a permutation matrix, then 𝑷 −1 = 𝑷 ⊤ , which
Algorithm 9 (Householder reflections). In practice, House- is also a permutation matrix.
holder reflections are often used instead of the Gram-Schmidt [ ]
process, even though the factorization requires about twice Definition 25 (Augmented Matrix 𝑨 𝒃 ). Elimination does
as much arithmetic. the same row operations to 𝑨 and to 𝒃. We can include 𝒃 as
an extra column and let elimination act on whole rows of this
matrix.
3 Application: Solving Linear Sys- Definition 26 (Row Equivalent). Two matrices are called row
tems equivalent if there is a sequence of elementary row operations
that transforms one matrix into the other.
Understanding the linear system 𝑨𝒙 = 𝒃.
• Row picture: 𝑚 hyperplanes meets at a single point (if Lemma 20. If the augmented matrices of two linear systems
possible). are row equivalent, then the two linear systems have the same
• Column picture: 𝑛 vectors are combined to produce 𝒃. solution set.

Definition 19 (Consist Linear System). A linear system is

said to be consistent if it has either one solution or infinitely 3.2 Row Reduction and Echelon Forms
many solutions, and it is said to be inconsistent if it has no Definition 27 (Row Echelon Form (REF) 𝑼 ). A rectangular
solution. matrix is in row echelon form if it has the following proper-
The idea of Gaussian elimination is to replace one linear ties.
system with an equivalent linear system (i.e., one with the • All nonzero rows are above any rows of all zeros.
same solution set) that is easier to solve. • Each leading entry of a row is in a column to the right
of the leadining entry of the row above it.
• All entries in a column below a leading are zeros.
3.1 Elementary Row Operations
Definition 28 (Reduced Row Echelon Form (RREF) 𝑹). If a
Definition 20 (Elementary Row Operations). The followings matrix in row echelon form satisfies the following additional
are three types of elementary row operations. conditions, then it is in reduced row echelon form.
• (Replacement) Replace one row by the subtraction of • The leading entry in each nonzero row is 1.
itself and a multiple of another row of the matrix. • Each leading 1 is the only
• (Interchange) Interchange any two rows. [ nonzero
] entry in its column.
𝑰𝑭
• (Scaling) Multiply all entries of a row by a nonzero The general form is 𝑹 = , which has 𝑟 pivots rows
𝟎𝟎
constant. and pivots columns, 𝑚 − 𝑟 zero rows, and 𝑛 − 𝑟 free columns.
Definition 21 (Elementary Matrix 𝑬 𝑖𝑗 ). Identity matrix with Every free column is a combination of earlier pivot columns.
an extra nonzero entry −𝑙𝑖𝑗 in the (𝑖, 𝑗) position, where mul- Free variables can be given any values whatsoever.
entry to eliminate in row 𝑖
tiplier 𝑙𝑖𝑗 ∶= pivot in row 𝑗
. 𝑬 𝑖𝑗 𝑨 means that we per- Definition 29 (Pivot). A pivot position in a matrix 𝑨 is a
form (row 𝑖) - 𝑙𝑖𝑗 ⋅ (row 𝑗) to make the (𝑖, 𝑗) entry zero. location in 𝑨 that corresponds to a leading 1 in the RREF of

5
Table 5: The four possibilities for steady state problem 𝑨𝒙 = 𝒃, where 𝑨 ∈ ℝ𝑚×𝑛 and 𝑟 ∶= rank 𝑨. Gaussian elimination on
[𝑨 𝒃] gives 𝑹𝒙 = 𝒅, where 𝑹 ∶= 𝑬𝑨 and 𝒅 ∶= 𝑬𝒃.

Case Shape of 𝑨 RREF 𝑹 Particular solution 𝒙𝑝 Nullspace matrix # solutions Left inverse Right inverse
−1 −1
𝑟=𝑚=𝑛 Square and invertible [𝑰] [𝑨
[ ]𝒃] [ [𝟎] ] 1 𝑨 𝑨−1
𝒅 −𝑭
𝑟=𝑚<𝑛 Short and wide [𝑰 𝑭 ] ∞ - 𝑨⊤ (𝑨𝑨⊤ )−1
[ ] 𝟎 𝑰
𝑰
𝑟=𝑛<𝑚 Tall and thin [𝒅] or none [𝟎] 0 or 1 (𝑨⊤ 𝑨)−1 𝑨 -
[ 𝟎 ] [ ] [ ]
𝑰𝑭 𝒅 −𝑭
𝑟 < 𝑚, 𝑟 < 𝑛 Not full rank or none 0 or ∞ - -
𝟎𝟎 𝟎 𝑰

𝑨. A pivot row/column is a row/column of 𝑨 that contains 3.3 Solution of a Linear System 𝑨𝒙 = 𝒃

a pivot position. A pivot is a nonzero number in a pivot
Algorithm 11 (Solving 𝑨𝒙 = 𝒃). The algorithm is illus-
position. A zero in the pivot position can be repaired if there
trated in Alg. 3. There are four possibilities for 𝑨𝒙 = 𝒃
is a nonzero below it.
depending on rank 𝑨, as illustrated in Table 5. 𝑇 (𝑚, 𝑛) ∼
Algorithm 10 (RREF). The algorithm to get the RREF of 1 3
𝑚 + 13 𝑚2 𝑛.
3
𝑨 is illustrated in Alg. 2. We use partial pivoting to reduce
the roundoff errors in the calculations. The FLOP (number
of multiplication/division on two floating point numbers) is Algorithm 3 Solving a linear system 𝑨𝒙 = 𝒃.
∑ 1 3 1 2
𝑇 (𝑚, 𝑛) = 𝑚−1𝑖=1 (𝑛 − 𝑖 + 1)(𝑚 − 𝑖) ∼ 3 𝑚 + 3 𝑚 𝑛.
Input: 𝑨 ∈ ℝ𝑚×𝑛 , 𝒃 ∈ ℝ𝑚
Output: 𝒙 ∈ ℝ𝑛
1: Use elementary row operations to transform[ the ]augmented
Algorithm 2 The RREF algorithm. [ ] [ ] 𝑰𝑭
matrix 𝑨 𝒃 to its RREF 𝑹 𝒅 , where 𝑹 = .
Input: 𝑨 ∈ ℝ𝑚×𝑛 [ ]
𝟎𝟎
Output: 𝑹 ∈ ℝ𝑚×𝑛 2: if there is a row in 𝑹 𝒅 whose entries are all zeros except the
1: ⊳ Elimination downwards to produce zeros below the pivots. last one on the right then
2: while there is nonzero row to modify do 3: return “The system is inconsistent
[ ] and has no solution”
3: Pick the leftmost nonzero column. This is a pivot column. 𝒅
4: Find one particular solution 𝒙𝑝 ∶= , which solves 𝑨𝒙𝑝 = 𝒃,
The pivot position is at the top. 𝟎
4: Use elementary row interchange to select the entry with the by setting the free variables to 0.
largest absolute value in the pivot column as a pivot. 5: Find special[ solutions
] which are columns of the nullspace ma-
5: Use elementary row replacements to create zeros in all posi- −𝑭
trix 𝑵 ∶= which solves 𝑨𝒙𝑛 = 𝟎. Every free column
tions below the pivot. 𝑰
6: Cover/Ignore that pivot row. leads to a special solution. The complete null solution 𝒙𝑛 is a
7: Use elementary row scaling to produce 1 in all pivot positions. linear combination of the special solutions.
8: ⊳ Elimination upwards to produce zeros above the pivots. 6: return 𝒙𝑝 + 𝒙𝑛
9: while there is nonzero row to modify do
10: Pick the rightmost nonzero column.
11: Use elementary row replacements to create zeros in all posi-
tions above the pivot. 3.4 The LU Factorization
12: Cover/Ignore that pivot row. The LU factorization is motivated by the problem of solv-
13: return the result
ing a set of linear systems all with the same cofficient matrix:
{𝑨𝒙𝑘 = 𝒃𝑘 }𝐾 𝑘=1
.
Lemma 21. Any nonzero matrix is row equivalent to more
than one matrix in REF, by using different sequences of ele- Definition 30 (LU Factorization). Assuming no row ex-
mentary row operations. However, each matrix is row equiv- changes, 𝑨 ∈ ℝ𝑚×𝑛 can be written as 𝑨 = 𝑳𝑼 , where
alent to one and only one matrix in RREF. 𝑳 ∈ ℝ𝑚×𝑚 is an invertible lower triangular matrix with 1’s
on the diagonal and multipliers 𝑙𝑖𝑗 are below the diagonal.
Lemma 22. An elimination matrix 𝑬 ∈ ℝ𝑚×𝑚 which is a 𝑼 ∈ ℝ𝑚×𝑛 is an REF of 𝑨.
product of elementary matrices 𝑬 𝑖𝑗 , row exchange matrices Besides, we can extract from 𝑼 a diagonal matrix 𝑫 ∈
𝑷 𝑖𝑗 , and diagonal matrix 𝑫 −1 (divides rows by their pivots to ℝ𝑚×𝑚 containing the pivots: 𝑨 = 𝑳𝑫𝑼 . The new 𝑼 matrix
produce 1’s) puts the original 𝑨 into its RREF, i.e., 𝑬𝑨[ = 𝑹. ] has 1’s on the pivot positions. When 𝑨 is symmetric, the
If we want 𝑬, we can apply row reduction to the matrix 𝑨 𝑰 ,
[ ] [ ] usual LU factorization becomes 𝑨 = 𝑳𝑫𝑳⊤ . Sometimes
namely, 𝑬 𝑨 𝑰 = 𝑹 𝑬 . row exchanges are needed to produce pivots 𝑷 𝑨 = 𝑳𝑼 .

6
Algorithm 12 (The LU Factorizaiton). The algorithm is il-
lustrated in Alg. 4. 𝑇 (𝑚, 𝑛) ∼ 13 𝑚3 + 31 𝑚2 𝑛. For a band
matrix 𝑩 with 𝑤 nonzero diagonals below and above its
main diagonal, 𝑇 (𝑚, 𝑛, 𝑤) ∼ 𝑚𝑤2 .

Algorithm 4 LU factorization on 𝑨.
Input: 𝑨 ∈ ℝ𝑚×𝑛
Output: 𝑳, 𝑼
1: 𝑳 ← 𝟎 ∈ ℝ𝑚×𝑚 Figure 1: Three different cases (fixed-fixed, fixed-free, and
2: 𝑘 ← 0 free-free) for a spring system.
3: for 𝑗 ← 0 to 𝑛 − 1 do
4: if 𝑗-th column if not a pivot column then
5: continue • 𝒆 ∈ ℝ𝑚 : The stretching distance of each spring.
6: Row exchange to make 𝑎𝑘𝑗 as the largest available pivot. By Hooke’s Law 𝑦𝑖 = 𝑐𝑖 𝑒𝑖 . In matrix form, 𝒚 = 𝑪𝒆.
7: 𝑙𝑘𝑘 ← 1 These are three different cases for these springs, as illus-
8: for 𝑖 ← 𝑘 + 1 to 𝑚 − 1 do trated in Fig. 1.
𝑎
9: 𝑙𝑖𝑘 ← 𝑎 𝑖𝑗 ⊳ Multiplier
𝑘𝑗 Fixed-fixed Case. In this case, 𝑚 = 𝑛 + 1 and the top and
10: ⊳ Eliminates row 𝑖 beyond row 𝑘 bottom spring are fixed. Originally there is no stretching.
11: (row 𝑖 of 𝑨) ← (row 𝑖 of 𝑨) - 𝑙𝑖𝑘 (row 𝑘 of 𝑨) Then gravity acts to move down the masses by 𝒖. Each
12: 𝑘←𝑘+1
spring is stretched by the difference in displacements of its
13: return 𝑳, 𝑨
end 𝑒𝑖 = 𝑢𝑖 − 𝑢𝑖−1 . Besides, 𝑒1 = 𝑢1 since the top is fixed,
and 𝑒𝑚 = −𝑢𝑛 since the bottom is fixed. In matrix form,
Lemma 23. Assuming no row exchanges, when a row of 𝑨
starts with zeros, so does that row of 𝑳. When a column of 𝑨 ⎡1 ⎤
⎢−1 1 ⎥
starts with zeros, so does that column of 𝑼 .
𝒆 = 𝑨𝒖 ∶= ⎢ −1 ⋱ ⎥𝒖. (16)
⎢ ⎥
Algorithm 13 (Solving {𝑨𝒙𝑘 = 𝒃𝑘 }𝐾 𝑘=1
). The algorithm is ⎢ ⋱ 1⎥
⎣ −1⎦
illustrated in Alg. 5. 𝑇 (𝑚, 𝑛, 𝐾) ∼ 3 𝑚 + 31 𝑚2 𝑛 + 𝑛2 𝐾. For
1 3

a band matrix 𝑩 with 𝑤 nonzero diagonals below and above Finally comes the balance equation, the internal forces
its main diagonal, 𝑇 (𝑚, 𝑛, 𝑤) ∼ 𝑚𝑤2 + 2𝑛𝑤𝐾. from the springs balance the external forces on the masses
𝑓𝑖 = 𝑦𝑖 − 𝑦𝑖+1 . In matrix form,
Algorithm 5 Solving {𝑨𝒙𝑘 = 𝒃𝑘 }𝐾 .
𝑘=1 ⎡1 −1 ⎤
Input: 𝑨, {𝒃𝑘 }𝐾
𝑘=1
. ⊤ ⎢ 1 −1 ⎥
Output: {𝒙𝑘 }𝐾 𝒇 = 𝑨 𝒚 ∶= ⎢ ⎥𝒚. (17)
⋱ ⋱
𝑘=1
⎢ ⎥
1: LU factorization 𝑨 = 𝑳𝑼 . ⎣ 1 −1⎦
2: for 𝑘 ← 1 to 𝐾 do
3: Solve 𝑳𝒚 𝑘 = 𝒃𝑘 by forward subsitution. Combining the three matrices gives
4: Solve 𝑼 𝒙𝑘 = 𝒚 𝑘 by backward subsitution.
5: return {𝒙𝑘 }𝐾𝑘=1 𝑨⊤ 𝑪𝑨𝒖 = 𝒇 . (18)

When 𝑪 = 𝑰,

3.5 Matrices in Engineering ⎡ 2 −1 ⎤

⎢−1 2
⊤ ⋱ ⎥
𝑲 =𝑨 𝑨=⎢ ∈ ℝ𝑛×𝑛 .
−1⎥
Suppose there are 𝑛 masses vertically connected by a line (19)
⋱ ⋱
of 𝑚 springs. We define ⎢ ⎥
⎣ −1 2⎦
• 𝒖 ∈ ℝ𝑛 : The movement of the masses, where we define
𝑢𝑗 > 0 when a mass move downward, Lemma 24. The followings are properties of the matrix 𝑲.
• 𝒚 ∈ ℝ𝑚 : The internal force of each spring, where we • 𝑲 is symmetric.
define 𝑦[𝑖 > 0] when a spring is in stretched. • 𝑲 is tridiagonal.
• 𝒇 ∶= 𝑚𝑗 𝑔 𝑛 ∈ ℝ𝑛 : The extern force comes from • The 𝑖-th pivot of 𝑲 is 𝑖+1
𝑖
, and it converges to 1 when
gravity. 𝑛 → ∞.
• 𝑪 ∶= diag 𝒄 ∈ ℝ𝑚×𝑚 : The spring constant of each • 𝑲 is PD.
spring. • det 𝑲 = 𝑛 + 1.

7
• 𝑲 −1 is a full matrix with all positive entries. 𝑲 −1 is
also PD.

Fixed-free Case. In this case, 𝑚 = 𝑛 and the top spring are

fixed. When 𝑪 = 𝑰,

⎡1 ⎤ ⎡ 2 −1 ⎤
⎢−1 1 ⎥ ⎢−1 2 ⋱⎥
𝑨=⎢ ⎥, 𝑲 = ⎢ −1⎥
.
⋱ ⋱ ⋱ ⋱
⎢ ⎥ ⎢ ⎥
⎣ −1 1⎦ ⎣ 1⎦
−1
Figure 2: A curcuit with a current source into vertex 1.
(20)
Free-free Case. In this case, 𝑚 = 𝑛 − 1 and the both ends
are free. When 𝑪 = 𝑰, 𝑨𝒖 gives the potential differences across the 𝑚 edges.
Ohm’s law says that the current 𝑦𝑖 through the resistor is
⎡−1 1 ⎤ ⎡ 1 −1 ⎤ proportional to the potential difference 𝒚 = 𝑪𝑨𝒖. Kirch-
𝑨=⎢ ⋱ ⋱ ⎥, 𝑲 = ⎢⎢−1 2 ⋱ ⎥
hoff’s current law says that the net current into every node is
−1⎥
.
⎢ ⎥ ⎢
⋱ ⋱
⎥
⎣ −1 1⎦
⎣ −1 1⎦
zero, which is expressed as
(21) 𝑨⊤ 𝒚 = 𝑨⊤ 𝑪𝑨𝒖 = 𝒇 . (23)
There is a nonzero solution to 𝑨𝒖 = 𝟎. The masses can move
𝒖𝑛 = 𝟏 with no stretching of the springs 𝒆 = 𝟎. 𝑲 is only Kirchhoff’s voltage law says that the sum of potential
PSD, and 𝑲𝒖 = 𝒇 is solvable for special 𝒇 , i.e., 𝟏⊤ 𝒇 = 0, or differences around a loop must be zero.
the whole line of spring (with both ends free) will take off
Lemma 25. The followings are properties of an incidence
like a rocket.
matrix 𝑨.
• dim (𝑨) = dim (𝑨⊤ ) = 𝑛 − 1.
3.6 Graph and Networks • dim  (𝑨) = 1
• dim  (𝑨⊤ ) = 𝑚 − 𝑛 + 1.
Definition 31 (Adjacency Matrix). Given a directed graph
[ = (𝑉 , 𝐸) ]where |𝑉 | = 𝑛, the adjacency matrix is 𝑨 =
𝐺 Proof. Since we can raise or lower all the potentials by the
𝕀((𝑖, 𝑗) ∈ 𝐸) 𝑛×𝑛 , i.e., 𝑎𝑖𝑗 = 1 if there exists path from vertex same constant, 𝟏 ∈  (𝑨). Rows of 𝑨 are dependent if
𝑖 to vertex 𝑗. The (𝑖, 𝑗) entry of 𝑨𝑘 counts the number of 𝑘- the corresponding edges containing a loop. At the end of
step path from vertex 𝑖 to vertex 𝑗. If 𝐺 is undirected, 𝑨 is elimination we have a full set of 𝑟 independent rows. Those
symmetric. 𝑟 edges form a spanning tree of the graph, which has 𝑛 − 1
edges of the graph is connected.
Definition 32 (Incidence Matrix). Given a directed graph
𝐺 = (𝑉 , 𝐸) where |𝑉 | = 𝑛 and |𝐸| = 𝑚, the incidence 3.7 Two-point Boundary-value Problems
matrix is 𝑨 ∈ ℝ𝑚×𝑛 where 𝑎𝑖𝑗 = −1 if edge 𝑖 starts from
vertex 𝑗, 𝑎𝑖𝑗 = 1 if edge 𝑖 ends at vertex 𝑗, and 𝑎𝑖𝑗 = 0 Solving
otherwise.
d2 𝑢(𝑥)
− = 𝑓 (𝑥), 𝑥 ∈ [0, 1] , (24)
For a curcuit in Fig. 2, the incidence matrix is d𝑥2
with bounday condition 𝑢(0) = 0 and 𝑢(1) = 0. This equation
⎡−1 1 ⎤
⎢−1 ⎥ describe a steady state system, e.g., the temperature distribu-
1
⎢ ⎥ tion of a rod with a heat source 𝑓 (𝑥) and both ends fixed at
−1 1
𝑨 ∶= ⎢ ⎥. (22) 0 ◦ C.
⎢−1 1⎥
⎢ Since a computer cannot solve a differential equation ex-
−1 1⎥
⎢ ⎥ actly, we have to approximate the differential equation with
⎣ −1 1⎦
a difference equation. For that reason we can only accept a
finite amount of information at 𝑛 equally spaced points
We define
• 𝒖 ∈ ℝ𝑛 . Pontentials (the voltages) at 𝑛 nodes. 𝑢1 ∶= 𝑢(ℎ), 𝑢2 ∶= 𝑢(2ℎ), … , 𝑢𝑛 ∶= 𝑢(𝑛ℎ) (25)
• 𝒚 ∈ ℝ𝑚 . Currents flowing along 𝑚 edges. 𝑓1 ∶= 𝑓 (ℎ), 𝑓2 ∶= 𝑓 (2ℎ), … , 𝑓𝑛 ∶= 𝑓 (𝑛ℎ) (26)
• 𝒇 ∈ ℝ𝑛 be the current sources into 𝑛 nodes.
1
• 𝐶 ∶= diag(𝑐1 , 𝑐2 , … , 𝑐𝑚 ) ∈ ℝ𝑚×𝑚 . Conductance of where ℎ ∶= 𝑛
The boundary condition becomes 𝑢0 ∶= 0 and
each edge. 𝑢𝑛+1 ∶= 0.

8
We approximate the second-order derivative by • There is no connection between invertibility and diag-
onalizability. Invertibility is concerned with the eigen-
d2 𝑢(𝑥) 𝑢(𝑥 + ℎ) − 2𝑢(𝑥) + 𝑢(𝑥 − ℎ) values (𝜆 = 0 or 𝜆 ≠ 0). Diagonalizability is concerned
− ≈−
d𝑥 2 ℎ2 with the eigenvectors (too few or enough for 𝑺).
−𝑢𝑗+1 + 𝑢𝑗 − 𝑢𝑗−1
= . (27) • Suppose both 𝑨 and 𝑩 can be diagonalized, they share
ℎ2 the same eigenvector matrix 𝑺 iff 𝑨𝑩 = 𝑩𝑨.
2
Therefore, the differential equation − d d𝑥𝑢(𝑥)
2 = 𝑓 (𝑥) becomes
4.2 Diagonalizable
⎡2 −1 ⎤ ⎡𝑢1 ⎤ ⎡𝑓1 ⎤ Theorem 28 (Diagonalizable). If 𝑨 ∈ ℝ𝑛×𝑛 has 𝑛 indepen-
⎢−1 2 ⋱ ⎥ ⎢𝑢2 ⎥ ⎢𝑓 ⎥
𝑲𝒖 = ⎢ = ℎ2 ⎢ 2 ⎥ = ℎ2 𝒇 . (28) dent eigenvectors, 𝑨 is diagonalizable
⋱ ⋱ −1⎥ ⎢ ⋮ ⎥ ⋮
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣ −1 2 ⎦ ⎣𝑢𝑛 ⎦ ⎣𝑓𝑛 ⎦ 𝑨 = 𝑺𝚲𝑺 −1 , (29)
Lemma 26. The FLOPs for solving 𝑲𝒇 = ℎ2 𝒇 is 𝑇 (𝑛) ∼ [ ]
where 𝑺 ∶= 𝒙1 𝒙2 ⋯ 𝒙𝑛 and 𝚲 ∶= diag(𝜆1 , 𝜆2 , … , 𝜆𝑛 ).
3𝑛. In other words, 𝑨 is similar to 𝚲.
[ ]
Proof. 𝑨𝑺 = 𝜆1 𝒙1 𝜆2 𝒙2 ⋯ 𝜆𝑛 𝒙𝑛 = 𝑺𝚲.
4 Theory: Eigenvalues and Eigenvec-
Definition 36 (Normal Matrix). A square matrix 𝑨 is normal
tors when 𝑨⊤ 𝑨 = 𝑨𝑨⊤ . That includes symmetric, antisymmet-
ric, and orthogonal matrices. In this case, 𝜎𝑖 = |𝜆𝑖 |.
4.1 Eigenvalues and Eigenvectors
Definition 33 (Eigenvalue 𝜆 and Eigenvector 𝒙). 𝜆 and 𝒙 ≠ 𝟎 Lemma 29. The eigenvectors of 𝑨 is orthnormal when 𝑨 is
are the eigenvalue and eigenvector of 𝑨 if 𝑨𝒙 = 𝜆𝒙. normal.

Algorithm 14 (Solving Eigenvalues and Eigenvectors). Theorem 30 (Spectral Theorem). Every symmetric matrix
𝑨𝒙 = 𝜆𝒙 ⇒ (𝑨 − 𝜆𝐼)𝒙 = 𝟎. Since 𝒙 ≠ 𝟎,  (𝑨 − 𝜆𝑰) ≠ 𝟘, 𝑨 has the factorization
which has det(𝑨 − 𝜆𝑰) = 0. The algorithm is illustrated
∑
𝑛
in Alg. 6. In practice, the best way to compute eigenvalues 𝑨 = 𝑸𝚲𝑸⊤ = 𝜆𝑖 𝒙𝑖 𝒙⊤ (30)
𝑖
is to compute similar matrices 𝑨1 , 𝑨2 , … that approach a 𝑖=1
triangular matrix.
Porperties of eigenvalues and eigenvectors of special ma-
trices are illustrated in Table 10.
Algorithm 6 Solve the eigenvalues and eigenvector for 𝑨.
Input: 𝑨 ∈ ℝ𝑛×𝑛
Output: 𝜆𝑖 , 𝒙𝑖 5 Application: Solving Dynamic
1: Solve det(𝑨 − 𝜆𝑰) = 0, which is a polynomial in 𝜆 of degree 𝑛,
for eigenvalue 𝜆. Problems
2: For each eigenvalue 𝜆, solve (𝑨 − 𝜆𝑰)𝒙 = 𝟎 for eigenvector 𝒙.
3: return 𝜆𝑖 , 𝒙𝑖 5.1 Solving Difference and Differential Equa-
tions
Definition 34 (Geometric Multiplicity (GM)). The number The algorithm for solving first-order difference and differ-
of independent eigenvectors for 𝜆, which is dim  (𝑨 − 𝜆𝑰). ential equations are illustrated in Alg. 7 and Alg. 8, respec-
tively.
Definition 35 (Algebraic Multiplicity (AM)). The number
of repetitions of 𝜆 among the eigenvalues. Look at the 𝑛 roots Algorithm 7 Solving 𝒖𝑘+1 = 𝑨𝒖𝑘 .
of det(𝑨 − 𝜆𝑰) = 0.
Input: 𝑨 ∈ ℝ𝑛×𝑛 , 𝒖0
Lemma 27. The followings are properties of eigenvalues Output: 𝒖𝑘
and eigenvectors. 1: Diagonalize on 𝑨 = 𝑺𝚲𝑺 −1 .
• For each eigenvalue, GM ≤ AM. A matrix is diagonaliz- 2: Solving 𝑺𝒄 = 𝒖0 to write 𝒖0 as a linear combination of eigen-
vectors.
able iff every eigenvalue has GM = AM. ∑𝑛
3: The solution 𝒖𝑘 = 𝑨𝑘 𝒖0 = 𝑺𝚲𝑘 𝑺 −1 𝒖0 = 𝑺𝚲𝑘 𝒄 = 𝑖=1 𝑐𝑖 𝜆𝑘𝑖 𝒙𝑖
• Each eigenvalue has ≥ 1 eigenvector.
4: return 𝒖𝑘
• All eigenvalues are different ⇒ all eigenvectors are inde-
pendent, which means the matrix can be diagonalized.

9
Algorithm 8 Solving d𝒖(𝑡)
= 𝑨𝒖(𝑡), where 𝑨 is a constant In the stable case, the powers 𝑨𝑘 approach zero and so does
d𝑡
coefficient matrix. 𝒖𝑘 = 𝑨 𝑘 𝒖0 .
Input: 𝑨 ∈ ℝ𝑛×𝑛 , 𝒖(0)
Lemma 33. The differential equation d𝑡d 𝒖(𝑡) = 𝑨𝒖(𝑡) is
Output: 𝒖(𝑡)
1: Diagonalize on 𝑨 = 𝑺𝚲𝑺 −1 .
• stable and exp 𝑨𝑡 → 𝟎 if ∀𝑖. Re𝜆𝑖 < 0.
2: Solving 𝑺𝒄 = 𝒖(0) to write 𝒖(0) as a linear combination of • neutrally stable if ∃𝑖. Re𝜆𝑖 = 0, and all the other Re𝜆𝑖 <
eigenvectors. 0.
3: The solution 𝒖(𝑡) = exp(𝑨𝑡)𝒖(0) = 𝑺 exp(𝚲𝑡)𝑺 −1 𝒖(𝑡) = • unstable and exp 𝑨𝑡 is unbounded if ∃𝑖. Re𝜆𝑖 > 0.
∑𝑛
𝑺 exp(𝚲𝑡)𝒄 = 𝑖=1 𝑐𝑖 exp(𝜆𝑖 𝑡)𝒙𝑖
4: If two 𝜆’s are equal, with only one eigenvector, another solution
5.2 Singular Value Decomposition
𝑡 exp(𝜆𝑡)𝒙 is needed.
5: return 𝒖(𝑡) Theorem 34 (SVD Factorization). For matrix 𝑨 ∈ ℝ𝑚×𝑛
with 𝑟 ∶= rank 𝑨, choose 𝑼 ∈ ℝ𝑚×𝑚 to contain orthonor-
mal eigenvectors of 𝑨𝑨⊤ , and 𝑽 ∈ ℝ𝑛×𝑛 to contain or-
Example 1 (Fibonacci Numbers). Find the 𝑘-th Fibonacci thonormal eigenvectors of 𝑨⊤ 𝑨. The shared eigenvalues are
number where the sequence is defined as 𝐹0 = 0, 𝐹1 = 1, 𝜎12 , 𝜎22 , … , 𝜎𝑟2 . Then
and 𝐹𝑘+2 = 𝐹𝑘+1 + 𝐹𝑘 .
[ ] [ ] ∑
𝑟
𝐹𝑘+1 1 𝑨 = 𝑼 𝚺𝑽 ⊤ = 𝜎𝑖 𝒖𝑖 𝒗⊤ (31)
Solution. Let 𝒖𝑘 ∶=
𝐹𝑘
, then 𝒖0 =
0
and 𝒖𝑘+1 = 𝑖 .
[ ] √ √ 𝑖=1
1 1
𝑨𝒖𝑘 ∶= 𝒖𝑘 . 𝜆1 = 1+2 5 , 𝜆2 = 1−2 5 . 𝒙1 = 𝑼 and 𝑽 satisfy the followings.
1 0
[ ] [ ] [ ] • The first 𝑟 columns of 𝑼 contains orthonormal bases for
𝜆1 𝜆 1 1 1
, 𝒙2 = 2 . 𝒄 = 𝜆 −𝜆 . 𝒖𝑘 = 𝜆 −𝜆 (𝜆𝑘1 𝒙1 −𝜆𝑘2 𝒙2 ). (𝑨).
1 1 1 2 −1 1 2
( √ )𝑘 • The last 𝑚−𝑟 columns of 𝑼 contains orthonormal bases
𝐹𝑘 = √1 (𝜆𝑘1 − 𝜆𝑘2 ) = nearest integer to √1 1+2 5 . for  (𝑨⊤ ).
5 5 • The first 𝑟 columns of 𝑽 contains orthonormal bases for
d2 𝑥(𝑡) (𝑨⊤ ).
Example 2 (Simple Harmonic Vibration). Solve d𝑡2
+
d𝑥(0)
• The last 𝑛 − 𝑟 columns of 𝑽 contains orthonormal bases
𝑥(𝑡) = 0, where 𝑥(0) = 1, = 0. This is a 𝑚𝑎 = −𝑘𝑥
d𝑡 for  (𝑨).
where 𝑚 = 1, 𝑘 = 1.
[ ] [ ] Proof. Start from 𝑨⊤ 𝑨𝒗𝑖 = 𝜎𝑖2 𝒗𝑖 . Multiply both sides by
𝑥 1
Solution. Let 𝒖(𝑡) ∶= d𝑥 , then 𝒖(0) = and d𝒖 = 𝑨 gives 𝑨𝑨⊤ (𝑨𝒗𝑖 ) = 𝜎𝑖2 (𝑨𝒗𝑖 ), which shows that 𝑨𝒗𝑖 is
0 d𝑡
[ ] d𝑡 [ ] an eigenvector of 𝑨𝑨⊤ with shared eigenvalue 𝜎𝑖2 . Since
0 1 1 √ √
𝑨𝒖(𝑡) = 𝒖(𝑡). 𝜆1 = i, 𝜆2 = −i. 𝒙1 = , 𝒙2 =
−1 0 i ‖𝑨𝒗𝑖 ‖ = 𝒗⊤ 𝑨⊤
𝑨𝒗 = 𝜎𝑖2 𝒗⊤
𝑖 𝒗𝑖 = 𝜎𝑖 , we denote 𝒖𝑖 ∶=
[ ] [ ] 𝑖 𝑖
1 1 𝑨𝒗𝑖 𝑨𝒗𝑖
. 𝒄 = 12 . 𝒖(𝑡) = 21 (exp(i𝑡)𝒙1 + exp(−i𝑡)𝒙2 ). 𝑥(𝑡) = ‖𝑨𝒗𝑖 ‖
= 𝜎𝑖
, namely, 𝑨𝒗𝑖 = 𝜎𝑖 𝒖𝑖 . It shows column by
−i 1
1 column that 𝑨𝑽 = 𝑼 𝚺. Since 𝑽 is orthogonal, 𝑨 = 𝑼 𝚺𝑽 ⊤ .
2
(exp(i𝑡) + exp(−i𝑡) = cos 𝑡.
Definition 37 (Markov Matrices). A 𝑛×𝑛 matrix is a Markov
Lemma 35. The largest singular value dominates all eigen-
matrix if all entries are nonnegative and the each column of
values and all entries of 𝑨. That is, 𝜎1 ≥ max𝑖 |𝜆𝑖 | and
the matrix adds up to 1.
𝜎1 ≥ max𝑖,𝑗 |𝑎𝑖𝑗 |.
Lemma 31. A Markov matrix 𝑨 has the following properties
Lemma 36. For a square matrix 𝑨, spectral factorization
• 𝜆1 = 1 is an eigenvalue of 𝑨.
and SVD factorization give the same result when 𝑨 is PSD.
• Its eigenvector 𝒙1 is nonnegative, and it is steady state
since 𝑨𝒙1 = 𝒙1 . Proof. We need orthonormal eigenvectors (𝑨 should be sym-
• The other eigenvalues satisfy |𝜆𝑖 | ≤ 1. metric), and nonnegative eigenvalues (𝑨 is PSD).
• If 𝑨 or any power of 𝑨 has all positive entries, these
other |𝜆𝑖 | < 1. The solution 𝑨𝑘 𝒖0 approaches a multi-
ple of 𝒙1 , which is the steady state 𝒖∞ . 5.3 Leontief’s Input-ouput Model
Lemma 32. The difference equation 𝒖𝑘=1 = 𝑨𝒖𝑘 is Leontief divided the US economy into 𝑛 sectors that pro-
• stable if ∀𝑖. |𝜆𝑖 | < 1. duce goods or services (e.g., coal, automotive, and commu-
• neutrally stable if ∃𝑖. |𝜆𝑖 | = 1, and all the other |𝜆𝑖 | < 1. nication), and another sectors that only consume goods or
• unstable if ∃𝑖. 𝜆𝑖 | > 1. services (e.g., consumer and government).

10
Table 6: Comparation of PD and PSD matrices.

PD matrices PSD matrices

⊤
𝒙 𝑨𝒙, if 𝒙 ≠ 𝟎 >0 ≥0
𝑨⊤ 𝑨 If 𝑨 has independent columns If 𝑨 has dependent columns
𝑨⊤ 𝑪𝑨 (𝑪 is diagonal with positive elements) If 𝑨 has independent columns If 𝑨 has dependent columns
Upper left determinants All positive All nonnegative
Pivots All positive All nonnegative
Eigenvalues All Postive All nonnegative

• Production vector 𝒙 ∈ ℝ𝑛 . Ouput of each producer for Definition 38 (Stationary Point). Point where 𝜕𝑓
𝜕𝒖
= 𝟎. Such
one year. point can be a local minimum, a local maximum, or a saddle
• Final demand vector 𝒃 ∈ ℝ𝑛 . Demand for each pro- point.
ducer by the consumer for a year. 𝜕2 𝑓
• Intermediate demand vector 𝒖 ∶= 𝑨𝒙 ∈ ℝ𝑛 . Demand Lemma 40. 𝑓 (𝒖) has a local minimum when 𝜕𝒖2
is PD.
for each producer by the producer for a year. 𝑨 ∈ ℝ𝑛×𝑛 𝜕2 𝑓
Similarly, 𝑓 (𝒖) has a local maximum when is ND. If some
𝜕𝒖2
is the consumption matrix. eigenvalues are postive and some are negative, 𝑓 (𝒖) has a
2
Theorem 37 (Leontief Input-output Model). When there is saddle point. If 𝜕𝜕𝒖𝑓2 has eigenvalue 0, the test is inconclusive.
a production level 𝒙 such that the amounts produced will In some cases, we can directly get the stationary point by
exactly balance the total demand for that production
solving 𝜕𝑓𝜕𝒖
= 𝟎. In other cases, we iteratively approach to
𝒙 = 𝑨𝒙 + 𝒃 . (32) the stationary point.
If 𝑨 and 𝒃 have nonnegative entries and the largest eigen- Algorithm 16 (Gradient Descent). 𝒖 ← 𝒖 − 𝜂 𝜕𝑓
𝜕𝒖
.
value of 𝑨 is less than 1, then the solution exists and has ( 2 )−1
nonnegative entires Algorithm 17 (Newton’s Method). 𝒖 ← 𝒖 − 𝜕𝜕𝒖𝑓2 𝜕𝑓
𝜕𝒖
.
( )
∑∞
Lemma 41 (Taylor Series).
−1 𝑘
𝒙 = (𝑰 − 𝑨) 𝒃 = 𝑰 + 𝑨 𝒃. (33)
𝑘=1 𝜕𝑓 | 1 𝜕2𝑓 |
𝑓 (𝒖) ≈ 𝑓 (𝒖0 )+(𝒖−𝒖0 )⊤ | + (𝒖−𝒖0 )⊤ 2 | (𝒖−𝒖0 ) .
𝜕𝒖 |𝒖0 2 𝜕𝒖 |𝒖0
6 Theory: Positive Definite Matrices (35)

and Optimizations 6.3 Constrained Optimization

In many cases, the sign of eigenvalues are crucial. Construct the Lagrange function is an important method
for solving constrained optimization problems.
6.1 Positive Definite Matrices Definition 39 (Lagrange Function). For a constrained opti-
The comparation of PD and PSD matrices are illustrated mization problem
in Table 6. min 𝑓 (𝒖) (36)
𝒖
Algorithm 15 (Determine Whether 𝑨 is PD). Try to fac-
s. t. 𝑔𝑖 (𝒖) ≤ 0, 𝑖 = 1, 2, … , 𝑚 ,
tor 𝑨 = 𝑹⊤ 𝑹 where 𝑹 is upper triangular with positive
diagonal entries (i.e., Cholesky factorization). ℎ𝑗 (𝒖) = 0, 𝑗 = 1, 2, … , 𝑛 ,

Lemma 38. For symmetric matrices, the pivots and the The Lagrange function is defined as
eigenvalues have the same signs. ∑
𝑚 ∑
𝑛

Lemma 39. 𝒙⊤ 𝑨𝒙 = 1 is an ellipsoid in 𝑛 dimensions. The (𝒖, 𝜶, 𝜷) ≔ 𝑓 (𝒖) + 𝛼𝑖 𝑔𝑖 (𝒖) + 𝛽𝑗 ℎ𝑗 (𝒖) , (37)
𝑖=1 𝑗=1
axes of the ellipsoid point toward the eigenvector of 𝑨.
where 𝛼𝑖 ≥ 0.
6.2 Unconstrained Optimization Lemma 42. The optimization problem of 36 is equivalent to
The goal is to solve min max (𝒖, 𝜶, 𝜷) (38)
𝒖 𝜶,𝜷
arg min 𝑓 (𝒖) . (34) s. t. 𝛼𝑖 ≥ 0, 𝑖 = 1, 2, … , 𝑚 .
𝒖

11
Proof. Lemma 46 (Slater Condition). When primal problem is con-
vex, i.e., 𝑓 and 𝑔𝑖 are convex, ℎ𝑗 is affine, and there exists
min max (𝒖, 𝜶, 𝜷) at least one point in the feasible region to let the inequal-
𝒖 𝜶,𝜷
( ( )) ity strictly holds true, the dual problem is equivalent to the
∑
𝑚 ∑
𝑛
primal problem.
= min 𝑓 (𝒖) + max 𝛼𝑖 𝑔𝑖 (𝒖) + 𝛽𝑗 ℎ𝑗 (𝒖)
𝒖 𝜶,𝜷
𝑖=1 𝑗=1
( { ) Proof. The proof is out of the range of this note. Please refer
0 if 𝒖 feasible ; to [2] if you are interested.
= min 𝑓 (𝒖) +
𝒖 ∞ otherwise
= min 𝑓 (𝒖), and 𝒖 feasible , (39)
𝒖
7 Application: Solving Optimization
When 𝑔𝑖 is infeasible 𝑔𝑖 (𝒖) > 0, we can let 𝛼𝑖 = ∞, such Problems
that 𝛼𝑖 𝑔𝑖 (𝒖) = ∞; When ℎ𝑗 is infeasible ℎ𝑗 (𝒖) ≠ 0, we can
let 𝛽𝑗 = sign(ℎ𝑗 (𝒖))∞, such that 𝛽𝑗 ℎ𝑗 (𝒖) = ∞. When 𝒖 7.1 Removale Non-differentiability
feasible, since 𝛼𝑖 ≥ 0, 𝑔𝑖 (𝒖) ≤ 0, 𝛼𝑖 𝑔𝑖 (𝒖) ≤ 0. Therefore, the
maximum of 𝛼𝑖 𝑔𝑖 (𝒖) is 0. Lemma 47. The optimization problem

Corollary 43 (KKT condition). The optimization problem arg min |𝑓 (𝒖)| (43)
𝒖
of 38 should satisfy the followings at the optimium.
• Primal feasible: 𝑔𝑖 (𝒖) ≤ 0, ℎ𝑖 (𝒖) = 0; is equivalent to
• Dual feasible: 𝛼𝑖 ≥ 0;
• Complementary slackness: 𝛼𝑖 𝑔𝑖 (𝒖) = 0. arg min 𝑥 (44)
𝒖,𝑥
Definition 40 (Dual problem). The dual problem of 36 is s. t. 𝑓 (𝒖) − 𝑥 ≤ 0
− 𝑓 (𝒖) − 𝑥 ≤ 0 .
max min (𝒖, 𝜶, 𝜷) (40)
𝜶,𝜷 𝒖

s. t. 𝛼𝑖 ≥ 0, 𝑖 = 1, 2, … , 𝑚 . Proof. arg min𝒖 |𝑓 (𝒖)| is equivalent to

arg min𝒖 max(𝑓 (𝒖), −𝑓 (𝒖)). Let 𝑥 be an upper bound
Lemma 44. Dual problem is a lower bound of the primal of max(𝑓 (𝒖), −𝑓 (𝒖)).
problem.
7.2 Linear Programming
max min (𝒖, 𝜶, 𝜷) ≤ min max (𝒖, 𝜶, 𝜷) . (41)
𝜶,𝜷 𝒖 𝒖 𝜶,𝜷
7.3 Support Vector Machine
Proof. For any (𝜶 ′ , 𝜷 ′ ), min𝒖 (𝒖, 𝜶 ′ , 𝜷 ′ )
≤ Definition 43 (Support Vector Machine (SVM)). Given a set
min𝒖 max𝜶,𝜷 (𝒖, 𝜶, 𝜷). When (𝜶 ′ , 𝜷 ′ ) = of training examples {(𝒙𝑖 , 𝑦𝑖 )}𝑚 , the goal of SVM is to find
𝑖=1
max𝜶′ ,𝜷 ′ min𝒖 (𝒖, 𝜶 ′ , 𝜷 ′ ), it still holds true, i.e., a hyperplane to separate examples with different classes, and
max𝜶′ ,𝜷 ′ min𝒖 (𝒖, 𝜶 ′ , 𝜷 ′ ) ≤ min𝒖 max𝜶,𝜷 (𝒖, 𝜶, 𝜷). that hyperplane is fastest from training examples.

2
arg max min |𝒘⊤ 𝒙𝑖 + 𝑏| (45)
Definition 41 (Convex Function). A function 𝑓 is convex if 𝒘,𝑏 𝑖 ‖𝒘‖
s. t. 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏) > 0, 𝑖 = 1, 2, … , 𝑚 .
∀𝛼 ∈ [0, 1]. 𝑓 (𝛼𝒙 + (1 − 𝛼)𝒚) ≤ 𝛼𝑓 (𝒙) + (1 − 𝛼)𝑓 (𝒚) , (42)
Since scaling of (𝒘, 𝑏) does not change the solution, for
which means that if we pick two points on the graph of a
simplicity, we add a constraint that
convex function and draw a straight line segment between
them, the portion of the function between these two points min |𝒘⊤ 𝒙𝑖 + 𝑏| = 1 . (46)
will lie below this straight line. 𝑖

Lemma 45. A function 𝑓 is convex if every point on the Theorem 48 (Standard Form of SVM). The optimization
tangent line will lie below the corresponding point on 𝑓 problem of SVM is equivalent to
2
𝑓 (𝒚) ≥ 𝑓 (𝒙) + (𝒚 − 𝒙)⊤ 𝜕𝑓𝜕𝒙(𝒙) or 𝜕𝜕𝒙𝑓2 is PSD. 1 ⊤
arg min 𝒘 𝒘 (47)
Definition 42 (Affine Function). Funtion 𝑓 in the form 𝒘,𝑏 2
𝑓 (𝒙) = 𝒄 ⊤ 𝒙 + 𝑑. s. t. 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏) ≥ 1, 𝑖 = 1, 2, … , 𝑚 .

12
Proof. By contradiction. Suppose the equality of the Table 7: Analogy of real-valued functions with linear trans-
constraint does not hold at the optimial (𝒘⋆ , 𝑏⋆ ), i.e., formations.
⊤
min𝑖 𝑦𝑖 (𝒘⋆ 𝒙𝑖 + 𝑏⋆ ) > 1. There exists (𝑟𝒘, 𝑟𝑏) where
0 < 𝑟 < 1 such that min𝑖 𝑦𝑖 ((𝑟𝒘)⊤ 𝒙𝑖 + 𝑟𝑏) = 1, and Function Linear transformation
1
2
‖𝑟𝒘‖2 < 12 ‖𝒘‖2 . That implies (𝒘⋆ , 𝑟⋆ ) is not an optimial, Definition 𝑓∶ ℝ→ℝ 𝑇 ∶ ℝ𝑛 → ℝ𝑚
which contradicts to the assumption. Therefore, Eqn. 47 is Domain, Codomain ℝ, ℝ ℝ 𝑛 , ℝ𝑚
equivalent to Image of 𝑥 𝑓 (𝑥) 𝑇 (𝒙) ∶= 𝑨𝒙
Range {𝑦 ∣ ∃𝑥. 𝑦 = 𝑓 (𝑥)} (𝑨)
1 ⊤ Zero {𝑥 ∣ 𝑓 (𝑥) = 0}  (𝑨)
arg min 𝒘 𝒘 (48)
𝒘,𝑏 2 Inverse 𝑓 −1 (𝑦) 𝑇 −1 (𝒚) = 𝑨−1 𝒚
Decomposition 𝑔◦𝑓 = 𝑔(𝑓 (𝑥)) 𝑇𝐵 ◦𝑇𝐴 = 𝑩𝑨𝒙
s. t. min 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏) = 1 .
𝑖

The objective function is equivalent to Table 8: Terminologies of linear transformation 𝑇 (𝒙) = 𝑨𝒙.

1 2 Terminology Meaning Property

arg min 𝒘⊤ 𝒘 = arg max ⋅1
𝒘,𝑏 2 𝒘,𝑏 ‖𝒘‖
Onto (surjective) ≥ 1 arrow in (𝑨) = ℝ𝑚 .
2 One-to-one (injective) ≤ 1 arrow in  (𝑨) = 𝟘
= arg max min 𝑦 (𝒘⊤ 𝒙𝑖 + 𝑏)
𝒘,𝑏 𝑖 ‖𝒘‖ 𝑖
2
= arg max min |𝒘⊤ 𝒙𝑖 + 𝑏| . (49)
𝒘,𝑏 𝑖 ‖𝒘‖ Definition 44 (Support Vector). Example with dual variable
𝛼𝑖 > 0, which has 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏) = 1.
Lemma 50. The hypothesis function of linear SVM is
Theorem 49 (Dual problem of SVM). The dual problem of ( )
SVM is ∑
⊤
ℎ(𝒙) = sign 𝛼𝑖 𝑦𝑖 𝒙𝑖 𝒙 + 𝑏 . (55)
1 ∑∑ ∑
𝑚 𝑚 𝑚
𝑖∈𝑆𝑉
arg min 𝛼𝑖 𝛼𝑗 𝑦𝑖 𝑦𝑗 𝒙⊤
𝑖 𝒙𝑗 − 𝛼𝑖 (50)
𝜶 2 𝑖=1 𝑗=1 𝑖=1
∑
𝑚 8 Theory: Linear Transformations
s. t. 𝛼𝑖 𝑦𝑖 = 0,
𝑖=1 8.1 Linear Transformations
𝛼𝑖 ≥ 0, 𝑖 = 1, 2, … , 𝑚 . Definition 45 (Transformation 𝑇 ). A transformation
𝑇 ∶ ℝ𝑛 → ℝ𝑚 is a rule that assigns to each vector 𝒙 ∈ ℝ𝑛 a
Proof. The Lagrange function is
vector 𝑇 (𝒙) ∈ ℝ𝑚 .
1 ⊤ ∑ 𝑚
(𝒘, 𝑏, 𝜶) ≔ 𝒘 𝒘+ 𝛼𝑖 (1 − 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏)) . (51) Definition 46 (Linear Transformation). A transformation
2 𝑖=1 𝑇 is linear if 𝑇 (𝑐1 𝒗1 + 𝑐2 𝒗2 ) = 𝑐1 𝑇 (𝒗1 ) + 𝑐2 𝑇 (𝒗2 ). It is
Its dual problem is always the case that 𝑇 (𝟎) = 𝟎. The comparison between real-
valued functions and linear transformations is illustrated in
1 ⊤ ∑ 𝑚 Table 7 and terminologies of transformations are illustrated
arg max min 𝒘 𝒘+ 𝛼𝑖 (1 − 𝑦𝑖 (𝒘⊤ 𝒙𝑖 + 𝑏)) (52) in Table 8.
𝜶 𝒘,𝑏 2 𝑖=1
s. t. 𝛼𝑖 ≥ 0, 𝑖 = 1, 2, … , 𝑚 . Theorem 51 (Standard Matrix for a Linear Transformation).
Let 𝑇 ∶ ℝ𝑛 → ℝ𝑚 be a[ linear transformation.] There exists a
Since the inner optimization problem is unconstrained, we unique matrix 𝑨 ∶= 𝑇 (𝒆1 ) 𝑇 (𝒆2 ) ⋯ 𝑇 (𝒆𝑛 ) ∈ ℝ𝑚×𝑛 such
can get the optimial by that 𝑇 (𝒙) = 𝑨𝒙, where 𝒆𝑗 is the 𝑗-th column of 𝑰 ∈ ℝ𝑛×𝑛 .
(∑ ) ∑
∑ 𝑚
Proof. 𝑇 (𝒙) = 𝑇 𝑛
= 𝑛𝑗=1 𝑥𝑗 𝑇 (𝒆𝑗 ) = 𝑨𝒙.
𝜕
=𝟎⇒𝒘= 𝛼𝑖 𝑦𝑖 𝒙𝑖 , (53) 𝑗=1 𝑥𝑗 𝒆𝑗
𝜕𝒘 𝑖=1

𝜕 ∑ 𝑚
Lemma 52 (Genral Matrix for a Linear Transformation). Let
=0⇒ 𝛼𝑖 𝑦𝑖 = 0 . (54)
𝜕𝑏 𝑇 ∶  →  be a linear transformation, where 𝑽 ∈ ℝ𝑛×𝑛
𝑖=1
is the input basis and 𝑼 ∈ ℝ𝑚×𝑚 is the output basis. There
Substitute them into Eqn. 52 gives Eqn. . exists a unique matrix 𝑨 ∈ ℝ𝑚×𝑛 that gives the coordinate

13
𝑇 (𝒄) = 𝑨𝒄 in the output space when the coordinate of input Table 9: Transformation using homogeneous coordinates.
space is 𝒄. The 𝑗-th column of 𝑨 is found by solving 𝑇 (𝒗𝑗 ) =
𝑼 𝒂𝑗 . Transformation Result
⎡𝑐𝑥 0 0⎤ ⎡𝑥⎤ ⎡𝑐𝑥 𝑥⎤
8.2 Identity Transformations = Change of Ba- Scaling ⎢ 0 𝑐𝑦 0⎥ ⎢𝑦⎥ = ⎢ 𝑐𝑦 𝑦 ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
sis ⎣ 0 0 1⎦ ⎣ 1 ⎦ ⎣ 1 ⎦
⎡1 0 𝑥0 ⎤ ⎡𝑥⎤ ⎡𝑥 + 𝑥0 ⎤
Definition 47 (Coordinate). The coordinate of a vector 𝒙 ∈ Translation ⎢0 1 𝑦0 ⎥ ⎢𝑦⎥ = ⎢ 𝑦 + 𝑦0 ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
ℝ𝑛 relative to the bases matrix 𝑾 ∈ ℝ𝑛×𝑛 is the coefficient ⎣0 0 1 ⎦ ⎣ 1 ⎦ ⎣ 1 ⎦
𝒄 such that 𝒙 = 𝑾 𝒄, or equivalently 𝒄 = 𝑾 −1 𝒙. ⎡0 1 0⎤ ⎡𝑥⎤ ⎡𝑦⎤
Reflection ⎢1 0 0⎥ ⎢𝑦⎥ = ⎢𝑥⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
Example 3 (Wavelet Transform). Wavelets are little waves. ⎣0 0 1 ⎦ ⎣ 1 ⎦ ⎣ 1 ⎦
They have different length and they localized at different ⎡cos 𝛼 − sin 𝛼 0⎤ ⎡𝑥⎤
places. The basis matrix is Clockwise rotation ⎢ sin 𝛼 cos 𝛼 0⎥ ⎢𝑦⎥
⎢ ⎥⎢ ⎥
⎣ 0 0 1⎦ ⎣ 1 ⎦
⎡1 1 1 0⎤
⎢1 1 −1 0⎥
𝑾 ∶= ⎢
1⎥
. (56)
⎢
1 −1 0
⎥
9 Applications: Linear Transforma-
⎣1 −1 0 −1⎦
tions
Those bases are orthogonal. The wavelet transform finds the
coefficients 𝒄 when the input signal 𝒙 is expressed in the 9.1 Computer Graphics
wavelet basis 𝒙 = 𝑾 𝒄.
Definition 48 (Homogeneous Coordinates). Each point
Example 4 (Discrete Fourier Transform). The Fourier trans- [ ] ⎡𝑥⎤
𝑥
form decomposes the signal into waves at equally spaced ∈ ℝ2 can be identified with the point ⎢𝑦⎥ ∈ ℝ3 . Homo-
frequencies. The basis matrix is 𝑦 ⎢ ⎥
⎣1⎦
geneous coordinates can be trasformed via multiplication by
⎡1 1 1 1 ⎤ 3 × 3 matrices, as illustrated in Table 9. By analogy, each
⎢1 i i2 i3 ⎥
𝑭 ∶= ⎢ ⎡𝑥⎤
(i3 )2 ⎥
. (57) ⎡𝑥⎤
1 i2 (i2 )2 ⎢𝑦⎥
⎢ ⎥ point ⎢𝑦⎥ ∈ ℝ3 can be identified with the point ⎢ ⎥ ∈ ℝ4 .
⎣1 i3 (i2 )3 (i3 )3 ⎦ ⎢ ⎥ 𝑧
⎣𝑧⎦ ⎢ ⎥
⎣1⎦
Those bases are orthogonal. The discrete Fourier transform
finds the coefficients 𝒄 when the input signal 𝒙 is expressed
in the Fourier basis 𝒙 = 𝑭 𝒄. Theorem 54 (Perspective projections). A 3D object is rep-
resented on the 2D computer screen by projecting the object
Lemma 53 (Change of [ Basis). Suppose
] we want
[ to change] onto a viewing plane at 𝑧 = 0. Suppose the eye of a viewer is
the basis from 𝑽 ∶= 𝒗1 𝒗2 ⋯ 𝒗𝑛 to 𝑼 ∶= 𝒖1 𝒖2 ⋯ 𝒖𝑛 . ⎡0⎤
The coordinate of a vector 𝒙 is 𝒄 in 𝑽 , and is 𝒃 in 𝑼 . Then at the point ⎢ 0 ⎥. A perspective projection maps each point
𝒃 = 𝑼 −1 𝑽 𝒄, where 𝑨 ∶= 𝑼 −1 𝑽 is called the change of ⎢ ⎥
⎣𝑑 ⎦
basis matrix. ⎡𝑥⎤ ⎡𝑥𝑝 ⎤
⎢𝑦⎥ onto an image point ⎢𝑦𝑝 ⎥ such that those two point and
Proof. 𝒙 = 𝑽 𝒄 = 𝑼 𝒃 ⇒ 𝒃 = 𝑼 −1 𝑽 𝒄. ⎢ ⎥ ⎢ ⎥
⎣𝑧⎦ ⎣0⎦
Algorithm 18 (Solving the Change[ of Basis
] [Matrix).
] Per- the eye position (center of projection) are on a line.
form elementary row operations on 𝑼 𝑽 to 𝑰 𝑨 . 𝑇 (𝑛) ∼
𝑥 𝑦
𝑛3 . 𝑥𝑝 = 𝑧 , 𝑦𝑝 = 𝑧 . (58)
1− 𝑑
1− 𝑑
Example 5 (Diagonalization). 𝑇 (𝒙) ∶= 𝑨𝒙 = 𝑺𝚲𝑺 −1 𝒙
defines a linear transformation which changes the basis from
𝑰 to 𝑺, then transform 𝒙 in space of 𝑺, and last changes the
9.2 Principle Component Analysis
basis from 𝑺 back to 𝑰. Definition 49 (Principle Component Analysis, PCA). Given
Example 6 (SVD Factorization). 𝑇 (𝒙) ∶= 𝑨𝒙 = 𝑼 𝚺𝑽 ⊤ 𝒙 a set of instances {𝒙𝑖 }𝑚
𝑖=1
with empirical mean 𝝁 ∈ ℝ𝑑 and
defines a linear transformation which changes the basis from empirical covariance 𝚺 ∈ ℝ[𝑑×𝑑 , PCA wants] to find a set
𝑰 to 𝑽 , then transform 𝒙 from space 𝑽 to space 𝑼 , and last of orthonormal bases 𝑾 ∶= 𝒘1 𝒘2 ⋯ 𝒘𝑑 ′ such that the
changes the basis from 𝑼 back to 𝑰. sum of variance of the projected data along each component

14
Table 10: Porperties of eigenvalues and eigenvectors of special matrices.

Matrix Eigenvalues Eigenvectors

⊤
Symmetric 𝑨 = 𝑨 All real Orthnormal
Anti-symmetric 𝑨⊤ = −𝑨 All imaginary Orthnormal
Orthogonal 𝑸−1 = 𝑸⊤ All |𝜆| = 1 Orthnormal
PD All 𝜆 > 0 Orthnormal
PSD All 𝜆 ≥ 0 Orthnormal
Diagonalizable 𝑨 = 𝑺𝚲𝑺 −1 diag 𝚲 Columns of 𝑺 are independent
Rectangular 𝑨 = 𝑼 𝚺𝑽 ⊤ rank 𝑨 = rank 𝚺 Eigenvectors of 𝑨⊤ 𝑨, 𝑨𝑨⊤ in 𝑽 , 𝑼
Stable powers 𝑨𝑛 → 𝟎 All |𝜆| < 1 Any
∑𝑛
Markov 𝐴𝑖𝑗 > 0, 𝑖=1 𝐴𝑖𝑗 = 1 max𝑖 𝜆𝑖 = 1 Steady state 𝒙 > 0
Stable exponential exp 𝑨𝑡 → 0 All Re𝜆 < 0 Any
Projection 𝑷 = 𝑷 2 = 𝑷 ⊤ 1, 0 (𝑷 ),  (𝑷 )
Rank-1 𝒖𝒗⊤ 𝒗⊤ 𝒖, 0, … , 0 𝒖, whole plane 𝒗⟂
Reflection 𝑰 − 2𝒆𝒆⊤ −1, 1, … , 1 𝒆, whole plane 𝒆⟂
[ ]⊤ [ ]⊤
Plane rotation exp(i𝜃), exp(−i𝜃) 1i , 1 −i
[ ]⊤
Cyclic permutation: row 1 of 𝑰 last 𝜆𝑘 = exp 2𝜋i𝑘 𝒙𝑘 = 1 𝜆𝑘 ⋯ 𝜆𝑛−1
𝑛 [ 𝑘
]⊤
𝑘𝜋 𝑘𝜋 2𝑘𝜋
Tridiagonal: -1, 2, -1 on diagonals 𝜆𝑘 = 2 − 2 cos 𝑛+1 𝒙𝑘 = sin 𝑛+1 sin 𝑛+1 ⋯

1
is maximized. Definition 50 (PCA Whitening). 𝒙̂ ∶= 𝚲− 2 𝑸⊤ (𝒙 − 𝝁) has
̂ = 𝟎 and cov 𝒙̂ = 𝑰.
𝔼[𝒙]
arg max tr cov 𝑾 ⊤ (𝒙 − 𝝁) (59)
𝑾 1
Definition 51 (ZCA Whitening). 𝒙̂ ∶= 𝑸𝚲− 2 𝑸⊤ (𝒙 − 𝝁) has
s. t. 𝑾 ⊤𝑾 = 𝑰 .
̂ = 𝟎 and cov 𝒙̂ = 𝑰.
𝔼[𝒙]
Theorem 55. The optimium 𝑾 to Eqn. 61 is the top 𝑑 ′
eigenvectors of 𝚺.
10 Appendix
⊤ ⊤
Proof. Since 𝔼[𝑾 (𝒙 − 𝝁)] = 𝑾 𝔼[𝒙 − 𝝁] = 𝟎,
Lemma 57 (Sum of Series).
tr cov 𝑾 ⊤ (𝒙 − 𝝁) = tr 𝔼[(𝑾 ⊤ (𝒙 − 𝝁) − 𝟎)(𝑾 ⊤ (𝒙 − 𝝁) − 𝟎)⊤ ]
∑
𝑛
𝑛(𝑛 + 1) 1 2
= tr 𝑾 ⊤ 𝔼[(𝒙 − 𝝁)(𝒙 − 𝝁)⊤ ]𝑾 𝑖= ∼ 𝑛 , (64)
2 2
= tr 𝑾 ⊤ 𝚺𝑾 . (60) 𝑖=1
∑
𝑛
𝑛(𝑛 + 1)(2𝑛 + 1) 1 3
The optimization problem is equivalent to 𝑖2 = ∼ 𝑛 . (65)
𝑖=1
6 3
arg min − tr 𝑾 ⊤ 𝚺𝑾 (61)
𝑾
References
s. t. 𝑾 ⊤𝑾 = 𝑰 .
[1] S. Axler. Linear algebra done right. Springer, 1997. 1
The Lagrange function is [2] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge
University Press, 2004. 12
(𝑾 , 𝑩) ∶= − tr 𝑾 ⊤ 𝚺𝑾 + (vec 𝑩)⊤ (vec(𝑾 ⊤ 𝑾 − 𝑰)) [3] S. Boyd and L. Vandenberghe. Introduction to Applied Linear
= − tr 𝑾 ⊤ 𝚺𝑾 + tr 𝑩 ⊤ (𝑾 ⊤ 𝑾 − 𝑰) . (62) Algebra: Vectors, Matrices, and Least Squares. Cambridge
University Press, 2018. 1
We can get the optimial by [4] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge
University Press, 1990. 1
𝜕
= 𝟎 ⇒ 𝚺𝑾 = 𝑾 𝑩 . (63) [5] D. C. Lay, S. R. Lay, and J. J. McDonald. Linear Algebra and
𝜕𝑾 Its Applications (Fifth Edition). Pearson, 2014. 1
[6] K. B. Petersen, M. S. Pedersen, et al. The matrix cookbook.
Technical University of Denmark, 2008. 1
Corollary 56. 𝒙̂ ∶= 𝑸⊤ (𝒙−𝝁) has 𝔼[𝒙]
̂ = 𝟎 and cov 𝒙̂ = 𝚲, [7] G. Strang. Linear algebra and its applications (Fourth Edition).
where 𝚺 = 𝑸𝚲𝑸⊤ . Academic Press, 2006. 1

15
[8] G. Strang. Computational science and engineering. Wellesley-
Cambridge Press, 2007. 1
[9] G. Strang. Introduction to linear algebra (Fourth Edition).
Wellesley-Cambridge Press, 2009. 1

Summary of MATLAB Onramp: Basic Syntax
50% (2)
Summary of MATLAB Onramp: Basic Syntax
3 pages
Maths 2 Book Linear Algebra (All The Chapters)
No ratings yet
Maths 2 Book Linear Algebra (All The Chapters)
240 pages
Matrices and Linear Algebra in Control Applications
No ratings yet
Matrices and Linear Algebra in Control Applications
38 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
La PDF
No ratings yet
La PDF
208 pages
SMTA022 Study Guide For 2024
No ratings yet
SMTA022 Study Guide For 2024
63 pages
Notas Algebra Linea 20
No ratings yet
Notas Algebra Linea 20
35 pages
Linear Algebra
No ratings yet
Linear Algebra
18 pages
Linear Algebra Lecture Notes
No ratings yet
Linear Algebra Lecture Notes
176 pages
Lecture Notes On Linear Algebra: Aklal S Pati
No ratings yet
Lecture Notes On Linear Algebra: Aklal S Pati
176 pages
Draft: Lecture Notes On Discrete Mathematics
No ratings yet
Draft: Lecture Notes On Discrete Mathematics
195 pages
LA Lectures
No ratings yet
LA Lectures
148 pages
Linear ALgebra1
No ratings yet
Linear ALgebra1
101 pages
Matrices
No ratings yet
Matrices
45 pages
Iiserb Mm1 Notes Oct 4
No ratings yet
Iiserb Mm1 Notes Oct 4
30 pages
MAT 212 Lecture Notes 2
No ratings yet
MAT 212 Lecture Notes 2
82 pages
Linear Algebra Summary
No ratings yet
Linear Algebra Summary
34 pages
ALGEBRE LINEAIRE. Eng PDF
No ratings yet
ALGEBRE LINEAIRE. Eng PDF
45 pages
MA1522 Note Binder
No ratings yet
MA1522 Note Binder
43 pages
Linear Algebra Notes Ajay Gandecha
No ratings yet
Linear Algebra Notes Ajay Gandecha
75 pages
La PDF
No ratings yet
La PDF
257 pages
Applied Linear Algebra
No ratings yet
Applied Linear Algebra
121 pages
1bis-Linear Algebra Lectures Notes NYU
No ratings yet
1bis-Linear Algebra Lectures Notes NYU
119 pages
Linear Algebra
No ratings yet
Linear Algebra
6 pages
Iiserb Mm1 Notes
No ratings yet
Iiserb Mm1 Notes
21 pages
Lecture 1 - (Spring 2024)
No ratings yet
Lecture 1 - (Spring 2024)
12 pages
Linear Algebra Through Matrices
100% (2)
Linear Algebra Through Matrices
245 pages
Linear Algebra Notes
100% (1)
Linear Algebra Notes
93 pages
Algebra2
No ratings yet
Algebra2
58 pages
MATH 233 - Linear Algebra I Lecture Notes: Cesar O. Aguilar
No ratings yet
MATH 233 - Linear Algebra I Lecture Notes: Cesar O. Aguilar
206 pages
Linear Algebra CBE 0616 GK
No ratings yet
Linear Algebra CBE 0616 GK
37 pages
Linear Algebra
100% (1)
Linear Algebra
245 pages
Linear Algebra and Vector Calculus
100% (3)
Linear Algebra and Vector Calculus
136 pages
Linear Algebra 1 - 2
No ratings yet
Linear Algebra 1 - 2
122 pages
Lecture Notes On Linear Algebra: Prepared by Muhammad Shahnewaz Bhuyan
No ratings yet
Lecture Notes On Linear Algebra: Prepared by Muhammad Shahnewaz Bhuyan
75 pages
Linear Algebra With Its Applications
100% (1)
Linear Algebra With Its Applications
336 pages
Classnote Ma2031
100% (1)
Classnote Ma2031
185 pages
Laa 2024
No ratings yet
Laa 2024
45 pages
LectureNotes LinearAlgebra
No ratings yet
LectureNotes LinearAlgebra
98 pages
Culegere Peter
No ratings yet
Culegere Peter
157 pages
MAT 213-304 Linear Algebra II Notes
No ratings yet
MAT 213-304 Linear Algebra II Notes
35 pages
Marketing Part 3
No ratings yet
Marketing Part 3
4 pages
Nonlinear Optimization (18799 B, PP) : Ist-Cmu PHD Course, Spring 2011
No ratings yet
Nonlinear Optimization (18799 B, PP) : Ist-Cmu PHD Course, Spring 2011
11 pages
Introduction To Linear Algebra
No ratings yet
Introduction To Linear Algebra
33 pages
Unit 1
No ratings yet
Unit 1
64 pages
Chap Sum 9
No ratings yet
Chap Sum 9
8 pages
Unit 1
No ratings yet
Unit 1
64 pages
MAT 212LectureNote
No ratings yet
MAT 212LectureNote
71 pages
Differential Equations and Linear Algebra
No ratings yet
Differential Equations and Linear Algebra
52 pages
Linear Algebra
No ratings yet
Linear Algebra
116 pages
Warwick Linear Algebra Inna
No ratings yet
Warwick Linear Algebra Inna
61 pages
Main Notes
No ratings yet
Main Notes
5 pages
Linalg Elia
No ratings yet
Linalg Elia
20 pages
Linear Algebra
No ratings yet
Linear Algebra
91 pages
cs229 Linalg
No ratings yet
cs229 Linalg
26 pages
MAT2144 - Lectures Nores
No ratings yet
MAT2144 - Lectures Nores
195 pages
Arindama Singh's MA2031 Notes
No ratings yet
Arindama Singh's MA2031 Notes
207 pages
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Exercises of Basic Analytical Geometry
From Everand
Exercises of Basic Analytical Geometry
Simone Malacrida
No ratings yet
Exercises of Matrices and Linear Algebra
From Everand
Exercises of Matrices and Linear Algebra
Simone Malacrida
4/5 (1)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Common Pre-Board Examination: 2022-23 Class-XII Subject: MATHEMATICS
No ratings yet
Common Pre-Board Examination: 2022-23 Class-XII Subject: MATHEMATICS
18 pages
Algebra
No ratings yet
Algebra
7 pages
Design Methods-Test Paper
No ratings yet
Design Methods-Test Paper
4 pages
ECON509 Probability and Statistics: Slides 6
No ratings yet
ECON509 Probability and Statistics: Slides 6
14 pages
Mathematic-1 MA100 (Pre Calculus)
No ratings yet
Mathematic-1 MA100 (Pre Calculus)
3 pages
IGCSE - Common Errors-Extended PDF
No ratings yet
IGCSE - Common Errors-Extended PDF
5 pages
Truss Analysis
No ratings yet
Truss Analysis
8 pages
Zetasbasuni
No ratings yet
Zetasbasuni
13 pages
PRESENTATION-Mulitiphysics Simulation For MEMS Using Workbench
No ratings yet
PRESENTATION-Mulitiphysics Simulation For MEMS Using Workbench
27 pages
HCM City University of Technology: Exercises and Problems in Linear Algebra
No ratings yet
HCM City University of Technology: Exercises and Problems in Linear Algebra
20 pages
2013 STPM Mathematics M Past Year Questions P1 P2 P3
No ratings yet
2013 STPM Mathematics M Past Year Questions P1 P2 P3
17 pages
Transformation - 2D
No ratings yet
Transformation - 2D
93 pages
Bmate201 - Lab Manual
No ratings yet
Bmate201 - Lab Manual
49 pages
USJ Master AI
No ratings yet
USJ Master AI
6 pages
High Standards in Maths - DKC
100% (1)
High Standards in Maths - DKC
56 pages
Btech 1 Sem Engineering Mathematics 1 Nas103 2019
No ratings yet
Btech 1 Sem Engineering Mathematics 1 Nas103 2019
2 pages
Node-Voltage or Nodal Analysis
No ratings yet
Node-Voltage or Nodal Analysis
8 pages
Fourier Transforms of Generalized Low-Dimensional Fractals
No ratings yet
Fourier Transforms of Generalized Low-Dimensional Fractals
26 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
9 pages
Python Lab Manual PDF
No ratings yet
Python Lab Manual PDF
94 pages
C Programs Examples
100% (1)
C Programs Examples
11 pages
Lec 21
No ratings yet
Lec 21
11 pages
MATH 348 Lecture 22
No ratings yet
MATH 348 Lecture 22
12 pages
Keam Ptwo
No ratings yet
Keam Ptwo
32 pages
20mhg01 Matrix
No ratings yet
20mhg01 Matrix
86 pages
Civil - II I BCE
No ratings yet
Civil - II I BCE
100 pages
2nd Puc Maths Sulalitha Exam Practice Guide Eng Version 2020-21 by Chikkaballapura
82% (91)
2nd Puc Maths Sulalitha Exam Practice Guide Eng Version 2020-21 by Chikkaballapura
101 pages
2 Failure To Launch
No ratings yet
2 Failure To Launch
30 pages
Abstract Reasoning Explanation
No ratings yet
Abstract Reasoning Explanation
5 pages

Quick Reference of Linear Algebra

Uploaded by

Quick Reference of Linear Algebra

Uploaded by

Quick Reference of Linear Algebra

Abstract • ‖𝑨‖2 ∶= ‖𝝈‖∞ =√ max𝑖 𝜎𝑖 . √

2 Theory: Vector Spaces and Sub- 2.2 Matrix Inverses

Inverse Transpose Rank

Table 2: Properties of common matrix operations (II).

Determinant Trace Eigenvalue

Table 3: Vector spaces for 𝑨 ∈ ℝ𝑚×𝑛 , where 𝑹 = 𝑬𝑨 is the RREF of 𝑨.

Subspace Definition Basis Dimension

Table 4: Comparasions of invertible and singular matrices (the matrix 𝑨 ∈ ℝ𝑛×𝑛 ).

Invertible matrices Singluar matrices

Definition 19 (Consist Linear System). A linear system is

𝑨. A pivot row/column is a row/column of 𝑨 that contains 3.3 Solution of a Linear System 𝑨𝒙 = 𝒃

3.5 Matrices in Engineering ⎡ 2 −1 ⎤

Fixed-free Case. In this case, 𝑚 = 𝑛 and the top spring are

PD matrices PSD matrices

and Optimizations 6.3 Constrained Optimization

s. t. 𝛼𝑖 ≥ 0, 𝑖 = 1, 2, … , 𝑚 . Proof. arg min𝒖 |𝑓 (𝒖)| is equivalent to

1 2 Terminology Meaning Property

Matrix Eigenvalues Eigenvectors

You might also like