Self Learning LinAlgebra
Self Learning LinAlgebra
a quick tour
Olivia Pfeiler
[email protected]
• YouTube Channel from Jon Krohn (wrote the bestseller Deep Learning Illustrated
and hosts the SuperDataScience podcast)
Linear Algebra for Machine Learning
3
Vectors, Matrices, Sum &
Multiplication
4
Example: Vectors used in ML
best hyperplane
E.g. Classification with
Support Vector
Machines (SVM)
5
Example: Matrices used in ML
Source
6
What is a matrix?
[ ]
𝑎 11 ⋯ 𝑎1 𝑛
𝑚𝑥𝑛
𝐴= ⋮ ⋱ ⋮ = ( 𝑖𝑗 )
𝑎 ∈ ℝ
𝑎𝑚 1 ⋯ 𝑎𝑚𝑛
7
Sum & product of matrices
𝑚× 𝑛 𝑛× 𝑝
𝐴∈ ℝ ,𝐵∈ℝ
𝑚 ×𝑝
𝐴 ∙ 𝐵=𝐶 , 𝐶 ∈ ℝ
8
Some matrix properties
Non-commutativity: AB ≠ BA
Distributivity: (A+B)(C+D) = AC + AD + BC
+ BD
Transpose: (AT)T = A
(AB)T = BTAT
• B is left & right
(A+B)T = AT+BT inverse
• A-1 exists only for
square matrices
Invertible (nonsingular): B = A-1 if and only if BA = I
= AB 9
Reminder: Hadamard Product
( 1
2 ) (
1
2
⨀
0
4
1
5) (
=
0
8
1
10 )
10
Special matrices
• Idempotent matrix 11
Orthogonal matrices
A is orthogonal if
• A is a square matrix and
• rows & columns are orthonormal vectors
(orthogonal unit vectors)
A is orthogonal if
• ATA = I
12
Tensors
13
Matrix Operations
14
Determinant
Rule of Sarrus
Source: Wikipedia
Trace
Source: Wikipedia
Inverse
Source: Wikipedia
Eigenvectors & Eigenvalues
18
Eigenvectors & Eigenvalues
Source
19
Example: Eigenvectors, Eigenvalues
An eigenvector defines a
direction in which a space
is scaled by a transform.
An eigenvalue defines a
length of scaled change
related to the eigenvector
21
General form of system of linear
equations
• A system with m
equations
22
General form of system of linear
equations
Ax=b with A ∊ ℝmxn , b ∊ ℝmx1 und x
∊ ℝnx1
23
Solution types
25
How to solve systems of linear
equations?
Elimination Substitution
2x – 3y = y=
15 3x
26
Solving matrix equations
Given
AX=B with A ∊ ℝmxn, X ∊ ℝnxp, B ∊
ℝmxp
we can multiply both sides by the inverse of A-1, provided this
exists, to give
A-1 A X = A-1 B.
X = A-1 B.
A-1 ∊ ℝnxn
29
Gauss Elimination - Example
30
Gauss Elimination - Example
• The matrix A has linearly independent columns if and only if the Gram matrix (ATA)
is invertible
• A matrix is said to have full rank if its rank equals the largest possible for a matrix
of the same dimensions, which is the lesser of the number of rows and columns
33
Matrix rank - Example
[ ] [ ]
1 2 1 Gauss elimination 1 0 −5
𝐴= − 2 −3 1 0 1 3
3 5 0 0 0 0
rk(A) = 2
because: row3 = row1 – row2
• A matrix has reduced rank if the echelon form has rows containing only zeros
• The number of non-zero rows is equal to rk(A)
• If rk(A) = dim(b), then the equation Ax = b has exactly one solution
• If rk(A) < dim(b), then the system Ax = b is underdetermined
• If rk(A) > dim(b), then the system Ax = b is overdetermined
34
Matrix Decompositions
35
What is a matrix decompositions?
•
36
.
Eigen decomposition
A = VΛV-1
• Not applicable to all matrices (for details see Goodfellow et al. 2016)
A = VΛVT
• Used in e.g. Principal Component Analysis (PCA)
A = UDVT
40
LU & LUP decomposition
A = LU or PA = LU
applying LUP
𝐴𝑥=𝑏 LU 𝑥=𝑃𝑏
42
QR decomposition
A = QR
1. compute
2. compute via substitution
44