0% found this document useful (0 votes)

123 views47 pages

Linear Algebra: Lecture Notes

The document provides lecture notes on linear algebra topics including singular value decomposition, LU decomposition, Cholesky decomposition, and QR decomposition. It begins with an outline of the topics to be covered and instructions for an online quiz. It then discusses the spectral theorem and its applications, including constructing matrices from eigenvectors and low-rank approximations. The document motivates singular value decomposition as a generalization of spectral decomposition to non-symmetric matrices, and defines singular values as the square roots of the eigenvalues of the matrix A transposed times A.

Uploaded by

Halyna Oliinyk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views47 pages

Linear Algebra: Lecture Notes

Uploaded by

Halyna Oliinyk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Linear Algebra

Lecture Notes

Rostyslav Hryniv

Ukrainian Catholic University

Data Science Master Programme

1st term
Autumn 2019
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Heart and soul review:

go to socrative.com
press student login
enter the room LAUCU2019
answer 10 questions on eigenvalues and eigenvectors
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Outline
Spectral Theorem
Low rank approximations
1 Singular Value Decomposition
Definition
Explanation and proof
Applications of SVD
2 LU decomposition
Linear systems
Elementary matrices
LU factorization
3 Cholesky decomposition
Motivation
Applications of Cholesky decomposition
Algorithm
4 QR and around
Applications of QR
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The Spectral Theorem

Holds for symmetric/Hermitian, skew-Hermitian, orthogonal/unitary
matrices
claims existence of an orthogonal basis of eigenvectors u1 , . . . , un
for eigenvalues λ1 , . . . , λn
spectral decomposition:

A = λ1 u1 u> >
1 + · · · + λn un un

orthogonally diagonalizable: with

Λ = diag(λ1 , . . . , λn )
P with columns u1 , . . . , un

P −1 AP = P > AP = Λ ⇐⇒ A = PΛP >

λj are
real for symmetric/Hermitian matrices
purely imaginary for anti-symmetric/skew-Hermitian ones
unimodular for orthogonal/unitary matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications of the Spectral Theorem

to construct symmetric or anti-symmetric or orthogonal (Hermitian,

skew-Hermitian or unitary) matrix with prescribed spectrum and
eigenvectors
to construct low-rank approximation of A:
if |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn | and λk +1 , . . . , λn are small compared to
λ1 , . . . , λk , then
Ak = λ1 u1 u> 1 + · · · + λk uk uk
>

is a good rank k approximation of A

in what norm? use the so-called Frobenius norm
Xn
2
kAkF := |ajk |2 = trace(A∗ A)
j,k =1
Pn Pn
then kAk2F = λ2j trace(uj u>
j=1 j )= j=1 λ2j
n
kA − Ak k2F = j=k +1 λ2j
P
and
in fact, the SVD says Ak is the best rank k approximation!
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example of low-rank approximation

Let
15 10
A=
10 0

Eigenvalues: λ1 = 20, λ2 = −5 (λ1 + λ2 = 15, λ1 λ2 = −100)

> >
eigenvectors: u1 = √25 , √15 , u2 = − √15 , √25
rank-one approximation:

> 16 8 −1 2
A1 = λ1 u1 u1 = , A − A1 =
8 4 2 −4
Frobenius norms:
kAk2F = 152 + 102 + 102 = 425,
kA − A1 k2F = 25 is just 1/17 of kAk2F
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Why low-rank?

Why low-rank matrix approximation are important?

In reality, deal with huge matrices (sizes 103 –106 or larger)
Sending and efficient storing becomes an issue!
Low-rank approximations are much easier for storing and sending!

Cost comparison:
full m × n matrix requires mn numbers to store;
rank one matrix requires only m + n + 1
important e.g. for image compressions
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

What if A is non-symmetric or non-square?

If A is non-symmetric, then
its eigenvectors need not be orthogonal, or even
too few eigenvectors (have to use generalized EVc’s)
If A is non-square, there are no eigenvalues and eigenvectors at all!

However, low rank approximations in (Frobenius norm) exist;

what is the best one?
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best rank one approximation to a generic A

Problem
What is the best rank one approximation uv> of an m × n matrix A in
Frobenius norm? (WLOG assume kvk = 1)

The matrix uv> has rows u1 v> , . . . , um v> , if the rows of A are a>
1 , ...,
a>
m , then m m
X X
> 2 > > 2
kA − uv kF = kaj − uj v k = kaj − uj vk2
j=1 j=1
This is minimal if uj v is the projection Pk aj of aj onto ls(v):
m
X m
X m
X m
X
kaj − Pk aj k2 = kP⊥ aj k2 = kaj k2 − kPk aj k2
j=1 j=1 j=1 j=1
Thus need to maximize
Xm m
X m
X
kPk aj k2 = |a>
j v|2
= |v> aj |2 = kAvk2
j=1 j=1 j=1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley-line-location problem

We reduced the above problem to the following one:
Maximize kAvk under the restriction that kvk = 1
This is what we get in the trolley-line-location problem:
Choose a direction v to minimize the sum of
squared distances from a1 , . . . , am to the line
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley line problem

Problem:
For the given vectors a1 , a2 , . . . am in Rn , find their best line fit `
The objective function to be minimized:
Xm
f (`) := dist2 (aj , `)
k =1

v is the unit vector on ` and Pv := vv> the orthogonal projector;

then dist(ak , `) = kak − Pv ak k, so that
X X X
f (`) = kak − Pv ak k2 = kak k2 − kPv ak k2

thus one needs to maximize the sum

X X X
kPv ak k2 = kvv> ak k2 = |a> 2 2
k v| = kAvk ,

where A has rows a> > >

1 , a2 , . . . , am
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Solution to the rank-one approximation problem

Consider the quadratic form
Q(v) := kAvk2 = (Av)> (Av) = v> A> Av
and denote
the largest eigenvalue by σ12
corresponding eigenvector (the first principal axis) by v1
then A> Av1 = σ12 v1

max{Q(v) | kvk = 1} = Q(v1 ) = σ12

and u1 := Av1 satisfies A> u1 = σ12 v1
Solution to the rank-one approximation problem:
In Frobenius norm, the best rank-one approximation of A is σ1 u1 v>
1

This leads to the notion of singular values of A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Motivation

The spectral decomposition A = PDP −1 works perfectly for

symmetric matrices:
P is an orthogonal matrix
columns of P are eigenvectors of A
D is diagonal with EV’s on the diagonal
Is there anything similar for nonsymmetric matrices
What about rectangular matrices?
Use the Singular Value Decomposition (SVD)

A = UΣV T
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Singular values

Let A be any m × n matrix

B := AT A is n × n and nonnegative:

xT Bx = xT (AT A)x = (Ax)T (Ax) = kAxk2 ≥ 0

denote by λ1 ≥ λ2 ≥ · · · ≥ λn the EV’s of B

p
σj := λj are called the singular values of A
notice that there are r := rank B = rank A positive σj

Example
 
1 1
T 2 1
A = 0 1; B = A A = has EV’s λ1 = 3 and λ2 = 1;
1 2
1 0
√
thus σ1 = 3, σ2 = 1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD theorem

Theorem (SVD)
Every m × n matrix A can be written as

A = UΣV T

where U and V are orthogonal and Σ is an m × n matrix with singular

values of A on its main diagonal and zeros otherwise

Remark
This is an analogue for the A = UDU T diagonalization of a symmetric
matrix A
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD theorem
Theorem (SVD — expanded form)
Every m × n matrix A of rank r can be written as A = UΣV T , where
U = (u1 . . . ur |ur +1 . . . um ),
V = (v1 . . . vr |vr +1 . . . vn ),
Σ has σj on its main diagonal and zeros otherwise
vj are eigenvectors of AT A with EV’s σj2 : AT Avj = σj2 vj
uj := Avj /kAvj k = Avj /σj for j = 1, . . . , r is an ONB for the range
of A
u1 , . . . , um is an ONB for Rm
A = σ1 u1 vT1 + · · · + σr ur vTr
The vectors u1 , . . . , ur are the left singular vectors of A;
v1 , . . . , vr are the right singular vectors of A
Remark: Avj = σj uj , AT uj = σvj , AAT uj = σj2 uj
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Proof of the SVD decomposition

Start with normalized eigenvectors v1 , . . . , vn

and eigenvalues σ12 , . . . , σn2 of AT A
Form uj := Avj /kAvj k = Avj /σj , j = 1, . . . , r (= rank A)
σj T
uTi uj = vTi AT Avj /(σi σj ) = vTi (AT A)vj /(σi σj ) = σi vi vj = δij
complete with ur +1 , . . . , um to an ONB of Rm
Now

UΣ = (σ1 u1 . . . σr ur |0 .{z
. . 0})
n−r
. . 0}) = A(v1 . . . vn ) = AV
= (Av1 . . . Avr |0 .{z
n−r

since V is orthogonal, VV T = I yields A = UΣV T

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example
 
1 1
For A = 0 1, we find that
1 0
√
σ1 = 3 and σ2 = 1
v1 = ( √1 , √1 ) and v2 = ( √1 , − √1 )
2 2 2 2
1
√ 1 1 T
u1 = √3 ( 2, √ , √ ) ,
2 2
u2 = (0, − √1 , √1 )T ,
2 2
u3 = √1 (−1, 1, 1)T
   
3 1 1 0 0
σ1 u1 vT + σ2 u2 vT2 =  12 1
2 + − 21 1
2 =A
1 1 1
2 2 2 − 12
T is the best rank one approximation of A in the Frobenius
σ1 u1 vP
norm (aij − bij )2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Interpretation of SVD

A = UΣV T implies decomposition of x 7→ Ax into

x 7→ y := V T x, y 7→ z := Σy, z 7→ Ax = Uz

x 7→ y finds the coordinates of the vector x in terms of one

orthonormal basis (v1 , . . . , vn )
y 7→ z scales those coordinates
z 7→ Ax find the vector with the scaled coordinates over another
orthonormal basis (u1 , . . . , un )
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Reduced SVD
In the SVD representation, some part is uninformative:
vr +1 , . . . , vn are chosen arbitrarily in the nullspace of A
ur +1 , . . . um are chosen arbitrarily in the nullspace of AT
Σ has zero rows or columns
The reduced SVD removes that uninformative part:
 T
v1

σ 1 · · · 0
A = u1 · · · ur  . . . . . . . . . . .   ... 
 
| {z } 0 ··· σ T
} | v{zr }
m×r r
| {z
r ×r r ×n

The reduced SVD of AT :

 T
u1

σ 1 · · · 0
 .. 
AT = v1 · · ·

vr  ...........  . 

0 · · · σr uT r
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

To summarize:
The SVD for arbitrary rectangular matrices is an analogue of the
spectral decomposition for square (especially symmetric) matrices
factorization
A = UΣV T
means:
rotation V T : change of basis to v1 , . . . , vn
stretch Σ: multiplication by singular values along the vj
rotation U: change of basis to u1 , . . . , um
in particular, Ax = b is equivalent to Σc = d, where
c := V > x is the coordinate vector of x in the ONB v1 , . . . , vn
d := U > bPis the coordinatePvector of b in the ONB u1 , . . . , um ; ;
thus x = ck vk and b = dk uk with dk = σk ck for k = 1, . . . , r
Geometrically this means that A maps the unit ball Bn of Rn into
“degenerate” ellipsoid Em of Rm :
X
(dk /σk )2 ≤ 1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Geometrical meaning of SVD

v1 7→ e1 7→ σ1 e1 7→ σ1 u1
v2 7→ e2 7→ σ2 e2 7→ σ2 u2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Polar decomposition

Theorem (Polar decomposition)

Any square matrix A can be written as QS with orthogonal
::::::::::
Q and
symmetric positive semidefinite
::::::::::::::::::::::::::::::
S

Why polar?
z = reiθ =⇒ zz = |z|2 = r 2
A = QS =⇒ AT A = S(Q T Q)S = S 2

Proof.
Write A = UΣV T = (UV T )(V ΣV T ) =: QS
Q := UV T is orthogonal
S := V ΣV T is symmetric and positive semidefinite
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Image compression
Instead of storing m × n numerical entries, can take the best rank-r
approximation of A; need
r (1 + m + n)
numbers
Pseudo-inverse
A rectangular A cannot be inverted!
However, a pseudo-inverse A+ can be defined s.t.
A+ A ≈ In and AA+ ≈ Im
for Σ, the pseudo-inverse Σ+ should satisfy
Σ+ Σ = Ir ⊕ 0n−r , Σ Σ+ = Ir ⊕ 0m−r
Σ+ gets transposed and σj replaced with 1/σj
if A = UΣV T , then its pseudo-inverse is A+ := V Σ+ U T : indeed,

A+ A = V Σ+ (U T U)ΣV T = V Σ+ ΣV T = V (Ir ⊕ 0n−r )V T = (Ir ⊕ 0n−r )

A A+ = UΣ(V T V )Σ+ U T = UΣ Σ+ U T = U(Ir ⊕ 0m−r )U T = (Ir ⊕ 0m−r )
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best solution to Ax = b:
Recall that Ax = b is soluble ⇐⇒ b belongs to the range (ie,
column space) of A
if n > rank A, then there are many solutions (or none)
(homogeneous equation Ax = 0 has nontrivial solutions)
any solution has the form x = x0 + x1 with x0 a particular solution
and x1 any solution to Ax = 0
if b not in the range, solve the normal equation AT Ax = AT b to get
the least square solution
if rank A = n, then AT A is invertible
otherwise, the least square solution is not unique
look for the shortest solution x̂
Claim: if A = UΣV T , then x̂ = V Σ+ U T b
kAx − bk = kΣV T x − U T bk = kΣy − U T bk;
the shortest solution: y = Σ+ (U T b) and x = V y = (V Σ+ U T )b
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

SVD vs PCA
Observe that the largest value of kAxk with kxk ≤ 1 is obtained for
x = v1 and is equal to σ1 ;
this is the first principal axis for AT A:
indeed, AT A = V ΣT U T UΣV T = V ΣT ΣV T = VDV T is the spectral
decomposition of the symmetric matrix B := AT A
B has eigenvalues σk2 with eigenvectors vk
the quadratic form Q(x) := xT Bx is equal to kAxk2
by the minimax properties of the eigenvalues,

σ12 = max xT Bx = max kAxk2 ,

kxk=1 kxk=1

σ22 = max xT Bx = max kAxk2 ,

kxk=1, kxk=1,
x⊥v1 x⊥v1

σ32 = . . .

AT A can be considered as a correlation matrix for the columns of A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The trolley line problem

Problem:
For the given vectors a1 , a2 , . . . am in Rn , find their best line fit `
The target function to be minimized:
Xm
f (`) := dist2 (aj , `)
k =1
u is the unit vector on ` and Pu := uuT the orthogonal projector;
then dist(ak , `) = kak − Pu ak k, so that
X X X
f (`) = kak − Pu ak k2 = kak k2 − kPu ak k2
thus one needs to maximize the sum
X X X
kPu ak k2 = kuuT ak k2 = |aTk u|2 = kAuk2 ,

where A has rows aT1 , aT2 , . . . , aTm

Solution: The best direction is the first left singular vector v1 of A
The smallest value of the target function is kak k2 − σ12 = σ22 + · · · + σn2
P
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Best low-rank approximation of A

Frobenius norm of a matrix
X X X
kAk2F = |aij |2 = kai k2 = σi2
Indeed, i,j i i
pre-/post-multiplying by an orthogonal matrix does not change k · kF
thus A = UΣV T yields kAk2F = kU T AV k2F = kΣk2F
another reason: kAk2F = trace(AT A); now
trace(AT A) = trace(V ΣT ΣV T ) = trace(ΣT Σ) = σk2
P

Best rank-one approximation of A in the Frobenius norm

For a rank-one operator B = uvT , kBk2F = kuk2 kvk2 ; thus (kuk = 1)

kA − uvT k2F = trace(A − uvT )T (A − uvT )

= ...
= kAk2F − kAuk2 + kAu − vk2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Linear systems

Linear equation, or linear system:



 a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2

Ax = b

 . .................................
am1 x1 + am2 x2 + · · · + amn xn = bm


A an m × n matrix
x ∈ Rn , b ∈ Rm
Solution: x = A−1 b for invertible A (m=n)
Computation of A−1 is costly ∼ O(n3 ) and not always necessary!
Alternatively, use the Gauss elimination method
In matrix form, amounts to an LU representation of A
L stands for “lower”- and U for “upper”-triangular
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row operations that simplify linear systems

Definition
A system is called consistent if it possesses at least one solution
Two linear systems are called equivalent if they possess the same
set of solutions
Idea:
Transform a system to a simpler form without changing the set of
solutions using the following elementary row transformations:
1. Multiply an equation/a row through by a nonzero constant
2. Add a constant times one equation/row to another
3. Interchange two equations/rows
Properties
1 Elementary row operations are reversible
2 Lead to equivalent systems/augmented matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example: elementary row operations

 
x + 2y − z = 2 1 2 −1 2
2x + y − z = 1 −2 × (r1 ) 2 1 −1 1
x+ y− z = 0 −1 × (r1 ) 1 1 −1 0
 
x + 2y − z = 2 1 2 −1 2
− 3y + z = −3 (r2 ) → (r3 ) 0 −3 1 −3
− y = −2 (r3 ) → (r2 ) 0 −1 0 −2
 
x + 2y − z = 2 1 2 −1 2
− y = −2 ×(−1) 0 −1 0 −2
− 3y + z = −3 +3 × (r2 ) 0 −3 1 −3
 
x + 2y − z = 2 −2 × (r2 ) + (r3 ) 1 2 −1 2
y = 2 0 1 0 2
z = 3 0 0 1 3
 
x = 1 1 0 0 1
y = 2 0 1 0 2
z = 3 0 0 1 3
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row transformations revisited

A is an m × n coefficient matrix of a system
1. Row multiplication: Multiply i th row by t
Amounts to matrix multiplication EA, with
E −1 = diag{1, . . . , 1, t −1 , 1, . . . , 1}
| {z }
i−1

2. Row replacement: Add α times k th row to `th row

Amounts to EA, with (E)ii = 1, (E)`k = α, (E)ij = 0 otherwise:
 
1 k

 ↓ 

−1 ` → −α 1
E =


 . . 
 . 
1
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Elementary row transformations revisited

3. Row interchange: Interchange k th and `th rows

Amounts to matrix multiplication EA, with
 
1 k
 ↓ 0 1  ← k
−1

E =E =  1 

 1 0 ↑ ←`
` 1

Definition
The above matrices performing the elementary row operations are
called elementary matrices
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example: ERO via matrix multiplication

   
1 2 −1 1 0 0
2 1 −1 −2 × (r1 ) E1 = −2 1
 0
1 1 −1 −1 × (r1 ) −1 0 1
   
1 2 −1 1 0 0
0 −3 1 E2 = 0 1 0
0 −1 0 − 13 × (r2 ) 0 − 13 1
 
1 2 −1
E2 E1 A = 0 −3 1 =⇒
0 0 − 13
| {z }
U
 
1 0 0
A = E1−1 E2−1 U = 2 1 0 U = LU
1 13 1
| {z }
L
L: lower-triangular U: upper-triangular
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Products of lower-triangular matrices

Definition (Lower- and upper-triangular matrices)

A square matrix is called lower-triangular (upper-triangular) if all its
entries above (below) the main diagonal are zero

Lemma
Product of two lower-triangular (upper-triangular) matrices is
lower-triangular (upper-triangular)

Proof.
Use the row or column form of matrix-matrix product
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

LU factorization
Theorem
Assume that an m × n matrix can be reduced to row echelon form U
using only row substitution operations. Then A = LU with a
lower-triangular m × m matrix L.
Proof.
Ek · · · E1 A = U =⇒ L = (Ek · · · E1 )−1 = E1−1 · · · Ek−1

Definition
The above representation A = LU, with an m × m lower-triangular
matrix L and upper-triangular∗ m × n matrix U, is the LU-factorization
of A.
Remark (∗ )
L is unique if all its diagonal entries are 1
If row interchanges are needed, use PA = LU, with P encoding all
row interchanges
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Why is LU-factorization important?

(
Ux = y
Ax = b ⇐⇒
Ly = b

Typically, O(n3 ) flops are needed to solve Ax = b

To solve Ly = b and Ux = y, need O(n2 ) flops
E.g., elementary row operations transforming L to In only perform
on b
Use to find A−1 for nonsingular A
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

What is a Cholesky decomposition?

Recall:
A = LU (or PA = LU) exists for rectangular matrices
a LU-representation is not unique!
usually L is m × m with 1 on the main diagonal;
then both L and U = L−1 A is uniquely determined
reason: if A = L1 U1 = L2 U2 , then
L−1
2 L1 = U2 U1
−1

L−1
2 L1 is lower-triangular with 1 on the diagonal
U2 U1−1 is upper-triangular
thus L−1 −1
2 L 1 = U2 U1 = I
if A is nonsingular, U has nonzero diagonal
can “factor it out” as D to get A = LDU with U having 1’s on the
main diagonal
for symmetric matrices, U = LT and A = LDLT
reason: AT = U T DLT = LDU = A and use uniqueness
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Standard Cholesky decomposition

The standard form of Cholesky decomposition reads

A = LLT with L lower-triangular

requires A symmetric and positive semi-definite :

xT Ax = xT LT Lx = (Lx)T Lx = kLxk2 ≥ 0

conversely, if A positive semi-definite, the LLT decomposition exists

for positive definite A, such a decomposition is (almost) unique!
if L has diagonal S, then with L := L1 S and D := S 2 , we get

A = LLT = (L1 S)(SLT1 ) = L1 S 2 LT1 = L1 DLT1

Thus the LDLT decomposition is more general as it does not

require positive semi-definiteness
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications

For numerical solutions of Ax = b

As in LU factorization, split solving Ax = b into
Ly = b (forward substitution) and
LT x = y (backward substitution)
For symmetric matrices, twice as efficient as LU

Monte Carlo simulation of correlated random variables

The covariance matrix Σ of Gaussian RV X := (X1 , X2 , . . . , Xn )T is
positive (semi-)definite. If Σ = LLT , then X can be simulated as LZ with
standard Gaussian RV Z; indeed, then
E(XXT ) = E(LZZT LT ) = LE(ZZT )LT = LLT = Σ
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

The algorithm
Idea:
As in Gaussian elimination, make entries below diagonal zero
Recursive algorithm:
1 start with i := 1 and A(1) := A
2 At step i, the matrix A(i) has the following form:
 
Ii−1 0 0
A(i) =  0 ai,i bTi  ,
0 bi B (i)
with the identity matrix Ii−1 of size i − 1
3 Set  
Ii−1 0 0
 0 √
Li :=  ai,i 0 

0 √1 bi In−i
ai,i
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Recursive algorithm (continued):

4 Write
A(i) = Li A(i+1) L∗i ;
then
Ii−1 0 0
 

A(i+1) = 0 1 0 
(i) 1 T
0 0 B − ai,i bi bi
5 Finally, A(n+1) = In and so we get

A = A(1) = L1 A(2) LT1 = L1 L2 A(3) LT2 LT1

= · · · = L1 L2 . . . Ln LTn . . . LT2 LT1
= L1 L2 . . . Ln (L1 L2 . . . Ln )T
= LLT

with L := L1 L2 . . . Ln
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Example

   
4 0 2 2 0 0
A = A(1) = 0 1 −1 =⇒ L1 = 0 1 0 
2 −1 6 1 0 1
 
1 0 0
(2) T 1 0
A(1) = L1 A L1 =⇒ A = 0 (2)  1 −1 =⇒ L̃2 =

−1 1
0 −1 5

(2) (3) T (3) 1 0 1 0
Ã = L̃2 Ã L̃2 =⇒ Ã = =⇒ L̃3 =
0 4 0 2
   
2 0 0 1 0 0 1 0 0
L = L1 L2 L3 = 0 1 0 0 1 0 0 1 0
1 0 1 0 −1 1 0 0 2
  
2 0 0 2 0 1
A = LLT = 0 1 0  0 1 −1
1 −1 2 0 0 2
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

QR factorization = matrix form of Gram–Schmidt

Assume A has linearly independent columns a1 , . . . , an

perform Gram–Schmidt orthogonalization to get q1 , . . . , qn
at each step, qk is a linear combination of ak , q1 , . . . , qk −1
thus ak is in the span of q1 , . . . , qk
ak = P1 ak + · · · + Pk ak = q1 q> >
1 ak + · · · + qk qk ak
in matrix form, this becomes a QR factorization:
 >
q1 a1 q> >

1 a2 . . . q1 an
 q>
2 a2 . . . q2 an 
> 
a1 a2 . . . an = q1 q2 . . . qn  . . . . . . . . . .
q>
| {z } | {z }
A Q n an
| {z }
R

R can be calculated as Q > A

Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Full QR factorization and applications

In the above form, Q is m × n and R is n × n
often called the reduced QR factorization
Q is not orthogonal (as it is not square)
add m − n columns to get an orthogonal Q̃ and add m − n zero
rows to R;
then A = Q̃ R̃ is the full QR factorization
Application of QR to least squares:
AT A = (QR)T (QR) = R T R;
therefore, R T R x̂ = R T Q T b =⇒ R x̂ = Q T b
as R is upper-triangular, this is very fast! Que: why is R invertible?

Advantages of QR-decomposition:
Orthogonal columns of Q make algorithm stable
(norms do not increase or decrease)
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Full QR factorization and applications

another method to find Q and R involve Householder’s reflections
Q = I − 2v vT
Householder’s reflection Q can be chosen so that Qx = kxke1 : set
u = x − kxke1 and v = u/kuk

the impact on A: take x to be the first column of A; then

 
kxk ∗ . . . ∗
 0 ∗ . . . ∗
QA =  .
 
. .. .. .. 
 . . . .
0 ∗ ... ∗
Yet another method uses Givens rotations
Singular Value Decomposition LU decomposition Cholesky decomposition QR and around

Applications of QR

QR eigenvalue algorithm
On each step, factorize Ak = Qk Rk and set Ak +1 := Rk Qk
as Rk = Qk−1 Ak , one gets Ak +1 = Qk−1 Ak Qk
thus Ak and Ak +1 have the same eigenvalues
typically, Ak converge to an upper-triangular matrix R
(Schur form of A)
eigenvalues of A = diagonal entries of R
Fourier transform and Fast Fourier transform
...

Frans Bosch Motor Learning in Athletics
100% (3)
Frans Bosch Motor Learning in Athletics
23 pages
Linear Algebra Project
No ratings yet
Linear Algebra Project
9 pages
PART I: Approximation of Static Systems
No ratings yet
PART I: Approximation of Static Systems
123 pages
Least Squares and The Singular Value Decomposition: Ivan Markovsky
No ratings yet
Least Squares and The Singular Value Decomposition: Ivan Markovsky
52 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
No ratings yet
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
18 pages
MFDS Lecture BITS WILP
No ratings yet
MFDS Lecture BITS WILP
29 pages
SVD Notes
No ratings yet
SVD Notes
7 pages
SVD Notes
No ratings yet
SVD Notes
7 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
26 pages
SVD Notes
No ratings yet
SVD Notes
2 pages
COMP4222-Lecture 4-Self Reading-Chapter 4 Eigenvalues Decomposition by Longin Jan Latecki
No ratings yet
COMP4222-Lecture 4-Self Reading-Chapter 4 Eigenvalues Decomposition by Longin Jan Latecki
30 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
The Singular Value Decomposition
No ratings yet
The Singular Value Decomposition
16 pages
Lecture Notes On SVD For Math 54
No ratings yet
Lecture Notes On SVD For Math 54
5 pages
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
10 pages
The Singular Value Decomposition Let A Be
No ratings yet
The Singular Value Decomposition Let A Be
15 pages
Singular Value Decomposition: Notes On Linear Algebra
No ratings yet
Singular Value Decomposition: Notes On Linear Algebra
9 pages
Svdnotes
No ratings yet
Svdnotes
10 pages
Singular-Value Decomposition and Its Applications
No ratings yet
Singular-Value Decomposition and Its Applications
28 pages
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
No ratings yet
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
26 pages
2 - Introduction To SVD
No ratings yet
2 - Introduction To SVD
5 pages
Matrix YM
100% (3)
Matrix YM
17 pages
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
No ratings yet
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
9 pages
1.2.7 Singular Value Decomposition: Mathematical Background 39
No ratings yet
1.2.7 Singular Value Decomposition: Mathematical Background 39
7 pages
SVD Cholesky
No ratings yet
SVD Cholesky
9 pages
Final 4 Sem
No ratings yet
Final 4 Sem
29 pages
3 - Low Rank Apprx For SVD
No ratings yet
3 - Low Rank Apprx For SVD
4 pages
Lecture 4
No ratings yet
Lecture 4
48 pages
Exercises For 8.5
No ratings yet
Exercises For 8.5
17 pages
SVD Slides
No ratings yet
SVD Slides
26 pages
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
No ratings yet
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
24 pages
Internal 4 Sem
No ratings yet
Internal 4 Sem
36 pages
Singular Value Decomposition Geometry
No ratings yet
Singular Value Decomposition Geometry
9 pages
Abdi SVD2007 Pretty PDF
No ratings yet
Abdi SVD2007 Pretty PDF
14 pages
The Singular Value Decomposition.
No ratings yet
The Singular Value Decomposition.
88 pages
Compact SVD
No ratings yet
Compact SVD
10 pages
Lecture 9 Linear Least Squares SVD
No ratings yet
Lecture 9 Linear Least Squares SVD
20 pages
LA Hw8
No ratings yet
LA Hw8
2 pages
Singular Value Decomposition (SVD) : - Definition
No ratings yet
Singular Value Decomposition (SVD) : - Definition
5 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
24 pages
Singular Value Decomposition Fast Track Tutorial
No ratings yet
Singular Value Decomposition Fast Track Tutorial
5 pages
Math 5390 Chapter 3
No ratings yet
Math 5390 Chapter 3
32 pages
SVD Slides
No ratings yet
SVD Slides
17 pages
Cos323 s06 Lecture09 SVD
No ratings yet
Cos323 s06 Lecture09 SVD
24 pages
Singular Value Decomposition
100% (1)
Singular Value Decomposition
24 pages
Math Primer
No ratings yet
Math Primer
13 pages
Revisión SVD
No ratings yet
Revisión SVD
17 pages
Eval Norms
No ratings yet
Eval Norms
49 pages
SVD My Lecture 2021-Desktop-Qov8vhr
No ratings yet
SVD My Lecture 2021-Desktop-Qov8vhr
79 pages
Introduction To Linear Algebra V: 1 Eigenvalue and Eigenvector
No ratings yet
Introduction To Linear Algebra V: 1 Eigenvalue and Eigenvector
4 pages
Decomp
No ratings yet
Decomp
27 pages
Elec9731 LM1
No ratings yet
Elec9731 LM1
41 pages
Singular Value Decomposition (SVD)
No ratings yet
Singular Value Decomposition (SVD)
94 pages
Caam 453 Numerical Analysis I: 6 October 2009 M. Embree, Rice University
No ratings yet
Caam 453 Numerical Analysis I: 6 October 2009 M. Embree, Rice University
4 pages
SVD 4
No ratings yet
SVD 4
29 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Sequences and Infinite Series, A Collection of Solved Problems
From Everand
Sequences and Infinite Series, A Collection of Solved Problems
Steven Tan
No ratings yet
Exercises of Quantum Physics
From Everand
Exercises of Quantum Physics
Simone Malacrida
No ratings yet
Study Guide: Facilitate Elearning Sessions Online Training (40 Hours)
No ratings yet
Study Guide: Facilitate Elearning Sessions Online Training (40 Hours)
9 pages
BAsic Concept of Algorithm
No ratings yet
BAsic Concept of Algorithm
17 pages
Business Super Plan Etisalat
No ratings yet
Business Super Plan Etisalat
11 pages
Full Download Multi-Objective Passing Vehicle Search Algorithm For Structure Optimization Sumit Kumar PDF
100% (3)
Full Download Multi-Objective Passing Vehicle Search Algorithm For Structure Optimization Sumit Kumar PDF
40 pages
Unit 1 - Evolution of Wireless Networks
No ratings yet
Unit 1 - Evolution of Wireless Networks
12 pages
AWS Academy Learner Lab - Student Guide v0.2
No ratings yet
AWS Academy Learner Lab - Student Guide v0.2
7 pages
Careers at Hyland - Jobs in Software at Hyland - Developer 2 in Kolkata, West Bengal - Careers at Kolkata
No ratings yet
Careers at Hyland - Jobs in Software at Hyland - Developer 2 in Kolkata, West Bengal - Careers at Kolkata
5 pages
Shikhbe Shobai UI/UX Course Outline
No ratings yet
Shikhbe Shobai UI/UX Course Outline
2 pages
Reading Computer and ESSAY TRI3
100% (1)
Reading Computer and ESSAY TRI3
2 pages
Changelog
No ratings yet
Changelog
3 pages
PIANC Event Container Terminal QA
No ratings yet
PIANC Event Container Terminal QA
1 page
Upper Primary Division Round 2: Questions 1 To 5, 4 Marks Each
100% (1)
Upper Primary Division Round 2: Questions 1 To 5, 4 Marks Each
6 pages
Network Configuration Files
No ratings yet
Network Configuration Files
5 pages
Hand Out-Computer Components
No ratings yet
Hand Out-Computer Components
16 pages
ŠKODA Diagnostics
No ratings yet
ŠKODA Diagnostics
82 pages
9 Basic Ict Skills-1
No ratings yet
9 Basic Ict Skills-1
8 pages
CS8392 - Oop - Unit - 3 - PPT - 3.1
No ratings yet
CS8392 - Oop - Unit - 3 - PPT - 3.1
17 pages
EMR Product Catalogue
No ratings yet
EMR Product Catalogue
12 pages
PF - 1
No ratings yet
PF - 1
27 pages
IconTutorials IOSTabletAndroidandDesktop
No ratings yet
IconTutorials IOSTabletAndroidandDesktop
6 pages
NPTEL Course List (Jan - Apr 2025)
No ratings yet
NPTEL Course List (Jan - Apr 2025)
30 pages
Yama Blox Fruits Wiki Fandom
No ratings yet
Yama Blox Fruits Wiki Fandom
1 page
UX Design
No ratings yet
UX Design
20 pages
GR For The Delivery Generates Error Message M7064: Symptom
No ratings yet
GR For The Delivery Generates Error Message M7064: Symptom
2 pages
Dating Format PDF Romance (Love) Passion (Emotion)
No ratings yet
Dating Format PDF Romance (Love) Passion (Emotion)
1 page
Digital Tools and Learning Library
No ratings yet
Digital Tools and Learning Library
4 pages
Hungarian Notation
No ratings yet
Hungarian Notation
1 page
Variable Resolution, Monolithic Resolver-to-Digital Converters AD2S81A/AD2S82A
No ratings yet
Variable Resolution, Monolithic Resolver-to-Digital Converters AD2S81A/AD2S82A
16 pages
Rest Integrations
No ratings yet
Rest Integrations
24 pages

Linear Algebra: Lecture Notes

Uploaded by

Linear Algebra: Lecture Notes

Uploaded by

Linear Algebra

Ukrainian Catholic University

Heart and soul review:

The Spectral Theorem

orthogonally diagonalizable: with

P −1 AP = P > AP = Λ ⇐⇒ A = PΛP >

Applications of the Spectral Theorem

to construct symmetric or anti-symmetric or orthogonal (Hermitian,

is a good rank k approximation of A

Example of low-rank approximation

Eigenvalues: λ1 = 20, λ2 = −5 (λ1 + λ2 = 15, λ1 λ2 = −100)

Why low-rank matrix approximation are important?

What if A is non-symmetric or non-square?

However, low rank approximations in (Frobenius norm) exist;

Best rank one approximation to a generic A

The trolley-line-location problem

The trolley line problem

v is the unit vector on ` and Pv := vv> the orthogonal projector;

thus one needs to maximize the sum

where A has rows a> > >

Solution to the rank-one approximation problem

max{Q(v) | kvk = 1} = Q(v1 ) = σ12

This leads to the notion of singular values of A

The spectral decomposition A = PDP −1 works perfectly for

Let A be any m × n matrix

xT Bx = xT (AT A)x = (Ax)T (Ax) = kAxk2 ≥ 0

denote by λ1 ≥ λ2 ≥ · · · ≥ λn the EV’s of B

where U and V are orthogonal and Σ is an m × n matrix with singular

Proof of the SVD decomposition

Start with normalized eigenvectors v1 , . . . , vn

since V is orthogonal, VV T = I yields A = UΣV T

A = UΣV T implies decomposition of x 7→ Ax into

x 7→ y finds the coordinates of the vector x in terms of one

The reduced SVD of AT :

Geometrical meaning of SVD

Theorem (Polar decomposition)

A+ A = V Σ+ (U T U)ΣV T = V Σ+ ΣV T = V (Ir ⊕ 0n−r )V T = (Ir ⊕ 0n−r )

σ12 = max xT Bx = max kAxk2 ,

σ22 = max xT Bx = max kAxk2 ,

AT A can be considered as a correlation matrix for the columns of A

The trolley line problem

where A has rows aT1 , aT2 , . . . , aTm

Best low-rank approximation of A

Best rank-one approximation of A in the Frobenius norm

kA − uvT k2F = trace(A − uvT )T (A − uvT )

Linear equation, or linear system:

Elementary row operations that simplify linear systems

Example: elementary row operations

Elementary row transformations revisited

2. Row replacement: Add α times k th row to `th row

Elementary row transformations revisited

3. Row interchange: Interchange k th and `th rows

Example: ERO via matrix multiplication

Products of lower-triangular matrices

Definition (Lower- and upper-triangular matrices)

Why is LU-factorization important?

Typically, O(n3 ) flops are needed to solve Ax = b

What is a Cholesky decomposition?

Standard Cholesky decomposition

A = LLT with L lower-triangular

requires A symmetric and positive semi-definite :

conversely, if A positive semi-definite, the LLT decomposition exists

A = LLT = (L1 S)(SLT1 ) = L1 S 2 LT1 = L1 DLT1

Thus the LDLT decomposition is more general as it does not

For numerical solutions of Ax = b

Monte Carlo simulation of correlated random variables

Recursive algorithm (continued):

A = A(1) = L1 A(2) LT1 = L1 L2 A(3) LT2 LT1

QR factorization = matrix form of Gram–Schmidt

Assume A has linearly independent columns a1 , . . . , an

R can be calculated as Q > A

Full QR factorization and applications

Full QR factorization and applications

the impact on A: take x to be the first column of A; then

You might also like