0% found this document useful (0 votes)
79 views42 pages

QR Factorization: Triangular Matrices QR Factorization Gram-Schmidt Algorithm Householder Algorithm

This document discusses QR factorization. It begins by introducing triangular matrices and then defines QR factorization as factorizing a matrix A with linearly independent columns as A = QR, where Q is orthonormal and R is upper triangular with nonzero diagonal elements. It provides examples of forward and back substitution using triangular matrices and discusses computing inverses. It also describes applications of QR factorization such as computing pseudoinverses and projections onto the range of a matrix.

Uploaded by

ALEEZA kk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views42 pages

QR Factorization: Triangular Matrices QR Factorization Gram-Schmidt Algorithm Householder Algorithm

This document discusses QR factorization. It begins by introducing triangular matrices and then defines QR factorization as factorizing a matrix A with linearly independent columns as A = QR, where Q is orthonormal and R is upper triangular with nonzero diagonal elements. It provides examples of forward and back substitution using triangular matrices and discusses computing inverses. It also describes applications of QR factorization such as computing pseudoinverses and projections onto the range of a matrix.

Uploaded by

ALEEZA kk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

L.

Vandenberghe ECE133A (Fall 2019)

6. QR factorization

• triangular matrices

• QR factorization

• Gram–Schmidt algorithm

• Householder algorithm

6.1
Triangular matrix

a square matrix A is lower triangular if Ai j = 0 for j > i

 A11 0 ··· 0 0 
 A21 A22 0 0
 
··· 
A = 
 .. .. ... 0 0


 An−1,1 An−1,2 An−1,n−1 0

··· 
 An1 An2 An,n−1 Ann
 
 ··· 

A is upper triangular if Ai j = 0 for j < i (the transpose AT is lower triangular)

a triangular matrix is unit upper/lower triangular if Aii = 1 for all i

QR factorization 6.2
Forward substitution

solve Ax = b when A is lower triangular with nonzero diagonal elements

Algorithm

x1 = b1/A11
x2 = (b2 − A21 x1)/A22
x3 = (b3 − A31 x1 − A32 x2)/A33
..

xn = (bn − An1 x1 − An2 x2 − · · · − An,n−1 xn−1)/Ann

Complexity: 1 + 3 + 5 + · · · + (2n − 1) = n2 flops

QR factorization 6.3
Back substitution

solve Ax = b when A is upper triangular with nonzero diagonal elements

Algorithm

xn = bn/Ann
xn−1 = (bn−1 − An−1,n xn)/An−1,n−1
xn−2 = (bn−2 − An−2,n−1 xn−1 − An−2,n xn)/An−2,n−2
..

x1 = (b1 − A12 x2 − A13 x3 − · · · − A1n xn)/A11

Complexity: n2 flops

QR factorization 6.4
Inverse of a triangular matrix

a triangular matrix A with nonzero diagonal elements is nonsingular:

Ax = 0 =⇒ x=0

this follows from forward or back substitution applied to the equation Ax = 0

• inverse of A can be computed by solving AX = I column by column

A x1 x2 · · · xn e1 e2 · · · en ( xi is column i of X )
   
=

• inverse of lower triangular matrix is lower triangular

• inverse of upper triangular matrix is upper triangular

• complexity of computing inverse of n × n triangular matrix is

2 2 1 3
n + (n − 1) + · · · + 1 ≈ n flops
3

QR factorization 6.5
Outline

• triangular matrices

• QR factorization

• Gram–Schmidt algorithm

• Householder algorithm
QR factorization

if A ∈ Rm×n has linearly independent columns then it can be factored as

 R11 R12 · · · R1n 


 0 R22 · · · R2n
 
A= q1 q2 · · · qn  .
 
.. ... .. 
 .

 0 0 ··· Rnn


 

• vectors q1, . . . , qn are orthonormal m-vectors:

kqi k = 1, qiT q j = 0 if i , j

• diagonal elements Rii are nonzero

• if Rii < 0, we can switch the signs of Rii , . . . , Rin, and the vector qi

• most definitions require Rii > 0; this makes Q and R unique

QR factorization 6.6
QR factorization in matrix notation

if A ∈ Rm×n has linearly independent columns then it can be factored as

A = QR

Q-factor

• Q is m × n with orthonormal columns (QT Q = I )


• if A is square (m = n), then Q is orthogonal (QT Q = QQT = I )

R-factor

• R is n × n, upper triangular, with nonzero diagonal elements


• R is nonsingular (diagonal elements are nonzero)

QR factorization 6.7
Example

 −1 −1 1   −1/2 1/2 −1/2 


 1 3 3   1/2 1/2 −1/2   2 4 2 
    
=  0 2 8 
 −1 −1 5   −1/2 1/2 1/2  
 
0 0 4 

 1 3 7   1/2 1/2 1/2 
  
 
 

  R11 R12 R13


 
q1 q2 q3  0 R22 R23
 
= 
 0 0 R33


 

= QR

QR factorization 6.8
Applications

in the following lectures, we will use the QR factorization to solve

• linear equations
• least squares problems
• constrained least squares problems

here, we show that it gives useful simple formulas for

• the pseudo-inverse of a matrix with linearly independent columns


• the inverse of a nonsingular matrix
• projection on the range of a matrix with linearly independent columns

QR factorization 6.9
QR factorization and (pseudo-)inverse

pseudo-inverse of a matrix A with linearly independent columns (page 4.23)

A† = (AT A)−1 AT

• pseudo-inverse in terms of QR factors of A:

A† = ((QR)T (QR))−1(QR)T
= (RT QT QR)−1 RT QT
= (RT R)−1 RT QT (QT Q = I )
= R−1 R−T RT QT ( R is nonsingular)
= R−1QT

• for square nonsingular A this is the inverse:

A−1 = (QR)−1 = R−1QT

QR factorization 6.10
Range

recall definition of range of a matrix A ∈ Rm×n (page 5.16):

range(A) = { Ax | x ∈ Rn }

suppose A has linearly independent columns with QR factors Q , R

• Q has the same range as A:

y ∈ range(A) ⇐⇒ y = Ax for some x


⇐⇒ y = QRx for some x
⇐⇒ y = Qz for some z
⇐⇒ y ∈ range(Q)

• columns of Q are orthonormal and have the same span as columns of A

QR factorization 6.11
Projection on range

• combining A = QR and A† = R−1QT (from page 6.10) gives

AA† = QRR−1QT = QQT

note the order of the product in AA† and the difference with A† A = I

• recall (from page 5.17) that QQT x is the projection of x on the range of Q
x

AA† x = QQT x

range(A) = range(Q)

QR factorization 6.12
QR factorization of complex matrices

if A ∈ Cm×n has linearly independent columns then it can be factored as

A = QR

• Q ∈ Cm×n has orthonormal columns (Q H Q = I )

• R ∈ Cn×n is upper triangular with real nonzero diagonal elements

• most definitions choose diagonal elements Rii to be positive

• in the rest of the lecture we assume A is real

QR factorization 6.13
Algorithms for QR factorization

Gram–Schmidt algorithm (page 6.15)


• complexity is 2mn2 flops
• not recommended in practice (sensitive to rounding errors)

Modified Gram–Schmidt algorithm


• complexity is 2mn2 flops
• better numerical properties

Householder algorithm (page 6.25)


• complexity is 2mn2 − (2/3)n3 flops
• represents Q as a product of elementary orthogonal matrices
• the most widely used algorithm (used by the function qr in MATLAB and Julia)

in the rest of the course we will take 2mn2 for the complexity of QR factorization
QR factorization 6.14
Outline

• triangular matrices

• QR factorization

• Gram–Schmidt algorithm

• Householder algorithm
Gram–Schmidt algorithm

Gram–Schmidt QR algorithm computes Q and R column by column

• after k steps we have a partial QR factorization

 R11 R12 · · · R1k 


 0 R22 · · · R2k
 
a1 a2 · · · ak q1 q2 · · · qk  .
   
= .. ... .. 
 .

 0 0 ··· Rk k


 

• columns q1, . . . , qk are orthonormal

• diagonal elements R11, R22, . . . , Rk k are positive

• columns q1, . . . , qk have the same span as a1, . . . , a k (see page 6.11)

QR factorization 6.15
Computing column k

suppose we have completed the factorization for the first k − 1 columns

• column k of the equation A = QR reads

a k = R1k q1 + R2k q2 + · · · + Rk−1,k qk−1 + Rk k qk

• regardless of how we choose R1k , . . . , Rk−1,k , the vector

q̃k = a k − R1k q1 − R2k q2 − · · · − Rk−1,k qk−1

will be nonzero: a1, a2, . . . , a k are linearly independent and therefore

a k < span{a1, . . . , a k−1 } = span{q1, . . . , qk−1 }

• qk is q̃k normalized: choose Rk k = k q̃k k and qk = (1/Rk k )q̃k


• q̃k and qk are orthogonal to q1, . . . , qk−1 if we choose R1k , . . . , Rk−1,k as

R1k = q1T a k , R2k = q2T a k , . . ., Rk−1,k = qTk−1 a k

QR factorization 6.16
Gram–Schmidt algorithm

Given: m × n matrix A with linearly independent columns a1, . . . , an

Algorithm

for k = 1 to n
R1k = q1T a k
R2k = q2T a k
..

Rk−1,k = qTk−1 a k
q̃k = a k − (R1k q1 + R2k q2 + · · · + Rk−1,k qk−1)
Rk k = k q̃k k
1
qk = q̃k
Rk k

QR factorization 6.17
Example

example on page 6.8:

 −1 −1 1 
 1 3 3 
 
a1 a2 a3
 
=  −1 −1 5 

 1 3 7 
 

 R11 R12 R13 
q1 q2 q3  0 R22 R23
  
= 
 0 0 R33


 

First column of Q and R

 −1   −1/2 
 1  1 1/2 
   
q̃1 = a1 =  R11 = k q̃1 k = 2, q1 = q̃1 = 

, 
 −1  R11  −1/2 
 
 1   1/2 
   

QR factorization 6.18
Example

Second column of Q and R

• compute R12 = q1T a2 = 4

• compute
 −1   −1/2   1 
 3   1/2   1 
     
q̃2 = a2 − R12 q1 =   −4 = 
 −1   −1/2   1 
    
 3   1/2   1 
     

• normalize to get

 1/2 
1 1/2
 
R22 = k q̃2 k = 2, q2 = q̃2 = 
 
1/2

R22 
1/2
 
 
 

QR factorization 6.19
Example

Third column of Q and R

• compute R13 = q1T a3 = 2 and R23 = q2T a3 = 8

• compute

 1   −1/2   1/2   −2 
3  1/2   1/2   −2 
       
 −2
q̃3 = a3 − R13 q1 − R23 q2 =   −1/2  − 8  1/2  =  2 
 
5
    

7  1/2   1/2   2 
       
 
       

• normalize to get

 −1/2 
1
 
−1/2
R33 = k q̃3 k = 4, q3 = q̃3 = 
 
 1/2 

R33 
 1/2 
 

QR factorization 6.20
Example

Final result

 −1 −1 1 
 1 3 3    R11 R12 R13
   
q1 q2 q3  0 R22 R23
 
 −1 −1 5 
 = 
 0 0 R33

 1 3 7 
  
 

 −1/2 1/2 −1/2 
 1/2 1/2 −1/2   2 4 2 
  
=  0 2 8 
 −1/2 1/2 1/2  

0 0 4 

 1/2 1/2 1/2 

 

QR factorization 6.21
Complexity

Complexity of cycle k (of algorithm on page 6.17)

• k − 1 inner products with a k : (k − 1)(2m − 1) flops


• computation of q̃k : 2(k − 1)m flops
• computing Rk k and qk : 3m flops

total for cycle k : (4m − 1)(k − 1) + 3m flops

Complexity for m × n factorization:

n
n(n − 1)
((4m − 1)(k − 1) + 3m) = (4m − 1) + 3mn
X
k=1 2
≈ 2mn2 flops

QR factorization 6.22
Numerical experiment

• we use the following MATLAB code


[m, n] = size(A);
Q = zeros(m,n);
R = zeros(n,n);
for k = 1:n
R(1:k-1,k) = Q(:,1:k-1)’ * A(:,k);
v = A(:,k) - Q(:,1:k-1) * R(1:k-1,k);
R(k,k) = norm(v);
Q(:,k) = v / R(k,k);
end;

• we apply this to a square matrix A of size m = n = 50

• A is constructed as A = USV with U , V orthogonal, S diagonal with

Sii = 10−10(i−1)/(n−1), i = 1, . . . , n

QR factorization 6.23
Numerical experiment

plot shows deviation from orthogonality between qk and previous columns

e k = max |qiT qk |, k = 2, . . . , n
1≤i<k

0.8

0.6
ek

0.4

0.2

0
0 10 20 30 40 50
k

loss of orthogonality is due to rounding error


QR factorization 6.24
Outline

• triangular matrices

• QR factorization

• Gram–Schmidt algorithm

• Householder algorithm
Householder algorithm

• the most widely used algorithm for QR factorization (qr in MATLAB and Julia)

• less sensitive to rounding error than Gram–Schmidt algorithm

• computes a “full” QR factorization

R
 
A= Q Q̃ Q Q̃ orthogonal
   
,
0

• the full Q-factor is constructed as a product of orthogonal matrices

Q Q̃ = H1 H2 · · · Hn
 

each Hi is an m × m symmetric, orthogonal “reflector” (page 5.10)

QR factorization 6.25
Reflector

H = I − 2vvT with kvk = 1

• H x is reflection of x through hyperplane {z | vT z = 0} (see page 5.10)

• H is symmetric

• H is orthogonal

• matrix-vector product H x can be computed efficiently as

H x = x − 2(vT x)v

complexity is 4p flops if v and x have length p

QR factorization 6.26
Reflection to multiple of unit vector

given nonzero p-vector y = (y1, y2, . . . , y p), define

 y1 + sign(y1)k yk 
1
 
 y2 
,
w =  .. v= w


 kwk

 yp 

• we define sign(0) = 1
• vector w satisfies

kwk 2 = 2 (wT y) = 2k yk (k yk + |y1 |)

• reflector H = I − 2vvT maps y to multiple of e1 = (1, 0, . . . , 0):

2(wT y)
Hy = y − 2
w = y − w = −sign(y1)k yke1
kwk

QR factorization 6.27
Geometry

w
y

first coordinate axis


−sign(y1)k yke1

hyperplane {x | wT x = 0}

the reflection through the hyperplane {x | wT x = 0} with normal vector

w = y + sign(y1)k yke1

maps y to the vector −sign(y1)k yke1

QR factorization 6.28
Householder triangularization

• computes reflectors H1, . . . , Hn that reduce A to triangular form:

R
 
Hn Hn−1 · · · H1 A =
0

• after step k , the matrix Hk Hk−1 · · · H1 A has the following structure:

m−k

k n−k

(elements in positions i, j for i > j and j ≤ k are zero)

QR factorization 6.29
Householder algorithm

R
 
the following algorithm overwrites A with
0

Algorithm: for k = 1 to n,

1. define y = Ak:m,k and compute (m − k + 1)-vector v k :

1
w = y + sign(y1)k yke1, vk = w
kwk

2. multiply Ak:m,k:n with reflector I − 2v k vTk :

Ak:m,k:n := Ak:m,k:n − 2v k (vTk Ak:m,k:n)

(see page 109 in textbook for “slice” notation for submatrices)

QR factorization 6.30
Comments

• in step 2 we multiply Ak:m,k:n with the reflector I − 2v k vTk :

(I − 2v k vTk )Ak:m,k:n = Ak:m,k:n − 2v k (vTk Ak:m,k:n)

• this is equivalent to multiplying A with m × m reflector


T
I 0 0 0
   
Hk = = I −2
0 I − 2v k vTk vk vk

• algorithm overwrites A with


R
 
0

and returns the vectors v1, . . . , vn, with v k of length m − k + 1

QR factorization 6.31
Example

example on page 6.8:

 −1 −1 1 
 1 3 3  R
   
A =  H H H

= 1 2 3
−1 −1 5  0
 1 3 7 
 

we compute reflectors H1, H2, H3 that triangularize A:

 R11 R12 R13 


0 R22 R23
 
H3 H2 H1 A = 
 
0 0 R33


0 0 0
 
 
 

QR factorization 6.32
Example

First column of R

• compute reflector that maps first column of A to multiple of e1:

 −1   −3   −3 
 1   1  1 1  1 
     

y =  , w = y − k yke1 =  , v1 = w= √ 
 −1   −1  kwk 2 3  −1 
  
 1   1   1 
     

• overwrite A with product of I − 2v1 v1T and A

 2 4 2 
0 4/3 8/3 

T
A := (I − 2v1 v1 )A = 

0 2/3 16/3 
0 4/3 20/3 


QR factorization 6.33
Example

Second column of R

• compute reflector that maps A2:4,2 to multiple of e1:

 4/3   10/3   5 
1 1  
y =  2/3  , w = y + k yke1 =  2/3  , w=√  1 
   
v2 =
 4/3   4/3  kwk 30  2 
     

• overwrite A2:4,2:3 with product of I − 2v2 v2T and A2:4,2:3:

 2 4 2 
1 0 0 −2
  
−8 
A := A = 

0 I − 2v2 v2T 0 0 16/5 
0 0 12/5 


QR factorization 6.34
Example

Third column of R

• compute reflector that maps A3:4,3 to multiple of e1:

16/5 36/5 1 1 3
     
y= , w = y + k yke1 = , v3 = w=√
12/5 12/5 kwk 10 1

• overwrite A3:4,3 with product of I − 2v3 v3T and A3:4,3:

 2 4 2 
I 0 0 −2 −8 
  
A := A = 

0 I − 2v3 v3T 0 0 −4 
0 0 0 


QR factorization 6.35
Example

Final result

I 0 1 0
  
H3 H2 H1 A = (I − 2v1 v1T )A
0 I − 2v3 v3T 0 I − 2v2 v2T
 2 4 2 
I 0 1 0 0 4/3 8/3 
  

=
0 I − 2v3 v3T 0 I − 2v2 v2T 0 2/3 16/3 


0 4/3 20/3 



 2 4 2 
I 0 0 −2
  
 −8 
=
0 I − 2v3 v3T 0 0 16/5
 
 
0 0 12/5
 
 
 
 2 4 2 
0 −2 −8 


=
0 0 −4 


0 0 0 


QR factorization 6.36
Complexity

Complexity in cycle k (of algorithm on page 6.30): the dominant terms are

• (2(m − k + 1) − 1)(n − k + 1) flops for product vTk (Ak:m,k:n)


• (m − k + 1)(n − k + 1) flops for outer product with v k
• (m − k + 1)(n − k + 1) flops for subtraction from Ak:m,k:n

sum is roughly 4(m − k + 1)(n − k + 1) flops

Total for computing R and vectors v1, . . . , vn:

n ∫ n
4(m − k + 1)(n − k + 1) ≈ 4(m − t)(n − t)dt
X
k=1 0
2
= 2mn2 − n3 flops
3

QR factorization 6.37
Q-factor

the Householder algorithm returns the vectors v1, . . . , vn that define

Q Q̃ = H1 H2 · · · Hn
 

• usually there is no need to compute the matrix [ Q Q̃ ] explicitly

• the vectors v1, . . . , vn are an economical representation of [ Q Q̃ ]

• products with [ Q Q̃ ] or its transpose can be computed as

Q Q̃ x = H1 H2 · · · Hn x
 

T
Q Q̃ y = Hn Hn−1 · · · H1 y


QR factorization 6.38
Multiplication with Q-factor

• the matrix-vector product Hk x is defined as

I 0 x1:k−1 x1:k−1
    
Hk x = =
0 I − 2v k vTk x k:m x k:m − 2(vTk x k:m )v k

• complexity of multiplication Hk x is 4(m − k + 1) flops:

• complexity of multiplication with H1 H2 · · · Hn or its transpose is

n
4(m − k + 1) ≈ 4mn − 2n2 flops
X
k=1

• roughly equal to matrix-vector product with m × n matrix (2mn flops)

QR factorization 6.39

You might also like